From patchwork Fri Mar 24 23:25:26 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Steve Ellcey X-Patchwork-Id: 19726 Received: (qmail 107190 invoked by alias); 24 Mar 2017 23:25:35 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 107169 invoked by uid 89); 24 Mar 2017 23:25:35 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=showed, chips, consensus, raised X-Spam-User: qpsmtpd, 2 recipients X-HELO: NAM03-BY2-obe.outbound.protection.outlook.com Authentication-Results: sourceware.org; dkim=none (message not signed) header.d=none;sourceware.org; dmarc=none action=none header.from=caviumnetworks.com; Message-ID: <1490397926.19074.73.camel@caviumnetworks.com> Subject: [Patch] aarch64: Thunderx specific memcpy and memmove From: Steve Ellcey To: libc-alpha Cc: Siddhesh Poyarekar , Adhemerval Zanella Date: Fri, 24 Mar 2017 16:25:26 -0700 Mime-Version: 1.0 X-ClientProxiedBy: BN6PR13CA0042.namprd13.prod.outlook.com (10.171.172.28) To SN1PR07MB2445.namprd07.prod.outlook.com (10.169.127.145) X-MS-Office365-Filtering-Correlation-Id: daeabb6d-4f01-4409-e3f3-08d4730d0fbd X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001); SRVR:SN1PR07MB2445; X-Microsoft-Exchange-Diagnostics: 1; SN1PR07MB2445; 3:I0AiUum9JyYVlEg9soWjgF/cl166jOMNpthATMgjxMwyPLxEi6A59aDfXwIJqJ+n8N+m+xVy8qi6GhcY4lAHzTfb/QDeou4vx90Qkl6W786F/5dv23N2CeIvgU7iOjOAEwSuBE7jcLRdfYoc9NwmguLImbMGoFURo/p1slivsA5dX/PLyiiasUwaDsNvZmcwcWX0yZBXHVAaZa+wUuK8mLzqph0fEafVrfp/mOWX3j2sV/4OMxmDpVrmYWUPxpyQMzTP8ODzGTTwKDdcshOafw==; 25:4A9pjOdFDTtGi9Bzkh7Kpp7hq4ZvF7DTrZpkXZSkr3ZPUUHAFlG5Tz+v2247welyqwr09SVjIxsZ7q9Ygg6MVzQFgVrV+M2Esglahc8zx67/klQENgPtZKe/u7tiaoMNlrfTLzx2i1gh/q9oiCA1UxqomCm5aOarXQAoXP5Jao+nT4UsIpDJBy7t2UYWYg5EA3VxhBMrYNd+kwKSrCqZWp64omyr0dn0lt6m8u4ADhxOOLIt6YK3UcSu8jCNq3u/LWQtMJgqEKfjUJeebTePYwCKbWi7rZd7hVK3EPTmTu3U3Kvp9wL2YRI3i1WHDxkQ5GfXOP+KdSgfZKpA/IyK2ie3gK6DEmXDsNbCSgRphaL1eY0FrrUatxF6lgY+EfARYptED+9j29U6nz8mfU+6bKsUMgHYTNozixfrXaqwiKzvliRIrkn6vcQhYQgGawt2rq/FRvQOaZvCGYxCsOS/bIDgOtJWDSLpD7Vhp+jpUEcghgcNbPcAJskzFtK5EwKD X-Microsoft-Exchange-Diagnostics: 1; SN1PR07MB2445; 31:asefPbTUoeHOenAddmtG8uySm1j2c10HAXMTCHw7MuZvPAXbWC4+YmegBEEVGQlvzZqNTRYEls1DBLNovyerF/oWTLx5KNe3EkBtiLN5l9J0ocjAjB6Oo1pF9jPVJOBcRD8+xJVVDu7oib2qqpzKxlRp5QWDl/RRezVt+nR+lak2Bafy6+ivHLf08UlKvZfssZletw4859e0aACU9sxwNKfaRxhqJobzPN98Tmy5nneGUxdNL+Cjim60JmKQGxs8DJSP99YciqG9LanEzaMHrJej2LZfkggCgL+V1xmM8j9QCHgws0iFrAKpGC1rOsvCA5StpMjjBwxpwXgmdCNLig== X-Microsoft-Exchange-Diagnostics: 1; SN1PR07MB2445; 20:kbcd+C58i7t+y0X5NIpFKgh2qAK+K1MWnGju/klLMl2Vblt42CjSGkMmw0mCltY5SY7yo50a7AeV+ikdpCZxhC1EEGptEpq80wXnBmXBLGgErF825XDBZ5fSipYNeCaWs4gVCW6JI0fvguoDNeooD9M7nf3mOAGf11DwYJwxQKbFtnbYLl5xcWbwCvB96q2hl2VApD2i9OND6yNDbdBh6BasJJvLsEsdG/OIs1vO604UpYIi1XLPwhXIhKy4btrmcqJG+WW9N9ay/koFIavdCkEtFn+BJRJJCPd4MfPNfnmSKOX3MfO7Cub0p+j27Z4CPGNcPk+QnPjqkxXpknxQo5bf3fK9YwsDOWy3eHjNvpdtqnfIdZJhNKX9PV2zeSByDVyqU/NZy2hKnVM/yyi+nkaXOIq+6gIB+Xp5IVxMRc3pjw1PAk5cv3vD4ceILYFgeX9u4WqfrCkZAvDGvKEvZ/VINKipyLKR0XGWwCfUL4VuVdEUd2ZN5TZB0UEwviSMx987XVZ2faMmTOoJ49QDmg7bScIv02V+XEojpWz5eHwmuH91qiRzgE84PTnMcJayf5N8KdvJpzgtfp2w9UXj3+tRnRCYmjB6wIx5yx5iadc= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(102415395)(6040375)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6041248)(20161123564025)(20161123558025)(20161123560025)(20161123562025)(20161123555025)(6072148); SRVR:SN1PR07MB2445; BCL:0; PCL:0; RULEID:; SRVR:SN1PR07MB2445; X-Microsoft-Exchange-Diagnostics: 1; SN1PR07MB2445; 4:HceGNZwKx4B39889ppoP9SZlPjq/i01IpIHNiiI4Rkx6ngStpKPczQKgevU9vnUd0++uaWIBDOHkUZRhC/ONwG508Um2bdMpEJOK1NHP2Eb9ekqvTeAt1+Tq+lM9yKqHd6Owa7iP5zPUxkmjgPgIYZKV3Y/yRG9UwqOR7FhklEo/PVrwX3y2jE88mgG/9ypM3s1Xra3n5ohWq0SZqzD3HOyBLaD0MYjaGvzp65ybqgncFPoiQVsHB8neSh2/Qear+BVKf4ZyatycUYGqY2jDu/t89MnVhgMmAmmVKl/RYiDw6OZ9wy88vO1jnTjf5pi0jiNL9t2U0hnO4NgUQIUk0KVU7J+djErGUlrovdXjkwY4vRDYIys/2FUmPhKmHeGgTGOzNwU2HjnbbxjEBGDZIvy72tG6CcCjnLK6SliGm6yHLE0AAH2FOueHipghvJJrYl8cYbb2lJHX10m1Bf3rROzV72zo8a43JPUIPli7IbU4+GYFYtZxlfmzV3Ej+zI2FDBEeF+ZB6ExwsaF10/9aHs3sXLBoPvDkP0viexc/JWQLHUemOhkvD3zs/kKaXVoRpdattr5VNMFEJOCuep4EUpbJ6gI+wJxUMp4sxyGpUphvtwZxZTk+AQPwqVHTPam X-Forefront-PRVS: 0256C18696 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(4630300001)(6009001)(39410400002)(39830400002)(39450400003)(377424004)(6916009)(7736002)(6666003)(4326008)(66066001)(305945005)(84326002)(36756003)(4610100001)(5660300001)(189998001)(2906002)(5890100001)(6486002)(38730400002)(3846002)(110136004)(2476003)(50986999)(33646002)(6116002)(512874002)(53936002)(25786009)(6506006)(54906002)(103116003)(568964002)(50226002)(42186005)(53416004)(6512007)(81166006)(8676002); DIR:OUT; SFP:1101; SCL:1; SRVR:SN1PR07MB2445; H:sellcey-dt.caveonetworks.com; FPR:; SPF:None; MLV:sfv; LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; SN1PR07MB2445; 23:J5/dG/NO1Zo9w8iEDGKSUyX7s7EroioxryDbdEzQG?= =?us-ascii?Q?Jkq+DbO38t7ytiONc6Pmzr+xxDYkjQKBhOHiygvB2grCm+CUdyOTjAmrF6FY?= =?us-ascii?Q?IIeEJ4Wkea0bc22PnIYrf1TiXZjLu3jwHTjVR4W3xnMl6Z4YtNU83blF18et?= =?us-ascii?Q?UjkVONYsvrvNwFv4BueG9IlBxme0wIUzZ+WtB/JcHai9Tr8q5gqYnvw818xN?= =?us-ascii?Q?eT+iqjTuwSad2EJH/abfK2V56BDgtl1v3YghJCaLdatJTNkaQ0/DWNslPtnA?= =?us-ascii?Q?vsLFsxZ71Qn0/MDtN+0odUkH2SgGep02YCuHwoM2jN6IpzI+qGvX105EyXZA?= =?us-ascii?Q?mHBhyExb8aoEExcH4PHVg+MoO6slNw9ekpHgXMcJ3HpbxMngPXUJ7BKvDO3o?= =?us-ascii?Q?aqArqkZPQ/9UYCmtziHMxOpW58dDbIEPlokeDtAaQTSG3sJbvhU1UmrDfZkd?= =?us-ascii?Q?di1gxjd9oxpM0c+dqO1LtuYN8EBqmDkbqJ6CWOp7+J9nN/9GJetSp2dhkWbI?= =?us-ascii?Q?4U1CdUO8zgicqJ2Bnas+esSGLtefJFDroU0MT/C2pkPAYL3pfjR4xL+Jqm//?= =?us-ascii?Q?7eMd17j4Ir84b7gorTMjxv2huH/Z90Bb/2bxr3/MiFX3yq2Q19/DRy4NBEsi?= =?us-ascii?Q?p0XPMIDCoYadyxjbYgrl1DxaKzgC3dNqfkhUFGERbYbqDuo0Dx6ZHWOm+IFE?= =?us-ascii?Q?RCH/aXSXX2RG9QI8PDeDzLnW4bYdt44r45uez3uUZ+76r0KzrMAjERCQgvqJ?= =?us-ascii?Q?bgzJ2fwpOM3LJq7KMFg9Lx2T954RNyDXU+i7MPNu7vJ71y9e8VGe9DFhAbKG?= =?us-ascii?Q?8oTh4Gim1gciOGaO50W8E//zfqNnwXB9x33OMYh0IF/JpDSV8xw8YZlLUedV?= =?us-ascii?Q?g1w9OxicP0/AB0f96x67gGfJL2GddbfKhoV0bcrqtApT0mKRecxPhEDTzqTv?= =?us-ascii?Q?8sBr4NYNOkYQ2JfiURpfm/w68scn03170ixvbtPCuo8a+6fAjg85GcAUStQ3?= =?us-ascii?Q?Zghl59Xhzi6YL9TRIH6FHHt?= X-Microsoft-Exchange-Diagnostics: 1; SN1PR07MB2445; 6:tHXtu4O1oH/oF48SHQe/70xeJL2zWnS971aGCIHHVONr3SN0tYo3DUKU6/S3u5hfHiEOU1cSs8o2q2v7SZCxfwmhrcY5NvEaTtACsDF3S/kyH5fFE+ls534Vm1AsPXJbNJlMqXasQl3AWpbVKSvEYdISZnC6iI9H78E3hjf2ps0CTr/EF4BaopOGT9YbYhiT+G3Zr98bvYB+ctGzkW/0pkObs/ZNs7ZEP/yC5FkSxG5ZV0BWrxpWrBcryj2wUuStS757mRBI8NqX4st8z960H2lZSga0hCic3wsWHcETKN7XnNTX6Rd5zpjgZ/YTf3xI7EsEWKTl6Ta4dGPe5+sYX3Yjmmw8CA+wmVE/bRnOreeDrZE/gXRdqSVj5dnP2OgAQI5Ap4jGLsyoVsLLA+bs5w==; 5:vIpsq92dI4YgHHHyIuyDKW6ljHkINEHugF5m9SLkm6ldMEKXe9rEGBmyvJNo2XvT9OeANle6VFABdtFuTwVQe54Hw65I+7FfMwv2W5/l2J3dQdwONCsxxYsPVGgZU/wUTTEtS3T2QqHsOw1YEQqrXA==; 24:GqR0xZftuI6Jkny7I4QoTtnW5er30ZPdHiNWrVv718BMUe/Sec9oRLlyQuIkndCDMoZRH5x6TJvKIvEI8Bi+D3YAnYYICve15nqZbceMRPg= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; SN1PR07MB2445; 7:TmRwA0mX1EcLwT24o9wq0/tkqaGPtWjKwPbZuHfPgZvWplFpT8ws1V+BZIAHNVhwZYAjY0EqatBer+Lqd3XZUQzSx7d/PJLe4wh8DgBEbTyHz/UvQI+A5IE/T7AFxz7WtviOGQvpuZ6/X4hsga9OazaZqpVT3Z4p44msZnXa2GqjvCOboQuwLgBFdGx4El7yMajR9xFbfxAtk1tMdPJaD2ATQhPtTYw8PFemhcSaUAZWv1cuB6nk5Y9WpcfAY4vhZoGMS84v+Gj3uwhT51h4hHVyMdUm95kbVVYcoGG+M48nT2UgOZf0O+DV189D5gnkmeCkP3mxkXd62HZt1jbuDA== X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 24 Mar 2017 23:25:29.5879 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1PR07MB2445 Now that the IFUNC infrastructure for aarch64 is in place, here is a patch to use it to create ThunderX specific versions of memcpy and memmove. This was part of my original patch before it was split in two and a couple of issues were raised at that time.  Siddhesh Poyarekar wanted to separate the generic and thunderx copies of memcpy/memmove instead of using ifdefs in a combined source file. I prefer the ifdef version as a cleaner implementation with less code duplication but I can change it if that is the consensus. Also Adhemerval Zanella did some benchmarking that showed the prefetching done in the thunderx version might be appropriate for the generic version.  However if you look at the prefetching we only do it every other time through the loop.  This is because the loop copies 64 bytes and the ThunderX cache line size is 128 bytes.  If other aarch64 chips have a 64 byte cache line they might want a different prefetching setup. If people think we should use the ThunderX version of memcpy for all aarch64 systems I am happy to drop this patch and create one that just changes memcpy.S to do the ThunderX style prefetches for all aarch64 systems. Steve Ellcey sellcey@cavium.com 2017-03-24  Steve Ellcey   * sysdeps/aarch64/memcpy.S (MEMMOVE, MEMCPY): New macros. (memmove): Use MEMMOVE for name. (memcpy): Use MEMCPY for name.  Add loop with prefetching under USE_THUNDERX macro. * sysdeps/aarch64/multiarch/Makefile: New file. * sysdeps/aarch64/multiarch/ifunc-impl-list.c: Likewise. * sysdeps/aarch64/multiarch/init-arch.h: Likewise. * sysdeps/aarch64/multiarch/memcpy.c: Likewise. * sysdeps/aarch64/multiarch/memcpy_generic.S: Likewise. * sysdeps/aarch64/multiarch/memcpy_thunderx.S: Likewise. * sysdeps/aarch64/multiarch/memmove.c: Likewise. diff --git a/sysdeps/aarch64/memcpy.S b/sysdeps/aarch64/memcpy.S index 29af8b1..74444b4 100644 --- a/sysdeps/aarch64/memcpy.S +++ b/sysdeps/aarch64/memcpy.S @@ -59,7 +59,14 @@ Overlapping large forward memmoves use a loop that copies backwards. */ -ENTRY_ALIGN (memmove, 6) +#ifndef MEMMOVE +# define MEMMOVE memmove +#endif +#ifndef MEMCPY +# define MEMCPY memcpy +#endif + +ENTRY_ALIGN (MEMMOVE, 6) DELOUSE (0) DELOUSE (1) @@ -71,9 +78,9 @@ ENTRY_ALIGN (memmove, 6) b.lo L(move_long) /* Common case falls through into memcpy. */ -END (memmove) -libc_hidden_builtin_def (memmove) -ENTRY (memcpy) +END (MEMMOVE) +libc_hidden_builtin_def (MEMMOVE) +ENTRY (MEMCPY) DELOUSE (0) DELOUSE (1) @@ -158,10 +165,22 @@ L(copy96): .p2align 4 L(copy_long): + +#ifdef USE_THUNDERX + + /* On thunderx, large memcpy's are helped by software prefetching. + This loop is identical to the one below it but with prefetching + instructions included. For loops that are less than 32768 bytes, + the prefetching does not help and slow the code down so we only + use the prefetching loop for the largest memcpys. */ + + cmp count, #32768 + b.lo L(copy_long_without_prefetch) and tmp1, dstin, 15 bic dst, dstin, 15 ldp D_l, D_h, [src] sub src, src, tmp1 + prfm pldl1strm, [src, 384] add count, count, tmp1 /* Count is now 16 too large. */ ldp A_l, A_h, [src, 16] stp D_l, D_h, [dstin] @@ -169,7 +188,10 @@ L(copy_long): ldp C_l, C_h, [src, 48] ldp D_l, D_h, [src, 64]! subs count, count, 128 + 16 /* Test and readjust count. */ - b.ls 2f + +L(prefetch_loop64): + tbz src, #6, 1f + prfm pldl1strm, [src, 512] 1: stp A_l, A_h, [dst, 16] ldp A_l, A_h, [src, 16] @@ -180,12 +202,40 @@ L(copy_long): stp D_l, D_h, [dst, 64]! ldp D_l, D_h, [src, 64]! subs count, count, 64 - b.hi 1b + b.hi L(prefetch_loop64) + b L(last64) + +L(copy_long_without_prefetch): +#endif + + and tmp1, dstin, 15 + bic dst, dstin, 15 + ldp D_l, D_h, [src] + sub src, src, tmp1 + add count, count, tmp1 /* Count is now 16 too large. */ + ldp A_l, A_h, [src, 16] + stp D_l, D_h, [dstin] + ldp B_l, B_h, [src, 32] + ldp C_l, C_h, [src, 48] + ldp D_l, D_h, [src, 64]! + subs count, count, 128 + 16 /* Test and readjust count. */ + b.ls L(last64) +L(loop64): + stp A_l, A_h, [dst, 16] + ldp A_l, A_h, [src, 16] + stp B_l, B_h, [dst, 32] + ldp B_l, B_h, [src, 32] + stp C_l, C_h, [dst, 48] + ldp C_l, C_h, [src, 48] + stp D_l, D_h, [dst, 64]! + ldp D_l, D_h, [src, 64]! + subs count, count, 64 + b.hi L(loop64) /* Write the last full set of 64 bytes. The remainder is at most 64 bytes, so it is safe to always copy 64 bytes from the end even if there is just 1 byte left. */ -2: +L(last64): ldp E_l, E_h, [srcend, -64] stp A_l, A_h, [dst, 16] ldp A_l, A_h, [srcend, -48] @@ -256,5 +306,5 @@ L(move_long): stp C_l, C_h, [dstin] 3: ret -END (memcpy) -libc_hidden_builtin_def (memcpy) +END (MEMCPY) +libc_hidden_builtin_def (MEMCPY) diff --git a/sysdeps/aarch64/multiarch/Makefile b/sysdeps/aarch64/multiarch/Makefile index e69de29..78d52c7 100644 --- a/sysdeps/aarch64/multiarch/Makefile +++ b/sysdeps/aarch64/multiarch/Makefile @@ -0,0 +1,3 @@ +ifeq ($(subdir),string) +sysdep_routines += memcpy_generic memcpy_thunderx +endif diff --git a/sysdeps/aarch64/multiarch/ifunc-impl-list.c b/sysdeps/aarch64/multiarch/ifunc-impl-list.c index e69de29..c4f23df 100644 --- a/sysdeps/aarch64/multiarch/ifunc-impl-list.c +++ b/sysdeps/aarch64/multiarch/ifunc-impl-list.c @@ -0,0 +1,51 @@ +/* Enumerate available IFUNC implementations of a function. AARCH64 version. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include +#include + +/* Maximum number of IFUNC implementations. */ +#define MAX_IFUNC 2 + +size_t +__libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, + size_t max) +{ + assert (max >= MAX_IFUNC); + + size_t i = 0; + + INIT_ARCH (); + + /* Support sysdeps/aarch64/multiarch/memcpy.c and memmove.c. */ + IFUNC_IMPL (i, name, memcpy, + IFUNC_IMPL_ADD (array, i, memcpy, IS_THUNDERX (midr), + __memcpy_thunderx) + IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_generic)) + IFUNC_IMPL (i, name, memmove, + IFUNC_IMPL_ADD (array, i, memmove, IS_THUNDERX (midr), + __memmove_thunderx) + IFUNC_IMPL_ADD (array, i, memmove, 1, __memmove_generic)) + + return i; +} diff --git a/sysdeps/aarch64/multiarch/init-arch.h b/sysdeps/aarch64/multiarch/init-arch.h index e69de29..e690e00 100644 --- a/sysdeps/aarch64/multiarch/init-arch.h +++ b/sysdeps/aarch64/multiarch/init-arch.h @@ -0,0 +1,22 @@ +/* This file is part of the GNU C Library. + Copyright (C) 2017 Free Software Foundation, Inc. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +#define INIT_ARCH() \ + uint64_t __attribute__((unused)) midr = \ + GLRO(dl_aarch64_cpu_features).midr_el1; diff --git a/sysdeps/aarch64/multiarch/memcpy.c b/sysdeps/aarch64/multiarch/memcpy.c index e69de29..4e3f251 100644 --- a/sysdeps/aarch64/multiarch/memcpy.c +++ b/sysdeps/aarch64/multiarch/memcpy.c @@ -0,0 +1,39 @@ +/* Multiple versions of memcpy. AARCH64 version. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in libc. */ + +#if IS_IN (libc) +/* Redefine memcpy so that the compiler won't complain about the type + mismatch with the IFUNC selector in strong_alias, below. */ +# undef memcpy +# define memcpy __redirect_memcpy +# include +# include + +extern __typeof (__redirect_memcpy) __libc_memcpy; + +extern __typeof (__redirect_memcpy) __memcpy_generic attribute_hidden; +extern __typeof (__redirect_memcpy) __memcpy_thunderx attribute_hidden; + +libc_ifunc (__libc_memcpy, + IS_THUNDERX (midr) ? __memcpy_thunderx : __memcpy_generic); + +#undef memcpy +strong_alias (__libc_memcpy, memcpy); +#endif diff --git a/sysdeps/aarch64/multiarch/memcpy_generic.S b/sysdeps/aarch64/multiarch/memcpy_generic.S index e69de29..50e1a1c 100644 --- a/sysdeps/aarch64/multiarch/memcpy_generic.S +++ b/sysdeps/aarch64/multiarch/memcpy_generic.S @@ -0,0 +1,42 @@ +/* A Generic Optimized memcpy implementation for AARCH64. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* The actual memcpy and memmove code is in ../memcpy.S. If we are + building libc this file defines __memcpy_generic and __memmove_generic. + Otherwise the include of ../memcpy.S will define the normal __memcpy + and__memmove entry points. */ + +#include + +#if IS_IN (libc) + +#define MEMCPY __memcpy_generic +#define MEMMOVE __memmove_generic + +/* Do not hide the generic versions of memcpy and memmove, we use them + internally. */ +#undef libc_hidden_builtin_def +#define libc_hidden_builtin_def(name) + +/* It doesn't make sense to send libc-internal memcpy calls through a PLT. */ + .globl __GI_memcpy; __GI_memcpy = __memcpy_generic + .globl __GI_memmove; __GI_memmove = __memmove_generic + +#endif + +#include "../memcpy.S" diff --git a/sysdeps/aarch64/multiarch/memcpy_thunderx.S b/sysdeps/aarch64/multiarch/memcpy_thunderx.S index e69de29..ee971c8 100644 --- a/sysdeps/aarch64/multiarch/memcpy_thunderx.S +++ b/sysdeps/aarch64/multiarch/memcpy_thunderx.S @@ -0,0 +1,32 @@ +/* A Thunderx Optimized memcpy implementation for AARCH64. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* The actual thunderx optimized code is in ../memcpy.S under the USE_THUNDERX + ifdef. If we are not building libc then we do not build anything when + compiling this file and __memcpy is defined by memcpy_generic.S. */ + +#include + +#if IS_IN (libc) + +#define MEMCPY __memcpy_thunderx +#define MEMMOVE __memmove_thunderx +#define USE_THUNDERX +#include "../memcpy.S" + +#endif diff --git a/sysdeps/aarch64/multiarch/memmove.c b/sysdeps/aarch64/multiarch/memmove.c index e69de29..8d7a146 100644 --- a/sysdeps/aarch64/multiarch/memmove.c +++ b/sysdeps/aarch64/multiarch/memmove.c @@ -0,0 +1,39 @@ +/* Multiple versions of memmove. AARCH64 version. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in libc. */ + +#if IS_IN (libc) +/* Redefine memmove so that the compiler won't complain about the type + mismatch with the IFUNC selector in strong_alias, below. */ +# undef memmove +# define memmove __redirect_memmove +# include +# include + +extern __typeof (__redirect_memmove) __libc_memmove; + +extern __typeof (__redirect_memmove) __memmove_generic attribute_hidden; +extern __typeof (__redirect_memmove) __memmove_thunderx attribute_hidden; + +libc_ifunc (__libc_memmove, + IS_THUNDERX (midr) ? __memmove_thunderx : __memmove_generic); + +#undef memmove +strong_alias (__libc_memmove, memmove); +#endif