diff mbox

Add ifunc memcpy and memmove for aarch64

Message ID 1484850162.4759.24.camel@caviumnetworks.com
State New, archived
Headers show

Commit Message

Steve Ellcey Jan. 19, 2017, 6:22 p.m. UTC
This patch adds ifunc versions of memcpy and memmove for aarch64.  I
know this isn't appropriate for 2.25 but I wanted to submit it and get
it reviewed for 2.26.  The basic change is to include software
prefetching for large memcpy's on thunderx which can speed up those
routines by around 2X.  For memcpy's under 32K bytes I found that the
software prefetching did not help (and sometimes hurt).  I wasn't
really interested in speeding up memmove but since memcpy and memmove
are implemented in one file it seemed easier to make memmove an ifunc
along with memcpy rather than try and split them up.  memmove does get
a speedup when it uses the memcpy code.

The ifunc code depends on the mrs instruction which is a privileged
instruction but the 4.11 version of the linux kernel will have
emulation for it (https://lkml.org/lkml/2017/1/10/816).  Since it is
emulated I added code to save it's value rather than read it everytime
we want to execute an ifunc selection function.  I also saved a flag to
specify if the platform was thunderx or not so that glibc did not have
to do multiple logical operations on the mrs value in each ifunc
selection function to determine if it was on a thunderx platform or
not.

I have attached the bench-memcpy.out, bench-memcpy-large.out, bench-
memmove.out and bench-memmove-large.out files to show the performance
difference, most of the difference is seen in the large versions as the
smaller ones only use prefetching on a couple of inputs.

Steve Ellcey
sellcey@caviumnetworks.com


2017-01-19  Steve Ellcey  <sellcey@caviumnetworks.com>

	* sysdeps/aarch64/memcpy.S (MEMMOVE, MEMCPY): New macros.
	(memmove): Use MEMMOVE for name.
	(memcpy): Use MEMCPY for name.  Add loop with prefetching
	under USE_THUNDERX macro.
	* sysdeps/aarch64/multiarch/Makefile: New file.
	* sysdeps/aarch64/multiarch/ifunc-impl-list.c: Ditto.
	* sysdeps/aarch64/multiarch/init-arch.h: Ditto.
	* sysdeps/aarch64/multiarch/memcpy.c: Ditto.
	* sysdeps/aarch64/multiarch/memcpy_generic.S: Ditto.
	* sysdeps/aarch64/multiarch/memcpy_thunderx.S: Ditto.
	* sysdeps/unix/sysv/linux/aarch64/configure.ac (arch_minimum_kernel):
	Set to 4.11.0 if building with multi_arch.
	* sysdeps/unix/sysv/linux/aarch64/configure: Regenerate.
builtin_memcpy	simple_memcpy	__memcpy_thunderx	__memcpy_generic
Length    1, alignment  0/ 0:	40.4688	21.25	21.0938	21.5625
Length    1, alignment  0/ 0:	24.0625	21.5625	21.4062	21.0938
Length    1, alignment  0/ 0:	23.9062	15.7812	20.7812	20.9375
Length    1, alignment  0/ 0:	24.0625	15.7812	20.7812	20.9375
Length    2, alignment  0/ 0:	24.0625	24.375	20.9375	20.9375
Length    2, alignment  1/ 0:	24.0625	23.75	20.7812	20.9375
Length    2, alignment  0/ 1:	24.0625	22.9688	20.9375	20.9375
Length    2, alignment  1/ 1:	24.0625	22.9688	20.7812	20.7812
Length    4, alignment  0/ 0:	22.9688	24.6875	19.0625	19.5312
Length    4, alignment  2/ 0:	22.0312	23.9062	18.9062	19.2188
Length    4, alignment  0/ 2:	22.0312	23.4375	18.9062	18.9062
Length    4, alignment  2/ 2:	22.0312	23.4375	18.9062	18.9062
Length    8, alignment  0/ 0:	21.875	35.3125	17.9688	18.125
Length    8, alignment  3/ 0:	30.3125	34.375	26.0938	26.25
Length    8, alignment  0/ 3:	31.875	33.5938	27.8125	27.6562
Length    8, alignment  3/ 3:	39.375	33.5938	35.1562	35.3125
Length   16, alignment  0/ 0:	20.7812	67.9688	17.6562	17.6562
Length   16, alignment  4/ 0:	30.3125	67.5	26.25	26.25
Length   16, alignment  0/ 4:	31.875	67.3438	27.6562	27.6562
Length   16, alignment  4/ 4:	39.375	67.5	35.1562	35.3125
Length   32, alignment  0/ 0:	22.3438	99.375	17.6562	17.6562
Length   32, alignment  5/ 0:	31.5625	99.2188	27.3438	27.3438
Length   32, alignment  0/ 5:	31.5625	99.2188	27.3438	27.3438
Length   32, alignment  5/ 5:	40.4688	99.2188	36.4062	36.4062
Length   64, alignment  0/ 0:	23.9062	179.219	19.6875	19.375
Length   64, alignment  6/ 0:	40.3125	179.219	36.0938	36.0938
Length   64, alignment  0/ 6:	41.7188	179.375	37.6562	37.5
Length   64, alignment  6/ 6:	57.5	179.219	53.4375	53.4375
Length  128, alignment  0/ 0:	30.4688	339.219	27.0312	25.3125
Length  128, alignment  7/ 0:	67.8125	339.219	64.375	63.125
Length  128, alignment  0/ 7:	69.8438	339.375	66.4062	64.375
Length  128, alignment  7/ 7:	72.3438	339.375	68.9062	67.3438
Length  256, alignment  0/ 0:	40.1562	659.219	35.9375	37.3438
Length  256, alignment  8/ 0:	108.594	659.219	105.156	106.094
Length  256, alignment  0/ 8:	110.469	659.375	107.188	107.5
Length  256, alignment  8/ 8:	81.4062	659.375	78.125	80
Length  512, alignment  0/ 0:	57.9688	1299.38	53.9062	52.3438
Length  512, alignment  9/ 0:	194.844	1299.53	191.719	190.156
Length  512, alignment  0/ 9:	196.875	1299.53	193.75	191.719
Length  512, alignment  9/ 9:	99.375	1299.53	96.0938	94.5312
Length 1024, alignment  0/ 0:	97.0312	2579.53	93.2812	91.4062
Length 1024, alignment 10/ 0:	372.812	2579.53	369.688	368.125
Length 1024, alignment  0/10:	375.156	2579.38	371.562	369.531
Length 1024, alignment 10/10:	139.062	2579.38	135.469	134.219
Length 2048, alignment  0/ 0:	194.375	5393.91	193.281	188.125
Length 2048, alignment 11/ 0:	713.906	5139.84	710.781	709.375
Length 2048, alignment  0/11:	715.938	5139.84	712.812	710.938
Length 2048, alignment 11/11:	235.781	5139.69	233.281	230.156
Length 4096, alignment  0/ 0:	369.062	10260.9	366.406	360.781
Length 4096, alignment 12/ 0:	1403.44	10260.9	1400.16	1398.59
Length 4096, alignment  0/12:	1405.16	10435.2	1402.66	1399.84
Length 4096, alignment 12/12:	410.781	10260.3	408.125	402.344
Length 8192, alignment  0/ 0:	717.344	20503.6	714.844	703.906
Length 8192, alignment 13/ 0:	2781.41	20740.5	2778.12	2776.72
Length 8192, alignment  0/13:	2783.28	20503.1	2779.38	2776.56
Length 8192, alignment 13/13:	759.375	20654.2	757.031	746.406
Length 16384, alignment  0/ 0:	1430.16	41006.7	1429.84	1408.59
Length 16384, alignment 14/ 0:	5563.91	41004.4	5550.16	5734.38
Length 16384, alignment  0/14:	5549.06	41002.2	5547.03	5693.75
Length 16384, alignment 14/14:	1461.88	40995.2	1461.41	1440.47
Length 32768, alignment  0/ 0:	3426.72	82858.4	3555.16	4450.62
Length 32768, alignment 15/ 0:	12141.2	83634.5	12128.8	12733.9
Length 32768, alignment  0/15:	12190.8	83705	12164.4	12682.2
Length 32768, alignment 15/15:	3548.28	83734.7	3546.41	3970
Length 65536, alignment  0/ 0:	7917.97	174364	7863.91	15286.1
Length 65536, alignment 16/ 0:	8100.16	174329	7898.44	15577.5
Length 65536, alignment  0/16:	7956.56	174425	7906.09	15322
Length 65536, alignment 16/16:	7952.34	174475	7907.34	15613.8
Length    0, alignment  0/ 0:	25.3125	17.0312	21.7188	21.7188
Length    0, alignment  0/ 0:	24.375	16.5625	21.4062	21.5625
Length    0, alignment  0/ 0:	24.375	16.4062	21.4062	21.25
Length    0, alignment  0/ 0:	24.5312	16.4062	21.25	21.4062
Length    1, alignment  0/ 0:	24.375	16.4062	21.25	20.9375
Length    1, alignment  1/ 0:	24.0625	15.9375	20.9375	21.0938
Length    1, alignment  0/ 1:	24.0625	15.9375	20.9375	21.0938
Length    1, alignment  1/ 1:	24.2188	15.9375	20.9375	20.9375
Length    2, alignment  0/ 0:	24.0625	23.4375	20.9375	21.0938
Length    2, alignment  2/ 0:	24.0625	23.4375	20.9375	21.0938
Length    2, alignment  0/ 2:	24.0625	23.125	20.9375	21.0938
Length    2, alignment  2/ 2:	24.0625	23.2812	20.9375	21.0938
Length    3, alignment  0/ 0:	24.0625	22.5	20.9375	20.9375
Length    3, alignment  3/ 0:	24.0625	21.875	20.9375	20.9375
Length    3, alignment  0/ 3:	24.0625	20.7812	20.9375	20.9375
Length    3, alignment  3/ 3:	24.2188	20.9375	20.9375	20.9375
Length    4, alignment  0/ 0:	22.3438	23.75	19.0625	18.9062
Length    4, alignment  4/ 0:	22.1875	23.5938	18.9062	19.0625
Length    4, alignment  0/ 4:	22.0312	23.4375	19.0625	18.9062
Length    4, alignment  4/ 4:	22.0312	23.4375	18.9062	19.0625
Length    5, alignment  0/ 0:	22.0312	27.1875	18.9062	18.9062
Length    5, alignment  5/ 0:	30.625	26.4062	27.5	27.3438
Length    5, alignment  0/ 5:	32.1875	25.9375	28.9062	29.0625
Length    5, alignment  5/ 5:	39.6875	25.9375	36.25	36.4062
Length    6, alignment  0/ 0:	22.0312	29.375	18.9062	18.9062
Length    6, alignment  6/ 0:	26.0938	28.75	22.9688	22.8125
Length    6, alignment  0/ 6:	27.5	28.5938	24.375	24.5312
Length    6, alignment  6/ 6:	31.0938	28.4375	27.9688	27.9688
Length    7, alignment  0/ 0:	21.875	32.5	18.9062	18.9062
Length    7, alignment  7/ 0:	25.9375	31.5625	22.9688	22.9688
Length    7, alignment  0/ 7:	27.5	30.9375	24.375	24.375
Length    7, alignment  7/ 7:	31.0938	31.0938	27.9688	27.9688
Length    8, alignment  0/ 0:	21.4062	34.0625	17.9688	17.8125
Length    8, alignment  8/ 0:	20.9375	33.9062	17.6562	17.8125
Length    8, alignment  0/ 8:	20.9375	33.75	17.8125	17.8125
Length    8, alignment  8/ 8:	20.9375	33.75	17.6562	17.8125
Length    9, alignment  0/ 0:	31.4062	37.0312	27.1875	27.3438
Length    9, alignment  9/ 0:	35.4688	36.5625	31.0938	31.25
Length    9, alignment  0/ 9:	35.4688	36.25	31.0938	31.4062
Length    9, alignment  9/ 9:	39.375	36.4062	35.3125	35.3125
Length   10, alignment  0/ 0:	31.4062	39.2188	27.3438	27.1875
Length   10, alignment 10/ 0:	35.3125	38.9062	31.25	31.25
Length   10, alignment  0/10:	35.4688	38.75	31.25	31.25
Length   10, alignment 10/10:	39.375	173.438	35.625	35.3125
Length   11, alignment  0/ 0:	31.4062	41.4062	27.3438	27.3438
Length   11, alignment 11/ 0:	35.3125	41.25	31.25	31.25
Length   11, alignment  0/11:	35.4688	41.0938	31.25	31.25
Length   11, alignment 11/11:	39.5312	41.0938	35.3125	35.3125
Length   12, alignment  0/ 0:	31.4062	44.0625	27.1875	27.1875
Length   12, alignment 12/ 0:	31.875	43.75	27.8125	27.8125
Length   12, alignment  0/12:	30.9375	43.5938	26.875	26.7188
Length   12, alignment 12/12:	30.9375	43.5938	26.7188	26.7188
Length   13, alignment  0/ 0:	31.4062	46.25	27.3438	27.1875
Length   13, alignment 13/ 0:	35.4688	46.25	31.25	31.25
Length   13, alignment  0/13:	35.3125	46.0938	31.25	31.25
Length   13, alignment 13/13:	39.375	46.25	35.3125	35.3125
Length   14, alignment  0/ 0:	31.4062	52.6562	27.3438	27.1875
Length   14, alignment 14/ 0:	35.4688	52.5	31.25	31.4062
Length   14, alignment  0/14:	35.3125	52.5	31.25	31.25
Length   14, alignment 14/14:	39.375	52.5	35.1562	35.3125
Length   15, alignment  0/ 0:	31.4062	65	27.3438	27.3438
Length   15, alignment 15/ 0:	35.3125	64.8438	31.25	31.25
Length   15, alignment  0/15:	35.4688	64.8438	31.0938	31.25
Length   15, alignment 15/15:	39.5312	64.8438	35.3125	35.3125
Length   16, alignment  0/ 0:	20.9375	67.5	17.6562	17.6562
Length   16, alignment 16/ 0:	23.2812	67.5	17.8125	17.6562
Length   16, alignment  0/16:	20.9375	67.3438	17.6562	17.6562
Length   16, alignment 16/16:	20.9375	67.3438	17.8125	17.6562
Length   17, alignment  0/ 0:	32.9688	62.0312	28.5938	28.4375
Length   17, alignment 17/ 0:	36.5625	61.875	32.5	32.5
Length   17, alignment  0/17:	36.5625	61.875	32.5	32.3438
Length   17, alignment 17/17:	40.625	61.875	36.4062	36.4062
Length   18, alignment  0/ 0:	32.5	64.375	28.4375	28.2812
Length   18, alignment 18/ 0:	36.5625	64.375	32.3438	32.3438
Length   18, alignment  0/18:	36.5625	64.375	32.5	32.3438
Length   18, alignment 18/18:	40.4688	64.375	36.4062	36.4062
Length   19, alignment  0/ 0:	32.5	66.875	28.4375	28.2812
Length   19, alignment 19/ 0:	36.7188	66.875	32.3438	32.5
Length   19, alignment  0/19:	36.5625	66.875	32.3438	32.3438
Length   19, alignment 19/19:	40.625	66.875	36.5625	36.4062
Length   20, alignment  0/ 0:	32.5	69.375	28.4375	28.2812
Length   20, alignment 20/ 0:	36.5625	69.375	32.5	32.3438
Length   20, alignment  0/20:	36.5625	69.375	32.5	32.3438
Length   20, alignment 20/20:	40.625	69.375	36.5625	36.25
Length   21, alignment  0/ 0:	32.5	71.875	28.4375	28.2812
Length   21, alignment 21/ 0:	36.5625	71.875	32.5	32.3438
Length   21, alignment  0/21:	36.5625	71.875	32.3438	32.3438
Length   21, alignment 21/21:	40.625	71.875	36.4062	36.4062
Length   22, alignment  0/ 0:	32.6562	74.375	28.4375	28.4375
Length   22, alignment 22/ 0:	36.5625	74.375	32.3438	32.3438
Length   22, alignment  0/22:	36.7188	74.375	32.3438	32.3438
Length   22, alignment 22/22:	40.4688	74.375	36.4062	36.4062
Length   23, alignment  0/ 0:	32.5	76.875	28.4375	28.2812
Length   23, alignment 23/ 0:	36.5625	76.875	32.3438	32.3438
Length   23, alignment  0/23:	36.5625	76.875	32.3438	32.3438
Length   23, alignment 23/23:	40.4688	76.875	36.4062	36.4062
Length   24, alignment  0/ 0:	32.6562	79.375	28.4375	28.4375
Length   24, alignment 24/ 0:	32.0312	79.375	27.9688	27.9688
Length   24, alignment  0/24:	32.0312	79.375	27.9688	27.9688
Length   24, alignment 24/24:	31.5625	79.375	27.5	27.3438
Length   25, alignment  0/ 0:	32.5	81.875	28.4375	28.4375
Length   25, alignment 25/ 0:	36.5625	81.875	32.5	32.3438
Length   25, alignment  0/25:	36.5625	81.875	32.5	32.3438
Length   25, alignment 25/25:	40.625	81.875	36.4062	36.4062
Length   26, alignment  0/ 0:	32.6562	84.375	28.4375	28.4375
Length   26, alignment 26/ 0:	36.5625	84.375	32.3438	32.3438
Length   26, alignment  0/26:	36.5625	84.2188	32.5	32.3438
Length   26, alignment 26/26:	40.625	84.375	36.4062	36.25
Length   27, alignment  0/ 0:	32.6562	86.7188	28.4375	28.2812
Length   27, alignment 27/ 0:	36.5625	86.875	32.3438	32.3438
Length   27, alignment  0/27:	36.7188	86.875	32.3438	32.3438
Length   27, alignment 27/27:	40.4688	86.875	36.5625	36.4062
Length   28, alignment  0/ 0:	32.6562	89.375	28.4375	28.4375
Length   28, alignment 28/ 0:	36.5625	89.375	32.5	32.3438
Length   28, alignment  0/28:	36.5625	89.2188	32.5	32.3438
Length   28, alignment 28/28:	40.625	89.375	36.4062	36.4062
Length   29, alignment  0/ 0:	32.6562	91.7188	28.4375	28.4375
Length   29, alignment 29/ 0:	36.5625	91.875	32.3438	32.3438
Length   29, alignment  0/29:	36.5625	91.875	32.3438	32.3438
Length   29, alignment 29/29:	40.4688	91.875	36.5625	36.4062
Length   30, alignment  0/ 0:	32.5	94.375	28.4375	28.4375
Length   30, alignment 30/ 0:	36.5625	94.375	32.3438	32.5
Length   30, alignment  0/30:	36.5625	94.375	32.3438	32.5
Length   30, alignment 30/30:	40.625	95.625	36.4062	36.4062
Length   31, alignment  0/ 0:	32.5	96.875	28.4375	28.2812
Length   31, alignment 31/ 0:	36.5625	96.7188	32.5	32.3438
Length   31, alignment  0/31:	36.5625	96.875	32.3438	32.3438
Length   31, alignment 31/31:	40.4688	96.875	36.4062	36.4062
Length   48, alignment  0/ 0:	22.9688	139.219	19.2188	19.2188
Length   48, alignment  3/ 0:	40.1562	139.219	36.25	36.0938
Length   48, alignment  0/ 3:	41.7188	139.375	37.6562	37.5
Length   48, alignment  3/ 3:	57.5	139.219	53.4375	53.4375
Length   80, alignment  0/ 0:	25.1562	219.219	20.7812	20.4688
Length   80, alignment  5/ 0:	51.5625	219.375	47.8125	48.2812
Length   80, alignment  0/ 5:	51.0938	219.219	46.875	46.875
Length   80, alignment  5/ 5:	78.5938	219.219	74.375	74.375
Length   96, alignment  0/ 0:	24.0625	259.375	20.4688	20.4688
Length   96, alignment  6/ 0:	51.5625	259.219	47.9688	47.9688
Length   96, alignment  0/ 6:	51.0938	259.219	46.875	46.875
Length   96, alignment  6/ 6:	78.4375	259.219	74.375	74.2188
Length  112, alignment  0/ 0:	30	299.375	26.7188	24.5312
Length  112, alignment  7/ 0:	67.8125	299.219	64.375	62.9688
Length  112, alignment  0/ 7:	69.8438	299.375	66.4062	64.375
Length  112, alignment  7/ 7:	72.3438	299.219	68.9062	67.1875
Length  144, alignment  0/ 0:	29.8438	379.375	26.4062	24.5312
Length  144, alignment  9/ 0:	67.8125	379.219	64.5312	62.9688
Length  144, alignment  0/ 9:	90.4688	379.375	86.7188	85.1562
Length  144, alignment  9/ 9:	76.5625	379.375	73.4375	72.1875
Length  160, alignment  0/ 0:	34.2188	419.375	30.9375	28.9062
Length  160, alignment 10/ 0:	87.0312	419.219	83.9062	82.3438
Length  160, alignment  0/10:	89.0625	419.375	86.0938	83.75
Length  160, alignment 10/10:	76.7188	419.375	73.4375	71.7188
Length  176, alignment  0/ 0:	34.2188	459.219	31.0938	28.75
Length  176, alignment 11/ 0:	87.1875	459.375	83.9062	82.1875
Length  176, alignment  0/11:	89.0625	459.375	85.9375	83.9062
Length  176, alignment 11/11:	76.7188	459.375	73.4375	71.7188
Length  192, alignment  0/ 0:	34.0625	499.375	31.0938	28.9062
Length  192, alignment 12/ 0:	87.1875	499.219	84.0625	82.1875
Length  192, alignment  0/12:	89.0625	499.219	85.9375	83.75
Length  192, alignment 12/12:	76.5625	499.375	73.5938	71.875
Length  208, alignment  0/ 0:	34.2188	539.375	30.9375	28.9062
Length  208, alignment 13/ 0:	87.0312	539.219	83.9062	82.3438
Length  208, alignment  0/13:	111.25	539.219	107.812	108.75
Length  208, alignment 13/13:	81.4062	539.219	78.125	80.1562
Length  224, alignment  0/ 0:	38.9062	579.375	35.7812	37.0312
Length  224, alignment 14/ 0:	108.906	579.375	105.625	106.875
Length  224, alignment  0/14:	110.938	579.375	107.812	108.438
Length  224, alignment 14/14:	81.4062	579.219	78.2812	80
Length  240, alignment  0/ 0:	38.9062	619.375	35.7812	37.0312
Length  240, alignment 15/ 0:	108.906	619.219	105.625	107.031
Length  240, alignment  0/15:	110.781	619.375	107.656	108.438
Length  240, alignment 15/15:	81.5625	619.375	78.125	80
Length  272, alignment  0/ 0:	38.9062	699.375	35.625	37.0312
Length  272, alignment 17/ 0:	108.906	699.375	105.781	106.875
Length  272, alignment  0/17:	133.125	864.688	129.531	127.812
Length  272, alignment 17/17:	85.7812	699.375	82.5	81.5625
Length  288, alignment  0/ 0:	43.4375	739.375	40.1562	38.125
Length  288, alignment 18/ 0:	130.312	739.375	127.031	125.469
Length  288, alignment  0/18:	132.344	739.375	129.062	127.031
Length  288, alignment 18/18:	85.9375	739.375	82.5	81.0938
Length  304, alignment  0/ 0:	43.2812	779.531	40	38.125
Length  304, alignment 19/ 0:	130.312	779.375	127.031	125.625
Length  304, alignment  0/19:	132.344	779.531	129.062	127.031
Length  304, alignment 19/19:	85.7812	779.375	82.5	81.0938
Length  320, alignment  0/ 0:	43.4375	819.375	40.1562	37.9688
Length  320, alignment 20/ 0:	130.312	819.375	127.031	125.469
Length  320, alignment  0/20:	132.344	819.531	128.906	127.031
Length  320, alignment 20/20:	85.9375	819.375	82.5	81.0938
Length  336, alignment  0/ 0:	43.4375	859.375	40.1562	38.125
Length  336, alignment 21/ 0:	130.312	859.375	127.031	125.625
Length  336, alignment  0/21:	155	859.375	150.781	149.375
Length  336, alignment 21/21:	90.1562	859.375	86.875	85.7812
Length  352, alignment  0/ 0:	47.6562	899.531	44.5312	42.5
Length  352, alignment 22/ 0:	151.562	899.375	148.438	146.875
Length  352, alignment  0/22:	153.594	899.375	150.469	148.281
Length  352, alignment 22/22:	90.1562	899.375	87.0312	85.4688
Length  368, alignment  0/ 0:	47.8125	939.375	44.5312	42.5
Length  368, alignment 23/ 0:	151.562	939.375	148.438	146.875
Length  368, alignment  0/23:	153.75	939.375	150.625	148.438
Length  368, alignment 23/23:	90.1562	939.375	87.0312	85.4688
Length  384, alignment  0/ 0:	47.8125	979.375	44.5312	42.5
Length  384, alignment 24/ 0:	151.562	979.375	147.969	146.406
Length  384, alignment  0/24:	153.125	979.531	149.844	147.969
Length  384, alignment 24/24:	90.1562	979.531	87.0312	85.4688
Length  400, alignment  0/ 0:	47.6562	1019.53	44.5312	42.5
Length  400, alignment 25/ 0:	151.719	1019.53	148.438	147.031
Length  400, alignment  0/25:	176.25	1019.53	172.188	170.781
Length  400, alignment 25/25:	94.8438	1187.5	91.4062	90
Length  416, alignment  0/ 0:	52.1875	1059.53	49.0625	46.875
Length  416, alignment 26/ 0:	173.125	1059.53	170	168.438
Length  416, alignment  0/26:	175.156	1059.53	172.031	169.844
Length  416, alignment 26/26:	94.6875	1059.53	91.5625	90
Length  432, alignment  0/ 0:	52.1875	1099.53	49.0625	47.0312
Length  432, alignment 27/ 0:	173.125	1099.69	170	168.594
Length  432, alignment  0/27:	175.156	1099.53	171.875	169.844
Length  432, alignment 27/27:	94.6875	1099.53	91.4062	90
Length  448, alignment  0/ 0:	52.1875	1139.53	49.0625	47.0312
Length  448, alignment 28/ 0:	173.125	1139.38	170	168.438
Length  448, alignment  0/28:	175.156	1139.53	171.875	169.844
Length  448, alignment 28/28:	94.6875	1139.53	91.4062	90
Length  464, alignment  0/ 0:	52.1875	1179.53	49.0625	47.0312
Length  464, alignment 29/ 0:	173.281	1179.53	170	168.438
Length  464, alignment  0/29:	197.344	1179.53	193.75	191.875
Length  464, alignment 29/29:	99.5312	1179.53	96.0938	94.6875
Length  480, alignment  0/ 0:	56.7188	1219.53	53.5938	51.5625
Length  480, alignment 30/ 0:	194.688	1219.53	191.562	190.156
Length  480, alignment  0/30:	196.875	1219.53	193.594	191.562
Length  480, alignment 30/30:	99.5312	1219.38	96.0938	94.5312
Length  496, alignment  0/ 0:	56.7188	1259.53	53.75	51.5625
Length  496, alignment 31/ 0:	194.688	1259.38	191.562	190
Length  496, alignment  0/31:	197.031	1259.53	193.594	191.719
Length  496, alignment 31/31:	99.375	1259.53	96.0938	94.5312
Length 1024, alignment  0/ 0:	96.4062	2579.38	93.125	91.25
Length 1024, alignment 32/ 0:	96.0938	2579.53	94.2188	91.0938
Length 1024, alignment  0/32:	96.25	2579.53	93.2812	90.9375
Length 1024, alignment 32/32:	96.4062	2579.53	92.9688	91.0938
Length 1056, alignment  0/ 0:	104.688	2659.53	101.094	99.2188
Length 1056, alignment 33/ 0:	391.719	2659.38	388.125	386.719
Length 1056, alignment  0/33:	393.594	2659.53	390.156	388.125
Length 1056, alignment 33/33:	143.75	2659.53	140.156	145.781
Length 1088, alignment  0/ 0:	101.094	2739.53	97.5	102.812
Length 1088, alignment 34/ 0:	391.719	2739.53	388.125	386.562
Length 1088, alignment  0/34:	393.594	2739.38	390	388.125
Length 1088, alignment 34/34:	143.594	2739.53	140	145.625
Length 1120, alignment  0/ 0:	105.312	2819.53	102.031	107.031
Length 1120, alignment 35/ 0:	413.125	2819.53	409.688	408.125
Length 1120, alignment  0/35:	415.156	2819.38	411.719	409.688
Length 1120, alignment 35/35:	148.125	2819.53	144.531	150.156
Length 1152, alignment  0/ 0:	105.781	3025.62	102.5	107.188
Length 1152, alignment 36/ 0:	413.281	2899.38	409.844	408.125
Length 1152, alignment  0/36:	415.156	2899.53	411.719	409.844
Length 1152, alignment 36/36:	148.125	2899.53	144.531	150.156
Length 1184, alignment  0/ 0:	110.156	2979.53	106.562	111.719
Length 1184, alignment 37/ 0:	434.688	2979.53	431.25	429.688
Length 1184, alignment  0/37:	436.719	2979.53	433.281	431.094
Length 1184, alignment 37/37:	152.656	2979.53	149.219	154.688
Length 1216, alignment  0/ 0:	110	3059.53	106.719	111.719
Length 1216, alignment 38/ 0:	434.688	3059.53	431.25	429.844
Length 1216, alignment  0/38:	436.719	3059.38	433.125	431.25
Length 1216, alignment 38/38:	152.812	3059.53	149.219	154.688
Length 1248, alignment  0/ 0:	114.531	3139.53	110.938	109.219
Length 1248, alignment 39/ 0:	456.25	3264.84	453.125	451.25
Length 1248, alignment  0/39:	458.281	3139.38	454.688	452.812
Length 1248, alignment 39/39:	157.188	3139.53	153.594	152.188
Length 1280, alignment  0/ 0:	114.531	3219.38	111.094	109.219
Length 1280, alignment 40/ 0:	456.094	3219.38	452.5	451.094
Length 1280, alignment  0/40:	458.125	3219.53	454.688	452.656
Length 1280, alignment 40/40:	157.188	3219.53	153.594	152.188
Length 1312, alignment  0/ 0:	121.562	3299.84	115.625	113.594
Length 1312, alignment 41/ 0:	477.812	3299.53	474.375	472.812
Length 1312, alignment  0/41:	479.844	3299.53	476.25	474.375
Length 1312, alignment 41/41:	161.562	3299.53	158.125	156.719
Length 1344, alignment  0/ 0:	119.219	3379.53	115.781	113.75
Length 1344, alignment 42/ 0:	477.812	3379.53	636.406	473.281
Length 1344, alignment  0/42:	479.844	3379.53	476.094	474.375
Length 1344, alignment 42/42:	161.562	3379.53	158.125	156.719
Length 1376, alignment  0/ 0:	123.594	3459.53	120.156	118.281
Length 1376, alignment 43/ 0:	499.219	3459.53	495.938	494.375
Length 1376, alignment  0/43:	501.406	3459.53	497.812	495.625
Length 1376, alignment 43/43:	166.25	3459.38	162.656	161.25
Length 1408, alignment  0/ 0:	123.75	3539.53	120.156	118.281
Length 1408, alignment 44/ 0:	499.219	3539.53	495.781	494.375
Length 1408, alignment  0/44:	501.406	3539.53	497.969	496.094
Length 1408, alignment 44/44:	166.25	3539.38	162.656	161.094
Length 1440, alignment  0/ 0:	128.281	3619.53	124.531	122.812
Length 1440, alignment 45/ 0:	520.938	3750.94	517.812	515.938
Length 1440, alignment  0/45:	522.969	3619.53	519.531	517.5
Length 1440, alignment 45/45:	170.625	3619.38	167.188	165.625
Length 1472, alignment  0/ 0:	128.281	3699.53	124.688	122.812
Length 1472, alignment 46/ 0:	521.094	3699.53	517.5	516.094
Length 1472, alignment  0/46:	522.812	3699.53	519.375	517.5
Length 1472, alignment 46/46:	170.781	3699.53	167.188	165.625
Length 1504, alignment  0/ 0:	132.812	3779.38	129.219	127.188
Length 1504, alignment 47/ 0:	542.344	3779.53	538.906	537.188
Length 1504, alignment  0/47:	545.156	3779.53	540.469	538.594
Length 1504, alignment 47/47:	175.312	3779.38	171.875	170.469
Length 1536, alignment  0/ 0:	132.812	3982.5	129.531	127.344
Length 1536, alignment 48/ 0:	132.812	3859.53	129.375	127.344
Length 1536, alignment  0/48:	133.281	3859.53	129.844	127.812
Length 1536, alignment 48/48:	133.281	3859.53	129.844	127.812
Length 1568, alignment  0/ 0:	138.438	3939.38	134.219	132.031
Length 1568, alignment 49/ 0:	563.594	3939.53	560	558.594
Length 1568, alignment  0/49:	565.469	3939.38	562.031	560
Length 1568, alignment 49/49:	180.625	3939.53	177.031	174.844
Length 1600, alignment  0/ 0:	137.5	4019.53	133.906	132.031
Length 1600, alignment 50/ 0:	563.438	4019.53	560	558.594
Length 1600, alignment  0/50:	565.469	4019.53	562.031	683.125
Length 1600, alignment 50/50:	180.625	4019.53	177.188	175.469
Length 1632, alignment  0/ 0:	142.5	4099.53	138.75	137.031
Length 1632, alignment 51/ 0:	585	4099.53	581.406	580
Length 1632, alignment  0/51:	587.031	4099.53	583.594	581.562
Length 1632, alignment 51/51:	188.75	4099.53	185.469	182.5
Length 1664, alignment  0/ 0:	142.5	4179.53	138.906	136.875
Length 1664, alignment 52/ 0:	585	4179.53	581.406	580
Length 1664, alignment  0/52:	587.031	4179.53	583.438	581.562
Length 1664, alignment 52/52:	188.281	4179.53	181.406	180.625
Length 1696, alignment  0/ 0:	154.688	4259.38	146.406	143.438
Length 1696, alignment 53/ 0:	732.5	4259.84	603.125	601.562
Length 1696, alignment  0/53:	608.594	4259.53	605	602.969
Length 1696, alignment 53/53:	199.375	4259.53	201.562	197.656
Length 1728, alignment  0/ 0:	148.906	4339.53	146.875	143.125
Length 1728, alignment 54/ 0:	606.406	4339.53	603.125	601.562
Length 1728, alignment  0/54:	608.438	4339.53	605	603.125
Length 1728, alignment 54/54:	197.5	4339.53	190.312	186.094
Length 1760, alignment  0/ 0:	153.906	4419.38	150.938	148.125
Length 1760, alignment 55/ 0:	627.969	4419.53	624.375	623.125
Length 1760, alignment  0/55:	630	4542.03	626.719	624.531
Length 1760, alignment 55/55:	213.906	4419.53	211.25	208.125
Length 1792, alignment  0/ 0:	152.969	4499.53	152.969	146.406
Length 1792, alignment 56/ 0:	628.125	4499.53	624.531	623.125
Length 1792, alignment  0/56:	630	4499.53	626.406	624.531
Length 1792, alignment 56/56:	212.031	4499.38	209.688	206.562
Length 1824, alignment  0/ 0:	164.844	4579.38	173.125	158.438
Length 1824, alignment 57/ 0:	649.531	4579.53	646.094	644.531
Length 1824, alignment  0/57:	651.562	4579.53	647.969	645.938
Length 1824, alignment 57/57:	218.281	4702.34	218.438	212.812
Length 1856, alignment  0/ 0:	168.281	4659.53	162.344	162.5
Length 1856, alignment 58/ 0:	649.531	4659.53	645.938	644.531
Length 1856, alignment  0/58:	651.406	4659.53	647.969	646.094
Length 1856, alignment 58/58:	218.281	4659.53	215.625	212.656
Length 1888, alignment  0/ 0:	168.594	4739.53	167.656	168.438
Length 1888, alignment 59/ 0:	671.094	4739.38	667.5	666.094
Length 1888, alignment  0/59:	672.969	4739.53	669.531	667.5
Length 1888, alignment 59/59:	224.844	4739.38	222.344	218.906
Length 1920, alignment  0/ 0:	173.594	4974.38	174.531	162.812
Length 1920, alignment 60/ 0:	670.938	4819.53	667.5	665.938
Length 1920, alignment  0/60:	672.969	4819.38	669.531	667.5
Length 1920, alignment 60/60:	224.688	4819.53	222.656	219.219
Length 1952, alignment  0/ 0:	190.156	4899.38	187.344	183.906
Length 1952, alignment 61/ 0:	692.5	4899.53	688.906	687.5
Length 1952, alignment  0/61:	694.531	4899.38	690.938	689.062
Length 1952, alignment 61/61:	229.062	4899.53	226.562	222.969
Length 1984, alignment  0/ 0:	190.156	4979.53	187.344	184.062
Length 1984, alignment 62/ 0:	692.5	4981.25	689.844	687.5
Length 1984, alignment  0/62:	694.531	4979.53	691.094	688.906
Length 1984, alignment 62/62:	229.375	4979.53	226.562	222.969
Length 2016, alignment  0/ 0:	194.375	5059.53	191.562	188.281
Length 2016, alignment 63/ 0:	713.906	5059.84	710.781	709.375
Length 2016, alignment  0/63:	715.938	5059.69	712.812	710.781
Length 2016, alignment 63/63:	235.625	5059.69	233.438	230
Length 4096, alignment  0/ 0:	369.219	10398	367.812	360.938
__memcpy_thunderx	__memcpy_generic
Length 65543, alignment  0/ 0:	8083.75	15541.2
Length 65551, alignment  0/ 3:	24720.6	32405
Length 65567, alignment  3/ 0:	24486.2	33813.8
Length 65599, alignment  3/ 5:	24731.9	32400.6
Length 131079, alignment  0/ 0:	15959.4	31031.9
Length 131087, alignment  0/ 3:	49411.9	64778.1
Length 131103, alignment  3/ 0:	49505.6	66046.2
Length 131135, alignment  3/ 5:	49401.9	64780
Length 262151, alignment  0/ 0:	31648.1	62405
Length 262159, alignment  0/ 3:	98538.8	129344
Length 262175, alignment  3/ 0:	97577.5	132878
Length 262207, alignment  3/ 5:	99119.4	129346
Length 524295, alignment  0/ 0:	63994.4	124135
Length 524303, alignment  0/ 3:	199494	259969
Length 524319, alignment  3/ 0:	198194	264828
Length 524351, alignment  3/ 5:	199118	259842
Length 1048583, alignment  0/ 0:	152811	259784
Length 1048591, alignment  0/ 3:	422906	529983
Length 1048607, alignment  3/ 0:	424201	540640
Length 1048639, alignment  3/ 5:	422879	529976
Length 2097159, alignment  0/ 0:	276857	501925
Length 2097167, alignment  0/ 3:	812467	1.04305e+06
Length 2097183, alignment  3/ 0:	810185	1.06351e+06
Length 2097215, alignment  3/ 5:	812467	1.04307e+06
Length 4194311, alignment  0/ 0:	524355	986463
Length 4194319, alignment  0/ 3:	1.59268e+06	2.06977e+06
Length 4194335, alignment  3/ 0:	1.5818e+06	2.11026e+06
Length 4194367, alignment  3/ 5:	1.59222e+06	2.06932e+06
Length 8388615, alignment  0/ 0:	1.12852e+06	3.00444e+06
Length 8388623, alignment  0/ 3:	3.17872e+06	5.16414e+06
Length 8388639, alignment  3/ 0:	3.15213e+06	5.23659e+06
Length 8388671, alignment  3/ 5:	3.179e+06	5.1543e+06
Length 16777223, alignment  0/ 0:	3.54774e+06	1.30525e+07
Length 16777231, alignment  0/ 3:	6.8e+06	1.77641e+07
Length 16777247, alignment  3/ 0:	6.72802e+06	1.7955e+07
Length 16777279, alignment  3/ 5:	6.80436e+06	1.77679e+07
Length 33554439, alignment  0/ 0:	7.34141e+06	2.62947e+07
Length 33554447, alignment  0/ 3:	1.36974e+07	3.57826e+07
Length 33554463, alignment  3/ 0:	1.37467e+07	3.6138e+07
Length 33554495, alignment  3/ 5:	1.36981e+07	3.57831e+07
simple_memmove	__memmove_thunderx	__memmove_generic
Length    1, alignment  0/32:	37.1875	26.875	22.6562
Length    1, alignment 32/ 0:	18.2812	22.0312	22.0312
Length    1, alignment  0/ 0:	17.3438	21.875	21.875
Length    1, alignment  0/ 0:	16.875	21.875	21.875
Length    2, alignment  0/32:	22.0312	21.875	21.875
Length    2, alignment 32/ 0:	20.4688	21.875	21.875
Length    2, alignment  0/ 1:	21.25	21.875	21.875
Length    2, alignment  1/ 0:	20	21.875	21.875
Length    4, alignment  0/32:	27.0312	20.7812	20.3125
Length    4, alignment 32/ 0:	25.4688	19.8438	19.8438
Length    4, alignment  0/ 2:	26.4062	19.8438	19.6875
Length    4, alignment  2/ 0:	24.6875	19.8438	19.6875
Length    8, alignment  0/32:	37.1875	19.0625	18.75
Length    8, alignment 32/ 0:	35.625	18.125	18.2812
Length    8, alignment  0/ 3:	35.9375	28.75	28.5938
Length    8, alignment  3/ 0:	34.375	37.1875	37.1875
Length   16, alignment  0/32:	66.875	18.2812	18.125
Length   16, alignment 32/ 0:	69.0625	18.4375	18.125
Length   16, alignment  0/ 4:	66.25	28.5938	28.5938
Length   16, alignment  4/ 0:	68.2812	37.1875	37.0312
Length   32, alignment  0/32:	102.188	19.5312	19.2188
Length   32, alignment 32/ 0:	100.312	18.9062	19.0625
Length   32, alignment  0/ 5:	102.344	28.5938	28.5938
Length   32, alignment  5/ 0:	100.156	33.5938	33.5938
Length   64, alignment  0/32:	182.344	21.25	21.0938
Length   64, alignment 32/ 0:	180.312	20.625	20.4688
Length   64, alignment  0/ 6:	182.188	38.9062	38.5938
Length   64, alignment  6/ 0:	180.312	47.3438	47.3438
Length  128, alignment  0/32:	342.344	26.25	25.625
Length  128, alignment 32/ 0:	340.156	28.125	26.7188
Length  128, alignment  0/ 7:	342.344	65.1562	65
Length  128, alignment  7/ 0:	340.156	70.9375	69.2188
Length  256, alignment  0/32:	662.344	35.4688	35.3125
Length  256, alignment 32/ 0:	660.312	37.1875	36.0938
Length  256, alignment  0/ 8:	662.344	106.406	106.719
Length  256, alignment  8/ 0:	660.312	111.094	109.688
Length  512, alignment  0/32:	1302.34	53.75	53.75
Length  512, alignment 32/ 0:	1300.47	55.3125	53.9062
Length  512, alignment  0/ 9:	1302.34	192.656	192.5
Length  512, alignment  9/ 0:	1300.31	197.656	195.938
Length 1024, alignment  0/32:	2582.34	93.125	93.125
Length 1024, alignment 32/ 0:	2580.31	94.375	93.2812
Length 1024, alignment  0/10:	2582.34	367.188	366.875
Length 1024, alignment 10/ 0:	2580.47	375.625	371.094
Length 2048, alignment  0/32:	5537.97	193.75	190.469
Length 2048, alignment 32/ 0:	5142.81	192.812	189.375
Length 2048, alignment  0/11:	5142.66	712.031	712.031
Length 2048, alignment 11/ 0:	5141.09	716.719	715
Length 4096, alignment  0/32:	10263.1	366.094	362.031
Length 4096, alignment 32/ 0:	10261.7	367.031	361.25
Length 4096, alignment  0/12:	10263.3	1400.94	1564.69
Length 4096, alignment 12/ 0:	10261.6	1405.31	1403.75
Length    0, alignment  0/32:	19.6875	23.2812	22.9688
Length    0, alignment 32/ 0:	17.8125	22.3438	22.3438
Length    0, alignment  0/ 0:	17.5	22.1875	22.1875
Length    0, alignment  0/ 0:	17.1875	22.1875	22.1875
Length    1, alignment  0/32:	19.2188	22.0312	22.0312
Length    1, alignment 32/ 0:	17.1875	22.0312	21.875
Length    1, alignment  0/ 1:	19.0625	21.7188	21.875
Length    1, alignment  1/ 0:	17.0312	21.875	21.7188
Length    2, alignment  0/32:	21.4062	21.875	21.875
Length    2, alignment 32/ 0:	19.5312	21.875	22.0312
Length    2, alignment  0/ 2:	21.0938	21.875	21.875
Length    2, alignment  2/ 0:	19.5312	21.875	21.875
Length    3, alignment  0/32:	23.9062	21.7188	21.875
Length    3, alignment 32/ 0:	22.9688	21.875	21.875
Length    3, alignment  0/ 3:	23.2812	21.875	21.875
Length    3, alignment  3/ 0:	22.1875	21.875	21.875
Length    4, alignment  0/32:	26.25	20	19.8438
Length    4, alignment 32/ 0:	24.375	19.8438	19.6875
Length    4, alignment  0/ 4:	26.0938	19.8438	19.6875
Length    4, alignment  4/ 0:	24.5312	19.8438	19.6875
Length    5, alignment  0/32:	29.5312	19.6875	19.6875
Length    5, alignment 32/ 0:	27.9688	19.8438	19.6875
Length    5, alignment  0/ 5:	28.75	29.6875	29.6875
Length    5, alignment  5/ 0:	27.1875	28.2812	28.2812
Length    6, alignment  0/32:	35.4688	19.6875	19.6875
Length    6, alignment 32/ 0:	30.3125	19.8438	19.6875
Length    6, alignment  0/ 6:	31.25	25.3125	25.1562
Length    6, alignment  6/ 0:	29.8438	23.75	23.75
Length    7, alignment  0/32:	34.375	19.8438	19.6875
Length    7, alignment 32/ 0:	32.5	19.8438	19.6875
Length    7, alignment  0/ 7:	33.4375	25.3125	25.1562
Length    7, alignment  7/ 0:	32.1875	23.75	23.75
Length    8, alignment  0/32:	36.4062	18.2812	18.4375
Length    8, alignment 32/ 0:	35	18.2812	18.2812
Length    8, alignment  0/ 8:	36.0938	18.125	18.125
Length    8, alignment  8/ 0:	34.5312	18.2812	18.125
Length    9, alignment  0/32:	38.9062	28.2812	28.125
Length    9, alignment 32/ 0:	37.5	28.125	28.125
Length    9, alignment  0/ 9:	37.9688	37.0312	37.1875
Length    9, alignment  9/ 0:	37.0312	32.1875	32.1875
Length   10, alignment  0/32:	41.4062	28.125	28.125
Length   10, alignment 32/ 0:	40	28.125	28.125
Length   10, alignment  0/10:	41.0938	37.1875	37.0312
Length   10, alignment 10/ 0:	39.6875	32.1875	32.0312
Length   11, alignment  0/32:	43.4375	28.125	28.125
Length   11, alignment 32/ 0:	42.3438	28.125	27.9688
Length   11, alignment  0/11:	42.9688	37.1875	37.0312
Length   11, alignment 11/ 0:	41.875	32.1875	32.0312
Length   12, alignment  0/32:	46.25	28.125	28.125
Length   12, alignment 32/ 0:	44.5312	28.125	28.125
Length   12, alignment  0/12:	45.9375	27.6562	27.5
Length   12, alignment 12/ 0:	44.375	28.5938	28.5938
Length   13, alignment  0/32:	48.5938	28.2812	28.125
Length   13, alignment 32/ 0:	46.875	28.125	28.125
Length   13, alignment  0/13:	48.125	32.1875	32.0312
Length   13, alignment 13/ 0:	46.875	32.1875	32.1875
Length   14, alignment  0/32:	50.7812	28.2812	28.125
Length   14, alignment 32/ 0:	53.9062	28.125	28.125
Length   14, alignment  0/14:	50.7812	32.1875	32.0312
Length   14, alignment 14/ 0:	53.75	32.0312	32.0312
Length   15, alignment  0/32:	62.3438	28.125	28.125
Length   15, alignment 32/ 0:	60.4688	28.2812	28.125
Length   15, alignment  0/15:	62.3438	32.1875	32.0312
Length   15, alignment 15/ 0:	60.3125	32.1875	32.0312
Length   16, alignment  0/32:	66.25	18.125	18.125
Length   16, alignment 32/ 0:	66.875	18.2812	18.125
Length   16, alignment  0/16:	66.25	18.2812	18.125
Length   16, alignment 16/ 0:	66.875	18.4375	18.125
Length   17, alignment  0/32:	63.5938	29.8438	29.8438
Length   17, alignment 32/ 0:	70.7812	29.8438	29.6875
Length   17, alignment  0/17:	63.4375	33.75	33.5938
Length   17, alignment 17/ 0:	70.7812	33.5938	33.75
Length   18, alignment  0/32:	67.3438	29.5312	29.5312
Length   18, alignment 32/ 0:	73.4375	29.6875	29.6875
Length   18, alignment  0/18:	67.1875	33.5938	33.5938
Length   18, alignment 18/ 0:	73.2812	33.5938	33.5938
Length   19, alignment  0/32:	68.5938	29.6875	29.5312
Length   19, alignment 32/ 0:	75.9375	29.5312	29.5312
Length   19, alignment  0/19:	68.2812	33.5938	33.5938
Length   19, alignment 19/ 0:	75.9375	35.3125	33.5938
Length   20, alignment  0/32:	72.3438	29.6875	29.5312
Length   20, alignment 32/ 0:	70.3125	29.6875	29.5312
Length   20, alignment  0/20:	72.3438	33.75	33.75
Length   20, alignment 20/ 0:	70.3125	33.75	33.75
Length   21, alignment  0/32:	73.5938	29.6875	29.5312
Length   21, alignment 32/ 0:	72.9688	29.6875	29.5312
Length   21, alignment  0/21:	73.2812	33.5938	33.5938
Length   21, alignment 21/ 0:	72.8125	33.5938	33.5938
Length   22, alignment  0/32:	77.3438	29.6875	29.6875
Length   22, alignment 32/ 0:	75.3125	29.6875	29.6875
Length   22, alignment  0/22:	77.3438	33.75	33.5938
Length   22, alignment 22/ 0:	75.3125	33.75	33.5938
Length   23, alignment  0/32:	78.5938	29.6875	29.6875
Length   23, alignment 32/ 0:	77.8125	29.6875	29.6875
Length   23, alignment  0/23:	78.4375	33.5938	33.5938
Length   23, alignment 23/ 0:	77.9688	33.5938	33.5938
Length   24, alignment  0/32:	82.1875	29.6875	29.6875
Length   24, alignment 32/ 0:	80.4688	29.6875	29.6875
Length   24, alignment  0/24:	82.3438	29.2188	29.0625
Length   24, alignment 24/ 0:	80.3125	29.2188	29.0625
Length   25, alignment  0/32:	83.5938	29.6875	29.5312
Length   25, alignment 32/ 0:	82.8125	30.1562	30
Length   25, alignment  0/25:	83.4375	33.5938	33.5938
Length   25, alignment 25/ 0:	82.8125	33.75	33.5938
Length   26, alignment  0/32:	87.3438	29.6875	29.6875
Length   26, alignment 32/ 0:	85.4688	29.6875	29.5312
Length   26, alignment  0/26:	87.3438	33.75	33.5938
Length   26, alignment 26/ 0:	85.3125	33.75	33.5938
Length   27, alignment  0/32:	88.4375	29.6875	29.6875
Length   27, alignment 32/ 0:	87.8125	29.6875	29.5312
Length   27, alignment  0/27:	88.2812	33.5938	33.5938
Length   27, alignment 27/ 0:	87.8125	33.75	33.5938
Length   28, alignment  0/32:	92.3438	29.6875	29.6875
Length   28, alignment 32/ 0:	90.3125	29.6875	29.5312
Length   28, alignment  0/28:	92.1875	33.75	33.5938
Length   28, alignment 28/ 0:	90.3125	33.75	33.5938
Length   29, alignment  0/32:	93.5938	29.6875	29.5312
Length   29, alignment 32/ 0:	92.8125	29.6875	29.6875
Length   29, alignment  0/29:	93.2812	33.5938	33.5938
Length   29, alignment 29/ 0:	92.9688	33.5938	33.5938
Length   30, alignment  0/32:	97.1875	29.6875	29.6875
Length   30, alignment 32/ 0:	95.3125	29.6875	29.5312
Length   30, alignment  0/30:	97.3438	33.5938	33.5938
Length   30, alignment 30/ 0:	95.3125	33.75	33.5938
Length   31, alignment  0/32:	98.5938	29.6875	29.5312
Length   31, alignment 32/ 0:	97.9688	29.6875	29.5312
Length   31, alignment  0/31:	98.2812	33.75	33.75
Length   31, alignment 31/ 0:	97.8125	33.5938	33.5938
Length   48, alignment  0/32:	142.188	20.4688	20.4688
Length   48, alignment 32/ 0:	140.469	20.4688	20.4688
Length   48, alignment  0/ 3:	142.344	38.9062	38.75
Length   48, alignment  3/ 0:	140.312	47.3438	47.3438
Length   80, alignment  0/32:	222.188	22.5	22.3438
Length   80, alignment 32/ 0:	220.312	21.875	21.7188
Length   80, alignment  0/ 5:	222.344	48.125	48.125
Length   80, alignment  5/ 0:	220.312	49.375	49.0625
Length   96, alignment  0/32:	262.188	21.5625	21.5625
Length   96, alignment 32/ 0:	260.312	21.5625	21.5625
Length   96, alignment  0/ 6:	262.188	48.125	48.125
Length   96, alignment  6/ 0:	260.312	49.375	49.0625
Length  112, alignment  0/32:	302.344	25.9375	25.4688
Length  112, alignment 32/ 0:	300.156	27.8125	26.0938
Length  112, alignment  0/ 7:	302.188	65	64.8438
Length  112, alignment  7/ 0:	300.312	70.625	68.9062
Length  144, alignment  0/32:	382.344	31.0938	30.9375
Length  144, alignment 32/ 0:	380.312	27.3438	25.9375
Length  144, alignment  0/ 9:	382.344	85.4688	85
Length  144, alignment  9/ 0:	380.156	70.3125	68.75
Length  160, alignment  0/32:	422.344	30.625	30.3125
Length  160, alignment 32/ 0:	420.312	32.6562	31.0938
Length  160, alignment  0/10:	422.5	85	84.6875
Length  160, alignment 10/ 0:	420.312	105.156	103.75
Length  176, alignment  0/32:	462.344	30.4688	30.3125
Length  176, alignment 32/ 0:	460.312	32.0312	30.4688
Length  176, alignment  0/11:	462.344	84.8438	84.8438
Length  176, alignment 11/ 0:	460.312	89.8438	88.4375
Length  192, alignment  0/32:	502.344	30.4688	30.3125
Length  192, alignment 32/ 0:	500.312	32.0312	30.4688
Length  192, alignment  0/12:	502.344	84.8438	84.6875
Length  192, alignment 12/ 0:	500.312	90	88.4375
Length  208, alignment  0/32:	708.281	35.4688	35.1562
Length  208, alignment 32/ 0:	540.312	32.0312	30.4688
Length  208, alignment  0/13:	542.5	106.406	106.562
Length  208, alignment 13/ 0:	540.312	89.8438	88.4375
Length  224, alignment  0/32:	582.344	34.8438	35
Length  224, alignment 32/ 0:	580.312	36.5625	35.3125
Length  224, alignment  0/14:	582.344	106.406	106.406
Length  224, alignment 14/ 0:	580.312	126.25	125.156
Length  240, alignment  0/32:	622.344	34.8438	35
Length  240, alignment 32/ 0:	620.312	36.4062	35
Length  240, alignment  0/15:	622.344	106.406	106.406
Length  240, alignment 15/ 0:	620.312	111.25	109.844
Length  272, alignment  0/32:	702.344	40	39.8438
Length  272, alignment 32/ 0:	700.469	36.25	35
Length  272, alignment  0/17:	702.344	127.969	127.969
Length  272, alignment 17/ 0:	700.312	106.25	104.844
Length  288, alignment  0/32:	742.344	39.6875	39.375
Length  288, alignment 32/ 0:	740.312	42.3438	40.1562
Length  288, alignment  0/18:	742.344	128.125	127.812
Length  288, alignment 18/ 0:	740.781	128.594	126.406
Length  304, alignment  0/32:	782.188	39.375	39.2188
Length  304, alignment 32/ 0:	780.312	41.0938	39.2188
Length  304, alignment  0/19:	782.188	127.969	127.812
Length  304, alignment 19/ 0:	780.469	128.125	126.25
Length  320, alignment  0/32:	822.344	39.5312	39.375
Length  320, alignment 32/ 0:	820.469	40.9375	39.375
Length  320, alignment  0/20:	822.344	127.969	127.812
Length  320, alignment 20/ 0:	820.312	127.969	126.25
Length  336, alignment  0/32:	862.188	44.6875	44.375
Length  336, alignment 32/ 0:	860.312	41.0938	39.375
Length  336, alignment  0/21:	862.344	149.531	149.375
Length  336, alignment 21/ 0:	860.469	127.969	126.094
Length  352, alignment  0/32:	902.188	43.9062	43.9062
Length  352, alignment 32/ 0:	900.312	46.5625	44.6875
Length  352, alignment  0/22:	902.344	149.375	149.219
Length  352, alignment 22/ 0:	900.312	149.531	148.281
Length  368, alignment  0/32:	942.344	43.9062	43.75
Length  368, alignment 32/ 0:	940.469	45.3125	43.9062
Length  368, alignment  0/23:	942.344	149.531	149.219
Length  368, alignment 23/ 0:	940.469	149.375	147.969
Length  384, alignment  0/32:	982.344	43.9062	43.9062
Length  384, alignment 32/ 0:	980.469	45.4688	43.9062
Length  384, alignment  0/24:	1129.22	149.219	149.375
Length  384, alignment 24/ 0:	980.312	148.75	147.344
Length  400, alignment  0/32:	1022.19	49.0625	49.0625
Length  400, alignment 32/ 0:	1020.31	45.3125	44.0625
Length  400, alignment  0/25:	1022.34	171.094	170.938
Length  400, alignment 25/ 0:	1020.47	149.375	147.969
Length  416, alignment  0/32:	1062.34	48.75	48.2812
Length  416, alignment 32/ 0:	1060.31	51.0938	49.375
Length  416, alignment  0/26:	1062.34	171.094	170.781
Length  416, alignment 26/ 0:	1060.47	170.938	169.844
Length  432, alignment  0/32:	1102.34	48.75	48.2812
Length  432, alignment 32/ 0:	1100.47	49.8438	48.4375
Length  432, alignment  0/27:	1102.34	171.094	170.781
Length  432, alignment 27/ 0:	1100.47	170.938	169.531
Length  448, alignment  0/32:	1142.34	48.5938	48.2812
Length  448, alignment 32/ 0:	1140.47	49.8438	48.5938
Length  448, alignment  0/28:	1142.34	170.938	170.938
Length  448, alignment 28/ 0:	1140.31	170.781	169.531
Length  464, alignment  0/32:	1182.34	53.2812	53.2812
Length  464, alignment 32/ 0:	1180.47	49.8438	48.4375
Length  464, alignment  0/29:	1182.34	192.5	192.344
Length  464, alignment 29/ 0:	1180.47	170.938	169.531
Length  480, alignment  0/32:	1222.34	52.9688	52.8125
Length  480, alignment 32/ 0:	1220.31	54.6875	53.2812
Length  480, alignment  0/30:	1222.34	192.5	192.188
Length  480, alignment 30/ 0:	1220.47	192.5	190.938
Length  496, alignment  0/32:	1262.34	53.125	52.8125
Length  496, alignment 32/ 0:	1260.31	54.2188	52.6562
Length  496, alignment  0/31:	1262.5	192.5	192.188
Length  496, alignment 31/ 0:	1260.47	192.344	190.781
Length 1024, alignment  0/ 0:	2580.47	18.4375	18.4375
Length 1024, alignment 32/ 0:	2702.97	94.8438	92.6562
Length 1024, alignment  0/32:	2582.5	92.9688	92.8125
Length 1024, alignment 32/32:	2580.16	18.2812	18.2812
Length 1056, alignment  0/ 0:	2660.47	18.125	18.125
Length 1056, alignment 33/ 0:	2660.78	389.375	387.656
Length 1056, alignment  0/33:	2662.81	389.688	389.531
Length 1056, alignment 33/33:	2660.62	18.125	18.125
Length 1088, alignment  0/ 0:	2740.78	18.125	18.125
Length 1088, alignment 34/ 0:	2740.78	389.375	387.656
Length 1088, alignment  0/34:	2742.66	389.531	389.531
Length 1088, alignment 34/34:	2740.78	18.2812	18.125
Length 1120, alignment  0/ 0:	2820.78	18.125	18.125
Length 1120, alignment 35/ 0:	2820.62	410.781	409.219
Length 1120, alignment  0/35:	2822.81	411.094	410.938
Length 1120, alignment 35/35:	2820.78	18.125	18.125
Length 1152, alignment  0/ 0:	2900.78	18.2812	18.125
Length 1152, alignment 36/ 0:	2900.62	410.781	409.062
Length 1152, alignment  0/36:	3014.69	411.562	411.25
Length 1152, alignment 36/36:	2900.78	18.2812	17.9688
Length 1184, alignment  0/ 0:	2980.78	18.125	18.125
Length 1184, alignment 37/ 0:	2980.78	432.344	430.625
Length 1184, alignment  0/37:	2982.81	432.5	432.5
Length 1184, alignment 37/37:	2980.78	18.125	18.125
Length 1216, alignment  0/ 0:	3062.03	18.4375	18.125
Length 1216, alignment 38/ 0:	3060.16	432.188	430.625
Length 1216, alignment  0/38:	3062.34	432.656	432.344
Length 1216, alignment 38/38:	3060.47	18.2812	18.125
Length 1248, alignment  0/ 0:	3140.47	18.125	18.125
Length 1248, alignment 39/ 0:	3140.31	453.906	452.188
Length 1248, alignment  0/39:	3142.34	454.219	453.906
Length 1248, alignment 39/39:	3140.47	18.125	17.9688
Length 1280, alignment  0/ 0:	3220.31	18.125	17.9688
Length 1280, alignment 40/ 0:	3378.75	453.75	452.188
Length 1280, alignment  0/40:	3222.5	454.062	453.906
Length 1280, alignment 40/40:	3220.47	18.125	18.125
Length 1312, alignment  0/ 0:	3300.31	18.125	18.125
Length 1312, alignment 41/ 0:	3300.47	475.312	473.594
Length 1312, alignment  0/41:	3302.34	475.625	475.469
Length 1312, alignment 41/41:	3300.31	18.2812	18.125
Length 1344, alignment  0/ 0:	3380.31	18.125	18.125
Length 1344, alignment 42/ 0:	3380.47	475.312	473.75
Length 1344, alignment  0/42:	3382.34	475.625	475.469
Length 1344, alignment 42/42:	3380.47	18.125	18.125
Length 1376, alignment  0/ 0:	3460.31	18.2812	18.125
Length 1376, alignment 43/ 0:	3460.31	496.875	495
Length 1376, alignment  0/43:	3462.5	497.031	619.219
Length 1376, alignment 43/43:	3460.62	18.125	18.125
Length 1408, alignment  0/ 0:	3540.47	18.2812	17.9688
Length 1408, alignment 44/ 0:	3540.31	496.719	495.156
Length 1408, alignment  0/44:	3542.34	497.031	496.875
Length 1408, alignment 44/44:	3540.47	18.125	18.125
Length 1440, alignment  0/ 0:	3620.31	18.2812	18.125
Length 1440, alignment 45/ 0:	3620.31	518.281	516.719
Length 1440, alignment  0/45:	3622.34	518.594	518.438
Length 1440, alignment 45/45:	3620.47	18.125	18.125
Length 1472, alignment  0/ 0:	3700.47	18.2812	18.125
Length 1472, alignment 46/ 0:	3700.31	518.281	516.719
Length 1472, alignment  0/46:	3702.5	518.438	518.438
Length 1472, alignment 46/46:	3700.47	19.8438	17.9688
Length 1504, alignment  0/ 0:	3781.25	18.125	17.9688
Length 1504, alignment 47/ 0:	3780.47	539.688	538.125
Length 1504, alignment  0/47:	3782.81	540.156	540
Length 1504, alignment 47/47:	3780.47	18.125	18.125
Length 1536, alignment  0/ 0:	3860.31	18.125	18.125
Length 1536, alignment 48/ 0:	3860.47	130.469	128.75
Length 1536, alignment  0/48:	3862.5	128.906	128.906
Length 1536, alignment 48/48:	3860.47	18.125	17.9688
Length 1568, alignment  0/ 0:	3940.62	18.125	17.9688
Length 1568, alignment 49/ 0:	3940.94	561.25	559.531
Length 1568, alignment  0/49:	3942.81	561.719	561.562
Length 1568, alignment 49/49:	3940.47	18.125	18.125
Length 1600, alignment  0/ 0:	4164.53	19.2188	18.125
Length 1600, alignment 50/ 0:	4021.72	561.25	559.688
Length 1600, alignment  0/50:	4022.81	561.719	561.562
Length 1600, alignment 50/50:	4020.47	18.125	18.125
Length 1632, alignment  0/ 0:	4100.78	18.125	18.125
Length 1632, alignment 51/ 0:	4100.62	582.812	581.094
Length 1632, alignment  0/51:	4102.66	583.281	582.969
Length 1632, alignment 51/51:	4100.47	18.2812	18.125
Length 1664, alignment  0/ 0:	4180.78	18.125	18.125
Length 1664, alignment 52/ 0:	4180.62	582.812	581.25
Length 1664, alignment  0/52:	4182.66	583.125	582.969
Length 1664, alignment 52/52:	4315.94	18.2812	18.125
Length 1696, alignment  0/ 0:	4260.78	18.2812	17.9688
Length 1696, alignment 53/ 0:	4260.62	604.219	602.656
Length 1696, alignment  0/53:	4262.81	604.688	604.531
Length 1696, alignment 53/53:	4260.31	18.125	18.125
Length 1728, alignment  0/ 0:	4340.31	18.2812	17.9688
Length 1728, alignment 54/ 0:	4340.78	604.375	602.5
Length 1728, alignment  0/54:	4342.81	604.531	604.531
Length 1728, alignment 54/54:	4340.47	18.2812	18.2812
Length 1760, alignment  0/ 0:	4420.31	18.125	18.125
Length 1760, alignment 55/ 0:	4420.78	625.625	624.062
Length 1760, alignment  0/55:	4422.66	760.312	626.25
Length 1760, alignment 55/55:	4420.31	18.125	17.9688
Length 1792, alignment  0/ 0:	4500.62	18.125	18.125
Length 1792, alignment 56/ 0:	4500.62	625.781	624.062
Length 1792, alignment  0/56:	4502.81	625.938	625.938
Length 1792, alignment 56/56:	4500.47	18.125	18.125
Length 1824, alignment  0/ 0:	4580.62	18.2812	18.125
Length 1824, alignment 57/ 0:	4582.5	647.656	646.562
Length 1824, alignment  0/57:	4582.19	647.5	647.344
Length 1824, alignment 57/57:	4580.47	18.9062	18.75
Length 1856, alignment  0/ 0:	4661.09	18.125	18.125
Length 1856, alignment 58/ 0:	4834.06	647.812	646.25
Length 1856, alignment  0/58:	4663.12	647.656	647.5
Length 1856, alignment 58/58:	4660.31	18.75	18.75
Length 1888, alignment  0/ 0:	4741.09	18.125	18.125
Length 1888, alignment 59/ 0:	4741.09	668.75	667.031
Length 1888, alignment  0/59:	4742.34	668.906	668.906
Length 1888, alignment 59/59:	4740.31	18.75	18.75
Length 1920, alignment  0/ 0:	4821.09	18.125	18.75
Length 1920, alignment 60/ 0:	4821.09	668.75	667.031
Length 1920, alignment  0/60:	4822.34	669.062	668.906
Length 1920, alignment 60/60:	4963.28	18.9062	18.75
Length 1952, alignment  0/ 0:	4900.94	18.9062	18.75
Length 1952, alignment 61/ 0:	4901.09	690.156	688.75
Length 1952, alignment  0/61:	4902.5	690.625	690.469
Length 1952, alignment 61/61:	4900.47	18.75	18.75
Length 1984, alignment  0/ 0:	4981.09	18.125	18.75
Length 1984, alignment 62/ 0:	4980.94	690.156	688.75
Length 1984, alignment  0/62:	4982.5	690.625	690.469
Length 1984, alignment 62/62:	4980.47	18.75	18.75
Length 2016, alignment  0/ 0:	5060.94	18.2812	18.75
Length 2016, alignment 63/ 0:	5194.38	712.5	710.938
Length 2016, alignment  0/63:	5063.44	712.188	712.031
Length 2016, alignment 63/63:	5060.62	19.2188	19.0625
__memmove_thunderx	__memmove_generic
Length 4103, alignment  0/64:	525.625	418.125
Length 4111, alignment  0/ 3:	1465	1464.38
Length 4127, alignment  3/ 0:	1502.5	1497.5
Length 4159, alignment  3/ 7:	1485	1482.5
Length 4223, alignment  9/ 5:	1509.38	1510.62
Length 8199, alignment  0/64:	768.75	756.25
Length 8207, alignment  0/ 3:	2848.12	2846.88
Length 8223, alignment  3/ 0:	2881.25	2879.38
Length 8255, alignment  3/ 7:	2868.12	2868.12
Length 8319, alignment  9/ 5:	2893.75	2890
Length 16391, alignment  0/64:	1508.75	1468.75
Length 16399, alignment  0/ 3:	5630.62	5645
Length 16415, alignment  3/ 0:	6587.5	5696.88
Length 16447, alignment  3/ 7:	5665.62	5671.25
Length 16511, alignment  9/ 5:	5680	5715.62
Length 32775, alignment  0/64:	3404.38	3379.38
Length 32783, alignment  0/ 3:	11950	11948.8
Length 32799, alignment  3/ 0:	12272.5	12196.2
Length 32831, alignment  3/ 7:	11932.5	11937.5
Length 32895, alignment  9/ 5:	12278.1	12600.6
Length 65543, alignment  0/64:	15153.8	15145.6
Length 65551, alignment  0/ 3:	32700.6	32701.9
Length 65567, alignment  3/ 0:	24323.1	32856.9
Length 65599, alignment  3/ 7:	32333.1	32333.1
Length 65663, alignment  9/ 5:	24330.6	32889.4
Length 131079, alignment  0/64:	30443.8	30994.4
Length 131087, alignment  0/ 3:	65536.2	65528.8
Length 131103, alignment  3/ 0:	48560	65695
Length 131135, alignment  3/ 7:	65275.6	64768.8
Length 131199, alignment  9/ 5:	48574.4	66727.5
Length 262151, alignment  0/64:	61041.2	61043.1
Length 262159, alignment  0/ 3:	131192	132792
Length 262175, alignment  3/ 0:	97861.9	131354
Length 262207, alignment  3/ 7:	129631	130307
Length 262271, alignment  9/ 5:	97690	131389
Length 524295, alignment  0/64:	121656	122274
Length 524303, alignment  0/ 3:	262468	262572
Length 524319, alignment  3/ 0:	193571	262575
Length 524351, alignment  3/ 7:	259253	259378
Length 524415, alignment  9/ 5:	193574	262574
Length 1048583, alignment  0/64:	242923	242967
Length 1048591, alignment  0/ 3:	524019	524000
Length 1048607, alignment  3/ 0:	386996	524172
Length 1048639, alignment  3/ 7:	517732	517734
Length 1048703, alignment  9/ 5:	386961	524196
Length 2097159, alignment  0/64:	485096	484989
Length 2097167, alignment  0/ 3:	1.04767e+06	1.04724e+06
Length 2097183, alignment  3/ 0:	772307	1.04781e+06
Length 2097215, alignment  3/ 7:	1.03466e+06	1.03465e+06
Length 2097279, alignment  9/ 5:	772771	1.04743e+06
Length 4194311, alignment  0/64:	969175	969163
Length 4194319, alignment  0/ 3:	2.09413e+06	2.09371e+06
Length 4194335, alignment  3/ 0:	1.54408e+06	2.09386e+06
Length 4194367, alignment  3/ 7:	2.06856e+06	2.06895e+06
Length 4194431, alignment  9/ 5:	1.54376e+06	2.09388e+06
Length 8388615, alignment  0/64:	1.95435e+06	1.95496e+06
Length 8388623, alignment  0/ 3:	4.19442e+06	4.19203e+06
Length 8388639, alignment  3/ 0:	3.08706e+06	4.19641e+06
Length 8388671, alignment  3/ 7:	4.14203e+06	4.14215e+06
Length 8388735, alignment  9/ 5:	3.08678e+06	4.19257e+06
Length 16777223, alignment  0/64:	6.15746e+06	6.13601e+06
Length 16777231, alignment  0/ 3:	1.07097e+07	1.06811e+07
Length 16777247, alignment  3/ 0:	6.44117e+06	1.10652e+07
Length 16777279, alignment  3/ 7:	1.06051e+07	1.05801e+07
Length 16777343, alignment  9/ 5:	6.43968e+06	1.10505e+07

Comments

Adhemerval Zanella Jan. 19, 2017, 7:41 p.m. UTC | #1
Hi Steve,

On 19/01/2017 16:22, Steve Ellcey wrote:
> +extern uint64_t __midr attribute_hidden;
> +extern bool __is_thunderx attribute_hidden;
> +
> +#define INIT_ARCH()						\
> +  {								\
> +    if (__midr == 0)						\
> +      {								\
> +	asm volatile ("mrs %0, midr_el1" : "=r"(__midr));	\
> +	__is_thunderx = IS_THUNDERX(__midr);			\
> +      }								\
> +  }

I think to avoid potentially multiple kernel traps at loading or plt resolve time,
a better solution would be issue the mrs instruction once at loader/program startup,
fill in an internal structure with the required information and use it later on
ifunc resolution.  This is similar the cpu-features/cacheinfo strategy for x86.


> diff --git a/sysdeps/unix/sysv/linux/aarch64/configure.ac b/sysdeps/unix/sysv/linux/aarch64/configure.ac
> index 211fa9c..684cb46 100644
> --- a/sysdeps/unix/sysv/linux/aarch64/configure.ac
> +++ b/sysdeps/unix/sysv/linux/aarch64/configure.ac
> @@ -1,6 +1,11 @@
>  GLIBC_PROVIDES dnl See aclocal.m4 in the top level source directory.
>  # Local configure fragment for sysdeps/unix/sysv/linux/aarch64.
>  
> -arch_minimum_kernel=3.7.0
> +# For multi-arch support we need a kernel that emulates the mrs instruction.
> +if test x$multi_arch = xyes; then
> +    arch_minimum_kernel=4.11.0
> +else
> +    arch_minimum_kernel=3.7.0
> +fi

I do not think this is suffice to prevent the multiarch version on system with
old installed kernel headers.  This will only prevents if you explicit use
--enable-multi-arch, however multiarch are enabled by default in configure.ac
(configure.ac:877).  So building on with old kernel headers will broke 
the runtime.

We need to make sure glibc built against older kernel headers (or with
--enable-kernel=x.y.z) do not use mrs instruction and glibc built against
newer kernel that may use mrs fail on loading with DL_SYSDEP_OSCHECK.

From last patch iteration [1] documentation, kernel provides the HWCAP_CPUID
bit on hwcap to indication it supports the mrs emulation.  So using my previous
suggestion I would recommend:

  1. Remove any configure check or restriction.
  2. Add a cpu_features module similar to x86 that set a global state with
     the cpu information obtained from kernel.  It will first check HWCAP_CPUID
     bit on hwcap and if it is set then issue the mrs instruction.  It will
     then populate the global state with the required cpu information.
  3. Use the cpu information to select the correct ifunc.

It has another advantage of avoid more complexity with different glibc
with different minimum required kernels.
Joseph Myers Jan. 19, 2017, 9:04 p.m. UTC | #2
On Thu, 19 Jan 2017, Adhemerval Zanella wrote:

> We need to make sure glibc built against older kernel headers (or with
> --enable-kernel=x.y.z) do not use mrs instruction and glibc built against
> newer kernel that may use mrs fail on loading with DL_SYSDEP_OSCHECK.

Agreed.  That is, I think that either the configured minimum kernel 
version or the kernel support at runtime (or both, with the configured 
minimum kernel allowing runtime tests to be disabled) should be what 
determines whether these implementations can be used - rather than 
enabling multi-arch changing the minimum kernel version.
diff mbox

Patch

diff --git a/sysdeps/aarch64/memcpy.S b/sysdeps/aarch64/memcpy.S
index 29af8b1..74444b4 100644
--- a/sysdeps/aarch64/memcpy.S
+++ b/sysdeps/aarch64/memcpy.S
@@ -59,7 +59,14 @@ 
    Overlapping large forward memmoves use a loop that copies backwards.
 */
 
-ENTRY_ALIGN (memmove, 6)
+#ifndef MEMMOVE
+#  define MEMMOVE memmove
+#endif
+#ifndef MEMCPY
+#  define MEMCPY memcpy
+#endif
+
+ENTRY_ALIGN (MEMMOVE, 6)
 
 	DELOUSE (0)
 	DELOUSE (1)
@@ -71,9 +78,9 @@  ENTRY_ALIGN (memmove, 6)
 	b.lo	L(move_long)
 
 	/* Common case falls through into memcpy.  */
-END (memmove)
-libc_hidden_builtin_def (memmove)
-ENTRY (memcpy)
+END (MEMMOVE)
+libc_hidden_builtin_def (MEMMOVE)
+ENTRY (MEMCPY)
 
 	DELOUSE (0)
 	DELOUSE (1)
@@ -158,10 +165,22 @@  L(copy96):
 
 	.p2align 4
 L(copy_long):
+
+#ifdef USE_THUNDERX
+
+	/* On thunderx, large memcpy's are helped by software prefetching.
+	   This loop is identical to the one below it but with prefetching
+	   instructions included.  For loops that are less than 32768 bytes,
+	   the prefetching does not help and slow the code down so we only
+	   use the prefetching loop for the largest memcpys.  */
+
+	cmp	count, #32768
+	b.lo	L(copy_long_without_prefetch)
 	and	tmp1, dstin, 15
 	bic	dst, dstin, 15
 	ldp	D_l, D_h, [src]
 	sub	src, src, tmp1
+	prfm	pldl1strm, [src, 384]
 	add	count, count, tmp1	/* Count is now 16 too large.  */
 	ldp	A_l, A_h, [src, 16]
 	stp	D_l, D_h, [dstin]
@@ -169,7 +188,10 @@  L(copy_long):
 	ldp	C_l, C_h, [src, 48]
 	ldp	D_l, D_h, [src, 64]!
 	subs	count, count, 128 + 16	/* Test and readjust count.  */
-	b.ls	2f
+
+L(prefetch_loop64):
+	tbz	src, #6, 1f
+	prfm	pldl1strm, [src, 512]
 1:
 	stp	A_l, A_h, [dst, 16]
 	ldp	A_l, A_h, [src, 16]
@@ -180,12 +202,40 @@  L(copy_long):
 	stp	D_l, D_h, [dst, 64]!
 	ldp	D_l, D_h, [src, 64]!
 	subs	count, count, 64
-	b.hi	1b
+	b.hi	L(prefetch_loop64)
+	b	L(last64)
+
+L(copy_long_without_prefetch):
+#endif
+
+	and	tmp1, dstin, 15
+	bic	dst, dstin, 15
+	ldp	D_l, D_h, [src]
+	sub	src, src, tmp1
+	add	count, count, tmp1	/* Count is now 16 too large.  */
+	ldp	A_l, A_h, [src, 16]
+	stp	D_l, D_h, [dstin]
+	ldp	B_l, B_h, [src, 32]
+	ldp	C_l, C_h, [src, 48]
+	ldp	D_l, D_h, [src, 64]!
+	subs	count, count, 128 + 16	/* Test and readjust count.  */
+	b.ls	L(last64)
+L(loop64):
+	stp	A_l, A_h, [dst, 16]
+	ldp	A_l, A_h, [src, 16]
+	stp	B_l, B_h, [dst, 32]
+	ldp	B_l, B_h, [src, 32]
+	stp	C_l, C_h, [dst, 48]
+	ldp	C_l, C_h, [src, 48]
+	stp	D_l, D_h, [dst, 64]!
+	ldp	D_l, D_h, [src, 64]!
+	subs	count, count, 64
+	b.hi	L(loop64)
 
 	/* Write the last full set of 64 bytes.  The remainder is at most 64
 	   bytes, so it is safe to always copy 64 bytes from the end even if
 	   there is just 1 byte left.  */
-2:
+L(last64):
 	ldp	E_l, E_h, [srcend, -64]
 	stp	A_l, A_h, [dst, 16]
 	ldp	A_l, A_h, [srcend, -48]
@@ -256,5 +306,5 @@  L(move_long):
 	stp	C_l, C_h, [dstin]
 3:	ret
 
-END (memcpy)
-libc_hidden_builtin_def (memcpy)
+END (MEMCPY)
+libc_hidden_builtin_def (MEMCPY)
diff --git a/sysdeps/aarch64/multiarch/Makefile b/sysdeps/aarch64/multiarch/Makefile
index e69de29..78d52c7 100644
--- a/sysdeps/aarch64/multiarch/Makefile
+++ b/sysdeps/aarch64/multiarch/Makefile
@@ -0,0 +1,3 @@ 
+ifeq ($(subdir),string)
+sysdep_routines += memcpy_generic memcpy_thunderx
+endif
diff --git a/sysdeps/aarch64/multiarch/ifunc-impl-list.c b/sysdeps/aarch64/multiarch/ifunc-impl-list.c
index e69de29..c6d63f6 100644
--- a/sysdeps/aarch64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/aarch64/multiarch/ifunc-impl-list.c
@@ -0,0 +1,61 @@ 
+/* Enumerate available IFUNC implementations of a function.  AARCH64 version.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <string.h>
+#include <wchar.h>
+#include <ldsodefs.h>
+#include <ifunc-impl-list.h>
+#include <init-arch.h>
+#include <stdio.h>
+
+/* Access to the midr_el1 register is emulated by the linux kernel and
+   is slow so we save it in __midr after it is read once.  We also save
+   the value of IS_THUNDERX in __is_thunderx so it does not need to be
+   recomputed by checking multiple bits from __midr.  */
+
+uint64_t __midr attribute_hidden = 0;
+bool __is_thunderx attribute_hidden;
+
+/* Maximum number of IFUNC implementations.  */
+#define MAX_IFUNC	2
+
+size_t
+__libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
+			size_t max)
+{
+  assert (max >= MAX_IFUNC);
+
+  size_t i = 0;
+
+  INIT_ARCH ();
+
+#ifdef SHARED
+  /* Support sysdeps/aarch64/multiarch/memcpy.c.  */
+  IFUNC_IMPL (i, name, memcpy,
+	      IFUNC_IMPL_ADD (array, i, memcpy, __is_thunderx,
+			      __memcpy_thunderx)
+	      IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_generic))
+  IFUNC_IMPL (i, name, memmove,
+	      IFUNC_IMPL_ADD (array, i, memmove, __is_thunderx,
+			      __memmove_thunderx)
+	      IFUNC_IMPL_ADD (array, i, memmove, 1, __memmove_generic))
+#endif
+
+  return i;
+}
diff --git a/sysdeps/aarch64/multiarch/init-arch.h b/sysdeps/aarch64/multiarch/init-arch.h
index e69de29..e12ba61 100644
--- a/sysdeps/aarch64/multiarch/init-arch.h
+++ b/sysdeps/aarch64/multiarch/init-arch.h
@@ -0,0 +1,55 @@ 
+/* This file is part of the GNU C Library.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <elf.h>
+#include <stdint.h>
+#include <stdbool.h>
+
+#define MIDR_REVISION_MASK	0xf
+#define MIDR_REVISION(__midr)	((__midr) & MIDR_REVISION_MASK)
+#define MIDR_PARTNUM_SHIFT	4
+#define MIDR_PARTNUM_MASK	(0xfff << MIDR_PARTNUM_SHIFT)
+#define MIDR_PARTNUM(__midr)	\
+	(((__midr) & MIDR_PARTNUM_MASK) >> MIDR_PARTNUM_SHIFT)
+#define MIDR_ARCHITECTURE_SHIFT	16
+#define MIDR_ARCHITECTURE_MASK	(0xf << MIDR_ARCHITECTURE_SHIFT)
+#define MIDR_ARCHITECTURE(__midr)	\
+	(((__midr) & MIDR_ARCHITECTURE_MASK) >> MIDR_ARCHITECTURE_SHIFT)
+#define MIDR_VARIANT_SHIFT	20
+#define MIDR_VARIANT_MASK	(0xf << MIDR_VARIANT_SHIFT)
+#define MIDR_VARIANT(__midr)	\
+	(((__midr) & MIDR_VARIANT_MASK) >> MIDR_VARIANT_SHIFT)
+#define MIDR_IMPLEMENTOR_SHIFT	24
+#define MIDR_IMPLEMENTOR_MASK	(0xff << MIDR_IMPLEMENTOR_SHIFT)
+#define MIDR_IMPLEMENTOR(__midr)	\
+	(((__midr) & MIDR_IMPLEMENTOR_MASK) >> MIDR_IMPLEMENTOR_SHIFT)
+
+#define IS_THUNDERX(__midr) (MIDR_IMPLEMENTOR(__midr) == 'C'	\
+			     && MIDR_PARTNUM(__midr) == 0x0a1)
+
+
+extern uint64_t __midr attribute_hidden;
+extern bool __is_thunderx attribute_hidden;
+
+#define INIT_ARCH()						\
+  {								\
+    if (__midr == 0)						\
+      {								\
+	asm volatile ("mrs %0, midr_el1" : "=r"(__midr));	\
+	__is_thunderx = IS_THUNDERX(__midr);			\
+      }								\
+  }
diff --git a/sysdeps/aarch64/multiarch/memcpy.c b/sysdeps/aarch64/multiarch/memcpy.c
index e69de29..b2b587b 100644
--- a/sysdeps/aarch64/multiarch/memcpy.c
+++ b/sysdeps/aarch64/multiarch/memcpy.c
@@ -0,0 +1,41 @@ 
+/* Multiple versions of memcpy. AARCH64 version.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Define multiple versions only for the definition in lib and for
+   DSO.  In static binaries we need memcpy before the initialization
+   happened.  */
+
+#if defined SHARED && IS_IN (libc)
+/* Redefine memcpy so that the compiler won't complain about the type
+   mismatch with the IFUNC selector in strong_alias, below.  */
+# undef memcpy
+# define memcpy __redirect_memcpy
+# include <string.h>
+# include <init-arch.h>
+
+extern __typeof (__redirect_memcpy) __libc_memcpy;
+
+extern __typeof (__redirect_memcpy) __memcpy_generic attribute_hidden;
+extern __typeof (__redirect_memcpy) __memcpy_thunderx attribute_hidden;
+
+libc_ifunc (__libc_memcpy,
+            __is_thunderx ? __memcpy_thunderx : __memcpy_generic);
+
+#undef memcpy
+strong_alias (__libc_memcpy, memcpy);
+#endif
diff --git a/sysdeps/aarch64/multiarch/memcpy_generic.S b/sysdeps/aarch64/multiarch/memcpy_generic.S
index e69de29..c0e3462 100644
--- a/sysdeps/aarch64/multiarch/memcpy_generic.S
+++ b/sysdeps/aarch64/multiarch/memcpy_generic.S
@@ -0,0 +1,42 @@ 
+/* A Generic Optimized memcpy implementation for AARCH64.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* The actual memcpy and memmove code is in ../memcpy.S.  If we are
+   building a shared libc using IFUNC this file defines __memcpy_generic
+   and __memmove_generic.  Otherwise the include of ../memcpy.S will
+   define the normal __memcpy and__memmove entry points.  */
+
+#include <sysdep.h>
+
+#if defined SHARED && IS_IN (libc)
+
+#define MEMCPY __memcpy_generic
+#define MEMMOVE __memmove_generic
+
+/* Do not hide the generic versions of memcpy and memmove, we use them
+   internally.  */
+#undef libc_hidden_builtin_def
+#define libc_hidden_builtin_def(name)
+
+/* It doesn't make sense to send libc-internal memcpy calls through a PLT. */
+	.globl __GI_memcpy; __GI_memcpy = __memcpy_generic
+	.globl __GI_memmove; __GI_memmove = __memmove_generic
+
+#endif
+
+#include "../memcpy.S"
diff --git a/sysdeps/aarch64/multiarch/memcpy_thunderx.S b/sysdeps/aarch64/multiarch/memcpy_thunderx.S
index e69de29..df5e959 100644
--- a/sysdeps/aarch64/multiarch/memcpy_thunderx.S
+++ b/sysdeps/aarch64/multiarch/memcpy_thunderx.S
@@ -0,0 +1,33 @@ 
+/* A Thunderx Optimized memcpy implementation for AARCH64.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* The actual thunderx optimized code is in ../memcpy.S under the USE_THUNDERX
+   ifdef.  If we are not building a shared libc with IFUNC then we do not
+   build anything when compiling this file and __memcpy is defined by
+   memcpy_generic.S.  */
+
+#include <sysdep.h>
+
+#if defined SHARED && IS_IN (libc)
+
+#define MEMCPY __memcpy_thunderx
+#define MEMMOVE __memmove_thunderx
+#define USE_THUNDERX
+#include "../memcpy.S"
+
+#endif
diff --git a/sysdeps/aarch64/multiarch/memmove.c b/sysdeps/aarch64/multiarch/memmove.c
index e69de29..c08c763 100644
--- a/sysdeps/aarch64/multiarch/memmove.c
+++ b/sysdeps/aarch64/multiarch/memmove.c
@@ -0,0 +1,40 @@ 
+/* Multiple versions of memmove. AARCH64 version.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Define multiple versions only for the definition in lib and for
+   DSO.  In static binaries we need memmove before the initialization
+   happened.  */
+#if defined SHARED && IS_IN (libc)
+/* Redefine memmove so that the compiler won't complain about the type
+   mismatch with the IFUNC selector in strong_alias, below.  */
+# undef memmove
+# define memmove __redirect_memmove
+# include <string.h>
+# include <init-arch.h>
+
+extern __typeof (__redirect_memmove) __libc_memmove;
+
+extern __typeof (__redirect_memmove) __memmove_generic attribute_hidden;
+extern __typeof (__redirect_memmove) __memmove_thunderx attribute_hidden;
+
+libc_ifunc (__libc_memmove,
+            __is_thunderx ? __memmove_thunderx : __memmove_generic);
+
+#undef memmove
+strong_alias (__libc_memmove, memmove);
+#endif
diff --git a/sysdeps/unix/sysv/linux/aarch64/configure.ac b/sysdeps/unix/sysv/linux/aarch64/configure.ac
index 211fa9c..684cb46 100644
--- a/sysdeps/unix/sysv/linux/aarch64/configure.ac
+++ b/sysdeps/unix/sysv/linux/aarch64/configure.ac
@@ -1,6 +1,11 @@ 
 GLIBC_PROVIDES dnl See aclocal.m4 in the top level source directory.
 # Local configure fragment for sysdeps/unix/sysv/linux/aarch64.
 
-arch_minimum_kernel=3.7.0
+# For multi-arch support we need a kernel that emulates the mrs instruction.
+if test x$multi_arch = xyes; then
+    arch_minimum_kernel=4.11.0
+else
+    arch_minimum_kernel=3.7.0
+fi
 
 LIBC_SLIBDIR_RTLDDIR([lib64], [lib])