Speed up strcoll by inlining

Message ID 54686492.4000402@web.de
State Committed
Headers

Commit Message

Leonhard Holz Nov. 16, 2014, 8:47 a.m. UTC
  This patch improves the performance of strcoll_l by inlining the two 
helper functions get_next_seq and do_compare.

measurement	old      	new      	diff
glibc files	202,393,000	138,597,000	-31.52%
vi_VN.UTF-8	2,772,150	2,055,920	-25.84%
en_US.UTF-8	2,741,380	1,952,770	-28.77%
ar_SA.UTF-8	3,062,180	2,484,940	-18.85%
zh_CN.UTF-8	917,198  	875,756  	-4.52%
cs_CZ.UTF-8	3,455,670	2,465,530	-28.65%
en_GB.UTF-8	3,334,790	2,420,540	-27.42%
da_DK.UTF-8	2,805,660	2,023,430	-27.88%
pl_PL.UTF-8	2,640,710	2,014,960	-23.70%
fr_FR.UTF-8	3,504,700	2,642,280	-24.61%
pt_PT.UTF-8	3,542,390	2,599,250	-26.62%
el_GR.UTF-8	4,529,580	3,881,700	-14.30%
ru_RU.UTF-8	3,527,070	2,806,480	-20.43%
iw_IL.UTF-8	3,047,060	2,530,360	-16.96%
es_ES.UTF-8	3,089,990	2,376,410	-23.09%
hi_IN.UTF-8	222,487,000	223,397,000	0.41%
sv_SE.UTF-8	2,724,630	2,019,420	-25.88%
hu_HU.UTF-8	4,446,990	3,658,830	-17.72%
tr_TR.UTF-8	2,966,180	2,200,790	-25.80%
is_IS.UTF-8	2,559,480	2,012,190	-21.38%
it_IT.UTF-8	3,301,190	2,527,230	-23.44%
sr_RS.UTF-8	2,973,150	2,322,010	-21.90%
ja_JP.UTF-8	985,042  	1,044,980	6.08%

2014-11-16  Leonhard Holz  <leonhard.holz@web.de>

     * string/strcoll_l.c (get_next_seq): __always_inline.
     * string/strcoll_l.c (do_compare): __always_inline.
  

Comments

Andrew Pinski Nov. 16, 2014, 8:50 a.m. UTC | #1
On Sun, Nov 16, 2014 at 12:47 AM, Leonhard Holz <leonhard.holz@web.de> wrote:
> This patch improves the performance of strcoll_l by inlining the two helper
> functions get_next_seq and do_compare.


Can you say which version of GCC you are using?  And on what target.
Also maybe this should be analysed by the GCC folks to figure out why
those two functions are not inlined.

Thanks,
Andrew

>
> measurement     old             new             diff
> glibc files     202,393,000     138,597,000     -31.52%
> vi_VN.UTF-8     2,772,150       2,055,920       -25.84%
> en_US.UTF-8     2,741,380       1,952,770       -28.77%
> ar_SA.UTF-8     3,062,180       2,484,940       -18.85%
> zh_CN.UTF-8     917,198         875,756         -4.52%
> cs_CZ.UTF-8     3,455,670       2,465,530       -28.65%
> en_GB.UTF-8     3,334,790       2,420,540       -27.42%
> da_DK.UTF-8     2,805,660       2,023,430       -27.88%
> pl_PL.UTF-8     2,640,710       2,014,960       -23.70%
> fr_FR.UTF-8     3,504,700       2,642,280       -24.61%
> pt_PT.UTF-8     3,542,390       2,599,250       -26.62%
> el_GR.UTF-8     4,529,580       3,881,700       -14.30%
> ru_RU.UTF-8     3,527,070       2,806,480       -20.43%
> iw_IL.UTF-8     3,047,060       2,530,360       -16.96%
> es_ES.UTF-8     3,089,990       2,376,410       -23.09%
> hi_IN.UTF-8     222,487,000     223,397,000     0.41%
> sv_SE.UTF-8     2,724,630       2,019,420       -25.88%
> hu_HU.UTF-8     4,446,990       3,658,830       -17.72%
> tr_TR.UTF-8     2,966,180       2,200,790       -25.80%
> is_IS.UTF-8     2,559,480       2,012,190       -21.38%
> it_IT.UTF-8     3,301,190       2,527,230       -23.44%
> sr_RS.UTF-8     2,973,150       2,322,010       -21.90%
> ja_JP.UTF-8     985,042         1,044,980       6.08%
>
> 2014-11-16  Leonhard Holz  <leonhard.holz@web.de>
>
>     * string/strcoll_l.c (get_next_seq): __always_inline.
>     * string/strcoll_l.c (do_compare): __always_inline.
  
Leonhard Holz Nov. 16, 2014, 9:33 a.m. UTC | #2
My development machine is a virtual 32-bit Ubuntu 14.10 Linux:

gcc --version: gcc (Ubuntu 4.9.1-16ubuntu6) 4.9.1
uname -a: Linux leo-VirtualBox 3.16.0-24-generic #32-Ubuntu SMP Tue Oct 
28 13:13:18 UTC 2014 i686 i686 i686 GNU/Linux

Best,
Leonhard

Am 16.11.2014 09:50, schrieb Andrew Pinski:
> On Sun, Nov 16, 2014 at 12:47 AM, Leonhard Holz <leonhard.holz@web.de> wrote:
>> This patch improves the performance of strcoll_l by inlining the two helper
>> functions get_next_seq and do_compare.
>
>
> Can you say which version of GCC you are using?  And on what target.
> Also maybe this should be analysed by the GCC folks to figure out why
> those two functions are not inlined.
>
> Thanks,
> Andrew
>
  
Ondrej Bilka Nov. 16, 2014, 11:09 a.m. UTC | #3
On Sun, Nov 16, 2014 at 10:33:35AM +0100, Leonhard Holz wrote:
> My development machine is a virtual 32-bit Ubuntu 14.10 Linux:
> 
> gcc --version: gcc (Ubuntu 4.9.1-16ubuntu6) 4.9.1
> uname -a: Linux leo-VirtualBox 3.16.0-24-generic #32-Ubuntu SMP Tue
> Oct 28 13:13:18 UTC 2014 i686 i686 i686 GNU/Linux
> 
> Best,
> Leonhard
> 
> Am 16.11.2014 09:50, schrieb Andrew Pinski:
> >On Sun, Nov 16, 2014 at 12:47 AM, Leonhard Holz <leonhard.holz@web.de> wrote:
> >>This patch improves the performance of strcoll_l by inlining the two helper
> >>functions get_next_seq and do_compare.
> >
> >
> >Can you say which version of GCC you are using?  And on what target.
> >Also maybe this should be analysed by the GCC folks to figure out why
> >those two functions are not inlined.
> >
It is not surprising, get_next_seq is 120 lines long and called twice so
gcc will not inline it based on size. As do_compare is called once it
should be inlined so difference is entirely from get_next_seq.
  
Siddhesh Poyarekar Nov. 18, 2014, 12:42 p.m. UTC | #4
On Sun, Nov 16, 2014 at 09:47:14AM +0100, Leonhard Holz wrote:
> 2014-11-16  Leonhard Holz  <leonhard.holz@web.de>
> 
>     * string/strcoll_l.c (get_next_seq): __always_inline.
>     * string/strcoll_l.c (do_compare): __always_inline.

Thanks, I'll commit this.

Siddhesh
  

Patch

diff --git a/string/strcoll_l.c b/string/strcoll_l.c
index 7a2d066..ddac7e5 100644
--- a/string/strcoll_l.c
+++ b/string/strcoll_l.c
@@ -63,7 +63,7 @@  typedef struct
 } coll_seq;
 
 /* Get next sequence.  Traverse the string as required.  */
-static void
+static __always_inline void
 get_next_seq (coll_seq *seq, int nrules, const unsigned char *rulesets,
 	      const USTRING_TYPE *weights, const int32_t *table,
 	      const USTRING_TYPE *extra, const int32_t *indirect,
@@ -194,7 +194,7 @@  get_next_seq (coll_seq *seq, int nrules, const unsigned char *rulesets,
 }
 
 /* Compare two sequences.  */
-static int
+static __always_inline int
 do_compare (coll_seq *seq1, coll_seq *seq2, int position,
 	    const USTRING_TYPE *weights)
 {