[1/1] msort : optimizing merge sort.

Message ID 1499313544-18244-1-git-send-email-ayush.m@samsung.com
State New, archived
Headers

Commit Message

Ayush Mittal July 6, 2017, 3:59 a.m. UTC
  This patch improves the performance of merge sort when most of the elements are already sorted .
It improves performace of merge procedure by skipping comparison and copying of elements 
when last element of first array and first element of second array is already sorted .

When we simulated optimzed merge sort,the performace increases by 84% when mergesort 
is used for already sorted elements .

Signed-off-by: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Ayush Mittal <ayush.m@samsung.com>
---
 stdlib/msort.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)
  

Comments

Siddhesh Poyarekar July 6, 2017, 4:25 a.m. UTC | #1
On Thursday 06 July 2017 09:29 AM, Ayush Mittal wrote:
> This patch improves the performance of merge sort when most of the elements are already sorted .
> It improves performace of merge procedure by skipping comparison and copying of elements 
> when last element of first array and first element of second array is already sorted .
> 
> When we simulated optimzed merge sort,the performace increases by 84% when mergesort 
> is used for already sorted elements .

Please also include the benchmark you used to measure this.  Take a look
at benchtests/README to know ways in which you can include your
benchmark in glibc.

Also, please take a moment to review the contribution checklist[1] to
know the kind of information you need to include in your patch
submission.  For example this submission is missing a ChangeLog.

Thanks,
Siddhesh

[1] https://sourceware.org/glibc/wiki/Contribution%20checklist
  
Siddhesh Poyarekar July 7, 2017, 11:38 a.m. UTC | #2
On Thursday 06 July 2017 04:38 PM, Ayush Mittal wrote:
> We have shared test case for this change with results(performance) .
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=21719

That test only shows the amount of improvement in the case that you've
optimized for, not the inputs that could get worse due to the additional
check.  You need a benchmark that exercises different kinds of input
arrays to determine what the impact is on those loads.  A good set of
inputs for this would be based on an existing and commonly used
application program that uses msort extensively on different kinds of
inputs.  If that is not possible, a set of inputs that exercises all of
the major branches in the code proportionately would be acceptable.  The
fast paths (such as this one, where the array is almost sorted)
typically get lower weight compared to other paths through the function.

Finally, the benchmark should be included in the patch as an addition to
benchmarks in the benchtests directory.  There is a benchtests/README
that describes how you can do that.

>> Also, please take a moment to review the contribution checklist[1] to
>> know the kind of information you need to include in your patch
>> submission.  For example this submission is missing a ChangeLog.
> 
> Regarding ChangeLog we are not sure about which ChangeLog to change.
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=21719

The contribution checklist I linked to below tells you what a ChangeLog
is; see "12. Properly Formatted GNU ChangeLog"

>>
>> [1] https://sourceware.org/glibc/wiki/Contribution%20checklist

Siddhesh
  

Patch

diff --git a/stdlib/msort.c b/stdlib/msort.c
index 02ef28b..498de09 100644
--- a/stdlib/msort.c
+++ b/stdlib/msort.c
@@ -39,7 +39,7 @@  static void msort_with_tmp (const struct msort_param *p, void *b, size_t n);
 static void
 msort_with_tmp (const struct msort_param *p, void *b, size_t n)
 {
-  char *b1, *b2;
+  char *b1, *b2, *b3;
   size_t n1, n2;
 
   if (n <= 1)
@@ -49,6 +49,7 @@  msort_with_tmp (const struct msort_param *p, void *b, size_t n)
   n2 = n - n1;
   b1 = b;
   b2 = (char *) b + (n1 * p->s);
+  b3 = (char *) b + ((n1-1) * p->s);
 
   msort_with_tmp (p, b1, n1);
   msort_with_tmp (p, b2, n2);
@@ -60,6 +61,8 @@  msort_with_tmp (const struct msort_param *p, void *b, size_t n)
   switch (p->var)
     {
     case 0:
+    if((*cmp) (b3, b2, arg) <= 0)
+	return ;
       while (n1 > 0 && n2 > 0)
 	{
 	  if ((*cmp) (b1, b2, arg) <= 0)
@@ -78,6 +81,8 @@  msort_with_tmp (const struct msort_param *p, void *b, size_t n)
 	}
       break;
     case 1:
+    if((*cmp) (b3, b2, arg) <= 0)
+	return ;
       while (n1 > 0 && n2 > 0)
 	{
 	  if ((*cmp) (b1, b2, arg) <= 0)
@@ -96,6 +101,8 @@  msort_with_tmp (const struct msort_param *p, void *b, size_t n)
 	}
       break;
     case 2:
+    if((*cmp) (b3, b2, arg) <= 0)
+	return ;
       while (n1 > 0 && n2 > 0)
 	{
 	  unsigned long *tmpl = (unsigned long *) tmp;
@@ -119,6 +126,8 @@  msort_with_tmp (const struct msort_param *p, void *b, size_t n)
 	}
       break;
     case 3:
+    if((*cmp) (*(const void **)b3,*(const void **)  b2, arg) <= 0)
+	return ;
       while (n1 > 0 && n2 > 0)
 	{
 	  if ((*cmp) (*(const void **) b1, *(const void **) b2, arg) <= 0)
@@ -137,6 +146,8 @@  msort_with_tmp (const struct msort_param *p, void *b, size_t n)
 	}
       break;
     default:
+    if((*cmp) (b3, b2, arg) <= 0)
+	return ;
       while (n1 > 0 && n2 > 0)
 	{
 	  if ((*cmp) (b1, b2, arg) <= 0)