[1/1] msort : optimizing merge sort.
Commit Message
This patch improves the performance of merge sort when most of the elements are already sorted .
It improves performace of merge procedure by skipping comparison and copying of elements
when last element of first array and first element of second array is already sorted .
When we simulated optimzed merge sort,the performace increases by 84% when mergesort
is used for already sorted elements .
Signed-off-by: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Ayush Mittal <ayush.m@samsung.com>
---
stdlib/msort.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
Comments
On Thursday 06 July 2017 09:29 AM, Ayush Mittal wrote:
> This patch improves the performance of merge sort when most of the elements are already sorted .
> It improves performace of merge procedure by skipping comparison and copying of elements
> when last element of first array and first element of second array is already sorted .
>
> When we simulated optimzed merge sort,the performace increases by 84% when mergesort
> is used for already sorted elements .
Please also include the benchmark you used to measure this. Take a look
at benchtests/README to know ways in which you can include your
benchmark in glibc.
Also, please take a moment to review the contribution checklist[1] to
know the kind of information you need to include in your patch
submission. For example this submission is missing a ChangeLog.
Thanks,
Siddhesh
[1] https://sourceware.org/glibc/wiki/Contribution%20checklist
On Thursday 06 July 2017 04:38 PM, Ayush Mittal wrote:
> We have shared test case for this change with results(performance) .
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=21719
That test only shows the amount of improvement in the case that you've
optimized for, not the inputs that could get worse due to the additional
check. You need a benchmark that exercises different kinds of input
arrays to determine what the impact is on those loads. A good set of
inputs for this would be based on an existing and commonly used
application program that uses msort extensively on different kinds of
inputs. If that is not possible, a set of inputs that exercises all of
the major branches in the code proportionately would be acceptable. The
fast paths (such as this one, where the array is almost sorted)
typically get lower weight compared to other paths through the function.
Finally, the benchmark should be included in the patch as an addition to
benchmarks in the benchtests directory. There is a benchtests/README
that describes how you can do that.
>> Also, please take a moment to review the contribution checklist[1] to
>> know the kind of information you need to include in your patch
>> submission. For example this submission is missing a ChangeLog.
>
> Regarding ChangeLog we are not sure about which ChangeLog to change.
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=21719
The contribution checklist I linked to below tells you what a ChangeLog
is; see "12. Properly Formatted GNU ChangeLog"
>>
>> [1] https://sourceware.org/glibc/wiki/Contribution%20checklist
Siddhesh
@@ -39,7 +39,7 @@ static void msort_with_tmp (const struct msort_param *p, void *b, size_t n);
static void
msort_with_tmp (const struct msort_param *p, void *b, size_t n)
{
- char *b1, *b2;
+ char *b1, *b2, *b3;
size_t n1, n2;
if (n <= 1)
@@ -49,6 +49,7 @@ msort_with_tmp (const struct msort_param *p, void *b, size_t n)
n2 = n - n1;
b1 = b;
b2 = (char *) b + (n1 * p->s);
+ b3 = (char *) b + ((n1-1) * p->s);
msort_with_tmp (p, b1, n1);
msort_with_tmp (p, b2, n2);
@@ -60,6 +61,8 @@ msort_with_tmp (const struct msort_param *p, void *b, size_t n)
switch (p->var)
{
case 0:
+ if((*cmp) (b3, b2, arg) <= 0)
+ return ;
while (n1 > 0 && n2 > 0)
{
if ((*cmp) (b1, b2, arg) <= 0)
@@ -78,6 +81,8 @@ msort_with_tmp (const struct msort_param *p, void *b, size_t n)
}
break;
case 1:
+ if((*cmp) (b3, b2, arg) <= 0)
+ return ;
while (n1 > 0 && n2 > 0)
{
if ((*cmp) (b1, b2, arg) <= 0)
@@ -96,6 +101,8 @@ msort_with_tmp (const struct msort_param *p, void *b, size_t n)
}
break;
case 2:
+ if((*cmp) (b3, b2, arg) <= 0)
+ return ;
while (n1 > 0 && n2 > 0)
{
unsigned long *tmpl = (unsigned long *) tmp;
@@ -119,6 +126,8 @@ msort_with_tmp (const struct msort_param *p, void *b, size_t n)
}
break;
case 3:
+ if((*cmp) (*(const void **)b3,*(const void **) b2, arg) <= 0)
+ return ;
while (n1 > 0 && n2 > 0)
{
if ((*cmp) (*(const void **) b1, *(const void **) b2, arg) <= 0)
@@ -137,6 +146,8 @@ msort_with_tmp (const struct msort_param *p, void *b, size_t n)
}
break;
default:
+ if((*cmp) (b3, b2, arg) <= 0)
+ return ;
while (n1 > 0 && n2 > 0)
{
if ((*cmp) (b1, b2, arg) <= 0)