manual: Correct guarantee about pointers compared by qsort()

Message ID alpine.DEB.2.10.1412110456190.37899@buzzword-bingo.mit.edu
State Superseded
Headers

Commit Message

Anders Kaseorg Dec. 11, 2014, 9:57 a.m. UTC
  C99, C11, POSIX, and the glibc implementation do guarantee that the
pointers passed to the qsort comparison function lie within the array.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
---
 manual/search.texi | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)
  

Comments

Florian Weimer Dec. 11, 2014, 10:25 a.m. UTC | #1
On 12/11/2014 10:57 AM, Anders Kaseorg wrote:
> +Although the object addresses passed to the comparison function lie
> +within the array, they need not correspond with the original locations
> +of those objects, because the sorting algorithm may swap around

No comma before “because”.

> +objects in the array before making some comparisons.  The only way to
> +perform a stable sort with @var{qsort} is to first augment the objects
> +with a monotonic counter of some kind.

“@code{qsort}” instead of “@var{qsort}”.

Otherwise looks fine.
  
Rich Felker Dec. 11, 2014, 4:28 p.m. UTC | #2
On Thu, Dec 11, 2014 at 04:57:42AM -0500, Anders Kaseorg wrote:
> C99, C11, POSIX, and the glibc implementation do guarantee that the
> pointers passed to the qsort comparison function lie within the array.
> 
> Signed-off-by: Anders Kaseorg <andersk@mit.edu>
> ---
>  manual/search.texi | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/manual/search.texi b/manual/search.texi
> index 8aff574..9ebcb68 100644
> --- a/manual/search.texi
> +++ b/manual/search.texi
> @@ -180,10 +180,12 @@ This can make a difference when the comparison considers only part of
>  the elements.  Two elements with the same sort key may differ in other
>  respects.
>  
> -The addresses passed to the comparison function need not correspond with
> -the original location of the objects, and need not even lie within the
> -original array.  The only way to perform a stable sort with @var{qsort}
> -is to first augment the objects with a monotonic counter of some kind.
> +Although the object addresses passed to the comparison function lie
> +within the array, they need not correspond with the original locations
> +of those objects, because the sorting algorithm may swap around
> +objects in the array before making some comparisons.  The only way to
> +perform a stable sort with @var{qsort} is to first augment the objects
> +with a monotonic counter of some kind.

I think "the only way" is mistaken. I would simply add to the previous
sentence:

"and therefore, the addresses of the objects may not be used as part
of the comparison for producing a stable sort"

or similar. If the next sentence is also desired, it could say "One
way", or "The canonical way" rather than "The only way". Note that
another way, often optimal in time anyway due to the smaller amount of
data to be moved, is to create a new array of pointers to objects in
the original array. Then the array of pointers can be sorted using the
address (not the address of the pointer, but the address stored in the
pointer) as a factor, and then the objects in the original array can
be moved into the resulting order if desired (often it's not needed to
do so anyway).

Rich
  
Anders Kaseorg Dec. 11, 2014, 11 p.m. UTC | #3
On Thu, 11 Dec 2014, Rich Felker wrote:
> > -The addresses passed to the comparison function need not correspond with
> > -the original location of the objects, and need not even lie within the
> > -original array.  The only way to perform a stable sort with @var{qsort}
> > -is to first augment the objects with a monotonic counter of some kind.
> > +Although the object addresses passed to the comparison function lie
> > +within the array, they need not correspond with the original locations
> > +of those objects, because the sorting algorithm may swap around
> > +objects in the array before making some comparisons.  The only way to
> > +perform a stable sort with @var{qsort} is to first augment the objects
> > +with a monotonic counter of some kind.
> 
> I think "the only way" is mistaken.

Note that the second sentence is unchanged in this patch.  The goal of 
this patch is to fix an incorrect claim in the first sentence.

> Note that another way, often optimal in time anyway due to the smaller 
> amount of data to be moved, is to create a new array of pointers to 
> objects in the original array.  Then the array of pointers can be sorted 
> using the address (not the address of the pointer, but the address 
> stored in the pointer) as a factor,

That’s just another way to augment the objects with a monotonic counter, 
where the monotonic counter is the address.

> and then the objects in the original array can be moved into the 
> resulting order if desired (often it's not needed to do so anyway).

And this means using an additional nontrivial algorithm outside of qsort.  
So I’m not sure it counts as a way to “perform a stable sort with 
@var{qsort}”.

Anyway, the reason the second sentence is there at all is to explicitly 
contradict an incorrect recipe for stable qsort in previous versions of 
the manual (https://sourceware.org/bugzilla/show_bug.cgi?id=10672).  If 
you imagine you’re a programmer who wants to do a stable sort, and search 
the web for related keywords, you find the current version of the manual 
http://www.gnu.org/software/libc/manual/html_node/Array-Sort-Function.html 
(which actually still contains that incorrect claim, but will be fixed 
with the next release), along with some previous versions of the manual 
(which will probably never be updated).  So I think it’s important that 
the current version lets the programmer easily decide that the previous 
version was wrong.

If you’d still like the second sentence changed, can I suggest doing it in 
a separate patch?

Anders
  
Rich Felker Dec. 11, 2014, 11:40 p.m. UTC | #4
On Thu, Dec 11, 2014 at 06:00:18PM -0500, Anders Kaseorg wrote:
> On Thu, 11 Dec 2014, Rich Felker wrote:
> > > -The addresses passed to the comparison function need not correspond with
> > > -the original location of the objects, and need not even lie within the
> > > -original array.  The only way to perform a stable sort with @var{qsort}
> > > -is to first augment the objects with a monotonic counter of some kind.
> > > +Although the object addresses passed to the comparison function lie
> > > +within the array, they need not correspond with the original locations
> > > +of those objects, because the sorting algorithm may swap around
> > > +objects in the array before making some comparisons.  The only way to
> > > +perform a stable sort with @var{qsort} is to first augment the objects
> > > +with a monotonic counter of some kind.
> > 
> > I think "the only way" is mistaken.
> 
> Note that the second sentence is unchanged in this patch.  The goal of 
> this patch is to fix an incorrect claim in the first sentence.

Sorry, I missed that.

> > Note that another way, often optimal in time anyway due to the smaller 
> > amount of data to be moved, is to create a new array of pointers to 
> > objects in the original array.  Then the array of pointers can be sorted 
> > using the address (not the address of the pointer, but the address 
> > stored in the pointer) as a factor,
> 
> That’s just another way to augment the objects with a monotonic counter, 
> where the monotonic counter is the address.
> 
> > and then the objects in the original array can be moved into the 
> > resulting order if desired (often it's not needed to do so anyway).
> 
> And this means using an additional nontrivial algorithm outside of qsort.  
> So I’m not sure it counts as a way to “perform a stable sort with 
> @var{qsort}”.
> 
> Anyway, the reason the second sentence is there at all is to explicitly 
> contradict an incorrect recipe for stable qsort in previous versions of 
> the manual (https://sourceware.org/bugzilla/show_bug.cgi?id=10672).  If 
> you imagine you’re a programmer who wants to do a stable sort, and search 
> the web for related keywords, you find the current version of the manual 
> http://www.gnu.org/software/libc/manual/html_node/Array-Sort-Function.html 
> (which actually still contains that incorrect claim, but will be fixed 
> with the next release), along with some previous versions of the manual 
> (which will probably never be updated).  So I think it’s important that 
> the current version lets the programmer easily decide that the previous 
> version was wrong.
> 
> If you’d still like the second sentence changed, can I suggest doing it in 
> a separate patch?

Thanks for the detailed explanation. In that case I think it's best to
just keep the emphasis on the wrongness of that approach. The text
could perhaps be improved but it's not high-importance IMO. I was
working based on the wrong assumption that it was newly proposed text.

Rich
  

Patch

diff --git a/manual/search.texi b/manual/search.texi
index 8aff574..9ebcb68 100644
--- a/manual/search.texi
+++ b/manual/search.texi
@@ -180,10 +180,12 @@  This can make a difference when the comparison considers only part of
 the elements.  Two elements with the same sort key may differ in other
 respects.
 
-The addresses passed to the comparison function need not correspond with
-the original location of the objects, and need not even lie within the
-original array.  The only way to perform a stable sort with @var{qsort}
-is to first augment the objects with a monotonic counter of some kind.
+Although the object addresses passed to the comparison function lie
+within the array, they need not correspond with the original locations
+of those objects, because the sorting algorithm may swap around
+objects in the array before making some comparisons.  The only way to
+perform a stable sort with @var{qsort} is to first augment the objects
+with a monotonic counter of some kind.
 
 Here is a simple example of sorting an array of doubles in numerical
 order, using the comparison function defined above (@pxref{Comparison