Fix wordsize-32 mmap offset for negative value (BZ#18877)
Commit Message
This patch fixes the default wordsize-32 mmap implementation offset
calculation for negative values. Current code uses signed shift
operation to calculate the multiple size to use with syscall and
it is implementation defined. Change it to use a division base
on mmap page size (default being as before, 4096).
Tested on armv7hf.
[BZ #18877]
* misc/Makefile (tests): Add tst-mmap
* sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c (__mmap): Fix
offset calculation for negative values.
* misc/tst-mmap.c: New file.
--
--
Comments
On Thu, Aug 27, 2015 at 05:13:13PM -0300, Adhemerval Zanella wrote:
> This patch fixes the default wordsize-32 mmap implementation offset
> calculation for negative values. Current code uses signed shift
> operation to calculate the multiple size to use with syscall and
> it is implementation defined. Change it to use a division base
> on mmap page size (default being as before, 4096).
>
> Tested on armv7hf.
>
> [BZ #18877]
> * misc/Makefile (tests): Add tst-mmap
> * sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c (__mmap): Fix
> offset calculation for negative values.
> * misc/tst-mmap.c: New file.
As we have posix/tst-mmap.c already, my idea was to call it
tst-mmap-offset.c,
> + char fname[] = "tst-mmap-offset-XXXXXX";
hence the name of this temporary file.
> - offset >> MMAP_PAGE_SHIFT);
> + offset/MMAP_PAGE_UNIT);
offset / MMAP_PAGE_UNIT
I'm looking at failures of this new test for MIPS, but am puzzled by the
descriptions of this issue in terms of things being incorrect and fixed
without any statement of what is actually correct or why that is correct.
The expectation in the test seems to be that negative offsets passed to
mmap should be interpreted as large unsigned values. But I can't find
anything in POSIX, or in the glibc manual, or in the Linux man-pages
collection, to justify those semantics. So what is the basis for that
interpretation as part of the glibc API (and for its being compatible with
POSIX), rather than negative arguments being considered invalid?
(The MIPS mmap syscall uses a signed argument and does a signed arithmetic
shift on it, meaning that matching the MIPS semantics to other
architectures will involve using mmap2 for o32 and making mmap not an
alias for mmap64 for n32 because mmap for n32 will need to zero-extend its
offset argument before calling the syscall.)
On 20-01-2016 19:44, Joseph Myers wrote:
> I'm looking at failures of this new test for MIPS, but am puzzled by the
> descriptions of this issue in terms of things being incorrect and fixed
> without any statement of what is actually correct or why that is correct.
>
> The expectation in the test seems to be that negative offsets passed to
> mmap should be interpreted as large unsigned values. But I can't find
> anything in POSIX, or in the glibc manual, or in the Linux man-pages
> collection, to justify those semantics. So what is the basis for that
> interpretation as part of the glibc API (and for its being compatible with
> POSIX), rather than negative arguments being considered invalid?
>
You are correct, I bluntly used the bz report testcase as the one for the
patch. And checking both Linux and POSIX negative offset are not really
specified, so the issue at BZ#18877 can't really being considered a regression
since the offset usage is undefined behaviour. The fix itself on the
wordsize-32 I still consider valid since it was using a shift arithmetic
signed is implementation-defined behaviour (and the divide operation is
assumed to not be implementation-defined).
About the test I think we should just remove it and state that invalid
offsets are undefined or architecture define.
> (The MIPS mmap syscall uses a signed argument and does a signed arithmetic
> shift on it, meaning that matching the MIPS semantics to other
> architectures will involve using mmap2 for o32 and making mmap not an
> alias for mmap64 for n32 because mmap for n32 will need to zero-extend its
> offset argument before calling the syscall.)
I think this can be a future cleanup to simplify implementations.
On Thu, 21 Jan 2016, Adhemerval Zanella wrote:
> You are correct, I bluntly used the bz report testcase as the one for the
> patch. And checking both Linux and POSIX negative offset are not really
> specified, so the issue at BZ#18877 can't really being considered a regression
> since the offset usage is undefined behaviour. The fix itself on the
> wordsize-32 I still consider valid since it was using a shift arithmetic
> signed is implementation-defined behaviour (and the divide operation is
> assumed to not be implementation-defined).
The shift is fully defined in GNU C, which is what is used by glibc. The
actual fix isn't the use of the divide (if the low bits are zero, signed
division and signed arithmetic right shift have the same effect) - it's
that you're dividing by an *unsigned* quantity whose type is such that C
type promotion rules cause the signed offset to be converted to unsigned
before the division (and converting to an unsigned type before shifting
right would have just the same effect).
> About the test I think we should just remove it and state that invalid
> offsets are undefined or architecture define.
I don't think saying they are architecture-defined makes sense - in
general the glibc API should be consistent between architectures (and we
test for the glibc API, not just for the POSIX API). And if it's
undefined (as opposed to invalid and requiring an error), again, what is
the basis for that in POSIX?
On 21-01-2016 12:12, Joseph Myers wrote:
> On Thu, 21 Jan 2016, Adhemerval Zanella wrote:
>
>> You are correct, I bluntly used the bz report testcase as the one for the
>> patch. And checking both Linux and POSIX negative offset are not really
>> specified, so the issue at BZ#18877 can't really being considered a regression
>> since the offset usage is undefined behaviour. The fix itself on the
>> wordsize-32 I still consider valid since it was using a shift arithmetic
>> signed is implementation-defined behaviour (and the divide operation is
>> assumed to not be implementation-defined).
>
> The shift is fully defined in GNU C, which is what is used by glibc. The
> actual fix isn't the use of the divide (if the low bits are zero, signed
> division and signed arithmetic right shift have the same effect) - it's
> that you're dividing by an *unsigned* quantity whose type is such that C
> type promotion rules cause the signed offset to be converted to unsigned
> before the division (and converting to an unsigned type before shifting
> right would have just the same effect).
Right I oversee the unsigned promotion rule, so seems the right signed
shift should be safe.
>
>> About the test I think we should just remove it and state that invalid
>> offsets are undefined or architecture define.
>
> I don't think saying they are architecture-defined makes sense - in
> general the glibc API should be consistent between architectures (and we
> test for the glibc API, not just for the POSIX API). And if it's
> undefined (as opposed to invalid and requiring an error), again, what is
> the basis for that in POSIX?
>
OK I am open to suggestion in such case to how to handle negative offsets.
Does anyone else have comments on my question
<https://sourceware.org/ml/libc-alpha/2016-01/msg00534.html> of what the
desired architecture-independent mmap semantics should be for negative
offsets?
On Wed, Jan 20, 2016 at 09:44:26PM +0000, Joseph Myers wrote:
> I'm looking at failures of this new test for MIPS, but am puzzled by the
> descriptions of this issue in terms of things being incorrect and fixed
> without any statement of what is actually correct or why that is correct.
>
> The expectation in the test seems to be that negative offsets passed to
> mmap should be interpreted as large unsigned values. But I can't find
> anything in POSIX, or in the glibc manual, or in the Linux man-pages
> collection, to justify those semantics. So what is the basis for that
> interpretation as part of the glibc API (and for its being compatible with
> POSIX), rather than negative arguments being considered invalid?
Is there any 32-bit kernel currently supported by glibc where underlying
system call doesn't operate with 4K or pagesize units for offsets?
On 28-01-2016 15:45, Joseph Myers wrote:
> Does anyone else have comments on my question
> <https://sourceware.org/ml/libc-alpha/2016-01/msg00534.html> of what the
> desired architecture-independent mmap semantics should be for negative
> offsets?
>
I recheck Linux handling of mmap syscall and I would say we should handle
it an unsigned integer since it is what general mmap implementation is
being based current (mmap_pgoff - mm/mmap.c).
The problem is some architecture implements mmap (either default or
for compatibility) as signed and others as unsigned. As far I could
check mmap2 is handled as unsigned for all the ports.
Now from GLIBC side I see we need to start use mmap2 where is possible
for the ports that already are not using it (mips for instance). The
problem is ports that only provides mmap with signed offset and for
such cases I am not sure how we will enforce architecture-independent
semantics for negative offset.
@@ -76,7 +76,8 @@ install-lib := libg.a
gpl2lgpl := error.c error.h
tests := tst-dirname tst-tsearch tst-fdset tst-efgcvt tst-mntent tst-hsearch \
- tst-error1 tst-pselect tst-insremque tst-mntent2 bug-hsearch1
+ tst-error1 tst-pselect tst-insremque tst-mntent2 bug-hsearch1 \
+ tst-mmap
ifeq ($(run-built-tests),yes)
tests-special += $(objpfx)tst-error1-mem.out
endif
new file mode 100644
@@ -0,0 +1,67 @@
+/* mmap offset test.
+
+ Copyright (C) 2015 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/mman.h>
+
+static int
+printmsg (int rc, const char *msg)
+{
+ printf ("%s failed: %m\n", msg);
+ return rc;
+}
+
+/* Check if negative offset are handled correctly by mmap. */;
+static int
+do_test_bz18877 (void)
+{
+ const int prot = PROT_READ | PROT_WRITE;
+ const int flags = MAP_SHARED;
+ const unsigned long length = 0x10000;
+ const unsigned long offset = 0xace00000;
+ const unsigned long size = offset + length;
+ void *addr;
+ int fd;
+ char fname[] = "tst-mmap-offset-XXXXXX";
+
+ fd = mkstemp64 (fname);
+ if (fd < 0)
+ return printmsg (1, "mkstemp");
+
+ if (unlink (fname))
+ return printmsg (1, "unlink");
+
+ if (ftruncate64 (fd, size))
+ return printmsg (0, "ftruncate64");
+
+ addr = mmap (NULL, length, prot, flags, fd, offset);
+ if (MAP_FAILED == addr)
+ return printmsg (1, "mmap");
+
+ /* This memcpy is likely to SIGBUS if mmap has messed up with offset. */
+ memcpy (addr, fname, sizeof (fname));
+
+ return 0;
+}
+
+#define TEST_FUNCTION do_test_bz18877 ()
+#include "../test-skeleton.c"
@@ -21,20 +21,20 @@
#include <errno.h>
#include <sysdep.h>
-#ifndef MMAP_PAGE_SHIFT
-#define MMAP_PAGE_SHIFT 12
+#ifndef MMAP_PAGE_UNIT
+#define MMAP_PAGE_UNIT 4096UL
#endif
__ptr_t
__mmap (__ptr_t addr, size_t len, int prot, int flags, int fd, off_t offset)
{
- if (offset & ((1 << MMAP_PAGE_SHIFT) - 1))
+ if (offset & (MMAP_PAGE_UNIT - 1))
{
__set_errno (EINVAL);
return MAP_FAILED;
}
return (__ptr_t) INLINE_SYSCALL (mmap2, 6, addr, len, prot, flags, fd,
- offset >> MMAP_PAGE_SHIFT);
+ offset/MMAP_PAGE_UNIT);
}
weak_alias (__mmap, mmap)