[COMMITTED] manual/filesys.texi: O_WRONLY files also fail with EBADF when used with posix_fallocate emulation.

Message ID 560AAF09.7010407@redhat.com
State Committed
Headers

Commit Message

Carlos O'Donell Sept. 29, 2015, 3:32 p.m. UTC
  In the posix_fallocate description in the manual we list various
drawbacks with the emulation, including the fact that a file opened
with O_APPEND fails with EBADF. Similarly a file opened with O_WRONLY
fails with EBADF. We must be able to emulate a compare-and-swap via
pread/compare/pwrite in order to make the emulation as safe as possible.
It is not acceptable to ignore the read failure because it could result
in significant data loss across all of the blocks. There is no other way
to make this work without a true atomic CAS and SIGBUS handler (which
is looking more attractive as a way to remove the race condition).

This patch adds O_WRONLY to the manual as another bullet to clarify the
limits of the emulation.

Manual looks good in PDF.

Committed.

Cheers,
Carlos.

2015-09-29  Carlos O'Donell  <carlos@redhat.com>

	* manual/filesys.texi (Storage Allocation): Document that
	posix_fallocate emulation fails when fd is open with O_WRONLY.
  

Comments

Carlos O'Donell Sept. 29, 2015, 3:47 p.m. UTC | #1
On 09/29/2015 11:32 AM, Carlos O'Donell wrote:
> In the posix_fallocate description in the manual we list various
> drawbacks with the emulation, including the fact that a file opened
> with O_APPEND fails with EBADF. Similarly a file opened with O_WRONLY
> fails with EBADF. We must be able to emulate a compare-and-swap via
> pread/compare/pwrite in order to make the emulation as safe as possible.
> It is not acceptable to ignore the read failure because it could result
> in significant data loss across all of the blocks. There is no other way
> to make this work without a true atomic CAS and SIGBUS handler (which
> is looking more attractive as a way to remove the race condition).

Just a note here, this still doesn't solve the O_WRONLY problem, because
such an fd can't be mapped PROT_WRITE since it violates the permissions
of file descriptor. We would need an fdreopen syscall, which doesn't exist,
and that would open a race window where other threads might actually
succeed at reading a file that should be write only (potential security
risk).

Cheers,
Carlos.
  
Florian Weimer Oct. 22, 2015, 7:42 p.m. UTC | #2
On 09/29/2015 05:47 PM, Carlos O'Donell wrote:
> On 09/29/2015 11:32 AM, Carlos O'Donell wrote:
>> In the posix_fallocate description in the manual we list various
>> drawbacks with the emulation, including the fact that a file opened
>> with O_APPEND fails with EBADF. Similarly a file opened with O_WRONLY
>> fails with EBADF. We must be able to emulate a compare-and-swap via
>> pread/compare/pwrite in order to make the emulation as safe as possible.
>> It is not acceptable to ignore the read failure because it could result
>> in significant data loss across all of the blocks. There is no other way
>> to make this work without a true atomic CAS and SIGBUS handler (which
>> is looking more attractive as a way to remove the race condition).
> 
> Just a note here, this still doesn't solve the O_WRONLY problem, because
> such an fd can't be mapped PROT_WRITE since it violates the permissions
> of file descriptor. We would need an fdreopen syscall, which doesn't exist,
> and that would open a race window where other threads might actually
> succeed at reading a file that should be write only (potential security
> risk).

fdreopen can be emulated through /proc on Linux, but it does not solve
the problem with POSIX fcntl locking semantics (because *any* close
operation by the process on the same file, not file descriptor, releases
the locks).

Florian
  
Rich Felker Oct. 24, 2015, 6:07 a.m. UTC | #3
On Thu, Oct 22, 2015 at 09:42:35PM +0200, Florian Weimer wrote:
> On 09/29/2015 05:47 PM, Carlos O'Donell wrote:
> > On 09/29/2015 11:32 AM, Carlos O'Donell wrote:
> >> In the posix_fallocate description in the manual we list various
> >> drawbacks with the emulation, including the fact that a file opened
> >> with O_APPEND fails with EBADF. Similarly a file opened with O_WRONLY
> >> fails with EBADF. We must be able to emulate a compare-and-swap via
> >> pread/compare/pwrite in order to make the emulation as safe as possible.
> >> It is not acceptable to ignore the read failure because it could result
> >> in significant data loss across all of the blocks. There is no other way
> >> to make this work without a true atomic CAS and SIGBUS handler (which
> >> is looking more attractive as a way to remove the race condition).
> > 
> > Just a note here, this still doesn't solve the O_WRONLY problem, because
> > such an fd can't be mapped PROT_WRITE since it violates the permissions
> > of file descriptor. We would need an fdreopen syscall, which doesn't exist,
> > and that would open a race window where other threads might actually
> > succeed at reading a file that should be write only (potential security
> > risk).
> 
> fdreopen can be emulated through /proc on Linux, but it does not solve
> the problem with POSIX fcntl locking semantics (because *any* close
> operation by the process on the same file, not file descriptor, releases
> the locks).

This can presumably be avoided by using a child process (or possibly a
thread that does unshare(CLONE_FILES) or something) to do the work.
Rather hideous, but this whole emulation mess is hideous to begin with.

Rich
  

Patch

diff --git a/manual/filesys.texi b/manual/filesys.texi
index ed4f5fd..4064885 100644
--- a/manual/filesys.texi
+++ b/manual/filesys.texi
@@ -3300,6 +3300,10 @@  underlying file in the to-be-allocated area.  Non-null bytes could be
 overwritten with null bytes.
 
 @item
+If @var{fd} has been opened with the @code{O_WRONLY} flag, the function
+will fail with an @code{errno} value of @code{EBADF}.
+
+@item
 If @var{fd} has been opened with the @code{O_APPEND} flag, the function
 will fail with an @code{errno} value of @code{EBADF}.