[v2] Fix offset caching for streams and use it for ftell (BZ #16680)

Message ID 20140311124615.GA2182@spoyarek.pnq.redhat.com
State Committed
Headers

Commit Message

Siddhesh Poyarekar March 11, 2014, 12:46 p.m. UTC
  Here's take 2:

The ftell implementation was made conservative to ensure that
incorrectly cached offsets never affect it.  However, this causes
problems for append mode when a file stream is rewound.  Additionally,
the 'clever' trick of using stat to get position for append mode files
caused more problems than it solved and broke some old behavior.  I
have described below the various problems that it caused and then
finally the solution.

For a and a+ mode files, rewinding the stream should result in ftell
returning 0 as the offset, but the stat() trick caused it to
(incorrectly) always return the end of file.  Now I couldn't find
anything in POSIX that specifies the stream position after rewind()
for a file opened in 'a' mode, but for 'a+' mode it should be set to
0.  For 'a' mode too, it probably makes sense to keep it set to 0 in
the interest of retaining old behavior.

The initial file position for append mode files is implementation
defined, so the implementation could either retain the current file
position or move the position to the end of file.  The earlier ftell
implementation would move the offset to end of file for append-only
mode, but retain the old offset for a+ mode.  It would also cache the
offset (this detail is important).  My patch broke this and would set
the initial position to end of file for both append modes, thus
breaking old behavior.  I was ignorant enough to write an incorrect
test case for it too.

The Change:

I have now brought back the behavior of seeking to end of file for
append-only streams, but with a slight difference.  I don't cache the
offset though, since we would want ftell to query the current file
position through lseek while the stream is not active.  Since the
offset is moved to the end of file, we can rely on the file position
reported by lseek and we don't need to resort to the stat() nonsense.

Finally, the cache is always reliable, except when there are unflished
writes in an append mode stream (i.e. both a and a+).  In the latter
case, it is safe to just do an lseek to SEEK_END.  The value can be
safely cached too, since the file handle is already active at this
point.  Incidentally, this is the only state change we affect in the
file handle (apart from taking locks of course).

I have also updated the test case to correct my impression of the
initial file position for a+ streams to the initial behavior.  I have
verified that this does not break any existing tests in the testsuite
and also passes with the new tests.

Siddhesh

	[BZ #16680]
	* libio/fileops.c (_IO_file_open): Seek to end of file but
	don't cache the offset.
	(get_file_offset): Remove function.
	(do_ftell): Use cached offset when available.
	* libio/iofdopen.c (_IO_new_fdopen): Seek to end of file but
	don't cache the offset.
	* libio/tst-ftell-active-handler.c (do_rewind_test): New test
	case.
	(do_one_test): Call it.
	(do_ftell_test): Fix up expected old offset for a+ mode.
	* libio/wfileops.c (do_ftell_wide): Used cached offset when
	available.

---
 libio/fileops.c                  | 102 +++++++++++++----------------------
 libio/iofdopen.c                 |   9 ++++
 libio/tst-ftell-active-handler.c | 112 +++++++++++++++++++++++++++++++++++++--
 libio/wfileops.c                 |  47 ++++++++--------
 4 files changed, 178 insertions(+), 92 deletions(-)
  

Comments

Siddhesh Poyarekar March 12, 2014, 6:11 p.m. UTC | #1
Carlos suggested that I write up a brief description of ftell and its
interaction with the offset cache, so here it is:

https://sourceware.org/glibc/wiki/File%20offsets%20in%20a%20stdio%20stream%20and%20ftell

Siddhesh

On Tue, Mar 11, 2014 at 06:16:16PM +0530, Siddhesh Poyarekar wrote:
> Here's take 2:
> 
> The ftell implementation was made conservative to ensure that
> incorrectly cached offsets never affect it.  However, this causes
> problems for append mode when a file stream is rewound.  Additionally,
> the 'clever' trick of using stat to get position for append mode files
> caused more problems than it solved and broke some old behavior.  I
> have described below the various problems that it caused and then
> finally the solution.
> 
> For a and a+ mode files, rewinding the stream should result in ftell
> returning 0 as the offset, but the stat() trick caused it to
> (incorrectly) always return the end of file.  Now I couldn't find
> anything in POSIX that specifies the stream position after rewind()
> for a file opened in 'a' mode, but for 'a+' mode it should be set to
> 0.  For 'a' mode too, it probably makes sense to keep it set to 0 in
> the interest of retaining old behavior.
> 
> The initial file position for append mode files is implementation
> defined, so the implementation could either retain the current file
> position or move the position to the end of file.  The earlier ftell
> implementation would move the offset to end of file for append-only
> mode, but retain the old offset for a+ mode.  It would also cache the
> offset (this detail is important).  My patch broke this and would set
> the initial position to end of file for both append modes, thus
> breaking old behavior.  I was ignorant enough to write an incorrect
> test case for it too.
> 
> The Change:
> 
> I have now brought back the behavior of seeking to end of file for
> append-only streams, but with a slight difference.  I don't cache the
> offset though, since we would want ftell to query the current file
> position through lseek while the stream is not active.  Since the
> offset is moved to the end of file, we can rely on the file position
> reported by lseek and we don't need to resort to the stat() nonsense.
> 
> Finally, the cache is always reliable, except when there are unflished
> writes in an append mode stream (i.e. both a and a+).  In the latter
> case, it is safe to just do an lseek to SEEK_END.  The value can be
> safely cached too, since the file handle is already active at this
> point.  Incidentally, this is the only state change we affect in the
> file handle (apart from taking locks of course).
> 
> I have also updated the test case to correct my impression of the
> initial file position for a+ streams to the initial behavior.  I have
> verified that this does not break any existing tests in the testsuite
> and also passes with the new tests.
> 
> Siddhesh
> 
> 	[BZ #16680]
> 	* libio/fileops.c (_IO_file_open): Seek to end of file but
> 	don't cache the offset.
> 	(get_file_offset): Remove function.
> 	(do_ftell): Use cached offset when available.
> 	* libio/iofdopen.c (_IO_new_fdopen): Seek to end of file but
> 	don't cache the offset.
> 	* libio/tst-ftell-active-handler.c (do_rewind_test): New test
> 	case.
> 	(do_one_test): Call it.
> 	(do_ftell_test): Fix up expected old offset for a+ mode.
> 	* libio/wfileops.c (do_ftell_wide): Used cached offset when
> 	available.
> 
> ---
>  libio/fileops.c                  | 102 +++++++++++++----------------------
>  libio/iofdopen.c                 |   9 ++++
>  libio/tst-ftell-active-handler.c | 112 +++++++++++++++++++++++++++++++++++++--
>  libio/wfileops.c                 |  47 ++++++++--------
>  4 files changed, 178 insertions(+), 92 deletions(-)
> 
> diff --git a/libio/fileops.c b/libio/fileops.c
> index 2e7bc8d..cf68dbf 100644
> --- a/libio/fileops.c
> +++ b/libio/fileops.c
> @@ -232,13 +232,18 @@ _IO_file_open (fp, filename, posix_mode, prot, read_write, is32not64)
>      return NULL;
>    fp->_fileno = fdesc;
>    _IO_mask_flags (fp, read_write,_IO_NO_READS+_IO_NO_WRITES+_IO_IS_APPENDING);
> -  if ((read_write & _IO_IS_APPENDING) && (read_write & _IO_NO_READS))
> -    if (_IO_SEEKOFF (fp, (_IO_off64_t)0, _IO_seek_end, _IOS_INPUT|_IOS_OUTPUT)
> -	== _IO_pos_BAD && errno != ESPIPE)
> -      {
> -	close_not_cancel (fdesc);
> -	return NULL;
> -      }
> +  /* For append mode, send the file offset to the end of the file.  Don't
> +     update the offset cache though, since the file handle is not active.  */
> +  if ((read_write & (_IO_IS_APPENDING | _IO_NO_READS))
> +      == (_IO_IS_APPENDING | _IO_NO_READS))
> +    {
> +      _IO_off64_t new_pos = _IO_SYSSEEK (fp, 0, _IO_seek_end);
> +      if (new_pos == _IO_pos_BAD && errno != ESPIPE)
> +	{
> +	  close_not_cancel (fdesc);
> +	  return NULL;
> +	}
> +    }
>    _IO_link_in ((struct _IO_FILE_plus *) fp);
>    return fp;
>  }
> @@ -929,43 +934,13 @@ _IO_file_sync_mmap (_IO_FILE *fp)
>    return 0;
>  }
>  
> -/* Get the current file offset using a system call.  This is the safest method
> -   to get the current file offset, since we are sure that we get the current
> -   state of the file.  Before the stream handle is activated (by using fread,
> -   fwrite, etc.), an application may alter the state of the file descriptor
> -   underlying it by calling read/write/lseek on it.  Using a cached offset at
> -   this point will result in returning the incorrect value.  Same is the case
> -   when one switches from reading in a+ mode to writing, where the buffer has
> -   not been flushed - the cached offset would reflect the reading position
> -   while the actual write position would be at the end of the file.
> -
> -   do_ftell and do_ftell_wide may resort to using the cached offset in some
> -   special cases instead of calling get_file_offset, but those cases should be
> -   thoroughly described.  */
> -_IO_off64_t
> -get_file_offset (_IO_FILE *fp)
> -{
> -  if ((fp->_flags & _IO_IS_APPENDING) == _IO_IS_APPENDING)
> -    {
> -      struct stat64 st;
> -      bool ret = (_IO_SYSSTAT (fp, &st) == 0 && S_ISREG (st.st_mode));
> -      if (ret)
> -	return st.st_size;
> -      else
> -	return EOF;
> -    }
> -  else
> -    return _IO_SYSSEEK (fp, 0, _IO_seek_cur);
> -}
> -
> -
> -/* ftell{,o} implementation.  Don't modify any state of the file pointer while
> -   we try to get the current state of the stream.  */
> +/* ftell{,o} implementation.  The only time we modify the state of the stream
> +   is when we have unflushed writes.  In that case we seek to the end and
> +   record that offset in the stream object.  */
>  static _IO_off64_t
>  do_ftell (_IO_FILE *fp)
>  {
> -  _IO_off64_t result = 0;
> -  bool use_cached_offset = false;
> +  _IO_off64_t result, offset = 0;
>  
>    /* No point looking at unflushed data if we haven't allocated buffers
>       yet.  */
> @@ -974,39 +949,37 @@ do_ftell (_IO_FILE *fp)
>        bool was_writing = (fp->_IO_write_ptr > fp->_IO_write_base
>  			  || _IO_in_put_mode (fp));
>  
> +      bool append_mode = (fp->_flags & _IO_IS_APPENDING) == _IO_IS_APPENDING;
> +
> +      /* When we have unflushed writes in append mode, seek to the end of the
> +	 file and record that offset.  This is the only time we change the file
> +	 stream state and it is safe since the file handle is active.  */
> +      if (was_writing && append_mode)
> +	{
> +	  result = _IO_SYSSEEK (fp, 0, _IO_seek_end);
> +	  if (result == _IO_pos_BAD)
> +	    return EOF;
> +	  else
> +	    fp->_offset = result;
> +	}
> +
>        /* Adjust for unflushed data.  */
>        if (!was_writing)
> -	result -= fp->_IO_read_end - fp->_IO_read_ptr;
> +	offset -= fp->_IO_read_end - fp->_IO_read_ptr;
>        else
> -	result += fp->_IO_write_ptr - fp->_IO_read_end;
> -
> -      /* It is safe to use the cached offset when available if there is
> -	 unbuffered data (indicating that the file handle is active) and the
> -	 handle is not for a file open in a+ mode.  The latter condition is
> -	 because there could be a scenario where there is a switch from read
> -	 mode to write mode using an fseek to an arbitrary position.  In this
> -	 case, there would be unbuffered data due to be appended to the end of
> -	 the file, but the offset may not necessarily be the end of the
> -	 file.  It is fine to use the cached offset when the a+ stream is in
> -	 read mode though, since the offset is maintained correctly in that
> -	 case.  Note that this is not a comprehensive set of cases when the
> -	 offset is reliable.  The offset may be reliable even in some cases
> -	 where there is no unflushed input and the handle is active, but it's
> -	 just that we don't have a way to identify that condition reliably.  */
> -      use_cached_offset = (result != 0 && fp->_offset != _IO_pos_BAD
> -			   && ((fp->_flags & (_IO_IS_APPENDING | _IO_NO_READS))
> -			       == (_IO_IS_APPENDING | _IO_NO_READS)
> -			       && was_writing));
> +	offset += fp->_IO_write_ptr - fp->_IO_read_end;
>      }
>  
> -  if (use_cached_offset)
> -    result += fp->_offset;
> +  if (fp->_offset != _IO_pos_BAD)
> +    result = fp->_offset;
>    else
> -    result += get_file_offset (fp);
> +    result = _IO_SYSSEEK (fp, 0, _IO_seek_cur);
>  
>    if (result == EOF)
>      return result;
>  
> +  result += offset;
> +
>    if (result < 0)
>      {
>        __set_errno (EINVAL);
> @@ -1016,7 +989,6 @@ do_ftell (_IO_FILE *fp)
>    return result;
>  }
>  
> -
>  _IO_off64_t
>  _IO_new_file_seekoff (fp, offset, dir, mode)
>       _IO_FILE *fp;
> diff --git a/libio/iofdopen.c b/libio/iofdopen.c
> index 3f266f7..364e175 100644
> --- a/libio/iofdopen.c
> +++ b/libio/iofdopen.c
> @@ -167,6 +167,15 @@ _IO_new_fdopen (fd, mode)
>    _IO_mask_flags (&new_f->fp.file, read_write,
>  		  _IO_NO_READS+_IO_NO_WRITES+_IO_IS_APPENDING);
>  
> +  /* For append mode, send the file offset to the end of the file.  Don't
> +     update the offset cache though, since the file handle is not active.  */
> +  if ((read_write & (_IO_IS_APPENDING | _IO_NO_READS))
> +      == (_IO_IS_APPENDING | _IO_NO_READS))
> +    {
> +      _IO_off64_t new_pos = _IO_SYSSEEK (&new_f->fp.file, 0, _IO_seek_end);
> +      if (new_pos == _IO_pos_BAD && errno != ESPIPE)
> +	return NULL;
> +    }
>    return &new_f->fp.file;
>  }
>  libc_hidden_ver (_IO_new_fdopen, _IO_fdopen)
> diff --git a/libio/tst-ftell-active-handler.c b/libio/tst-ftell-active-handler.c
> index 5d5fc26..aef9cc0 100644
> --- a/libio/tst-ftell-active-handler.c
> +++ b/libio/tst-ftell-active-handler.c
> @@ -88,6 +88,107 @@ static size_t file_len;
>  typedef int (*fputs_func_t) (const void *data, FILE *fp);
>  fputs_func_t fputs_func;
>  
> +/* Test that ftell output after a rewind is correct.  */
> +static int
> +do_rewind_test (const char *filename)
> +{
> +  int ret = 0;
> +  struct test
> +    {
> +      const char *mode;
> +      int fd_mode;
> +      size_t old_off;
> +      size_t new_off;
> +    } test_modes[] = {
> +	  {"w", O_WRONLY, 0, data_len},
> +	  {"w+", O_RDWR, 0, data_len},
> +	  {"r+", O_RDWR, 0, data_len},
> +	  /* The new offsets for 'a' and 'a+' modes have to factor in the
> +	     previous writes since they compulsorily append to the end of the
> +	     file.  */
> +	  {"a", O_WRONLY, 0, 3 * data_len},
> +	  {"a+", O_RDWR, 0, 4 * data_len},
> +    };
> +
> +  /* Empty the file before the test so that our offsets are simple to
> +     calculate.  */
> +  FILE *fp = fopen (filename, "w");
> +  if (fp == NULL)
> +    {
> +      printf ("Failed to open file for emptying\n");
> +      return 1;
> +    }
> +  fclose (fp);
> +
> +  for (int j = 0; j < 2; j++)
> +    {
> +      for (int i = 0; i < sizeof (test_modes) / sizeof (struct test); i++)
> +	{
> +	  FILE *fp;
> +	  int fd;
> +	  int fileret;
> +
> +	  printf ("\trewind: %s (file, \"%s\"): ", j == 0 ? "fdopen" : "fopen",
> +		  test_modes[i].mode);
> +
> +	  if (j == 0)
> +	    fileret = get_handles_fdopen (filename, fd, fp,
> +					  test_modes[i].fd_mode,
> +					  test_modes[i].mode);
> +	  else
> +	    fileret = get_handles_fopen (filename, fd, fp, test_modes[i].mode);
> +
> +	  if (fileret != 0)
> +	    return fileret;
> +
> +	  /* Write some content to the file, rewind and ensure that the ftell
> +	     output after the rewind is 0.  POSIX does not specify what the
> +	     behavior is when a file is rewound in 'a' mode, so we retain
> +	     current behavior, which is to keep the 0 offset.  */
> +	  size_t written = fputs_func (data, fp);
> +
> +	  if (written == EOF)
> +	    {
> +	      printf ("fputs[1] failed to write data\n");
> +	      ret |= 1;
> +	    }
> +
> +	  rewind (fp);
> +	  long offset = ftell (fp);
> +
> +	  if (offset != test_modes[i].old_off)
> +	    {
> +	      printf ("Incorrect old offset.  Expected %zu, but got %ld, ",
> +		      test_modes[i].old_off, offset);
> +	      ret |= 1;
> +	    }
> +	  else
> +	    printf ("old offset = %ld, ", offset);
> +
> +	  written = fputs_func (data, fp);
> +
> +	  if (written == EOF)
> +	    {
> +	      printf ("fputs[1] failed to write data\n");
> +	      ret |= 1;
> +	    }
> +
> +	  /* After this write, the offset in append modes should factor in the
> +	     implicit lseek to the end of file.  */
> +	  offset = ftell (fp);
> +	  if (offset != test_modes[i].new_off)
> +	    {
> +	      printf ("Incorrect new offset.  Expected %zu, but got %ld\n",
> +		      test_modes[i].new_off, offset);
> +	      ret |= 1;
> +	    }
> +	  else
> +	    printf ("new offset = %ld\n", offset);
> +	}
> +    }
> +  return ret;
> +}
> +
>  /* Test that the value of ftell is not cached when the stream handle is not
>     active.  */
>  static int
> @@ -107,11 +208,13 @@ do_ftell_test (const char *filename)
>  	  {"w", O_WRONLY, 0, data_len},
>  	  {"w+", O_RDWR, 0, data_len},
>  	  {"r+", O_RDWR, 0, data_len},
> -	  /* For 'a' and 'a+' modes, the initial file position should be the
> +	  /* For the 'a' mode, the initial file position should be the
>  	     current end of file. After the write, the offset has data_len
> -	     added to the old value.  */
> +	     added to the old value.  For a+ mode however, the initial file
> +	     position is the file position of the underlying file descriptor,
> +	     since it is initially assumed to be in read mode.  */
>  	  {"a", O_WRONLY, data_len, 2 * data_len},
> -	  {"a+", O_RDWR, 2 * data_len, 3 * data_len},
> +	  {"a+", O_RDWR, 0, 3 * data_len},
>      };
>    for (int j = 0; j < 2; j++)
>      {
> @@ -157,7 +260,7 @@ do_ftell_test (const char *filename)
>  	  if (off != test_modes[i].new_off)
>  	    {
>  	      printf ("Incorrect new offset.  Expected %zu but got %ld\n",
> -		      test_modes[i].old_off, off);
> +		      test_modes[i].new_off, off);
>  	      ret |= 1;
>  	    }
>  	  else
> @@ -322,6 +425,7 @@ do_one_test (const char *filename)
>    ret |= do_ftell_test (filename);
>    ret |= do_write_test (filename);
>    ret |= do_append_test (filename);
> +  ret |= do_rewind_test (filename);
>  
>    return ret;
>  }
> diff --git a/libio/wfileops.c b/libio/wfileops.c
> index 8b2e108..3199861 100644
> --- a/libio/wfileops.c
> +++ b/libio/wfileops.c
> @@ -597,12 +597,12 @@ done:
>  }
>  
>  /* ftell{,o} implementation for wide mode.  Don't modify any state of the file
> -   pointer while we try to get the current state of the stream.  */
> +   pointer while we try to get the current state of the stream except in one
> +   case, which is when we have unflushed writes in append mode.  */
>  static _IO_off64_t
>  do_ftell_wide (_IO_FILE *fp)
>  {
>    _IO_off64_t result, offset = 0;
> -  bool use_cached_offset = false;
>  
>    /* No point looking for offsets in the buffer if it hasn't even been
>       allocated.  */
> @@ -615,6 +615,20 @@ do_ftell_wide (_IO_FILE *fp)
>  			   > fp->_wide_data->_IO_write_base)
>  			  || _IO_in_put_mode (fp));
>  
> +      bool append_mode = (fp->_flags & _IO_IS_APPENDING) == _IO_IS_APPENDING;
> +
> +      /* When we have unflushed writes in append mode, seek to the end of the
> +	 file and record that offset.  This is the only time we change the file
> +	 stream state and it is safe since the file handle is active.  */
> +      if (was_writing && append_mode)
> +	{
> +	  result = _IO_SYSSEEK (fp, 0, _IO_seek_end);
> +	  if (result == _IO_pos_BAD)
> +	    return EOF;
> +	  else
> +	    fp->_offset = result;
> +	}
> +
>        /* XXX For wide stream with backup store it is not very
>  	 reasonable to determine the offset.  The pushed-back
>  	 character might require a state change and we need not be
> @@ -703,37 +717,24 @@ do_ftell_wide (_IO_FILE *fp)
>  	     position is fp._offset - (_IO_read_end - new_write_ptr).  */
>  	  offset -= fp->_IO_read_end - fp->_IO_write_ptr;
>  	}
> -
> -      /* It is safe to use the cached offset when available if there is
> -	 unbuffered data (indicating that the file handle is active) and
> -	 the handle is not for a file open in a+ mode.  The latter
> -	 condition is because there could be a scenario where there is a
> -	 switch from read mode to write mode using an fseek to an arbitrary
> -	 position.  In this case, there would be unbuffered data due to be
> -	 appended to the end of the file, but the offset may not
> -	 necessarily be the end of the file.  It is fine to use the cached
> -	 offset when the a+ stream is in read mode though, since the offset
> -	 is maintained correctly in that case.  Note that this is not a
> -	 comprehensive set of cases when the offset is reliable.  The
> -	 offset may be reliable even in some cases where there is no
> -	 unflushed input and the handle is active, but it's just that we
> -	 don't have a way to identify that condition reliably.  */
> -      use_cached_offset = (offset != 0 && fp->_offset != _IO_pos_BAD
> -			   && ((fp->_flags & (_IO_IS_APPENDING | _IO_NO_READS))
> -			       == (_IO_IS_APPENDING | _IO_NO_READS)
> -			       && was_writing));
>      }
>  
> -  if (use_cached_offset)
> +  if (fp->_offset != _IO_pos_BAD)
>      result = fp->_offset;
>    else
> -    result = get_file_offset (fp);
> +    result = _IO_SYSSEEK (fp, 0, _IO_seek_cur);
>  
>    if (result == EOF)
>      return result;
>  
>    result += offset;
>  
> +  if (result < 0)
> +    {
> +      __set_errno (EINVAL);
> +      return EOF;
> +    }
> +
>    return result;
>  }
>  
> -- 
> 1.8.3.1
>
  
Rich Felker March 12, 2014, 6:18 p.m. UTC | #2
On Wed, Mar 12, 2014 at 11:41:27PM +0530, Siddhesh Poyarekar wrote:
> Carlos suggested that I write up a brief description of ftell and its
> interaction with the offset cache, so here it is:
> 
> https://sourceware.org/glibc/wiki/File%20offsets%20in%20a%20stdio%20stream%20and%20ftell

At least one line in yout table is wrong: ftell can never change the
offset when the handle is inactive. You have it changing the offset in
this case for append-mode files. I don't see why you want to do this.

Rich
  
Patchwork Bot March 12, 2014, 6:24 p.m. UTC | #3
On 12 March 2014 23:48, Rich Felker <dalias@aerifal.cx> wrote:
> At least one line in yout table is wrong: ftell can never change the
> offset when the handle is inactive. You have it changing the offset in
> this case for append-mode files. I don't see why you want to do this.

Thanks for pointing that out.  I was supposed to add a note there
clarifying that it's fopen/fdopen that does the offset change there
and I forgot to do that.  I've updated the document now to hopefully
be a bit clearer.

Siddhesh
  
Rich Felker March 13, 2014, 1:35 a.m. UTC | #4
On Tue, Mar 11, 2014 at 06:16:16PM +0530, Siddhesh Poyarekar wrote:
> The Change:
> 
> I have now brought back the behavior of seeking to end of file for
> append-only streams, but with a slight difference.  I don't cache the
> offset though, since we would want ftell to query the current file
> position through lseek while the stream is not active.  Since the
> offset is moved to the end of file, we can rely on the file position
> reported by lseek and we don't need to resort to the stat() nonsense.

Sounds good. While I haven't done a real code review of your patch
(I'm honestly not familiar with glibc's stdio internals and just
looking at them gives me a headache) your description of the behavior
you're implemented sounds correct to me now.

> Finally, the cache is always reliable, except when there are unflished
> writes in an append mode stream (i.e. both a and a+).  In the latter
> case, it is safe to just do an lseek to SEEK_END.  The value can be
> safely cached too, since the file handle is already active at this
> point.  Incidentally, this is the only state change we affect in the
> file handle (apart from taking locks of course).

Caching is valid for append mode with unflushed writes too. The first
time you get in this situation (transitioning from being an inactive
handle or from read mode) you don't have a valid cached offset to use
and you have to lseek to SEEK_END, but after you have a valid cached
offset, you can use it (being the active handle means that it's UB for
anything else to extend the file and change where the next write would
occur).

Of course one consideration is whether real-world programs actually
obey the rules for active handles. I suspect there might be
(especially log-file-related) cases where legacy software expects
append-mode to work with multiple writers all behaving as if they're
the active handle (this has some race conditions that cause
interleaving but "mostly works" historically) so it may be best not to
try to optimize this case heavily anyway.

Rich
  
Carlos O'Donell March 17, 2014, 8:17 a.m. UTC | #5
On 03/11/2014 08:46 AM, Siddhesh Poyarekar wrote:
> Here's take 2:

Nobody has touched the core parts of this code in years, and I appreciate
that you have taken the initiative to do so. There is no gain though
without some pain, and we see it here as we improve the implementation.
Thanks for working through this.
 
> The ftell implementation was made conservative to ensure that
> incorrectly cached offsets never affect it.  However, this causes
> problems for append mode when a file stream is rewound.  Additionally,
> the 'clever' trick of using stat to get position for append mode files
> caused more problems than it solved and broke some old behavior.  I
> have described below the various problems that it caused and then
> finally the solution.
> 
> For a and a+ mode files, rewinding the stream should result in ftell
> returning 0 as the offset, but the stat() trick caused it to
> (incorrectly) always return the end of file.  Now I couldn't find
> anything in POSIX that specifies the stream position after rewind()
> for a file opened in 'a' mode, but for 'a+' mode it should be set to
> 0.  For 'a' mode too, it probably makes sense to keep it set to 0 in
> the interest of retaining old behavior.

Right, so I was thinking along the same lines. If you didn't switch
from write to read then there is no reason why the offset can't
be the end of the file after rewind. However, that subtle change
is a change in our implementation and I agree that it would be best
if we retained the old behaviour.

Does a `write; rewind; read' work now with the patch? Was the current
patch causing this to fail also or did the `read' start reading at 
offset 0?
 
> The initial file position for append mode files is implementation
> defined, so the implementation could either retain the current file
> position or move the position to the end of file.  The earlier ftell
> implementation would move the offset to end of file for append-only
> mode, but retain the old offset for a+ mode.  It would also cache the
> offset (this detail is important).  My patch broke this and would set
> the initial position to end of file for both append modes, thus
> breaking old behavior.  I was ignorant enough to write an incorrect
> test case for it too.
> 
> The Change:
> 
> I have now brought back the behavior of seeking to end of file for
> append-only streams, but with a slight difference.  I don't cache the
> offset though, since we would want ftell to query the current file
> position through lseek while the stream is not active.  Since the
> offset is moved to the end of file, we can rely on the file position
> reported by lseek and we don't need to resort to the stat() nonsense.

OK, but I wouldn't call it nonsense, you either stat() or lseek() and
both cost a syscall. Your patch inlines the lseek from get_file_offset
and removes the stat. That's fine.

Does this reinstitution of lseek cause any problems for the use of ftell
on inactive streams? For example is it really correct to have _IO_file_open
seek to the end of the fully flushed file in append mode? What about
other users of the fd that might expect the underlying offset to remain
the same?

> Finally, the cache is always reliable, except when there are unflished
> writes in an append mode stream (i.e. both a and a+).  In the latter
> case, it is safe to just do an lseek to SEEK_END.  The value can be
> safely cached too, since the file handle is already active at this
> point.  Incidentally, this is the only state change we affect in the
> file handle (apart from taking locks of course).

When you say unflushed writes, I assume those are writes in the
stream buffers that have not been passed to the kernel write syscall,
as opposed to writes that the kernel itself has not flushed to disk
from its own buffers.

I don't understand this part. The cached offset *could* be reliable
but you would have to take into account the unflushed writes, but we
don't so it isn't? Why else is it unreliable?
 
> I have also updated the test case to correct my impression of the
> initial file position for a+ streams to the initial behavior.  I have
> verified that this does not break any existing tests in the testsuite
> and also passes with the new tests.

Thanks.

> Siddhesh
> 
> 	[BZ #16680]
> 	* libio/fileops.c (_IO_file_open): Seek to end of file but
> 	don't cache the offset.
> 	(get_file_offset): Remove function.
> 	(do_ftell): Use cached offset when available.
> 	* libio/iofdopen.c (_IO_new_fdopen): Seek to end of file but
> 	don't cache the offset.
> 	* libio/tst-ftell-active-handler.c (do_rewind_test): New test
> 	case.
> 	(do_one_test): Call it.
> 	(do_ftell_test): Fix up expected old offset for a+ mode.
> 	* libio/wfileops.c (do_ftell_wide): Used cached offset when
> 	available.
>

OK to checkin after fixing some comment nits and answering my questions.

> ---
>  libio/fileops.c                  | 102 +++++++++++++----------------------
>  libio/iofdopen.c                 |   9 ++++
>  libio/tst-ftell-active-handler.c | 112 +++++++++++++++++++++++++++++++++++++--
>  libio/wfileops.c                 |  47 ++++++++--------
>  4 files changed, 178 insertions(+), 92 deletions(-)
> 
> diff --git a/libio/fileops.c b/libio/fileops.c
> index 2e7bc8d..cf68dbf 100644
> --- a/libio/fileops.c
> +++ b/libio/fileops.c
> @@ -232,13 +232,18 @@ _IO_file_open (fp, filename, posix_mode, prot, read_write, is32not64)
>      return NULL;
>    fp->_fileno = fdesc;
>    _IO_mask_flags (fp, read_write,_IO_NO_READS+_IO_NO_WRITES+_IO_IS_APPENDING);
> -  if ((read_write & _IO_IS_APPENDING) && (read_write & _IO_NO_READS))
> -    if (_IO_SEEKOFF (fp, (_IO_off64_t)0, _IO_seek_end, _IOS_INPUT|_IOS_OUTPUT)
> -	== _IO_pos_BAD && errno != ESPIPE)
> -      {
> -	close_not_cancel (fdesc);
> -	return NULL;
> -      }
> +  /* For append mode, send the file offset to the end of the file.  Don't
> +     update the offset cache though, since the file handle is not active.  */

OK, by using _IO_SYSSEEK directly you bypass any cache update.

> +  if ((read_write & (_IO_IS_APPENDING | _IO_NO_READS))
> +      == (_IO_IS_APPENDING | _IO_NO_READS))
> +    {
> +      _IO_off64_t new_pos = _IO_SYSSEEK (fp, 0, _IO_seek_end);
> +      if (new_pos == _IO_pos_BAD && errno != ESPIPE)
> +	{
> +	  close_not_cancel (fdesc);
> +	  return NULL;
> +	}
> +    }

OK, though this is not the minimal change here. You rewrote
the conditionals to be more compact, but AFAICT they didn't
change their meaning and no boolean coercion. So OK.

>    _IO_link_in ((struct _IO_FILE_plus *) fp);
>    return fp;
>  }
> @@ -929,43 +934,13 @@ _IO_file_sync_mmap (_IO_FILE *fp)
>    return 0;
>  }
>  
> -/* Get the current file offset using a system call.  This is the safest method
> -   to get the current file offset, since we are sure that we get the current
> -   state of the file.  Before the stream handle is activated (by using fread,
> -   fwrite, etc.), an application may alter the state of the file descriptor
> -   underlying it by calling read/write/lseek on it.  Using a cached offset at
> -   this point will result in returning the incorrect value.  Same is the case
> -   when one switches from reading in a+ mode to writing, where the buffer has
> -   not been flushed - the cached offset would reflect the reading position
> -   while the actual write position would be at the end of the file.
> -
> -   do_ftell and do_ftell_wide may resort to using the cached offset in some
> -   special cases instead of calling get_file_offset, but those cases should be
> -   thoroughly described.  */
> -_IO_off64_t
> -get_file_offset (_IO_FILE *fp)
> -{
> -  if ((fp->_flags & _IO_IS_APPENDING) == _IO_IS_APPENDING)
> -    {
> -      struct stat64 st;
> -      bool ret = (_IO_SYSSTAT (fp, &st) == 0 && S_ISREG (st.st_mode));
> -      if (ret)
> -	return st.st_size;
> -      else
> -	return EOF;
> -    }
> -  else
> -    return _IO_SYSSEEK (fp, 0, _IO_seek_cur);
> -}

OK, removing get_file_offset because we've just inlined the
_IO_SYSSEEK directly into the callers but only when we know
no writing has happened.

> -
> -
> -/* ftell{,o} implementation.  Don't modify any state of the file pointer while
> -   we try to get the current state of the stream.  */
> +/* ftell{,o} implementation.  The only time we modify the state of the stream
> +   is when we have unflushed writes.  In that case we seek to the end and
> +   record that offset in the stream object.  */
>  static _IO_off64_t
>  do_ftell (_IO_FILE *fp)
>  {
> -  _IO_off64_t result = 0;
> -  bool use_cached_offset = false;
> +  _IO_off64_t result, offset = 0;

OK.

>  
>    /* No point looking at unflushed data if we haven't allocated buffers
>       yet.  */
> @@ -974,39 +949,37 @@ do_ftell (_IO_FILE *fp)
>        bool was_writing = (fp->_IO_write_ptr > fp->_IO_write_base
>  			  || _IO_in_put_mode (fp));
>  
> +      bool append_mode = (fp->_flags & _IO_IS_APPENDING) == _IO_IS_APPENDING;
> +
> +      /* When we have unflushed writes in append mode, seek to the end of the
> +	 file and record that offset.  This is the only time we change the file
> +	 stream state and it is safe since the file handle is active.  */
> +      if (was_writing && append_mode)
> +	{
> +	  result = _IO_SYSSEEK (fp, 0, _IO_seek_end);
> +	  if (result == _IO_pos_BAD)
> +	    return EOF;
> +	  else
> +	    fp->_offset = result;
> +	}

OK.

> +
>        /* Adjust for unflushed data.  */
>        if (!was_writing)
> -	result -= fp->_IO_read_end - fp->_IO_read_ptr;
> +	offset -= fp->_IO_read_end - fp->_IO_read_ptr;

OK.

>        else
> -	result += fp->_IO_write_ptr - fp->_IO_read_end;
> -
> -      /* It is safe to use the cached offset when available if there is
> -	 unbuffered data (indicating that the file handle is active) and the
> -	 handle is not for a file open in a+ mode.  The latter condition is
> -	 because there could be a scenario where there is a switch from read
> -	 mode to write mode using an fseek to an arbitrary position.  In this
> -	 case, there would be unbuffered data due to be appended to the end of
> -	 the file, but the offset may not necessarily be the end of the
> -	 file.  It is fine to use the cached offset when the a+ stream is in
> -	 read mode though, since the offset is maintained correctly in that
> -	 case.  Note that this is not a comprehensive set of cases when the
> -	 offset is reliable.  The offset may be reliable even in some cases
> -	 where there is no unflushed input and the handle is active, but it's
> -	 just that we don't have a way to identify that condition reliably.  */
> -      use_cached_offset = (result != 0 && fp->_offset != _IO_pos_BAD
> -			   && ((fp->_flags & (_IO_IS_APPENDING | _IO_NO_READS))
> -			       == (_IO_IS_APPENDING | _IO_NO_READS)
> -			       && was_writing));
> +	offset += fp->_IO_write_ptr - fp->_IO_read_end;

OK. It always seems kind of odd that simulating the buffering effects
using the various pointers and their outcome on the offset is repeated
among all the functions. What would be nicer, and I'm just talking out loud
is a "account for buffering" function that did all of this simulation itself.

>      }
>  
> -  if (use_cached_offset)
> -    result += fp->_offset;
> +  if (fp->_offset != _IO_pos_BAD)
> +    result = fp->_offset;
>    else
> -    result += get_file_offset (fp);
> +    result = _IO_SYSSEEK (fp, 0, _IO_seek_cur);

OK.

>  
>    if (result == EOF)
>      return result;
>  
> +  result += offset;
> +

OK.

>    if (result < 0)
>      {
>        __set_errno (EINVAL);
> @@ -1016,7 +989,6 @@ do_ftell (_IO_FILE *fp)
>    return result;
>  }
>  
> -
>  _IO_off64_t
>  _IO_new_file_seekoff (fp, offset, dir, mode)
>       _IO_FILE *fp;
> diff --git a/libio/iofdopen.c b/libio/iofdopen.c
> index 3f266f7..364e175 100644
> --- a/libio/iofdopen.c
> +++ b/libio/iofdopen.c
> @@ -167,6 +167,15 @@ _IO_new_fdopen (fd, mode)
>    _IO_mask_flags (&new_f->fp.file, read_write,
>  		  _IO_NO_READS+_IO_NO_WRITES+_IO_IS_APPENDING);
>  
> +  /* For append mode, send the file offset to the end of the file.  Don't

s/send/set/g

> +     update the offset cache though, since the file handle is not active.  */
> +  if ((read_write & (_IO_IS_APPENDING | _IO_NO_READS))
> +      == (_IO_IS_APPENDING | _IO_NO_READS))
> +    {
> +      _IO_off64_t new_pos = _IO_SYSSEEK (&new_f->fp.file, 0, _IO_seek_end);
> +      if (new_pos == _IO_pos_BAD && errno != ESPIPE)
> +	return NULL;
> +    }

OK.

>    return &new_f->fp.file;
>  }
>  libc_hidden_ver (_IO_new_fdopen, _IO_fdopen)
> diff --git a/libio/tst-ftell-active-handler.c b/libio/tst-ftell-active-handler.c
> index 5d5fc26..aef9cc0 100644
> --- a/libio/tst-ftell-active-handler.c
> +++ b/libio/tst-ftell-active-handler.c
> @@ -88,6 +88,107 @@ static size_t file_len;
>  typedef int (*fputs_func_t) (const void *data, FILE *fp);
>  fputs_func_t fputs_func;
>  
> +/* Test that ftell output after a rewind is correct.  */
> +static int
> +do_rewind_test (const char *filename)
> +{
> +  int ret = 0;
> +  struct test
> +    {
> +      const char *mode;
> +      int fd_mode;
> +      size_t old_off;
> +      size_t new_off;
> +    } test_modes[] = {
> +	  {"w", O_WRONLY, 0, data_len},
> +	  {"w+", O_RDWR, 0, data_len},
> +	  {"r+", O_RDWR, 0, data_len},
> +	  /* The new offsets for 'a' and 'a+' modes have to factor in the
> +	     previous writes since they compulsorily append to the end of the

s/compulsorily/always/g

> +	     file.  */
> +	  {"a", O_WRONLY, 0, 3 * data_len},
> +	  {"a+", O_RDWR, 0, 4 * data_len},
> +    };
> +
> +  /* Empty the file before the test so that our offsets are simple to
> +     calculate.  */
> +  FILE *fp = fopen (filename, "w");
> +  if (fp == NULL)
> +    {
> +      printf ("Failed to open file for emptying\n");
> +      return 1;
> +    }
> +  fclose (fp);
> +
> +  for (int j = 0; j < 2; j++)
> +    {
> +      for (int i = 0; i < sizeof (test_modes) / sizeof (struct test); i++)
> +	{
> +	  FILE *fp;
> +	  int fd;
> +	  int fileret;
> +
> +	  printf ("\trewind: %s (file, \"%s\"): ", j == 0 ? "fdopen" : "fopen",
> +		  test_modes[i].mode);
> +
> +	  if (j == 0)
> +	    fileret = get_handles_fdopen (filename, fd, fp,
> +					  test_modes[i].fd_mode,
> +					  test_modes[i].mode);
> +	  else
> +	    fileret = get_handles_fopen (filename, fd, fp, test_modes[i].mode);
> +
> +	  if (fileret != 0)
> +	    return fileret;
> +
> +	  /* Write some content to the file, rewind and ensure that the ftell
> +	     output after the rewind is 0.  POSIX does not specify what the
> +	     behavior is when a file is rewound in 'a' mode, so we retain
> +	     current behavior, which is to keep the 0 offset.  */
> +	  size_t written = fputs_func (data, fp);
> +
> +	  if (written == EOF)
> +	    {
> +	      printf ("fputs[1] failed to write data\n");
> +	      ret |= 1;
> +	    }
> +
> +	  rewind (fp);
> +	  long offset = ftell (fp);
> +
> +	  if (offset != test_modes[i].old_off)
> +	    {
> +	      printf ("Incorrect old offset.  Expected %zu, but got %ld, ",
> +		      test_modes[i].old_off, offset);
> +	      ret |= 1;
> +	    }
> +	  else
> +	    printf ("old offset = %ld, ", offset);
> +
> +	  written = fputs_func (data, fp);
> +
> +	  if (written == EOF)
> +	    {
> +	      printf ("fputs[1] failed to write data\n");
> +	      ret |= 1;
> +	    }
> +
> +	  /* After this write, the offset in append modes should factor in the
> +	     implicit lseek to the end of file.  */
> +	  offset = ftell (fp);
> +	  if (offset != test_modes[i].new_off)
> +	    {
> +	      printf ("Incorrect new offset.  Expected %zu, but got %ld\n",
> +		      test_modes[i].new_off, offset);
> +	      ret |= 1;
> +	    }
> +	  else
> +	    printf ("new offset = %ld\n", offset);
> +	}
> +    }
> +  return ret;
> +}
> +
>  /* Test that the value of ftell is not cached when the stream handle is not
>     active.  */
>  static int
> @@ -107,11 +208,13 @@ do_ftell_test (const char *filename)
>  	  {"w", O_WRONLY, 0, data_len},
>  	  {"w+", O_RDWR, 0, data_len},
>  	  {"r+", O_RDWR, 0, data_len},
> -	  /* For 'a' and 'a+' modes, the initial file position should be the
> +	  /* For the 'a' mode, the initial file position should be the
>  	     current end of file. After the write, the offset has data_len
> -	     added to the old value.  */
> +	     added to the old value.  For a+ mode however, the initial file
> +	     position is the file position of the underlying file descriptor,
> +	     since it is initially assumed to be in read mode.  */
>  	  {"a", O_WRONLY, data_len, 2 * data_len},
> -	  {"a+", O_RDWR, 2 * data_len, 3 * data_len},
> +	  {"a+", O_RDWR, 0, 3 * data_len},

OK.

>      };
>    for (int j = 0; j < 2; j++)
>      {
> @@ -157,7 +260,7 @@ do_ftell_test (const char *filename)
>  	  if (off != test_modes[i].new_off)
>  	    {
>  	      printf ("Incorrect new offset.  Expected %zu but got %ld\n",
> -		      test_modes[i].old_off, off);
> +		      test_modes[i].new_off, off);
>  	      ret |= 1;
>  	    }
>  	  else
> @@ -322,6 +425,7 @@ do_one_test (const char *filename)
>    ret |= do_ftell_test (filename);
>    ret |= do_write_test (filename);
>    ret |= do_append_test (filename);
> +  ret |= do_rewind_test (filename);
>  
>    return ret;
>  }
> diff --git a/libio/wfileops.c b/libio/wfileops.c
> index 8b2e108..3199861 100644
> --- a/libio/wfileops.c
> +++ b/libio/wfileops.c
> @@ -597,12 +597,12 @@ done:
>  }
>  
>  /* ftell{,o} implementation for wide mode.  Don't modify any state of the file
> -   pointer while we try to get the current state of the stream.  */
> +   pointer while we try to get the current state of the stream except in one
> +   case, which is when we have unflushed writes in append mode.  */
>  static _IO_off64_t
>  do_ftell_wide (_IO_FILE *fp)
>  {
>    _IO_off64_t result, offset = 0;
> -  bool use_cached_offset = false;
>  
>    /* No point looking for offsets in the buffer if it hasn't even been
>       allocated.  */
> @@ -615,6 +615,20 @@ do_ftell_wide (_IO_FILE *fp)
>  			   > fp->_wide_data->_IO_write_base)
>  			  || _IO_in_put_mode (fp));
>  
> +      bool append_mode = (fp->_flags & _IO_IS_APPENDING) == _IO_IS_APPENDING;
> +
> +      /* When we have unflushed writes in append mode, seek to the end of the
> +	 file and record that offset.  This is the only time we change the file
> +	 stream state and it is safe since the file handle is active.  */
> +      if (was_writing && append_mode)
> +	{
> +	  result = _IO_SYSSEEK (fp, 0, _IO_seek_end);
> +	  if (result == _IO_pos_BAD)
> +	    return EOF;
> +	  else
> +	    fp->_offset = result;
> +	}
> +

OK.

>        /* XXX For wide stream with backup store it is not very
>  	 reasonable to determine the offset.  The pushed-back
>  	 character might require a state change and we need not be
> @@ -703,37 +717,24 @@ do_ftell_wide (_IO_FILE *fp)
>  	     position is fp._offset - (_IO_read_end - new_write_ptr).  */
>  	  offset -= fp->_IO_read_end - fp->_IO_write_ptr;
>  	}
> -
> -      /* It is safe to use the cached offset when available if there is
> -	 unbuffered data (indicating that the file handle is active) and
> -	 the handle is not for a file open in a+ mode.  The latter
> -	 condition is because there could be a scenario where there is a
> -	 switch from read mode to write mode using an fseek to an arbitrary
> -	 position.  In this case, there would be unbuffered data due to be
> -	 appended to the end of the file, but the offset may not
> -	 necessarily be the end of the file.  It is fine to use the cached
> -	 offset when the a+ stream is in read mode though, since the offset
> -	 is maintained correctly in that case.  Note that this is not a
> -	 comprehensive set of cases when the offset is reliable.  The
> -	 offset may be reliable even in some cases where there is no
> -	 unflushed input and the handle is active, but it's just that we
> -	 don't have a way to identify that condition reliably.  */
> -      use_cached_offset = (offset != 0 && fp->_offset != _IO_pos_BAD
> -			   && ((fp->_flags & (_IO_IS_APPENDING | _IO_NO_READS))
> -			       == (_IO_IS_APPENDING | _IO_NO_READS)
> -			       && was_writing));
>      }
>  
> -  if (use_cached_offset)
> +  if (fp->_offset != _IO_pos_BAD)
>      result = fp->_offset;
>    else
> -    result = get_file_offset (fp);
> +    result = _IO_SYSSEEK (fp, 0, _IO_seek_cur);

OK.

>  
>    if (result == EOF)
>      return result;
>  
>    result += offset;
>  
> +  if (result < 0)
> +    {
> +      __set_errno (EINVAL);
> +      return EOF;
> +    }

OK.

> +
>    return result;
>  }
>  
> 

Cheers,
Carlos.
  
Rich Felker March 17, 2014, 10:32 a.m. UTC | #6
On Mon, Mar 17, 2014 at 04:17:46AM -0400, Carlos O'Donell wrote:
> > I have now brought back the behavior of seeking to end of file for
> > append-only streams, but with a slight difference.  I don't cache the
> > offset though, since we would want ftell to query the current file
> > position through lseek while the stream is not active.  Since the
> > offset is moved to the end of file, we can rely on the file position
> > reported by lseek and we don't need to resort to the stat() nonsense.
> 
> OK, but I wouldn't call it nonsense, you either stat() or lseek() and
> both cost a syscall. Your patch inlines the lseek from get_file_offset
> and removes the stat. That's fine.

It is nonsense. For example stat does the wrong thing on device nodes.

> Does this reinstitution of lseek cause any problems for the use of ftell
> on inactive streams? For example is it really correct to have _IO_file_open
> seek to the end of the fully flushed file in append mode? What about
> other users of the fd that might expect the underlying offset to remain
> the same?

It's perfectly correct for fopen to perform the seek. For fdopen I'm
not so clear; I think it's wrong for fdopen to change the position
except in the case where O_APPEND wasn't set and fdopen adds it (in
this case the implementation can do whatever it wants, no?).

> > Finally, the cache is always reliable, except when there are unflished
> > writes in an append mode stream (i.e. both a and a+).  In the latter
> > case, it is safe to just do an lseek to SEEK_END.  The value can be
> > safely cached too, since the file handle is already active at this
> > point.  Incidentally, this is the only state change we affect in the
> > file handle (apart from taking locks of course).
> 
> When you say unflushed writes, I assume those are writes in the
> stream buffers that have not been passed to the kernel write syscall,
> as opposed to writes that the kernel itself has not flushed to disk
> from its own buffers.

Of course. From an application perspective kernel buffers don't exist.
They are not supposed to be observable in any way except for
performance.

> I don't understand this part. The cached offset *could* be reliable
> but you would have to take into account the unflushed writes, but we
> don't so it isn't? Why else is it unreliable?

It's not necessarily unreliable in all cases, but in two important
ones:

1. Transitioning from reading or seeking to writing, and the current
offset on the underlying fd is not at the end of the file.

2. Transitioning from being an inactive handle to being the active
handle, and the current offset on the underlying fd is not at the end
of the file.

In both cases, formally, the first write moved the position to the end
of the file (note: O_APPEND does this for you), but this isn't seen
yet because there hasn't been an actual write to the fd; the data is
stored in the FILE buffer. So you need to determine what position the
buffered data is going to get written to. lseek(fd,0,SEEK_END) does
this. And you don't have to worry that the result will be stale by the
time the underlying write happens, since the active handle rules
forbid any operations on the fd that would make it stale.

Rich
  

Patch

diff --git a/libio/fileops.c b/libio/fileops.c
index 2e7bc8d..cf68dbf 100644
--- a/libio/fileops.c
+++ b/libio/fileops.c
@@ -232,13 +232,18 @@  _IO_file_open (fp, filename, posix_mode, prot, read_write, is32not64)
     return NULL;
   fp->_fileno = fdesc;
   _IO_mask_flags (fp, read_write,_IO_NO_READS+_IO_NO_WRITES+_IO_IS_APPENDING);
-  if ((read_write & _IO_IS_APPENDING) && (read_write & _IO_NO_READS))
-    if (_IO_SEEKOFF (fp, (_IO_off64_t)0, _IO_seek_end, _IOS_INPUT|_IOS_OUTPUT)
-	== _IO_pos_BAD && errno != ESPIPE)
-      {
-	close_not_cancel (fdesc);
-	return NULL;
-      }
+  /* For append mode, send the file offset to the end of the file.  Don't
+     update the offset cache though, since the file handle is not active.  */
+  if ((read_write & (_IO_IS_APPENDING | _IO_NO_READS))
+      == (_IO_IS_APPENDING | _IO_NO_READS))
+    {
+      _IO_off64_t new_pos = _IO_SYSSEEK (fp, 0, _IO_seek_end);
+      if (new_pos == _IO_pos_BAD && errno != ESPIPE)
+	{
+	  close_not_cancel (fdesc);
+	  return NULL;
+	}
+    }
   _IO_link_in ((struct _IO_FILE_plus *) fp);
   return fp;
 }
@@ -929,43 +934,13 @@  _IO_file_sync_mmap (_IO_FILE *fp)
   return 0;
 }
 
-/* Get the current file offset using a system call.  This is the safest method
-   to get the current file offset, since we are sure that we get the current
-   state of the file.  Before the stream handle is activated (by using fread,
-   fwrite, etc.), an application may alter the state of the file descriptor
-   underlying it by calling read/write/lseek on it.  Using a cached offset at
-   this point will result in returning the incorrect value.  Same is the case
-   when one switches from reading in a+ mode to writing, where the buffer has
-   not been flushed - the cached offset would reflect the reading position
-   while the actual write position would be at the end of the file.
-
-   do_ftell and do_ftell_wide may resort to using the cached offset in some
-   special cases instead of calling get_file_offset, but those cases should be
-   thoroughly described.  */
-_IO_off64_t
-get_file_offset (_IO_FILE *fp)
-{
-  if ((fp->_flags & _IO_IS_APPENDING) == _IO_IS_APPENDING)
-    {
-      struct stat64 st;
-      bool ret = (_IO_SYSSTAT (fp, &st) == 0 && S_ISREG (st.st_mode));
-      if (ret)
-	return st.st_size;
-      else
-	return EOF;
-    }
-  else
-    return _IO_SYSSEEK (fp, 0, _IO_seek_cur);
-}
-
-
-/* ftell{,o} implementation.  Don't modify any state of the file pointer while
-   we try to get the current state of the stream.  */
+/* ftell{,o} implementation.  The only time we modify the state of the stream
+   is when we have unflushed writes.  In that case we seek to the end and
+   record that offset in the stream object.  */
 static _IO_off64_t
 do_ftell (_IO_FILE *fp)
 {
-  _IO_off64_t result = 0;
-  bool use_cached_offset = false;
+  _IO_off64_t result, offset = 0;
 
   /* No point looking at unflushed data if we haven't allocated buffers
      yet.  */
@@ -974,39 +949,37 @@  do_ftell (_IO_FILE *fp)
       bool was_writing = (fp->_IO_write_ptr > fp->_IO_write_base
 			  || _IO_in_put_mode (fp));
 
+      bool append_mode = (fp->_flags & _IO_IS_APPENDING) == _IO_IS_APPENDING;
+
+      /* When we have unflushed writes in append mode, seek to the end of the
+	 file and record that offset.  This is the only time we change the file
+	 stream state and it is safe since the file handle is active.  */
+      if (was_writing && append_mode)
+	{
+	  result = _IO_SYSSEEK (fp, 0, _IO_seek_end);
+	  if (result == _IO_pos_BAD)
+	    return EOF;
+	  else
+	    fp->_offset = result;
+	}
+
       /* Adjust for unflushed data.  */
       if (!was_writing)
-	result -= fp->_IO_read_end - fp->_IO_read_ptr;
+	offset -= fp->_IO_read_end - fp->_IO_read_ptr;
       else
-	result += fp->_IO_write_ptr - fp->_IO_read_end;
-
-      /* It is safe to use the cached offset when available if there is
-	 unbuffered data (indicating that the file handle is active) and the
-	 handle is not for a file open in a+ mode.  The latter condition is
-	 because there could be a scenario where there is a switch from read
-	 mode to write mode using an fseek to an arbitrary position.  In this
-	 case, there would be unbuffered data due to be appended to the end of
-	 the file, but the offset may not necessarily be the end of the
-	 file.  It is fine to use the cached offset when the a+ stream is in
-	 read mode though, since the offset is maintained correctly in that
-	 case.  Note that this is not a comprehensive set of cases when the
-	 offset is reliable.  The offset may be reliable even in some cases
-	 where there is no unflushed input and the handle is active, but it's
-	 just that we don't have a way to identify that condition reliably.  */
-      use_cached_offset = (result != 0 && fp->_offset != _IO_pos_BAD
-			   && ((fp->_flags & (_IO_IS_APPENDING | _IO_NO_READS))
-			       == (_IO_IS_APPENDING | _IO_NO_READS)
-			       && was_writing));
+	offset += fp->_IO_write_ptr - fp->_IO_read_end;
     }
 
-  if (use_cached_offset)
-    result += fp->_offset;
+  if (fp->_offset != _IO_pos_BAD)
+    result = fp->_offset;
   else
-    result += get_file_offset (fp);
+    result = _IO_SYSSEEK (fp, 0, _IO_seek_cur);
 
   if (result == EOF)
     return result;
 
+  result += offset;
+
   if (result < 0)
     {
       __set_errno (EINVAL);
@@ -1016,7 +989,6 @@  do_ftell (_IO_FILE *fp)
   return result;
 }
 
-
 _IO_off64_t
 _IO_new_file_seekoff (fp, offset, dir, mode)
      _IO_FILE *fp;
diff --git a/libio/iofdopen.c b/libio/iofdopen.c
index 3f266f7..364e175 100644
--- a/libio/iofdopen.c
+++ b/libio/iofdopen.c
@@ -167,6 +167,15 @@  _IO_new_fdopen (fd, mode)
   _IO_mask_flags (&new_f->fp.file, read_write,
 		  _IO_NO_READS+_IO_NO_WRITES+_IO_IS_APPENDING);
 
+  /* For append mode, send the file offset to the end of the file.  Don't
+     update the offset cache though, since the file handle is not active.  */
+  if ((read_write & (_IO_IS_APPENDING | _IO_NO_READS))
+      == (_IO_IS_APPENDING | _IO_NO_READS))
+    {
+      _IO_off64_t new_pos = _IO_SYSSEEK (&new_f->fp.file, 0, _IO_seek_end);
+      if (new_pos == _IO_pos_BAD && errno != ESPIPE)
+	return NULL;
+    }
   return &new_f->fp.file;
 }
 libc_hidden_ver (_IO_new_fdopen, _IO_fdopen)
diff --git a/libio/tst-ftell-active-handler.c b/libio/tst-ftell-active-handler.c
index 5d5fc26..aef9cc0 100644
--- a/libio/tst-ftell-active-handler.c
+++ b/libio/tst-ftell-active-handler.c
@@ -88,6 +88,107 @@  static size_t file_len;
 typedef int (*fputs_func_t) (const void *data, FILE *fp);
 fputs_func_t fputs_func;
 
+/* Test that ftell output after a rewind is correct.  */
+static int
+do_rewind_test (const char *filename)
+{
+  int ret = 0;
+  struct test
+    {
+      const char *mode;
+      int fd_mode;
+      size_t old_off;
+      size_t new_off;
+    } test_modes[] = {
+	  {"w", O_WRONLY, 0, data_len},
+	  {"w+", O_RDWR, 0, data_len},
+	  {"r+", O_RDWR, 0, data_len},
+	  /* The new offsets for 'a' and 'a+' modes have to factor in the
+	     previous writes since they compulsorily append to the end of the
+	     file.  */
+	  {"a", O_WRONLY, 0, 3 * data_len},
+	  {"a+", O_RDWR, 0, 4 * data_len},
+    };
+
+  /* Empty the file before the test so that our offsets are simple to
+     calculate.  */
+  FILE *fp = fopen (filename, "w");
+  if (fp == NULL)
+    {
+      printf ("Failed to open file for emptying\n");
+      return 1;
+    }
+  fclose (fp);
+
+  for (int j = 0; j < 2; j++)
+    {
+      for (int i = 0; i < sizeof (test_modes) / sizeof (struct test); i++)
+	{
+	  FILE *fp;
+	  int fd;
+	  int fileret;
+
+	  printf ("\trewind: %s (file, \"%s\"): ", j == 0 ? "fdopen" : "fopen",
+		  test_modes[i].mode);
+
+	  if (j == 0)
+	    fileret = get_handles_fdopen (filename, fd, fp,
+					  test_modes[i].fd_mode,
+					  test_modes[i].mode);
+	  else
+	    fileret = get_handles_fopen (filename, fd, fp, test_modes[i].mode);
+
+	  if (fileret != 0)
+	    return fileret;
+
+	  /* Write some content to the file, rewind and ensure that the ftell
+	     output after the rewind is 0.  POSIX does not specify what the
+	     behavior is when a file is rewound in 'a' mode, so we retain
+	     current behavior, which is to keep the 0 offset.  */
+	  size_t written = fputs_func (data, fp);
+
+	  if (written == EOF)
+	    {
+	      printf ("fputs[1] failed to write data\n");
+	      ret |= 1;
+	    }
+
+	  rewind (fp);
+	  long offset = ftell (fp);
+
+	  if (offset != test_modes[i].old_off)
+	    {
+	      printf ("Incorrect old offset.  Expected %zu, but got %ld, ",
+		      test_modes[i].old_off, offset);
+	      ret |= 1;
+	    }
+	  else
+	    printf ("old offset = %ld, ", offset);
+
+	  written = fputs_func (data, fp);
+
+	  if (written == EOF)
+	    {
+	      printf ("fputs[1] failed to write data\n");
+	      ret |= 1;
+	    }
+
+	  /* After this write, the offset in append modes should factor in the
+	     implicit lseek to the end of file.  */
+	  offset = ftell (fp);
+	  if (offset != test_modes[i].new_off)
+	    {
+	      printf ("Incorrect new offset.  Expected %zu, but got %ld\n",
+		      test_modes[i].new_off, offset);
+	      ret |= 1;
+	    }
+	  else
+	    printf ("new offset = %ld\n", offset);
+	}
+    }
+  return ret;
+}
+
 /* Test that the value of ftell is not cached when the stream handle is not
    active.  */
 static int
@@ -107,11 +208,13 @@  do_ftell_test (const char *filename)
 	  {"w", O_WRONLY, 0, data_len},
 	  {"w+", O_RDWR, 0, data_len},
 	  {"r+", O_RDWR, 0, data_len},
-	  /* For 'a' and 'a+' modes, the initial file position should be the
+	  /* For the 'a' mode, the initial file position should be the
 	     current end of file. After the write, the offset has data_len
-	     added to the old value.  */
+	     added to the old value.  For a+ mode however, the initial file
+	     position is the file position of the underlying file descriptor,
+	     since it is initially assumed to be in read mode.  */
 	  {"a", O_WRONLY, data_len, 2 * data_len},
-	  {"a+", O_RDWR, 2 * data_len, 3 * data_len},
+	  {"a+", O_RDWR, 0, 3 * data_len},
     };
   for (int j = 0; j < 2; j++)
     {
@@ -157,7 +260,7 @@  do_ftell_test (const char *filename)
 	  if (off != test_modes[i].new_off)
 	    {
 	      printf ("Incorrect new offset.  Expected %zu but got %ld\n",
-		      test_modes[i].old_off, off);
+		      test_modes[i].new_off, off);
 	      ret |= 1;
 	    }
 	  else
@@ -322,6 +425,7 @@  do_one_test (const char *filename)
   ret |= do_ftell_test (filename);
   ret |= do_write_test (filename);
   ret |= do_append_test (filename);
+  ret |= do_rewind_test (filename);
 
   return ret;
 }
diff --git a/libio/wfileops.c b/libio/wfileops.c
index 8b2e108..3199861 100644
--- a/libio/wfileops.c
+++ b/libio/wfileops.c
@@ -597,12 +597,12 @@  done:
 }
 
 /* ftell{,o} implementation for wide mode.  Don't modify any state of the file
-   pointer while we try to get the current state of the stream.  */
+   pointer while we try to get the current state of the stream except in one
+   case, which is when we have unflushed writes in append mode.  */
 static _IO_off64_t
 do_ftell_wide (_IO_FILE *fp)
 {
   _IO_off64_t result, offset = 0;
-  bool use_cached_offset = false;
 
   /* No point looking for offsets in the buffer if it hasn't even been
      allocated.  */
@@ -615,6 +615,20 @@  do_ftell_wide (_IO_FILE *fp)
 			   > fp->_wide_data->_IO_write_base)
 			  || _IO_in_put_mode (fp));
 
+      bool append_mode = (fp->_flags & _IO_IS_APPENDING) == _IO_IS_APPENDING;
+
+      /* When we have unflushed writes in append mode, seek to the end of the
+	 file and record that offset.  This is the only time we change the file
+	 stream state and it is safe since the file handle is active.  */
+      if (was_writing && append_mode)
+	{
+	  result = _IO_SYSSEEK (fp, 0, _IO_seek_end);
+	  if (result == _IO_pos_BAD)
+	    return EOF;
+	  else
+	    fp->_offset = result;
+	}
+
       /* XXX For wide stream with backup store it is not very
 	 reasonable to determine the offset.  The pushed-back
 	 character might require a state change and we need not be
@@ -703,37 +717,24 @@  do_ftell_wide (_IO_FILE *fp)
 	     position is fp._offset - (_IO_read_end - new_write_ptr).  */
 	  offset -= fp->_IO_read_end - fp->_IO_write_ptr;
 	}
-
-      /* It is safe to use the cached offset when available if there is
-	 unbuffered data (indicating that the file handle is active) and
-	 the handle is not for a file open in a+ mode.  The latter
-	 condition is because there could be a scenario where there is a
-	 switch from read mode to write mode using an fseek to an arbitrary
-	 position.  In this case, there would be unbuffered data due to be
-	 appended to the end of the file, but the offset may not
-	 necessarily be the end of the file.  It is fine to use the cached
-	 offset when the a+ stream is in read mode though, since the offset
-	 is maintained correctly in that case.  Note that this is not a
-	 comprehensive set of cases when the offset is reliable.  The
-	 offset may be reliable even in some cases where there is no
-	 unflushed input and the handle is active, but it's just that we
-	 don't have a way to identify that condition reliably.  */
-      use_cached_offset = (offset != 0 && fp->_offset != _IO_pos_BAD
-			   && ((fp->_flags & (_IO_IS_APPENDING | _IO_NO_READS))
-			       == (_IO_IS_APPENDING | _IO_NO_READS)
-			       && was_writing));
     }
 
-  if (use_cached_offset)
+  if (fp->_offset != _IO_pos_BAD)
     result = fp->_offset;
   else
-    result = get_file_offset (fp);
+    result = _IO_SYSSEEK (fp, 0, _IO_seek_cur);
 
   if (result == EOF)
     return result;
 
   result += offset;
 
+  if (result < 0)
+    {
+      __set_errno (EINVAL);
+      return EOF;
+    }
+
   return result;
 }