From patchwork Mon Apr 13 12:28:36 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rasmus Villemoes X-Patchwork-Id: 6181 Received: (qmail 21105 invoked by alias); 13 Apr 2015 12:29:10 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 21061 invoked by uid 89); 13 Apr 2015 12:29:10 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-lb0-f178.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=MQuPblb7oeheEKkATqh+DR1XWQOuUAWkHNht/RuJN/w=; b=NynU9rM3lQ7vJ86/jBKxa+ah2Z1/LgSMjEqvcrzuHIh1dZqd1e/6HEGpD4zG8LwNmO CLHYI+jh2m30q4wNfJhlW65F0t7sF1oRy5jC7CyJNtf0yI34xTfNar8h6C/kQGRXbzfV tvWfrjFJo/+s/1L86Aj2xaI2I4o0G+8hYEFwQ9Qvl+DXgmp6qUOSsOQvS65WzgmZl272 Ukg+G8/F0iAzy4ZdaTkVFgoe8eeQW5i5y8R+OHdJAJ33FtmRWqzFoFPr32I9MDdePLuC UvDup7/c3Ep/ep6EUSxs5ohCW3FJ6l3RLnqZu5JYkfv0UFctqop7/OWg2I+i3fvNpSC/ Spqg== X-Gm-Message-State: ALoCoQlhX59IDNwMKzaKI3OjqqEQC2EMOqFkAaHpgIb8ciHSuj9BPyo66XlXD/E/seYWZB9ppKnD X-Received: by 10.112.198.225 with SMTP id jf1mr12795964lbc.91.1428928144765; Mon, 13 Apr 2015 05:29:04 -0700 (PDT) From: Rasmus Villemoes To: libc-alpha@sourceware.org Cc: Jeff King , Rasmus Villemoes Subject: [RFC/PoC 3/4] getdelim: Introduce getdelim_append Date: Mon, 13 Apr 2015 14:28:36 +0200 Message-Id: <1428928117-8643-4-git-send-email-rv@rasmusvillemoes.dk> In-Reply-To: <1428928117-8643-1-git-send-email-rv@rasmusvillemoes.dk> References: <1428928117-8643-1-git-send-email-rv@rasmusvillemoes.dk> If getdelim fails (e.g. due to ENOMEM), the contents which have already been read from the stream and copied to the output buffer are effectively lost: the bytes exist in the output buffer and *n faithfully reflects the allocated size of that, but the caller has no way of knowing how many actually came from the stream, and how many might be random malloc/realloc junk. This means that there is no way for the application to try to free some memory and retry the getdelim call or falling back to some other method (e.g. a slow getc loop). One way to solve this is to introduce getdelim_append, which has an extra in/out parameter allowing the caller to indicate the initial offset in the buffer to start writing at. getdelim_append updates this parameter every time content is copied to the output buffer. There are other use cases apart from allowing the application to try to recover from an error. For example, one can imagine reading a file format where a set of header lines is delimited by a blank line. Reading the entire header can then be done without maintaining both a line buffer and a final buffer, copying from one to the other: char *buf = NULL; size_t cap = 0, len = 0, old_len; ssize_t ret; do { old_len = len; ret = getdelim_append(&buf, &cap, '\n', f, &len); } while (ret > 0 && len-old_len > 1); (this could probably just be ret > 1, but in more complicated situations one could use old_len with ret and/or len to inspect the last record read). Signed-off-by: Rasmus Villemoes --- libio/iogetdelim.c | 30 +++++++++++++++++++++++------- libio/libioP.h | 1 + 2 files changed, 24 insertions(+), 7 deletions(-) diff --git a/libio/iogetdelim.c b/libio/iogetdelim.c index eeda0eb..1d20594 100644 --- a/libio/iogetdelim.c +++ b/libio/iogetdelim.c @@ -37,14 +37,14 @@ null terminator), or -1 on error or EOF. */ _IO_ssize_t -_IO_getdelim (lineptr, n, delimiter, fp) +_IO_getdelim_append (lineptr, n, delimiter, fp, cur_len) char **lineptr; _IO_size_t *n; int delimiter; _IO_FILE *fp; + _IO_size_t *cur_len; { _IO_ssize_t result = 0; - _IO_size_t cur_len = 0; _IO_ssize_t len; if (lineptr == NULL || n == NULL) @@ -89,7 +89,7 @@ _IO_getdelim (lineptr, n, delimiter, fp) t = (char *) memchr ((void *) fp->_IO_read_ptr, delimiter, len); if (t != NULL) len = (t - fp->_IO_read_ptr) + 1; - if (__glibc_unlikely (len >= SIZE_MAX - cur_len) || + if (__glibc_unlikely (len >= SIZE_MAX - *cur_len) || __glibc_unlikely (len >= SSIZE_MAX - result)) { __set_errno (EOVERFLOW); @@ -97,7 +97,7 @@ _IO_getdelim (lineptr, n, delimiter, fp) goto unlock_return; } /* Make enough space for len+1 (for final NUL) bytes. */ - needed = cur_len + len + 1; + needed = *cur_len + len + 1; if (needed > *n) { char *new_lineptr; @@ -113,15 +113,15 @@ _IO_getdelim (lineptr, n, delimiter, fp) *lineptr = new_lineptr; *n = needed; } - memcpy (*lineptr + cur_len, (void *) fp->_IO_read_ptr, len); + memcpy (*lineptr + *cur_len, (void *) fp->_IO_read_ptr, len); fp->_IO_read_ptr += len; - cur_len += len; + *cur_len += len; result += len; if (t != NULL || __underflow (fp) == EOF) break; len = fp->_IO_read_end - fp->_IO_read_ptr; } - (*lineptr)[cur_len] = '\0'; + (*lineptr)[*cur_len] = '\0'; unlock_return: _IO_release_lock (fp); @@ -129,6 +129,22 @@ unlock_return: } #ifdef weak_alias +weak_alias (_IO_getdelim_append, __getdelim_append) +weak_alias (_IO_getdelim_append, getdelim_append) +#endif + +_IO_ssize_t +_IO_getdelim (lineptr, n, delimiter, fp) + char **lineptr; + _IO_size_t *n; + int delimiter; + _IO_FILE *fp; +{ + _IO_size_t offset = 0; + return _IO_getdelim_append (lineptr, n, delimiter, fp, &offset); +} + +#ifdef weak_alias weak_alias (_IO_getdelim, __getdelim) weak_alias (_IO_getdelim, getdelim) #endif diff --git a/libio/libioP.h b/libio/libioP.h index d8604ca..73f9597 100644 --- a/libio/libioP.h +++ b/libio/libioP.h @@ -688,6 +688,7 @@ extern _IO_size_t _IO_getline_info (_IO_FILE *,char *, _IO_size_t, int, int, int *); libc_hidden_proto (_IO_getline_info) extern _IO_ssize_t _IO_getdelim (char **, _IO_size_t *, int, _IO_FILE *); +extern _IO_ssize_t _IO_getdelim_append (char **, _IO_size_t *, int, _IO_FILE *, _IO_size_t *); extern _IO_size_t _IO_getwline (_IO_FILE *,wchar_t *, _IO_size_t, wint_t, int); extern _IO_size_t _IO_getwline_info (_IO_FILE *,wchar_t *, _IO_size_t, wint_t, int, wint_t *);