On 2/24/2013 6:47 AM, Noob wrote:
> Noob wrote:
>
>> Here's my attempt at writing a "get_line" implementation, which
>> reads an entire line from a file stream, dynamically allocating
>> the space needed to store said line.
>
> I went back to the drawing board, with all your comments and suggestions
> in mind.
>
> As far as I can tell, there are 5 "situations" to deal with:
>
> 1) non-empty line
> 2) empty line
> 3) end of stream
> 4) stream error
> 5) out of memory
>
> For "typical" text files, getline will deal mostly with 1 and 2, and
> one necessary 3 at the end of the stream. 4 and 5 are exceptional
> error conditions.
>
> With the aim of keeping the common case simple, and given that I've
> stuck with a pointer return value, the simplest strategy seems to be
> to return
> - a valid pointer for 1 and 2
> - NULL for 3, 4, 5
>
> and let the user tell 3, 4, 5 apart using
> feof for 3, ferror for 4, otherwise 5
>
> So here's the "formal" description:
>
> char *mygetline(FILE *stream)
>
> mygetline dynamically allocates enough space (using malloc and friends) to
> store the next complete line (a valid NUL-terminated string) from 'stream'.
> The string must be free'd by the user when it is no longer needed.
> mygetline may return NULL
> 1) when it has reached the end of the stream
> 2) when there is an error reading from the stream
> 3) when malloc fails
> The user may use feof and ferror to distinguish between these cases
>
> Here's the code (valid C89 according to gcc)
>
> #include <stdlib.h>
> #include <stdio.h>
>
> static char *wrap_realloc(char *s, size_t len)
> {
> char *temp = realloc(s, len);
> if (temp == NULL) free(s);
> return temp;
> }
>
> char *mygetline(FILE *stream)
> {
> char *s = NULL;
> size_t len = 500;
> while ( 1 )
> {
> size_t max = len*2;
If you're worried about wrap-around (a possibility, albeit
a remote one), at this point you could add something like
if (max <= len) {
max = (size_t) -1; // last gasp: try SIZE_MAX
if (max <= len) {
free(s);
return NULL;
}
}
> s = wrap_realloc(s, max);
> if (s == NULL) return NULL;
> while (len < max)
> {
> int c = getc(stream);
> if (c == EOF || c == '\n')
> {
> s[len] = '\0';
> return wrap_realloc(s, len+1);
I'm of two minds about this realloc(). On the one hand, it's
good not to tie up more memory than you must; if the caller is just
stuffing the lines into a linked list or something, the memory
savings is worth while. But it seems more likely that the caller
will read a line, extract bits and pieces of it and store them
(rather than the whole line), and then just call free(). In that
use, the realloc() is just a waste of effort.
My own edition of this thing (everybody writes one eventually)
takes the idea a step further: It re-uses the existing buffer on
the next call, meaning that the caller needn't (in fact, mustn't)
call free(), and that the caller must extract and stash information
before reading the next line. Whether that's a good idea is open
to debate, but it works for me. YMMV.
> }
> s[len++] = c;
> }
> }
> }
>
> P.S. Eric, I did note your remark that len*2 may wrap-around, I'm just
> not sure what to do in this situation...
>
> Again, suggestions and criticism are welcome.
In addition to the comments above, I suggest you test this
code: You may be in for a surprise. (Hint: When the caller
examines the returned buffer, where will it find the very first
input character?)
--
Eric Sosman
d