![]() |
getline problem
Hello
I have been writing a getline-type function. It should read an arbitrary length line from a file and return it in a buffer. Here is my code #include<stdio.h> #include<stdlib.h> #include<string.h> void GL (char **buf, size_t * sz, FILE * fp) { size_t off = 0; char *new; for(;;) { if(*buf) { fgets (*buf + off, *sz - off, fp); if(strchr (*buf + off, '\n')) break; off += *sz - 1; } else *sz = 1; new = realloc (*buf, *sz <<= 1); if(!new) { free (*buf); *buf = *sz = 0; return; } *buf = new; } } int main (void) { char *p = 0; size_t sz = 0; while(!feof (stdin)) { GL (&p, &sz, stdin); printf ("%d\n", strlen (p)); } } However when I run this with the following test file as input: ==begin input== hello world everyone bye ==end input== the results are as follows: 3 6 9 1 4 4 instead I would expect to get: 5 5 8 0 3 Can anyone see what the problem is? Thanks |
Re: getline problem
On Thu, 10 Feb 2011 19:48:18 +0000, Gand Alf wrote:
> I have been writing a getline-type function. It should read an > arbitrary length line from a file and return it in a buffer. [snip #includes] > void > GL (char **buf, size_t * sz, FILE * fp) { > size_t off = 0; > char *new; > for(;;) What, you can't spell 'while' ? > { > if(*buf) > { > fgets (*buf + off, *sz - off, fp); > if(strchr (*buf + off, '\n')) > break; > off += *sz - 1; ^ You have an extra character there. > } [snip realloc & most of main] Note, your main starts off properly enough with "int main (void)", but at the end has no "return 0". > However when I run this with the following test file as input: ==begin > input== > hello > world > everyone > > bye > ==end input== > the results are as follows: > > 3 > 6 > 9 > 1 > 4 > 4 > > instead I would expect to get: > 5 > 5 > 8 > 0 > 3 > > Can anyone see what the problem is? (a) Wrong expectation -- Per man fgets, "If a newline is read, it is stored into the buffer", and (eg) strlen("world\n") is 6. (b) Mishandled EOF test -- Again per man, "fgets() returns ... NULL ... when end of file occurs while no characters have been read", but you've discarded the value of fgets(), so can't put a \0 at **buf, etc -- jiw |
Re: getline problem
pete <pfiland@mindspring.com> writes:
> James Waldby wrote: >> >> On Thu, 10 Feb 2011 19:48:18 +0000, Gand Alf wrote: > >> > for(;;) >> What, you can't spell 'while' ? > > for(;;) is the special construct for infinite loops. It's *a* construct for infinite loops. > On some compilers, while(1) generates a warning. while (1) and for (;;) are equally valid. Compilers can warn about anything they like. -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> Nokia "We must do something. This is something. Therefore, we must do this." -- Antony Jay and Jonathan Lynn, "Yes Minister" |
Re: getline problem
In article <ij1jrl$i7p$1@news.eternal-september.org>,
James Waldby <not@valid.invalid> wrote: > On Thu, 10 Feb 2011 19:48:18 +0000, Gand Alf wrote: [...] > > for(;;) > What, you can't spell 'while' ? I forgot around the 999th time I changed `while(1)' to `for(<initialize>; <test>; <iterate>)'. -- Michael Press |
Re: getline problem
@ OP
<problematic snippet> if(strchr (*buf + off, '\n')) break; </problematic snippet> Well, there is indeed an extra character being considered as part of the string read before the control passes back to main. So, for this we would just replace it with NUL providing the required NUL- terminated C string back to main. <corrected snippet> char *ptrToNewLineChar = strchr (*buf + off, '\n'); if( ptrToNewLineChar != NULL ) { *ptrToNewLineChar = '\0'; break; } </corrected snippet> In addition, if the specified number of characters in fgets are not read in then, there would be a need to update the offset(variable - off) appropriately. I would suggest to consider the strlen( ) to help in this regard as the the read buffer(variable - buf) would be a C string, to count the exact number of characters read correctly once we chop off the newline character from the read buffer as shown above. The corresponding implementation is below. <problematic snippet> off += *sz - 1; </problematic snippet> <corrected snippet> off += strlen(*buf + off); </corrected snippet> Cheers. |
Re: getline problem
On Feb 10, 7:48*pm, Gand Alf <nos...@nospam.com> wrote:
> Hello > > I have been writing a getline-type function. It should read an arbitrary > length line from a file and return it in a buffer. .... > Can anyone see what the problem is? Just a FYI. fgets is ok for limited use but it has some issues which make it not suitable unless you can guarantee the input to it. See http://codewiki.wikispaces.com/xbuf.c and look at the section What's wrong with fgets? James |
Re: getline problem
On Feb 11, 8:03*pm, James Harris <james.harri...@googlemail.com>
wrote: > > > and look at the section What's wrong with fgets? > This doesn't mention the real problem, which is that fgets() is just too difficult to use correctly if a program must process all input with absolutely no errors. You have to to call strchr() to check for the newline. If it is absent, a partial read has occurred. So you need to take action to ensure that the characters remaining in the stream are not treated as a whole line. All very fiddly. However in most applications you can just assume that a partial read will generate a parse error on the next line. |
Re: getline problem
On Feb 13, 12:25*pm, Malcolm McLean <malcolm.mcle...@btinternet.com>
wrote: > On Feb 11, 8:03*pm, James Harris <james.harri...@googlemail.com> > wrote: > > > and look at the section What's wrong with fgets? > > This doesn't mention the real problem, which is that fgets() is just > too difficult to use correctly if a program must process all input > with absolutely no errors. > > You have to to call strchr() to check for the newline. If it is > absent, a partial read has occurred. So you need to take action to > ensure that the characters remaining in the stream are not treated as > a whole line. All very fiddly. > However in most applications you can just assume that a partial read > will generate a parse error on the next line. It's worse than I thought! Incidentally, it occurred to me that another, more modern way to read a line from a file may be to mmap the file and use memchr or similar to scan it. Maybe the fastest option? As a slight downside, the code might need to account for very large files that exceed address space. James |
[OT] Re: getline problem
On 2/13/2011 2:56 PM, James Harris wrote:
> [...] > Incidentally, it occurred to me that another, more modern way to read > a line from a file may be to mmap the file and use memchr or similar > to scan it. Maybe the fastest option? As a slight downside, the code > might need to account for very large files that exceed address space. <off-topic reason="requires extensions"> That's one downside. Another is that the system's line-ending conventions won't be translated to '\n' for you; you'll need to interpret them yourself. Still another downside is that not all input sources are mmap-able: Try it on your keyboard, for instance, or on a pipe or socket. </off-topic> -- Eric Sosman esosman@ieee-dot-org.invalid |
Re: getline problem
On Feb 13, 9:00*pm, Eric Sosman <esos...@ieee-dot-org.invalid> wrote:
> On 2/13/2011 2:56 PM, James Harris wrote: > > > [...] > > Incidentally, it occurred to me that another, more modern way to read > > a line from a file may be to mmap the file and use memchr or similar > > to scan it. Maybe the fastest option? As a slight downside, the code > > might need to account for very large files that exceed address space. > > <off-topic reason="requires extensions"> > > * * *That's one downside. *Another is that the system's line-ending > conventions won't be translated to '\n' for you; you'll need to > interpret them yourself. True. > *Still another downside is that not all > input sources are mmap-able: Try it on your keyboard, for instance, > or on a pipe or socket. Sure. I did say "file". James |
| All times are GMT. The time now is 11:07 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.