Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Analogue of ReadLine() in C

Reply
Thread Tools

Analogue of ReadLine() in C

 
 
Morris Keesan
Guest
Posts: n/a
 
      10-20-2011
On Thu, 20 Oct 2011 16:32:06 -0400, Michael Angelo Ravera
<(E-Mail Removed)> wrote:

> The reason that you code fails is that you can't generally realloc what
> you haven't [m|c|0]alloc'd.


Wrong. From the C99 standard, ISO/IEC 9899:1999 (E)

7.20.3.4 The realloc function

Synopsis
#include <stdlib.h>
void *realloc(void *ptr, size_t size);
....
If ptr is a null pointer, the realloc function behaves like the
malloc function for the specified size.

--
Morris Keesan -- http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
 
 
 
BartC
Guest
Posts: n/a
 
      10-20-2011


"Malcolm McLean" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> On Oct 20, 3:31 pm, "BartC" <(E-Mail Removed)> wrote:
>>
>> Someone doing 'ReadLine' is probably interested in line-oriented files.
>>
>> A file containing with a single multi-GB line is likely not
>> line-oriented!
>> Or is not using the expected newline sequence.
>>

> The problem is that it is impossible to define a reasonable line
> length.


A good guide might be how long you are prepared to spend scrolling from one
end to another.

My rough calculations show that a line 30,000 characters long would take
about a minute to scroll from one end to the other (paging a full width,
twice a second). 30,000 characters seems perfectly reasonable (I use 2048
myself and rarely have problems).

(30,000 characters also requires a virtual screen 200 feet wide. A line
buffer 1 billion characters long requires a virtual screen some 1300 miles
across and would take 3 weeks to scroll. No-one can possibly pretend that
that is still one line of text!)

If a limit of a few thousand characters is not enough, then a different
approach is needed; something in between ReadLine and ReadFile; ReadBlock
perhaps, as the file cannot be considered line-oriented, not in the
conventional, human-readable sense.

--
Bartc


 
Reply With Quote
 
 
 
 
Keith Thompson
Guest
Posts: n/a
 
      10-20-2011
"BartC" <(E-Mail Removed)> writes:
> "Malcolm McLean" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>> On Oct 20, 3:31 pm, "BartC" <(E-Mail Removed)> wrote:
>>>
>>> Someone doing 'ReadLine' is probably interested in line-oriented files.
>>>
>>> A file containing with a single multi-GB line is likely not
>>> line-oriented!
>>> Or is not using the expected newline sequence.
>>>

>> The problem is that it is impossible to define a reasonable line
>> length.

>
> A good guide might be how long you are prepared to spend scrolling from one
> end to another.

[snip]
> If a limit of a few thousand characters is not enough, then a different
> approach is needed; something in between ReadLine and ReadFile; ReadBlock
> perhaps, as the file cannot be considered line-oriented, not in the
> conventional, human-readable sense.


Text isn't necessarily meant to be human-readable.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Ian Collins
Guest
Posts: n/a
 
      10-21-2011
On 10/21/11 11:34 AM, Keith Thompson wrote:
> "BartC"<(E-Mail Removed)> writes:
>> "Malcolm McLean"<(E-Mail Removed)> wrote in message
>> news:(E-Mail Removed)...
>>> On Oct 20, 3:31 pm, "BartC"<(E-Mail Removed)> wrote:
>>>>
>>>> Someone doing 'ReadLine' is probably interested in line-oriented files.
>>>>
>>>> A file containing with a single multi-GB line is likely not
>>>> line-oriented!
>>>> Or is not using the expected newline sequence.
>>>>
>>> The problem is that it is impossible to define a reasonable line
>>> length.

>>
>> A good guide might be how long you are prepared to spend scrolling from one
>> end to another.

> [snip]
>> If a limit of a few thousand characters is not enough, then a different
>> approach is needed; something in between ReadLine and ReadFile; ReadBlock
>> perhaps, as the file cannot be considered line-oriented, not in the
>> conventional, human-readable sense.

>
> Text isn't necessarily meant to be human-readable.


Indeed. The longest single "line" I have in a file is 1.3GB (a JSON
representation of a rather large filesystem).

--
Ian Collins
 
Reply With Quote
 
Jorgen Grahn
Guest
Posts: n/a
 
      10-21-2011
On Thu, 2011-10-20, Ben Bacarisse wrote:
> Markus Wichmann <(E-Mail Removed)> writes:

....

> I am not sure how you decided that this suits the OP's use case. I
> think the original posted code was just a cut down example, so it's not
> obvious that silently splitting lines that are longer than 255
> characters suits the OP.
>
>> No dynamic allocation where it isn't necessary. It is not really
>> necessary in I/O, because there is never a buffer big enough for that,
>> so don't program as though there were!

>
> I think you'd be annoyed if, say, awk behaved like this. Given a big
> enough buffer, your shell might get away with it, but I expect it
> doesn't try.


I'm too lazy to look for the context, but I assume you are talking
about varying input line lengths here.

I think it's a bit more complicated than that. Assuming some
line-oriented input format:

- It's usually wrong to set a limit (say, 8192 bytes) and pretend
anything longer is two or more lines.

- It's usually wrong to accept *any* length by using malloc()/realloc(),
and then start swapping and crashing when someone feeds you a
10GB line.

I assume a decent awk sets some limit (far higher than any reasonable
input) and exits gracefully with an error message when it gets
anything larger.

- Many file/data formats put a limit to the line length anyway.
NNTP, SMTP. IIRC also the C language says a compiler is allowed
to bail out on lines longer than N characters.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
 
Reply With Quote
 
BartC
Guest
Posts: n/a
 
      10-21-2011
"Keith Thompson" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> "BartC" <(E-Mail Removed)> writes:


>> If a limit of a few thousand characters is not enough, then a different
>> approach is needed; something in between ReadLine and ReadFile; ReadBlock
>> perhaps, as the file cannot be considered line-oriented, not in the
>> conventional, human-readable sense.

>
> Text isn't necessarily meant to be human-readable.


In that case there should be no pretence that a ReadLine function actually
reads a 'line' at all, but any sequence of characters of any length.

That's why I suggested such a function should be called Readblock, with all
the problems that that would introduce (such as possibly bringing down the
system in the case of a rogue file with 'lines' too long to fit into either
physical or virtual memory).

Then 'ReadLine' can be used for the straightforward cases such as reading
configuration files or command lines. Lines which are too long are reported
as errors (which in all probability they are).

--
Bartc

 
Reply With Quote
 
Stefan Ram
Guest
Posts: n/a
 
      10-21-2011
Jorgen Grahn <(E-Mail Removed)> writes:
>- It's usually wrong to set a limit (say, 8192 bytes) and pretend
> anything longer is two or more lines.


Untested:

int readline
( FILE * const source,
int( * const target )( int ch, void * obj ),
void * const object )
{ int c; int result = 1; while( result == 1 )
{ c = fgetc( source ); result =
c == EOF ? 2 :
ferror( source )? 3 :
c == '\n' ? 0 :
target( c, object )? 4 : result; }
return result; }

 
Reply With Quote
 
Malcolm McLean
Guest
Posts: n/a
 
      10-21-2011
On Oct 21, 11:30*am, "BartC" <(E-Mail Removed)> wrote:
> "Keith Thompson" <(E-Mail Removed)> wrote in message
>
> > Text isn't necessarily meant to be human-readable.

>
> In that case there should be no pretence that a ReadLine function actually
> reads a 'line' at all, but any sequence of characters of any length.
>

There's a difference between human-inspectable and human-readable.

In the case of my program, no-one can really make sense of a matrix
with 1000 factors and 6000 cases. However you can certainly look at it
to see that the factors have a reasonable balance of set and missing,
the labels are right, there's the right number of cases (one per
gene), and so on.
--
Visit my website
http://www.malcolmmclean.site11.com/www


 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      10-21-2011
"BartC" <(E-Mail Removed)> writes:
> "Keith Thompson" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>> "BartC" <(E-Mail Removed)> writes:

>
>>> If a limit of a few thousand characters is not enough, then a different
>>> approach is needed; something in between ReadLine and ReadFile; ReadBlock
>>> perhaps, as the file cannot be considered line-oriented, not in the
>>> conventional, human-readable sense.

>>
>> Text isn't necessarily meant to be human-readable.

>
> In that case there should be no pretence that a ReadLine function
> actually reads a 'line' at all, but any sequence of characters of any
> length.
>
> That's why I suggested such a function should be called Readblock,
> with all the problems that that would introduce (such as possibly
> bringing down the system in the case of a rogue file with 'lines' too
> long to fit into either physical or virtual memory).
>
> Then 'ReadLine' can be used for the straightforward cases such as
> reading configuration files or command lines. Lines which are too long
> are reported as errors (which in all probability they are).


A block of characters terminated by a newline character is what we call
a "line".

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      10-21-2011
http://www.velocityreviews.com/forums/(E-Mail Removed)-berlin.de (Stefan Ram) writes:

> Jorgen Grahn <(E-Mail Removed)> writes:
>>- It's usually wrong to set a limit (say, 8192 bytes) and pretend
>> anything longer is two or more lines.

>
> Untested:
>
> int readline
> ( FILE * const source,
> int( * const target )( int ch, void * obj ),
> void * const object )
> { int c; int result = 1; while( result == 1 )
> { c = fgetc( source ); result =
> c == EOF ? 2 :
> ferror( source )? 3 :
> c == '\n' ? 0 :
> target( c, object )? 4 : result; }
> return result; }


Unread.

--
Ben.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Trying to get a 3640 to work as dial-in ISDN/analogue accessserver Guy Dawson Cisco 1 04-17-2007 10:08 AM
Analogue V DVI display quality xxxxxxxxxx Computer Support 2 12-05-2004 06:25 PM
Analogue or digital camcorder? Kenny Computer Support 1 11-24-2004 09:17 PM
Thrustmaster Dual Analogue JayneB Computer Support 0 03-04-2004 06:48 PM
as5200 and incoming analogue calls problem John Gelavis Cisco 0 11-27-2003 01:50 AM



Advertisments