Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > File handling question

Reply
Thread Tools

File handling question

 
 
Anunay
Guest
Posts: n/a
 
      05-17-2006
Hi all,

Suppose a text file is given and we have to write a program that will
return a random line from the file. This can be easily done. But what
if the text file is too big and can't fit into the main memory
completely? In this case, how will we modify our code?

Also, if we are given a stream, instead of a file, then what changes
are required?

Thanks,
Anunay

 
Reply With Quote
 
 
 
 
Richard Heathfield
Guest
Posts: n/a
 
      05-17-2006
Anunay said:

> Hi all,
>
> Suppose a text file is given and we have to write a program that will
> return a random line from the file. This can be easily done. But what
> if the text file is too big and can't fit into the main memory
> completely? In this case, how will we modify our code?


Not at all, if you write it properly first time. You just need storage for
two lines: the currently "saved" line, and the most recently read line.
When you read line N into memory, copy it into the "saved" buffer with
probability 1/N. At the end of this process, the "saved" buffer will
contain a randomly selected line.

> Also, if we are given a stream, instead of a file, then what changes
> are required?


None at all, if the way you wrote the first program was to take the input
from stdin if no command line argument was forthcoming.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
 
Reply With Quote
 
 
 
 
Friedrich Dominicus
Guest
Posts: n/a
 
      05-17-2006
"Anunay" <(E-Mail Removed)> writes:

> Hi all,
>
> Suppose a text file is given and we have to write a program that will
> return a random line from the file. This can be easily done. But what
> if the text file is too big and can't fit into the main memory
> completely?

What it the problem? You read the file line by line till you are at
your random number.

>In this case, how will we modify our code?

I do not have to change a line of code.
>
> Also, if we are given a stream, instead of a file, then what changes
> are required?

What do you mean with this?

Friedrich


--
Please remove just-for-news- to reply via e-mail.
 
Reply With Quote
 
Ico
Guest
Posts: n/a
 
      05-17-2006
Friedrich Dominicus <(E-Mail Removed)> wrote:
> "Anunay" <(E-Mail Removed)> writes:
>
>> Hi all,
>>
>> Suppose a text file is given and we have to write a program that will
>> return a random line from the file. This can be easily done. But what
>> if the text file is too big and can't fit into the main memory
>> completely?

> What it the problem? You read the file line by line till you are at
> your random number.


Requiring 2 passes of file reading: one to count the number of lines,
one to choos a random line. This can be used when reading a file
(although not optimal, ofcourse), but is not possible with streams.

>> Also, if we are given a stream, instead of a file, then what changes
>> are required?


> What do you mean with this?


I assume the OP means reading from standard input, instead of a file.

--
:wq
^X^Cy^K^X^C^C^C^C
 
Reply With Quote
 
Robert Latest
Guest
Posts: n/a
 
      05-17-2006
On Wed, 17 May 2006 07:07:01 +0000,
Richard Heathfield <(E-Mail Removed)> wrote
in Msg. <(E-Mail Removed)>

[...]

homework assignment DONE.

robert

 
Reply With Quote
 
Anunay
Guest
Posts: n/a
 
      05-17-2006

Richard Heathfield wrote:
> Not at all, if you write it properly first time. You just need storage for
> two lines: the currently "saved" line, and the most recently read line.
> When you read line N into memory, copy it into the "saved" buffer with
> probability 1/N. At the end of this process, the "saved" buffer will
> contain a randomly selected line.


Hi Richard,

Can you please elaborate a little on this? I am confused what will
happen at the fopen() function call. As the file is too big to be fully
fit into main memory, what will fopen() return?
If fopen() returns SUCCESS, then where is that portion of file gets
stored which could not get loaded into main memory?

Thanks.

 
Reply With Quote
 
Richard Heathfield
Guest
Posts: n/a
 
      05-17-2006
Robert Latest said:

> On Wed, 17 May 2006 07:07:01 +0000,
> Richard Heathfield <(E-Mail Removed)> wrote
> in Msg. <(E-Mail Removed)>
>
> [...]
>
> homework assignment DONE.


No, I just did the thinking. He still gets to do the coding. If he can't,
and it looks very much as if this is the case, then presumably he still
won't get the marks.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
 
Reply With Quote
 
Richard Heathfield
Guest
Posts: n/a
 
      05-17-2006
Anunay said:

>
> Richard Heathfield wrote:
>> Not at all, if you write it properly first time. You just need storage
>> for two lines: the currently "saved" line, and the most recently read
>> line. When you read line N into memory, copy it into the "saved" buffer
>> with probability 1/N. At the end of this process, the "saved" buffer will
>> contain a randomly selected line.

>
> Hi Richard,
>
> Can you please elaborate a little on this?


I am very reluctant to do that.

> I am confused what will happen at the fopen() function call.


Either the file will open, or fopen will return NULL.

> As the file is too big to be fully fit into main memory, what
> will fopen() return?


You seem to be confused about the purpose of fopen. It does not read a file
into main memory.

> If fopen() returns SUCCESS, then where is that portion of file gets
> stored which could not get loaded into main memory?


fopen() never returns SUCCESS. It returns either NULL or a pointer to an
in-memory semi-opaque data structure describing an open file.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
 
Reply With Quote
 
CBFalconer
Guest
Posts: n/a
 
      05-17-2006
Richard Heathfield wrote:
> Anunay said:
>>
>> Suppose a text file is given and we have to write a program that
>> will return a random line from the file. This can be easily done.
>> But what if the text file is too big and can't fit into the main
>> memory completely? In this case, how will we modify our code?

>
> Not at all, if you write it properly first time. You just need
> storage for two lines: the currently "saved" line, and the most
> recently read line. When you read line N into memory, copy it
> into the "saved" buffer with probability 1/N. At the end of this
> process, the "saved" buffer will contain a randomly selected line.
>
>> Also, if we are given a stream, instead of a file, then what
>> changes are required?

>
> None at all, if the way you wrote the first program was to take the input
> from stdin if no command line argument was forthcoming.


Using Richards clever algorithm and my ggets routine you can
probably reduce it to a few lines and a suitable function to decide
the "probability 1/N" result.

char *saved;
char *buffer;
size_t N, savedlinenum;

N = 0; saved = NULL; savedlinenum = 0;
while (0 == ggets(&buffer)) {
N++;
if (probfunction(N)) {
free(saved); saved = buffer;
savedlinenum = N;
}
else free(buffer);
}
if (saved) puts(saved);

For probfunction read the Cfaq. You can find ggets at:

<http://cbfalconer.home.att.net/download/ggets.zip>
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>


 
Reply With Quote
 
Anunay
Guest
Posts: n/a
 
      05-17-2006

Richard Heathfield wrote:
> > As the file is too big to be fully fit into main memory, what
> > will fopen() return?

>
> You seem to be confused about the purpose of fopen. It does not read a file
> into main memory.


Okay, that was a bad question. I meant to ask that where will the
remaining portion of file be stored as it could not get loaded in one
go?

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
signal handling and (structured) exception handling Peter C++ 34 10-17-2009 10:03 AM
python list handling and Lisp list handling Mark Tarver Python 22 04-26-2009 09:36 PM
Is faster handling hexadecimal values than handling chars? IƱaki Baz Castillo Ruby 1 04-15-2008 09:04 AM
file handling in a server (.py) file using xmlrpc uwb Python 4 07-08-2005 07:55 PM
General File Handling (Class Structure Preservation) Question Sean W. Quinn C++ 1 12-01-2003 02:58 AM



Advertisments