Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Deleting first N lines from a text file

Reply
Thread Tools

Deleting first N lines from a text file

 
 
pozz
Guest
Posts: n/a
 
      11-16-2011
On 16 Nov, 03:48, Eric Sosman <(E-Mail Removed)> wrote:
> On 11/15/2011 7:57 PM, Giuseppe wrote:
> > Ok, I implemented the "temporary file" solution and it works well.
> > The
> > only disadvantage is time: when the file is big (1000 lines of about
> > 50 bytes
> > each), the time to delete the first line could be very high.

>
> * * *Fifty K shouldn't take long. *Even on a system from forty years
> ago it didn't take long. *Even on paper tape, for goodness' sake, it
> took less than a minute!


100ms (see my answer to Keith above). It's not too much, but I was
thingking
about improvements.


> * * *For "really big" files (terabytes) copying most of the file from
> one place to another could take an unacceptably long time. *Also, the
> need to find space for a second nearly complete copy could be
> troublesome. *In such cases you'd be justified in seeking fancier
> solutions -- but I sincerely doubt that "slide all those terabytes
> a couple hundred positions leftward" would produce a savings. *More
> likely it would produce a slowdown, plus the risks you've already
> mentioned about data loss in the event of an error. *No, the fancier
> solution would probably involve some kind of an index external to the
> file, describing which parts of the file were "live" and which "dead,"
> and fancier routines to read just the live parts.


Ok.


> > Do you think the process could be reduced launching an external script
> > (for
> > example, 'head' based) with system()? *If I redirect the output to the
> > original
> > filename I could avoid the time consuming process of copying the
> > original
> > to the temporary file.

>
> * * *First, just what do you imagine the "head" program does, hmmm?
>
> * * *However, on the systems I've encountered that provide a "head"
> utility and support "redirection," your solution is likely to run
> very quickly indeed. *And save a lot of disk space, too! *(Hint:
> Try it yourself: `head <foo.txt >foo.txt', then `ls -l foo.txt',
> and then you get to test your backups ...)





> * * *But all this is mostly beside the point. *You are worried about
> the time to copy 50K bytes: Have you *measured* the time? *Have you
> actually found it to be a problem for your application? *Or are you
> just imagining monsters under your bed? *The fundamental theorem of
> all optimization is There Are No Monsters Until You've Measured Them.

 
Reply With Quote
 
 
 
 
Phil Carmody
Guest
Posts: n/a
 
      11-16-2011
Acid Washed China Blue Jeans <(E-Mail Removed)> writes:
> In article <(E-Mail Removed)>,
> Roberto Waltman <(E-Mail Removed)> wrote:
> > pozz wrote:
> > >I want to delete the first N lines from a file text.
> > >...
> > >The second approach is simpler,...
> > >...
> > >What do you think about those thoughts?

> >
> > Only that the second approach is not simpler.
> > Also, depending on the underlying OS, it may not be possible to read
> > from and write to the same file as you propose.

>
> Fopen with "r+". If fopen succeeds, the library has promised you you are allowed
> to read and write an existing file.


Being allowed to write to it at the point that you open the file
doesn't mean that it's possible to write to the file at any point
later in time.

Think wire-cutters.

Phil
--
Unix is simple. It just takes a genius to understand its simplicity
-- Dennis Ritchie (1941-2011), Unix Co-Creator
 
Reply With Quote
 
 
 
 
jgharston
Guest
Posts: n/a
 
      11-16-2011
pozz wrote:
> It takes about 100ms to finish the shrink procedure. *It's not a long
> time on a desktop PC, but I'm working on ambedded Linux based on ARM9
> processor.


Are you doing it byte by byte? Try buffering it, even chunks of
16 bytes at a time will speed it up significantly. What's the
biggest chunk of memory you can claim, use, release without
memory fragmentation impacting your program more than acceptably?

JGH
 
Reply With Quote
 
-.-
Guest
Posts: n/a
 
      11-16-2011
jacob navia was trying to save the world with his stuff:

> Using the containers library (and if your file fits in memory)
>
> #include <containers.h>


You self-celebrating ****o. There only exist your things to you:
that silly lcc-win and your funny containers.
Stop making this newsgroup your personal advertisements page.


 
Reply With Quote
 
jacob navia
Guest
Posts: n/a
 
      11-16-2011
Le 16/11/11 14:01, -.- a Úcrit :
> jacob navia was trying to save the world with his stuff:
>
>> Using the containers library (and if your file fits in memory)
>>
>> #include <containers.h>

>
> You self-celebrating ****o.


That is why you hide behind a pseudo, because you have the courage of
your opinions...

 
Reply With Quote
 
BartC
Guest
Posts: n/a
 
      11-16-2011


"pozz" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> On 16 Nov, 03:48, Eric Sosman <(E-Mail Removed)> wrote:
>> On 11/15/2011 7:57 PM, Giuseppe wrote:
>> > Ok, I implemented the "temporary file" solution and it works well.
>> > The
>> > only disadvantage is time: when the file is big (1000 lines of about
>> > 50 bytes
>> > each), the time to delete the first line could be very high.

>>
>> Fifty K shouldn't take long. Even on a system from forty years
>> ago it didn't take long. Even on paper tape, for goodness' sake, it
>> took less than a minute!


(That's a fast paper tape reader. The last one I used would have taken
nearly 3 hours.)

> 100ms (see my answer to Keith above). It's not too much, but I was
> thingking
> about improvements.


How long for a file containing ten lines instead of 1000? How long for
double the number of lines?

That will tell you the overheads involved and the fastest speed achievable.

While you're about, how long does it take to create a file, write 50,000
bytes to it (of anything) and close it? And how long to read such a file?

Take care when taking measurements, to eliminate the effects of
disk-caching.

--
Bartc

 
Reply With Quote
 
jgharston
Guest
Posts: n/a
 
      11-16-2011
Try replacing:
> * * * * while((c = fgetc(f)) != EOF) {
> * * * * * * * * fputc(c, ftmp);
> * * * * }


with:
bsize=m_free(0);
buff=m_alloc(bsize);
numread=-1;

while(numread) {
numread=fread(buff,1,bsize,f);
fwrite(buff,1,numread,ftmp);
}
m_free(buff);

As with usenet tradition, completely untested.

JGH
 
Reply With Quote
 
jgharston
Guest
Posts: n/a
 
      11-16-2011
jgharston wrote:
> * * * * bsize=m_free(0);
> * * * * buff=m_alloc(bsize);


Following up my own post, that call to m_free(0) is supposed to
return a size of a free block that can subsequently be claimed
with m_alloc(). A bit of a skim of through the web shows that
functionality isn't in any of the malloc libraries documented
there. All I can say is it worked 25 years ago! and inspired
me to include that functionality in my own malloc library.

Just replace bsize=m_free(0) with a suitable bsize=(some
method of deciding an amount of memory to claim).

JGH
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      11-16-2011
jgharston <(E-Mail Removed)> writes:
> Try replacing:
>> ┬* ┬* ┬* ┬* while((c = fgetc(f)) != EOF) {
>> ┬* ┬* ┬* ┬* ┬* ┬* ┬* ┬* fputc(c, ftmp);
>> ┬* ┬* ┬* ┬* }

>
> with:
> bsize=m_free(0);
> buff=m_alloc(bsize);
> numread=-1;
>
> while(numread) {
> numread=fread(buff,1,bsize,f);
> fwrite(buff,1,numread,ftmp);
> }
> m_free(buff);
>
> As with usenet tradition, completely untested.


Leaving aside the m_free and m_alloc calls, why do you assume that this
will be significantly faster than the fgetc/fputc loop? stdio does its
own buffering.

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
jgharston
Guest
Posts: n/a
 
      11-16-2011
Keith Thompson wrote:
> Leaving aside the m_free and m_alloc calls, why do you assume that this
> will be significantly faster than the fgetc/fputc loop? *stdio does its
> own buffering.


As I recall, this was a standard exam question back when I worra
litt'un.
If doing bulk data copying a program buffer is likely to be bigger
than stdio's buffer and bulk read/write/read/write is more efficient
for simple chucking of large lumps of data from one place to another,
one bit being the skipping of fgetc's unget functionality.

JGH
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Deleting data from the file without deleting the file first crea C++ 2 12-28-2012 11:50 PM
Deleting first lines of Array Jo Ay Ruby 6 04-14-2008 11:49 PM
To delete few lines and add few lines at the end of a text file using c program Murali C++ 2 03-09-2006 04:45 PM
Deleting blank lines from text file Joey Martin ASP General 1 08-30-2005 09:12 AM
Re: how to read 10 lines from a 200 lines file and write to a new file?? Joe Wright C Programming 0 07-27-2003 08:50 PM



Advertisments