Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Deleting first N lines from a text file

Reply
Thread Tools

Deleting first N lines from a text file

 
 
pozz
Guest
Posts: n/a
 
      11-15-2011
I want to delete the first N lines from a file text. I imagine two
approaches:
- use a temporary file to copy the last lines only
- use the same file to move characters starting from N+1 line to the
beginning

The temporary file could be more complex to write (at last I have to
delete the original file and rename the temporary file), but at any
moment I have a coherent text file. So this approach is safe if the
application crashes during the deleting process. If the application
crashes just after deleting the original text file but before renaming
the temporary file, during initialization I can detect this situation
and proceed with the renaming.

The second approach is simpler, but leaves a malformed text file on
the filesystem if the application crashes during the deleting process.

What do you think about those thoughts? Do you agree with me?

My "deleting first N lines" function is:

int text_delete(unsigned int N) {
FILE *f;
FILE *ftmp;
int c;
f = fopen(filename, "rt");
ftmp = fopen(tmpfilename, "wt");
if ((f == NULL) || (ftmp == NULL)) {
return -1;
}
while((c = fgetc(f)) != EOF) {
if ((char)c == '\n') {
if (--N == 0) break;
}
}
while((c = fgetc(f)) != EOF) {
fputc(c, ftmp);
}
fclose(f);
fclose(ftmp);
if (remove(filename) < 0) return -1;
if (rename(tmpfilename, filename) < 0) return -1;
return 0;
}

At initialization I try to open the text file or the temporary file;

int text_init(void) {
FILE *f;
f = fopen(filename, "rt");
if (f == NULL) {
/* Does the temporary file exist? */
f = fopen(tmpfilename, "rt");
if (f != NULL) {
/* Yes!, recover temporary file */
fclose(f);
if (rename(tmpfilename, filename) < 0) return -1;
} else {
/* Create an empty log file... */
f = fopen(filename, "wt");
if (f == NULL) return -1;
fclose(f);
}
} else {
fclose(f);
}
return 0;
}
 
Reply With Quote
 
 
 
 
Roberto Waltman
Guest
Posts: n/a
 
      11-15-2011
pozz wrote:
>I want to delete the first N lines from a file text.
>...
>The second approach is simpler,...
>...
>What do you think about those thoughts?


Only that the second approach is not simpler.
Also, depending on the underlying OS, it may not be possible to read
from and write to the same file as you propose.

--
Roberto Waltman

[ Please reply to the group,
return address is invalid ]
 
Reply With Quote
 
 
 
 
Ben Pfaff
Guest
Posts: n/a
 
      11-15-2011
Acid Washed China Blue Jeans <> writes:

> In article <>,
> Roberto Waltman <> wrote:
>
>> pozz wrote:
>> >I want to delete the first N lines from a file text.
>> >...
>> >The second approach is simpler,...
>> >...
>> >What do you think about those thoughts?

>>
>> Only that the second approach is not simpler.
>> Also, depending on the underlying OS, it may not be possible to read
>> from and write to the same file as you propose.

>
> Fopen with "r+". If fopen succeeds, the library has promised
> you you are allowed to read and write an existing file.


However, writing in a text file may truncate it, see 7.19.3
"Files":

Whether a write on a text stream causes the associated file
to be truncated beyond that point is implementation-defined.
--
int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}
 
Reply With Quote
 
Roberto Waltman
Guest
Posts: n/a
 
      11-15-2011
Acid Washed China Blue Jeans wrote:

>Fopen with "r+". If fopen succeeds, the library has promised you you are allowed
>to read and write an existing file.


In the general case, a write may truncate the file at the end of the
written data, so it may be OK to read from a location before the last
location written, but not after it.

And there may be environments in which fopen(..., "r+") always fails.

--
Roberto Waltman

[ Please reply to the group,
return address is invalid ]
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      11-15-2011
On 11/14/2011 7:02 PM, pozz wrote:
> I want to delete the first N lines from a file text. I imagine two
> approaches:
> - use a temporary file to copy the last lines only


Do this.

> - use the same file to move characters starting from N+1 line to the
> beginning


Don't do this.

> The temporary file could be more complex to write (at last I have to
> delete the original file and rename the temporary file), but at any
> moment I have a coherent text file. So this approach is safe if the
> application crashes during the deleting process. If the application
> crashes just after deleting the original text file but before renaming
> the temporary file, during initialization I can detect this situation
> and proceed with the renaming.
>
> The second approach is simpler, but leaves a malformed text file on
> the filesystem if the application crashes during the deleting process.
>
> What do you think about those thoughts? Do you agree with me?


No, not at all. One problem with your supposedly simpler
solution: How do you tell subsequent readers of the file that they
should stop before reaching the end? Observe that <stdio.h> offers
no way to shorten an existing file to any length other than zero.

--
Eric Sosman
d
 
Reply With Quote
 
jacob navia
Guest
Posts: n/a
 
      11-15-2011
Using the containers library (and if your file fits in memory)

#include <containers.h>
int main(int argc,char *argv[])
{
if (argc != 3) {
printf("Usage: deletelines <file> <N>\n");
return -1;
}
strCollection *data = istrCollection.CreateFromFile(argv[1]);
if (data == NULL) return -1;
istrCollection.RemoveRange(data,0,atoi(argv[2]));
istrCollection.WriteToFile(data,argv[1]);
istrCollection.Finalize(data);
}

 
Reply With Quote
 
Giuseppe
Guest
Posts: n/a
 
      11-16-2011
On 15 Nov, 04:06, Eric Sosman <esos...@ieee-dot-org.invalid> wrote:
> > What do you think about those thoughts? Do you agree with me?

>
> * * *No, not at all. *One problem with your supposedly simpler
> solution: How do you tell subsequent readers of the file that they
> should stop before reaching the end? *Observe that <stdio.h> offers
> no way to shorten an existing file to any length other than zero.


Ok, I implemented the "temporary file" solution and it works well.
The
only disadvantage is time: when the file is big (1000 lines of about
50 bytes
each), the time to delete the first line could be very high.

Do you think the process could be reduced launching an external script
(for
example, 'head' based) with system()? If I redirect the output to the
original
filename I could avoid the time consuming process of copying the
original
to the temporary file.
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      11-16-2011
Giuseppe <> writes:
> On 15 Nov, 04:06, Eric Sosman <esos...@ieee-dot-org.invalid> wrote:
>> > What do you think about those thoughts? Do you agree with me?

>>
>> Â* Â* Â*No, not at all. Â*One problem with your supposedly simpler
>> solution: How do you tell subsequent readers of the file that they
>> should stop before reaching the end? Â*Observe that <stdio.h> offers
>> no way to shorten an existing file to any length other than zero.

>
> Ok, I implemented the "temporary file" solution and it works well.
> The only disadvantage is time: when the file is big (1000 lines of
> about 50 bytes each), the time to delete the first line could be very
> high.


A text file of 1000 lines of 50 bytes each really isn't all that big.
The time to copy and rename it probably won't even be noticeable.

> Do you think the process could be reduced launching an external script
> (for example, 'head' based) with system()? If I redirect the output
> to the original filename I could avoid the time consuming process of
> copying the original to the temporary file.


The behavior of external program is outside the scope of the C language.

(But I'll mention that on Unix-like systems, running a command with its
input and output directed to the same file can cause serious problems;
it can easily end up reading a partially modified version of the file
instead of the original. And even if it works, it's likely going to be
doing the same thing you would have done in your program.)

--
Keith Thompson (The_Other_Keith) kst- <http://www.ghoti.net/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      11-16-2011
On 11/15/2011 7:57 PM, Giuseppe wrote:
> On 15 Nov, 04:06, Eric Sosman<esos...@ieee-dot-org.invalid> wrote:
>>> What do you think about those thoughts? Do you agree with me?

>>
>> No, not at all. One problem with your supposedly simpler
>> solution: How do you tell subsequent readers of the file that they
>> should stop before reaching the end? Observe that<stdio.h> offers
>> no way to shorten an existing file to any length other than zero.

>
> Ok, I implemented the "temporary file" solution and it works well.
> The
> only disadvantage is time: when the file is big (1000 lines of about
> 50 bytes
> each), the time to delete the first line could be very high.


Fifty K shouldn't take long. Even on a system from forty years
ago it didn't take long. Even on paper tape, for goodness' sake, it
took less than a minute!

For "really big" files (terabytes) copying most of the file from
one place to another could take an unacceptably long time. Also, the
need to find space for a second nearly complete copy could be
troublesome. In such cases you'd be justified in seeking fancier
solutions -- but I sincerely doubt that "slide all those terabytes
a couple hundred positions leftward" would produce a savings. More
likely it would produce a slowdown, plus the risks you've already
mentioned about data loss in the event of an error. No, the fancier
solution would probably involve some kind of an index external to the
file, describing which parts of the file were "live" and which "dead,"
and fancier routines to read just the live parts.

> Do you think the process could be reduced launching an external script
> (for
> example, 'head' based) with system()? If I redirect the output to the
> original
> filename I could avoid the time consuming process of copying the
> original
> to the temporary file.


First, just what do you imagine the "head" program does, hmmm?

However, on the systems I've encountered that provide a "head"
utility and support "redirection," your solution is likely to run
very quickly indeed. And save a lot of disk space, too! (Hint:
Try it yourself: `head <foo.txt >foo.txt', then `ls -l foo.txt',
and then you get to test your backups ...)

But all this is mostly beside the point. You are worried about
the time to copy 50K bytes: Have you *measured* the time? Have you
actually found it to be a problem for your application? Or are you
just imagining monsters under your bed? The fundamental theorem of
all optimization is There Are No Monsters Until You've Measured Them.

--
Eric Sosman
d
 
Reply With Quote
 
pozz
Guest
Posts: n/a
 
      11-16-2011
On 16 Nov, 02:50, Keith Thompson <ks...@mib.org> wrote:
> Giuseppe <giuseppe.modu...@gmail.com> writes:
> > Ok, I implemented the "temporary file" solution and it works well.
> > The only disadvantage is time: when the file is big (1000 lines of
> > about 50 bytes each), the time to delete the first line could be very
> > high.

>
> A text file of 1000 lines of 50 bytes each really isn't all that big.
> The time to copy and rename it probably won't even be noticeable.


It takes about 100ms to finish the shrink procedure. It's not a long
time
on a desktop PC, but I'm working on ambedded Linux based on ARM9
processor.

The slowest part of my application is this. Anyway I'm thinking if
there
are some simple improvements to reduce the time taken by this task.


> > Do you think the process could be reduced launching an external script
> > (for example, 'head' based) with system()? *If I redirect the output
> > to the original filename I could avoid the time consuming process of
> > copying the original to the temporary file.

>
> The behavior of external program is outside the scope of the C language.


Oh, I now, I was asking for on "off-topic" opinion


> (But I'll mention that on Unix-like systems, running a command with its
> input and output directed to the same file can cause serious problems;
> it can easily end up reading a partially modified version of the file
> instead of the original. *And even if it works, it's likely going to be
> doing the same thing you would have done in your program.)


Ok, I'll not try.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Deleting data from the file without deleting the file first crea C++ 2 12-28-2012 11:50 PM
Deleting first lines of Array Jo Ay Ruby 6 04-14-2008 11:49 PM
To delete few lines and add few lines at the end of a text file using c program Murali C++ 2 03-09-2006 04:45 PM
Deleting blank lines from text file Joey Martin ASP General 1 08-30-2005 09:12 AM
Re: how to read 10 lines from a 200 lines file and write to a new file?? Joe Wright C Programming 0 07-27-2003 08:50 PM



Advertisments