On 11/15/2011 7:57 PM, Giuseppe wrote:
> On 15 Nov, 04:06, Eric Sosman<esos...@ieee-dot-org.invalid> wrote:
>>> What do you think about those thoughts? Do you agree with me?
>>
>> No, not at all. One problem with your supposedly simpler
>> solution: How do you tell subsequent readers of the file that they
>> should stop before reaching the end? Observe that<stdio.h> offers
>> no way to shorten an existing file to any length other than zero.
>
> Ok, I implemented the "temporary file" solution and it works well.
> The
> only disadvantage is time: when the file is big (1000 lines of about
> 50 bytes
> each), the time to delete the first line could be very high.
Fifty K shouldn't take long. Even on a system from forty years
ago it didn't take long. Even on paper tape, for goodness' sake, it
took less than a minute!
For "really big" files (terabytes) copying most of the file from
one place to another could take an unacceptably long time. Also, the
need to find space for a second nearly complete copy could be
troublesome. In such cases you'd be justified in seeking fancier
solutions -- but I sincerely doubt that "slide all those terabytes
a couple hundred positions leftward" would produce a savings. More
likely it would produce a slowdown, plus the risks you've already
mentioned about data loss in the event of an error. No, the fancier
solution would probably involve some kind of an index external to the
file, describing which parts of the file were "live" and which "dead,"
and fancier routines to read just the live parts.
> Do you think the process could be reduced launching an external script
> (for
> example, 'head' based) with system()? If I redirect the output to the
> original
> filename I could avoid the time consuming process of copying the
> original
> to the temporary file.
First, just what do you imagine the "head" program does, hmmm?
However, on the systems I've encountered that provide a "head"
utility and support "redirection," your solution is likely to run
very quickly indeed. And save a lot of disk space, too! (Hint:
Try it yourself: `head <foo.txt >foo.txt', then `ls -l foo.txt',
and then you get to test your backups ...)
But all this is mostly beside the point. You are worried about
the time to copy 50K bytes: Have you *measured* the time? Have you
actually found it to be a problem for your application? Or are you
just imagining monsters under your bed? The fundamental theorem of
all optimization is There Are No Monsters Until You've Measured Them.
--
Eric Sosman
d