Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Splitting text files?

Reply
Thread Tools

Splitting text files?

 
 
MM
Guest
Posts: n/a
 
      07-08-2003
Hi

I have never written any C programs before, but it seems that I need to do
so now. Hope some of you out there can spend a few minutes and help me by
writing a simple example of something fairly similar to what I need. I
really think it is a simple matter if you know C programming, but to me it
is not easy at all. An example from some "professional" C programmer will
probably give me all I need to complete it into exactly what I need.

Basically I need it to, in a specific way, split large text files containing
experimental data (stored in a known "form", see example below) into some
smaller files. The smaller files I will later use MATLAB to handle.
Theoretically I could use MATLAB to do it all (split the data file as well),
but when trying this it took WAY to long time (not possible, since I will
use this in another system). MATLAB is not really optimized to read/write
large text files (if the files are not structured in some ways...). And yes,
I need to do it all in C (not C++, VB, Fortran, Perl...).

Below is an example of the structure of the type of text file I will need to
split. Suppose the file name of this file is "simdata.txt". Open this file
for reading is probably one of the first things to do.

First there are some header lines. The header ends when the word "\Data:"
(without quotes) is found. All header lines are to be saved into a new file,
say "header.dat".

When "\Data:" has been identified, the first word "Time" is to be
identified. Probably it follows on the next row (after "\Data:"), but one
cannot be absolutely sure of this. Though, "Time" can be assumed to be the
first word in the row. So, when the word "Time" is identified, then starts
(including that row!) the first data block. This block ends when the next
block is identified in a similar way. Each data block is to be saved as
individual files, say "data1.dat", "data2.dat", and "data3.dat". We can
assue there are three blocks.

Hope this information is sufficient and that someone can help me with this.
I really need it, and cannot do much more without it.

Best regards,

MM

########################################
########### Example of file to split ###########
########################################

header line 1
header line 2
header line 3
.......
.......
.......
header line (last one)
\Data:
Time parameter2 parameter3 parameter4 ...
....... This is data block 1
....... This is data block 1
....... This is data block 1
....... This is data block 1
....... This is data block 1
....... This is data block 1
....... This is data block 1
....... This is data block 1
Time parameter5 parameter6 parameter7 ...
....... This is data block 2
....... This is data block 2
....... This is data block 2
....... This is data block 2
....... This is data block 2
....... This is data block 2
....... This is data block 2
....... This is data block 2
Time parameter8 parameter9 parameter10 ...
....... This is data block 3
....... This is data block 3
....... This is data block 3
....... This is data block 3
....... This is data block 3
....... This is data block 3
....... This is data block 3
....... This is data block 3

########################################
############# End of example #############
########################################



 
Reply With Quote
 
 
 
 
Pieter Droogendijk
Guest
Posts: n/a
 
      07-08-2003
On Tue, 8 Jul 2003 15:55:05 +0200, MM wrote:
> "Tom St Denis" <(E-Mail Removed)> wrote in message
> news:_FzOa.83651$(E-Mail Removed) le.rogers.com...
> > MM wrote:
> > > Hi

> >
> > How is summer school going?
> >
> > Fail much?
> >
> > Tom
> >

> So, what's wrong with you? Tired of your tedious job? I'm not, which
> is why I take on (for me) challenging tasks in my job.


Like asking a newsgroup to solve your problem?

Oh, and top posting is severely frowned upon.

--
main(int c,char*k,char*s){c>0?main(0,"adceoX$_k6][^hn","-7\
0#05&'40$.6'+).3+1%30"),puts(""):*s?c=!c?-*sputchar(45),c
),putchar(main(c,k+=*s-c*-1,s+1))s=0);return!s?10:10+*k;}
 
Reply With Quote
 
 
 
 
David Rubin
Guest
Posts: n/a
 
      07-08-2003
MM wrote:

The following is untested...

[snip - split this]

#include <stdio.h>
#include <string.h>

int
main(void)
{
FILE *fp;
char fname[4+2+4+1]; /* dataNN.txt */
char buf[256]; /* max line length is 255 characters */
int i = 0;

/* find start of data segment */
while(fgets(buf, sizeof buf, stdio) != 0){
if(strcmp("\\Data:", buf) == 0)
break;
}

while(fgets(buf, sizeof buf, stdio) != 0){
/* lines starting with '#' are skipped as comments */
/* blank lines are also skipped */
if(buf[0] == '#' || buf[0] == '\n')
continue;

/* write each block to a separate file */
if(strncmp("Time", buf, 4) == 0){
if(i > 0)
fclose(fp);
sprintf(fname, "data%02d.txt", ++i);
if((fp=fopen(fname, "w")) == 0){
perror(fname);
exit(EXIT_FAILURE);
}
}
fputs(buf, fp);
}
fclose(fp);
return 0;
}

HTH,

/david

--
Andre, a simple peasant, had only one thing on his mind as he crept
along the East wall: 'Andre, creep... Andre, creep... Andre, creep.'
-- unknown
 
Reply With Quote
 
Bertrand Mollinier Toublet
Guest
Posts: n/a
 
      07-08-2003
MM wrote:
> Hi
>
> I have never written any C programs before, but it seems that I need to do
> so now. Hope some of you out there can spend a few minutes and help me by
> writing a simple example of something fairly similar to what I need. I
> really think it is a simple matter if you know C programming, but to me it
> is not easy at all. An example from some "professional" C programmer will
> probably give me all I need to complete it into exactly what I need.
>
> Basically I need it to, in a specific way, split large text files containing
> experimental data (stored in a known "form", see example below) into some
> smaller files. The smaller files I will later use MATLAB to handle.
> Theoretically I could use MATLAB to do it all (split the data file as well),
> but when trying this it took WAY to long time (not possible, since I will
> use this in another system). MATLAB is not really optimized to read/write
> large text files (if the files are not structured in some ways...). And yes,
> I need to do it all in C (not C++, VB, Fortran, Perl...).
>

Don't pay too much attention to Tom StDenis, he has a pretty wide mouth.

As others have pointed out, bottom-posting is the rule in c.l.c, and so
is not doing people's work for them. On the other hand, here's a handful
of advice:

- it might be presomptuous to take on a C project without having a
few basic notions of the language. If you are as serious as you claim
about your job and taking on challenging tasks, do get Kernighan &
Ritchie 2nd ed. to learn about the language. I would even think that
when you are through with the book, you should be way able to solve your
little problem by yourself.
- nonetheless, if you want to skip on the concepts part and start
fighting with your little program, you should definitely explore the
functions fopen, fgets, strcmp, fputs, fclose. Have a look at, say, the
ggets library, if only to get an idea of the common issues involved with
I/O in C.

--
Bertrand Mollinier Toublet
"Reality exists" - Richard Heathfield, 1 July 2003

 
Reply With Quote
 
MM
Guest
Posts: n/a
 
      07-08-2003
Ok, I get it. But, the alternative for me would be to say "Now, I cannot do
this - it will have to wait until after summer". Of course there are people
in my company that could help me with this, but since it is summer and
pretty much everyone is on holidays, then I have to try to find other ways
to solve the problems I encounter. I thought one way was to ask people who
really knows C programming. Maybe I was wrong... But I still hope that there
ARE people who can understand what I need and are willing to help me.

"Pieter Droogendijk" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) g...
> Like asking a newsgroup to solve your problem?
>
> Oh, and top posting is severely frowned upon.
>
> --
> main(int c,char*k,char*s){c>0?main(0,"adceoX$_k6][^hn","-7\
> 0#05&'40$.6'+).3+1%30"),puts(""):*s?c=!c?-*sputchar(45),c
> ),putchar(main(c,k+=*s-c*-1,s+1))s=0);return!s?10:10+*k;}



 
Reply With Quote
 
MM
Guest
Posts: n/a
 
      07-08-2003
Many thanks to both David for his code (I will have a look at it and see if
I can get it all to work) and Bertrand (yes, I will get to learn much more
of C, starting right away) for his advice.

If I have had a lot of time I would not have asked the HG for all this.
Instead I would have begun trying to write the program all from the
beginning myself, and only asking the NG for specific parts. But I really
don't have the time now.

By the way, what is "bottom-posting"?

MM


 
Reply With Quote
 
Pieter Droogendijk
Guest
Posts: n/a
 
      07-08-2003
Evil top-posted text.

On Tue, 8 Jul 2003 17:04:56 +0200, MM wrote:
> Many thanks to both David for his code (I will have a look at it and
> see if I can get it all to work) and Bertrand (yes, I will get to
> learn much more of C, starting right away) for his advice.


Good Non-top-posted text.

> If I have had a lot of time I would not have asked the HG for all
> this. Instead I would have begun trying to write the program all from
> the beginning myself, and only asking the NG for specific parts. But I
> really don't have the time now.
>
> By the way, what is "bottom-posting"?
>
> MM


Bottom posting (as in opposite of top-posting) is replying to a post
where your own comments appear BELOW some amount of quoted text. like
this.

--
main(int c,char*k,char*s){c>0?main(0,"adceoX$_k6][^hn","-7\
0#05&'40$.6'+).3+1%30"),puts(""):*s?c=!c?-*sputchar(45),c
),putchar(main(c,k+=*s-c*-1,s+1))s=0);return!s?10:10+*k;}
 
Reply With Quote
 
Bertrand Mollinier Toublet
Guest
Posts: n/a
 
      07-08-2003
This is top-posting (my reply above yours), frowned upon in c.l.c.

MM wrote:
> Many thanks to both David for his code (I will have a look at it and see if
> I can get it all to work) and Bertrand (yes, I will get to learn much more
> of C, starting right away) for his advice.
>
> If I have had a lot of time I would not have asked the HG for all this.
> Instead I would have begun trying to write the program all from the
> beginning myself, and only asking the NG for specific parts. But I really
> don't have the time now.
>
> By the way, what is "bottom-posting"?
>

This is bottom-posting (my reply below yours), de facto standard in c.l.c.


--
Bertrand Mollinier Toublet
"Reality exists" - Richard Heathfield, 1 July 2003

 
Reply With Quote
 
Mike Wahler
Guest
Posts: n/a
 
      07-08-2003

MM <(E-Mail Removed)> wrote in message
news:WUAOa.828$(E-Mail Removed)...
> Ok, I get it. But, the alternative for me would be to say "Now, I cannot

do
> this - it will have to wait until after summer". Of course there are

people
> in my company that could help me with this, but since it is summer and
> pretty much everyone is on holidays, then I have to try to find other ways
> to solve the problems I encounter. I thought one way was to ask people who
> really knows C programming. Maybe I was wrong... But I still hope that

there
> ARE people who can understand what I need and are willing to help me.


Again, please don't top post.

Then please note that most folks don't consider
'helping' and 'doing it for you' to be the same
thing.

Post the code of your best attempt, and then I
suspect you'll get plenty of assistance.

-Mike



 
Reply With Quote
 
David Rubin
Guest
Posts: n/a
 
      07-08-2003
David Rubin wrote:
>
> MM wrote:
>
> The following is untested...
>
> [snip - split this]
>
> #include <stdio.h>


#include <stdlib.h>

> #include <string.h>


> int
> main(void)
> {
> FILE *fp;
> char fname[4+2+4+1]; /* dataNN.txt */
> char buf[256]; /* max line length is 255 characters */
> int i = 0;


> /* find start of data segment */
> while(fgets(buf, sizeof buf, stdio) != 0){


while(fgets(buf, sizeof buf, stdin) != 0){

> if(strcmp("\\Data:", buf) == 0)


if(strncmp("\\Data:", buf, 6) == 0)

> break;
> }
>
> while(fgets(buf, sizeof buf, stdio) != 0){


while(fgets(buf, sizeof buf, stdin) != 0){

/david

--
Andre, a simple peasant, had only one thing on his mind as he crept
along the East wall: 'Andre, creep... Andre, creep... Andre, creep.'
-- unknown
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Splitting text at whitespace but keeping the whitespace in thereturned list MRAB Python 3 01-26-2010 11:36 PM
Splitting Text into an Array nvanhaaster@caitele.com ASP General 2 02-07-2006 06:22 PM
Re: Splitting up the definitions of a class into different files (splitting public from private)? John Dibling C++ 0 07-19-2003 04:41 PM
Re: Splitting up the definitions of a class into different files (splitting public from private)? Mark C++ 0 07-19-2003 04:24 PM
Re: Splitting up the definitions of a class into different files (splitting public from private)? John Ericson C++ 0 07-19-2003 04:03 PM



Advertisments