Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > writing binary files

Reply
Thread Tools

writing binary files

 
 
mermadak
Guest
Posts: n/a
 
      08-04-2005
I am trying to convert an ANSI encoded ASCII text file to a binary file. I
have looked at the b2a_qp( data[, quotetabs, istext, header]) function at
http://aspn.activestate.com/ASPN/doc...-binascii.html
but I am not sure if it will do what I need it to or how set it up to take
the data.

Also, the parts of this that really make it an issue is that the data is
coming off of a DOS machine (so endian is a concern here right?) and is a
rather large text file with a ton of scientific data points (from 500k to
5MB files).

Any help would be greatly appreciated.

Thanks,
Dennis Aust


 
Reply With Quote
 
 
 
 
John Bokma
Guest
Posts: n/a
 
      08-04-2005
"mermadak" <(E-Mail Removed)> wrote:

> I am trying to convert an ANSI encoded ASCII text file to a binary
> file. I have looked at the b2a_qp( data[, quotetabs, istext, header])
> function at
> http://aspn.activestate.com/ASPN/doc...thon/lib/modul
> e-binascii.html but I am not sure if it will do what I need it to or
> how set it up to take the data.


Python... hmmm....

> Also, the parts of this that really make it an issue is that the data
> is coming off of a DOS machine (so endian is a concern here right?)
> and is a rather large text file with a ton of scientific data points
> (from 500k to 5MB files).


So basically you want to convert numbers in a text file to some short
binary notation?

5MB... you are aware that the current year is 2005?

--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html

 
Reply With Quote
 
 
 
 
mermadak
Guest
Posts: n/a
 
      08-04-2005

"John Bokma" <(E-Mail Removed)> wrote in message
news:Xns96A8BA198CF8castleamber@130.133.1.4...
>> Also, the parts of this that really make it an issue is that the data
>> is coming off of a DOS machine (so endian is a concern here right?)
>> and is a rather large text file with a ton of scientific data points
>> (from 500k to 5MB files).

>
> So basically you want to convert numbers in a text file to some short
> binary notation?


Exactly... any ideas?

> 5MB... you are aware that the current year is 2005?


Does that mean 5MB shouldn't be a problem???
I originally tried writing a program to simply maniplute these files in my
native programming languages of VB and C++ which would hang due to the size
of these files. I finally found a PERL script that would handle parsing this
much data.

Dennis Aust


 
Reply With Quote
 
John Bokma
Guest
Posts: n/a
 
      08-04-2005
"mermadak" <(E-Mail Removed)> wrote:

>
> "John Bokma" <(E-Mail Removed)> wrote in message
> news:Xns96A8BA198CF8castleamber@130.133.1.4...
>>> Also, the parts of this that really make it an issue is that the
>>> data is coming off of a DOS machine (so endian is a concern here
>>> right?) and is a rather large text file with a ton of scientific
>>> data points (from 500k to 5MB files).

>>
>> So basically you want to convert numbers in a text file to some short
>> binary notation?

>
> Exactly... any ideas?


Python or Perl, since your post referred to Python

>> 5MB... you are aware that the current year is 2005?

>
> Does that mean 5MB shouldn't be a problem???


Yup, your computer probably has 100 times as much memory.

> I originally tried writing a program to simply maniplute these files
> in my native programming languages of VB and C++ which would hang due
> to the size of these files.


If a C++ program would hang on 5MB files, how can programs handle 10M
MP3 files, or 700 MB movies?

> I finally found a PERL script that would


PERL is not an acronym

> handle parsing this much data.


Again: 5MB is not much. My best guess is that you should rethink your
algoritm(s).

--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html

 
Reply With Quote
 
mermadak
Guest
Posts: n/a
 
      08-06-2005

> Python or Perl, since your post referred to Python


Perl... preferrably. My point there is that I am grasping at straws at this
point...

I looked at the pack function that was also recommended but I am not sure
how to use it. Could anyone possibly give me an example? Mainly it looks as
though my data can only contain strings, floating point decimals, or fixed
point decimals but not a combination there of. My data is ASCII format but
would it be considered string data even though a data string may look like
"2005-08-05, 13:36:06, 3236.453232, 11123.456, 0.0, 21, 224.332" for
purposes of conversion to raw binary format? Also, the function says it
calls for a TEMPLATE variable to be passed to it. Is this required? And this
looks as though it would require a template character to be passed for every
character in the file??? This seems like it will be very processor intensive
as well as nearly impractical from a code writing perspective, as I would
have to build an array of the TEMPLATE characters and then build a
comparison function to check which character matches the TEMPLATE
designation and then convert each character to binary at that point. Am I
way off base here? Just seems like there would be a more practical way to
achieve this.

> Yup, your computer probably has 100 times as much memory.


True... but what does that have to do with process intensity and the
capabilities of the tools?

> If a C++ program would hang on 5MB files, how can programs handle 10M
> MP3 files, or 700 MB movies?


I agree with your point. Admittedly it was probably due to poor programming.
I have only been coding for 3 years now and only part time at that. But I
would be glad to send you the programs I was working on and see if you make
them work. Although, I did finally get that covered with Perl so it not
much of a concern at the moment.

> Again: 5MB is not much. My best guess is that you should rethink your
> algoritm(s).


Agreed, see above. Thank you for pointing out all of the obvious problems
here. Perhaps you would be so kind as to make some suggestions on how I
could actually accomplish this now?


 
Reply With Quote
 
John Bokma
Guest
Posts: n/a
 
      08-06-2005
"mermadak" <(E-Mail Removed)> wrote:

>
>> Python or Perl, since your post referred to Python

>
> Perl... preferrably. My point there is that I am grasping at straws at
> this point...
>
> I looked at the pack function that was also recommended but I am not
> sure how to use it. Could anyone possibly give me an example? Mainly
> it looks as though my data can only contain strings, floating point
> decimals, or fixed point decimals but not a combination there of. My
> data is ASCII format but would it be considered string data even
> though a data string may look like "2005-08-05, 13:36:06, 3236.453232,
> 11123.456, 0.0, 21, 224.332" for purposes of conversion to raw binary
> format?


A better question is: is compression really required? What is causing
the current problem(s). I am sure it's not managing 5 MB of data, which
is on a recent PC close to nothing.

> Also, the function says it calls for a TEMPLATE variable to be
> passed to it. Is this required?


The whole idea of pack is that it packs data according to a TEMPLATE, so
guess

> And this looks as though it would
> require a template character to be passed for every character in the
> file???


More or less, yes.

> This seems like it will be very processor intensive as well as
> nearly impractical from a code writing perspective, as I would have to
> build an array of the TEMPLATE characters and then build a comparison
> function to check which character matches the TEMPLATE designation and
> then convert each character to binary at that point. Am I way off base
> here? Just seems like there would be a more practical way to achieve
> this.


Yup: the most practical problem is: find the real bottle neck of your
problem. If you just require compression, use a compression solution.
Pack indeed needs to "know" what is in the string you want to be packed.
So if you want to pack a date followed by 3 floats on line 1 and 4
floats and a fixed number on line 2, you have to provide the correct
template to pack.

>> Yup, your computer probably has 100 times as much memory.

>
> True... but what does that have to do with process intensity and the
> capabilities of the tools?


That there shouldn't be any problem reading 5 MB of data into memory and
use it.

Regarding pack: if your lines don't follow a fixed format (e.g. a date
followed by exactly 5 floats, and 2 fixed point nrs), you already have
to do some parsing in your program. You can use the same parsing set up
to compress/convert your data to binary. If you only want to use the
output in Perl, you might consider writing out the compact version using
Storable.

If you have access to the program that creates those "big" files, and
it's written in Perl, you just have to tweak the output part, since that
part decides the structure of the output file. If it's not written in
Perl, you have to create a compatible binary output format (which is not
that hard). However, I recommend, especially if your files are around 5
MB, to stick with ASCII. It's human readable

>> If a C++ program would hang on 5MB files, how can programs handle 10M
>> MP3 files, or 700 MB movies?

>
> I agree with your point. Admittedly it was probably due to poor
> programming. I have only been coding for 3 years now and only part
> time at that. But I would be glad to send you the programs I was
> working on and see if you make them work.


No problem. I do such things professionally (ie. for money ). It
might save you a lot of time and trouble.

> Although, I did finally
> get that covered with Perl so it not much of a concern at the moment.
>
>> Again: 5MB is not much. My best guess is that you should rethink your
>> algoritm(s).

>
> Agreed, see above. Thank you for pointing out all of the obvious
> problems here. Perhaps you would be so kind as to make some
> suggestions on how I could actually accomplish this now?


If handling 5 MB of data is a problem for your program, why is it a
problem?

--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Newbie: working with binary files/extract png from a binary file Jim Ruby 6 12-24-2013 08:09 AM
writing binary file (ios::binary) Ron Eggler C++ 9 04-28-2008 08:20 AM
Writing Binary files to clients over SSL =?Utf-8?B?TWljaGFlbFk=?= ASP .Net 5 11-18-2005 08:44 PM
Reading and Writing to Binary Files Daniel Moree C++ 9 11-24-2004 10:20 PM
Reading/Writing pure binary files Daniel Gowans VHDL 2 06-12-2004 01:25 AM



Advertisments