Go Back   Velocity Reviews > Newsgroups > C++
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

C++ - Binary file IO: Converting imported sequences of chars to desiredtype

 
Thread Tools Search this Thread
Old 10-28-2009, 12:42 PM   #21
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype


On Oct 26, 9:50 pm, Brian <c...@mailvault.com> wrote:
> On Oct 26, 12:06 pm, James Kanze <james.ka...@gmail.com> wrote:


> I haven't invested in text or XML marshalling because
> I think binary formats are going to prevail.


Which binary format? There are quite a few to choose from.

> With the portability edge taken away from text, there won't be
> much reason to use text.


The main reason to use text is that it's an order of magnitude
easier to debug. And that's not likely to change.

--
James Kanze


James Kanze
  Reply With Quote
Old 10-28-2009, 02:55 PM   #22
Rune Allnor
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
On 28 Okt, 13:40, James Kanze <james.ka...@gmail.com> wrote:

> The code was written very quickly, with no tricks or anything.


Just out of curiosity - would it be possible to see your code?
As far as I can tell, you haven't posted it (If you have, I have
missed it).

Rune


Rune Allnor
  Reply With Quote
Old 10-28-2009, 04:53 PM   #23
mzdude
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
On Oct 28, 8:42*am, James Kanze <james.ka...@gmail.com> wrote:
> The main reason to use text is that it's an order of magnitude
> easier to debug. *And that's not likely to change.
>

Is that text 8 bit ASCII, 16 bit, wchart_t, MBCS, UNICODE ... :^)


mzdude
  Reply With Quote
Old 10-28-2009, 06:14 PM   #24
Mick
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
mzdude wrote:
> On Oct 28, 8:42 am, James Kanze <james.ka...@gmail.com> wrote:
>> The main reason to use text is that it's an order of magnitude
>> easier to debug. And that's not likely to change.
>>

> Is that text 8 bit ASCII, 16 bit, wchart_t, MBCS, UNICODE ... :^)


Quill & Parchment.

--
------------
< I'm Karmic >
------------
\
\
___
{~._.~}
( Y )
()~*~()
(_)-(_)


Mick
  Reply With Quote
Old 10-28-2009, 08:38 PM   #25
Brian
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
On Oct 28, 7:42*am, James Kanze <james.ka...@gmail.com> wrote:
> On Oct 26, 9:50 pm, Brian <c...@mailvault.com> wrote:
>
> > On Oct 26, 12:06 pm, James Kanze <james.ka...@gmail.com> wrote:
> > I haven't invested in text or XML marshalling because
> > I think binary formats are going to prevail.

>
> Which binary format? *There are quite a few to choose from.
>


I'm only aware of a few of them. I don't know if
it matters much to me which one is selected. It's
more that there's a standard.


> > With the portability edge taken away from text, there won't be
> > much reason to use text.

>
> The main reason to use text is that it's an order of magnitude
> easier to debug. *And that's not likely to change.
>


I was thinking that having a standard for binary would
help with debugging. I guess it is a tradeoff between
development costs and bandwidth costs.


Brian Wood
http://webEbenezer.net


Brian
  Reply With Quote
Old 10-28-2009, 09:19 PM   #26
Brian
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
On Oct 28, 3:38*pm, Brian <c...@mailvault.com> wrote:
> On Oct 28, 7:42*am, James Kanze <james.ka...@gmail.com> wrote:
>
> > On Oct 26, 9:50 pm, Brian <c...@mailvault.com> wrote:


> > > With the portability edge taken away from text, there won't be
> > > much reason to use text.

>
> > The main reason to use text is that it's an order of magnitude
> > easier to debug. *And that's not likely to change.

>
> I was thinking that having a standard for binary would
> help with debugging. *I guess it is a tradeoff between
> development costs and bandwidth costs.
>


Does this perspective seem accurate? Assuming the order
of magnitude is correct, then the question becomes
something like if language A takes 10 times longer to
learn than language B, but once you learn A you can
communicate in 1/3 the time it takes for those using B.
So those who learn how to use A have an advantage over
those who don't.


Brian Wood


Brian
  Reply With Quote
Old 10-28-2009, 09:23 PM   #27
Gerhard Fiedler
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desired type
Rune Allnor wrote:

> Here is a test I wrote in matlab a few years ago, to demonstrate
> the problem (WinXP, 2.4GHz, no idea about disk):
>
> [... Matlab code]
>
> Output:
> ------------------------------------
> Wrote ASCII data in 24.0469 seconds
> Read ASCII data in 42.2031 seconds
> Wrote binary data in 0.10938 seconds
> Read binary data in 0.32813 seconds
> ------------------------------------
>
> Binary writes are 24.0/0.1 = 240x faster than text write.
> Binary reads are 42.2/0.32 = 130x faster than text read.


In Matlab. This doesn't say much if anything about any other program.
Possibly Matlab has a lousy (in terms of speed) text IO.

Re the precision issue: When writing out text, there isn't really a need
to go decimal, too. Hex or octal numbers are also text. Speeds up the
conversion (probably not by much, but still) and provides a way to write
out the exact value that is in memory (and recreate that exact value --
no matter the involved precisions).

Gerhard


Gerhard Fiedler
  Reply With Quote
Old 10-29-2009, 10:00 AM   #28
James Kanze
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
On Oct 28, 2:55 pm, Rune Allnor <all...@tele.ntnu.no> wrote:
> On 28 Okt, 13:40, James Kanze <james.ka...@gmail.com> wrote:


> > The code was written very quickly, with no tricks or anything.


> Just out of curiosity - would it be possible to see your code?
> As far as I can tell, you haven't posted it (If you have, I
> have missed it).


I haven't posted it because it's on my machine at home (in
France), and I'm currently working in London, and don't have
immediate access to it. Redoing it here (from memory):

#include <fstream>
#include <iostream>
#include <string>
#include <vector>
#include <stddef.h>
#include <stdlib.h>
#include <time.h>

class FileOutput
{
protected:
std::string my_type;
std:fstream my_file;
time_t my_start;
time_t my_end;
public:
FileOutput( std::string const& type, bool is_binary = true )
: my_type( type )
, my_file( ("test_" + type + ".dat").c_str(),
is_binary ? std::ios:ut | std::ios::binary
: std::ios:ut )
{
my_start = time( NULL );
}
~FileOutput()
{
my_end = time( NULL ) ;
my_file.close();
std::cout << my_type << ": "
<< (my_end - my_start) << " sec." << std::endl;
}

virtual void output( double d ) = 0;
};

class RawOutput : public FileOutput
{
public:
RawOutput() : FileOutput( "raw" ) {}
virtual void output( double d )
{
my_file.write( reinterpret_cast< char* >(&d), sizeof(d) );
}
};

class CookedOutput : public FileOutput
{
public:
CookedOutput() : FileOutput( "cooked" ) {}
virtual void output( double d )
{
unsigned long long const& tmp
= reinterpret_cast< unsigned long long const& >(d);
int shift = 64 ;
while ( shift > 0 ) {
shift -= 8 ;
my_file.put( (tmp >> shift) & 0xFF );
}
}
};

class TextOutput : public FileOutput
{
public:
TextOutput() : FileOutput( "text", false )
{
my_file.setf( std::ios::scientific,
std::ios::floatfield );
my_file.precision( 17 );
}
virtual void output( double d )
{
my_file << d << '\n';
}
};

template< typename File >
void
test( std::vector< double > const& values )
{
File dest;
for ( std::vector< double >::const_iterator iter = values.begin
();
iter != values.end();
++ iter ) {
dest.output( *iter );
}
}

int
main()
{
size_t const size = 10000000;
std::vector< double >
v;
while ( v.size() != size ) {
v.push_back( (double)( rand() ) / (double)( RAND_MAX ) );
}
test< TextOutput >( v );
test< CookedOutput >( v );
test< RawOutput >( v );
return 0;
}

Compiled with "cl /EHs /O2 timefmt.cc". On my local disk here,
I get:
text: 90 sec.
cooked: 31 sec.
raw: 9 sec.
The last is, of course, not significant, except that it is very
small. (I can't run it on the networked disk, where any real
data would normally go, because it would use too much network
bandwidth, possibly interfering with others. Suffice it to say
that the networked disk is about 5 or more times slower, so the
relative differences would be reduced by that amount.) I'm not
sure what's different in the code above (or the environment---I
suspect that the disk bandwidth is higher here, since I'm on a
professional PC, and not a "home computer") compared to my tests
at home (under Windows); at home, there was absolutely no
difference in the times for raw and cooked. (Cooked is, of
course, XDR format, at least on a machine like the PC, which
uses IEEE floating point.)

--
James Kanze


James Kanze
  Reply With Quote
Old 10-29-2009, 10:03 AM   #29
James Kanze
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
On Oct 28, 8:38 pm, Brian <c...@mailvault.com> wrote:
> On Oct 28, 7:42 am, James Kanze <james.ka...@gmail.com> wrote:


> > On Oct 26, 9:50 pm, Brian <c...@mailvault.com> wrote:


> > > On Oct 26, 12:06 pm, James Kanze <james.ka...@gmail.com> wrote:
> > > I haven't invested in text or XML marshalling because
> > > I think binary formats are going to prevail.


> > Which binary format? There are quite a few to choose from.


> I'm only aware of a few of them. I don't know if
> it matters much to me which one is selected. It's
> more that there's a standard.


> > > With the portability edge taken away from text, there
> > > won't be much reason to use text.


> > The main reason to use text is that it's an order of
> > magnitude easier to debug. And that's not likely to change.


> I was thinking that having a standard for binary would help
> with debugging.


It might. It would certainly encourage tools for reading it.
On the other hand: we already have a couple of standards for
binary, and I haven't seen that many tools. Part of the reason
might be because one of the most common standards, XDR, is
basically untyped, so the tools wouldn't really know how to read
it anyway. (There are tools which display certain specific uses
of XDR in human readable format, e.g. tcpdump.)

--
James Kanze


James Kanze
  Reply With Quote
Old 10-29-2009, 10:09 AM   #30
James Kanze
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
On Oct 28, 9:23 pm, Gerhard Fiedler <geli...@gmail.com> wrote:
> Rune Allnor wrote:
> > Here is a test I wrote in matlab a few years ago, to
> > demonstrate the problem (WinXP, 2.4GHz, no idea about disk):


> > [... Matlab code]


> > Output:
> > ------------------------------------
> > Wrote ASCII data in 24.0469 seconds
> > Read ASCII data in 42.2031 seconds
> > Wrote binary data in 0.10938 seconds
> > Read binary data in 0.32813 seconds
> > ------------------------------------


> > Binary writes are 24.0/0.1 = 240x faster than text write.
> > Binary reads are 42.2/0.32 = 130x faster than text read.


> In Matlab. This doesn't say much if anything about any other
> program. Possibly Matlab has a lousy (in terms of speed) text
> IO.


Obviously, not possibly. I get a factor of between 3 and 10,
depending on the compiler and the system. I get a signficant
difference simply running what I think is the same program (more
or less) on two different machines, using the same compiler and
having the same architecture---one probably has a much higher
speed IO bus than the other, and that makes the difference.

> Re the precision issue: When writing out text, there isn't
> really a need to go decimal, too. Hex or octal numbers are
> also text. Speeds up the conversion (probably not by much, but
> still) and provides a way to write out the exact value that is
> in memory (and recreate that exact value -- no matter the
> involved precisions).


But it defeats one of the major reasons for using text: human
readability.

--
James Kanze


James Kanze
  Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Counting In Binary Raymond A+ Certification 13 03-07-2004 07:28 PM
HD-DVD and DVD's future Phil Riker DVD Video 68 09-28-2003 09:32 PM




SEO by vBSEO 3.3.2 ©2009, Crawlability, Inc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46