![]() |
|
|
|||||||
![]() |
C++ - Binary file IO: Converting imported sequences of chars to desiredtype |
|
|
Thread Tools | Search this Thread |
|
|
#21 |
|
On Oct 26, 9:50 pm, Brian <c...@mailvault.com> wrote:
> On Oct 26, 12:06 pm, James Kanze <james.ka...@gmail.com> wrote: > I haven't invested in text or XML marshalling because > I think binary formats are going to prevail. Which binary format? There are quite a few to choose from. > With the portability edge taken away from text, there won't be > much reason to use text. The main reason to use text is that it's an order of magnitude easier to debug. And that's not likely to change. -- James Kanze James Kanze |
|
|
|
|
#22 |
|
Posts: n/a
|
On 28 Okt, 13:40, James Kanze <james.ka...@gmail.com> wrote:
> The code was written very quickly, with no tricks or anything. Just out of curiosity - would it be possible to see your code? As far as I can tell, you haven't posted it (If you have, I have missed it). Rune Rune Allnor |
|
|
|
#23 |
|
Posts: n/a
|
On Oct 28, 8:42*am, James Kanze <james.ka...@gmail.com> wrote:
> The main reason to use text is that it's an order of magnitude > easier to debug. *And that's not likely to change. > Is that text 8 bit ASCII, 16 bit, wchart_t, MBCS, UNICODE ... :^) mzdude |
|
|
|
#24 |
|
Posts: n/a
|
mzdude wrote:
> On Oct 28, 8:42 am, James Kanze <james.ka...@gmail.com> wrote: >> The main reason to use text is that it's an order of magnitude >> easier to debug. And that's not likely to change. >> > Is that text 8 bit ASCII, 16 bit, wchart_t, MBCS, UNICODE ... :^) Quill & Parchment. -- ------------ < I'm Karmic > ------------ \ \ ___ {~._.~} ( Y ) ()~*~() (_)-(_) Mick |
|
|
|
#25 |
|
Posts: n/a
|
On Oct 28, 7:42*am, James Kanze <james.ka...@gmail.com> wrote:
> On Oct 26, 9:50 pm, Brian <c...@mailvault.com> wrote: > > > On Oct 26, 12:06 pm, James Kanze <james.ka...@gmail.com> wrote: > > I haven't invested in text or XML marshalling because > > I think binary formats are going to prevail. > > Which binary format? *There are quite a few to choose from. > I'm only aware of a few of them. I don't know if it matters much to me which one is selected. It's more that there's a standard. > > With the portability edge taken away from text, there won't be > > much reason to use text. > > The main reason to use text is that it's an order of magnitude > easier to debug. *And that's not likely to change. > I was thinking that having a standard for binary would help with debugging. I guess it is a tradeoff between development costs and bandwidth costs. Brian Wood http://webEbenezer.net Brian |
|
|
|
#26 |
|
Posts: n/a
|
On Oct 28, 3:38*pm, Brian <c...@mailvault.com> wrote:
> On Oct 28, 7:42*am, James Kanze <james.ka...@gmail.com> wrote: > > > On Oct 26, 9:50 pm, Brian <c...@mailvault.com> wrote: > > > With the portability edge taken away from text, there won't be > > > much reason to use text. > > > The main reason to use text is that it's an order of magnitude > > easier to debug. *And that's not likely to change. > > I was thinking that having a standard for binary would > help with debugging. *I guess it is a tradeoff between > development costs and bandwidth costs. > Does this perspective seem accurate? Assuming the order of magnitude is correct, then the question becomes something like if language A takes 10 times longer to learn than language B, but once you learn A you can communicate in 1/3 the time it takes for those using B. So those who learn how to use A have an advantage over those who don't. Brian Wood Brian |
|
|
|
#27 |
|
Posts: n/a
|
Rune Allnor wrote:
> Here is a test I wrote in matlab a few years ago, to demonstrate > the problem (WinXP, 2.4GHz, no idea about disk): > > [... Matlab code] > > Output: > ------------------------------------ > Wrote ASCII data in 24.0469 seconds > Read ASCII data in 42.2031 seconds > Wrote binary data in 0.10938 seconds > Read binary data in 0.32813 seconds > ------------------------------------ > > Binary writes are 24.0/0.1 = 240x faster than text write. > Binary reads are 42.2/0.32 = 130x faster than text read. In Matlab. This doesn't say much if anything about any other program. Possibly Matlab has a lousy (in terms of speed) text IO. Re the precision issue: When writing out text, there isn't really a need to go decimal, too. Hex or octal numbers are also text. Speeds up the conversion (probably not by much, but still) and provides a way to write out the exact value that is in memory (and recreate that exact value -- no matter the involved precisions). Gerhard Gerhard Fiedler |
|
|
|
#28 |
|
Posts: n/a
|
On Oct 28, 2:55 pm, Rune Allnor <all...@tele.ntnu.no> wrote:
> On 28 Okt, 13:40, James Kanze <james.ka...@gmail.com> wrote: > > The code was written very quickly, with no tricks or anything. > Just out of curiosity - would it be possible to see your code? > As far as I can tell, you haven't posted it (If you have, I > have missed it). I haven't posted it because it's on my machine at home (in France), and I'm currently working in London, and don't have immediate access to it. Redoing it here (from memory): #include <fstream> #include <iostream> #include <string> #include <vector> #include <stddef.h> #include <stdlib.h> #include <time.h> class FileOutput { protected: std::string my_type; std: time_t my_start; time_t my_end; public: FileOutput( std::string const& type, bool is_binary = true ) : my_type( type ) , my_file( ("test_" + type + ".dat").c_str(), is_binary ? std::ios: : std::ios: { my_start = time( NULL ); } ~FileOutput() { my_end = time( NULL ) ; my_file.close(); std::cout << my_type << ": " << (my_end - my_start) << " sec." << std::endl; } virtual void output( double d ) = 0; }; class RawOutput : public FileOutput { public: RawOutput() : FileOutput( "raw" ) {} virtual void output( double d ) { my_file.write( reinterpret_cast< char* >(&d), sizeof(d) ); } }; class CookedOutput : public FileOutput { public: CookedOutput() : FileOutput( "cooked" ) {} virtual void output( double d ) { unsigned long long const& tmp = reinterpret_cast< unsigned long long const& >(d); int shift = 64 ; while ( shift > 0 ) { shift -= 8 ; my_file.put( (tmp >> shift) & 0xFF ); } } }; class TextOutput : public FileOutput { public: TextOutput() : FileOutput( "text", false ) { my_file.setf( std::ios::scientific, std::ios::floatfield ); my_file.precision( 17 ); } virtual void output( double d ) { my_file << d << '\n'; } }; template< typename File > void test( std::vector< double > const& values ) { File dest; for ( std::vector< double >::const_iterator iter = values.begin (); iter != values.end(); ++ iter ) { dest.output( *iter ); } } int main() { size_t const size = 10000000; std::vector< double > v; while ( v.size() != size ) { v.push_back( (double)( rand() ) / (double)( RAND_MAX ) ); } test< TextOutput >( v ); test< CookedOutput >( v ); test< RawOutput >( v ); return 0; } Compiled with "cl /EHs /O2 timefmt.cc". On my local disk here, I get: text: 90 sec. cooked: 31 sec. raw: 9 sec. The last is, of course, not significant, except that it is very small. (I can't run it on the networked disk, where any real data would normally go, because it would use too much network bandwidth, possibly interfering with others. Suffice it to say that the networked disk is about 5 or more times slower, so the relative differences would be reduced by that amount.) I'm not sure what's different in the code above (or the environment---I suspect that the disk bandwidth is higher here, since I'm on a professional PC, and not a "home computer") compared to my tests at home (under Windows); at home, there was absolutely no difference in the times for raw and cooked. (Cooked is, of course, XDR format, at least on a machine like the PC, which uses IEEE floating point.) -- James Kanze James Kanze |
|
|
|
#29 |
|
Posts: n/a
|
On Oct 28, 8:38 pm, Brian <c...@mailvault.com> wrote:
> On Oct 28, 7:42 am, James Kanze <james.ka...@gmail.com> wrote: > > On Oct 26, 9:50 pm, Brian <c...@mailvault.com> wrote: > > > On Oct 26, 12:06 pm, James Kanze <james.ka...@gmail.com> wrote: > > > I haven't invested in text or XML marshalling because > > > I think binary formats are going to prevail. > > Which binary format? There are quite a few to choose from. > I'm only aware of a few of them. I don't know if > it matters much to me which one is selected. It's > more that there's a standard. > > > With the portability edge taken away from text, there > > > won't be much reason to use text. > > The main reason to use text is that it's an order of > > magnitude easier to debug. And that's not likely to change. > I was thinking that having a standard for binary would help > with debugging. It might. It would certainly encourage tools for reading it. On the other hand: we already have a couple of standards for binary, and I haven't seen that many tools. Part of the reason might be because one of the most common standards, XDR, is basically untyped, so the tools wouldn't really know how to read it anyway. (There are tools which display certain specific uses of XDR in human readable format, e.g. tcpdump.) -- James Kanze James Kanze |
|
|
|
#30 |
|
Posts: n/a
|
On Oct 28, 9:23 pm, Gerhard Fiedler <geli...@gmail.com> wrote:
> Rune Allnor wrote: > > Here is a test I wrote in matlab a few years ago, to > > demonstrate the problem (WinXP, 2.4GHz, no idea about disk): > > [... Matlab code] > > Output: > > ------------------------------------ > > Wrote ASCII data in 24.0469 seconds > > Read ASCII data in 42.2031 seconds > > Wrote binary data in 0.10938 seconds > > Read binary data in 0.32813 seconds > > ------------------------------------ > > Binary writes are 24.0/0.1 = 240x faster than text write. > > Binary reads are 42.2/0.32 = 130x faster than text read. > In Matlab. This doesn't say much if anything about any other > program. Possibly Matlab has a lousy (in terms of speed) text > IO. Obviously, not possibly. I get a factor of between 3 and 10, depending on the compiler and the system. I get a signficant difference simply running what I think is the same program (more or less) on two different machines, using the same compiler and having the same architecture---one probably has a much higher speed IO bus than the other, and that makes the difference. > Re the precision issue: When writing out text, there isn't > really a need to go decimal, too. Hex or octal numbers are > also text. Speeds up the conversion (probably not by much, but > still) and provides a way to write out the exact value that is > in memory (and recreate that exact value -- no matter the > involved precisions). But it defeats one of the major reasons for using text: human readability. -- James Kanze James Kanze |
|
![]() |
| Thread Tools | Search this Thread |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Counting In Binary | Raymond | A+ Certification | 13 | 03-07-2004 07:28 PM |
| HD-DVD and DVD's future | Phil Riker | DVD Video | 68 | 09-28-2003 09:32 PM |