Go Back   Velocity Reviews > Newsgroups > C++
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

C++ - Binary file IO: Converting imported sequences of chars to desiredtype

 
Thread Tools Search this Thread
Old 11-01-2009, 06:54 AM   #41
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype


On Oct 30, 11:08*am, James Kanze <james.ka...@gmail.com> wrote:
> On Oct 30, 9:37 am, Rune Allnor <all...@tele.ntnu.no> wrote:
>
> > On 30 Okt, 09:44, James Kanze <james.ka...@gmail.com> wrote:

> * * [...]
> > So what does text-based formats actually buy you?

>
> Shorter development times, less expensive development, greater
> reliability...
>
> In sum, lower cost.
>



Since a message using a text format is generally longer than
binary formats, text leaves systems more vulnerable to
network problems caused by storms, cyber attacks, etc.
I won't argue the point about it being easier to use text,
but think it's a little like buying an SUV. If the price of
gas goes way up, many wish they had never bought an SUV.
Using binary might be a way to mitigate the pain caused by
volatile markets/conditions.


Brian Wood
Ebenezer Enterprises
http://webEbenezer.net


Brian
  Reply With Quote
Old 11-01-2009, 08:32 PM   #42
Gerhard Fiedler
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desired type
Brian wrote:

> Since a message using a text format is generally longer than binary
> formats, text leaves systems more vulnerable to network problems
> caused by storms, cyber attacks, etc. I won't argue the point about
> it being easier to use text, but think it's a little like buying an
> SUV. If the price of gas goes way up, many wish they had never
> bought an SUV. Using binary might be a way to mitigate the pain
> caused by volatile markets/conditions.


If you're talking about sending something over a potentially unstable
network connection, simple binary is pretty bad. With text encoding
(could be e.g. base64 encoded binary, or pretty much everything else
that's guaranteed not to use all available symbols), you have a few
symbols left that you can use for stream synchronization. This is in
general much more important that a few bytes more to transmit. This may
even be important when storing data on disk: the chances of recovering
data if there's a problem is much higher if you have sync symbols in the
data stream.

There's a point for (simple) binary protocols when all you have is an
8bit microcontroller with 100 bytes of RAM and 1k of Flash. But you
typically don't program these in standard-compliant C++

IMO this has nothing to do with SUVs... more with seat belts, if you
really want an automotive analogy. While they add weight to the vehicle,
and on (very) rare occasions may complicate things if there's a problem,
in most problem cases they can save your face, and more. (Which, back to
programming, may save your job -- and with it the payments for your SUV.
Now here we're back to the SUV

Gerhard


Gerhard Fiedler
  Reply With Quote
Old 11-02-2009, 10:12 AM   #43
James Kanze
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
On Oct 31, 2:19 pm, Rune Allnor <all...@tele.ntnu.no> wrote:
> On 30 Okt, 17:08, James Kanze <james.ka...@gmail.com> wrote:


> > On Oct 30, 9:37 am, Rune Allnor <all...@tele.ntnu.no> wrote:


> > > On 30 Okt, 09:44, James Kanze <james.ka...@gmail.com> wrote:

> > [...]
> > > So what does text-based formats actually buy you?


> > Shorter development times, less expensive development, greater
> > reliability...


> > In sum, lower cost.


> As long as you keep two factors in mind:


> 1) The user's time is not yours (the programmer) to waste.
> 2) The users's storage facilities (disk space, network
> bandwidth etc) are not yours (the programmer) to waste.


The user pays for your time. Spending it to do something which
results in a less reliable program, and that he doesn't need, is
irresponsible, and borders on fraud.

> Those who want easy, not awfully challenging jobs might be
> better off flipping burgers.


Writing the most reliable programs for the lowest cost is
challenging enough without going out of your way to make it
harder. If you're an amateur, doing this for fun, do whatever
amuses you the most. If you're a professional, selling your
services, professional ontology requires provided the best
service possible at the lowest price possible.

--
James Kanze


James Kanze
  Reply With Quote
Old 11-02-2009, 08:04 PM   #44
Brian
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
On Nov 1, 2:32*pm, Gerhard Fiedler <geli...@gmail.com> wrote:
> Brian wrote:
> > Since a message using a text format is generally longer than binary
> > formats, text leaves systems more vulnerable to network problems
> > caused by storms, cyber attacks, etc. I won't argue the point about
> > it being easier to use text, but think it's a little like buying an
> > SUV. *If the price of gas goes way up, many wish they had never
> > bought an SUV. Using binary might be a way tomitigate the pain
> > caused by volatile markets/conditions.

>
> If you're talking about sending something over a potentially unstable
> network connection, simple binary is pretty bad. With text encoding
> (could be e.g. base64 encoded binary, or pretty much everything else
> that's guaranteed not to use all available symbols), you have a few
> symbols left that you can use for stream synchronization. This is in
> general much more important that a few bytes more to transmit. This may
> even be important when storing data on disk: the chances of recovering
> data if there's a problem is much higher if you have sync symbols in the
> data stream.
>


If it were just a "few bytes more" I wouldn't be saying
anything. Likewise the difference between an SUV and
a fuel efficient vehicle isn't trivial. People wouldn't
be wishing they had never bought an SUV if that were
the case.


Brian Wood
Ebenezer Enterprises
http://webEbenezer.net





Brian
  Reply With Quote
Old 11-02-2009, 09:05 PM   #45
Brian
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
On Nov 2, 4:12*am, James Kanze <james.ka...@gmail.com> wrote:
> On Oct 31, 2:19 pm, Rune Allnor <all...@tele.ntnu.no> wrote:
>
> > On 30 Okt, 17:08, James Kanze <james.ka...@gmail.com> wrote:
> > > On Oct 30, 9:37 am, Rune Allnor <all...@tele.ntnu.no> wrote:
> > > > On 30 Okt, 09:44, James Kanze <james.ka...@gmail.com> wrote:
> > > * * [...]
> > > > So what does text-based formats actually buy you?
> > > Shorter development times, less expensive development, greater
> > > reliability...
> > > In sum, lower cost.

> > As long as you keep two factors in mind:
> > 1) The user's time is not yours (the programmer) to waste.
> > 2) The users's storage facilities (disk space, network
> > * *bandwidth etc) are not yours (the programmer) to waste.

>
> The user pays for your time. *Spending it to do something which
> results in a less reliable program, and that he doesn't need, is
> irresponsible, and borders on fraud.
>
> > Those who want easy, not awfully challenging jobs might be
> > better off flipping burgers.

>
> Writing the most reliable programs for the lowest cost is
> challenging enough without going out of your way to make it
> harder. *If you're an amateur, doing this for fun, do whatever
> amuses you the most. *If you're a professional, selling your
> services, professional ontology requires provided the best
> service possible at the lowest price possible.
>



I'm interested in binary in this context as an
alternative to text because I believe markets and
conditions are likely to continue to be volatile for
a while. If I had more confidence in various
officials, B.O. (Obama), Putin, Ahmadinejad, etc.,
I'd be less likely to think things are going to be
volatile. I like what Rabbi Michael Healer said
when he met the governor of Texas -- Rick Perry --
a few years ago: "I didn't vote for you and I
don't trust you." I didn't vote for B.O. and I
don't trust him either.


Brian Wood
Ebenezer Enterprises
http://webEbenezer.net




Brian
  Reply With Quote
Old 11-02-2009, 11:39 PM   #46
Brian
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
On Nov 2, 3:05*pm, Brian <c...@mailvault.com> wrote:
> On Nov 2, 4:12*am, James Kanze <james.ka...@gmail.com> wrote:
>
>
>
>
>
> > On Oct 31, 2:19 pm, Rune Allnor <all...@tele.ntnu.no> wrote:

>
> > > On 30 Okt, 17:08, James Kanze <james.ka...@gmail.com> wrote:
> > > > On Oct 30, 9:37 am, Rune Allnor <all...@tele.ntnu.no> wrote:
> > > > > On 30 Okt, 09:44, James Kanze <james.ka...@gmail.com> wrote:
> > > > * * [...]
> > > > > So what does text-based formats actually buy you?
> > > > Shorter development times, less expensive development, greater
> > > > reliability...
> > > > In sum, lower cost.
> > > As long as you keep two factors in mind:
> > > 1) The user's time is not yours (the programmer) to waste.
> > > 2) The users's storage facilities (disk space, network
> > > * *bandwidth etc) are not yours (the programmer) to waste.

>
> > The user pays for your time. *Spending it to do something which
> > results in a less reliable program, and that he doesn't need, is
> > irresponsible, and borders on fraud.

>
> > > Those who want easy, not awfully challenging jobs might be
> > > better off flipping burgers.

>
> > Writing the most reliable programs for the lowest cost is
> > challenging enough without going out of your way to make it
> > harder. *If you're an amateur, doing this for fun, do whatever
> > amuses you the most. *If you're a professional, selling your
> > services, professional ontology requires provided the best
> > service possible at the lowest price possible.

>
> I'm interested in binary in this context as an
> alternative to text because I believe markets and
> conditions are likely to continue to be volatile for
> a while. *


This is interesting --

http://stackoverflow.com/questions/1...-binary-format

M. Troyer, who I think is still around the Boost list,
considered using binary to be "essential."

http://lists.boost.org/Archives/boost/2002/11/39601.php

I'm not sure if those participating in this thread
come from a scientific application background as Troyer
does.


Brian Wood
Ebenezer Enterprises
http://webEbenezer.net




Brian
  Reply With Quote
Old 11-03-2009, 10:14 AM   #47
Gerhard Fiedler
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desired type
Brian wrote:

> On Nov 1, 2:32*pm, Gerhard Fiedler <geli...@gmail.com> wrote:
>> Brian wrote:
>>> Since a message using a text format is generally longer than binary
>>> formats, text leaves systems more vulnerable to network problems
>>> caused by storms, cyber attacks, etc. I won't argue the point about
>>> it being easier to use text, but think it's a little like buying an
>>> SUV. *If the price of gas goes way up, many wish they had never
>>> bought an SUV. Using binary might be a way tomitigate the pain
>>> caused by volatile markets/conditions.

>>
>> If you're talking about sending something over a potentially
>> unstable network connection, simple binary is pretty bad. With text
>> encoding (could be e.g. base64 encoded binary, or pretty much
>> everything else that's guaranteed not to use all available symbols),
>> you have a few symbols left that you can use for stream
>> synchronization. This is in general much more important that a few
>> bytes more to transmit. This may even be important when storing data
>> on disk: the chances of recovering data if there's a problem is much
>> higher if you have sync symbols in the data stream.

>
> If it were just a "few bytes more" I wouldn't be saying anything.
> Likewise the difference between an SUV and a fuel efficient vehicle
> isn't trivial. People wouldn't be wishing they had never bought an
> SUV if that were the case.


It is longer, but you were talking about unreliable networks. And
resyncing a binary stream is by design very problematic. Since you often
don't know beforehand the length of records (think strings), you have
length information encoded in your binary stream. If one length field is
bad and unrecoverable, pretty much the complete rest of the stream is
unreadable because you're out of sync from that point on. This is also
valid for data on disks.

Now, if you used an encoding with a few unused symbols, you can use
those symbols to add synchronization markers (records, whatever), and
even if a length field is bad, you maybe lost a record but not the whole
remainder of the stream.

On unreliable networks, I take that any day over the size advantage of
raw binary. Of course, this is not about text vs binary, this is about
whether raw binary is the best choice for unreliable networks. It isn't.

If you want both (speed and reliability), you'd create a custom encoding
that leaves only a few symbols unused that you then can use for syncing.
But raw binary is not a good choice over unreliable networks.

And I still think that this has nothing to do with SUVs. How many people
do you know that are wishing they never had used a text protocol? How
many are there wishing they never had used raw binary over an unreliable
network link?

Gerhard


Gerhard Fiedler
  Reply With Quote
Old 11-03-2009, 05:35 PM   #48
Brian
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
On Nov 3, 4:14*am, Gerhard Fiedler <geli...@gmail.com> wrote:
> Brian wrote:
> > On Nov 1, 2:32*pm, Gerhard Fiedler <geli...@gmail.com> wrote:
> >> Brian wrote:
> >>> Since a message using a text format is generally longer than binary
> >>> formats, text leaves systems more vulnerable to network problems
> >>> caused by storms, cyber attacks, etc. I won't argue the point about
> >>> it being easier to use text, but think it's a little like buying an
> >>> SUV. *If the price of gas goes way up, many wish they had never
> >>> bought an SUV. Using binary might be a way tomitigate the pain
> >>> caused by volatile markets/conditions.

>
> >> If you're talking about sending something over a potentially
> >> unstable network connection, simple binary is pretty bad. With text
> >> encoding (could be e.g. base64 encoded binary, or pretty much
> >> everything else that's guaranteed not to use all available symbols),
> >> you have a few symbols left that you can use for stream
> >> synchronization. This is in general much more important that a few
> >> bytes more to transmit. This may even be important when storing data
> >> on disk: the chances of recovering data if there's a problem is much
> >> higher if you have sync symbols in the data stream.

>
> > If it were just a "few bytes more" I wouldn't be saying anything.
> > Likewise the difference between an SUV and a fuel efficient vehicle
> > isn't trivial. *People wouldn't be wishing they had never bought an
> > SUV if that were the case.

>
> It is longer, but you were talking about unreliable networks. And
> resyncing a binary stream is by design very problematic. Since you often
> don't know beforehand the length of records (think strings), you have
> length information encoded in your binary stream.


Yes.

> If one length field is
> bad and unrecoverable, pretty much the complete rest of the stream is
> unreadable because you're out of sync from that point on. This is also
> valid for data on disks.


I think there are ways to avoid that. Sentinel values are
often used in binary streams. If you get to the end of a
message and don't find the sentinel, you can scan until
you do find it. It's true that you may find a false
positive with binary, but the whole stream isn't lost.
Additionally, the message length can be embedded two times.
If the two lengths match, then an errant sublength within
the message won't cause any trouble to the whole stream,
but it may make it impossible to interpret one message.
If the two message lengths don't match then you have to
do some checking. If you have a max message length, you
check both values against that. If both are less than
that you would have to proceed with caution.

>
> Now, if you used an encoding with a few unused symbols, you can use
> those symbols to add synchronization markers (records, whatever), and
> even if a length field is bad, you maybe lost a record but not the whole
> remainder of the stream.
>
> On unreliable networks, I take that any day over the size advantage of
> raw binary. Of course, this is not about text vs binary, this is about
> whether raw binary is the best choice for unreliable networks. It isn't.


Just saying "it isn't" doesn't convince me.


>
> If you want both (speed and reliability), you'd create a custom encoding
> that leaves only a few symbols unused that you then can use for syncing.
> But raw binary is not a good choice over unreliable networks.
>
> And I still think that this has nothing to do with SUVs. How many people
> do you know that are wishing they never had used a text protocol? How
> many are there wishing they never had used raw binary over an unreliable
> network link?


I don't know any in either of those two categories.
Some predicted spiking oil prices 10 years ago and
they based their decisions on those predictions.
Something similar may happen with bandwidth prices.


Brian Wood
Ebenezer Enterprises
http://webEbenezer.net


I read today of a man who was fired for saying,
"I think homosexuality is bad stuff."
http://www.wnd.com/index.php?fa=PAGE.view&pageId=114779
I agree with him - it is bad stuff.


Brian
  Reply With Quote
Old 11-03-2009, 07:09 PM   #49
Rune Allnor
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desiredtype
On 3 Nov, 00:39, Brian <c...@mailvault.com> wrote:
> On Nov 2, 3:05*pm, Brian <c...@mailvault.com> wrote:
>
>
>
>
>
> > On Nov 2, 4:12*am, James Kanze <james.ka...@gmail.com> wrote:

>
> > > On Oct 31, 2:19 pm, Rune Allnor <all...@tele.ntnu.no> wrote:

>
> > > > On 30 Okt, 17:08, James Kanze <james.ka...@gmail.com> wrote:
> > > > > On Oct 30, 9:37 am, Rune Allnor <all...@tele.ntnu.no> wrote:
> > > > > > On 30 Okt, 09:44, James Kanze <james.ka...@gmail.com> wrote:
> > > > > * * [...]
> > > > > > So what does text-based formats actually buy you?
> > > > > Shorter development times, less expensive development, greater
> > > > > reliability...
> > > > > In sum, lower cost.
> > > > As long as you keep two factors in mind:
> > > > 1) The user's time is not yours (the programmer) to waste.
> > > > 2) The users's storage facilities (disk space, network
> > > > * *bandwidth etc) are not yours (the programmer) to waste.

>
> > > The user pays for your time. *Spending it to do something which
> > > results in a less reliable program, and that he doesn't need, is
> > > irresponsible, and borders on fraud.

>
> > > > Those who want easy, not awfully challenging jobs might be
> > > > better off flipping burgers.

>
> > > Writing the most reliable programs for the lowest cost is
> > > challenging enough without going out of your way to make it
> > > harder. *If you're an amateur, doing this for fun, do whatever
> > > amuses you the most. *If you're a professional, selling your
> > > services, professional ontology requires provided the best
> > > service possible at the lowest price possible.

>
> > I'm interested in binary in this context as an
> > alternative to text because I believe markets and
> > conditions are likely to continue to be volatile for
> > a while. *

>
> This is interesting --
>
> http://stackoverflow.com/questions/1...ization-perfor...
>
> M. Troyer, who I think is still around the Boost list,
> considered using binary to be "essential."
>
> http://lists.boost.org/Archives/boost/2002/11/39601.php
>
> I'm not sure if those participating in this thread
> come from a scientific application background as Troyer
> does.


I used to be involved with seismic data porcessing. About 12
years ago, the company I worked for got the first TByte disk
stack nationwide. Before that time, the guys who went offshore
came back with truckloads of EXAByte tapes. Just loading the
tapes to the disk drives took weeks.

The applciation I'm working with has to do with bathymetry
map processing. 'Bathymetry' just means 'underwater terrain',
so the end product is a map of the sea floor.

There are huge amounts of data flowing through (I wouldn't
be surprised if present day 'simple' mapping tasks are comparable
to late '80s seismic processing, what computational through-put
is concerned), and the job is essentially real-time: A directive
to discontinue present survey activities might be recieved at any
time (surveying is done from general-purpose vessles), in which
case the vessel ad crew needs to shut down all activities and
switch focus to whatevere assignment is coming up, in a matter
of minutes or hours. At best one might accept a couple of hours
latency on the processed result after a new batch of survey
data is available, but that's it. Since any survey can go on
for indefinite lengths of time, one needs to be able to process
each data batch faster than it took to measure, or one will
accumulate backlog.

The processing is done in multiple stages, so one just can't
wait for text-based file IO to complete. Those who base their
data flow on text files are not able to complete even the
shortest survey processing whithin the time it takes to survey
the data - which is the essential aspect of a real-time operation.

Rune


Rune Allnor
  Reply With Quote
Old 11-03-2009, 08:14 PM   #50
Gerhard Fiedler
 
Posts: n/a
Default Re: Binary file IO: Converting imported sequences of chars to desired type
Brian wrote:

>> And I still think that this has nothing to do with SUVs. How many
>> people do you know that are wishing they never had used a text
>> protocol? How many are there wishing they never had used raw binary
>> over an unreliable network link?

>
> I don't know any in either of those two categories.


Wasn't it you who wrote "People wouldn't be wishing they had never
bought an SUV if that were the case", while using the analogy of text
format and SUVs? I thought you'd know at least "people" who wished they
had used binary -- if not, how do you get to the analogy in the first
place?

> Some predicted spiking oil prices 10 years ago and they based their
> decisions on those predictions. Something similar may happen with
> bandwidth prices.


Right, may. In general, when programming, I don't base my decisions on
such "predictions". If you take all those predictions made, you get
probably more misses than hits. I tend to try to get more hits than
misses when programming... this is better for the near-term financial
situation, and I can know this without making any shaky predictions

Gerhard


Gerhard Fiedler
  Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Counting In Binary Raymond A+ Certification 13 03-07-2004 07:28 PM
HD-DVD and DVD's future Phil Riker DVD Video 68 09-28-2003 09:32 PM




SEO by vBSEO 3.3.2 ©2009, Crawlability, Inc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46