Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Interplatform (interprocess, interlanguage) communication

Reply
Thread Tools

Interplatform (interprocess, interlanguage) communication

 
 
Lew
Guest
Posts: n/a
 
      02-08-2012
BGB wrote:
> ...
> an example is this:
> <foo> <bar value="3"/> </foo>
> and:
> (foo (bar 3))
>
> now, consider one wants to add a new field to 'foo' (say 'ln').
> <foo ln="15"> <bar value="3"/> </foo>
> and:
> (foo 15 (bar 3))
>
> a difference here is that existing code will probably not even notice
> the new XML attribute, whereas the positional nature of most


Ahem. You mean other than failing schema validation?

> S-Expressions makes the latter far more likely to break something (and


More likely than failing schema validation was for that well-designed XML-based
application?

> there is no good way to "annotate" an S-Exp, whereas with XML it is
> fairly solidly defined that one can simply add new attributes).


Attributes in XML are not annotation (with or without quotes). That role is filled by the actual 'annotation' element
http://www.w3schools.com/schema/el_annotation.asp

> note: my main way of working with XML is typically via DOM-style
> interfaces (if I am using it, it is typically because I am directly
> working with the data structure, and not as the result of some dumb-ass
> "data binding" crud...).


Sorry, "dumb-ass 'data-binding' crud"?

Why the extreme pejoratives? I would not say that there's anything wrong with
XML data-binding /per se/, although as with documented-oriented approaches it
can be done very badly.

> typically, the "internal representation" and "concrete serialization"
> are different:


I don't understand what you mean here. You cite these terms in quotes as though
they are a standard terminology for some specific things, but use them in their
ordinary meaning. The internal representation of what? The serialization
("concrete" or otherwise) of what? I don't mean to be obtuse here, but I am not
grokking the referents.

> I may use a textual XML serialization, or just as easily, I could use a
> binary format;
> likewise for S-Exps (actually, I probably far more often represent
> S-Exps as a binary format of one form or another than I use them in a
> form externally serialized as text).
>
> all hail the mighty DOM-node or CONS-cell...


WTF?

--
Lew
 
Reply With Quote
 
 
 
 
BGB
Guest
Posts: n/a
 
      02-09-2012
On 2/8/2012 3:02 PM, Lew wrote:
> BGB wrote:
>> ...
>> an example is this:
>> <foo> <bar value="3"/> </foo>
>> and:
>> (foo (bar 3))
>>
>> now, consider one wants to add a new field to 'foo' (say 'ln').
>> <foo ln="15"> <bar value="3"/> </foo>
>> and:
>> (foo 15 (bar 3))
>>
>> a difference here is that existing code will probably not even notice
>> the new XML attribute, whereas the positional nature of most

>
> Ahem. You mean other than failing schema validation?
>


many of us don't use schemas with our XML.

I think the issue is that one particular technology, XML, is used in
significantly different ways by different people and for different reasons.

many people use XML for data-binding, and many other people who use it
could care less about data-binding.


some people may use XML for similar purposes to how people using Lisp
would use lists (never-mind if this is kind of awkward, it does work).

like, doing Lisp type stuff in Java using DOM-nodes in place of
cons-based lists... +1 now that Java also (sort of) has closures.


>> S-Expressions makes the latter far more likely to break something (and

>
> More likely than failing schema validation was for that well-designed XML-based
> application?
>


as noted, many people neither use schemas nor any sort of schema
validation. in many use-cases, schemas are overly constraining to the
ability of using XML to represent free-form data, or using them
otherwise would offer little particular advantage.

say, if one is using XML for compiler ASTs or similar (say, the XML is
used to represent a just-parsed glob of source-code), do they really
need any sort of schema?

http://en.wikipedia.org/wiki/Abstract_syntax_tree


>> there is no good way to "annotate" an S-Exp, whereas with XML it is
>> fairly solidly defined that one can simply add new attributes).

>
> Attributes in XML are not annotation (with or without quotes). That role is filled by the actual 'annotation' element
> http://www.w3schools.com/schema/el_annotation.asp
>


they can be used for annotating the nodes in many sane use cases...

a lot depends on how one is using the XML in a given context.


>> note: my main way of working with XML is typically via DOM-style
>> interfaces (if I am using it, it is typically because I am directly
>> working with the data structure, and not as the result of some dumb-ass
>> "data binding" crud...).

>
> Sorry, "dumb-ass 'data-binding' crud"?
>
> Why the extreme pejoratives? I would not say that there's anything wrong with
> XML data-binding /per se/, although as with documented-oriented approaches it
> can be done very badly.
>


yeah, this may have been stated overly strongly.

personally, IMO, data-binding is probably one of the worse and
technically more pointless ways of using XML (as, IMO, it leads to such
similarly ill-designed technologies as SOAP and similar...).

not that data-binding is itself necessarily itself pointless, but doing
it via overly verbose namespace-ridden XML is probably one of the worse
ways of doing it (vs either specialized file-formats, or the use of
binary data-binding formats, which IMO should also not be used for data
interchange).


admittedly, I also partly dislike traditional ways of using data-binding
as it often exposes things which are theoretically internal to the app,
namely structural data representation (via classes/...), with things
which should theoretically be isolated from the internal data
representation: file formats.

or, IOW: a file-format (or protocol/...) should express the data in
itself, and not express how it is physically represented within the
application.

likewise, data going into or coming out of a piece of code should be
ideally documented and defined in a form separate from the component in
question.

otherwise, data-binding is not that much different than a more modern
variant of writing raw structures and arrays to files.


>> typically, the "internal representation" and "concrete serialization"
>> are different:

>
> I don't understand what you mean here. You cite these terms in quotes as though
> they are a standard terminology for some specific things, but use them in their
> ordinary meaning. The internal representation of what? The serialization
> ("concrete" or otherwise) of what? I don't mean to be obtuse here, but I am not
> grokking the referents.
>


the internal representation of the data within the application code.

if one knows which objects or classes exist, what sorts of members they
contain, ... then one is essentially exposing data which should not be
visible, or for that matter relied upon for data interchange (or, for
that matter, relevant).

ideally, any data represented externally should be defined in terms of
its semantics: something will be present if it is relevant to the
meaning of the data. the serialization will then be defined in terms of
expressing the structure and semantics of the data, which may bear very
little resemblance to how the data is represented in the actual
classes/arrays/whatever which make up how the data is represented
internally to the application.

similarly, file formats should be as much abstracted from the
application code as is reasonably possible, with a "concrete"
specification for the file-format or data-representation being written
instead.


both XML and S-Expressions can be used as structured ways of
representing semantics, rather than as ways of representing the contents
of given a data-object.


>> I may use a textual XML serialization, or just as easily, I could use a
>> binary format;
>> likewise for S-Exps (actually, I probably far more often represent
>> S-Exps as a binary format of one form or another than I use them in a
>> form externally serialized as text).
>>
>> all hail the mighty DOM-node or CONS-cell...

>
> WTF?
>


DOM nodes can be very powerful (and are probably a much better way of
using XML than using it as some sort of data-binding thing).


cons-cells are pairs of dynamically-typed values, typically called "car"
and "cdr" and used to implement lists and similar (and are the main
building block of "everything" in languages like Lisp and Scheme, well,
along with "symbols" and "fixnums" and similar).

http://en.wikipedia.org/wiki/Cons_cell

they can also be implemented in C, C++, and Java without too much
trouble, and can be a fairly useful way of building various sorts of
data structures (although, sadly, they aren't nearly as efficient in
Java as they could be, but OTOH it is also sort of a pain to build a
dynamic type-system in C, so it probably evens out...).

then one can proceed to build logic based mostly on building and
processing lists.

or, conceptually, they can be regarded as a type of linked-list based
containers, however the ways they are traditionally used are
significantly different from traditional ways of using containers (they
are typically used as ways of building tree-structures, rather than
usually as ways of storing a collection of items).


it may be worthwhile to look-up information regarding Lisp and Scheme
and similar, not that there is necessarily much reason to actually use
the languages, but there are some ideas and ways of doing things which
can be mapped fairly nicely onto other, more common, languages.

 
Reply With Quote
 
 
 
 
Arne Vajhj
Guest
Posts: n/a
 
      02-09-2012
On 2/8/2012 8:49 PM, BGB wrote:
> as noted, many people neither use schemas nor any sort of schema
> validation. in many use-cases, schemas are overly constraining to the
> ability of using XML to represent free-form data, or using them
> otherwise would offer little particular advantage.


xsd:any do provide some flexibility in schemas.

> say, if one is using XML for compiler ASTs or similar (say, the XML is
> used to represent a just-parsed glob of source-code), do they really
> need any sort of schema?


I would expect syntax trees to follow certain rules and not be free
form.

Arne


 
Reply With Quote
 
Arne Vajhøj
Guest
Posts: n/a
 
      02-09-2012
On 2/8/2012 2:07 PM, BGB wrote:
> On 2/8/2012 4:19 AM, Arved Sandstrom wrote:
>> On 12-02-08 04:41 AM, BGB wrote:
>>> note: my main way of working with XML is typically via DOM-style
>>> interfaces (if I am using it, it is typically because I am directly
>>> working with the data structure, and not as the result of some dumb-ass
>>> "data binding" crud...).

>>
>> I haven't been able to completely avoid using the DOM, but I loathe the
>> API. If I'm using XML at all, and JAXB suits, I'll use JAXB. More
>> generally I'll use SAX or StAX.
>>

>
> I have rarely done things for which SAX has made sense...
> usually in cases where SAX would make sense, I end up using
> line-oriented text formats instead (because there is often little
> obvious reason for why XML syntax would make much sense).


Non flat structure and validation comes to mind.

Arne
 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      02-09-2012
On 2/8/2012 7:16 PM, Arne Vajhøj wrote:
> On 2/8/2012 2:07 PM, BGB wrote:
>> On 2/8/2012 4:19 AM, Arved Sandstrom wrote:
>>> On 12-02-08 04:41 AM, BGB wrote:
>>>> note: my main way of working with XML is typically via DOM-style
>>>> interfaces (if I am using it, it is typically because I am directly
>>>> working with the data structure, and not as the result of some dumb-ass
>>>> "data binding" crud...).
>>>
>>> I haven't been able to completely avoid using the DOM, but I loathe the
>>> API. If I'm using XML at all, and JAXB suits, I'll use JAXB. More
>>> generally I'll use SAX or StAX.
>>>

>>
>> I have rarely done things for which SAX has made sense...
>> usually in cases where SAX would make sense, I end up using
>> line-oriented text formats instead (because there is often little
>> obvious reason for why XML syntax would make much sense).

>
> Non flat structure and validation comes to mind.
>


fair enough.

often, one can implement non-flat structures with line-oriented formats,
for example:
....
groupDef {
....
groupDef {
itemDef {
....
}
....
}
....
}

a lot of time this may be combined with cosmetic indentation, but this
does not change if it is a line-oriented format, for example, writing:
groupDef
{
....
}

could very-well break the parser.


typically, I have not used validation:
if there is anything to validate, typically this logic will be placed in
the logic to parse the text.

a lot of times, code operates under the assumption that nearly anything
which can be reasonably done is valid de-facto (the code is written,
however, to ideally not do anything compromising).


granted, typically I don't deal a whole lot with anything "security
critical" or where there is much need to worry about "trust" or
"authorization" or similar (or if privacy or money or similar was
involved...). maybe if security were more of a concern, then added
layers of pedantics and validation would make a lot more sense.

in my typical use-cases, the theoretical worst case would probably be if
a 3rd party could somehow break the app and get control of the users' OS
or similar and cause damage, but again, modern Windows is itself partly
designed to try to defend against this (running applications by default
with constrained privileges, ...).

 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      02-09-2012
On Wednesday, February 8, 2012 6:14:31 PM UTC-8, Arne Vajhj wrote:
> On 2/8/2012 8:49 PM, BGB wrote:
> > as noted, many people neither use schemas nor any sort of schema
> > validation. in many use-cases, schemas are overly constraining to the
> > ability of using XML to represent free-form data, or using them
> > otherwise would offer little particular advantage.

>
> xsd:any do provide some flexibility in schemas.
>
> > say, if one is using XML for compiler ASTs or similar (say, the XML is
> > used to represent a just-parsed glob of source-code), do they really
> > need any sort of schema?

>
> I would expect syntax trees to follow certain rules and not be free
> form.


In one breath we're singing the praises of binary formats, in the next we
complain that XML isn't sufficiently flexible.

"Do they really need any sort of schema?" with XML is usually a "yes".

But only if you're interested in clear, unambiguous, readily-parsable and
maintainable XML document formats.

People often excoriate the supposed verbosity of XML as though it were the only
criterion to measure utility.

There is no inherent advantage of a LISP/list-like format over any other, nor vice versa; it's all accordin'. If the convention is agreeable to all parties,
it will work. If all projects were one-off and isolated from the larger world,
we'd never need to adhere to a standard. If we don't mind inventing our own
tools for anything, we'd never have to adopt a standard with extensive tools
support.

Where are the *real* costs of a software system?

--
Lew
 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      02-09-2012
On 2/8/2012 7:14 PM, Arne Vajhj wrote:
> On 2/8/2012 8:49 PM, BGB wrote:
>> as noted, many people neither use schemas nor any sort of schema
>> validation. in many use-cases, schemas are overly constraining to the
>> ability of using XML to represent free-form data, or using them
>> otherwise would offer little particular advantage.

>
> xsd:any do provide some flexibility in schemas.
>


yep, but one can wonder what is the gain of using a schema if one is
just going to use "xsd:any"?...

it is also a mystery how well EXI behaves in this case (admittedly, I
have not personally looked into EXI in-depth, as I only briefly skimmed
over the spec a long time ago).


>> say, if one is using XML for compiler ASTs or similar (say, the XML is
>> used to represent a just-parsed glob of source-code), do they really
>> need any sort of schema?

>
> I would expect syntax trees to follow certain rules and not be free
> form.
>


well, there are some rules, but the question is more if a schema or the
use of validation would offer much advantage to make using it worth the
bother?...

the other possibility would be to make the next compiler stage, upon
seeing invalid data, give an error message essentially like "what the
hell is this?..." and halt compilation (typically this is what happens
if the compiler logic encounters a node type it doesn't know how to do
anything with in a situation where a known node-type is expected, or if
some required node is absent or similar).


so, one can have a schema to validate, say, that ones' "if" node looks like:
<if>
<cond> expr </cond>
<then> statement </then>
<else> statement </else>
</if>

but, OTOH, if upon getting back a null node when looking for "cond" or
"then", it causes an internal-error message to get displayed, it is the
same effect. even if it just ungracefully tries to use the null and
causes the program to crash, it is probably still not a huge loss (apart
from the annoyance that is a crash-prone compiler...).



I think the original point though was more about XML vs S-Expressions in
this case though:
XML allows more easily just stuffing-in new tags or contents for
existing tag-types, if this makes sense (it doesn't necessarily break
existing code or structures, and actually, protocols like XMPP make use
of this property fairly directly). for S-Exps, which are often
essentially, this is much less nice, and will often include needing more
node-types to deal with the presence or absence of certain features
(whereas with XML one can use different logic based on whether or not
certain attributes or tags are present or absent).

granted, it does still leave the possibility that one could structure
things more loosely (with S-Exps), say, rather than:
( if /cond/ /then/ /else/ )
one has:
( if (cond /cond/ ) (then /then/ ) (else /else/ ) )

so, gaining a little more flexibility at the cost of a little more
verbosity, which is possibly a reasonable point one could argue (my
client/server frame-delta protocol works more like this, typically using
marker tags before everything in place of lots of fixed argument lists,
although fixed-lists are used in many places as well).


trivia: the frame-delta protocol was originally intended to be
XML-based, but I switched out to S-Expressions at the last minute (just
prior to actually implementing it) mostly on the ground that S-Exps
would have been less effort (and I didn't feel like jerking off with the
added awkwardness using XML would bring at the moment).

a funny irony would be if someone were to devise some sort of schema
system and use it to try to validate their S-Expressions.

it is still an open question as to which is ultimately "better", as each
has strengths, and mostly seems to boil down to a tradeoff between
flexibility and ease-of-use.

 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      02-09-2012
On 2/8/2012 9:07 PM, Lew wrote:
> On Wednesday, February 8, 2012 6:14:31 PM UTC-8, Arne Vajhj wrote:
>> On 2/8/2012 8:49 PM, BGB wrote:
>>> as noted, many people neither use schemas nor any sort of schema
>>> validation. in many use-cases, schemas are overly constraining to the
>>> ability of using XML to represent free-form data, or using them
>>> otherwise would offer little particular advantage.

>>
>> xsd:any do provide some flexibility in schemas.
>>
>>> say, if one is using XML for compiler ASTs or similar (say, the XML is
>>> used to represent a just-parsed glob of source-code), do they really
>>> need any sort of schema?

>>
>> I would expect syntax trees to follow certain rules and not be free
>> form.

>
> In one breath we're singing the praises of binary formats, in the next we
> complain that XML isn't sufficiently flexible.
>


it is not like one can't have both:
have a format which is at the same time is a compressed binary format,
and can also retain the full flexibility of representing free-form XML
semantics, ideally without a major drop in compactness (this happens
with WBXML, and IIRC should also happen with EXI about as soon as one
starts encoding nodes which lie outside the schema).

this is partly why I was advocating a sort of pattern-building adaptive
format: it can build the functional analogue of a schema as it encodes
the data, and likewise does not depend on a schema to properly decode
the document. it is mostly a matter of having the format predict when it
doesn't need to specify tag and attribute names (it is otherwise similar
to a traditional data-compressor).

this is functionally similar to the sliding-window as used in deflate
and LZMA (7zip) and similar (in contrast to codebook-based data
compressors). functionally, it would have a little more in common with
LZW+MTF than with LZ77 though.

granted, potentially a binary format could incorporate both support for
schemas and the use of adaptive compression.


is XML really the text, or is it actually the structure?
I had operated under the premise that it was the data-structure (tags,
attributes, namespaces, ...), which allows for pretty much anything
which can faithfully encode the structure (without imposing too many
arbitrary restrictions).


> "Do they really need any sort of schema?" with XML is usually a "yes".
>
> But only if you're interested in clear, unambiguous, readily-parsable and
> maintainable XML document formats.
>


fair enough, I have mostly been using it "internally", and as noted, for
some of my file-formats, I had used a custom binary coded variant
(roughly similar to WBXML, but generally more compact and supporting
more features, such as namespaces and similar, which I had called SBXE).
it didn't make use of schemas, and worked by simply encoding the tag
structure into the file, and using basic contextual modeling strategies.

it also compared favorably with XML+GZ in my tests (which IIRC was also
generally smaller than WBXML). remotely possible would also be XML+BZip2
or XML+LZMA.


I had considered the possibility of a more "advanced" format (with more
advanced predictive modeling), but didn't bother (couldn't see much
point at the time of trying to shave off more bytes at the time, as it
was already working fairly well).


> People often excoriate the supposed verbosity of XML as though it were the only
> criterion to measure utility.
>


well, a lot depends...

for disk files, really, who cares?...
for a link where a several kB message might only take maybe 250-500ms
and is at typical "user-interaction" speeds (say, part of a generic "web
app"), likewise, who cares?...


it may matter a little more in a 3D interactive world where everything
going on in the visible scene has to get through at a 10Hz or 24Hz
clock-tick, and if the connection bogs down the user will be rather
annoyed (as their game world has essentially stalled).

one may have to make due with about 16-24kB/s (or maybe less) to better
ensure a good user experience (little is to say that the user has a
perfect internet connection either).

so, some sort of compression may be needed in this case.
(yes, XML+GZ would probably be sufficient).

if it were dial-up, probably no one would even consider using XML for
the network protocol in a 3D game.


> There is no inherent advantage of a LISP/list-like format over any other, nor vice versa; it's all accordin'. If the convention is agreeable to all parties,
> it will work. If all projects were one-off and isolated from the larger world,
> we'd never need to adhere to a standard. If we don't mind inventing our own
> tools for anything, we'd never have to adopt a standard with extensive tools
> support.
>


it is possible, it all depends.

a swaying factor in my last choice was the effort tradeoff of writing
the code (because working with DOM is kind of a pain...). IIRC, I may
have also been worrying about performance (mostly passing around lots of
numeric data as ASCII strings, ...).

but, I may eventually need to throw together a basic encoding scheme for
this case (a binary encoder for list-based data), that or just reuse an
existing data serializer of mine (mostly intended for generic data
serialization, which supports lists). it lacks any sort of prediction or
context modeling though, and is used in my stuff mostly as a container
format for bytecode for my VM and similar.


> Where are the *real* costs of a software system?
>


who knows?...

probably delivering the best reasonable user experience?...

for a game:
reasonably good graphics;
reasonably good performance (ideally, consistently over 30fps);
hopefully good gameplay, plot, story, ...

well, that and "getting everything done" (this is the hard one).

 
Reply With Quote
 
Arved Sandstrom
Guest
Posts: n/a
 
      02-09-2012
On 12-02-08 10:50 PM, BGB wrote:
> On 2/8/2012 7:16 PM, Arne Vajhøj wrote:
>> On 2/8/2012 2:07 PM, BGB wrote:
>>> On 2/8/2012 4:19 AM, Arved Sandstrom wrote:
>>>> On 12-02-08 04:41 AM, BGB wrote:
>>>>> note: my main way of working with XML is typically via DOM-style
>>>>> interfaces (if I am using it, it is typically because I am directly
>>>>> working with the data structure, and not as the result of some
>>>>> dumb-ass
>>>>> "data binding" crud...).
>>>>
>>>> I haven't been able to completely avoid using the DOM, but I loathe the
>>>> API. If I'm using XML at all, and JAXB suits, I'll use JAXB. More
>>>> generally I'll use SAX or StAX.
>>>>
>>>
>>> I have rarely done things for which SAX has made sense...
>>> usually in cases where SAX would make sense, I end up using
>>> line-oriented text formats instead (because there is often little
>>> obvious reason for why XML syntax would make much sense).

>>
>> Non flat structure and validation comes to mind.
>>

>
> fair enough.
>
> often, one can implement non-flat structures with line-oriented formats,
> for example:
> ...
> groupDef {
> ...
> groupDef {
> itemDef {
> ...
> }
> ...
> }
> ...
> }

[ SNIP ]

No need for the braces, if you're going to use those all you gain over
the XML is terseness.

Consider line-oriented files/messages like .properties files: these can
describe hierarchical structures perfectly well if you've got an
understood key=value syntax, specifically with a hierarchy-supporting
syntax for the keys. Easy to read and edit, easy to parse.

As an example take a look at log4j .properties and XML configuration
files. All you gain with the XML is the ability to validate against a
log4j DTD.

> a lot of times, code operates under the assumption that nearly anything
> which can be reasonably done is valid de-facto (the code is written,
> however, to ideally not do anything compromising).
>
> granted, typically I don't deal a whole lot with anything "security
> critical" or where there is much need to worry about "trust" or
> "authorization" or similar (or if privacy or money or similar was
> involved...). maybe if security were more of a concern, then added
> layers of pedantics and validation would make a lot more sense.
>
> in my typical use-cases, the theoretical worst case would probably be if
> a 3rd party could somehow break the app and get control of the users' OS
> or similar and cause damage, but again, modern Windows is itself partly
> designed to try to defend against this (running applications by default
> with constrained privileges, ...).
>

This is a narrow view of application security. Unless you're writing toy
apps, one would expect that your apps are doing *something*, and that
something includes access to databases or files or other resources.
Furthermore, if your app is used by anyone other than yourself, another
asset is in play, and that's your personal, team's or business's
reputation.

Privacy-sensitive data, or financial data, doesn't have to be involved,
and you don't need the actions of a malicious third party, in order to
have an application security problem. If your code is such that it
corrupts any persistent data, say, or is seriously under-performant
under load, or intermittently breaks and the app has to be re-started,
you've managed to trample all over the Integrity [1] and Availability
security attributes of CAI (Confidentiality, Availability,
Integrity)...all without the help of any malicious external threats.

Do you think your users care who or what mangled part of the
organizational data, or who or what is responsible for 20 percent
downtime? Some of your stakeholders will, sure, when culprits are being
sought, but most of your users will just care about proper function.

All application security starts with good coding. That's why so much of
standards like the Java Secure Coding Guidelines, or OWASP
Development/Code Review/Testing guides, have to do with good coding. And
I don't believe you can really relax your standards with some apps and
have high standards in another.

AHS

1. Strictly speaking not an integrity violation if you can detect the
unintended data corruption, ideally know what caused it, and even better
repair it, but in practice once the damage is done you often
*effectively* can't easily recover; the effort of detecting and fixing
is itself punitive.
--
....wherever the people are well informed they can be trusted with their
own government...
-- Thomas Jefferson, 1789
 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      02-09-2012
On 2/9/2012 3:24 AM, Arved Sandstrom wrote:
> On 12-02-08 10:50 PM, BGB wrote:
>> On 2/8/2012 7:16 PM, Arne Vajhøj wrote:
>>> On 2/8/2012 2:07 PM, BGB wrote:
>>>> On 2/8/2012 4:19 AM, Arved Sandstrom wrote:
>>>>> On 12-02-08 04:41 AM, BGB wrote:
>>>>>> note: my main way of working with XML is typically via DOM-style
>>>>>> interfaces (if I am using it, it is typically because I am directly
>>>>>> working with the data structure, and not as the result of some
>>>>>> dumb-ass
>>>>>> "data binding" crud...).
>>>>>
>>>>> I haven't been able to completely avoid using the DOM, but I loathe the
>>>>> API. If I'm using XML at all, and JAXB suits, I'll use JAXB. More
>>>>> generally I'll use SAX or StAX.
>>>>>
>>>>
>>>> I have rarely done things for which SAX has made sense...
>>>> usually in cases where SAX would make sense, I end up using
>>>> line-oriented text formats instead (because there is often little
>>>> obvious reason for why XML syntax would make much sense).
>>>
>>> Non flat structure and validation comes to mind.
>>>

>>
>> fair enough.
>>
>> often, one can implement non-flat structures with line-oriented formats,
>> for example:
>> ...
>> groupDef {
>> ...
>> groupDef {
>> itemDef {
>> ...
>> }
>> ...
>> }
>> ...
>> }

> [ SNIP ]
>
> No need for the braces, if you're going to use those all you gain over
> the XML is terseness.
>


well, if the format is still line-oriented, one can still parse the
files using a loop, getting and splitting strings, and checking the
first token of each line.

parsing XML is a little more invovlved, since:
items may be split across lines, or multiple items may exist on the same
line;
one can no longer use whitespace or commas as the primary deliminator;
....

granted, yes, one can use SAX or similar, but alas...

one can wonder though, what really would be the gain of using XML syntax
in many such cases, vs the typical "relative niceness" of a line
oriented format.

like, say I have a format which looks like:
{
"classname" "func_door"
"angle" "-1"
....
{
[ 1 0 0 16 ] brick/mybrick [ 0 1 0 0 ] [ 0 0 1 0 ]
[ -1 0 0 16 ] brick/mybrick [ 0 1 0 0 ] [ 0 0 1 0 ]
[ 0 1 0 16 ] brick/mybrick [ 1 0 0 0 ] [ 0 0 1 0 ]
[ 0 -1 0 16 ] brick/mybrick [ 1 0 0 0 ] [ 0 0 1 0 ]
[ 0 0 1 16 ] brick/mybrick [ 1 0 0 0 ] [ 0 1 0 0 ]
[ 0 0 -1 16 ] brick/mybrick [ 1 0 0 0 ] [ 0 1 0 0 ]
}
}

would it really look much better as:
<entity>
<field var="classname" value="func_door"/>
<field var="angle" value="-1"/>
....
<brush>
<face plane="1 0 0 16" texture="brick/mybrick" sdir="0 1 0 0" tdir="0 0
1 0"/>
....
</brush>
</entity>

even despite the parser being more generic, and it being better labeled
what everything is, is it really an improvement WRT, say, readability?...


> Consider line-oriented files/messages like .properties files: these can
> describe hierarchical structures perfectly well if you've got an
> understood key=value syntax, specifically with a hierarchy-supporting
> syntax for the keys. Easy to read and edit, easy to parse.
>


yes, but this defeats your own prior point, namely indirectly asserting
that line-oriented == flat-structure.

point is, one can have hierarchical line-oriented files.


> As an example take a look at log4j .properties and XML configuration
> files. All you gain with the XML is the ability to validate against a
> log4j DTD.
>
>> a lot of times, code operates under the assumption that nearly anything
>> which can be reasonably done is valid de-facto (the code is written,
>> however, to ideally not do anything compromising).
>>
>> granted, typically I don't deal a whole lot with anything "security
>> critical" or where there is much need to worry about "trust" or
>> "authorization" or similar (or if privacy or money or similar was
>> involved...). maybe if security were more of a concern, then added
>> layers of pedantics and validation would make a lot more sense.
>>
>> in my typical use-cases, the theoretical worst case would probably be if
>> a 3rd party could somehow break the app and get control of the users' OS
>> or similar and cause damage, but again, modern Windows is itself partly
>> designed to try to defend against this (running applications by default
>> with constrained privileges, ...).
>>

> This is a narrow view of application security. Unless you're writing toy
> apps, one would expect that your apps are doing *something*, and that
> something includes access to databases or files or other resources.
> Furthermore, if your app is used by anyone other than yourself, another
> asset is in play, and that's your personal, team's or business's
> reputation.
>


"someone steals' the user's save-games!", that would be scary, or not
really...

most of the files in a game are generic resource data, but stealing them
is of little concern, and damaging them is more likely to be an
annoyance than an actual threat "oh crap, I might have to reinstall...".


> Privacy-sensitive data, or financial data, doesn't have to be involved,
> and you don't need the actions of a malicious third party, in order to
> have an application security problem. If your code is such that it
> corrupts any persistent data, say, or is seriously under-performant
> under load, or intermittently breaks and the app has to be re-started,
> you've managed to trample all over the Integrity [1] and Availability
> security attributes of CAI (Confidentiality, Availability,
> Integrity)...all without the help of any malicious external threats.
>


typically, crashes are more an annoyance than a major threat.

consider Skyrim: the damn thing can't usually keep going for more than 1
or 2 hours before crashing-to-desktop or similar.

of course, not everyone aspires towards Bethesda levels of stability.


> Do you think your users care who or what mangled part of the
> organizational data, or who or what is responsible for 20 percent
> downtime? Some of your stakeholders will, sure, when culprits are being
> sought, but most of your users will just care about proper function.
>


only likely matters if it is some sort of server-based or business type app.

ok, a game-server crashing could be a bit annoying if one were making
something like an MMORPG or something (like WoW...).


in my case, I am not:
the online play would likely be more for things like user-run deathmatch
servers and similar.


> All application security starts with good coding. That's why so much of
> standards like the Java Secure Coding Guidelines, or OWASP
> Development/Code Review/Testing guides, have to do with good coding. And
> I don't believe you can really relax your standards with some apps and
> have high standards in another.
>


it is more a matter of productivity:
focus on security, code-quality, ... in places where it is important;
otherwise, whatever one can mash together which basically works is
arguably good enough.

granted, it is not like there aren't some things I care about, like I
prefer clean and nice code over a tangled mess, but ultimately this may
be secondary to the greater concern, "get it done" (as, what good is
good code if the product can never get out the door and on the market?).


it is like with art:
some people can be perfectionist, and worry about tiny details which
hardly anyone would ever notice;
other people can try to make something "good enough" and hope users
don't notice or care about any little graphical imperfections.


> AHS
>
> 1. Strictly speaking not an integrity violation if you can detect the
> unintended data corruption, ideally know what caused it, and even better
> repair it, but in practice once the damage is done you often
> *effectively* can't easily recover; the effort of detecting and fixing
> is itself punitive.


potentially, but it depends on the relative costs.

if the worst case is forcing a reinstall, this is much less of an issue
than, say, if it breaks their savegames, which is much less of an issue
than if any "actually important" data is involved (compromises users'
privacy or security, causes damage to their computer, ...).

say, one doesn't want to have their app be a vector for virus delivery,
as this can give a bad reputation.


but, alas...
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
The Future of Voice Communication @ BonaFideReviews Silverstrand Front Page News 0 09-27-2005 01:47 PM
SystemVerilog Interprocess Communication - Project VeriPage Update Swapnajit Mittra VHDL 0 12-21-2004 05:11 PM
communication between processes john VHDL 10 11-30-2004 09:59 AM
PC communication on wireless network? Mervin Williams Wireless Networking 3 08-24-2004 06:32 PM
Communication between HttpApplication that run on the same server Sherif ElMetainy ASP .Net 7 11-06-2003 11:23 PM



Advertisments