Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > need XML schema to store infomation in a language neutral format

Reply
Thread Tools

need XML schema to store infomation in a language neutral format

 
 
AViS
Guest
Posts: n/a
 
      08-16-2006
Hi,
I am building a language translator, that must convert input from
source languages to a language neutral format in XML. This XML must be
read by the target language translator and produce the output in the
target language. I am thinking of using a hashed map to handle
translations but am have trouble in deciding on the schema in which the
XML must be stored


The application must work as follows...
{c translator} <---> | X M L | <---> {vb translator}
int i; stored in Dim i as Integer
printf("%d",i); neutral format Print i

Proposed XML format:
<translate>
<action index=1>i</action>
<action index=2>i</action>
</translate>

the index attribute of the XML tag action will refer to a hash table
that will aid in translations thus
__________________________________________________ ________________
| index | c | vb |
|================================================= =================|
| 1 | int $token | Dim $token as Integer |
| 2 | printf("%d",$token) | print $token |
================================================== =================


Is the XML format and translation method I propose sufficient. Please
consider that the conversion is 100% possible (meaning my translator
excludes C's asm, pointers etc.)

 
Reply With Quote
 
 
 
 
Stefan Ram
Guest
Posts: n/a
 
      08-16-2006
"AViS" <(E-Mail Removed)> writes:
>I am building a language translator, that must convert input from
>source languages to a language neutral format in XML.


There is no language neutral format. Or - in other words:
A "language neutral format" is just another language.

>I am thinking of using a hashed map to handle translations


Mentioning a low-level implementation detail as a hashing when
talking about a very high-level task seems inappropriate.

>Is the XML format and translation method I propose sufficient.


Even XML is a low-level implementation detail when in fact you
should be talking about annotated trees or similar structures.

It will be possible for you to translate a small restricted
and controlled subset of both languages. More might be beyond
the capabilities of most individuals, though very gifted
programmers or organizations might be able to translate a
large part of both languages: But I expect this to be a huge
effort.

 
Reply With Quote
 
 
 
 
AViS
Guest
Posts: n/a
 
      08-16-2006
> There is no language neutral format. Or - in other words:
> A "language neutral format" is just another language.

"Language Neutral" was meant to be in the sense that I did not want the
source language to be recognized by looking at the XML, in other words
if 'printf' in c is translated to 'X' in the XML then an equivalent
'cout' in c++ must also be translated to the same 'X'

> Mentioning a low-level implementation detail as a hashing when
> talking about a very high-level task seems inappropriate.

Sorry about the hash map, I just thought it would make more sense to
explain the index attribute of the xml <action> tag along with the hash
map


> Even XML is a low-level implementation detail when in fact you
> should be talking about annotated trees or similar structures.

Can you explicate more about "annotated trees or similar structures". I
am not able to find much info on the net. Even if you can suggest some
sites, that'll go a long way

> It will be possible for you to translate a small restricted
> and controlled subset of both languages...

It is enough if it is translates only a subset, in other words...
though I need to find a way to store in the XML, the functions held by
a c++ class, when translating from xml to c the functions will be
removed and the class be converted to a typedef struct.

 
Reply With Quote
 
Stefan Ram
Guest
Posts: n/a
 
      08-16-2006
"AViS" <(E-Mail Removed)> writes:
>"Language Neutral" was meant to be in the sense that I did not
>want the source language to be recognized by looking at the XML,


This is easy: If any code, such as

printf( "%d", i );

is given, and I tell you that it was translated from a language
X, there is no way for you, to find out what X is. So /every/
representation will fulfil this requirement.

>in other words if 'printf' in c is translated to 'X' in the
>XML then an equivalent 'cout' in c++ must also be translated
>to the same 'X'


In general, equivalence between two programs is undecidable.

See »Equivalence Problem« in

http://www.cs.rochester.edu/u/nelson...decidable.html

However, for a restricted domain you might indeed suceed
to find such a representation. One possibility would be
to translate the C++ into C as early C++ compilers did.

>>Even XML is a low-level implementation detail when in fact you
>>should be talking about annotated trees or similar structures.

>Can you explicate more about "annotated trees or similar structures". I
>am not able to find much info on the net. Even if you can suggest some
>sites, that'll go a long way


Starting points might be

http://en.wikipedia.org/wiki/Abstract_syntax_tree
http://www.cse.iitk.ac.in/users/raj/...tes/lec17.html

An annotated tree is a tree with annotations, which might be
represented as attributes in XML. While "annotated tree" means
the information structure itself, an XML documented is one way
to represent such an information structure using a text
document.

Maybe, to ask in this XML newsgroup, you should try to isolate
that part of your question that is directly related to the
language XML from the rest that deals with your algorithm, but
has nothing to do with XML.



 
Reply With Quote
 
Joseph Kesselman
Guest
Posts: n/a
 
      08-16-2006
It sounds like you're talking about an XML representation of an
Intermediate Language general enough to cover multiple source languages.
Your first step, therefore, is to find or design that IL; from there,
writing an XML rendering of it is straightforward.

I'd recommend reading any of the standard reference works on compiler
design as a starting point for picking your IL. Note that its required
characteristics are going to depend heavily on exactly what operations
you're going to want to perform against that representation.


--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
 
Reply With Quote
 
AViS
Guest
Posts: n/a
 
      08-17-2006
Thanks Stefan and Joseph.
The IL in XML was intended to be my Proof of Concept for a much bigger
initiative
I shall try to keep this thread updated with the latest.


Thanks again.

 
Reply With Quote
 
=?iso-8859-1?q?Jean-Fran=E7ois_Michaud?=
Guest
Posts: n/a
 
      08-17-2006

Stefan Ram wrote:
> "AViS" <(E-Mail Removed)> writes:
> >"Language Neutral" was meant to be in the sense that I did not
> >want the source language to be recognized by looking at the XML,

>
> This is easy: If any code, such as
>
> printf( "%d", i );
>
> is given, and I tell you that it was translated from a language
> X, there is no way for you, to find out what X is. So /every/
> representation will fulfil this requirement.
>
> >in other words if 'printf' in c is translated to 'X' in the
> >XML then an equivalent 'cout' in c++ must also be translated
> >to the same 'X'


The idea of using an intermediate language might not be the best way to
go about it but so what? Let the guy explore, he might find
interresting things and he surely will learn alot.

> In general, equivalence between two programs is undecidable.


This whole thing seems fishy to me. If 2 languages are Turing complete,
then they can both represent everything that is representable by a
Turing machine which is everything that is computable. This means that
any program representation in the first language DOES have an
equivalent representation in the second language.

Knowing weather 2 given programs written in 2 different languages are
indeed functionally equivalent if both languages are Turing complete is
far from being a trivial problem but it is possible.

>
> See »Equivalence Problem« in
>
> http://www.cs.rochester.edu/u/nelson...decidable.html


Baloney. If the input and output subsets of each program are known for
both programs then they can be compared to evaluate if they are
functionally equivalent. The Equivalence problems speaks of Equivalence
in general terms (whatever that means (nothing in context if you ask
me)). The difficulty resides in our inability to track very complex
problems. They are not impossible to solve, they are simply too complex
to aprehend when taken as a whole upfront.

Of course the proof makes sure not to mention any specific languages.
The proof applies to a program that would compute equivalence for ANY 2
programs. No such program can exist in the first place. The guy isn't
trying to translate anything to everything else, he's writing a
translater that goes from one language to another. Quite challenging,
but not impossible.

[snip]

Regards
Jean-Francois Michaud

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
web.xml / XML schema issue, why do some XML schema attributes disappear asciz@starmail.com Java 3 02-20-2007 09:56 AM
[XML Schema] Including a schema document with absent target namespace to a schema with specified target namespace Stanimir Stamenkov XML 3 04-25-2005 09:59 AM
XML Schema to XML Schema Conversion Hari Om XML 1 03-02-2004 09:04 PM
XML schema regular expressions question and recommended XML Schema book Fred Smith XML 1 02-05-2004 11:12 AM
Localization: neutral language out of main assembly Hannes Schmiderer ASP .Net 5 08-21-2003 11:41 AM



Advertisments