Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > XML Parser

Reply
Thread Tools

XML Parser

 
 
an0047@gmail.com
Guest
Posts: n/a
 
      07-02-2007
Hello

Would like to develop a simple XML parser with own commands

The aproach is first to develop a state machine to later implement it
in C. I had a look to some posts relating lexical analysers but the
information
i found was not helpfull.

I know there are some books relating the creation of tables and
complicated
equations to analyse text, but don't know how to look for them.

Any recommendation, tips on how to implement the parser or maybe
literature
reference (book, paper) would be kindly appreciated.

Best Regards

 
Reply With Quote
 
 
 
 
Joseph Kesselman
Guest
Posts: n/a
 
      07-02-2007
Many good XML parsers exist. It sounds like you're going to need to do a
fair amount of homework before constructing your own. Reinventing the
wheel is probably not very useful unless you have a real interest in
learning how parsers function.

One standard standard reference which covers this topic: "Compilers:
Principles, Techniques, and Tools" (Aho, Ullman, and others). You can
ignore the code-generation and optimization sections, but the parsing
portions of the task are essentially the same, and the typechecking
chapter may be relevant if you want to implement validation.
 
Reply With Quote
 
 
 
 
Juergen Kahrs
Guest
Posts: n/a
 
      07-03-2007
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

> Any recommendation, tips on how to implement the parser or maybe
> literature
> reference (book, paper) would be kindly appreciated.


This question has been answered here several times.
Google for it. Usually, we warn newbies who want to
write their own parsers. You will be surprised about
the tricky details. Have you ever heard of a BOM ?
Are you prepared to process 32-bit-characters ?
 
Reply With Quote
 
Joe Kesselman
Guest
Posts: n/a
 
      07-03-2007
Juergen Kahrs wrote:
> Have you ever heard of a BOM ? Are you prepared to process 32-bit-characters ?


The usual estimate is that a complete XML parser is about the right size
to be a serious term project for a college student who already
understands the basics of writing parsers.

You can rattle off a subset in less time than that. But, again, unless
you have very special needs (such as a language where nobody has written
one yet and which can't link to existing parsers), the question is "why".

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
 
Reply With Quote
 
an0047@gmail.com
Guest
Posts: n/a
 
      07-03-2007
On 3 Jul., 00:56, Joseph Kesselman <(E-Mail Removed)> wrote:
> Many good XML parsers exist. It sounds like you're going to need to do a
> fair amount of homework before constructing your own. Reinventing the
> wheel is probably not very useful unless you have a real interest in
> learning how parsers function.
>
> One standard standard reference which covers this topic: "Compilers:
> Principles, Techniques, and Tools" (Aho, Ullman, and others). You can
> ignore the code-generation and optimization sections, but the parsing
> portions of the task are essentially the same, and the typechecking
> chapter may be relevant if you want to implement validation.


Thanks for your answer and reference!. I'm not trying to reinvent the
wheel, I'm trying to write a very simple and reliable parser for a
commercial software. The ones out there are very complex, big and
license violation needs to be taken under consideration. If you know
about a very simple one written in C please let me know.

I have indeed a real interest in learning how parsers function, that's
because I asked for a book reference. As for now the new state machine
has 5 states and more or less I can handle some simple XLM tags.

Regards


 
Reply With Quote
 
an0047@gmail.com
Guest
Posts: n/a
 
      07-03-2007
On 3 Jul., 09:49, Juergen Kahrs <(E-Mail Removed)>
wrote:
> (E-Mail Removed) wrote:
> > Any recommendation, tips on how to implement the parser or maybe
> > literature
> > reference (book, paper) would be kindly appreciated.

>
> This question has been answered here several times.
> Google for it. Usually, we warn newbies who want to
> write their own parsers. You will be surprised about
> the tricky details. Have you ever heard of a BOM ?
> Are you prepared to process 32-bit-characters ?


Hi thanks for your answer and thanks for the warning too. As a newbie
I need and want to learn about parsers. Actually I don't even know if
I'm posting at the right group, unfortunately I didn't found any
information on the web that satisfied my search and that is the reason
of my post. The characters are still 8 bit long and I think they will
remain like that. For your pleasure I had great problems handling
chars and strings under C. I don't know to which question do you refer
but if you could point me to posts that talk about the implementation
(and state machine) of the kind of parser described above I would
kindly appreciate it, have a nice day and best regards

 
Reply With Quote
 
an0047@gmail.com
Guest
Posts: n/a
 
      07-03-2007
On 3 Jul., 14:06, Joe Kesselman <(E-Mail Removed)> wrote:
> Juergen Kahrs wrote:
> > Have you ever heard of a BOM ? Are you prepared to process 32-bit-characters ?

>
> The usual estimate is that a complete XML parser is about the right size
> to be a serious term project for a college student who already
> understands the basics of writing parsers.
>
> You can rattle off a subset in less time than that. But, again, unless
> you have very special needs (such as a language where nobody has written
> one yet and which can't link to existing parsers), the question is "why".


Why? I think the answer is the posts above

>
> --
> () ASCII Ribbon Campaign | Joe Kesselman
> /\ Stamp out HTML e-mail! | System architexture and kinetic poetry



 
Reply With Quote
 
Joseph Kesselman
Guest
Posts: n/a
 
      07-03-2007
The existing parsers are complex because that's what's required to do a
good job. Supporting a trivial subset of XML is near-trivial, but there
is a lot more that has to be dealt with if you want your code to survive
contact with real data and real users.

Everything should be as simple as possible, but not simpler.

If you want to learn about parsers, implementing a sloppy subset really
won't teach you much.

"Try not. Do! ... Or do not."

There are royalty-free parsers out there, if that's your concern. I
don't know what's available in plain C these days, but Apache's Xerces
parser is available in a C++ version.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
 
Reply With Quote
 
Joseph Kesselman
Guest
Posts: n/a
 
      07-03-2007
(If your parser doesn't support all of XML, it isn't an XML parser.)

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
 
Reply With Quote
 
Joseph Kesselman
Guest
Posts: n/a
 
      07-03-2007
Joseph Kesselman wrote:
> There are royalty-free parsers out there, if that's your concern.


For what it's worth, the W3C's own website just suggests you do a
websearch for "XML parser" to get a list of the available parsers.
Adding "in C" and "free" to that suggests that you might want to look at
libxml2, XMLTok, expat, and possibly others.

(I haven't used any C-based XML parser in years, so I can't offer
opinions on any of these.)
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
XML::Parser Installation error: XML-Parser-2.34 Sean Perl Misc 3 10-03-2006 01:23 AM
XML::Parser Installation error: XML-Parser-2.34 Sean Perl Misc 0 10-02-2006 06:20 PM
Different results parsing a XML file with XML::Simple (XML::Sax vs. XML::Parser) Erik Wasser Perl Misc 5 03-05-2006 10:09 PM
XML-Parser to XML-Parser communication (encoding issues?) arne Perl Misc 0 09-13-2005 12:53 PM
XML Parser VS HTML Parser ZOCOR Java 11 10-05-2004 01:58 PM



Advertisments