Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > "toXHTML()", where's the XML?

Reply
Thread Tools

"toXHTML()", where's the XML?

 
 
hawat.thufir@gmail.com
Guest
Posts: n/a
 
      08-08-2005
I'm looking at
<http://xmlserv.com/API/com/xmlserv/app/shared/SODocument.html#toXHTML(javax.xml.transform.Source )>
and <http://jtidy.sourceforge.net/> in trying to get some XML extracted
from XML. I'm reading two "for dummies" books, "XHTML for dummies" and
"XML for dummies", they're not the most current but are sufficient, I'm
sure.

I have some Java code which, using JTidy, reads in a URL and kicks out
the JTidy parsed file. The code is at <http://thufir.lecktronix.net/>,
click on "files" and "JTidy".

Where's the XML? I'm looking for either a "toXHTML" or "toXML" method
(function/routine/sub-program) in JTidy, but can't find it.

At this point I'm staying away from Necko
<http://people.apache.org/~andyc/neko/doc/html/> and Xerces
<http://xml.apache.org/xerces2-j/> simply because I have something with
JTidy, although I might switch to Necko later.

Anyhow, in the jar (which can be downloaded from
<http://thufir.lecktronix.net/> JTidy does parse a URL
(<http://www.yahoo.com/> is hard-coded in) and generates out.html.
Where's the XML, embedded within the XHTML?


thanks,

Thufir

 
Reply With Quote
 
 
 
 
Martin Honnen
Guest
Posts: n/a
 
      08-08-2005


http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:


> Anyhow, in the jar (which can be downloaded from
> <http://thufir.lecktronix.net/> JTidy does parse a URL
> (<http://www.yahoo.com/> is hard-coded in) and generates out.html.
> Where's the XML, embedded within the XHTML?


XHTML is XML so I am not sure what you are looking for, presumably the
out.html is JTidy's attempt to create XHTML from http://www.yahoo.com/
for you.


--

Martin Honnen
http://JavaScript.FAQTs.com/
 
Reply With Quote
 
 
 
 
hawat.thufir@gmail.com
Guest
Posts: n/a
 
      08-08-2005
Martin Honnen wrote:
> (E-Mail Removed) wrote:
>
>
> > Anyhow, in the jar (which can be downloaded from
> > <http://thufir.lecktronix.net/> JTidy does parse a URL
> > (<http://www.yahoo.com/> is hard-coded in) and generates out.html.
> > Where's the XML, embedded within the XHTML?

>
> XHTML is XML so I am not sure what you are looking for, presumably the
> out.html is JTidy's attempt to create XHTML from http://www.yahoo.com/
> for you.
>
>
> --
>
> Martin Honnen
> http://JavaScript.FAQTs.com/


That was my understanding, that "out.html" is XHTML; XHTML being a
super-set of XML. Having said that, I now realize that I've asked the
wrong question, or at least phrased it wrong, sorry.

I want to insert data from "out.html" into a database, such as
Hibernate <http://www.hibernate.org/> or Cocoon
<http://cocoon.apache.org/>.

To do that, I understand that I must first "transform" XML with XSLT,
somehow. I think I need more information about that process in order
to ask useful questions.

So, perhaps better questions are:

What sort of "output" am I looking for from an XSLT transform (in order
to do a database insert)?

Do I need to do the XSLT myself, or can that be done from a database?



Thanks,

Thufir

 
Reply With Quote
 
Martin Honnen
Guest
Posts: n/a
 
      08-09-2005


(E-Mail Removed) wrote:


> That was my understanding, that "out.html" is XHTML; XHTML being a
> super-set of XML.


XHTML is an XML application meaning that any XHTML document is a
well-formed XML document. But XHTML is not a super-set of XML.


> What sort of "output" am I looking for from an XSLT transform (in order
> to do a database insert)?


No idea really, standard XSLT output methods are xml, html, and text so
you could use an XSLT stylesheet to process an XHTML document and
transform it to XML or HTML or plain text.
Not sure why you think XSLT helps with a data base insert, unless you
have a data base that stores XML natively/directly, or unless you have a
certain XSLT extension that allows RDBMS access.

> Do I need to do the XSLT myself, or can that be done from a database?


I am not familiar with Cocoon or Hibernate, unless someone else shows up
here with expertise on those you are probably better off asking in one
of the dedicated mailing lists or forums offered on the web sites of the
products.

--

Martin Honnen
http://JavaScript.FAQTs.com/
 
Reply With Quote
 
hawat.thufir@gmail.com
Guest
Posts: n/a
 
      08-09-2005
Martin Honnen wrote:
....
> No idea really, standard XSLT output methods are xml, html, and text so
> you could use an XSLT stylesheet to process an XHTML document and
> transform it to XML or HTML or plain text.


This is what I want to do, thanks. I'll look further into that.

> Not sure why you think XSLT helps with a data base insert, unless you
> have a data base that stores XML natively/directly, or unless you have a
> certain XSLT extension that allows RDBMS access.


I was browsing the bookstore and came across "Hibernate: A Developer's
notebook", <http://www.oreilly.com/catalog/hibernate/>.

Perhaps I misunderstood. I'm trying to get XML from XHTML with XSLT.
As I read it, Hibernate and Cocoon will take XML as data. Therefore, I
need to get some XML from the XHTML, then feed the XML to the RBDMS.

....
> I am not familiar with Cocoon or Hibernate, unless someone else shows up
> here with expertise on those you are probably better off asking in one
> of the dedicated mailing lists or forums offered on the web sites of the
> products.

....

Ok, will do.



Thanks,

Thufir

 
Reply With Quote
 
hawat.thufir@gmail.com
Guest
Posts: n/a
 
      08-15-2005
Martin Honnen wrote:
....
> I am not familiar with Cocoon or Hibernate, unless someone else shows up
> here with expertise on those you are probably better off asking in one
> of the dedicated mailing lists or forums offered on the web sites of the
> products.

....

"Use Cocoon to Create a Well-Formed View of a Web Page, Then Scrape It
for Data"
<http://hacks.oreilly.com/pub/h/2125>

Now to install Cocoon...


-Thufir

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off




Advertisments