Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > Converting thousands of pages to XML

Reply
Thread Tools

Converting thousands of pages to XML

 
 
lquast@univ.llu.edu
Guest
Posts: n/a
 
      04-18-2004
What's the best and fastest way to approach converting a large HTML
site to XML? Thanks.
 
Reply With Quote
 
 
 
 
David Dorward
Guest
Posts: n/a
 
      04-18-2004
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

> What's the best and fastest way to approach converting a large HTML
> site to XML?


That rather depends on what dialect of XML you wish to convert the HTML to,
what form the HTML is at present, and what your skills are.

I would probably do something involving Perl, File::Find, HTML:arser or
HTML::TreeBuilder, and one of the many XML modules for Perl.

--
David Dorward <http://blog.dorward.me.uk/> <http://dorward.me.uk/>
 
Reply With Quote
 
 
 
 
Andy Dingley
Guest
Posts: n/a
 
      04-19-2004
On 18 Apr 2004 08:55:43 -0700, (E-Mail Removed) wrote:

>What's the best and fastest way to approach converting a large HTML
>site to XML? Thanks.


HTML Tidy is a good start (assuming your target is XHTML)

Then go to c.i.w.a.h and ask "Why ?"

--
Smert' spamionam
 
Reply With Quote
 
lquast@univ.llu.edu
Guest
Posts: n/a
 
      04-20-2004
Andy Dingley <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>. ..
> On 18 Apr 2004 08:55:43 -0700, (E-Mail Removed) wrote:
>
> >What's the best and fastest way to approach converting a large HTML
> >site to XML? Thanks.

>
> HTML Tidy is a good start (assuming your target is XHTML)
>
> Then go to c.i.w.a.h and ask "Why ?"


Hello,

Thank you for your suggestion regarding converting to XHTML. I am new
to using these groups, however, and just looked up c.i.w.a.h! Very
interesting—and I don't think I'll ask.

Regards

LQ
 
Reply With Quote
 
Andy Dingley
Guest
Posts: n/a
 
      04-20-2004
On 19 Apr 2004 19:27:59 -0700, (E-Mail Removed) wrote:

>Thank you for your suggestion regarding converting to XHTML. I am new
>to using these groups, however, and just looked up c.i.w.a.h! Very
>interesting—and I don't think I'll ask.




c.i.w.a.h is one of the most unfriendly groups I know of, and
certainly the most useless and downright hostile that I still bother
to read. "Converting to XHTML" is a regular topic in there and
searching will show up some interesting discussion of its benefits, or
lack of them. However many people in there have egos bigger than
their knowledge and will spout the same old party line with more
volume than understanding.

HTML Tidy is open sourced, AFAIR, and if you have a huge number of
files to convert, you can tie the source into your favourite choice of
scripting language.

--
Smert' spamionam
 
Reply With Quote
 
lquast@univ.llu.edu
Guest
Posts: n/a
 
      04-22-2004
Andy Dingley <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>. ..
> On 19 Apr 2004 19:27:59 -0700, (E-Mail Removed) wrote:
>
> >Thank you for your suggestion regarding converting to XHTML. I am new
> >to using these groups, however, and just looked up c.i.w.a.h! Very
> >interesting?and I don't think I'll ask.

>
>
>
> c.i.w.a.h is one of the most unfriendly groups I know of, and
> certainly the most useless and downright hostile that I still bother
> to read. "Converting to XHTML" is a regular topic in there and
> searching will show up some interesting discussion of its benefits, or
> lack of them. However many people in there have egos bigger than
> their knowledge and will spout the same old party line with more
> volume than understanding.
>
> HTML Tidy is open sourced, AFAIR, and if you have a huge number of
> files to convert, you can tie the source into your favourite choice of
> scripting language.


I guess it couldn't hurt to see what they have to say! Thanks again
for the info. HTML Tidy may come in handy.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting string to currency/formatting in thousands.aspx (.NET) pebelund ASP .Net 1 10-23-2006 10:32 AM
Problem to insert an XML-element by XSLT-converting from one XML-file into another XML-file jkflens XML 2 05-30-2006 09:41 AM
How to make thousands rogerrd34@aol.com ASP .Net 0 10-02-2005 04:41 PM
Converting ASP.NET pages directly to PDF pages. Ryan Taylor ASP .Net 3 09-29-2004 08:33 PM
Sorting problem with datagrid with sveral thousands items! bredal Jensen ASP .Net 1 05-05-2004 08:35 AM



Advertisments