Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > convert html

Reply
Thread Tools

convert html

 
 
Guest
Posts: n/a
 
      07-08-2004
Hi:

I want to convert html to xml.

I am doing this:

from xml.dom.ext.reader import HtmlLib
from xml.dom import ext, Node
from xml.dom.NodeFilter import NodeFilter

def main( argv ):
# build a DOM tree from the html
reader = HtmlLib.Reader()
dom_object = reader.fromUri( sys.argv[1] )

info = getTableInfo( dom_object, 9 )

reader.releaseNode( dom_object );

if __name__ == "__main__":
main( sys.argv )

This takes almost a minute on a 6000 line html file on a PIII 700 Mhz 256 RAM. This is too slow.

Can you suggest another way of doing this in Python?



 
Reply With Quote
 
 
 
 
Richard Brodie
Guest
Posts: n/a
 
      07-09-2004

<(E-Mail Removed)> wrote in message news:(E-Mail Removed)...
> I want to convert html to xml.
>
> I am doing this:

....
> Can you suggest another way of doing this in Python?


I haven't benchmarked but I would imagine using HTML Tidy
(or ÁTidylib) is as good as any, particularly if your HTML source
is a bit rough.


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
firefox html, my downloaded html and firebug html different? Adam Akhtar Ruby 9 08-16-2008 07:55 PM
Convert HTML String to HTML Document And Save csgraham74 ASP .Net 2 09-19-2006 08:07 AM
RE: Convert HTML to XML or Paser HTML Steven Cheng[MSFT] ASP .Net 3 02-12-2004 07:15 PM
Re: Convert HTML to XML or Paser HTML Q.Z ASP .Net 0 01-13-2004 04:20 PM
Re: Convert HTML to XML or Paser HTML Joerg Jooss ASP .Net 0 01-11-2004 12:23 AM



Advertisments