Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > HTML > pdf to HTML conversion program?

Reply
Thread Tools

pdf to HTML conversion program?

 
 
Cliff R.
Guest
Posts: n/a
 
      01-31-2004
Hi, can anyone recommend a good program that converts PDF files to
HTML? I've tried one called PDF to HTML Converter Pro but the code it
creates isn't what I'm looking for. I really just need it to convert
to basic HTML keeping bold, itals, paragraph breaks, etc., NOT styled
text so the line breaks are exactly the same, etc. In this one, every
single line has this sort of code at the beginning: <div
id="_506:9699" style="position:absolute;top:9699;left:506"><span
id="_11" style="font-size:11px;font-family:Helvetica;color=#000000">
etc. so the code is huge and unnecessarily complicated.

Any ideas of what to use to create clean, basic HTML of mostly
text-based PDF's?

Thanks.


 
Reply With Quote
 
 
 
 
Leif K-Brooks
Guest
Posts: n/a
 
      01-31-2004
Cliff R. wrote:
> Hi, can anyone recommend a good program that converts PDF files to
> HTML?


rm -f *.pdf
nano foo.html

 
Reply With Quote
 
 
 
 
Terry
Guest
Posts: n/a
 
      01-31-2004
Leif K-Brooks wrote:

> Cliff R. wrote:
>
>> Hi, can anyone recommend a good program that converts PDF files to
>> HTML?

>
>
> rm -f *.pdf
> nano foo.html
>


tsk... and he asked so politely too!

 
Reply With Quote
 
Toby A Inkster
Guest
Posts: n/a
 
      01-31-2004
Cliff R. wrote:

> Any ideas of what to use to create clean, basic HTML of mostly
> text-based PDF's?


I dunno about that, but I can go one step better. Ghostscript includes a
tool "ps2ascii" that can convert PDF and Postscript files to plain text.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me - http://www.goddamn.co.uk/tobyink/?page=132

 
Reply With Quote
 
Leif K-Brooks
Guest
Posts: n/a
 
      02-01-2004
Terry wrote:
>>> Hi, can anyone recommend a good program that converts PDF files to
>>> HTML?

>> rm -f *.pdf
>> nano foo.html

> tsk... and he asked so politely too!


It's what I would do. PDF is a (mostly?) presentational format, HTML is
structural. Anything short of true AI won't be able to convert them well.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Text Replacement in PDF Document & Stable PDF to Image Conversion sherazam Java 0 06-19-2012 11:46 AM
Convert PCL Files into PDF, CCIFF Encoding & Faster XML to PDF Conversion sherazam Java 0 09-26-2011 10:20 AM
Get List of all Available Fonts in a PDF File & PDF to Image Conversion sherazam Java 0 07-15-2011 10:15 AM
Stable, Robust Image Files to PDF & HTML to PDF Conversion sherazam Java 0 04-21-2011 09:10 AM
PDF::Writer, create pdf and insert in other pdf file. Ricardo Pog Ruby 1 03-26-2008 08:24 PM



Advertisments