Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > HTML2JPEG

Reply
Thread Tools

HTML2JPEG

 
 
katja
Guest
Posts: n/a
 
      12-07-2004
How can I convert an Internet-page to a PDF/JPEG-file from a
Perl-script under Linux? (I would like to make a screenshot from a
command line).
Thank you
Katja
 
Reply With Quote
 
 
 
 
Roman M. Parparov
Guest
Posts: n/a
 
      12-07-2004
katja <> wrote:
> How can I convert an Internet-page to a PDF/JPEG-file from a
> Perl-script under Linux? (I would like to make a screenshot from a
> command line).
> Thank you
> Katja


Since every browser for every platform for every user renders the WWW
pages differently, it is unclear whether your task is well-defined, thus
an answer should not be possible. You can download the contents of a web
page as a text file easily. Converting text files on Linux is often
done using wv suite (wvText, wvPDF et al.) which operates installed
converting programs like latex, distill and others through a shell script
wrapper.

--
Roman M. Parparov - NASA EOSDIS project node at TAU technical manager.
Email: http://www.nasa.proj.ac.il/
Phone/Fax: +972-(0)3-6405205 (work), +972-(0)50-734-18-34 (home)
----------------------------------------------------------------------
The economy depends about as much on economists as the weather does on
weather forecasters.
-- Jean-Paul Kauffmann
 
Reply With Quote
 
 
 
 
Katja Zinchenko
Guest
Posts: n/a
 
      12-08-2004
*** Roman M. Parparov, 07-Dec-04, 14:06 h ***

> katja <> wrote:
> > How can I convert an Internet-page to a PDF/JPEG-file from a
> > Perl-script under Linux? (I would like to make a screenshot from a
> > command line).
> > Thank you
> > Katja

>
> Since every browser for every platform for every user renders the WWW
> pages differently, it is unclear whether your task is well-defined, thus
> an answer should not be possible. You can download the contents of a web
> page as a text file easily. Converting text files on Linux is often
> done using wv suite (wvText, wvPDF et al.) which operates installed
> converting programs like latex, distill and others through a shell script
> wrapper.


Thanks for the hint, and sorry, yes, my question was too vague. What I wanted to
ask is: Is there a perlish way to save the contents of a web page to an image
file exactly the way a browser (any recent browser) would render it? I'd like to
pass my script a url and automatically get a kind of screen shot of the web
page. There doesn't seem to be a perl module (at least not on CPAN) to that
purpose, and as far as I could find out, mozilla, firefox, galeon, konqueror and
opera don't support any command line options to have them print to a file.

Thank you,
Katja
 
Reply With Quote
 
henq
Guest
Posts: n/a
 
      12-09-2004
Would http://freshmeat.net/projects/htmldoc/ help?

Regards,

Henk

www.windsurfpedia.com


"Katja Zinchenko" <> schreef in bericht
news hte.uni-muenchen.de...
> *** Roman M. Parparov, 07-Dec-04, 14:06 h ***
>
>> katja <> wrote:
>> > How can I convert an Internet-page to a PDF/JPEG-file from a
>> > Perl-script under Linux? (I would like to make a screenshot from a
>> > command line).
>> > Thank you
>> > Katja

>>
>> Since every browser for every platform for every user renders the WWW
>> pages differently, it is unclear whether your task is well-defined, thus
>> an answer should not be possible. You can download the contents of a web
>> page as a text file easily. Converting text files on Linux is often
>> done using wv suite (wvText, wvPDF et al.) which operates installed
>> converting programs like latex, distill and others through a shell script
>> wrapper.

>
> Thanks for the hint, and sorry, yes, my question was too vague. What I
> wanted to
> ask is: Is there a perlish way to save the contents of a web page to an
> image
> file exactly the way a browser (any recent browser) would render it? I'd
> like to
> pass my script a url and automatically get a kind of screen shot of the
> web
> page. There doesn't seem to be a perl module (at least not on CPAN) to
> that
> purpose, and as far as I could find out, mozilla, firefox, galeon,
> konqueror and
> opera don't support any command line options to have them print to a file.
>
> Thank you,
> Katja



 
Reply With Quote
 
Roman M. Parparov
Guest
Posts: n/a
 
      12-09-2004
Katja Zinchenko <> wrote:

> Thanks for the hint, and sorry, yes, my question was too vague. What I wanted to
> ask is: Is there a perlish way to save the contents of a web page to an image
> file exactly the way a browser (any recent browser) would render it? I'd like to
> pass my script a url and automatically get a kind of screen shot of the web
> page. There doesn't seem to be a perl module (at least not on CPAN) to that
> purpose, and as far as I could find out, mozilla, firefox, galeon, konqueror and
> opera don't support any command line options to have them print to a file.


> Thank you,
> Katja


But even the same browser would show it differently for various users.
You must also realize that most of the sites do not fit within one browser
window which is about 1024x768 pixels in size.

The only way that comes to mind - an abstract implementation:
1) Launch the browser to the requested URL from a perl script.
Your browser and whatever it displays are an X11 window within your
window manager/desktop environment.

2) Use the perl library corresponding to your window manager or even an
X11 perl interface to capture the window into an image.

OR

2) Use scriptlets supplied with your window manager (some have them) that
do the window capture. AFAIK Afterstep has this ability. I am not sure
but it is possible that X11 package has a program to take a shot of a full
desktop. Then, knowing your browser X11 geometry you can crop out the
browser itself.

The image would usually be saved in the XWD format.

3) Use an image conversion library of perl (or an external application)
to convert XWD to JPG or whatever.

Note: all this might be done with a shell script and existing Linux
utilities without Perl.

--
Roman M. Parparov - NASA EOSDIS project node at TAU technical manager.
Email: http://www.nasa.proj.ac.il/
Phone/Fax: +972-(0)3-6405205 (work), +972-(0)50-734-18-34 (home)
----------------------------------------------------------------------
The economy depends about as much on economists as the weather does on
weather forecasters.
-- Jean-Paul Kauffmann
 
Reply With Quote
 
Jürgen Exner
Guest
Posts: n/a
 
      12-09-2004
Katja Zinchenko wrote:
> Thanks for the hint, and sorry, yes, my question was too vague. What
> I wanted to ask is: Is there a perlish way to save the contents of a
> web page to an image file exactly the way a browser (any recent
> browser) would render it?


Of course. And it is so trivial that you don't even need Perl.
Just use Lynx. It has an option to write the rendered text to a file:

-dump
dumps the formatted output of the default document or one specified on the
command line to standard out.

Further details see
http://lynx.isc.org/current/lynx2-8-...ers_guide.html

jue


 
Reply With Quote
 
Roman M. Parparov
Guest
Posts: n/a
 
      12-09-2004
"J?rgen Exner" <> wrote:
> Katja Zinchenko wrote:
> > Thanks for the hint, and sorry, yes, my question was too vague. What
> > I wanted to ask is: Is there a perlish way to save the contents of a
> > web page to an image file exactly the way a browser (any recent
> > browser) would render it?


> Of course. And it is so trivial that you don't even need Perl.
> Just use Lynx. It has an option to write the rendered text to a file:


> -dump
> dumps the formatted output of the default document or one specified on the
> command line to standard out.


I suspect Katja wants to take screenshots of specific pages _including_
graphics, so lynx isn't sufficient.

> Further details see
> http://lynx.isc.org/current/lynx2-8-...ers_guide.html


> jue




--
Roman M. Parparov - NASA EOSDIS project node at TAU technical manager.
Email: http://www.nasa.proj.ac.il/
Phone/Fax: +972-(0)3-6405205 (work), +972-(0)50-734-18-34 (home)
----------------------------------------------------------------------
The economy depends about as much on economists as the weather does on
weather forecasters.
-- Jean-Paul Kauffmann
 
Reply With Quote
 
Jürgen Exner
Guest
Posts: n/a
 
      12-09-2004
Roman M. Parparov wrote:
> "J?rgen Exner" <> wrote:
>> Katja Zinchenko wrote:
>>> Thanks for the hint, and sorry, yes, my question was too vague. What
>>> I wanted to ask is: Is there a perlish way to save the contents of a
>>> web page to an image file exactly the way a browser (any recent
>>> browser) would render it?

>
>> Of course. And it is so trivial that you don't even need Perl.
>> Just use Lynx. It has an option to write the rendered text to a file:

>
>> -dump
>> dumps the formatted output of the default document or one specified
>> on the command line to standard out.

>
> I suspect Katja wants to take screenshots of specific pages
> _including_ graphics, so lynx isn't sufficient.


<quote>
the way [...] any recent browser would render it
</quote>

Anything beyond that is pure speculation

jue


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off




Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57