Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > python to enable javascript , tried selinium, ghost, pyQt4 already

Reply
Thread Tools

python to enable javascript , tried selinium, ghost, pyQt4 already

 
 
Jaiprakash Singh
Guest
Posts: n/a
 
      01-18-2014
hi,

can you please suggest me some method for study so that i can scrap a site having JavaScript behind it


i have tried selenium, ghost, pyQt4, but it is slow and as a am working with thread it sinks my ram memory very fast.
 
Reply With Quote
 
 
 
 
Denis McMahon
Guest
Posts: n/a
 
      01-18-2014
On Sat, 18 Jan 2014 03:54:17 -0800, Jaiprakash Singh wrote:

> can you please suggest me some method for study so that i can
> scrap a site having JavaScript behind it


Please expand upon the requirement, are you trying to:

a) replace server side javascript with server side python, or
b) replace client side javascript with server side python, or
c) replace client side javascript with client side python, or
d) something else?

(c) is not possible (you can't guarantee that all clients will have
python, or that there will be a mechanism for calling it from your
webpages), (b) doesn't make a lot of sense (you'll be trading cpu in the
client for cpu in the server + network bandwidth and latency).

--
Denis McMahon, http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
 
 
 
Chris Angelico
Guest
Posts: n/a
 
      01-18-2014
On Sat, Jan 18, 2014 at 10:54 PM, Jaiprakash Singh
<(E-Mail Removed)> wrote:
> hi,
>
> can you please suggest me some method for study so that i can scrap a site having JavaScript behind it
>
>
> i have tried selenium, ghost, pyQt4, but it is slow and as a am working with thread it sinks my ram memory very fast.


Do you mean "scrape"? You're trying to retrieve the displayed contents
of a web page that uses JavaScript? If so, that's basically impossible
without actually executing the JS code, which means largely
replicating the web browser.

ChrisA
 
Reply With Quote
 
Denis McMahon
Guest
Posts: n/a
 
      01-18-2014
On Sun, 19 Jan 2014 05:13:57 +1100, Chris Angelico wrote:

> On Sat, Jan 18, 2014 at 10:54 PM, Jaiprakash Singh
> <(E-Mail Removed)> wrote:
>> hi,
>>
>> can you please suggest me some method for study so that i can
>> scrap a site having JavaScript behind it
>>
>>
>> i have tried selenium, ghost, pyQt4, but it is slow and as a am
>> working with thread it sinks my ram memory very fast.

>
> Do you mean "scrape"? You're trying to retrieve the displayed contents
> of a web page that uses JavaScript? If so, that's basically impossible
> without actually executing the JS code, which means largely replicating
> the web browser.


Oh, you think he meant scrape? I thought he was trying to scrap (as in
throw away / replace) an old javascript heavy website with something
using python instead.

--
Denis McMahon, (E-Mail Removed)
 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      01-18-2014
On Sun, Jan 19, 2014 at 8:40 AM, Denis McMahon <(E-Mail Removed)> wrote:
> On Sun, 19 Jan 2014 05:13:57 +1100, Chris Angelico wrote:
>
>> On Sat, Jan 18, 2014 at 10:54 PM, Jaiprakash Singh
>> <(E-Mail Removed)> wrote:
>>> hi,
>>>
>>> can you please suggest me some method for study so that i can
>>> scrap a site having JavaScript behind it
>>>
>>>
>>> i have tried selenium, ghost, pyQt4, but it is slow and as a am
>>> working with thread it sinks my ram memory very fast.

>>
>> Do you mean "scrape"? You're trying to retrieve the displayed contents
>> of a web page that uses JavaScript? If so, that's basically impossible
>> without actually executing the JS code, which means largely replicating
>> the web browser.

>
> Oh, you think he meant scrape? I thought he was trying to scrap (as in
> throw away / replace) an old javascript heavy website with something
> using python instead.


I thought so too at first, but since we had another recent case of
someone confusing the two words, and since "scrape" would make sense
in this context, I figured it'd be worth asking the question.

ChrisA
 
Reply With Quote
 
Giorgos Tzampanakis
Guest
Posts: n/a
 
      01-19-2014
On 2014-01-18, Jaiprakash Singh wrote:

> hi,
>
> can you please suggest me some method for study so that i can
> scrap a site having JavaScript behind it
>
>
> i have tried selenium, ghost, pyQt4, but it is slow and as a am
> working with thread it sinks my ram memory very fast.


I have tried selenium in the past and I remember it working reasonably
well. I am afraid you can't get around the slowness since you have to have
a web browser running.

--
Improve at backgammon rapidly through addictive quickfire position quizzes:
http://www.bgtrain.com/
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
PyQt4.__file__ gives PyQt4/__init__.py as value wgw Python 1 08-15-2009 09:58 PM
Problems with cin (already tried getline) Michele 'xjp' C++ 17 07-22-2007 08:30 PM
Mac / Netbeans - how to set up HTML browser ? (tried Tools-Options already) Mr. Kite Java 2 08-14-2006 01:08 PM
Reload Problem - Tried several solutions already Tim Pascoe ASP General 0 02-06-2004 02:43 PM
already tried Panasonic's DMC-F1 ? Missie Digital Photography 5 12-03-2003 03:47 AM



Advertisments