Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > ClientCookie bug

Reply
Thread Tools

ClientCookie bug

 
 
Mark Carter
Guest
Posts: n/a
 
      08-18-2003
> You want something like this:
>
> import ClientCookie
> c = ClientCookie.MSIECookieJar(delayload=1)
> c.load_cookie_data("hemscott-cookie.bin")
> url = 'http://businessplus.hemscott.net/corp/crp03733.htm'
> response = ClientCookie.urlopen(url)
>
> print response.read()
> response.close()



It doesn't work on my XP machine at least. This is proboably why
people have been doing it all wrong, allegedly.

I'll investigate further. Apologies for the time lag.
 
Reply With Quote
 
 
 
 
John J. Lee
Guest
Posts: n/a
 
      08-18-2003
http://www.velocityreviews.com/forums/(E-Mail Removed) (Mark Carter) writes:

> > You want something like this:
> >
> > import ClientCookie
> > c = ClientCookie.MSIECookieJar(delayload=1)
> > c.load_cookie_data("hemscott-cookie.bin")
> > url = 'http://businessplus.hemscott.net/corp/crp03733.htm'
> > response = ClientCookie.urlopen(url)
> >
> > print response.read()
> > response.close()

>
>
> It doesn't work on my XP machine at least. This is proboably why
> people have been doing it all wrong, allegedly.


I don't see how -- the examples in question (the one everybody seems
driven to copy from, and the ones people should copy from) does not
include any reference to any CookieJar.


> I'll investigate further. Apologies for the time lag.


Thanks


John
 
Reply With Quote
 
 
 
 
John J. Lee
Guest
Posts: n/a
 
      08-18-2003
Gary Feldman <(E-Mail Removed)> writes:

> On 16 Aug 2003 11:33:13 +0100, (E-Mail Removed) (John J. Lee) wrote:
>
> >
> >Take note of this comment from the web page:
> >
> >| # Don't copy this blindly! You probably want to follow the examples
> >| # above, not this one.

>
> Since the purpose of that example seems to be to show how things work under
> the hood, may I suggest putting it on a separate page, and replacing it
> here with something like "Here is a <a href...>lower level example</a> that
> shows how this works, though you would rarely want to implement things at
> this low level."


Hmm, good idea, but I really don't want to split the documentation up
-- one page is simpler -- and the example is instructive for people
who actually want to understand what the module does. And if people
manage to miss that comment (highlighted in orange, for heavens sake),
well...


John
 
Reply With Quote
 
Mark Carter
Guest
Posts: n/a
 
      08-18-2003
> I'll investigate further.

Here are the results from running tests in ClientCookie 0.4.4.a:

def go7():
#ClientCookie 0.4.4.a:
#works from win xp and win 98

# I prefer this method as a better way than using
load_from_registry()
#use this method!!

import ClientCookie
c = ClientCookie.MSIECookieJar() # do NOT set delayload
#c.user_name = "mark carter"
c.load_cookie_data("hemscott-cookie.bin")
#c.load_from_registry()
print c

import urllib2
url = 'http://businessplus.hemscott.net/corp/crp03733.htm'
request = urllib2.Request(url)
response = urllib2.urlopen(request)
c.extract_cookies(response, request)
#let's say this next request requires a cookie that was set in
response
request2 = urllib2.Request(url)
c.add_cookie_header(request2)
response2 = urllib2.urlopen(request2)

print response2.geturl()
print response2.info() # headers
print response2.read()
response2.close()



def go8():
#ClientCookie 0.4.4.a:
#contains bug in Win98 when environ variable is commented out,
#but works in win98 when environ variable is set
#Works in win xp , regardless of the USERNAME line

#this works - I can now import into Hemscott
import ClientCookie
c = ClientCookie.MSIECookieJar(delayload=1)
#c.user_name = "mark carter"
#os.environ['USERNAME'] = 'mcarter' #needed by
load_from_registry()
c.load_from_registry()

import urllib2
url = 'http://businessplus.hemscott.net/corp/crp03733.htm'
request = urllib2.Request(url)
response = urllib2.urlopen(request)
#c.extract_cookies(response, request)
# let's say this next request requires a cookie that was set in
response
request2 = urllib2.Request(url)
c.add_cookie_header(request2)
response2 = urllib2.urlopen(request2)

print response2.geturl()
print response2.info() # headers
for line in response2.readlines(): # body
print line


The upshot of this is that load_cookie_data() now works in win 98 and
xp.
load_from_registry() works from win xp; it works from win 98 if and
only if you set the USERNAME environment variable.

I appreciate that all the stuff about request2 and response2 may not
be to your liking - but at the moment I'm just trying to figure out
what works, and what doesn't. We can also worry about the delayload
business later.

What do you think about the idea of actually setting up an Aapache web
page to test these things 'for real'?
 
Reply With Quote
 
Gary Feldman
Guest
Posts: n/a
 
      08-18-2003
On 18 Aug 2003 14:43:06 +0100, (E-Mail Removed) (John J. Lee) wrote:

>Hmm, good idea, but I really don't want to split the documentation up
>-- one page is simpler -- and the example is instructive for people


Then definitely blockquote it (or indent it some other way), and consider
putting it into a smaller font, or using a grey background, or something
else to indicate that it's a digression. Orange would draw attention to
it; you want the opposite.

Gary

 
Reply With Quote
 
John J. Lee
Guest
Posts: n/a
 
      08-18-2003
Gary Feldman <(E-Mail Removed)> writes:

> On 18 Aug 2003 14:43:06 +0100, (E-Mail Removed) (John J. Lee) wrote:
>
> >Hmm, good idea, but I really don't want to split the documentation up
> >-- one page is simpler -- and the example is instructive for people

>
> Then definitely blockquote it (or indent it some other way), and consider
> putting it into a smaller font, or using a grey background, or something
> else to indicate that it's a digression.


Again, that would be a good idea if it *were* a digression, but it's
necessary for understanding what the module gets up to. Without that
understanding in the reader's mind, it's hard to explain the code that
one uses in practice if it's any more complicated than urlopen. And
the very top of the page says:

| import ClientCookie
| response = ClientCookie.urlopen("http://foo.bar.com/")
|
|This function behaves identically to urllib2.urlopen, except that it
|deals with cookies automatically. That's probably all you need to
|know.

So you don't even have to read further than that for most purposes. I
can't see how to improve on that, but I'm happy to learn how!


> Orange would draw attention to
> it; you want the opposite.


Only the comment is in emacs-orange (well, my copy of python-mode uses
that kind of rust-orange for Python comments), so it doesn't
particularly draw attention to that block of code more than the rest.
And I *do* want to draw attention to the comment, so people read the
comment before the code.

Admittedly, it doesn't seem to work (on a sample of one
misinterpreter, Mark, so far -- I only just added that comment
recently, though there are several other warnings that cover the same
ground elsewhere).


John
 
Reply With Quote
 
Anand Pillai
Guest
Posts: n/a
 
      08-19-2003
I am working on a Cookie module which works *with* urllib2 rather
than on top of it like the existing ClientCookie module. It uses
the Cookie module which comes with python standard library.

This module is written as an extension of my Harvestman webcrawler.
The alpha code is ready. We are doing testing right now.

Details will be posted to my website at
http://members.lycos.co.uk/anandpillai within say 2 weeks or so.

-Anand


(E-Mail Removed) (John J. Lee) wrote in message news:<(E-Mail Removed)>...
> (E-Mail Removed) (Mark Carter) writes:
>
> > > I'll investigate further.

> >
> > Here are the results from running tests in ClientCookie 0.4.4.a:

> [...]
> > The upshot of this is that load_cookie_data() now works in win 98 and
> > xp.
> > load_from_registry() works from win xp; it works from win 98 if and
> > only if you set the USERNAME environment variable.

>
> You missed the username argument.
>
> cookiejar.load_from_registry(username="mark")
>
> (should only be required for win9x family)
>
>
> > I appreciate that all the stuff about request2 and response2 may not
> > be to your liking - but at the moment I'm just trying to figure out
> > what works, and what doesn't. We can also worry about the delayload
> > business later.

>
> No really, I wasn't joking: you *never* need to use add_cookie_header
> / extract_cookies if you're using urllib2 (at least, I can't think of
> any possible reason to do so). It can only break things.
>
>
> > What do you think about the idea of actually setting up an Aapache web
> > page to test these things 'for real'?

>
> I've done limited testing on Windows with 'fake' cookies from a local
> Apache server, and on wine on linux. As I said, though, I don't have
> a networked Windows OS, so it's inconvenient to test these things in a
> 'real' situation. And my machine currently doesn't boot into Windows
> without physically switching cables around (security & obscure
> hardware issues, not software ones), which means I currently can't be
> bothered to test it on Windows . So, your feedback is appreciated.
>
>
> John

 
Reply With Quote
 
Mark Carter
Guest
Posts: n/a
 
      08-19-2003
> No really, I wasn't joking: you *never* need to use add_cookie_header
> / extract_cookies if you're using urllib2 (at least, I can't think of
> any possible reason to do so). It can only break things.


I must admit that I don't really know what I am doing. How would you
simplify the following code:

def go8():
import ClientCookie
c = ClientCookie.MSIECookieJar(delayload=1)
c.load_from_registry(username='mcarter') #only need username for
win9x

import urllib2
url = 'http://businessplus.hemscott.net/corp/crp03733.htm'
request = urllib2.Request(url)
response = urllib2.urlopen(request)
request2 = urllib2.Request(url)
c.add_cookie_header(request2)
response2 = urllib2.urlopen(request2)

print response2.geturl()
print response2.info() # headers
for line in response2.readlines(): # body
print line
 
Reply With Quote
 
Anand Pillai
Guest
Posts: n/a
 
      08-20-2003
Hi John

I wanted to add cookies support to harvestman, your module
looked ideal for it.

'We', nothing royal about it. It is just me and my friend
& co-developer Nirmal Chidambaram. Apparently he has found
a way around some of the bugs in Clientcookie. He has written
a new module using the existing Cookie module of python &
urllib2. One of the problems 'we' had with Clientcookie is that
it uses its own 'urlopen' methods which does not fit our
applications needs, so 'we' had to find a way around it.

Once the code is ready, I will post it on my webpage, and
of course it is not a module in itself, so I think an
announcement to c.l.py is out of place.

Regards

-Anand


(E-Mail Removed) (John J. Lee) wrote in message news:<(E-Mail Removed)>...
> (E-Mail Removed) (Anand Pillai) writes:
>
> > I am working on a Cookie module which works *with* urllib2 rather
> > than on top of it like the existing ClientCookie module. It uses
> > the Cookie module which comes with python standard library.

>
> Interesting, though I don't know quite what you mean.
>
> First, if there's a way to work more closely with urllib2 than I've
> figured out (which is quite possible), this patch needs to know about
> it, so please post a comment:
>
> http://www.python.org/sf/759792
>
> If I understand what you mean, ClientCookie only works 'on top of'
> rather than 'with' urllib2 to the extent that it currently has to
> cut-n-paste code to add cookie handling to urllib2. That patch is
> designed to remove the need to cut-n-paste, which would mean you'd do
> urllib2.urlopen (after building an OpenerDirector that has an
> HTTPCookieProcessor from ClientCookie) instead of ClientCookie.urlopen
> as is required at present.
>
> Seco

nd: is your module intended to do what ClientCookie does
> (ie. figure out what cookies should be set and returned, and do so),
> or is it just a more OO way of getting and returning Cookie headers?
> I guess the latter?
>
>
> > This module is written as an extension of my Harvestman webcrawler.
> > The alpha code is ready. We are doing testing right now.

>
> Is this the Royal We?
>
>
> > Details will be posted to my website at
> > http://members.lycos.co.uk/anandpillai within say 2 weeks or so.

> [...]
>
> Please do post an announcement to c.l.py.announce, or I'll forget.
>
>
> John

 
Reply With Quote
 
John J. Lee
Guest
Posts: n/a
 
      08-20-2003
(E-Mail Removed) (Anand Pillai) writes:
[...]
> 'We', nothing royal about it. It is just me and my friend
> & co-developer Nirmal Chidambaram. Apparently he has found
> a way around some of the bugs in Clientcookie. He has written


It'd be great if you made me aware what those bugs are!

(BTW, no intent to offend with my comment about your plurality, or
lack thereof -- it's just that the convention of using 'we' in source
code comments is common enough that I've sometimes found myself using
it even when writing code alone, which is funny.)


> a new module using the existing Cookie module of python &
> urllib2. One of the problems 'we' had with Clientcookie is that
> it uses its own 'urlopen' methods which does not fit our
> applications needs, so 'we' had to find a way around it.


As I said before, if you know how to do that, please comment on the
RFE I referenced in my last post. Jeremy Hylton is planning to look
at the patch associated with that RFE in detail sometime, and you
could save him some time if you know a way to do this without patching
urllib2. And I'd like to know how to do it, too


> Once the code is ready, I will post it on my webpage, and
> of course it is not a module in itself, so I think an
> announcement to c.l.py is out of place.

[...]

Would you mind sending me an email?

Thanks


John
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
ClientCookie problem - Difference between 'post' on the local networkand the internet Max M Python 5 10-26-2004 09:36 PM
urllib2 / ClientCookie / Keep-Alive Richie Hindle Python 0 10-15-2004 02:18 PM
ClientCookie Michael Foord Python 11 08-23-2004 02:40 PM
ClientCookie/urllib2 with persistent connections? Chuck Bearden Python 2 05-11-2004 03:55 AM
Why is ClientCookie/urllib2 using https? Grant Edwards Python 0 09-24-2003 08:06 PM



Advertisments