Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Transparent (redirecting) proxy with BaseHTTPServer

Reply
Thread Tools

Transparent (redirecting) proxy with BaseHTTPServer

 
 
paul koelle
Guest
Posts: n/a
 
      01-27-2005
Hi list,

My ultimate goal is to have a small HTTP proxy which is able to show a
message specific to clients name/ip/status then handle the original
request normally either by redirecting the client, or acting as a proxy.

I started with a modified[1] version of TinyHTTPProxy postet by Suzuki
Hisao somewhere in 2003 to this list and tried to extend it to my needs.
It works quite well if I configure my client to use it, but using
iptables REDIRECT feature to point the clients transparently to the
proxy caused some issues.

Precisely, the "self.path" member variable of baseHTTPRequestHandler is
missing the <command> and the host (i.e www.python.org) part of the
request line for REDIRECTed connections:

without iptables REDIRECT:
self.path -> GET http://www.python.org/ftp/python/contrib/ HTTP/1.1

with REDIRECT:
self.path -> GET /ftp/python/contrib/ HTTP/1.1

I asked about this on the squid mailing list and was told this is normal
and I have to reconstuct the request line from the real destination IP,
the URL-path and the Host header (if any). If the Host header is sent
it's an (unsafe) nobrainer, but I cannot for the life of me figure out
where to get the "real destination IP". Any ideas?

thanks
Paul

[1] HTTP Debugging Proxy
Modified by Xavier Defrang (http://defrang.com/)
 
Reply With Quote
 
 
 
 
aurora
Guest
Posts: n/a
 
      01-27-2005
If you actually want the IP, resolve the host header would give you that.

In the redirect case you should get a host header like

Host: www.python.org

From that you can reconstruct the original URL as
http://www.python.org/ftp/python/contrib/. With that you can open it using
urllib and proxy the data to the client.

The second form of HTTP request without the host part is for compatability
of pre-HTTP/1.1 standard. All modern web browser should send the Host
header.


> Hi list,
>
> My ultimate goal is to have a small HTTP proxy which is able to show a
> message specific to clients name/ip/status then handle the original
> request normally either by redirecting the client, or acting as a proxy.
>
> I started with a modified[1] version of TinyHTTPProxy postet by Suzuki
> Hisao somewhere in 2003 to this list and tried to extend it to my needs.
> It works quite well if I configure my client to use it, but using
> iptables REDIRECT feature to point the clients transparently to the
> proxy caused some issues.
>
> Precisely, the "self.path" member variable of baseHTTPRequestHandler is
> missing the <command> and the host (i.e www.python.org) part of the
> request line for REDIRECTed connections:
>
> without iptables REDIRECT:
> self.path -> GET http://www.python.org/ftp/python/contrib/ HTTP/1.1
>
> with REDIRECT:
> self.path -> GET /ftp/python/contrib/ HTTP/1.1
>
> I asked about this on the squid mailing list and was told this is normal
> and I have to reconstuct the request line from the real destination IP,
> the URL-path and the Host header (if any). If the Host header is sent
> it's an (unsafe) nobrainer, but I cannot for the life of me figure out
> where to get the "real destination IP". Any ideas?
>
> thanks
> Paul
>
> [1] HTTP Debugging Proxy
> Modified by Xavier Defrang (http://defrang.com/)


 
Reply With Quote
 
 
 
 
paul koelle
Guest
Posts: n/a
 
      01-27-2005

Thanks, aurora ,

aurora wrote:
> If you actually want the IP, resolve the host header would give you that.

I' m only interested in the hostname.

>
> The second form of HTTP request without the host part is for
> compatability of pre-HTTP/1.1 standard. All modern web browser should
> send the Host header.

How safe is the assumtion that the Host header will be there? Is it part
of the HTTP/1.1 spec? And does it mean all "pre 1.1" clients will fail?
Hmm, maybe I should look on the wire whats really happening...

thanks again
Paul
 
Reply With Quote
 
aurora
Guest
Posts: n/a
 
      01-28-2005
It should be very safe to count on the host header. Maybe some really
really old browser would not support that. But they probably won't work in
today's WWW anyway. Majority of today's web site is likely to be virtually
hosted. One Apache maybe hosting for 50 web addresses. If a client strip
the host name and not sending the host header either the web server
wouldn't what address it is really looking for. If you caught some request
that doesn't have host header it is a good idea to redirect them to a
browser upgrade page.

>
> Thanks, aurora ,
>
> aurora wrote:
>> If you actually want the IP, resolve the host header would give you
>> that.

> I' m only interested in the hostname.
>
>> The second form of HTTP request without the host part is for
>> compatability of pre-HTTP/1.1 standard. All modern web browser should
>> send the Host header.

> How safe is the assumtion that the Host header will be there? Is it part
> of the HTTP/1.1 spec? And does it mean all "pre 1.1" clients will fail?
> Hmm, maybe I should look on the wire whats really happening...
>
> thanks again
> Paul


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
T/Clear transparent proxy croaked, anybody got an open proxy handy? James Sleeman NZ Computing 12 09-19-2004 08:32 AM
Transparent proxy and Cisco router 831 and IOS 12.2 Mirek Cisco 1 01-18-2004 04:10 PM
www transparent proxy ? Casto Cisco 2 12-15-2003 05:54 AM
transparent www proxy - port forwarding with 3660 Enrico Gloeckner Cisco 2 11-29-2003 02:51 PM
transparent www proxy - pix port forwarding Enrico Gloeckner Cisco 1 11-09-2003 05:53 PM



Advertisments