Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Host

Reply
 
 
Daniel Pitts
Guest
Posts: n/a
 
      01-04-2007

ck wrote:
> Daniel Pitts wrote:
> > John Ersatznom wrote:
> > > ck wrote:
> > > > The catch I was talking about is the need to start the server every 6
> > > > hours for free accounts on eatj.
> > >
> > > Seems pointless. Especially as anyone with even a tiny bit of real
> > > wizardry can quickly knock up a scraper and a cron job to run on their
> > > own PC to automate it.

> >
> > I was thinking the same thing.
> > It would still be a PITA though.

>
> I don't know anything about cron daemon and how it works, but can a
> cron daemon fix this issue of logging in to the server and restarting
> the app(I believe the server checks from which url request is getting
> generated. That should be quite an easy way to make sure that the
> request are not automated. I might be wrong, but thats what I think
> happens.)?
> If I am wrong could you give an idea how to do that? I tried doing this
> using javascript which refreshes after every 20 minutes but that failed
> too.
> By the way I did try using commons httpclient to do the same thing. But
> that was no good either.
>
> Cheers,
> Ck
> http://www.gfour.net


cron just allows you to schedule a program/script run.

As for urls and referers, those are all set by the client. You can set
the referer URL to however you want. Referer is nothing more than a
header set by the User-Agent. I've had to write a program that
pretended to be Mozilla, logged into a web site (saving cookies), and
did some things, and then logged off. It had to be indistinguishible
from a regular user.

It wasn't too hard to do

 
Reply With Quote
 
 
 
 
ck
Guest
Posts: n/a
 
      01-04-2007
Daniel Pitts wrote:
> cron just allows you to schedule a program/script run.


Yea I get that. Just read about it. I am not a linux user myself
though.

> As for urls and referers, those are all set by the client. You can set
> the referer URL to however you want. Referer is nothing more than a
> header set by the User-Agent. I've had to write a program that
> pretended to be Mozilla, logged into a web site (saving cookies), and
> did some things, and then logged off. It had to be indistinguishible
> from a regular user.


Yea I did not think of that(inexperience) we can set the Referer. One
way to automate this could also be achieved by hosting a web app and
let the web app log in to the account and log off. Would that be a bad
idea?

> It wasn't too hard to do


Well thats a perspective.

Cheers,
Ck

 
Reply With Quote
 
 
 
 
John Ersatznom
Guest
Posts: n/a
 
      01-05-2007
ck wrote:
> Daniel Pitts wrote:
>
>>John Ersatznom wrote:
>>
>>>ck wrote:
>>>
>>>>The catch I was talking about is the need to start the server every 6
>>>>hours for free accounts on eatj.
>>>
>>>Seems pointless. Especially as anyone with even a tiny bit of real
>>>wizardry can quickly knock up a scraper and a cron job to run on their
>>>own PC to automate it.

>>
>>I was thinking the same thing.
>>It would still be a PITA though.

>
> I don't know anything about cron daemon and how it works, but can a
> cron daemon fix this issue of logging in to the server and restarting
> the app(I believe the server checks from which url request is getting
> generated. That should be quite an easy way to make sure that the
> request are not automated. I might be wrong, but thats what I think
> happens.)?


It shouldn't be hard. There are even tools to script IE so you can
puppet it into simulating the user interaction. There is NO way the
remote site can detect that even in principle, especially with random
delays of a few fractions of a second built into the "playback" of the
"macro". In any event, you clearly need to set the Referer(sic) header
to the correct origin page, and maybe even request that page first, and
a few other things. But if a Web browser (driven by certain user input)
can do it, then so can Java code. The user + computer combo at the other
end of the network connection is to them a black box and the only way
they can be sure no human's involved is if they can't get past a captcha
or they generate inhumanly large amounts of traffic -- asking for a page
every millisecond or 300 in a day or something no normal human being
would do.

Sites shouldn't even try, other than to use captchas to frustrate
spambots trying to actually spam via guestbooks, forums, and blog
comments and to block IPs for the rest of the day that exceed traffic
limits no human would exceed. They can block known-bad IPs and
user-agents too but those are easy for a genuine malefactor to change
and IPs can be hard for an individual of limited resources caught by a
false positive to ditch, so I don't recomment that. Just temporarily
black out excessive traffic sources to effectively throttle per-visitor
bandwidth costs and if actual plagiarised material from your site shows
up elsewhere complain to the hosting provider to get it taken down.
Things like the DMCA are for that, not for restricting users' choice of
tools and automation.
 
Reply With Quote
 
John Ersatznom
Guest
Posts: n/a
 
      01-06-2007
ck wrote:
> Daniel Pitts wrote:
>
>>cron just allows you to schedule a program/script run.

>
> Yea I get that. Just read about it. I am not a linux user myself
> though.


Very handy. One time, years ago, I set a cron job up one time to
retrieve a "picture of the day" from a web site (no, not Wikipedia) and
save it to a fixed location every night at 1 am ... a fixed location on
my Windows desktop, that is. Yep, it can be ported over and even run as
an NT service. Getting it not to try to send me email (and choke when it
failed, since unix mail spooling wasn't set up) was the hard part. The
job called a script that was a thin wrapper around wget. Of course I had
to move, rename, or otherwise deal with the file so the next day's
wouldn't obliterate it. These days, in the name of not being blocked for
the heinous crime of saving a few keystrokes a day I'd put it on a
random delay (inside the script) up to a few hours and fake a Mozilla
user agent string of some kind, though.

 
Reply With Quote
 
Alex Hunsley
Guest
Posts: n/a
 
      01-10-2007
ck wrote:
> On Jan 1, 10:08 pm, "Andrew Thompson" <(E-Mail Removed)> wrote:
>> ck wrote:
>>> Daniel Pitts wrote:
>>>> danny wrote:
>>>>> Anyone Know of a good free webhosting site that supports java?
>>>> Do you mean Java servlets, or do you mean Applets on the page?
>>>> Applets on the page doesn't require any special hosting.
>>>> You're highly unlikely to get free hosting for a Java servlet, whatever
>>>> the quality.
>>> You could trywww.eatj.com. It does provide free hosting(there is a
>>> annoying catch though)What catch? I would expect 'in page' ads.

>> I had a look over their 'free offer' specs., and
>> the limits seem modest, but fine for a small,
>> relatively sedate (not high volume) site.
>> The 'terms and conditions' do not seem
>> beyond reason..
>>
>> Andrew T.

>
> The catch I was talking about is the need to start the server every 6
> hours for free accounts on eatj.


Couldn't we set up a servlet somewhere else to access their web
interface and automatically restart the servlets?

)
 
Reply With Quote
 
ck
Guest
Posts: n/a
 
      01-10-2007

Alex Hunsley wrote:
> > The catch I was talking about is the need to start the server every 6
> > hours for free accounts on eatj.

>
> Couldn't we set up a servlet somewhere else to access their web
> interface and automatically restart the servlets?
>
> )


Well that's exactly what I was talking about. I tried it using commons
http client. Though it results in 401 error. (mostly, redirect by the
server). I have tried to set follow redirect but that results in
exception.
If anyone is interested I can put the code that fails.

Cheers,
Ck
http://www.gfour.net

 
Reply With Quote
 
Daniel Pitts
Guest
Posts: n/a
 
      01-10-2007

ck wrote:
> Alex Hunsley wrote:
> > > The catch I was talking about is the need to start the server every 6
> > > hours for free accounts on eatj.

> >
> > Couldn't we set up a servlet somewhere else to access their web
> > interface and automatically restart the servlets?
> >
> > )

>
> Well that's exactly what I was talking about. I tried it using commons
> http client. Though it results in 401 error. (mostly, redirect by the
> server). I have tried to set follow redirect but that results in
> exception.
> If anyone is interested I can put the code that fails.
>
> Cheers,
> Ck
> http://www.gfour.net

401 is "Unauthorized", meaning you probably need to do some
authentication. This is likely to be a cookie you haven't retrieved or
haven't sent. Make sure your program will actually log in properly
before attempting anything else.

 
Reply With Quote
 
Alex Hunsley
Guest
Posts: n/a
 
      01-11-2007
ck wrote:
> Alex Hunsley wrote:
>>> The catch I was talking about is the need to start the server every 6
>>> hours for free accounts on eatj.

>> Couldn't we set up a servlet somewhere else to access their web
>> interface and automatically restart the servlets?
>>
>> )

>
> Well that's exactly what I was talking about. I tried it using commons
> http client. Though it results in 401 error. (mostly, redirect by the
> server). I have tried to set follow redirect but that results in
> exception.
> If anyone is interested I can put the code that fails.


As Daniel pointed out, you may be missing a cookie. The other classic
things they might be looking for are:

1) referrer string - if it's not apparently the correct URL from their
site going to their form submission URL (as referrer), they might block
2) user agent - if it doesn't look like some sort of standard browser,
they might also block

To get under the bonnet and see what HTTP traffic is doing, I recommend
something like the Proxomitron web proxy: you set your browser to use
this proxy, and then check out the Proxomitron log window (turn on
things like 'view posted data' too) - very handy for seeing what cookies
are set, the exact headers flying about, where a redirect happened,
etc., when you visit a site and do things.

Proxomitron is free (as in beer) and funky:
http://www.proxomitron.info/

Also there's a decent Java HTTP proxy called Charles, but it's not free
- you have to buy it after N days trial.

lex
 
Reply With Quote
 
Daniel Pitts
Guest
Posts: n/a
 
      01-11-2007

Alex Hunsley wrote:
> ck wrote:
> > Alex Hunsley wrote:
> >>> The catch I was talking about is the need to start the server every 6
> >>> hours for free accounts on eatj.
> >> Couldn't we set up a servlet somewhere else to access their web
> >> interface and automatically restart the servlets?
> >>
> >> )

> >
> > Well that's exactly what I was talking about. I tried it using commons
> > http client. Though it results in 401 error. (mostly, redirect by the
> > server). I have tried to set follow redirect but that results in
> > exception.
> > If anyone is interested I can put the code that fails.

>
> As Daniel pointed out, you may be missing a cookie. The other classic
> things they might be looking for are:
>
> 1) referrer string - if it's not apparently the correct URL from their
> site going to their form submission URL (as referrer), they might block
> 2) user agent - if it doesn't look like some sort of standard browser,
> they might also block
>
> To get under the bonnet and see what HTTP traffic is doing, I recommend
> something like the Proxomitron web proxy: you set your browser to use
> this proxy, and then check out the Proxomitron log window (turn on
> things like 'view posted data' too) - very handy for seeing what cookies
> are set, the exact headers flying about, where a redirect happened,
> etc., when you visit a site and do things.
>
> Proxomitron is free (as in beer) and funky:
> http://www.proxomitron.info/
>
> Also there's a decent Java HTTP proxy called Charles, but it's not free
> - you have to buy it after N days trial.
>
> lex

I suggest using Ethereal to monitor traffic. No need to set up a
proxy, and you can monitor other types of communication, not just HTTP.

 
Reply With Quote
 
Alex Hunsley
Guest
Posts: n/a
 
      01-11-2007
Daniel Pitts wrote:
> Alex Hunsley wrote:
>> ck wrote:
>>> Alex Hunsley wrote:
>>>>> The catch I was talking about is the need to start the server every 6
>>>>> hours for free accounts on eatj.
>>>> Couldn't we set up a servlet somewhere else to access their web
>>>> interface and automatically restart the servlets?
>>>>
>>>> )
>>> Well that's exactly what I was talking about. I tried it using commons
>>> http client. Though it results in 401 error. (mostly, redirect by the
>>> server). I have tried to set follow redirect but that results in
>>> exception.
>>> If anyone is interested I can put the code that fails.

>> As Daniel pointed out, you may be missing a cookie. The other classic
>> things they might be looking for are:
>>
>> 1) referrer string - if it's not apparently the correct URL from their
>> site going to their form submission URL (as referrer), they might block
>> 2) user agent - if it doesn't look like some sort of standard browser,
>> they might also block
>>
>> To get under the bonnet and see what HTTP traffic is doing, I recommend
>> something like the Proxomitron web proxy: you set your browser to use
>> this proxy, and then check out the Proxomitron log window (turn on
>> things like 'view posted data' too) - very handy for seeing what cookies
>> are set, the exact headers flying about, where a redirect happened,
>> etc., when you visit a site and do things.
>>
>> Proxomitron is free (as in beer) and funky:
>> http://www.proxomitron.info/
>>
>> Also there's a decent Java HTTP proxy called Charles, but it's not free
>> - you have to buy it after N days trial.
>>
>> lex

> I suggest using Ethereal to monitor traffic. No need to set up a
> proxy, and you can monitor other types of communication, not just HTTP.


I only just recently found out about Ethereal forking off into a product
called Wireshark. The ethereal site doesn't mention Wireshark for weird
legal reasons. I believe wireshark is the more up-to-date/recent project...

Yup, Ethereal/wireshark is good stuff for monitoring HTTP traffic and
more, although it may be a little harder to use (although not bad once
you know how). I prefer Proxomitron for lightweight HTTP monitoring.







 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
how to refer a control in the host page from a user control if the host page using masterpage Jerry Qu ASP .Net 1 02-20-2009 07:41 PM
Dane Cook: Great S.N.L. host or GREATEST S.N.L. host? Jojo the 90lb hottie Digital Photography 1 02-14-2007 04:55 AM
Cisco PIX 501 - Port forwarded to an internal host via Static NAT doesn't work from internal host JoelSeph Cisco 9 01-23-2006 03:52 PM
PIX: how to allow 1 host from outside interface to access another host on the inside interface? jonnah Cisco 1 04-21-2004 02:26 PM
request.getHeader("Host") returns wrong host name Orpheus66 Java 0 07-30-2003 02:59 PM



Advertisments