Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > HTML > Web robots

Reply
Thread Tools

Web robots

 
 
Paul
Guest
Posts: n/a
 
      08-23-2006
I am tearing my hear out. It apears my website is under atack from
these search engins. I have heard that I can place code in my header
som where to stop this. Any help/

the browser information that I have collected show up the following

Mozilla/5.0 (compatible; Yahoo! Slurp;
http://help.yahoo.com/help/us/ysearch/slurp)

Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)

Please help.

Desmond.

 
Reply With Quote
 
 
 
 
Andy Dingley
Guest
Posts: n/a
 
      08-23-2006

Paul wrote:

> It apears my website is under atack from these search engins.


Evil Google! No doughnut!

Web or newsgroup search for "robots.txt"

Apart from that, post a URL to your site if you want better advice.
We're not psychic.

 
Reply With Quote
 
 
 
 
Paul
Guest
Posts: n/a
 
      08-23-2006
The website is www.des-otoole.co.uk
Also can I add that I do not have any meta data describing the site.
Can someone nominate me to a search engine? They should not have found
me in the first place

Andy Dingley wrote:
> Paul wrote:
>
> > It apears my website is under atack from these search engins.

>
> Evil Google! No doughnut!
>
> Web or newsgroup search for "robots.txt"
>
> Apart from that, post a URL to your site if you want better advice.
> We're not psychic.


 
Reply With Quote
 
David Dorward
Guest
Posts: n/a
 
      08-23-2006
Paul wrote:
> Can someone nominate me to a search engine? They should not have found
> me in the first place


Someone could have linked to your site from a site that the search
engines know about.

Please don't top post.

 
Reply With Quote
 
TreatmentPlant
Guest
Posts: n/a
 
      08-23-2006
David Dorward wrote:
> Paul wrote:
>> Can someone nominate me to a search engine? They should not have found
>> me in the first place

>
> Someone could have linked to your site from a site that the search
> engines know about.
>
> Please don't top post.
>


http://www.google.com/support/webmas....py?topic=8843

http://www.google.com/support/webmas....py?topic=8459


might help?
 
Reply With Quote
 
Ken Sims
Guest
Posts: n/a
 
      08-23-2006
Hi Paul -

On 23 Aug 2006 03:34:44 -0700, "Paul" <(E-Mail Removed)> wrote:

>The website is www.des-otoole.co.uk


You need a robots.txt text file at the root of the site (e.g.
accessible as <www.des-otoole.co.uk/robots.txt>).

See http://www.robotstxt.org/wc/norobots.html

This robots.txt file tells all robots to not access any part of your
website:

User-agent: *
Disallow: /

Of course bad robots won't bother to even retrieve the file or will
retrieve it and ignore it, but that's another issue.

Google, Yahoo, MSN, etc. will retrieve and obey the robots.txt (though
you may still see some activity for a little while since they use
multiple servers for indexing and it may take a while for any given
server to retrieve an up-to-date copy of robots.txt).

--
Ken
http://www.kensims.net/
 
Reply With Quote
 
Nikita the Spider
Guest
Posts: n/a
 
      08-24-2006
In article <(E-Mail Removed) .com>,
"Paul" <(E-Mail Removed)> wrote:

> I am tearing my hear out. It apears my website is under atack from
> these search engins. I have heard that I can place code in my header
> som where to stop this. Any help/
>
> the browser information that I have collected show up the following
>
> Mozilla/5.0 (compatible; Yahoo! Slurp;
> http://help.yahoo.com/help/us/ysearch/slurp)
>
> Mozilla/5.0 (compatible; Googlebot/2.1;
> +http://www.google.com/bot.html)



Desmond,
Ken has already given you good practical advice to which I have nothing
to add. But I'm wondering what you mean by saying your Web site is
"under attack". Yahoo! Slurp and Googlebot try to be reasonably polite
when spidering a site.

--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more
 
Reply With Quote
 
Mike Collins
Guest
Posts: n/a
 
      08-24-2006
On 23 Aug 2006 02:36:09 -0700, "Paul" <(E-Mail Removed)> wrote:

>I am tearing my hear out. It apears my website is under atack from
>these search engins. I have heard that I can place code in my header
>som where to stop this. Any help/


http://danielwebb.us/software/bot-trap/

You need a bot-trap. It catches bots that ignore robots.txt and writes
the IP to a blacklist. The one referenced above works with PHP/Apache.

>
>the browser information that I have collected show up the following
>
>Mozilla/5.0 (compatible; Yahoo! Slurp;
>http://help.yahoo.com/help/us/ysearch/slurp)
>
>Mozilla/5.0 (compatible; Googlebot/2.1;
>+http://www.google.com/bot.html)
>
>Please help.
>
>Desmond.

 
Reply With Quote
 
Mike Collins
Guest
Posts: n/a
 
      08-24-2006
On Thu, 24 Aug 2006 12:42:38 GMT, Mike Collins
<webspammer_@_yaho-o_.com> wrote:

>On 23 Aug 2006 02:36:09 -0700, "Paul" <(E-Mail Removed)> wrote:
>
>>I am tearing my hear out. It apears my website is under atack from
>>these search engins. I have heard that I can place code in my header
>>som where to stop this. Any help/

>
>http://danielwebb.us/software/bot-trap/
>
>You need a bot-trap. It catches bots that ignore robots.txt and writes
>the IP to a blacklist. The one referenced above works with PHP/Apache.


http://www.homelandstupidity.us/software/bad-behavior/

bad-behavior will control aggressive scraping bots
 
Reply With Quote
 
rf
Guest
Posts: n/a
 
      08-24-2006
Mike Collins wrote:

>>You need a bot-trap. It catches bots that ignore robots.txt and writes
>>the IP to a blacklist. The one referenced above works with PHP/Apache.

>
> http://www.homelandstupidity.us/software/bad-behavior/


Hmmm.

"Help contribute directly to Bad Behaviour Development"
followed by a list of monetory amounts in $US, pounds sterling and Euros.

I guess this site does not want my Australian dollars.Fine with me

(short sighted bastards)

--
Cheers
Richard.


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
OT: Opinions on Robots.txt Frankie ASP .Net 1 10-10-2005 01:21 PM
Re: How Robots Will Steal Your Job Arthur T. Murray Java 1479 01-22-2004 05:20 AM
Best Way to Hide Email from Web Robots maflu Javascript 2 11-27-2003 09:01 PM
Re: How Robots Will Steal Your Job Bent C Dalager Java 1 08-27-2003 09:35 PM
Re: How Robots Will Steal Your Job Bent C Dalager Java 1 08-26-2003 05:08 PM



Advertisments