Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > HTML > Missing robots.txt file

Reply
Thread Tools

Missing robots.txt file

 
 
Joe Blow
Guest
Posts: n/a
 
      08-28-2004
I am experiencing difficulties with a supposed missing "robots.txt"
file. I receive on average 5-6 notifications per day letting me know
that the file was requested was missing, when in fact it is there.

Our server logs indicate that the file is also being accessed
successfully, but and wondering why I am still receiving notifications.

This morning so far:
* NutchCVS/0.06-dev (Nutch; http://www.nutch.org/docs/en/bot.html;
http://www.velocityreviews.com/forums/(E-Mail Removed))
* msnbot/0.11 (+http://search.msn.com/msnbot.htm)

Can someone shed some light on what is happening?

The web site is www.wiavic.org.au

Thanks,
 
Reply With Quote
 
 
 
 
Nik Coughin
Guest
Posts: n/a
 
      08-29-2004
Joe Blow wrote:
> I am experiencing difficulties with a supposed missing "robots.txt"
> file. I receive on average 5-6 notifications per day letting me know
> that the file was requested was missing, when in fact it is there.
>
> Our server logs indicate that the file is also being accessed
> successfully, but and wondering why I am still receiving
> notifications.


Hate to answer a question with a question (or not answer it as the case may
be) but are spiders only supposed to look for robots.txt in the base
directory, or do they look for it at the entry point from which they start
crawling? Or do spiders check for a copy of robots.txt in every directory
that they crawl?


 
Reply With Quote
 
 
 
 
Joe Blow
Guest
Posts: n/a
 
      08-29-2004
The robots.txt file resides in the root directory of your server. The
file instructs the crawler which directories are accessible and which
are not.


Nik Coughin wrote:
>
> Joe Blow wrote:
> > I am experiencing difficulties with a supposed missing "robots.txt"
> > file. I receive on average 5-6 notifications per day letting me know
> > that the file was requested was missing, when in fact it is there.
> >
> > Our server logs indicate that the file is also being accessed
> > successfully, but and wondering why I am still receiving
> > notifications.

>
> Hate to answer a question with a question (or not answer it as the case may
> be) but are spiders only supposed to look for robots.txt in the base
> directory, or do they look for it at the entry point from which they start
> crawling? Or do spiders check for a copy of robots.txt in every directory
> that they crawl?

 
Reply With Quote
 
Big Bill
Guest
Posts: n/a
 
      08-30-2004
On Mon, 30 Aug 2004 09:47:35 +1200, "Nik Coughin"
<nrkn!no-spam!@woosh.co.nz> wrote:

>Joe Blow wrote:
>> I am experiencing difficulties with a supposed missing "robots.txt"
>> file. I receive on average 5-6 notifications per day letting me know
>> that the file was requested was missing, when in fact it is there.
>>
>> Our server logs indicate that the file is also being accessed
>> successfully, but and wondering why I am still receiving
>> notifications.

>
>Hate to answer a question with a question (or not answer it as the case may
>be) but are spiders only supposed to look for robots.txt in the base
>directory, or do they look for it at the entry point from which they start
>crawling? Or do spiders check for a copy of robots.txt in every directory
>that they crawl?


It should be in the root dir.

BB

 
Reply With Quote
 
Big Bill
Guest
Posts: n/a
 
      08-30-2004
On Sun, 29 Aug 2004 22:53:19 GMT, Joe Blow <(E-Mail Removed)> wrote:

>The robots.txt file resides in the root directory of your server. The
>file instructs the crawler which directories are accessible and which
>are not.


Being picky, it just says which are not. All files are deemed
accessible by default unless there's a statement against it in the
robots txt.

BB


>Nik Coughin wrote:
>>
>> Joe Blow wrote:
>> > I am experiencing difficulties with a supposed missing "robots.txt"
>> > file. I receive on average 5-6 notifications per day letting me know
>> > that the file was requested was missing, when in fact it is there.
>> >
>> > Our server logs indicate that the file is also being accessed
>> > successfully, but and wondering why I am still receiving
>> > notifications.

>>
>> Hate to answer a question with a question (or not answer it as the case may
>> be) but are spiders only supposed to look for robots.txt in the base
>> directory, or do they look for it at the entry point from which they start
>> crawling? Or do spiders check for a copy of robots.txt in every directory
>> that they crawl?


 
Reply With Quote
 
data64
Guest
Posts: n/a
 
      08-30-2004
Joe Blow <(E-Mail Removed)> wrote in news:(E-Mail Removed):

> I am experiencing difficulties with a supposed missing "robots.txt"
> file. I receive on average 5-6 notifications per day letting me know
> that the file was requested was missing, when in fact it is there.
>
> Our server logs indicate that the file is also being accessed
> successfully, but and wondering why I am still receiving notifications.
>
> This morning so far:
> * NutchCVS/0.06-dev (Nutch; http://www.nutch.org/docs/en/bot.html;
> (E-Mail Removed))
> * msnbot/0.11 (+http://search.msn.com/msnbot.htm)
>


Have you looked up the corresponding request in the access_log ?

data64
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Crystal Reports - Visual Basic UFL that implements this function is missing (or U2lcom.dll is missing) Les Caudle ASP .Net 3 09-03-2007 02:27 AM
Re: missing feature classes and missing fields Gary Herron Python 2 07-04-2006 10:29 PM
System looking for a missing file =?Utf-8?B?V2FrZSBGb3Jlc3QgVGlt?= Microsoft Certification 0 02-09-2006 02:48 PM
missing file =?Utf-8?B?dHdpbg==?= Wireless Networking 2 02-20-2005 11:53 AM
Selected file missing after page load hy ASP .Net 0 07-25-2003 04:24 AM



Advertisments