Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > HTML > "strange subdirectories"

Reply
Thread Tools

"strange subdirectories"

 
 
Luigi Donatello Asero
Guest
Posts: n/a
 
      03-06-2005
It seems as Google robots visit subdirectories which I do not see in my
directory www
for example

crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET
///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla

That means subdirectories with several "/" before my "normal subdirectories
(for example "it").
They are very many "strange subdirectories!
Is it normal?
What can I do?



--
Luigi ( un italiano che vive in Svezia)

https://www.scaiecat-spa-gigi.com/de...-rom-trevi.php






 
Reply With Quote
 
 
 
 
Roy Schestowitz
Guest
Posts: n/a
 
      03-06-2005
Luigi Donatello Asero wrote:

> It seems as Google robots visit subdirectories which I do not see in my
> directory www
> for example
>
> crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET
> ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla
>
> That means subdirectories with several "/" before my "normal
> subdirectories (for example "it").
> They are very many "strange subdirectories!
> Is it normal?
> What can I do?


I am not sure about Google, but Yahoo make very many mistakes. One such
mistake involves mixing structures and files from other sites. If you see
strange filenames, that'll be the explanations. Another one that I suffer
from all the time is when Yahoo fail to traverse directories without an
index -- that is -- directories which invoke the default Apache file
listing. Yahoo descents to a lower level, which is incorrect and this
triggers many distracting errors.

Google might be doing similar mistakes. I noticed that it continuously
fails to deal with frames that come from different domains. Sometimes it
looks for .tex files when a .pdf is found. All in all, I do not totally
trust it.

Roy

--
Roy Schestowitz
http://schestowitz.com
 
Reply With Quote
 
 
 
 
Toby Inkster
Guest
Posts: n/a
 
      03-06-2005
Luigi Donatello Asero wrote:

> crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET
> ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla
> That means subdirectories with several "/" before my "normal subdirectories
> (for example "it").


Most servers will (by default[1]) consider the following URLs to be
equivalent:

///foo//bar
//foo//bar
/foo//bar
/foo/bar

However, most clients won't consider them the same. (Nor should they![1])

It may well be that some stupid robots, when they are at the main index
page ("/") see a link like "/it/lamedeiazionemerci2.html" and wrongly
invent a URL like "//it/lamedeiazionemerci2.html".

Hence the weird requests.

____
[1] An easy way to make the URLs "/" and "//" act differently would be:

1. In an .htaccess in your document root, turn on Multimodes;
2. Then create a file "somedir.php" in the document root:

<?php
$p = $_SERVER['PATH_INFO'];
echo strstr($p,'//')?'Foo':'Bar';
?>

3. Now visit:

http://www.yourdomain.com/somedir/hello/world/
and
http://www.yourdomain.com/somedir/hello//world/

and note the difference.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact

 
Reply With Quote
 
Luigi Donatello Asero
Guest
Posts: n/a
 
      03-06-2005

"Toby Inkster" <(E-Mail Removed)> skrev i meddelandet
news(E-Mail Removed) .uk...
> Luigi Donatello Asero wrote:
>
> > crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET
> > ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla
> > That means subdirectories with several "/" before my "normal

subdirectories
> > (for example "it").

>
> Most servers will (by default[1]) consider the following URLs to be
> equivalent:
>
> ///foo//bar
> //foo//bar
> /foo//bar
> /foo/bar
>




What does "foo" stand for?


> However, most clients won't consider them the same. (Nor should they![1])
>
> It may well be that some stupid robots, when they are at the main index
> page ("/") see a link like "/it/lamedeiazionemerci2.html" and wrongly
> invent a URL like "//it/lamedeiazionemerci2.html".
>
> Hence the weird requests.
>
> ____
> [1] An easy way to make the URLs "/" and "//" act differently would be:
>
> 1. In an .htaccess in your document root, turn on Multimodes;



How do I turn on Multimodes?
And what is Multimodes, anyway?


> 2. Then create a file "somedir.php" in the document root:
>
> <?php
> $p = $_SERVER['PATH_INFO'];
> echo strstr($p,'//')?'Foo':'Bar';
> ?>



So, when you write "Foo" I should write for example
"it" or
https://www.scaiecat-spa-gigi.com/it/

And should "bar" be "lamediazionemerci2.html "
in the above mentioned example?

--
Luigi ( un italiano che vive in Svezia)
https://www.scaiecat-spa-gigi.com/de...en-italien.php











 
Reply With Quote
 
Steve Pugh
Guest
Posts: n/a
 
      03-06-2005
"Luigi Donatello Asero" <(E-Mail Removed)> wrote:

>What does "foo" stand for?


http://en.wikipedia.org/wiki/Foo

Steve

--
"My theories appal you, my heresies outrage you,
I never answer letters and you don't like my tie." - The Doctor

Steve Pugh <(E-Mail Removed)> <http://steve.pugh.net/>
 
Reply With Quote
 
Toby Inkster
Guest
Posts: n/a
 
      03-06-2005
Luigi Donatello Asero wrote:

> What does "foo" stand for?


Paradoxically, "foo" stands for "****ed up", but that's not important.

"foo" and "bar" are simply example words that can be inserted into any
example when you can't think of a better word to use as an example.

>> [1] An easy way to make the URLs "/" and "//" act differently would be:
>> 1. In an .htaccess in your document root, turn on Multimodes; How do I

>
> turn on Multimodes?
> And what is Multimodes, anyway?


Google for it.

Note: I'm not saying that you *should* to any of those steps -- I am
merely pointing out that it is possible to make /foo//bar be interpreted
differently from /foo/bar. In general, this is probably a bad idea, as
it's counter-intuitive.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Now Playing ~ ./brendan_benson/lapalco/03_folk_singer.ogg

 
Reply With Quote
 
Luigi Donatello Asero
Guest
Posts: n/a
 
      03-06-2005

"Steve Pugh" <(E-Mail Removed)> skrev i meddelandet
news:(E-Mail Removed)...
> "Luigi Donatello Asero" <(E-Mail Removed)> wrote:
>
> >What does "foo" stand for?

>
> http://en.wikipedia.org/wiki/Foo
>
> Steve



Thank you.


--
Luigi ( un italiano che vive in Svezia)
https://www.scaiecat-spa-gigi.com/sv/rom1.html


 
Reply With Quote
 
Ken
Guest
Posts: n/a
 
      03-06-2005
Hi Toby -

On Sun, 06 Mar 2005 11:28:10 +0000, Toby Inkster
<(E-Mail Removed)> wrote:

>Most servers will (by default[1]) consider the following URLs to be
>equivalent:
>
> ///foo//bar
> //foo//bar
> /foo//bar
> /foo/bar


I have my server configured to treat // anyplace in the URL as a
security violation.

--
Ken
http://www.ke9nr.net/
 
Reply With Quote
 
R Powell
Guest
Posts: n/a
 
      03-07-2005
On Sun, 06 Mar 2005 15:01:29 +0000, Toby Inkster scribbled:
> Luigi Donatello Asero wrote:
>
>> What does "foo" stand for?

>
> Paradoxically, "foo" stands for "****ed up"


The Jargon File/New Hacker's Dictionary (the ultimate resource for all
things geeky and unixy) seems to disagree on this:
http://www.catb.org/~esr/jargon/html/F/foo.html
although it too seems rather uncertain what the exact origin is.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off




Advertisments