![]() |
|
|
|
#1 |
|
It seems as Google robots visit subdirectories which I do not see in my
directory www for example crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla That means subdirectories with several "/" before my "normal subdirectories (for example "it"). They are very many "strange subdirectories! Is it normal? What can I do? -- Luigi ( un italiano che vive in Svezia) https://www.scaiecat-spa-gigi.com/de...-rom-trevi.php Luigi Donatello Asero |
|
|
|
|
#2 |
|
Posts: n/a
|
Luigi Donatello Asero wrote:
> It seems as Google robots visit subdirectories which I do not see in my > directory www > for example > > crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET > ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla > > That means subdirectories with several "/" before my "normal > subdirectories (for example "it"). > They are very many "strange subdirectories! > Is it normal? > What can I do? I am not sure about Google, but Yahoo make very many mistakes. One such mistake involves mixing structures and files from other sites. If you see strange filenames, that'll be the explanations. Another one that I suffer from all the time is when Yahoo fail to traverse directories without an index -- that is -- directories which invoke the default Apache file listing. Yahoo descents to a lower level, which is incorrect and this triggers many distracting errors. Google might be doing similar mistakes. I noticed that it continuously fails to deal with frames that come from different domains. Sometimes it looks for .tex files when a .pdf is found. All in all, I do not totally trust it. Roy -- Roy Schestowitz http://schestowitz.com |
|
|
|
#3 |
|
Posts: n/a
|
Luigi Donatello Asero wrote:
> crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET > ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla > That means subdirectories with several "/" before my "normal subdirectories > (for example "it"). Most servers will (by default[1]) consider the following URLs to be equivalent: ///foo//bar //foo//bar /foo//bar /foo/bar However, most clients won't consider them the same. (Nor should they![1]) It may well be that some stupid robots, when they are at the main index page ("/") see a link like "/it/lamedeiazionemerci2.html" and wrongly invent a URL like "//it/lamedeiazionemerci2.html". Hence the weird requests. ____ [1] An easy way to make the URLs "/" and "//" act differently would be: 1. In an .htaccess in your document root, turn on Multimodes; 2. Then create a file "somedir.php" in the document root: <?php $p = $_SERVER['PATH_INFO']; echo strstr($p,'//')?'Foo':'Bar'; ?> 3. Now visit: http://www.yourdomain.com/somedir/hello/world/ and http://www.yourdomain.com/somedir/hello//world/ and note the difference. -- Toby A Inkster BSc (Hons) ARCS Contact Me ~ http://tobyinkster.co.uk/contact |
|
|
|
#4 |
|
Posts: n/a
|
"Toby Inkster" <> skrev i meddelandet news > Luigi Donatello Asero wrote: > > > crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET > > ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla > > That means subdirectories with several "/" before my "normal subdirectories > > (for example "it"). > > Most servers will (by default[1]) consider the following URLs to be > equivalent: > > ///foo//bar > //foo//bar > /foo//bar > /foo/bar > What does "foo" stand for? > However, most clients won't consider them the same. (Nor should they![1]) > > It may well be that some stupid robots, when they are at the main index > page ("/") see a link like "/it/lamedeiazionemerci2.html" and wrongly > invent a URL like "//it/lamedeiazionemerci2.html". > > Hence the weird requests. > > ____ > [1] An easy way to make the URLs "/" and "//" act differently would be: > > 1. In an .htaccess in your document root, turn on Multimodes; How do I turn on Multimodes? And what is Multimodes, anyway? > 2. Then create a file "somedir.php" in the document root: > > <?php > $p = $_SERVER['PATH_INFO']; > echo strstr($p,'//')?'Foo':'Bar'; > ?> So, when you write "Foo" I should write for example "it" or https://www.scaiecat-spa-gigi.com/it/ And should "bar" be "lamediazionemerci2.html " in the above mentioned example? -- Luigi ( un italiano che vive in Svezia) https://www.scaiecat-spa-gigi.com/de...en-italien.php |
|
|
|
#5 |
|
Posts: n/a
|
"Luigi Donatello Asero" <> wrote:
>What does "foo" stand for? http://en.wikipedia.org/wiki/Foo Steve -- "My theories appal you, my heresies outrage you, I never answer letters and you don't like my tie." - The Doctor Steve Pugh <> <http://steve.pugh.net/> |
|
|
|
#6 |
|
Posts: n/a
|
Luigi Donatello Asero wrote:
> What does "foo" stand for? Paradoxically, "foo" stands for "****ed up", but that's not important. "foo" and "bar" are simply example words that can be inserted into any example when you can't think of a better word to use as an example. >> [1] An easy way to make the URLs "/" and "//" act differently would be: >> 1. In an .htaccess in your document root, turn on Multimodes; How do I > > turn on Multimodes? > And what is Multimodes, anyway? Google for it. Note: I'm not saying that you *should* to any of those steps -- I am merely pointing out that it is possible to make /foo//bar be interpreted differently from /foo/bar. In general, this is probably a bad idea, as it's counter-intuitive. -- Toby A Inkster BSc (Hons) ARCS Contact Me ~ http://tobyinkster.co.uk/contact Now Playing ~ ./brendan_benson/lapalco/03_folk_singer.ogg |
|
|
|
#7 |
|
Posts: n/a
|
"Steve Pugh" <> skrev i meddelandet news:... > "Luigi Donatello Asero" <> wrote: > > >What does "foo" stand for? > > http://en.wikipedia.org/wiki/Foo > > Steve Thank you. -- Luigi ( un italiano che vive in Svezia) https://www.scaiecat-spa-gigi.com/sv/rom1.html |
|
|
|
#8 |
|
Posts: n/a
|
Hi Toby -
On Sun, 06 Mar 2005 11:28:10 +0000, Toby Inkster <> wrote: >Most servers will (by default[1]) consider the following URLs to be >equivalent: > > ///foo//bar > //foo//bar > /foo//bar > /foo/bar I have my server configured to treat // anyplace in the URL as a security violation. -- Ken http://www.ke9nr.net/ |
|
|
|
#9 |
|
Posts: n/a
|
On Sun, 06 Mar 2005 15:01:29 +0000, Toby Inkster scribbled:
> Luigi Donatello Asero wrote: > >> What does "foo" stand for? > > Paradoxically, "foo" stands for "****ed up" The Jargon File/New Hacker's Dictionary (the ultimate resource for all things geeky and unixy) seems to disagree on this: http://www.catb.org/~esr/jargon/html/F/foo.html although it too seems rather uncertain what the exact origin is. |
|