Go Back   Velocity Reviews > Newsgroups > HTML
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

HTML - "strange subdirectories"

 
Thread Tools Search this Thread
Old 03-06-2005, 12:49 AM   #1
Default "strange subdirectories"


It seems as Google robots visit subdirectories which I do not see in my
directory www
for example

crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET
///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla

That means subdirectories with several "/" before my "normal subdirectories
(for example "it").
They are very many "strange subdirectories!
Is it normal?
What can I do?



--
Luigi ( un italiano che vive in Svezia)

https://www.scaiecat-spa-gigi.com/de...-rom-trevi.php








Luigi Donatello Asero
  Reply With Quote
Old 03-06-2005, 02:24 AM   #2
Roy Schestowitz
 
Posts: n/a
Default Re: "strange subdirectories"

Luigi Donatello Asero wrote:

> It seems as Google robots visit subdirectories which I do not see in my
> directory www
> for example
>
> crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET
> ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla
>
> That means subdirectories with several "/" before my "normal
> subdirectories (for example "it").
> They are very many "strange subdirectories!
> Is it normal?
> What can I do?


I am not sure about Google, but Yahoo make very many mistakes. One such
mistake involves mixing structures and files from other sites. If you see
strange filenames, that'll be the explanations. Another one that I suffer
from all the time is when Yahoo fail to traverse directories without an
index -- that is -- directories which invoke the default Apache file
listing. Yahoo descents to a lower level, which is incorrect and this
triggers many distracting errors.

Google might be doing similar mistakes. I noticed that it continuously
fails to deal with frames that come from different domains. Sometimes it
looks for .tex files when a .pdf is found. All in all, I do not totally
trust it.

Roy

--
Roy Schestowitz
http://schestowitz.com
  Reply With Quote
Old 03-06-2005, 11:28 AM   #3
Toby Inkster
 
Posts: n/a
Default Re: "strange subdirectories"

Luigi Donatello Asero wrote:

> crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET
> ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla
> That means subdirectories with several "/" before my "normal subdirectories
> (for example "it").


Most servers will (by default[1]) consider the following URLs to be
equivalent:

///foo//bar
//foo//bar
/foo//bar
/foo/bar

However, most clients won't consider them the same. (Nor should they![1])

It may well be that some stupid robots, when they are at the main index
page ("/") see a link like "/it/lamedeiazionemerci2.html" and wrongly
invent a URL like "//it/lamedeiazionemerci2.html".

Hence the weird requests.

____
[1] An easy way to make the URLs "/" and "//" act differently would be:

1. In an .htaccess in your document root, turn on Multimodes;
2. Then create a file "somedir.php" in the document root:

<?php
$p = $_SERVER['PATH_INFO'];
echo strstr($p,'//')?'Foo':'Bar';
?>

3. Now visit:

http://www.yourdomain.com/somedir/hello/world/
and
http://www.yourdomain.com/somedir/hello//world/

and note the difference.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact

  Reply With Quote
Old 03-06-2005, 01:56 PM   #4
Luigi Donatello Asero
 
Posts: n/a
Default Re: "strange subdirectories"


"Toby Inkster" <> skrev i meddelandet
news .uk...
> Luigi Donatello Asero wrote:
>
> > crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET
> > ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla
> > That means subdirectories with several "/" before my "normal

subdirectories
> > (for example "it").

>
> Most servers will (by default[1]) consider the following URLs to be
> equivalent:
>
> ///foo//bar
> //foo//bar
> /foo//bar
> /foo/bar
>




What does "foo" stand for?


> However, most clients won't consider them the same. (Nor should they![1])
>
> It may well be that some stupid robots, when they are at the main index
> page ("/") see a link like "/it/lamedeiazionemerci2.html" and wrongly
> invent a URL like "//it/lamedeiazionemerci2.html".
>
> Hence the weird requests.
>
> ____
> [1] An easy way to make the URLs "/" and "//" act differently would be:
>
> 1. In an .htaccess in your document root, turn on Multimodes;



How do I turn on Multimodes?
And what is Multimodes, anyway?


> 2. Then create a file "somedir.php" in the document root:
>
> <?php
> $p = $_SERVER['PATH_INFO'];
> echo strstr($p,'//')?'Foo':'Bar';
> ?>



So, when you write "Foo" I should write for example
"it" or
https://www.scaiecat-spa-gigi.com/it/

And should "bar" be "lamediazionemerci2.html "
in the above mentioned example?

--
Luigi ( un italiano che vive in Svezia)
https://www.scaiecat-spa-gigi.com/de...en-italien.php











  Reply With Quote
Old 03-06-2005, 02:19 PM   #5
Steve Pugh
 
Posts: n/a
Default Re: "strange subdirectories"

"Luigi Donatello Asero" <> wrote:

>What does "foo" stand for?


http://en.wikipedia.org/wiki/Foo

Steve

--
"My theories appal you, my heresies outrage you,
I never answer letters and you don't like my tie." - The Doctor

Steve Pugh <> <http://steve.pugh.net/>
  Reply With Quote
Old 03-06-2005, 03:01 PM   #6
Toby Inkster
 
Posts: n/a
Default Re: "strange subdirectories"

Luigi Donatello Asero wrote:

> What does "foo" stand for?


Paradoxically, "foo" stands for "****ed up", but that's not important.

"foo" and "bar" are simply example words that can be inserted into any
example when you can't think of a better word to use as an example.

>> [1] An easy way to make the URLs "/" and "//" act differently would be:
>> 1. In an .htaccess in your document root, turn on Multimodes; How do I

>
> turn on Multimodes?
> And what is Multimodes, anyway?


Google for it.

Note: I'm not saying that you *should* to any of those steps -- I am
merely pointing out that it is possible to make /foo//bar be interpreted
differently from /foo/bar. In general, this is probably a bad idea, as
it's counter-intuitive.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Now Playing ~ ./brendan_benson/lapalco/03_folk_singer.ogg

  Reply With Quote
Old 03-06-2005, 04:33 PM   #7
Luigi Donatello Asero
 
Posts: n/a
Default Re: "strange subdirectories"


"Steve Pugh" <> skrev i meddelandet
news:...
> "Luigi Donatello Asero" <> wrote:
>
> >What does "foo" stand for?

>
> http://en.wikipedia.org/wiki/Foo
>
> Steve



Thank you.


--
Luigi ( un italiano che vive in Svezia)
https://www.scaiecat-spa-gigi.com/sv/rom1.html


  Reply With Quote
Old 03-06-2005, 04:36 PM   #8
Ken
 
Posts: n/a
Default Re: "strange subdirectories"

Hi Toby -

On Sun, 06 Mar 2005 11:28:10 +0000, Toby Inkster
<> wrote:

>Most servers will (by default[1]) consider the following URLs to be
>equivalent:
>
> ///foo//bar
> //foo//bar
> /foo//bar
> /foo/bar


I have my server configured to treat // anyplace in the URL as a
security violation.

--
Ken
http://www.ke9nr.net/
  Reply With Quote
Old 03-07-2005, 03:29 AM   #9
R Powell
 
Posts: n/a
Default Re: "strange subdirectories"

On Sun, 06 Mar 2005 15:01:29 +0000, Toby Inkster scribbled:
> Luigi Donatello Asero wrote:
>
>> What does "foo" stand for?

>
> Paradoxically, "foo" stands for "****ed up"


The Jargon File/New Hacker's Dictionary (the ultimate resource for all
things geeky and unixy) seems to disagree on this:
http://www.catb.org/~esr/jargon/html/F/foo.html
although it too seems rather uncertain what the exact origin is.

  Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump