Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > HTML > Search Engine Tag?

Reply
Thread Tools

Search Engine Tag?

 
 
Ken
Guest
Posts: n/a
 
      11-22-2005
Hi Scott -

On Tue, 22 Nov 2005 13:24:25 -0600, Scott <(E-Mail Removed)> wrote:

>Steve Pugh wrote:
>>
>> Scott wrote:
>> > > Luigi Donatello Asero wrote:
>> > > > "Scott" <(E-Mail Removed)> skrev i meddelandet
>> > > > news:(E-Mail Removed)...
>> > > > >
>> > > > > Is there a tag that I can put on a page that will prevent search
>> > > > > engines from indexing the page?
>> > > >
>> > > > As far as I know you could insert the adress of the page into a file
>> > > > called robots.txt and indicate which search engine you do not want to
>> > > > index it.
>> >
>> > OK, I figured out what to write in robots.txt. What I'm wondering is exactly
>> > where to place that file on the host server.

>>
>> At the root of your site.
>>
>> If a spider wants to visit http://www.example.com/foo/bar/page.html
>> then it will look for http://www.example.com/foo/bar/robots.txt,
>> http://www.example.com/foo/robots.txt and
>> http://www.example.com/robots.txt and apply all the rules it finds.
>> >From your point of view having a single robots.txt in your root folder

>> makes for easy maintenance.
>>
>> Steve

>
>Steve,
>
>So, you're saying I can just upload the robots.txt file to the same place I
>upload all my website files? In my case, my web account on the server is
>"public_html". And I should configure robots.txt to exclude the one
>particular url that I wish not to be indexed?


In the example that Steve gave, according to the standards the robot
would look ONLY for:
http://www.example.com/robots.txt

I don't recall that I have ever seen a robot look for robots.txt other
than in the host root; certainly not in the last several years.

See http://www.robotstxt.org/wc/exclusion.html If you don't have
access to the host root, you can try using the "ROBOTS" META tag
within the individual page(s).

--
Ken
http://www.ke9nr.net/
 
Reply With Quote
 
 
 
 
Scott
Guest
Posts: n/a
 
      11-23-2005



> >
> >Steve,
> >
> >So, you're saying I can just upload the robots.txt file to the same place I
> >upload all my website files? In my case, my web account on the server is
> >"public_html". And I should configure robots.txt to exclude the one
> >particular url that I wish not to be indexed?

>
> In the example that Steve gave, according to the standards the robot
> would look ONLY for:
> http://www.example.com/robots.txt
>
> I don't recall that I have ever seen a robot look for robots.txt other
> than in the host root; certainly not in the last several years.
>
> See http://www.robotstxt.org/wc/exclusion.html If you don't have
> access to the host root, you can try using the "ROBOTS" META tag
> within the individual page(s).
>
> --
> Ken
> http://www.ke9nr.net/


Ken,

Please pardon my density, but where exactly is the "host root"? Is this the
same place where I upload all my website files to my account on the host's
server?

Thanks!
Scott
 
Reply With Quote
 
 
 
 
Beauregard T. Shagnasty
Guest
Posts: n/a
 
      11-23-2005
Scott wrote:

> Please pardon my density, but where exactly is the "host root"? Is
> this the same place where I upload all my website files to my account
> on the host's server?


The "root" is your main directory, the place you (usually) have your
main index.html file.

--
-bts
-Warning: I brake for lawn deer
 
Reply With Quote
 
Scott
Guest
Posts: n/a
 
      11-23-2005


"Beauregard T. Shagnasty" wrote:
>
> Scott wrote:
>
> > Please pardon my density, but where exactly is the "host root"? Is
> > this the same place where I upload all my website files to my account
> > on the host's server?

>
> The "root" is your main directory, the place you (usually) have your
> main index.html file.
>
> --
> -bts
> -Warning: I brake for lawn deer


Bts:

Thanks!!!!

Scott
 
Reply With Quote
 
Ken
Guest
Posts: n/a
 
      11-23-2005
Hi Scott -

On Tue, 22 Nov 2005 18:02:26 -0600, Scott <(E-Mail Removed)> wrote:

>Please pardon my density, but where exactly is the "host root"? Is this the
>same place where I upload all my website files to my account on the host's
>server?


The host root is wherever the files reside that are served for
http://www.example.com/[file]

The actual location on the hard drive depends on the server software
and configuration.

For example, the host root for my main website
http://www.ke9nr.net/
is
/save/internet/www/sites/www.ke9nr.net

That's not at all standard. The directory layout is the way that it
is because of the way I have the partitions set up and how I want to
do things. I configured Apache to match my directory structure, not
the other way around. (I have my own domains and my own server so I
can do as I please.)

If you don't have your own domain it is unlikely that you will have
access to the host root. E.g. if your ISP were example.net and your
files are accessed at http://www.example.net/~user/, it's unlikely
that you are going to be able to upload a robots.txt file so that it
is accessible at http://www.example.net/robots.txt. Uploading a
robots.txt so that it is accessible at
http://www.example.net/~user/robots.txt isn't going to work.

--
Ken
http://www.ke9nr.net/
 
Reply With Quote
 
Scott
Guest
Posts: n/a
 
      11-29-2005


Ken wrote:
>
> Hi Scott -
>
> On Tue, 22 Nov 2005 18:02:26 -0600, Scott <(E-Mail Removed)> wrote:
>
> >Please pardon my density, but where exactly is the "host root"? Is this the
> >same place where I upload all my website files to my account on the host's
> >server?

>
> The host root is wherever the files reside that are served for
> http://www.example.com/[file]
>
> The actual location on the hard drive depends on the server software
> and configuration.
>
> For example, the host root for my main website
> http://www.ke9nr.net/
> is
> /save/internet/www/sites/www.ke9nr.net
>
> That's not at all standard. The directory layout is the way that it
> is because of the way I have the partitions set up and how I want to
> do things. I configured Apache to match my directory structure, not
> the other way around. (I have my own domains and my own server so I
> can do as I please.)
>
> If you don't have your own domain it is unlikely that you will have
> access to the host root. E.g. if your ISP were example.net and your
> files are accessed at http://www.example.net/~user/, it's unlikely
> that you are going to be able to upload a robots.txt file so that it
> is accessible at http://www.example.net/robots.txt. Uploading a
> robots.txt so that it is accessible at
> http://www.example.net/~user/robots.txt isn't going to work.
>
> --
> Ken
> http://www.ke9nr.net/


Ken,

Darn. My website is: www.uslink.net/~golden. It's not my own domain,
so it looks like the host root is out of my reach. The page that I don't
want to be indexed is www.uslink.net/~golden/order1.html.

I'm trying not to use a password. It's only this one page that I want to
prevent from being indexed. Everything else on the site is fair game.

What are the chances that <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
will do the job?

Thanks!
Scott
 
Reply With Quote
 
Ken Sims
Guest
Posts: n/a
 
      11-29-2005
Hi Scott -

On Tue, 29 Nov 2005 14:19:32 -0600, Scott <(E-Mail Removed)> wrote:

>Darn. My website is: www.uslink.net/~golden. It's not my own domain,
>so it looks like the host root is out of my reach. The page that I don't
>want to be indexed is www.uslink.net/~golden/order1.html.


Yes, robots.txt has to be at http://www.uslink.net/robot.txt

Your only option for robots.txt is to see if you can convince USLink
to add a robots.txt with your Disallow. If you click the above link,
you will see that they don't have a robots.txt.

>I'm trying not to use a password. It's only this one page that I want to
>prevent from being indexed. Everything else on the site is fair game.
>
>What are the chances that <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
>will do the job?


It's better than nothing, but that's all I can say.

I think you are at the point where you need your domain. Not just so
that you can have a robots.txt but also to make what you are doing
look more professional.

--
Ken
http://www.ke9nr.net/
 
Reply With Quote
 
Scott
Guest
Posts: n/a
 
      11-30-2005


Ken Sims wrote:
>
> Hi Scott -
>
> On Tue, 29 Nov 2005 14:19:32 -0600, Scott <(E-Mail Removed)> wrote:
>
> >Darn. My website is: www.uslink.net/~golden. It's not my own domain,
> >so it looks like the host root is out of my reach. The page that I don't
> >want to be indexed is www.uslink.net/~golden/order1.html.

>
> Yes, robots.txt has to be at http://www.uslink.net/robot.txt
>
> Your only option for robots.txt is to see if you can convince USLink
> to add a robots.txt with your Disallow. If you click the above link,
> you will see that they don't have a robots.txt.
>
> >I'm trying not to use a password. It's only this one page that I want to
> >prevent from being indexed. Everything else on the site is fair game.
> >
> >What are the chances that <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
> >will do the job?

>
> It's better than nothing, but that's all I can say.
>
> I think you are at the point where you need your domain. Not just so
> that you can have a robots.txt but also to make what you are doing
> look more professional.
>
> --
> Ken
> http://www.ke9nr.net/



Ken,

I agree. In fact, the only reason I'm staying with my ISP-provided webspace
is that I don't want to have to start over being found by the search engines
again (although my Google ranking...under "GNLD" has slipped out of the top 20
this past year, but it's still pretty high with Yahoo). Also, my email address
has been around for nine years. I do have my own domain (www.teamone.net) for
a business site I'm starting to build. Then I'll have more control over things.

Scott
 
Reply With Quote
 
Ken Sims
Guest
Posts: n/a
 
      11-30-2005
Hi Scott -

On Tue, 29 Nov 2005 18:18:09 -0600, Scott <(E-Mail Removed)> wrote:

>I agree. In fact, the only reason I'm staying with my ISP-provided webspace
>is that I don't want to have to start over being found by the search engines
>again (although my Google ranking...under "GNLD" has slipped out of the top 20
>this past year, but it's still pretty high with Yahoo).


If you can set up 301 redirects, it ought be pretty smooth, both for
the search engines switching over as they attempt to re-spider the old
site, and for users clicking links that lead to the old site.

>Also, my email address has been around for nine years.


I'm not suggesting that you get rid of your USLink account.

>I do have my own domain (www.teamone.net) for a business site I'm starting to build. >Then I'll have more control over things.


Control is good. I went from a user website on the ISP's domain (like
what you have with USLink), to my own domain with virtual hosting, to
my own domains on a virtual server, to my own domains on my own
physical server that is about six feet away from me. And this is for
non-incoming-producing domains.

--
Ken
http://www.ke9nr.net/
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Getting search details from web search engine pandi Java 5 12-14-2009 04:45 AM
Search jobs at world's largest online job search engine. Findresources for all types of solomanjo@gmail.com Computer Support 0 03-13-2008 12:00 PM
Search jobs at world's largest online job search engine. Findresources for all types of solomanjo@gmail.com Digital Photography 0 03-13-2008 11:49 AM
.Net Search Engine - Has anyone used dtSearch .Net Engine? Sasha ASP .Net 3 05-22-2007 04:20 PM



Advertisments