rejetto forum

Software => HFS ~ HTTP File Server => Topic started by: CBB on December 02, 2007, 05:00:30 PM

Title: Feature request: stop search bots
Post by: CBB on December 02, 2007, 05:00:30 PM: I'd like to prevent search bots from reading the context of my site, or at least from following "folder archive" references.
The most bothering is Google bot, it follows "folder archive" reference, downloads about 15 megabytes, interrupts downloading and then repeats this operation several times each day.
Now my solution is to ban google bots IPs, but as far as I remember there is a possibility to add something like "metaname=robot, nofollow" to references.
Title: Re: Feature request: stop search bots
Post by: maverick on December 02, 2007, 05:14:07 PM: Add this to the head section of your template ....

<META NAME="ROBOTS" CONTENT="NOINDEX,NOFOLLOW">

Add attached file to the root of your vfs and hide.

That's it!
Title: Re: Feature request: stop search bots
Post by: Tuskenraider on December 02, 2007, 05:19:50 PM: you can also place a robots.txt file in the root directory... like mine is www.aguyincookeville.com/robots.txt and that will stop em as well!! i also have issue with robots.. darn goggle, yahoo and ask.com robots.. but this text file has pretty well stopped them. enjoy!

tuskenraider
Title: Re: Feature request: stop search bots
Post by: maverick on December 02, 2007, 05:21:50 PM: Tuskenraider

Doesn't look like your read my reply above ???
Title: Re: Feature request: stop search bots
Post by: CBB on December 02, 2007, 05:35:34 PM: Thank you, I've fulfilled your recomendations.
But I also think that preventing bots to follow "folder archive" references should be done in the default template.
Title: Re: Feature request: stop search bots
Post by: Tuskenraider on December 02, 2007, 06:02:09 PM: Quote from: maverick on December 02, 2007, 05:21:50 PM
Tuskenraider
Doesn't look like your read my reply above ???

doh... sorry man.... ive not had my morning coffee.. many apologies!

Tusken-where the hecks my coffee cup-Raider
Title: Re: Feature request: stop search bots
Post by: maverick on December 02, 2007, 06:18:56 PM: Quote from: CBB on December 02, 2007, 05:35:34 PM
I also think that preventing bots to follow "folder archive" references should be done in the default template.

I disagree. It's a matter of personal preference based on the contents of the site. There might be some admins that want their sites spidered.
Title: Re: Feature request: stop search bots
Post by: rejetto on December 02, 2007, 08:29:10 PM: Next beta will include a "stop spiders" option.
It just will serve this standard robots.txt file (but only if there's no such file in the file system).

I'm gonna make this option ON by default, and hidden while in "easy mode". Any opinion is welcome.
Title: Re: Feature request: stop search bots
Post by: CBB on December 02, 2007, 08:49:55 PM: Quote from: maverick on December 02, 2007, 06:18:56 PM
Quote from: CBB on December 02, 2007, 05:35:34 PM
I also think that preventing bots to follow "folder archive" references should be done in the default template.

I disagree. It's a matter of personal preference based on the contents of the site. There might be some admins that want their sites spidered.
Please take into attention that here I mean exclusively "folder archive" references, not other references or directories, so my proposal will not lead to obstacles of site spidering, but only to traffic diminishing.
Title: Re: Feature request: stop search bots
Post by: CBB on December 02, 2007, 09:00:52 PM: Quote from: rejetto on December 02, 2007, 08:29:10 PM
Next beta will include a "stop spiders" option.
It just will serve this standard robots.txt file (but only if there's no such file in the file system).

I'm gonna make this option ON by default, and hidden while in "easy mode". Any opinion is welcome.

I support this decision, it seems to be very reasonable.
Title: Re: Feature request: stop search bots
Post by: MarkV on December 03, 2007, 12:03:56 AM: Quote from: CBB on December 02, 2007, 08:49:55 PM
Quote from: maverick on December 02, 2007, 06:18:56 PM
Quote from: CBB on December 02, 2007, 05:35:34 PM
I also think that preventing bots to follow "folder archive" references should be done in the default template.

I disagree. It's a matter of personal preference based on the contents of the site. There might be some admins that want their sites spidered.
Please take into attention that here I mean exclusively "folder archive" references, not other references or directories, so my proposal will not lead to obstacles of site spidering, but only to traffic diminishing.

+1
Title: Re: Feature request: stop search bots
Post by: rejetto on December 03, 2007, 12:06:08 AM: i'm not sure i understand what's that +1 referring to
Title: Re: Feature request: stop search bots
Post by: Foggy on December 03, 2007, 01:51:16 AM: Quote from: rejetto on December 03, 2007, 12:06:08 AM
i'm not sure i understand what's that +1 referring to

He's agreeing to what CBB said.
Title: Re: Feature request: stop search bots
Post by: rejetto on December 03, 2007, 01:52:15 AM: i asked what's exactly referring to, not the meaning of +1 itself.
Title: Re: Feature request: stop search bots
Post by: MarkV on December 03, 2007, 05:23:38 AM: Quote from: rejetto on December 03, 2007, 01:52:15 AM
i asked what's exactly referring to, not the meaning of +1 itself.

Stopping spiders from following 'folder archive' links and downloading data, thus wasting precious resources. Your 'stop spiders' option would stop them completely.
Title: Re: Feature request: stop search bots
Post by: rejetto on December 03, 2007, 05:35:04 AM: Quote from: MarkV on December 03, 2007, 05:23:38 AM
Quote from: rejetto on December 03, 2007, 01:52:15 AM
i asked what's exactly referring to, not the meaning of +1 itself.

Stopping spiders from following 'folder archive' links and downloading data, thus wasting precious resources. Your 'stop spiders' option would stop them completely.

does anyone know how the robots.txt should be to get this job done?
Title: Re: Feature request: stop search bots
Post by: Foggy on December 03, 2007, 06:14:07 AM: maybe something like /*/~folder.tar under the not allow section?
Title: Re: Feature request: stop search bots
Post by: CBB on December 03, 2007, 11:30:55 AM: Quote from: rejetto on December 03, 2007, 05:35:04 AM
Quote from: MarkV on December 03, 2007, 05:23:38 AM
Quote from: rejetto on December 03, 2007, 01:52:15 AM
i asked what's exactly referring to, not the meaning of +1 itself.

Stopping spiders from following 'folder archive' links and downloading data, thus wasting precious resources. Your 'stop spiders' option would stop them completely.

does anyone know how the robots.txt should be to get this job done?

I meant here to change attributes of references in the default template, not to add robots.txt. It would be usefull to add rel='nofollow" to "folder archive" references, see http://googleblog.blogspot.com/2005/01/preventing-comment-spam.html or http://en.wikipedia.org/wiki/Nofollow .
Title: Re: Feature request: stop search bots
Post by: rejetto on December 03, 2007, 03:35:42 PM: AFAIK nofollow doesn't stop spiders.
stops the rank propagation.
i think it has truly no meaning on folder archives.
Title: Re: Feature request: stop search bots
Post by: day on December 07, 2007, 08:21:00 PM: the stop spiders function doesn't work..
try this http://www.google.com/webmasters/tools/ and you will understand..