rejetto forum
Software => HFS ~ HTTP File Server => Topic started by: CBB on December 02, 2007, 05:00:30 PM
-
I'd like to prevent search bots from reading the context of my site, or at least from following "folder archive" references.
The most bothering is Google bot, it follows "folder archive" reference, downloads about 15 megabytes, interrupts downloading and then repeats this operation several times each day.
Now my solution is to ban google bots IPs, but as far as I remember there is a possibility to add something like "metaname=robot, nofollow" to references.
-
Add this to the head section of your template ....
<META NAME="ROBOTS" CONTENT="NOINDEX,NOFOLLOW">
Add attached file to the root of your vfs and hide.
That's it!
-
you can also place a robots.txt file in the root directory... like mine is www.aguyincookeville.com/robots.txt and that will stop em as well!! i also have issue with robots.. darn goggle, yahoo and ask.com robots.. but this text file has pretty well stopped them. enjoy!
tuskenraider
-
Tuskenraider
Doesn't look like your read my reply above ???
-
Thank you, I've fulfilled your recomendations.
But I also think that preventing bots to follow "folder archive" references should be done in the default template.
-
Tuskenraider
Doesn't look like your read my reply above ???
doh... sorry man.... ive not had my morning coffee.. many apologies!
Tusken-where the hecks my coffee cup-Raider
-
I also think that preventing bots to follow "folder archive" references should be done in the default template.
I disagree. It's a matter of personal preference based on the contents of the site. There might be some admins that want their sites spidered.
-
Next beta will include a "stop spiders" option.
It just will serve this standard robots.txt file (but only if there's no such file in the file system).
I'm gonna make this option ON by default, and hidden while in "easy mode". Any opinion is welcome.
-
I also think that preventing bots to follow "folder archive" references should be done in the default template.
I disagree. It's a matter of personal preference based on the contents of the site. There might be some admins that want their sites spidered.
Please take into attention that here I mean exclusively "folder archive" references, not other references or directories, so my proposal will not lead to obstacles of site spidering, but only to traffic diminishing.
-
Next beta will include a "stop spiders" option.
It just will serve this standard robots.txt file (but only if there's no such file in the file system).
I'm gonna make this option ON by default, and hidden while in "easy mode". Any opinion is welcome.
I support this decision, it seems to be very reasonable.
-
I also think that preventing bots to follow "folder archive" references should be done in the default template.
I disagree. It's a matter of personal preference based on the contents of the site. There might be some admins that want their sites spidered.
Please take into attention that here I mean exclusively "folder archive" references, not other references or directories, so my proposal will not lead to obstacles of site spidering, but only to traffic diminishing.
+1
-
i'm not sure i understand what's that +1 referring to
-
i'm not sure i understand what's that +1 referring to
He's agreeing to what CBB said.
-
i asked what's exactly referring to, not the meaning of +1 itself.
-
i asked what's exactly referring to, not the meaning of +1 itself.
Stopping spiders from following 'folder archive' links and downloading data, thus wasting precious resources. Your 'stop spiders' option would stop them completely.
-
i asked what's exactly referring to, not the meaning of +1 itself.
Stopping spiders from following 'folder archive' links and downloading data, thus wasting precious resources. Your 'stop spiders' option would stop them completely.
does anyone know how the robots.txt should be to get this job done?
-
maybe something like /*/~folder.tar under the not allow section?
-
i asked what's exactly referring to, not the meaning of +1 itself.
Stopping spiders from following 'folder archive' links and downloading data, thus wasting precious resources. Your 'stop spiders' option would stop them completely.
does anyone know how the robots.txt should be to get this job done?
I meant here to change attributes of references in the default template, not to add robots.txt. It would be usefull to add rel='nofollow" to "folder archive" references, see http://googleblog.blogspot.com/2005/01/preventing-comment-spam.html or http://en.wikipedia.org/wiki/Nofollow .
-
AFAIK nofollow doesn't stop spiders.
stops the rank propagation.
i think it has truly no meaning on folder archives.
-
the stop spiders function doesn't work..
try this http://www.google.com/webmasters/tools/ and you will understand..