rejetto forum
Software => HFS ~ HTTP File Server => Topic started by: ninjapimp on May 13, 2010, 04:11:52 AM
-
66.249.68.167
a whois on that ip shows: crawl-66-249-68-167.googlebot.com
5/12/2010 11:06:20 PM 66.249.68.167:52928 Requested GET /WALLPAPERS/H.R.Giger/?sort=e&rev=1
5/12/2010 11:06:20 PM 66.249.68.167:52928 Sent 1460 bytes
5/12/2010 11:06:20 PM 66.249.68.167:52928 Served 53.81 KB
on my options I have it set to Prevent Spiders
so i'm wondering how is google getting past it??
i'm using beta build 260 and i dont memba this being a problem in previous builds
is there something i can do to block it?
-
try putting this in the <head>:
<meta name="robots" content="noindex, nofollow"/>
I have done this to my template and my HFS doesn't seem to end up on Google. Google will try to crawl it but if you tell it with a robots.txt and that meta it usually wont list you.
-
hey great idea
thanks for the tip
-
ok i tried that and it fails, googlebot still coming..
here is a copy n paste of my changes made:
<center><p><a href="http://www.devilsrage.com"><img src="/header-logo.jpg" alt="Devils Rage" /></a></p></center>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN">
<!--{.comment|--><h1 style='margin-bottom:100em'>WARNING: this template is only to be used with HFS 2.3 (and macros enabled)</h1> <!--.} -->
<html>
<head>
<meta name="robots" content="noindex, nofollow"/>
and after I made those changes. i restarted HFS. double checked to make sure it loading proper file
5/13/2010 9:19:22 AM 66.249.68.197:55185 Connected
5/13/2010 9:19:22 AM 66.249.68.197:55185 Requested GET /FTP/BIT TORRENTS/Tiesto_-_Club_Life_091_(Best_of_2008)-CABLE-12-26-2008-TALiON--TRANCEF.COM--.rar.torrent
5/13/2010 9:19:22 AM 66.249.68.197:55185 Sent 1460 bytes
-
What I use is slightly different
<META content=NOINDEX,NOFOLLOW name=ROBOTS>
The difference probably doesn't matter, but I don't get google bot :)
-
from what you pasted, it's like you have <center> other things and then <!DOCTYPE
if that's the order, it's wrong, and it may explain why it's not working.
-
66.249.65.39 - - [29/May/2010:11:22:31 -0500] "GET /WALLPAPERS/Scotland/Stirling%20Bridge%20and%20National%20Wallace%20Monument.jpg HTTP/1.1" 200 4280807 " "
i've been unable to prevent google from crawling on my site
it seems to have started when i applied the beta build 260
i have it in option to prevent spiders
i went into my template and added this info but still no luck.
----copy n paste-----
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN">
<!--{.comment|--><h1 style='margin-bottom:100em'>WARNING: this template is only to be used with HFS 2.3 (and macros enabled)</h1> <!--.} -->
<html>
<head>
<meta name="robots" content="noindex, nofollow"/>
<META content=NOINDEX,NOFOLLOW name=ROBOTS>
<center><p><a href="http://www.devilsrage.com"><img src="/header-logo.jpg" alt="Devils Rage" /></a></p></center>
<title>Devils Rage HFS</title>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<link rel="stylesheet" href="/~style.css" type="text/css">
<link rel="stylesheet" href="/~style.menu.css" type="text/css">
<title>HFS %folder%</title>
<link rel="shortcut icon" href="/favicon.ico">
<!--[if lte IE 5.5]>
<style type="text/css">
.menu ul li a, .menu ul li a:visited { width:151px; w\idth:139px; }
</style>
--------
thats the first few lines...
any ideas how to stop google from crawling my sites?
its doing it on my HFS ssl and the non ssl site
-
if you fear this is related to a recent build, try to revert to #255 or similar, to confirm your guess
-
i had build 252 prior to upgrading to 260 and it was not an issue
with 260 i can clearly see google is crawling my site endlessly
but with 252 it did not appear to happen at all
-
your past experience is not enough to say it depends on the build.
you should revert now to confirm this supposition.
-
A bit stupid question but, have you enabled stop spiders option?
Maybe you have unchecked it.
-
yes i have.
if i use 252 or below google wont crawl
but if i use anything above 252, i went from 252 to 260 and google crawls my site
even after turning on the prevent spiders and adding code to the template.
i must be the only person noticing this so i can only conclude its something on my end that i've yet to figure out
funny thing is i never made any changes to my template. yet after the upgrade to 260 google was crawling me big time.
-
I think <meta name="robots" content="noindex, nofollow"/> has to be on every page.
Do you have a "robots.txt" file in the root of your server? Maybe that would work better.
I've removed the meta name=robots method from my pages and am trying the robots.txt
for a while.
I just found that in Menu > Limits > Stop Spiders is "greyed out" and I can not un-check it
in 260? I just changed it in the ini, now its un-checked, but still "greyed out". Makes me wonder
if it was working.
-
ninja, could you state clearly that you tried recently (that is, after the problem) to REVERT to a previous build like 250?
note: i'm not talking about your past experience when you had still to update your hfs