rejetto forum

google bot is getting past the Prevent Spiders option

0 Members and 1 Guest are viewing this topic.

Offline ninjapimp

  • Occasional poster
  • *
    • Posts: 71
    • View Profile
66.249.68.167
a whois on that ip shows: crawl-66-249-68-167.googlebot.com

5/12/2010 11:06:20 PM 66.249.68.167:52928 Requested GET /WALLPAPERS/H.R.Giger/?sort=e&rev=1
5/12/2010 11:06:20 PM 66.249.68.167:52928 Sent 1460 bytes
5/12/2010 11:06:20 PM 66.249.68.167:52928 Served 53.81 KB

on my options I have it set to Prevent Spiders

so i'm wondering how is google getting past it??
i'm using beta build 260 and i dont memba this being a problem in previous builds

is there something i can do to block it?




Offline TSG

  • Operator
  • Tireless poster
  • *****
    • Posts: 1935
    • View Profile
    • RAWR-Designs
try putting this in the <head>:

<meta name="robots" content="noindex, nofollow"/>

I have done this to my template and my HFS doesn't seem to end up on Google. Google will try to crawl it but if you tell it with a robots.txt and that meta it usually wont list you.



Offline ninjapimp

  • Occasional poster
  • *
    • Posts: 71
    • View Profile
ok i tried that and it fails, googlebot still coming..

here is a copy n paste of my changes made:
  <center><p><a href="http://www.devilsrage.com"><img src="/header-logo.jpg" alt="Devils Rage" /></a></p></center>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN">
<!--{.comment|--><h1 style='margin-bottom:100em'>WARNING: this template is only to be used with HFS 2.3 (and macros enabled)</h1> <!--.} -->
<html>
<head>
<meta name="robots" content="noindex, nofollow"/>



and after I made those changes. i restarted HFS. double checked to make sure it loading proper file
5/13/2010 9:19:22 AM 66.249.68.197:55185 Connected
5/13/2010 9:19:22 AM 66.249.68.197:55185 Requested GET /FTP/BIT TORRENTS/Tiesto_-_Club_Life_091_(Best_of_2008)-CABLE-12-26-2008-TALiON--TRANCEF.COM--.rar.torrent
5/13/2010 9:19:22 AM 66.249.68.197:55185 Sent 1460 bytes



Offline r][m

  • Tireless poster
  • ****
    • Posts: 347
    • View Profile
What I use is slightly different
Code: [Select]
<META content=NOINDEX,NOFOLLOW name=ROBOTS>The difference probably doesn't matter, but I don't get google bot  :)


Offline rejetto

  • Administrator
  • Tireless poster
  • *****
    • Posts: 13517
    • View Profile
from what you pasted, it's like you have <center> other things and then <!DOCTYPE

if that's the order, it's wrong, and it may explain why it's not working.


Offline ninjapimp

  • Occasional poster
  • *
    • Posts: 71
    • View Profile
66.249.65.39 - - [29/May/2010:11:22:31 -0500] "GET /WALLPAPERS/Scotland/Stirling%20Bridge%20and%20National%20Wallace%20Monument.jpg HTTP/1.1" 200 4280807 " "

i've been unable to prevent google from crawling on my site
it seems to have started when i applied the beta build 260

i have it in option to prevent spiders
i went into my template and added this info but still no luck.
----copy n paste-----
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN">
<!--{.comment|--><h1 style='margin-bottom:100em'>WARNING: this template is only to be used with HFS 2.3 (and macros enabled)</h1> <!--.} -->
<html>
<head>
<meta name="robots" content="noindex, nofollow"/>
<META content=NOINDEX,NOFOLLOW name=ROBOTS>
<center><p><a href="http://www.devilsrage.com"><img src="/header-logo.jpg" alt="Devils Rage" /></a></p></center>
      <title>Devils Rage HFS</title>
  <meta http-equiv="content-type" content="text/html; charset=UTF-8">
  <link rel="stylesheet" href="/~style.css" type="text/css">
  <link rel="stylesheet" href="/~style.menu.css" type="text/css">
  <title>HFS %folder%</title>
  <link rel="shortcut icon" href="/favicon.ico">

<!--[if lte IE 5.5]>
<style type="text/css">
.menu ul li a, .menu ul li a:visited { width:151px; w\idth:139px; }

</style>



--------
thats the first few lines...


any ideas how to stop google from crawling my sites?
its doing it on my HFS ssl and the non ssl site


Offline rejetto

  • Administrator
  • Tireless poster
  • *****
    • Posts: 13517
    • View Profile
if you fear this is related to a recent build, try to revert to #255 or similar, to confirm your guess


Offline ninjapimp

  • Occasional poster
  • *
    • Posts: 71
    • View Profile
i had build 252 prior to upgrading to 260 and it was not an issue

with 260 i can clearly see google is crawling my site endlessly
but with 252 it did not appear to happen at all


Offline rejetto

  • Administrator
  • Tireless poster
  • *****
    • Posts: 13517
    • View Profile
your past experience is not enough to say it depends on the build.
you should revert now to confirm this supposition.


Offline Rob215

  • Occasional poster
  • *
    • Posts: 13
    • View Profile
A bit stupid question but, have you enabled stop spiders option?

Maybe you have unchecked it.


Offline ninjapimp

  • Occasional poster
  • *
    • Posts: 71
    • View Profile
yes i have.
if i use 252 or below google wont crawl
but if i  use anything above 252, i went from 252 to 260 and google crawls my site
even after turning on the prevent spiders and adding code to the template.

i must be the only person noticing this so i can only conclude its something on my end that i've yet to figure out
funny thing is i never made any changes to my template. yet after the upgrade to 260 google was crawling me big time.


Offline r][m

  • Tireless poster
  • ****
    • Posts: 347
    • View Profile
I think <meta name="robots" content="noindex, nofollow"/> has to be on every page.
Do you have a "robots.txt"  file in the root of your server? Maybe that would work better.

I've removed the meta name=robots method from my pages and am trying the robots.txt
for a while.

I just found that in Menu > Limits > Stop Spiders is "greyed out" and I can not un-check it
in 260?  I just changed it in the ini, now its un-checked, but still "greyed out". Makes me wonder
if it was working.

« Last Edit: June 14, 2010, 07:20:23 AM by r][m »


Offline rejetto

  • Administrator
  • Tireless poster
  • *****
    • Posts: 13517
    • View Profile
ninja, could you state clearly that you tried recently (that is, after the problem) to REVERT to a previous build like 250?

note: i'm not talking about your past experience when you had still to update your hfs