• Welcome back

    From Al@VERT/TRMB to kk4qbn on Tuesday, May 12, 2015 08:46:00
    On 05/12/15, kk4qbn said the following...

    I've been out of the BBS game for close to 10 years. I use to run
    Warzone BBS, my old handle was Mrproper. I'v now got back into BBSING,

    Good to see you and warzone back, my situation is much the same.. :)

    Ttyl :-),
    Al

    --- Mystic BBS v1.10 (Linux)
    * Origin: The Rusty MailBox - Penticton, B.C. Canada
  • From kk4qbn@VERT/NWGA_NET to Al on Tuesday, May 12, 2015 23:48:11
    Re: Welcome back
    By: Al to kk4qbn on Tue May 12 2015 08:46 am

    Good to see you and warzone back, my situation is much the same.. :)

    Thanks for the welcome, I'm not going by the name Warzone anymore, I do plan on
    having some doorgames, but now I'm wanting to host mainly Amateur Radio and RC model related stuff, also I'm trying to cater to the local community more with a pots dial in line, so I have opted for a more fitting name which I hope is in
    the origin line. Havent really tested to see if anything works yet, I can get into it myself, I do think my firewall opened correctly, but my router is weird, I do not NAT is working because all the hack attempts I'm getting via telnet (TONS compared to the way it used to be), but if I try to login using the nwga_net.synchro.net name from the local network It goes straight to my router setup, both on port 23 and 80. Does everyone else get tons of hack attempts or is this an isolated incident? my computer is locked tighter than fort knox, but it bugs me. looks like most of them are either trying to root the pc, or login as admin, and also give "busybox" commands from what I can tell is mainly used on android cell phones.




    Best Regards,

    Tim

    ---
    þ Synchronet þ Northwest GA Network: nwga_net.synchro.net (706)422-9538
  • From Al@VERT/TRMB to kk4qbn on Wednesday, May 13, 2015 07:52:00
    On 05/13/15, kk4qbn said the following...

    Thanks for the welcome, I'm not going by the name Warzone anymore, I do plan on having some doorgames, but now I'm wanting to host mainly
    Amateur Radio and RC model related stuff, also I'm trying to cater to
    the local community more with a pots dial in line, so I have opted for a more fitting name which I hope is in the origin line. Havent really

    I see Northwest GA Network: and you telnet address and phone number. Looks good.

    tested to see if anything works yet, I can get into it myself, I do
    think my firewall opened correctly, but my router is weird, I do not NAT is working because all the hack attempts I'm getting via telnet (TONS compared to the way it used to be), but if I try to login using the nwga_net.synchro.net name from the local network It goes straight to my router setup, both on port 23 and 80. Does everyone else get tons of
    hack attempts or is this an isolated incident? my computer is locked tighter than fort knox, but it bugs me. looks like most of them are
    either trying to root the pc, or login as admin, and also give "busybox" commands from what I can tell is mainly used on android cell phones.

    I get tons of it here too. Mystic has auto banned most of it so it's not such
    a bother anymore. Most of it seems to come from China and that part of the world, don't know what they are trying to accomplish but into the bit bucket they go. I should probably just ban china as a whole since there is no
    chinese content here but I've left it open so anyone there who reads english could log in and participate if they wanted too. Don't think that's ever happened but it could. :)

    Ttyl :-),
    Al

    --- Mystic BBS v1.10 (Linux)
    * Origin: The Rusty MailBox - Penticton, B.C. Canada
  • From kk4qbn@VERT/NWGA_NET to Al on Wednesday, May 13, 2015 14:06:31
    Re: Re: Welcome back
    By: Al to kk4qbn on Wed May 13 2015 07:52 am

    I get tons of it here too. Mystic has auto banned most of it so it's not such a bother anymore. Most of it seems to come from China and that part of the world, don't know what they are trying to accomplish but into the bit bucket they go. I should probably just ban china as a whole since there is no chinese content here but I've left it open so anyone there who reads english could log in and participate if they wanted too. Don't think that's ever happened but it could. :)

    As much as I hate to, I've put china, russia and others into the host filter, it seems that's where most of them originate. The ones that are trying to gain root access to android phones seem to all come from the US, they are more than likely drone computers or phones that have been compromised to do the dirty work.




    Best Regards,

    Tim

    ---
    þ Synchronet þ Northwest GA Network: nwga_net.synchro.net (706)422-9538
  • From mark lewis@VERT to kk4qbn on Wednesday, May 13, 2015 16:56:41
    On Wed, 13 May 2015, kk4qbn wrote to Al:

    As much as I hate to, I've put china, russia and others into the
    host filter, it seems that's where most of them originate. The
    ones that are trying to gain root access to android phones seem to
    all come from the US, they are more than likely drone computers or
    phones that have been compromised to do the dirty work.

    FWIW: it is easiest to say that all of them are drones... it is doubtful that there are any humans on the other end actually typing away... they only come into play after the drones have found a way in and set up the backdoor...

    )\/(ark


    * Origin: (1:3634/12)

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Lord Time@VERT/TIME to Al on Wednesday, May 13, 2015 22:23:14
    tested to see if anything works yet, I can get into it myself, I do think my firewall opened correctly, but my router is weird, I do not NAT is working because all the hack attempts I'm getting via telnet (TONS compared to the way it used to be), but if I try to login using the nwga_net.synchro.net name from the local network It goes straight to my router setup, both on port 23 and 80.

    thats why my bbs telnet port is on 24 and my bbs http is on port 81

    Does everyone else get
    tons of hack attempts or is this an isolated incident? my computer is locked tighter than fort knox, but it bugs me. looks like most of them are either trying to root the pc, or login as admin, and also give "busybox" commands from what I can tell is mainly used on android cell phones.

    was it on your ftp side, if so I get it also


    ---

    Rob Starr
    Lord Time SysOp of
    Time Warp of the Future BBS
    Telnet://Time.Darktech.Org:24 or
    Telnet://Time.Synchro.Net:24 (qwk or ftn & e-mail)
    ICQ # 11868133 or # 70398519 Jabber : lordtime2000@gmail.com
    Yahoo : lordtime2000 AIM : LordTime20000 MSN : Lord Time
    Astra : lord_time X-Box : Lord Time 2000 oovoo : lordtime2000
    ---
    þ Synchronet þ Time Warp of the Future BBS - Home of League 10 IBBS Games
  • From kk4qbn@VERT/NWGA_NET to Lord Time on Thursday, May 14, 2015 10:45:35
    Re: Re: Welcome back
    By: Lord Time to Al on Wed May 13 2015 10:23 pm

    was it on your ftp side, if so I get it also

    Nah, most of it is telnet and ssh. the only thing fooling with my ftp is googlebot.




    Best Regards,

    Tim

    ---
    þ Synchronet þ Northwest GA Network: nwga_net.synchro.net (706)422-9538
  • From KenDB3@VERT/KD3NET to kk4qbn on Thursday, May 14, 2015 19:34:15
    Re: Re: Welcome back
    By: Lord Time to Al on Wed May 13 2015 10:23 pm

    was it on your ftp side, if so I get it also

    Nah, most of it is telnet and ssh. the only thing fooling with my ftp is googlebot.

    I had to specifically block Googlebot, it was a constant onslaught of connections every other minute for days on end. Does anyone know why Googlebot just keeps crawling the FTP site endlessly?

    ~KenDB3

    ---
    þ Synchronet þ KD3net-Rhode Island's only BBS about nothing. http://bbs.kd3.us
  • From Lord Time@VERT/TIME to kk4qbn on Thursday, May 14, 2015 19:44:43
    Re: Re: Welcome back
    By: Lord Time to Al on Wed May 13 2015 10:23 pm

    was it on your ftp side, if so I get it also

    Nah, most of it is telnet and ssh. the only thing fooling with my ftp is googlebot.

    ok, I have them blocked

    66.249.64.~
    66.249.65.~
    66.249.66.~
    66.249.67.~
    66.249.68.~
    66.249.69.~
    66.249.70.~
    66.249.71.~
    66.249.72.~
    66.249.73.~
    66.249.74.~
    66.249.75.~
    66.249.76.~
    66.249.79.~
    66.249.83.~
    72.14.199.~
    209.85.238.~
    203.208.60.~


    ---

    Rob Starr
    Lord Time SysOp of
    Time Warp of the Future BBS
    Telnet://Time.Darktech.Org:24 or
    Telnet://Time.Synchro.Net:24 (qwk or ftn & e-mail)
    ICQ # 11868133 or # 70398519 Jabber : lordtime2000@gmail.com
    Yahoo : lordtime2000 AIM : LordTime20000 MSN : Lord Time
    Astra : lord_time X-Box : Lord Time 2000 oovoo : lordtime2000
    ---
    þ Synchronet þ Time Warp of the Future BBS - Home of League 10 IBBS Games
  • From Mickey@VERT/OXFORDMI to KenDB3 on Friday, May 15, 2015 10:21:33
    Re: Re: Welcome back
    By: KenDB3 to kk4qbn on Thu May 14 2015 07:34 pm

    Re: Re: Welcome back
    By: Lord Time to Al on Wed May 13 2015 10:23 pm

    was it on your ftp side, if so I get it also

    Nah, most of it is telnet and ssh. the only thing fooling with my ftp is googlebot.

    I had to specifically block Googlebot, it was a constant onslaught of connections every other minute for days on end. Does anyone know why Googleb just keeps crawling the FTP site endlessly?

    ~KenDB3


    Shhhh..... government sponsored searches for mostly terror and kiddie related files.

    Mickey
    SynchroNET 3.15
    Oxford Mills Remote @ telnet://oxfordmi.synchro.net:23
    Live Music Based IRC Chat @ irc:oxford.synchro.net 6667
    -+- --- ---
    another visitor! Stay awhile..... Stay FOREVER!
    *beep*



    ---
    þ Synchronet þ Oxford Mills Remote - manning.webhop.net
  • From Mro@VERT/BBSESINF to Lord Time on Friday, May 15, 2015 16:13:29
    Re: Re: Welcome back
    By: Lord Time to Al on Wed May 13 2015 10:23 pm

    tested to see if anything works yet, I can get into it myself, I do think my firewall opened correctly, but my router is weird, I do
    not NAT is working because all the hack attempts I'm getting via telnet (TONS compared to the way it used to be), but if I try to login using the nwga_net.synchro.net name from the local network
    It goes straight to my router setup, both on port 23 and 80.

    thats why my bbs telnet port is on 24 and my bbs http is on port 81

    you guys could always get a router that isnt a POS
    ---
    þ Synchronet þ ::: BBSES.info - free BBS services :::
  • From Mro@VERT/BBSESINF to KenDB3 on Friday, May 15, 2015 16:14:07
    Re: Re: Welcome back
    By: KenDB3 to kk4qbn on Thu May 14 2015 07:34 pm


    I had to specifically block Googlebot, it was a constant onslaught of connections every other minute for days on end. Does anyone know why Googlebot just keeps crawling the FTP site endlessly?


    googlebot is an asshole!
    ---
    þ Synchronet þ ::: BBSES.info - free BBS services :::
  • From KenDB3@VERT/KD3NET to Mickey on Friday, May 15, 2015 20:42:11
    Re: Re: Welcome back
    By: KenDB3 to kk4qbn on Thu May 14 2015 07:34 pm

    Re: Re: Welcome back
    By: Lord Time to Al on Wed May 13 2015 10:23 pm

    was it on your ftp side, if so I get it also

    Nah, most of it is telnet and ssh. the only thing fooling with my ftp is googlebot.

    I had to specifically block Googlebot, it was a constant onslaught of connections every other minute for days on end. Does anyone know why Googleb just keeps crawling the FTP site endlessly?

    ~KenDB3


    Shhhh..... government sponsored searches for mostly terror and kiddie related files.

    Mickey

    So, you're saying I should take down my anarchy related junk? LOL

    ~KenDB3

    ---
    þ Synchronet þ KD3net-Rhode Island's only BBS about nothing. http://bbs.kd3.us
  • From KenDB3@VERT/KD3NET to Mro on Friday, May 15, 2015 20:45:01
    Re: Re: Welcome back
    By: KenDB3 to kk4qbn on Thu May 14 2015 07:34 pm


    I had to specifically block Googlebot, it was a constant onslaught of connections every other minute for days on end. Does anyone know why Googlebot just keeps crawling the FTP site endlessly?


    googlebot is an asshole!


    I can certainly agree to that statement! It was just frikken relentless. I figured it would eventually give up but days later it was still going.

    ~KenDB3

    ---
    þ Synchronet þ KD3net-Rhode Island's only BBS about nothing. http://bbs.kd3.us
  • From Lord Time@VERT/TIME to Mro on Friday, May 15, 2015 22:14:35
    tested to see if anything works yet, I can get into it myself, I do think my firewall opened correctly, but my router is weird, I do not NAT is working because all the hack attempts I'm getting via telnet (TONS compared to the way it used to be), but if I try to login using the nwga_net.synchro.net name from the local network It goes straight to my router setup, both on port 23 and 80.

    thats why my bbs telnet port is on 24 and my bbs http is on port 81

    you guys could always get a router that isnt a POS

    I can't, it'd a dsl modem with a router (from the isp - tds)


    ---

    Rob Starr
    Lord Time SysOp of
    Time Warp of the Future BBS
    Telnet://Time.Darktech.Org:24 or
    Telnet://Time.Synchro.Net:24 (qwk or ftn & e-mail)
    ICQ # 11868133 or # 70398519 Jabber : lordtime2000@gmail.com
    Yahoo : lordtime2000 AIM : LordTime20000 MSN : Lord Time
    Astra : lord_time X-Box : Lord Time 2000 oovoo : lordtime2000
    ---
    þ Synchronet þ Time Warp of the Future BBS - Home of League 10 IBBS Games
  • From Accession@VERT/PHARCYDE to Lord Time on Saturday, May 16, 2015 06:52:16
    Hello Lord,

    On 15 May 15 22:14, Lord Time wrote to Mro:

    I can't, it'd a dsl modem with a router (from the isp - tds)

    Sure you can. Have a service tech come out and bypass the router on that modem (ie: put it in "bridged" mode). Then you can use whatever router you fancy.

    Regards,
    Nick

    --- GoldED+/LNX 1.1.5-b20130910
    * Origin: thePharcyde_ telnet://bbs.pharcyde.org (Wisconsin) (723:1/701)
    þ Synchronet þ thePharcyde_ telnet://bbs.pharcyde.org (Wisconsin)
  • From Poindexter Fortran@VERT/REALITY to KenDB3 on Saturday, May 16, 2015 09:30:08
    Re: Re: Welcome back
    By: KenDB3 to Mickey on Fri May 15 2015 08:42 pm

    So, you're saying I should take down my anarchy related junk? LOL


    I think I still have the E-911 document from 1991 on my system, and some telco box instructions.

    145BOXES.ZIP [ 0] This is the ultimate box collection, not that other on 35BOXES.ZIP [ 0] How To Make 35 Boxes
    ACRYLIC.ZIP [ 0] Acrylic box plans.
    AQUA.ZIP [ 0] Aqua Boxing
    BDIAL52.ZIP [ 0] Software Bluebox.
    BEIGE.ARJ [ 0] Beige Boxing for phun
    BLACKBOX.ZIP [ 0] More Black Box info
    BLAST.ARJ [ 0] Blast Box. Stupid kiddie file
    BLKBOX.ZIP [ 0] BLACK BOX SCHEMATICS
    BLOTTO.ARJ [ 0] Blotto Box instructions
    BLUE.ARJ [ 0] Blue Box information
    BLUETIP.ZIP [ 0] blue boxes and tips
    BOX.ZIP [ 0] 5 boxes for the general use
    BOXES.ZIP [ 0] All the Boxes one would ever want
    BOXES2.TXT [ 0] Second Of, Build Various Boxes.
    BOXFREQS.ZIP [ 0] bunch a different box frequencys
    BOXPLANS.ZIP [ 0] planx for many kinds of boxes
    BOXTEXT.ZIP [ 0] All The Boxes You'll Ever Need.
    BOXTONE.EXE [ 0] Just as the names says... Requires Soundblaster.
    BROWN.BOX [ 0] Brown Box
    BUD.BOX [ 0] Bud Box
    C5BOX.ZIP [ 0] learn how to make a unique telephone (phreaking) box CANNING.TXT [ 0] How To Go Canning And Build Beige Boxes
    CATAD.TXT G CELLULAR ENCODER BOX
    CAVERN.BOX [ 0] how to make a cavern box. to bybass cbv.
    DAY-GLO.ARJ [ 0] day-glo box? Come on, who thinks of these!
    GOLD.ARJ [ 0] Gold Box
    GREEN.ARJ [ 0] Green Box information
    LUNCH.BOX [ 0] Lunch Box
    MAUVE.ARJ [ 0] Mauve Box information
    NEON.ARJ [ 0] Neon Box
    OLIVE.ARJ [ 0] Olive Box information
    PHONINFO.ZIP [ 0] Info On Payphones, And Colored Boxes
    QUARTER.VOC [ 0] REDBOX TONE OF A QUARTER. Works great in USA!
    RED.ARJ [ 0] Red Box Box red
    SCARLET.ARJ [ 0] Scarlet box - call Hester Prynne
    SILVER.ARJ [ 0] Silver box from hell
    SLUG.ZIP [ 0] The Slug Box!
    TAN.ARJ [ 0] Tan Box
    TRON.BOX [ 0] Tron Box
    ULBOX21.ZIP [ 0] The Ultimate box collection
    WHITE.ARJ [ 0] White Box information


    ---
    þ Synchronet þ realitycheckBBS -- http://realitycheckBBS.org
  • From Poindexter Fortran@VERT/REALITY to KenDB3 on Saturday, May 16, 2015 09:31:06
    Re: Re: Welcome back
    By: KenDB3 to Mro on Fri May 15 2015 08:45 pm

    I can certainly agree to that statement! It was just frikken relentless. I figured it would eventually give up but days later it was still going.

    There should be a flag in ROBOTS.TXT that says "if you're going to ignore this, please have the decency of only using XX number of threads"

    ---
    þ Synchronet þ realitycheckBBS -- http://realitycheckBBS.org
  • From Poindexter Fortran@VERT/REALITY to Lord Time on Saturday, May 16, 2015 09:33:28
    Re: Re: Welcome back
    By: Lord Time to Mro on Fri May 15 2015 10:14 pm

    I can't, it'd a dsl modem with a router (from the isp - tds)

    You may be able to get the telco to switch it into Bridge mode, and disable their router. Then, you can use whatever router you want to firewall off your LAN.

    I had to do that with Comcast; I don't think an ISP has a right to put something on the inside of my LAN.

    ---
    þ Synchronet þ realitycheckBBS -- http://realitycheckBBS.org
  • From kk4qbn@VERT/NWGA_NET to Mro on Saturday, May 16, 2015 12:54:09
    Re: Re: Welcome back
    By: Mro to Lord Time on Fri May 15 2015 04:13 pm

    you guys could always get a router that isnt a POS

    some people may not be able to get up and spend 150.00 dollars on a router so quickly, until then you just have to make do with what you have.




    Best Regards,

    Tim

    ---
    þ Synchronet þ Northwest GA Network: nwga_net.synchro.net (706)422-9538
  • From mark lewis@VERT to KenDB3 on Saturday, May 16, 2015 11:57:35
    On Fri, 15 May 2015, KenDB3 wrote to Mro:

    I had to specifically block Googlebot, it was a constant
    onslaught of connections every other minute for days on end. Does
    anyone know why Googlebot just keeps crawling the FTP site
    endlessly?

    googlebot is an asshole!

    I can certainly agree to that statement! It was just frikken
    relentless. I figured it would eventually give up but days later it
    was still going.

    i don't understand why you guys have such a problem with googlebot... it crawls
    my web sites and ftp server with no problems... sure, at one point it may have been running a complete crawl but once that was done, it was quite well behaved
    and still is...

    another factor is links... if there's more than one link to a file, it will attempt to track all of them... i remember working on one site that had some sort of dynamic linking thing to all their files and pages... it made it seem that there were hundreds of pages with all the same content... all of the bots were ravenous on that site and the owners were complaining that they had no human visitors because of the bots... they blocked the bots and still didn't have a human visitors... why? because they weren't in the indexes... i went and
    ripped out that linking code and set plain static links to their content... then we allowed the bots back in and indexing the site was done in a very short
    time... much shorter than previously had been being seen... the humans followed
    after that... they still don't understand the problem that linking code caused... i mean they see the problem and know what it was but they don't understand how it was detrimental to them...

    another thing folks can do is to set access timings in robots.txt for the various bots that recognize them... set "Crawl-delay: 300" for 5 minutes between accesses... i don't find anything specific in any of my robots.txt for googlebot, though... it may recognise it but you'll need to go look that up on the googlebot site to see for sure...

    and yes, placing a robots.txt in your ftp root works... at least for googlebot... it regularly pulls mine and follows the instructions conveyed when
    they are merged into the index... it may take a few days or weeks but it does start following after the file has been pulled and added into the main index...

    )\/(ark


    * Origin: (1:3634/12)

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From mark lewis@VERT to Lord Time on Saturday, May 16, 2015 12:09:08
    On Fri, 15 May 2015, Lord Time wrote to Mro:

    thats why my bbs telnet port is on 24 and my bbs http is on port
    81

    you guys could always get a router that isnt a POS

    I can't, it'd a dsl modem with a router (from the isp - tds)

    you should be able to... my connection is DSL... i run my modem in bridge mode... the WAN IP lands on my real firewall which is a dedicated computer running a FOSS firewall package i downloaded... if your modem doesn't allow for
    such, then buy your own and put it in place... generally speaking, this is very
    easy to do and get operational...

    )\/(ark


    * Origin: (1:3634/12)

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Poindexter Fortran@VERT/REALITY to kk4qbn on Saturday, May 16, 2015 11:29:02
    Re: Re: Welcome back
    By: kk4qbn to Mro on Sat May 16 2015 12:54 pm

    some people may not be able to get up and spend 150.00 dollars on a router so quickly, until then you just have to make do with what you have.


    I just got my favorite router back from my parents' house - a Belkin F5D7230. Took a little work getting DD-WRT on it, but once I did it ended up being a dependable little router. I only replaced it for them after 5 years because they needed something with a little more range, and I had a WRT54G with high-gain antennas I wasn't using.

    The best thing about that little Belkin, besides running a killer OS and being rock-solid? The price. $20 with a $20 mail in rebate. :)

    ---
    þ Synchronet þ realitycheckBBS -- http://realitycheckBBS.org
  • From mark lewis@VERT to kk4qbn on Saturday, May 16, 2015 16:48:37
    On Sat, 16 May 2015, kk4qbn wrote to Mro:

    you guys could always get a router that isnt a POS

    some people may not be able to get up and spend 150.00 dollars on a
    router so quickly, until then you just have to make do with what
    you have.

    what $150?? my firewall router came out of the dumpster... free for the taking... PIII 800mhz with 768M RAM, a 2oGig HD and four NICs...

    today's throwaways are PIV 3Ghz with 2-4Gig of RAM and 500Gig HDs... the firewall router software i use has more than enough room in 20Gig so all the rest is wasted...

    )\/(ark


    * Origin: (1:3634/12)

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Joe Delahaye@VERT to Lord Time on Saturday, May 16, 2015 17:53:11
    Re: Re: Welcome back
    By: Lord Time to Mro on Fri May 15 2015 22:14:35

    you guys could always get a router that isnt a POS

    I can't, it'd a dsl modem with a router (from the isp - tds)

    I had one of those as well. I set it in passthru mode, and just used the modem portion. Still doing the same with the modem I purchased when I switched providers.
    --- SBBSecho 2.27-Win32
    * Origin: The Lions Den BBS, Trenton, On, CDN (1:249/303)
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From kk4qbn@VERT/NWGA_NET to Poindexter Fortran on Saturday, May 16, 2015 18:19:30
    Re: Re: Welcome back
    By: Poindexter Fortran to kk4qbn on Sat May 16 2015 11:29 am

    The best thing about that little Belkin, besides running a killer OS and being rock-solid? The price. $20 with a $20 mail in rebate. :)

    Nice, you cannot beat a free router :)



    Best Regards,

    Tim

    ---
    þ Synchronet þ Northwest GA Network: nwga_net.synchro.net (706)422-9538
  • From Mro@VERT/BBSESINF to KenDB3 on Saturday, May 16, 2015 20:18:38
    Re: Re: Welcome back
    By: KenDB3 to Mro on Fri May 15 2015 08:45 pm



    I can certainly agree to that statement! It was just frikken relentless. I figured it would eventually give up but days later it was still going.



    it doesnt stop and it doesnt ALWAYS follow robots.txt, and other methods you can try to block it from http and ftp. best thing to do is block the spiders ---
    þ Synchronet þ ::: BBSES.info - free BBS services :::
  • From Mro@VERT/BBSESINF to Lord Time on Saturday, May 16, 2015 20:19:15
    Re: Re: Welcome back
    By: Lord Time to Mro on Fri May 15 2015 10:14 pm


    you guys could always get a router that isnt a POS

    I can't, it'd a dsl modem with a router (from the isp - tds)


    most isps allow you to buy your own dsl modems and cable modems.
    ---
    þ Synchronet þ ::: BBSES.info - free BBS services :::
  • From Mro@VERT/BBSESINF to kk4qbn on Saturday, May 16, 2015 20:20:42
    Re: Re: Welcome back
    By: kk4qbn to Mro on Sat May 16 2015 12:54 pm

    Re: Re: Welcome back
    By: Mro to Lord Time on Fri May 15 2015 04:13 pm

    you guys could always get a router that isnt a POS

    some people may not be able to get up and spend 150.00 dollars on a router so quickly, until then you just have to make do with what you have.


    i have a 150+ dollar router, but it's a refurb so i bought it for much less. it's been running great for about 7 years. so that 80 bucks or so was well spent.
    ---
    þ Synchronet þ ::: BBSES.info - free BBS services :::
  • From Mro@VERT/BBSESINF to mark lewis on Saturday, May 16, 2015 20:22:00
    Re: Welcome back
    By: mark lewis to KenDB3 on Sat May 16 2015 11:57 am

    i don't understand why you guys have such a problem with googlebot... it crawls my web sites and ftp server with no problems... sure, at one point
    it may have been running a complete crawl but once that was done, it was quite well behaved and still is...


    it shouldnt even be crawling your ftp server, though.

    some people have no issues. other people are just raped non stop by googlebot. and i've logged onto them via vnc or teamviewer and seen this.
    ---
    þ Synchronet þ ::: BBSES.info - free BBS services :::
  • From KenDB3@VERT/KD3NET to Poindexter Fortran on Saturday, May 16, 2015 21:34:23
    Re: Re: Welcome back
    By: KenDB3 to Mickey on Fri May 15 2015 08:42 pm

    So, you're saying I should take down my anarchy related junk? LOL


    I think I still have the E-911 document from 1991 on my system, and some telco box instructions.


    Hah! That's awesome. I have some files on DVD somewhere, but I honestly was just joking since I don't have any up on my file base at all. You've got a great collection there!

    ~KenDB3

    ---
    þ Synchronet þ KD3net-Rhode Island's only BBS about nothing. http://bbs.kd3.us
  • From KenDB3@VERT/KD3NET to Poindexter Fortran on Saturday, May 16, 2015 21:35:01
    Re: Re: Welcome back
    By: KenDB3 to Mro on Fri May 15 2015 08:45 pm

    I can certainly agree to that statement! It was just frikken relentless. I figured it would eventually give up but days later it was still going.

    There should be a flag in ROBOTS.TXT that says "if you're going to ignore this, please have the decency of only using XX number of threads"

    That would be rather nice.

    ~KenDB3

    ---
    þ Synchronet þ KD3net-Rhode Island's only BBS about nothing. http://bbs.kd3.us
  • From KenDB3@VERT/KD3NET to mark lewis on Saturday, May 16, 2015 22:38:43
    On Fri, 15 May 2015, KenDB3 wrote to Mro:

    I had to specifically block Googlebot, it was a constant
    onslaught of connections every other minute for days on end. Does anyone know why Googlebot just keeps crawling the FTP site endlessly?

    googlebot is an asshole!

    I can certainly agree to that statement! It was just frikken
    relentless. I figured it would eventually give up but days later it was still going.

    i don't understand why you guys have such a problem with googlebot... it crawls my web sites and ftp server with no problems... sure, at one point it may have been running a complete crawl but once that was done, it was quite well behaved and still is...


    I don't really know about other folks, but my problem was that I had 8 whopping files in the file base at the time, and now I have 9, so either way, not much to crawl. And even though there wasn't much content, the crawl went on for *months* before I finally blocked it.

    I wouldn't have even bothered, except that it eventually slowed down browsing of the http site, and it slowed down the terminal access as well and I had noticeable delays logging in, reading messages, launching doors, etc.... But, when I turned off the FTP, all of those delays went away. Admittedly, my sbbs runs on an older machine running XP, mainly because I don't need (or want to spend any money on) any big hardware upgrades, because most of the time it's perfectly fine for what I need.

    I assure you, I'm not complaining because I hated seeing the traffic, I'm complaining because I really didn't want to block it, but had to because I didn't know how long it was going to keep it up.

    another factor is links... if there's more than one link to a file, it will attempt to track all of them... i remember working on one site that had some sort of dynamic linking thing to all their files and pages... it made it seem that there were hundreds of pages with all the same content... all of the bots were ravenous on that site and the owners were complaining that they had no human visitors because of the bots... they blocked the bots and still didn't have a human visitors... why? because they weren't in the indexes... i went and ripped out that linking code and set plain static links to their content... then we allowed the bots back in and indexing the site was done in a very short time... much shorter than previously had been being seen... the humans followed after that... they still don't understand the problem that linking code caused... i mean they see the problem and know what it was but they don't understand how it was detrimental to them...


    I wonder if the way sbbs adds some random text after 00index.html has anything to do with the way googlebot acts. I remember reading somewhere that there was a purpose to the randomized text, but can't remember the intended purpose.

    another thing folks can do is to set access timings in robots.txt for the various bots that recognize them... set "Crawl-delay: 300" for 5 minutes between accesses... i don't find anything specific in any of my robots.txt for googlebot, though... it may recognise it but you'll need to go look that up on the googlebot site to see for sure...


    I didn't know that. I wonder if it would have helped or not. My system was bogged down I think because googlebot was hitting the FTP what looked like every 2 to 4 minutes (give or take).


    00:00:21 1684 CTRL connection accepted from: 66.249.73.128 port 54041 00:00:21 1684 Hostname: crawl-66-249-73-128.googlebot.com
    00:00:22 1684 Guest: <googlebot@google.com>
    00:00:22 1684 Guest logged in (1 today, 63413 total)
    00:00:22 1684 Guest downloading HTML index for / in passive mode
    00:00:22 1684 Transfer successful: 3621 bytes sent in 0 seconds (7242 cps) 00:00:22 1684 Guest logged off
    00:00:22 1684 CTRL thread terminated (0 clients and 1 threads remain, 872 served) 00:03:02 1736 CTRL connection accepted from: 66.249.73.128 port 54480 00:03:02 1736 Hostname: crawl-66-249-73-128.googlebot.com
    00:03:02 1736 Guest: <googlebot@google.com>
    00:03:02 1736 Guest logged in (2 today, 63414 total)
    00:03:03 1736 Guest downloading HTML index for / in passive mode
    00:03:03 1736 Transfer successful: 3621 bytes sent in 0 seconds (7242 cps) 00:03:03 1736 Guest logged off
    00:03:03 1736 CTRL thread terminated (0 clients and 1 threads remain, 873 served) etc...

    and yes, placing a robots.txt in your ftp root works... at least for googlebot... it regularly pulls mine and follows the instructions conveyed when they are merged into the index... it may take a few days or weeks but it does start following after the file has been pulled and added into the main index...

    Well, food for thought. My BBS has sped up since I blocked it, and its not like a company web site where I would certainly *want* to be crawled, so I'm pretty happy with my decision. But, I could always try it and open the flood gates again (but again, I'm a happier now, so I probably won't lol).

    I would really just love to know why it started on 9/12/2014 (5 times that particular day, with spread out intervals), and then the very next day it crawled 479 times (short intervals). And then about the same ammount every day afterwards until 4/14/2015 when I put the block up. It crawled my 8 files for 7 months? Really? Really Really? What caused the loop? And the even better question, why did googlebot not detect a loop?

    ~KenDB3
    Note: It still tries to connect to this day, but I don't see performance issues when I am blocking it:

    00:02:53 1440 CTRL connection accepted from: 66.249.79.42 port 64042
    00:02:53 1440 Hostname: crawl-66-249-79-42.googlebot.com
    00:02:54 1440 !BLOCKED e-mail address: googlebot@google.com
    00:02:59 1440 CTRL thread terminated (0 clients and 1 threads remain, 468 served) 00:03:57 1596 CTRL connection accepted from: 66.249.79.55 port 52885 00:03:57 1596 Hostname: crawl-66-249-79-55.googlebot.com
    00:03:58 1596 !BLOCKED e-mail address: googlebot@google.com
    00:04:03 1596 CTRL thread terminated (0 clients and 1 threads remain, 469 served)

    ---
    þ Synchronet þ KD3net-Rhode Island's only BBS about nothing. http://bbs.kd3.us
  • From mark lewis@VERT to Mro on Sunday, May 17, 2015 12:56:03
    On Sat, 16 May 2015, Mro wrote to mark lewis:

    i don't understand why you guys have such a problem with
    googlebot... it crawls my web sites and ftp server with no
    problems... sure, at one point it may have been running a complete
    crawl but once that was done, it was quite well behaved and still
    is...

    it shouldnt even be crawling your ftp server, though.

    why not??

    some people have no issues. other people are just raped non stop by googlebot. and i've logged onto them via vnc or teamviewer and seen
    this.

    then there's a problem somewhere... it is well known, too, that there are spiders out there that say they are googlbot but are not... many of those do not play nice...

    )\/(ark


    * Origin: (1:3634/12)

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From mark lewis@VERT to KenDB3 on Sunday, May 17, 2015 12:58:32
    On Sat, 16 May 2015, KenDB3 wrote to mark lewis:

    I can certainly agree to that statement! It was just frikken relentless. I figured it would eventually give up but days later
    it was still going.

    i don't understand why you guys have such a problem with
    googlebot... it crawls my web sites and ftp server with no
    problems... sure, at one point it may have been running a complete
    crawl but once that was done, it was quite well behaved and still
    is...

    I don't really know about other folks, but my problem was that I
    had 8 whopping files in the file base at the time, and now I have
    9, so either way, not much to crawl. And even though there wasn't
    much content, the crawl went on for *months* before I finally
    blocked it.

    i don't see the problem... googlebot visits my site every day... if i had only 10 files, i would expect it to still drop by every day... it only pulls one file at each visit and then comes back some minutes later for another one...

    I wouldn't have even bothered, except that it eventually slowed
    down browsing of the http site, and it slowed down the terminal
    access as well and I had noticeable delays logging in, reading
    messages, launching doors, etc.... But, when I turned off the FTP,
    all of those delays went away. Admittedly, my sbbs runs on an older machine running XP, mainly because I don't need (or want to spend
    any money on) any big hardware upgrades, because most of the time
    it's perfectly fine for what I need.

    nothing wrong with that... all i was saying was that once a site is indexed then the visited are pretty much just spot checks from there on until the next indexing run comes along...

    I assure you, I'm not complaining because I hated seeing the
    traffic, I'm complaining because I really didn't want to block it,
    but had to because I didn't know how long it was going to keep it
    up.

    i can understand that... i'm not complaining or berating, either... granted, when my system was on the PII200mhz that it ran on forever, yeah, things could get slow at times... i used to have googlebot, yahoobot and that old micro$oftbot thing crawling my sites all at the same time when i was dialup... talk about slow access ;) then we got 3M/768K DSL in our area and things are somewhat better...

    another factor is links... if there's more than one link to a file,
    it will attempt to track all of them... i remember working on one
    site that had some sort of dynamic linking thing to all their files
    and pages... it made it seem that there were hundreds of pages with
    all the same content... all of the bots were ravenous on that site
    and the owners were complaining that they had no human visitors
    because of the bots... they blocked the bots and still didn't have a
    human visitors... why? because they weren't in the indexes... i went
    and ripped out that linking code and set plain static links to their content... then we allowed the bots back in and indexing the site
    was done in a very short time... much shorter than previously had
    been being seen... the humans followed after that... they still
    don't understand the problem that linking code caused... i mean they
    see the problem and know what it was but they don't understand how
    it was detrimental to them...

    I wonder if the way sbbs adds some random text after 00index.html
    has anything to do with the way googlebot acts. I remember reading somewhere that there was a purpose to the randomized text, but
    can't remember the intended purpose.

    it might have something to do with it... especially if it is serialization text... depending, it could make it appear that every link has changed the next
    time the bot comes by so it gos hunting all the new links again... there is a way to handle that but i forget, right now, what it is... i know that on many forums and such there's special clean non-themed pages given to the bots so they can get the meat of the site without all the frilly veggies...

    another thing folks can do is to set access timings in robots.txt
    for the various bots that recognize them... set "Crawl-delay: 300"
    for 5 minutes between accesses... i don't find anything specific in
    any of my robots.txt for googlebot, though... it may recognise it
    but you'll need to go look that up on the googlebot site to see for
    sure...

    I didn't know that. I wonder if it would have helped or not. My
    system was bogged down I think because googlebot was hitting the
    FTP what looked like every 2 to 4 minutes (give or take).

    yeah, that's about average for over here... but then i have a lot more files in
    a lot more directories... plus i'm speaking of *my* sites which are all hosted on apache (OS/2) and my ftp server peter moylan's ftpserve...

    the IPs look valid for google's bot range in your log snippet, too...

    and yes, placing a robots.txt in your ftp root works... at least for googlebot... it regularly pulls mine and follows the instructions
    conveyed when they are merged into the index... it may take a few
    days or weeks but it does start following after the file has been
    pulled and added into the main index...

    Well, food for thought. My BBS has sped up since I blocked it, and
    its not like a company web site where I would certainly *want* to
    be crawled, so I'm pretty happy with my decision. But, I could
    always try it and open the flood gates again (but again, I'm a
    happier now, so I probably won't lol).

    something else you might want to do, if you decide to open your site back up to
    google, is to drop by their site admin section and set up a management account... you'll have to place a file in your web root and ftp root for it to find for verification that you do control the site... then you can look at the stats and see what the URLs with errors are...

    I would really just love to know why it started on 9/12/2014 (5
    times that particular day, with spread out intervals), and then the
    very next day it crawled 479 times (short intervals). And then
    about the same ammount every day afterwards until 4/14/2015 when I
    put the block up. It crawled my 8 files for 7 months? Really?
    Really Really? What caused the loop? And the even better question,
    why did googlebot not detect a loop?

    this really sounds like the varied text strings on the URLs... that and maybe an index crawl instead of just verification spidering...

    another place you might want to drop by is webmasterworld.com... i've had an account there for years but haven't been by in a while... they taught me a lot about the bots as well as css stuff when i made the move to go from hardcoded html to css and fancy up the place a bit... but i'm about as much a painter as i am an astronaut so it ain't all that pretty anyway :lol: but seriously... webmasterworld.com and check out their google forum... they even have a section
    on Google SEO...

    http://www.webmasterworld.com/home.htm

    the forum is completely custom written and takes a little getting used to but the information provided is excellent...

    )\/(ark


    * Origin: (1:3634/12)

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Mro@VERT/BBSESINF to mark lewis on Monday, May 18, 2015 22:15:02
    Re: Welcome back
    By: mark lewis to Mro on Sun May 17 2015 12:56 pm

    it shouldnt even be crawling your ftp server, though.

    why not??


    because by definition googlebot is a web crawling bot that discovers "pages"

    it shouldnt be on people's ftp servers. it shouldnt be using the ftp protocol.

    then there's a problem somewhere... it is well known, too, that there are spiders out there that say they are googlbot but are not... many of those
    do not play nice...


    i never go by what the bot reports. i look at the ip addresses. fact is, googlebot and other spiders dont always obey what we ask of them.
    ---
    þ Synchronet þ ::: BBSES.info - free BBS services :::
  • From mark lewis@VERT to Mro on Tuesday, May 19, 2015 14:52:45
    On Mon, 18 May 2015, Mro wrote to mark lewis:

    it shouldnt even be crawling your ftp server, though.

    why not??

    because by definition googlebot is a web crawling bot that discovers "pages"

    you shold tell google that, then...

    it shouldnt be on people's ftp servers. it shouldnt be using the ftp protocol.

    spiders and bots follow links... if a link is using the ftp protocol, they follow it just as well and easily as following a http link...

    then there's a problem somewhere... it is well known, too, that there are spiders out there that say they are googlbot but are not... many of those
    do not play nice...

    i never go by what the bot reports. i look at the ip addresses.
    fact is, googlebot and other spiders dont always obey what we ask of
    them.

    it takes time... that's what folks don't understand... they expect that when they see GB getting their robots.txt every day that they can make a change, the
    bot will grab it and immediately start following it... the bot cannot and does not parse the contents of the robots.txt file... it delivers those contents back to the database for parsing and indexing... when the master database is updated, then and only then can the new instructions in the site's robots.txt be applied and followed...

    )\/(ark


    * Origin: (1:3634/12)

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Digital Man@VERT to KenDB3 on Tuesday, May 19, 2015 16:52:23
    Re: Re: Welcome back
    By: KenDB3 to Mro on Fri May 15 2015 08:45 pm

    Re: Re: Welcome back
    By: KenDB3 to kk4qbn on Thu May 14 2015 07:34 pm


    I had to specifically block Googlebot, it was a constant onslaught of connections every other minute for days on end. Does anyone know why Googlebot just keeps crawling the FTP site endlessly?


    googlebot is an asshole!


    I can certainly agree to that statement! It was just frikken relentless. I figured it would eventually give up but days later it was still going.

    There was a recent fix to the ftp-html.js and ftp-web-html.js scripts in CVS (http://cvs.synchro.net/cgi-bin/viewcvs.cgi/exec/) which should help stop Googlebot from forever-indexing Synchronet FTP servers which use these scripts to dynamically generate the 00index.html files.

    digital man

    Synchronet "Real Fact" #20:
    The first commericial sale of Synchronet was to Las Vegas Playground BBS (1992).
    Norco, CA WX: 69.2øF, 55.0% humidity, 15 mph ESE wind, 0.00 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Digital Man@VERT to Lord Time on Tuesday, May 19, 2015 16:55:40
    Re: Re: Welcome back
    By: Lord Time to Mro on Fri May 15 2015 10:14 pm

    tested to see if anything works yet, I can get into it myself, I do think my firewall opened correctly, but my router is weird, I do not NAT is working because all the hack attempts I'm getting via telnet (TONS compared to the way it used to be), but if I try to login using the nwga_net.synchro.net name from the local network It goes straight to my router setup, both on port 23 and 80.

    thats why my bbs telnet port is on 24 and my bbs http is on port 81

    you guys could always get a router that isnt a POS

    I can't, it'd a dsl modem with a router (from the isp - tds)

    The router can usually be defeated/bypassed in those devices so you can use another router. Look into enabling 'bridge mode' or something similar.

    digital man

    Synchronet "Real Fact" #37:
    Synchornet first supported Windows NT-based operating systems w/v3.00b (2000). Norco, CA WX: 69.2øF, 55.0% humidity, 15 mph ESE wind, 0.00 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Digital Man@VERT to KenDB3 on Tuesday, May 19, 2015 17:13:09
    Re: Re: Welcome back
    By: KenDB3 to mark lewis on Sat May 16 2015 10:38 pm

    On Fri, 15 May 2015, KenDB3 wrote to Mro:

    I had to specifically block Googlebot, it was a constant
    onslaught of connections every other minute for days on end. Does anyone know why Googlebot just keeps crawling the FTP site endlessly?

    googlebot is an asshole!

    I can certainly agree to that statement! It was just frikken relentless. I figured it would eventually give up but days later it was still going.

    i don't understand why you guys have such a problem with googlebot... it crawls my web sites and ftp server with no problems... sure, at one point it may have been running a complete crawl but once that was done, it was quite well behaved and still is...


    I don't really know about other folks, but my problem was that I had 8 whopping files in the file base at the time, and now I have 9, so either way, not much to crawl. And even though there wasn't much content, the crawl went on for *months* before I finally blocked it.

    I wouldn't have even bothered, except that it eventually slowed down browsing of the http site, and it slowed down the terminal access as well and I had noticeable delays logging in, reading messages, launching doors, etc.... But, when I turned off the FTP, all of those delays went away. Admittedly, my sbbs runs on an older machine running XP, mainly because I don't need (or want to spend any money on) any big hardware upgrades, because most of the time it's perfectly fine for what I need.

    I assure you, I'm not complaining because I hated seeing the traffic, I'm complaining because I really didn't want to block it, but had to because I didn't know how long it was going to keep it up.

    another factor is links... if there's more than one link to a file, it will attempt to track all of them... i remember working on one site that had some sort of dynamic linking thing to all their files and pages... it made it seem that there were hundreds of pages with all the same content... all of the bots were ravenous on that site and the owners were complaining that they had no human visitors because of the bots... they blocked the bots and still didn't have a human visitors... why? because they weren't in the indexes... i went and ripped out that linking code and set plain static links to their content... then we allowed the bots back in and indexing the site was done in a very short time... much shorter than previously had been being seen... the humans followed after that... they still don't understand the problem that linking code caused... i mean they see the problem and know what it was but they don't understand how it was detrimental to them...


    I wonder if the way sbbs adds some random text after 00index.html has anything to do with the way googlebot acts.

    Yes, it does.

    I remember reading somewhere
    that there was a purpose to the randomized text, but can't remember the intended purpose.

    It's to defeat browser caching which caused the sorting options to not work.

    another thing folks can do is to set access timings in robots.txt for the various bots that recognize them... set "Crawl-delay: 300" for 5 minutes between accesses... i don't find anything specific in any of my robots.txt for googlebot, though... it may recognise it but you'll need to go look that up on the googlebot site to see for sure...


    I didn't know that. I wonder if it would have helped or not. My system was bogged down I think because googlebot was hitting the FTP what looked like every 2 to 4 minutes (give or take).


    00:00:21 1684 CTRL connection accepted from: 66.249.73.128 port 54041 00:00:21 1684 Hostname: crawl-66-249-73-128.googlebot.com
    00:00:22 1684 Guest: <googlebot@google.com>
    00:00:22 1684 Guest logged in (1 today, 63413 total)
    00:00:22 1684 Guest downloading HTML index for / in passive mode
    00:00:22 1684 Transfer successful: 3621 bytes sent in 0 seconds (7242 cps) 00:00:22 1684 Guest logged off
    00:00:22 1684 CTRL thread terminated (0 clients and 1 threads remain, 872 served) 00:03:02 1736 CTRL connection accepted from: 66.249.73.128 port 54480 00:03:02 1736 Hostname: crawl-66-249-73-128.googlebot.com
    00:03:02 1736 Guest: <googlebot@google.com>
    00:03:02 1736 Guest logged in (2 today, 63414 total)
    00:03:03 1736 Guest downloading HTML index for / in passive mode
    00:03:03 1736 Transfer successful: 3621 bytes sent in 0 seconds (7242 cps) 00:03:03 1736 Guest logged off
    00:03:03 1736 CTRL thread terminated (0 clients and 1 threads remain, 873 served) etc...

    I resolved this for my FTP server by putting googlebot@google.com in my text/email.can file.

    and yes, placing a robots.txt in your ftp root works... at least for googlebot... it regularly pulls mine and follows the instructions conveyed when they are merged into the index... it may take a few days or weeks but it does start following after the file has been pulled and added into the main index...

    Just an FYI, FTP crawlers do not look for or adhere to any robots.txt files. That's for HTTP crawlers only.

    Well, food for thought. My BBS has sped up since I blocked it, and its not like a company web site where I would certainly *want* to be crawled, so I'm pretty happy with my decision. But, I could always try it and open the flood gates again (but again, I'm a happier now, so I probably won't lol).

    I would really just love to know why it started on 9/12/2014 (5 times that particular day, with spread out intervals), and then the very next day it crawled 479 times (short intervals). And then about the same ammount every day afterwards until 4/14/2015 when I put the block up. It crawled my 8 files for 7 months? Really? Really Really? What caused the loop? And the even better question, why did googlebot not detect a loop?

    It had to do with the encoding of the random portion of the FTP URLs generated by ftp-html.js and ftp-web-html.js. Deuce recently fixed those issues to stop the infinite crawling by Google. Or at least, that's what I recall.

    digital man

    Synchronet "Real Fact" #24:
    The Digital Dynamics company ceased day-to-day opperations in late 1995.
    Norco, CA WX: 68.3øF, 56.0% humidity, 11 mph SE wind, 0.00 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Mro@VERT/BBSESINF to mark lewis on Tuesday, May 19, 2015 18:56:37
    Re: Welcome back
    By: mark lewis to Mro on Tue May 19 2015 02:52 pm


    because by definition googlebot is a web crawling bot that discovers "pages"

    you shold tell google that, then...


    i did!

    it shouldnt be on people's ftp servers. it shouldnt be using the ftp protocol.

    spiders and bots follow links... if a link is using the ftp protocol, they follow it just as well and easily as following a http link...

    like i said, by definition it should not do that.
    it takes time... that's what folks don't understand... they expect that
    when they see GB getting their robots.txt every day that they can make a change, the bot will grab it and immediately start following it... the bot cannot and does not parse the contents of the robots.txt file... it
    delivers those contents back to the database for parsing and indexing... when the master database is updated, then and only then can the new instructions in the site's robots.txt be applied and followed...


    are you SURE, it behaves that way? i thought it first looked for the robots.txt rules, then followed them (but as i said it does not always do that). the method you described sounds pretty ass-backwards.

    i do know it phones home after several rejections. or it's supposed to.

    i'm pretty sure it reads the robots.txt file first thing.
    ---
    þ Synchronet þ ::: BBSES.info - free BBS services :::
  • From mark lewis@VERT to Digital Man on Wednesday, May 20, 2015 06:38:26
    On Tue, 19 May 2015, Digital Man wrote to KenDB3:

    and yes, placing a robots.txt in your ftp root works... at least
    for googlebot... it regularly pulls mine and follows the
    instructions conveyed when they are merged into the index... it
    may take a few days or weeks but it does start following after
    the file has been pulled and added into the main index...

    Just an FYI, FTP crawlers do not look for or adhere to any
    robots.txt files. That's for HTTP crawlers only.

    https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt

    Robots.txt Specifications
    Abstract
    Requirements Language
    Basic Definitions
    Applicability
    File location & range of validity
    2nd paragraph

    Google-specific: Google also accepts and follows robots.txt files
    for FTP sites. FTP-based robots.txt files are
    accessed via the FTP protocol using an anonymous
    login.

    HTH

    )\/(ark


    * Origin: (1:3634/12)

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From mark lewis@VERT to Mro on Wednesday, May 20, 2015 06:45:58
    On Tue, 19 May 2015, Mro wrote to mark lewis:

    it takes time... that's what folks don't understand... they expect
    that when they see GB getting their robots.txt every day that they
    can make a change, the bot will grab it and immediately start
    following it... the bot cannot and does not parse the contents of
    the robots.txt file... it delivers those contents back to the
    database for parsing and indexing... when the master database is
    updated, then and only then can the new instructions in the site's robots.txt be applied and followed...

    are you SURE, it behaves that way?

    yes...

    i thought it first looked for the robots.txt rules, then followed
    them (but as i said it does not always do that).

    the bot has no means of parsing the contents... it takes its instructions from home... everything it scans is sent home for home to parse... then home will tell the bot where else to gather documents from based on that parsing...

    the method you described sounds pretty ass-backwards.

    that's the way most all bots i know of work... they are only data gathering tools... they don't parse... that's done back home during the database processing...

    i do know it phones home after several rejections. or it's supposed
    to.

    i'm pretty sure it reads the robots.txt file first thing.

    i have tons of logs here where googlebot doesn't even look for robots.txt for that day's visits... robots.txt is simply another file like an html document...
    robots.txt is parsed back home and used to direct the next round of crawling and data gathering...

    checkout webmasterworld.com's google area... they can tell ya more about it than i can...

    )\/(ark


    * Origin: (1:3634/12)

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From kk4qbn@VERT/NWGA_NET to Mro on Wednesday, May 20, 2015 12:13:10
    Re: Welcome back
    By: Mro to mark lewis on Sat May 16 2015 08:22 pm

    it shouldnt even be crawling your ftp server, though.

    some people have no issues. other people are just raped non stop by googlebot. and i've logged onto them via vnc or teamviewer and seen this.

    I've blocked the whole range of ip addresses for googlebot that Lord Time provided, which locked googlebot out, but it is still constantly hammering my ftp server. RELENTLESS!

    Has'nt even tried to crawl my webserver afaik.




    Best Regards,

    Tim

    ---
    þ Synchronet þ Northwest GA Network: nwga_net.synchro.net (706)422-9538
  • From Digital Man@VERT to mark lewis on Wednesday, May 20, 2015 14:55:14
    Re: Welcome back
    By: mark lewis to Digital Man on Wed May 20 2015 06:38 am


    On Tue, 19 May 2015, Digital Man wrote to KenDB3:

    and yes, placing a robots.txt in your ftp root works... at least
    for googlebot... it regularly pulls mine and follows the
    instructions conveyed when they are merged into the index... it
    may take a few days or weeks but it does start following after
    the file has been pulled and added into the main index...

    Just an FYI, FTP crawlers do not look for or adhere to any
    robots.txt files. That's for HTTP crawlers only.

    https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt

    Robots.txt Specifications
    Abstract
    Requirements Language
    Basic Definitions
    Applicability
    File location & range of validity
    2nd paragraph

    Google-specific: Google also accepts and follows robots.txt files
    for FTP sites. FTP-based robots.txt files are
    accessed via the FTP protocol using an anonymous
    login.

    Interesting. In my experiments, I never saw the google FTP crawler adhere to the file, so I just ended up blocking it based on the email address used for an
    anonymous-FTP password. Perhaps they added the support for robots.txt via FTP later.

    digital man

    Synchronet "Real Fact" #52:
    Synchronet Blackjack was the first multi-node/multi-user game for Synchronet. Norco, CA WX: 70.9øF, 55.0% humidity, 6 mph ESE wind, 0.00 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From mark lewis@VERT to Digital Man on Wednesday, May 20, 2015 22:49:01
    On Wed, 20 May 2015, Digital Man wrote to mark lewis:

    Google-specific: Google also accepts and follows robots.txt files
    for FTP sites. FTP-based robots.txt files are
    accessed via the FTP protocol using an anonymous
    login.

    Interesting. In my experiments, I never saw the google FTP crawler
    adhere to the file, so I just ended up blocking it based on the
    email address used for an anonymous-FTP password. Perhaps they
    added the support for robots.txt via FTP later.

    i don't know when it came about but i found out about it in the last two or three years when we working on some issues in peter moylan's ftpserv for OS/2... he was adding a few features we needed (virtual base directories and being able to disallow the display of certain files (eg: htm*) and someone posted about this capability... i set it up and it has worked ever since... granted, it took some weeks to take effect but there's been no problems since then...

    )\/(ark


    * Origin: (1:3634/12)

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Mro@VERT/BBSESINF to kk4qbn on Friday, May 22, 2015 21:49:50
    Re: Welcome back
    By: kk4qbn to Mro on Wed May 20 2015 12:13 pm

    I've blocked the whole range of ip addresses for googlebot that Lord Time provided, which locked googlebot out, but it is still constantly hammering my ftp server. RELENTLESS!


    use peerblock [software firewall, sort of] and create a custom list with googlebot's ip addresses and it will block it so it can not tax your system. ---
    þ Synchronet þ ::: BBSES.info - free BBS services :::
  • From Deuce@VERT/SYNCNIX to KenDB3 on Saturday, May 23, 2015 20:52:01
    Re: Re: Welcome back
    By: KenDB3 to kk4qbn on Thu May 14 2015 07:34 pm

    I had to specifically block Googlebot, it was a constant onslaught of connections every other minute for days on end. Does anyone know why Googlebot just keeps crawling the FTP site endlessly?

    Yeah, I looked into it, and it was because of the random sequence appeneded by the index generation. The index has been updated to not do that anymore since,
    but Google will still try to crawl every random URL it has cached for a very long time (it's been many months since the fix, and my VPS is still getting over 100 queres per minute from Googlebot).

    ---
    http://DuckDuckGo.com/ a better search engine that respects your privacy.
    þ Synchronet þ My Brand-New BBS (All the cool SysOps run STOCK!)
  • From spacesst@VERT/SPACESST to Deuce on Sunday, May 24, 2015 07:35:10
    Re: Re: Welcome back
    By: Deuce to KenDB3 on Sat May 23 2015 20:52:01

    n
    Re: Re: Welcome back
    By: KenDB3 to kk4qbn on Thu May 14 2015 07:34 pm

    I had to specifically block Googlebot, it was a constant onslaught of
    connections every other minute for days on end. Does anyone know why
    Googlebot just keeps crawling the FTP site endlessly?

    Yeah, I looked into it, and it was because of the random sequence appeneded by the index generation. The index has been updated to not do that anymore since, but Google will still try to crawl every random URL it has cached for a very long time (it's been many months since the fix, and my VPS is still getting over 100 queres per minute from Googlebot).


    this is a block list of peerblock i use to control googlebot and cie

    YAHOO! SLURP:8.12.144.0-8.12.144.255
    MSNBOT:64.4.0.0-64.4.63.255
    MSNBOT:65.52.0.0-65.55.255.255
    GoogleBot:66.102.0.0-66.102.15.255
    YAHOO! SLURP:66.196.64.0-66.196.127.255
    YAHOO! SLURP:66.228.160.0-66.228.191.255
    GoogleBot:66.233.160.0-66.233.191.255
    GoogleBot:66.249.64.0-66.249.95.255
    YAHOO! SLURP:67.195.0.0-67.195.255.255
    YAHOO! SLURP:68.142.192.0-68.142.255.255
    GoogleBot:72.14.192.0-72.14.255.255
    YAHOO! SLURP:72.30.0.0-72.30.255.255
    YAHOO! SLURP:74.6.0.0-74.6.255.255
    GoogleBot:74.125.0.0-74.125.255.255
    YAHOO! SLURP:98.136.0.0-98.139.255.255
    MSNBOT:131.253.21.0-131.253.47.255
    MSNBOT:157.54.0.0-157.60.255.255
    YAHOO! SLURP:202.160.176.0-202.160.191.255
    MSNBOT:204.46.255.255-207.46.0.0
    MSNBOT:207.68.128.0-207.68.207.255
    GoogleBot:209.85.128.0-209.85.255.255
    YAHOO! SLURP:209.191.64.0-209.191.127.255
    GoogleBot:216.239.32.0-216.239.63.255



    ... Cotton Incorporated Cotton The fabric of our lives

    ---
    þ Synchronet þ SpaceSST BBS
  • From KenDB3@VERT/KD3NET to Deuce on Sunday, May 24, 2015 22:42:49
    Re: Re: Welcome back
    By: Deuce to KenDB3 on Sat May 23 2015 08:52 pm

    Re: Re: Welcome back
    By: KenDB3 to kk4qbn on Thu May 14 2015 07:34 pm

    I had to specifically block Googlebot, it was a constant onslaught of connections every other minute for days on end. Does anyone know why Googlebot just keeps crawling the FTP site endlessly?

    Yeah, I looked into it, and it was because of the random sequence appeneded the index generation. The index has been updated to not do that anymore sin but Google will still try to crawl every random URL it has cached for a very long time (it's been many months since the fix, and my VPS is still getting over 100 queres per minute from Googlebot).


    I kind of had a feeling. Thanks for the info! Was there a reason for the randomized appended text previously?

    ~KenDB3

    ---
    þ Synchronet þ KD3net-Rhode Island's only BBS about nothing. http://bbs.kd3.us
  • From Poindexter Fortran@VERT/REALITY to Mro on Monday, May 25, 2015 08:52:30
    Re: Welcome back
    By: Mro to mark lewis on Mon May 18 2015 10:15 pm

    i never go by what the bot reports. i look at the ip addresses. fact is, googlebot and other spiders dont always obey what we ask of them.

    ...Yeah, and make sure your security levels don't allow anything that you want out of the search index to be browsed by guest.

    ---
    þ Synchronet þ realitycheckBBS -- http://realitycheckBBS.org
  • From Stephen Hurd@VERT to KenDB3 on Wednesday, May 27, 2015 18:08:55
    Re: Re: Welcome back
    By: KenDB3 to Deuce on Sun May 24 2015 10:42 pm

    Yeah, I looked into it, and it was because of the random sequence appeneded the index generation. The index has been updated to not do that anymore sin but Google will still try to crawl every random URL it has cached for a very long time (it's been many months since the fix, and my VPS is still getting over 100 queres per minute from Googlebot).


    I kind of had a feeling. Thanks for the info! Was there a reason for the randomized appended text previously?

    The reason was to defeat browser caches.

    ---
    http://DuckDuckGo.com/ a better search engine that respects your privacy.
    * Origin: (1:103/17)
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Deuce@VERT/SYNCNIX to spacesst on Wednesday, May 27, 2015 17:59:34
    Re: Re: Welcome back
    By: spacesst to Deuce on Sun May 24 2015 07:35 am

    this is a block list of peerblock i use to control googlebot and cie

    For just Google FTP, you can add googlebot@google.com (or whatever it is) to the email.can file.

    ---
    http://DuckDuckGo.com/ a better search engine that respects your privacy.
    þ Synchronet þ My Brand-New BBS (All the cool SysOps run STOCK!)
  • From KenDB3@VERT/KD3NET to Stephen Hurd on Thursday, May 28, 2015 20:25:26
    Yeah, I looked into it, and it was because of the random sequence appeneded the index generation. The index has been updated to not do that anymore sin but Google will still try to crawl every random URL it has cached for a very long time (it's been many months since the fix, and my VPS is still getting over 100 queres per minute from Googlebot).


    I kind of had a feeling. Thanks for the info! Was there a reason for the randomized appended text previously?

    The reason was to defeat browser caches.

    Ahh! Thanks for indulging my curiosity!

    ~KenDB3

    ---
    þ Synchronet þ KD3net-Rhode Island's only BBS about nothing. http://bbs.kd3.us