What's your ideas on why Bing bot is so interested in LonelyCache?We run an automated process to attempt to detect and block data harvesting from some of our websites. The process watches for "aggressive" web browsing activity from individual IP addresses. In general any IP that requests more than 30 pages per minute averaged over a 5 minute period triggers an alert. In addition to monitoring or own website this system monitors all sites we host.
For some time I've noticed that the Bing bot has been aggressively spidering your Lonely Cache site. Bing bot is often visiting your site from multiple IP addresses simultaneously requesting upwards of 100-150 pages per minute combined. What I thought was odd is that the Bing bot is aggressively spidering your site almost daily. I haven't seen this type of aggressive spidering on any other sites we host. Do you have any idea why Bing bot is so interested in the Lonely Cache Project?
Bing bot aggressively spidering LCP?
- Corfman Clan
- Global Moderator
- Posts: 914
- Joined: January 17th, 2012, 12:21 am
Bing bot aggressively spidering LCP?
This morning I received an email from the admin of the company hosting LonelyCache and thought it was rather interesting
Re: Bing bot aggressively spidering LCP?
What is Bing bot?
- Corfman Clan
- Global Moderator
- Posts: 914
- Joined: January 17th, 2012, 12:21 am
Re: Bing bot aggressively spidering LCP?
Oops, I sometimes forget not everyone knows what these techie terms are.the greenskeeper wrote:What is Bing bot?
The internet search facilites, such as Google, Yahoo, Bing, etc., have web bots (or spiders) that basically traverse (crawl) all the world wide web gathering information that allows them to return search results quickly and (hopefully) that are worthwhile. So "Bing Bot" is Bing's web bot.
-
chris geertsen
- Posts: 11
- Joined: October 3rd, 2012, 9:21 pm
Re: Bing bot aggressively spidering LCP?
i am not a computer expert so i would not know alot of these terms. nor do i know how they work
but why is this a bad thing for the site?
-
chris geertsen
- Posts: 11
- Joined: October 3rd, 2012, 9:21 pm
Re: Bing bot aggressively spidering LCP?
so these bot's basically help out bing google etc for helping there browser page have more links to sites. maybe because this site is so new there trying to gather as much imformation as they can so when someone searches in there browser it will show up. i have been to at least one browser where i searched this name and it did not come up at all. there is my wisdom. doubt it's that good. 
- Corfman Clan
- Global Moderator
- Posts: 914
- Joined: January 17th, 2012, 12:21 am
Re: Bing bot aggressively spidering LCP?
Of course I want the search engines to know about LonelyCache and include it as results in searches. That is a good thing. The question really is why the Bing bot is hitting LonelyCache as much as it is (way more than any other site the company is hosting).
My response to the email was
Anyway, I was hoping for more whimsical reasons why the Bing bot might be so interested in LonelyCache, such as because it's so awesome how could it not be
My response to the email was
That may be the reason why, I don't know. For example, for the points leaderboards, a typical page has over 400 hyperlinks and there are over 425,000 of them just for the LonelyCache Wide region.I don’t know why the Bing bot would be spidering LonelyCache that much. Perhaps it’s because all the pages in LonelyCache tend to have a lot of hyperlinks to other LonelyCache pages. With the dynamic nature of the site and the number of geocaches & geocachers in the LonelyCache territory, there is essentially millions of pages to navigate through.
Anyway, I was hoping for more whimsical reasons why the Bing bot might be so interested in LonelyCache, such as because it's so awesome how could it not be
- Corfman Clan
- Global Moderator
- Posts: 914
- Joined: January 17th, 2012, 12:21 am
Re: Bing bot aggressively spidering LCP?
Or maybe, because LonelyCache is filled with so much Baad Daata it constantly needs re-scanning...Corfman Clan wrote:Anyway, I was hoping for more whimsical reasons why the Bing bot might be so interested in LonelyCache, such as because it's so awesome how could it not be
- Team Tierra Buena
- Posts: 8
- Joined: January 18th, 2012, 9:48 pm
Re: Bing bot aggressively spidering LCP?
Or maybe the bots have a website where they get points for visiting lonely websites!Corfman Clan wrote:Or maybe, because LonelyCache is filled with so much Baad Daata it constantly needs re-scanning...Corfman Clan wrote:Anyway, I was hoping for more whimsical reasons why the Bing bot might be so interested in LonelyCache, such as because it's so awesome how could it not be
Happy Thanksgiving, everyone!
Team Tierra Buena
Making geocaching needlessly difficult for ourselves since 2001!
Making geocaching needlessly difficult for ourselves since 2001!
-
rocketsciguy
- Posts: 145
- Joined: January 18th, 2012, 9:55 am
Re: Bing bot aggressively spidering LCP?
That's funny!Corfman Clan wrote:Or maybe, because LonelyCache is filled with so much Baad Daata it constantly needs re-scanning...Corfman Clan wrote:Anyway, I was hoping for more whimsical reasons why the Bing bot might be so interested in LonelyCache, such as because it's so awesome how could it not be
I think your response to the hosting company is probably right... tons of hyperlinks on every dynamically-generated page, and every page is updated every day. Even if the content of a particular page doesn't change, the time stamp at the bottom of the page changes every update cycle, so if the Bing-Bot is doing a text-comparison of the HTML, it will find differences. Those changes probably tell the Bot to dig deeper. Blame Microsoft for having an overly aggressive, poorly designed web-crawler algorithm.
But please keep all those hyperlinks! They make the site very useful!
I think I remember from somewhere that there's a way to prevent or inhibit spiders from crawling your domain. A "policy" stored as a specially formatted 'spider.txt' file in the root directory or something like that. I bet your hosting company would be happier if the spiders only did their thing once every week or month, or not at all.
-
Ranger Alpha
- Posts: 6
- Joined: March 31st, 2012, 10:47 pm
Re: Bing bot aggressively spidering LCP?
Does LonelyCache have a robots.txt file?
- Corfman Clan
- Global Moderator
- Posts: 914
- Joined: January 17th, 2012, 12:21 am
Re: Bing bot aggressively spidering LCP?
No, it doesn't and at this time I see no compelling reason to add one.Ranger Alpha wrote:Does LonelyCache have a robots.txt file?
- A web bot may honor a robots.txt file or completely ignore it, so its utility is limited.
- We do want the search engines to know about LonelyCache, so we don't want to direct those web bots to stay away.
- The web hosting company isn't concerned about any adverse effects (performance or otherwise) from the Bing Bot spider. The admin was mostly just curious on what might be going on with it.