
I added a couple of new charts to the Stats section ... in this case, both charts contain some
search engine stats.
Once again, visual data tells a really cool story -- it's great to see
about 14 months of data summarized in a visual way.
In these charts, I decided to focus on Google, MSN, and Yahoo search
engine bots and referrers. There's quite a few other bots, but
these three make up the vast majority, and the even when the other bots
hit, I'm not seeing any interesting traffic generated from their
respective portal. The clearest and easiest solution was to
simply omit them for the time being, but they can be added when/if
needed.
What's really interesting is watching the ramp up of the various
engines as they index the site over the first few months (September
2004). The number of requests have a direct relationship with the
bandwidth usage of the bots, and it's no secret that MSNBot is
quite, um, err, a bandwidth hog. There, I admitted it.
What I don't know is how and if the behavior is indicative of site
maturity and content versus the changing of the algorithms the search engine
bots use. For example, there's a huge spike in September 2005 (mainly from MSN, but also from the others) --
this may be caused by an increase in blogging activity (notice a
correlation with the
Traffic Analysis
data). At best, though, it's too early to tell since October has
just begun and those numbers will undoubtedly gain quickly on
September's data.
The Search Engine Referrals chart shows the number of referrals I'm
getting from the various sites. I'm not surprised at all to see
Google well above the others; to me, this illustrates Google's
popularity and what is likely a more effective indexing strategy --
Google is certainly taking less bandwidth.
Both charts seem to indicate that October will be a very busy month,
since the numbers for every data point have surpassed
September for the same time frame, except for the number of hits from MSN's bot. Going
forward, I'd like to do drill down into the data to see the frequency and
duration of the crawls -- unfortunately, I do not have enough data to
construct much of a detailed history, but it will be there going forward.