Ouch. I hate mistakes like this.
I decided over the weekend to take a step back and do some performance monitoring of my site. I've noticed the performance has slowly been degrading over the months, and I wanted to see why. In truth, with shared hosting (which is what I'm on) there is only so much you can do, but there are always optimizations to be made. I have my own analytics engine that makes both a
performance chart and a
traffic chart.
In looking at the charts, the performance started to trend upwards around April of '07 -- which is interesting because traffic also took a big dip that month. It could be the hosting provider had service issues, brought on more clients, or something during that time. The charts don't quite provide enough detail -- so I dove into some queries to look at things. In some future posts, I'd like to do a series on performance tuning .NET and SQL applications, but to keep this post on topic, I realized I had a fairly major bug. Some folks have asked me: how do you measure page performance/generation time in your charts?
What I do is stuff the current time in the HttpContext in an HttpModule's Application_BeginRequest event. Then, in the Application_EndRequest, I create a timespan of the difference. The timespan, then, records the amount of time it took to create the page. I convert the timespan into milliseconds since that's the only value I care about. I happen to log this data -- what you do with it is somewhat up to you -- you could, for example, only log pages that take > 500 milliseconds, so you know you have an issue with a page. Clearly, there are huge scalability issues so you have to be careful with how you log the data.
But I digress.
Here's the relevant code, with extraneous parameters stripped out. Now that you know what it is supposed to do, can you spot the bug?
Application_BeginRequest:System.Web.HttpContext.Current.Items.Add("beginrequest", DateTime.Now);Application_EndRequest:
TimeSpan pageGenTime =
DateTime.Now.Subtract((DateTime)Context.Items["beginrequest"]);
//call to the logging method
LogRequest( ... , ... , pageGenTime.Milliseconds, ... , ...);
Here's a hint: most of the time, I'd say, probably all of the time, the behavior is as expected. I'll follow up shortly with how I (finally, after a year or more) caught it.
EDIT: I highlighted the bug above in red. The mistake is accessing the
Milliseconds property instead of
TotalMilliseconds. Obviously, this only returns the milliseconds portion, when in reality I wanted the total timespan represented in milliseconds. The end result is that no page would ever be more than 999 milliseconds ... worse still, if a page renders in (for example) 00:00:01.002, it would show as 2 milliseconds. How was this eventually caught? Sadly, it took almost 2 years, and was caught while debugging the application. I looked at the log files, and realized that despite sitting on a break point for many seconds, the page miraculously rendered in only a few milliseconds.