Thanks to the folks at Infragistics, there’s a new reporting site for Worldmaps! This Silverlight application not only looks great, but provides many new ways to drill down into the data: One of the cool things you can do is easily compare to sites – the app has tabs that allows you to add multiple sites (unlike the still-existing old dashboard which showed only your neighbors in the leaderboard): What’s even cooler is playing around with the Gapminder stats: A Gapminder chart is ideal for animating data over time in a way that a simple 2-D chart doesn’t. The bottom line is: this app is so much fun just to play around with. Many thanks to Jason Beres, Mihail Mateev, Riddhima Shelat, and others (I’m sure I’m leaving people out!) for making this possible! Check out Mihail’s post here on the project!
Worldmaps users might see some volatility in the leaderboard today! In January I posted about a change in the leaderboard, where maps were then rated by hits/day average, not total hits. Overall, this was a much more fluid way to rate maps as it allows all users to participate on equal ground. The biggest downside to this approach, though, is that sites are subject to volume fluctuations over time. For example, a site might suddenly gain a great deal of new visitors, but the volume might be “hidden” by months of old data factored in. The solution: Worldmaps now only considers the last 30 days of traffic when calculating hits per day average. This allows the traffic sways to be much more impactful over the short term. As a side benefit, the data is now aggregated differently and allows for deeper reporting on a daily basis – this is reflected in the new chart on the reporting page, like so:
With the chaos of the Worldmaps to Azure migration winding down, I decided to take a look at some the low-pri issues on my to-do list. Those close to the project or who follow their leaderboard ranking closely know that last year, the leaderboard was ranked by Total Hits. This worked pretty well at the time, but it made it impossible for those who joined the site weeks or months after others to compete unless their site had considerable more volume to close the gap. This year, I changed the formula to base the data off of average hits per day. This adds a bit more excitement because it’s a level playfield regardless of join date, for the most part. The one remaining problem I have with hits per day is that if a site gains popularity over time, the resulting average encourages the user to delete and recreate their map for better rankings. At some point in time, I’ll further refine this so it looks at hits per day for the last month or two (for example). In the meantime, though, I got some heat from Chris Eargle who pointed out some problems with my implementation when he looked at his numbers. The way the implementation worked was that it grabbed the earliest hit, the most recent hit, and divided by the number of days. To prevent division by zero and give a minimum of 1 day, 1 was added to the number of days like so: coalesce(sum(hits.NumHits) / (DATEDIFF(d, MIN(hits.CreateDate), MAX(hits.ModifyDate))+1) ,0) as HitsPerDay
While this worked “okay,” it wasn’t perfect. For starters, using DATEDIFF returns an integer based on the dates passed in. If you pass in “1/21/2010 23:59” and “1/22/2010 00:01,” the DATEDIFF (by days) is 1, even though only 2 minutes have passed. If you pass in “1/22/2010 00:01” and “1/22/2010 23:59,” a zero is returned even though it’s just 2 minutes short of a day. This isn’t wrong, but it’s limited for what we need because the leaderboard is generated throughout the day.
Adding 1 can be incorrect because, using the first example, it will now return 2 days (!) when only 2 minutes have expired. It’s a pretty extreme example but possible. I wanted an implementation that, using T-SQL alone, would be a bit more consistent. The criteria I had was: 1) had to support partial days and 2) had to give a minimum of 1 day. #2 is because I don’t want a map that was just created (as in the first time example) to end up getting (again, for example) 50 hits in 2 minutes and due to the math, getting averaged at say 100,000 hits per day. Using a minimum of 1 day makes the leaderboard a little more stable. The resulting query:
WHEN (DATEDIFF(hh, MIN(hits.CreateDate), MAX(hits.ModifyDate))/cast(24 as float)) < 1 THEN 1
ELSE (DATEDIFF(hh, MIN(hits.CreateDate), MAX(hits.ModifyDate))/cast(24 as float))
as int) as HitsPerDay
This works out really well. First, we use hours instead of days and simply divide by 24. It’s essentially the same as days, but gives a fractional value. If the value is less than 1, we’ll use just 1 as the minimum. This solves the second criteria above and also any division by zero cases. The ELSE of the CASE gives us the fractional days, so hits per day is more accurate and no more +1 tomfoolery. Even though the FLOOR returns an whole number, the data type is unchanged so a cast to an int is necessary for the application.
If we look at the results in query analyzer, you can see the results are more true:
The “days” and “days_partial” shows the difference between using hours and days in DATEDIFF, giving us more accuracy. The “HitsPerDay_New” compared to “HitsPerDay_Original” shows the more accurate result. In some cases unchanged, in some cases much more accurate as you can see by looking at the “FirstHit” and “MostRecentHit” columns.
Anyway, just some fun playing with T-SQL and if you see some adjustments in the Leaderboard, this is why.
But Chris, it won’t help you get into the Top 10. :)
I admit that site design is something that often takes a back seat when developing Worldmaps, but in the recent migration to Windows Azure, I added a few new minor customizations to the maps. When you log into your account, you should see a list that contains all of your current maps: From this screen, you can either modify a map, or create a new one. When creating a new map, you’ll see a simple form to fill out: All of the information that is required for the map is the URL and Leaderboard. The latitude and longitude fields indicate your home location. You can use the home locator map on the bottom of the screen to help in this regard, or you can leave this blank. Without this information, some statistics cannot be calculated. The Leaderboard indicates which category best suits your map. Is it a personal blog? A technology blog? A personal site? Pick the one that best fits your site – this can be changed later. The last box, Invitation Code, is for premium accounts and changes the way the data is stored for the given map. I described this briefly in my last post. For scalability reasons, most new accounts will be under the new scheme – if you need more detailed information (such as # of unique IPs), contact me for invitation code. Customizing your Map Once your map is created, click the edit button next to your map to customize the colors. The form should look similar to: When the app is drawing your maps, you can control the colors used in drawing the circles. Feel free to experiment with some color choices. The four values you can customize are explained on the form. Use the Silverlight-based Color Picker to find a color and copy that value into the text box of the value you’d like to change. Or, you can enter values directly into the box if you know the hex-value of the color you’d like to use. When done, click Save. In general, the maps will be redrawn reasonably quick depending on how much work is currently in the queue. Customizing the colors is a great way to add a little personalization to your maps!
The recent update to Windows Azure went quite well! The site is now using a single Azure webrole, a single Azure worker role, Azure Queues for workload, and Azure Blobs for storage. It’s also using SQL Azure as the database. From a user’s point of view, not much has changed but the performance and scalability has been much improved. On the stats page, I implemented a few new stats … First up is the hourly breakdown of hits to a site. Below is Channel 9’s current breakdown. Neat way to tell when the traffic is heaviest to your site. In this case, C9 is busiest at 3pm GMT, or about 9am-4pm EST. In addition, Worldmaps includes country breakdown information: And, Stumbler has been updated a bit so be sure to check it out and watch traffic in real time! Finally, there’s change to the registration process. To add some scalability, Worldmaps now stores data in one of two schemes. The older scheme has been migrated to what is called a “plus” or enhanced account. The newer scheme is the default, and it stores data in a much more aggregated way. What determines how information is stored? This is based off of an invitation code on the Create Map form: If no invitation code is provided, the newer scheme is used. If a valid invite code is provided, the old, more detailed method is used. If you’d like an invite code, drop me some feedback.What’s the difference? Currently, the difference is pretty small. On the stats page, current number of Unique IP's can not be calculated, so it looks like so: Future report options are a bit limited as well, but otherwise, all data (and Stumbler) is still available.
For those of you with Worldmap accounts, the Azure migration is (hopefully) underway. This process will take awhile, in part due to the holiday vacation :) but also testing and what not. Starting today, the accounts section will be closed for changes until the migration is complete. Hopefully, the migration will not cause any breaking changes. However, the biggest possible change is to make sure you are accessing the service using www in the URL … for example, “http://www.myworldmaps.net….” – if you’re leaving that out, it will have to be added in or the map will not work after the migration due to limitations with the DNS CNAMEing in Azure. Stay tuned, and hope for a smooth ride and management approval :)
Hi folks! I wanted to take the opportunity and outline a few changes for Worldmaps – both changes in the service, the backend, and new features. First up: Stumbler. A link to Stumbler was added to the nav menu at the top of the page. If you haven’t played around with Stumbler, check it out. It maps users in near real time to websites. One of the biggest changes is the update to the new Bing Maps Silverlight control. This doesn’t change much from the end-user point of view, but a few new features have been added so the UI in Stumbler is a bit cleaner around pushpins and effects. In addition, there’s a new setting in the settings dialog to choose whether or not to scroll the map automatically. Normally you zoom around the map automatically, but now you can turn that off to stay focused on a certain area. The other big new feature is multiple/extensible leaderboards. The way it worked until now was there a single leaderboard for all users. This “master leaderboard” is still there, but having sub leaderboards is a lot more interesting. I defined a few leaderboards like “Tech Blogs” and “Personal Sites,” but will add more over time (leave feedback on new leaderboards!). On the leaderboard page, the default view is the all-up leaderboard, while each leaderboard is displayed on the left nav. To pick a leaderboard, log in to your account and edit your maps. In the detail section, you’ll see a drop down that allows you to select a leaderboard (see image to the right). I should also point out that the sub-boards are sorted based on hits per day, not total hits. The other big change to outline is the end-of-year change. At the end of the year, the individual hit data is reset. This means all the red dots on the map will be wiped clean and start over again. Data will be archived and available in some fashion, but this hasn’t been implemented yet. Additionally, next year (2010) the main leaderboard will be sorted based on average hits per day. Now to address the biggest architectural change: if you’ve created an account on Worldmaps but haven’t gotten an email yet with the approval, fear not. Unfortunately the volume is such that with limited infrastructure, there’s not much more that can be done. This will soon change. Worldmaps has been moved to Windows Azure, Microsoft’s cloud computing platform. This should give Worldmaps a nice bump in scalability (limited really only by funds). So stay tuned in early 2010 for more info.
The Worldmaps users queue is getting big! Just wanted to thank everyone for their interest, but in response to so many emails, I thought I should explain how to get signed up on Worldmaps. After creating an account, it must be approved before the account can be used. The main reason for this is to slowly ramp up on bandwidth to make sure the service (both website and database) are providing a good experience. During high volume, the service is processing many requests per second – obviously, not a ridiculous load in the scheme of things, but for a small service with no budget, it’s certainly something to keep an eye on. Currently, however, the service is getting more user requests than are approved daily, so there’s a backlog forming. We’ll get through it in time, but it does require culling the sites that don’t fall into the acceptable use parameters. Technical blogs and personal sites will generally get approved first. At this time, commercial sites cannot be approved. Over the coming months, I’ll be looking more at a Windows Azure implementation that will allow more growth and hopefully, allow just about all sites to “play.” One new feature hopefully to be released with this implementation will also be website categories – instead of just a single leader board, there will be multiple leader boards and users can select the one most appropriate to their domain. Stay tuned!
I’m happy to announce I’ve finally completed a cool little Silverlight app entitled “Worldmaps Stumbler” (thanks Andrew for the name!). So what is it? Worldmaps Stumbler essentially plots Worldmaps data in near-real time using Silverlight and Virtual Earth. While the current stats maps currently use Virtual Earth and web services to plot data, the experience is largely utilitarian, and due to the nature of the application, I wasn’t able to incorporate any real-time information. With Silverlight, it’s much easier to enhance the experience to include not only real time data, but more of a stylish presentation. As the application runs, it watches for changes and plots the data on the map – for example: The app will zoom in, show the browser used, the URL of the map, and the date/time of the hit and how long ago that was. When the Worldmaps server records the hit with the time, it bubbles up to the Stumbler to get plotted. When the hit is plotted, the age of the hit is calculated so you can see precisely how long ago the hit was. This process can take anywhere from about 4 seconds to about a minute, depending on a number of factors. The folks at Earthware have posted a great tutorial on how to create a minimap using the Silverlight Virtual Earth control. (They also have a cool demo called Silverlight Twitter Map. Although a word of warning: while it’s a great background show while discussing Virtual Earth and Silverlight integration, a lot of potty-mouth tweets always seem to end up on screen.) There are a few options in the app: The maps button allows you to select which Worldmaps maps to plot. By default it is all of them, but it can be narrowed to your preference. The slider to the right of the minimap controls the delay between plotting hits. Higher is slower. Also, if the dot in the middle of the minimap is blue, the app is running fine. Yellow circles indicate network activity (polling for data – this should barely be perceptible) and red means an error of some kind. Hopefully you won’t see red, but if you do, no data will be available. To use the Stumbler, simply visit: http://www.myworldmaps.net/sl/WorldmapsStumbler.aspx If you’re a Worldmaps user and want a customized experience for your maps on startup, your map IDs can be passed in through the maps parameter in the querystring (pipe delimited). For example, to launch the app observing only my two maps, I can use the following URL: http://www.myworldmaps.net/sl/WorldmapsStumbler.aspx?maps=FECB0AFF-083E-4F42-9B08-9A01E3CB714A|495A96ED-A6AC-495B-A134-72C434EEA880 At this time, only the top twenty or so maps are plotted, however more will be added soon. So go check it out – it’s fun to watch!
I promised awhile ago that I’d revamp the Worldmaps ranking system – classically, Worldmap Rank was always based off of total hits. While this rewarded longer-term usage, it made newcomers have virtually no chance of catching up and competing unless they owned a high volume site. The other problem with total hits was “rank parking” – that is, a website hits with a lot of volume, then either stops using the service, or pings it infrequently. There are two phases in coming up with a better rank – one is deciding what would be ideal, and the other is deciding if the technical implementation is feasible given the constraints of the current system. (Worldmaps doesn’t store each hit like a log file, for example, so data is aggregated to extrapolate various metrics that you see on the stats pages.) The first go around was to do hits per day. This metric (as you can see below) is nice because suddenly, even day 1 users are in the competition: It makes things more interesting. The problem with using both Total Hits and Hits Per Day is that neither is necessarily that compelling of a metric. (For those that have done web analytics in the past, you know that total website hits is hardly meaningful.) There’s nothing to stop someone from putting 100 image tags on their site, for example, so 1 hit registers as 100. (Though, that’s against the TOS. I think. If not it will be! :)) The problem with Hits Per Day is that the value is averaged out. Suppose you have a site where your volume over a period of 4 months increases substantially. If all you cared about was rank, you’d be better suited to delete and recreate the map account to get ranking higher. Then I thought about World Domination. Let’s face it, it’s the coolest stat Worldmaps has. But, is it the right one to use to determine rank? I’m not so sure. The only other solution is a compound metric … such as World Domination * Hits Per Day or something similar – that would still give the edge to traffic, but reaching out globally would impact that perhaps significantly. (A compound stat is a bit harder, given the schema, to work in.) Any suggestions?