Folding@home with the SMP Client in Windows Azure

Want to run @home with Azure for another team or use a more powerful CPU? For the true geeks out there, running the Folding@home client involves tweaking, high performance computing, and knowing the difference between the GPU and CPU clients.  We heard from a couple of folks about maximizing their Windows Azure usage, and Jim made some changes to the client piece to accommodate.  In truth, we did a little of this last time we ran @home, but we didn’t draw much attention to it for fear it would just add confusion – so, this info is presented as optional and not necessary to do @home. First, when setting up the 90 day trial, accounts have a $0 spending limit – which means, unless you intentionally disable the cap, you will never be charged a dime.  This also means your account will be shut down when you reach your monthly usage quota.   The 90 day trial allows 750 compute hours per month, which is 1 small instance 24x7.   If you’d like, you can run either 8 single-core instances, or a single 8-core instance, however you’ll burn 192 hours per day and exhaust the limit in about 4 days.  You could also run a dual core instance for half a month. However, Visual Studio Ultimate MSDN subscribers receive 1,500 hours/month, so you can run either 2 instances of @home, or more preferably, a single dual core – or a quad core for half a month.  It’s better for @home, and here’s why: Folding@home rewards speed over quantity.  The faster a work unit is completed, the more points are awarded.  Consequently, you and your team (by default, the Windows Azure team) does better!   To do this, you first need a passkey.  A passkey acts as a unique identifier for a user and is required to get these bonus points.   In the project properties, you can add the passkey, and specify the team number.  184157 is the Windows Azure team, but we allow you to change this if you’re already on another team: Next, if you have downloaded the bits already, you might need to re-download them.  To know for sure, check if you have both clients in the client folder of the AtHomeWebRole project, as pictured up above.  Specifically, you want to see the FAH6.34-win32-SMP.exe.   If you don’t have both executables as shown in the green box above, re-download the solution from the get started page. Within the project properties, you can now configure the app to use whichever VM size is most appropriate for you.  The larger VM will run faster and accumulate more points, but will get shut down quicker.  If you aren’t on a trial or don’t have the spending cap in place, monitor your usage carefully and be sure you’re staying within the plan limits or be willing to pay for the usage! That’s all there is to it!  Make sure your storage account is configured per the instructions – if you had a deployment already, the new code will start and run automatically as it will pick up the settings from the storage account. So what kind of results can you predict?   My main user, which is primarily using Azure, but also some Folding using my home 4-core i5 box has the following numbers: Looking at these stats, I’m pulling in an average of 146 points per WU (and 146 points per CPU).  This is actually a tiny bit better than it should be, because my home machine folds at a much higher rate and contributes to this account.  I then deployed some 8-core and a few 4-core instances with a different account and different pass-key: This account is pulling almost 4,000 points per WU!  If we assume they were all 8-core boxes (which they weren’t, so these numbers are unfavorable) then dividing that by 8 is 474 points per CPU per WU.   The bottom line:  CPUs working together pull significantly more points than CPUs working alone. See?  I told you this was geeky stuff.  In either event, the Folding@home project is a fantastic project to contribute to, and hopefully, a fun way to learn Windows Azure in the process.  Finally, if you’re not using Windows Vista, 7, or Server 2008, or otherwise just want to download the package files directly, you can do that by reading the instructions here!

Webcast: Intro to @home with Windows Azure

Tomorrow (Thursday, 3/15/2012) at noon ET or 9am PT, we have our first screencast in the @home series:  an introduction to the @home distributed computing project! This is the first in a series where we’ll dive into various aspects of Windows Azure – in this first webcast, we’ll keep it 100 level, discussing the platform, how to get started, and what the project is about.  From the abstract page: In this 100-level webcast, we introduce Windows Azure. We look at signing up a new account, evaluate the offers, and give you a tour of the platform and what it's all about. Throughout this workshop, we use a real-world application that uses Windows Azure compute cycles to contribute back to Stanford's Folding@home distributed computing project. We walk through the application, how it works in an Windows Azure virtual machine and makes use of Windows Azure storage, and deploying and monitoring the solution in the cloud. If you can’t make this one, be sure to check out the rest in the series by watching the @home website – we’ll be diving deeper into various features as the weeks progress, and we’ll post links to the recordings as they become available.

Launching “Learn the Cloud, Make a Difference” DC Effort

Two years ago, Jim O’neil and I developed a quick Azure training program called “@home with Windows Azure” – a way to learn Windows Azure and have some fun contributing to a well known distributed computing effort, Folding@home.  A few months later, Peter Laudati joined the cloud team and we developed the game RockPaperAzure.   RockPaperAzure was a lot of fun and is still active, but we decided to re-launch the @home with Windows Azure project because of all of the changes in the cloud since that effort in 2010. So, having said all that, welcome to our “Learn the Cloud.  Make a Difference” distributed computing project!  It’s been updated, as you can see on the page – a much cleaner and nicer layout, maintaining our great stats from the 2010 effort where we had a cumulative 6,200+ virtual machines having completed 188k work units!   (Of course, as happy as I am with the numbers, the Folding@home project has a over 400k active CPUs with over 8 petaFLOPS of processing power! Stanford University’s Pande Lab has been sponsoring Folding@home for nearly 12 years, during which they’ve used the results of their protein folding simulations (running on thousands of machines worldwide) to provide insight into the causes of diseases such as Alzheimer’s, Mad Cow disease, ALS, and some cancer-related syndromes. When you participate in @home with Windows Azure, you’ll leverage a free, 3-month Windows Azure Trial (or your MSDN benefits) to deploy Stanford’s Folding@home application to Windows Azure, where it will execute the protein folding simulations in the cloud, thus contributing to the research effort. Additionally, Microsoft is donating $10 (up to a maximum of $5000) to Stanford’s Pande Lab for everyone that participates. We’ve provided a lot of information to get you started, including four short screencasts that will lead you through the process of getting an Azure account, downloading the @home with Windows Azure software, and deploying it to the cloud. And we won’t stop there! We have a series of webcasts also planned to go into more detail about the application and other aspects of Windows Azure that we leveraged to make this effort possible. Here is the schedule for webcasts, and of course, you can jump in before at any time: 3/15/2012 12pm EDT  @home with Azure Overview 3/22/2012 12pm EDT  Windows Azure Roles 3/29/2012 12pm EDT  Azure Storage Options 4/05/2012 12pm EDT  Debugging in the Cloud 4/12/2012 12pm EDT  Async Cloud Patterns

Folding@home SMP Client

Wouldn’t you know it!  As soon as we get admin rights in Azure in the form of Startup Tasks and VM Role, the fine folks at Stanford have released a new SMP client that doesn’t require administrative rights.  This is great news, but let me provide a little background on the problem and why this is good for our @home project.  In the @home project, we leverage Stanford’s console client in the worker roles that run their Folding@home application.   The application, however, is single threaded.   During our @home webcasts where we’ve built these clients, we’ve walked through how to select the appropriate VM size – for example, a single core (small) instance, all the way up to an 8 core (XL) instance.  For our purposes, using a small, single core instance is best.  Because the costs are linear (2 single core costs the same as a single dual-core), we might as well just launch 1 small VM for each worker role we need.   The extra processors wouldn’t be utilized and it didn’t matter if we had 1 quad core running 4 instances, or 4 small VMs each with their own instance. The downside to this approach is that the work units assigned to our single core VMs were relatively small, and consequently the points received were very small.   In addition, bonus points are offered based on how fast work is done, which means that for single core machines, we won’t be earning bonus points.  Indeed, if you look at the number of Work Units our team has done, it’s a pretty impressive number compared to our peers, but our score isn’t all that great: As you can see, we’ve processed some 180,000 WU’s – that would take one of our small VMs, working alone, some 450 years to complete!   Points-wise, though, is somewhat ho-hum. Stanford has begun putting together some High Performance Clients that make use of multiple cores, however, until now, were difficult to install in Windows Azure.   With VM Role and admin startup tasks just announced at PDC we could now accomplish these tasks inside of Azure, but it turns out Stanford (a few months back, actually) put together a drop-in replacement that is multicore capable.  Read their install guide here.   This is referred to as the SMP (symmetric multiprocessing) client.  The end result is that instead of having (for example) 8 single-core clients running the folding app, we can instead of 1 8-core machine.   While it will crunch fewer Work Units, the power and point value is far superior.  To test this, I set up a new account with a username of bhitney-test.  After a couple of days, this is result (everyone else is using the non-SMP client): 36 Work Units processed for 97k points is averaging about 2,716 points per WU.   That’s significantly higher than the single core which pulls in about 100 points per WU.  The 2,716 average is quite a bit lower than what it is doing right now, because bonus points don’t kick in for about the first dozen items.  Had we been able to use the SMP client from the beginning, we’d be sitting pretty at a much higher rating – but that’s ok, it’s not about the points. :)

@home: The Beginning

This post is part of a series diving into the implementation of the @home With Windows Azure project, which formed the basis of a webcast series by Developer Evangelists Brian Hitney and Jim O’Neil.  Be sure to read the introductory post for the context of this and subsequent articles in the series. To give even more background than in the first post … way back in late March (possibly early April) Jim had the idea to start something we originally called “Azure Across America” … not be confused with AAA :).   If you put yourself in our shoes, Azure is a very difficult technology for us to evangelize.   It reminds me a little of what it was like to explain the value prop of “WPF/e” back when the first bits were made available, long before it took the name Silverlight.   Azure is obviously a pay for use model, so what would be an interesting app to build in a webcast series?  Preferably something that helps illustrate cloud computing, and not just a “hello world” application. While we debated what this would look like (I’ll spare the details), the corp team solidified a trail account program that enabled us to get free trial accounts for attendees to the series.   This changed the game completely, because now we weren’t hindered by signup costs, deployment costs, etc.  In fact, the biggest challenge was doing something interesting enough that would be worth your time to deploy. That’s when we had the idea of a distributed computing project.   Contributing back to a well-known distributed computing project would be interesting, useful, demonstrate cloud computing, and not be hindered by the constant fluctuation of apps going online and offline.   So now that we had the idea, which project would we choose?  We also had a number of limitations in the Azure platform.  Don’t get me wrong:  Azure offers a number of strengths as a fully managed PaaS … but we don’t have administrator rights or the ability to remote desktop into the VMs.  In essence, we need whatever we deploy to not require admin access, and be xcopy deployable.  Stanford’s Folding@home project was perfect for this.  It’s a great cause, and the console version is easy to work with.  What we wanted to do was put together a site that would, in addition to providing the details, how-to’s, and recordings, show stats to track the current progress … In the next posts, I’ll go over the site and some of the issues we faced when developing the app.

@home with Windows Azure: Behind the Scenes

As over a thousand of you know (because you signed up for our Webcast series during May and June), my colleague Jim O’Neil and I have been working on a Windows Azure project – known as @home With Windows Azure – to demonstrate a number of features of the platform to you, learn a bit ourselves, and contribute to a medical research project to boot.  During the series, it quickly became clear (…like after the first session) that the two hours was barely enough time to scratch the surface, and while we hope the series was a useful exercise in introducing you to Windows Azure and allowing you to deploy perhaps your first application to the cloud, we wanted (and had intended) to dive much deeper. So enter not one but two blog series.  This introductory post appears on both of our blogs, but from here on out we’re going to divide and conquer, each of us focusing on one of the two primary aspects of the project.  Jim will cover the application you might deploy (and did if you attended the series), and I will cover the distributed.cloudapp.net application, which also resides in Azure and serves as the ‘mothership’ for @home with Windows Azure.  Source code for the project is available, so you’ll be able to crack open the solutions and follow along – and perhaps even add to or improve our design.  You are responsible for monitoring your own Azure account utilization.  This project, in particular, can amass significant costs for CPU utilization.  We recommend your self-study be limited to using the Azure development fabric on your local machine, unless you have a limited-time trial account or other consumption plan that will cover the costs of execution. So let’s get started.  In this initial post, we’ll cover a few items Project history Folding@home overview @home with Windows Azure high-level architecture Prerequisites to follow along Project history Jim and I have both been intrigued by Windows Azure and cloud computing in general, but we realized it’s a fairly disruptive technology and can often seem unapproachable for many of your who are focused on your own (typically on-premises) application development projects and just trying to catch up on the latest topical developments in WPF, Silverlight, Entity Framework, WCF, and a host of other technologies that flow from the fire hose at Redmond.   Walking through the steps to deploy “Hello World Cloud” to Windows Azure was an obvious choice (and in fact we did that during our webcast), but we wanted an example that’s a bit more interesting in terms of domain as well as something that wasn’t gratuitously leveraging (or not) the power of the cloud. Originally, we’d considered just doing a blog series, but then our colleague John McClelland had a great idea – run a webcast series (over and over… and over again x9) so we could reach a crop of 100 new Azure-curious viewers each week.  With the serendipitous availability of ‘unlimited’ use, two-week Windows Azure trial accounts for the webcast series, we knew we could do something impactful that wouldn’t break anyone’s individual pocketbook – something along the lines of a distributed computing project, such as SETI.  SETI may be the most well-known of the efforts, but there are numerous others, and we settled on one (http://folding.stanford.edu/, sponsored by Stanford University) based on its mission, longevity, and low barrier to entry (read: no login required and minimal software download).  Once we decided on the project, it was just a matter of building up something in Windows Azure that would not only harness the computing power of Microsoft’s data centers but also showcase a number of the core concepts of Windows Azure and indeed cloud computing in general.  We weren’t quite sure what to expect in terms of interest in the webcast series, but via the efforts of our amazing marketing team (thank you, Jana Underwood and Susan Wisowaty), we ‘sold out’ each of the webcasts, including the last two at which we were able to double the registrants - and then some! For those of you that attended, we thank you.  For those that didn’t, each of our presentations was recorded and is available for viewing.  As we mentioned at the start of this blog post, the two hours we’d allotted seemed like a lot of time during the planning stages, but in practice we rarely got the chance to look at code or explain some the application constructs in our implementation.  Many of you, too, commented that you’d like to have seen us go deeper, and that’s, of course, where we’re headed with this post and others that will be forthcoming in our blogs. Overview of Stanford’s Folding@Home (FAH) project Stanford’s http://folding.stanford.edu/ was launched by the Pande lab at the Departments of Chemistry and Structural Biology at Stanford University on October 1, 2000, with a goal “to understand protein folding, protein aggregation, and related diseases,” diseases that include Alzheimer’s, cystic fibrosis, CBE (Mad Cow disease) and several cancers. The project is funded by both the National Institutes of Health and the National Science Foundation, and has enjoyed significant corporate sponsorship as well over the last decade.  To date, over 5 million CPUs have contributed to the project (310,000 CPUs are currently active), and the effort has spawned over 70 academic research papers and a number of awards. The project’s Executive Summary answers perhaps the three most frequently asked questions (a more extensive FAQ is also available): What are proteins and why do they "fold"? Proteins are biology's workhorses -- its "nanomachines." Before proteins can carry out their biochemical function, they remarkably assemble themselves, or "fold." The process of protein folding, while critical and fundamental to virtually all of biology, remains a mystery. Moreover, perhaps not surprisingly, when proteins do not fold correctly (i.e. "misfold"), there can be serious effects, including many well known diseases, such as Alzheimer's, Mad Cow (BSE), CJD, ALS, and Parkinson's disease. What does Folding@Home do? Folding@Home is a distributed computing project which studies protein folding, misfolding, aggregation, and related diseases. We use novel computational methods and large scale distributed computing, to simulate timescales thousands to millions of times longer than previously achieved. This has allowed us to simulate folding for the first time, and to now direct our approach to examine folding related disease. How can you help? You can help our project by downloading and running our client software. Our algorithms are designed such that for every computer that joins the project, we get a commensurate increase in simulation speed. FAH client applications are available for the Macintosh, PC, and Linux, and GPU and SMP clients are also available.  In fact, Sony has developed a FAH client for its Playstation 3 consoles (it’s included with system version 1.6 and later, and downloadable otherwise) to leverage its CELL microprocessor to provide performance at a 20 GigaFLOP scale. As you’ll note in the architecture overview below, the @home with Windows Azure project specifically leverages the FAH Windows console client. @home with Windows Azure high-level architecture The @home with Windows Azure project comprises two distinct Azure applications, the distributed.cloudapp.net site (on the right in the diagram below) and the application you deploy to your own account via the source code we’ve provided (shown on the left).  We’ll call this the Azure@home application from here on out. distributed.cloudapp.net has three main purposes: Serve as the ‘go-to’ site for this effort with download instructions, webcast recordings, and links to other Azure resources. Log and reflect the progress made by each of the individual contributors to the project (including the cool Silverlight map depicted below) Contribute itself to the effort by spawning off Folding@home work units.               I’ll focus mostly on this backend piece and other related bits and pieces, design choices, etc.  The other Azure service in play is the one you can download from distributed.cloudapp.net (either in VS2008 or VS2010 format) – the one we’re referring to as Azure@home.  This cloud application contains a web front end and a worker role implementation that wraps the console client downloaded from the http://folding.stanford.edu/English/Download site.  When you deploy this application, you will be setting up a small web site including a default page (image the left below) with a Bing Maps UI and a few text fields to kick off the folding activity.  Worker roles deployed with the Azure service are responsible for spawning the Folding@home console client application - within a VM in Azure - and reporting the progress to both your local account’s Azure storage and the distributed.cloudapp.net application (via a WCF service call).                    Via your own service’s website you can keep tabs on the contribution your deployment is making to the http://folding.stanford.edu/English/Stats effort (image to right above), and via distributed.cloudapp.net you can view the overall http://fah-web.stanford.edu/cgi-bin/main.py?qtype=teampage&teamnum=184157 – as I’m writing this the project is ranked 583 out of over 184,000 teams; that’s in roughly the top 0.3% after a little over two months, not bad! Jim will be exploring the design and implementation of the Azure@home piece via upcoming posts on my blog. Prerequisites to follow along Everything you need to know about getting started with @home with Windows Azure is available at the distributed.cloudapp.net site, but here’s a summary: Operating System Windows 7 Windows Server 2008 R2 WIndows Server 2008 Windows Vista Visual Studio development environment Visual Studio 2008 SP1 (standard or above), or Visual Web Developer 2008 Express Edition with SP1, Visual Studio 2010 Professional, Premium or Ultimate (trial download), or Visual Web Developer 2010 Express Windows Azure Tools for Visual Studio (which includes the SDK) and has the following prerequisites IIS 7 with WCF Http Activation enabled SQL Server 2005 Express Edition (or higher) – you can install SQL Server Express with Visual Studio or download it separately. Azure@home source code Visual Studio 2008 version Visual Studio 2010 version Folding@home console client For Windows XP/2003/Vista (from Stanford’s site) In addition to the core pieces listed above, feel free to view one of the webcast recordings or my screencast to learn how to deploy the application.  We won’t be focusing so much on the deployment in the upcoming blog series, but more on the underlying implementation of the constituent Azure services. Lastly, we want to reiterate that the Azure@home application requires a minimum of two Azure roles.  That’s tallied as two CPU hours in terms of Azure consumption, and therefore results in a default charge of $0.24/hour; add to that a much smaller amount of Azure storage charges, and you’ll find that it’s left running 7x24, your monthly charge will be around $172!  There are various Azure platform offers available, including an introductory special; however, the introductory special includes only 25 compute hours per month (equating to12 hours of running the smallest version of Azure@home possible). Most of the units of work assigned by the Folding@home project require at least 24 hours of computation time to complete, so it’s unlikely you can make a substantial contribution to the Stanford effort without leveraging idle CPUs within a paid account or having free access to Azure via a limited-time trial account.  You can, of course, utilize the development fabric on your local machine to run and analyze the application, and theoretically run the Folding@home client application locally to contribute to the project on a smaller scale. That’s it for now.  I’ll be following up with the next post within a few days or so; until then keep your head in the clouds, and your eye on your code.

@home: Most Common Problems #1

Jim and I are nearly done with the @home with Azure series, but we wanted to document some of the biggest issues we see every week.  As we go through the online workshop, many users are deploying an Azure application for the first time after installing the tools and SDK.   In some cases, attendees are installing the tools and SDK in the beginning of the workshop. When installing the tools and SDK, it’s important to make sure all the prerequisites are installed (available on the download page).  The biggest roadblock is typically IIS7 – which basically rules out Windows XP and similar pre-IIS7 operating systems.  IIS7 also needs to be installed (by default, it isn’t), which can be verified by going into the control panels / programs and features. The first time you hit F5 on an Azure project, development storage and the development fabric are initialized, so this is typically the second hurdle to cross.   Development storage relies on SQL Server to house the data for the local development storage simulation.  If you have SQL Express installed, this should just work out of the box.  If you have SQL Server Standard (or other edition), or a non-default instance of SQL Server, you’ll likely receive an error to the effect of, “unable to initialize development storage.” The Azure SDK includes a tool called DSINIT that can be used to configure development storage for these cases.  Using the DSINIT tool, you can configure development storage to use a default or named instance of SQL Server. With these steps complete, you should be up and running!

My Apps

Dark Skies Astrophotography Journal Vol 1 Explore The Moon
Mars Explorer Moons of Jupiter Messier Object Explorer
Brew Finder Earthquake Explorer Venus Explorer  

My Worldmap

Month List