The community approach to funding cyberinfrastructure gives researchers at Purdue University time allocations for funding contributions specified in their various grants. This cost-effective model has expanded capacity while reducing demands on the university’s IT budget.
By James S. Almond
When Purdue University, West Lafayette, Indiana, installed a new cluster supercomputer in 2008, it wasn't the fact that it was the biggest campus supercomputer in the Big Ten that excited Vice President for Information Technology and Chief Information Officer Gerry McCartney.
Nor was it that the university staff worked together to install the computer in just four hours, nor the media attention that resulted in coverage of the event in the New York Times, on National Public Radio, and in most major computing trade publications. Although all these factors contributed to the overall success and high profile of the project, what really excited McCartney was how the supercomputer was funded.
Funding tactics rarely garner a second glance, but this supercomputer, named "Steele" for a former IT administrator, was paid for in a way that generated buzz on campus and had McCartney eager to tell the story to anyone who would listen. "Steele was a major accomplishment and one that deserved the attention it received," McCartney says. "But the community funding model is the part that was unique, and it allowed us to maintain our position as a top research university without doing gymnastics to find a way to pay for it."
Often referred to on campus as a "condo computer," Steele is "owned" by the community of participating researchers, who each receive allocated research time commensurate with his or her funding contribution—not unlike condominium assessments that are calculated on a square-footage basis.
Equipment purchases, including computers, are typically included in a budget a researcher submits as part of the grant proposal. In this case, a portion of the funds specified for equipment purchases were used to buy a share of the Steele supercomputer. In only one case did the researcher have to get permission from the funder because of the way the grant was written.
This model maximizes the efficiency and use of the supercomputer and gives each researcher much more computing time than his or her grant funds would have allowed otherwise. We predict that this funding approach will become the model for supercomputer purchasing at universities across the nation.
Form and Function
The supercomputer, named for John Steele, former Purdue staff and faculty member, is a commodity cluster computer made up of 812 Dell servers and is capable of performing 60 trillion operations per second.
In the November 2008 Top 500 Supercomputer Sites rankings (www.top500.org), Steele placed at 104—compared with 319 for its previous computer on the November 2007 list—and ranked first among Big Ten university systems on the list.
And, while rankings are significant, other productivity and access factors are more important to Purdue's leadership and staff. For example, thanks in a large part to the new supercomputer, Purdue increased its aggregate peak computing capacity from 14 teraflops in 2006 to more than 100 teraflops in 2008 (a teraflop is one trillion floating-point operations per second).
"We were able to do this without any notable increase in our budget for research computing," McCartney says. At the same time, the expanded computer capacity enables the university's scientists and engineers to stay at the forefront of discovery in crucial areas such as cancer research, global warming, and affordable energy. In fact, there is enough excess capacity, which computer scientists call "opportunistic cycles," that the extra computing time is made available via the national TeraGrid (an open scientific infrastructure of computing resources at 11 partner sites) through a program funded by the National Science Foundation.
Collaborative Cost Sharing
To begin the move to a shared model, information technology at Purdue (ITaP), the central IT organization on campus, identified the heaviest users of computing resources. These were researchers whose work demanded thousands of hours of computing time and who often had to look outside the university for additional resources with which to get their work done.
ITaP staff told several dozen of these top researchers that ITaP was about to retire the campus supercomputers and asked if individuals would be interested in participating in the joint purchase of additional nodes, or servers, to make the incoming supercomputer even larger.
ITaP had already planned to retire two supercomputers on campus, and had budgeted funds for the life-cycle replacement of those two cluster computers. With researchers added into the purchase plan, the faculty members were able to buy more computing power per dollar than they could purchase as individuals.
Steele is "owned" by the community of participating researchers, who receive allocated research time commensurate with their funding contributions—not unlike condominium assessments.
Basically, the researchers could buy nodes using research grant funds. The obvious selling point was that with a group purchase the researchers could get the hardware at a lower cost than that of the standard "university price" they would typically pay. The funding organizations would also benefit because researchers would be able to buy more with the funds they were given, which translates into more research results in less time.
Ashlie Martini, an assistant professor of mechanical engineering and one of the faculty who helped fund the project, uses the computer's power to study friction at the molecular level. "ITaP was able to get a much better price for the processors, because they were buying so many—much better than I could have if I were just buying a stack of computers to put in my lab," she says.
"We initially estimated that the purchase would be about $1 million," says McCartney, "and we had commitments for $750,000 from faculty. We asked several vendors to give us a bid based on these numbers."
The first round of bids was better than the standard university price that faculty were accustomed to paying for computing hardware. "This encouraged even more faculty to participate, and we did a second bid request for $1.6 million," McCartney says. "Initially, we placed an order for $1.8 million worth of computing equipment, and as faculty support grew, the final purchase was $2.4 million."
That final negotiated price was 25 percent lower than the standard university price, which is already significantly less than the list price for this equipment.
"The deal was so good that faculty members knew they would be foolish to pass up participation," notes McCartney.
In the end, a group of more than 25 university scientists and engineers pooled research grant funds to contribute to the purchase of the supercomputer, with less than 25 percent of the purchase funded by the university's IT budget.
The central IT office paid for cabinetry and networking, while the faculty paid only for the actual nodes, which ranged in price from $2,000 to $2,900 each, depending on the configuration. Researchers purchased differing numbers of nodes, depending on the work they needed to do and their respective budgets. The purchases ranged from that of a researcher who only purchased a single node to that of an electrical engineer who purchased 248 nodes.
In addition to the reduced cost, faculty benefit from this shared computer model in two other ways:
1. Researchers often don't use all of their available computer capacity, allowing for others to tap into it. "In fact, it's common for CPUs [central processing units] to be in use only about 15 percent of the time," McCartney says. "For Steele this means that there are always extra computing cycles available when the researchers need them. In fact, we have enough cycles that we provide extra opportunistic cycles to researchers nationwide via the TeraGrid."
2. Researchers are spared the cost, time, and aggravation of buying, installing, and maintaining their own computers. Martini says she especially appreciates this beginning-to-end arrangement. "ITaP completely took care of the purchasing, dealing with the vendors, and installing the equipment," she says. "The department maintains the cluster, so my graduate students can be doing what they want to be doing, which is more research."
Of course, resources like Steele are integral to the research of Purdue faculty members who helped pay for the cluster, such as Gerhard Klimeck, professor of electrical and computer engineering. He models the next two or three generations of nanoscale electronic devices, allowing their properties to be understood long before they're ever fabricated.
"You can get what you paid for and more, because there are extra computing cycles available when others aren't using them," Klimeck says. "When you don't need all of your cycles, you can share with others so they can benefit from the community investment. Having a local machine that is of a significant size lets us benchmark and research and test before we go to national computing resources like the TeraGrid or the Track 2 [which is even more powerful than the TeraGrid]."
Rudolf Eigenmann, professor of electrical and computer engineering and interim director of Purdue's Computing Research Institute, says Steele will be used for a wide variety of research activities. "Faculty using this computer will be designing new drugs and materials, modeling weather patterns and the effects of global warming, engineering future aircraft, and making many more discoveries," Eigenmann says. "High-performance computing is essential to conducting research and development, so having one of the world's largest supercomputers here on campus will be a real benefit to our faculty."
Business Office Involvement
The actual purchase process was managed by the IT business office, which is a part of the university's overall business services function. The office coordinated the purchase with the 25 faculty members and their departmental business offices, along with the office of sponsored programs to make sure the purchase was allowable under the terms of each faculty member's research grant or start-up funds package. From there we followed normal procedures for purchases involving grant funds.
"Our staff thought we were insane when we challenged them to build such a big computer in one day."
In doing this, the business office discovered that one of the researchers' grants didn't actually allow for such a purchase within the budget. Fortunately, we were able to request and receive special approval from the program officer—but this did slow down the process for everyone involved, so we modified our procedures to make sure this wouldn't happen again in future purchases.
The first time around, we relied on the participating faculty to notify their respective departmental business offices, but we subsequently learned that the notification didn't happen in every case. So now someone from business services notifies the departmental business office to make sure that this step takes place.
Because some of the nodes cost less than $2,500, the IT business office coordinated with the property accounting department to make sure that the nodes were tagged as capital equipment. Otherwise, it was possible that facilities and administration would be charged. That was significant, because F&A overhead on federal grants is 52 percent, a number that would take a huge chunk out of the available grant funds if the equipment were mistakenly charged in this way.
Putting It All Together
After the computer equipment order was on its way, ITaP was faced with a couple of concerns. First, replacing the existing supercomputers would create an interruption in the scientific computations being run, as well as downtime in the research. Second, the research data center was located in the basement of Purdue's mathematics building, where there wasn't enough room to unload the many boxes of new equipment.
John Campbell, associate vice president for teaching and learning technologies at Purdue, came up with an ingenious solution to both problems, even if it wasn't perceived that way initially by his staff. They would unpack the boxes in a parking lot next to the building and construct the mammoth computer in a single day.
"Our staff thought we were insane when we challenged them to build such a big computer in one day," Campbell says. "But after reviewing Campbell's plans, the staff soon began to embrace the idea as a realistic one, and there was real excitement about the project." More than 200 employees gathered on May 5, 2008, to help build the massive machine, which was about the size of a semitrailer when installed.
More than 200 volunteers from IT organizations across Purdue University's campus took shifts scheduled to assemble the semitrailer-sized supercomputer within only four hours. They easily met the deadline.
To generate interest on campus for group assembly of the equipment, Campbell's group of organizers created a movie trailer called Installation Day, a spoof of the movie Independence Day. (Access the video on YouTube at www.youtube.com/watch?v=wVzThRN4QJI.)
This promotional effort prompted the 200 volunteers from IT organizations across the university to participate. The supercomputer vendor contributed breakfast and lunch for the volunteers.
The first shift of workers was scheduled to begin unpacking boxes at 7 a.m., but many employees arrived an hour earlier, eager to begin working. The Purdue workers were surprised by a team of four IT specialists from in-state rival Indiana University, Bloomington, who had heard about the project, made an early morning drive, and arrived to help construct the supercomputer.
By 11 a.m. the installation was essentially complete except for a few nodes that were intentionally held back to be installed at the noon dedication.
"The assembly was finished much faster than we expected," says McCartney, "and by noon we were doing science. The staff was enthusiastic, the weather was great, and there were no problems installing the hardware or software."
By 1 p.m. more than 500 of the 812 nodes that make up the supercomputer were already running 1,400 research jobs from across campus.
"We discovered that a build like this leverages the commodity nature of cluster computing, by using standard computing parts," McCartney says. "By using commodity computer servers to build our supercomputer, we didn't have to fly in engineers or hire specialized technicians. We were able to do it with our own IT staff in about four hours."
Expanding the Condo Complex
Seeing the effectiveness of the purchase and use of Steele, many more Purdue researchers wanted to join in a similar arrangement for high-performance computing resources. So, we plan to build another supercomputer later this year. This one will be named "Coates," after Ben Coates, former head of the College of Electrical and Computer Engineering.
The new computer will be about 50 percent larger than Steele—with 1,200-plus nodes versus Steele's 812 nodes—but it is the business processes that will see the most change.
As well as things worked with our first community cluster purchase, it's obvious that with improved communication we could have an even smoother experience. We've put several new procedures in place that will enhance communication, particularly with faculty who may be interested in participating in the purchase. The goal is to minimize surprises for everyone.
"With Steele, our faculty pretty much knew what they would be getting, because they'd worked with ITaP so much before," Campbell says. "With Coates, which is targeted to faculty from departments across campus with whom we've had less interaction, researchers won't be as familiar with the process or the benefits."
The communication plan includes a Web site (www.rcac.purdue.edu/userinfo/communityclusters.cfm) and an e-mail list for faculty that keeps them up-to-date about the Coates purchase.
The business units at Purdue are also working to stay in closer communication. The ITaP business office, for example, will reach out directly to departmental business officers and individual researchers to determine whether the grant funds are available and allowable. ITaP is committed to avoiding the unexpected obstacles that arose during the previous supercomputer purchase.
McCartney says that some administrators were surprised when he began talking so soon about a second supercomputer. "It's based on faculty need and making the best use of our resources," he says. "I told them that not only are we doing this again this year, but as long as the demand is there, we might be buying a supercomputer every year."
Despite a 2 percent reduction in the overall university budget, plans for the next supercomputer purchase are moving forward. The problems of society for which we are looking for answers—and the computation these discoveries require—continue.
JAMES S. ALMOND is vice president of business services and assistant treasurer, Purdue University, West Lafayette, Indiana.