Grid to speed analysis of genomic heritage
Monday, 21 June, 2004
When a joint research team split between Monash University and the University of Sydney recently began calculating its computing requirements for a new genotype analysis project, they soon hit upon a daunting reality: the project was going to require 1279 hours -- more than 7 weeks -- of non-stop computation on the single computer they had.
The research team, headed by Sydney University's Lars Jermiin and Monash's Paul Hertzog, is modelling the evolution of Type 1 interferons involved in inflammation, some cancers, and other conditions. By understanding how the proteins have evolved, researchers hope they can get a better sense of how the current molecules function and what commonalities exist between different types of molecules.
After calculating such a long time would be necessary, the team realised it simply didn't have the time to sit and wait for that desktop PC to plod through the work. "This is far, far too long for people in research to be competitive," says Jermiin.
Discussions with Sydney University IT staff eventually led the researchers to IBM, which has been investing heavily in life sciences research and was quick to suggest a better solution.
Calling upon a partnership with distributed computing company United Devices (www.ud.com), IBM, along with researchers at Sydney University's Biological Informatics and Technology Centre (SUBIT) and Monash University, suggested the researchers leverage the resources available at hand: standard Windows PCs.
Other universities have previously built massive computing clusters out of Linux-based PCs, but they require additional skills and reconfiguration of working desktop systems. Utilising the United Devices toolkit, the team realised it could take advantage of the spare computing power of university computers, building a computing 'grid' in which a central node co-ordinates jobs between dozens or hundreds of slave computers.
"You can harvest machines that spend 70 per cent of their time doing nothing but running screen savers," says Albert Zomaya, a professor in the Sydney University School of Information Technology and a key player in the work to modify UD's technology to suit the genotype project at hand. "The idea was to develop software that could be run at a number of sites, so the manager can sit in Sydney while the computers sit in different locations, do their computations, and send the data back to us."
The beauty of the UD toolkit is that the computers belonging to the grid don't need to be located in physical proximity; this means computers in both Melbourne and Sydney can contribute to the research, receiving new jobs from a central controlling system over the internet and broadcasting their results whenever they finish.
There are, of course, unknowns: there's no way to know how often the remote computers will be used or free, and therefore no way to tell exactly how quickly the grid will operate. The key thing about grid computing, however, is that this unpredictability doesn't matter: the controlling software simply uses whatever resources are available to it at any given time.
UD has successfully demonstrated the technology through a screen saver that's previously used the Internet to research cancer and smallpox molecules on a global scale, but the current project is distinctive for its ability to adopt the technology to a specific academic research project.
Zomaya and the other members of the technical team are nearly finished developing the algorithms necessary to split up, distribute and manage the jobs. Groups of calculations are farmed out to individual desktops, with each group sent to several computers to ensure they are completed and that results can be cross-checked.
Within weeks, the team expects to kick off a six-month pilot test of the technology that will involve 100 desktop PCs. Assuming everything works as expected, the end result should be a computing grid that churns through the calculations inestimably faster than the single desktop PC could. A successful demonstration of the grid's capabilities should lend ammunition to efforts by technical staff to expand the job dispatch application across Sydney and Monash universities. Although desktops within faculties of IT are the most obvious members of an evolving computing grid, broader in-principle support from other academic departments could see dozens or hundreds of additional machines added to the cluster.
Adapting the software to other departments' needs will advance academic understanding of grid computing, as well as giving biotechnology researchers a much-needed boost in computing power. "The idea is to use this framework to advance science and other disciplines," says Zomaya. "We're seeing more of this convergence [between science and IT] happening as time goes by, and creating new opportunities.
"Many disciplines that never needed to use computers are coming to us, saying 'we have massive amounts of data that we need to analyse', but they're not willing to spend hundreds of thousands of dollars to buy new machines. They've got these machines sitting around, and by having smart software on top of those PCs it could create the smart solution. Once you develop an understanding of the environment you can break it down, map it onto the [grid] environment, and get this performance improvement."
AI-designed DNA switches flip genes on and off
The work creates the opportunity to turn the expression of a gene up or down in just one tissue...
Drug delays tumour growth in models of children's liver cancer
A new drug has been shown to delay the growth of tumours and improve survival in hepatoblastoma,...
Ancient DNA rewrites the stories of those preserved at Pompeii
Researchers have used ancient DNA to challenge long-held assumptions about the inhabitants of...