Data visualisation: See what you're doing?
Wednesday, 11 December, 2002
The growing sophistication of data visualisation applications has been a boon for pharmaceutical and biotech researchers across the life and chemical sciences spectrum. Visualisation platforms help computational chemists to model molecules in drug discovery environments and genomic researchers to stitch useable information together from a confusing tangle of data held in different gene sequence databases.
Virtual reality environments that help researchers 'see' the interaction of candidate drug molecules with target receptors are becoming a powerful tool for drug companies. An example is US-headquartered Janssen Pharmaceutica, which believes its use of data visualisation technology fosters new ideas in drug design and stimulates interdisciplinary collaboration. The techniques are helping it to conduct extensive tests on up to 100,000 chemical compounds daily, according to the company.
In the genomics database space, "the amount of data is so vast, you can't do anything without visualisation tools," says veteran US bioinformatician Dr Dan Davison. Associate director of bioinformatics and applied genomics for the Bristol-Myers Squibb pharmaceutical research laboratory in New Jersey, Davison builds tools which view federated gene sequence, transcription expression profiling and protein databases.
Bristol-Myers Squibb, like other big pharmaceutical companies, has developed in-house visualisation tools to give its drug discovery researchers access to the widest possible range of genomic content. They range from bar charting programs (BART) to nearest neighbour sequence searches (KNN). Along with publicly available tools, they aren't as graphically sophisticated as the 3D rotational programs which form part of the structural biologists' armoury, but they clearly fall in the category of visualisation software.
Rosetta Biosoftware and GeneSpring are two major commercial suppliers of industrial strength visualisation programs for researchers dealing with genomic data. Rosetta is a wholly-owned bioinformatics subsidiary of Merck that provides solutions for 50 per cent of the top 10 pharmaceutical companies. Its flagship product is the Rosetta Resolver System, an enterprise-level gene expression data analysis system. Silicon Genetics' GeneSpring is part of a suite of programs for analysing and visualising high-throughput microarray gene expression data. As well as data visualisation, it includes features for data normalisation, statistical analysis, clustering and pattern matching.
As well as top-ranked commercial offerings and the pharmaceutical companies' proprietary in-house aids, a multitude of other graphics-based tools have been created for and by researchers with less intense genomic and proteomic data requirements. "They almost always run on PCs and tend to be Java-based," says Davison.
Waiting for demand
When it comes to the adoption of visualisation technology, however, the Australian reality departs from the global norm. The local arm of the dominant US vendor of computer visualisation platforms, SGI (Silicon Graphics), is still waiting for demand from the commercial life sciences organisations to ramp up in Australia. The big pharma customers who fill the top end of the market typically channel any equipment orders through their overseas parents, says Alan Ryner, marketing and business development manager for SGI in Australia and New Zealand.
Occupying the bottom end of the market, and not yet ready for the sort of technology SGI has on offer, are the small embryonic biotechs. That leaves a big gap in the middle, populated by mid-size biotechs who have the revenues and the need for visualisation software but have not yet taken the plunge.
"They probably should embrace this sort of technology but they aren't doing so yet," says Ryner. "As time goes on, we expect they will evolve, but at the moment we aren't seeing demand from the mid-size organisations such as ResMed." SGI supports a number of specialised centres around Australia which give commercial customers shared access to levels of visualisation hardware and software they could not afford to install or licence on an individual basis. In SGI parlance, they are known as Reality Centres, and SGI's flagship in Australia is sited at RMIT in Melbourne where it is called the Interactive Information Institute.
The centre is co-located with the Victorian Partnership for Advanced Computing (VPAC) and acts as the visualisation node within VPAC's network for clients with special graphics requirements. Such customers are not yet springing up in any numbers within the ranks of Australian bioindustry, according to VPAC business development manager Bill Yeadon. "It is very early days in this area for Australian life sciences and its demand for visualisation is at very low level," Yeadon says. "Most organisations usually come in to us unaware of what is available in visualisation. Because we have the virtual reality centre as part of our facility, we are trying to make them conscious of how valuable a tool it can be."
Until now the message has been a hard sell, although VPAC has scored success with the Victorian Institute of Animal Science, part of the Department of Natural Resources and Environment. "They initially saw no real need until we took some of our data and showed them how [visualisation] works," says Yeadon.
In general, the advantages bestowed by visualisation "are something people can't imagine until they see their own data sets up there [in visual form]." Yeadon believes VPAC will experience a gradual uptake of its visualisation services over time due to the demands imposed by microarray technology and proteomics research. Computational drug design is an area VPAC will be targeting with its computational and visualisation facilities. VPAC has already run two workshops to acquaint academia and small biotechs with how it can speed aspects of the drug discovery process, particularly in screening and toxicology areas.
"Basically we are now developing a platform as a research sharing facility for academic and industry organisations to take advantage of our supercomputer and software tools at minimal cost," says Yeadon. "Tied into that in a big way are the visualisation aspects."
VPAC is evaluating a number of software packages from vendors such as Accelrys and its Materials Studio, as well as a UK package called Aneda. "We are trying to see the benefits and contradictions of the software and also see what integration work may be required," Yeadon says.
Who needs it?
A core scientific constituency for computer-aided visualisation systems is composed of computational chemists and structural biologists who need to model molecular-level interactions. Among them is Dr Dave Winkler, senior principal research scientist with CSIRO Molecular Science and a molecular designer who focuses on bioactive compound design.
Well-versed in traditional molecular modelling techniques, Winkler is also seeing his work shift into the areas of genomics, proteomics, and diagnostics. His research requires the entire gamut of hardware platforms, from NEC and Cray vector supercomputers to Silicon Graphics workstations to personal computers. The software is a mix of commercially-supplied and self-developed.
"There is a lot of public domain software for gene sequence database searching and management but visualisation software tends to be more expensive and proprietary," Winkler says. "Looking at the specific interactions of molecules at the atomic level is much more visual than doing genomic sequence comparisons or alignments, so you need more comprehensive software and hardware."
The expense means that many research organisations "can't get everything they'd like," Winkler says. In-house developed software tends to be for specialised work where no packages are currently available. Even if external packages can be afforded, their provenance is often complicated and their ongoing support can become an issue. Winkler's team, for example, relies on a package called UniChem, which does back-end calculations and "very good front-end visualisation," he says.
It was originally developed to run on Cray supercomputers, then was acquired by SGI and finally by Accelrys. The latter has announced it will continue to support licence-holders but will not develop the program any further. That has left CSIRO and other organisations who rely on UniChem with the task of trying to get it ported to other supercomputers and workstations so it can be kept going.
Different levels of hardware and software come into play for different tasks. Supercomputers are harnessed for molecular properties using molecular orbital methods, a time- and power-consuming task based on first-principle calculations. To investigate the docking activity of small molecule ligands and receptors of peptides and proteins, intermediate systems such as high performance SGI or Sun Microsystems graphics workstations can fill the bill.
This intermediate layer probably sees the greatest use of off-the-shelf commercial software packages. For this level of drug design applications, the list of visualisation vendors is led by US companies like Accelrys and Tripos. The lowest layer is the PC, where much of the software is created in-house and is focused on distilling information out of very large data sets.
Winkler's group has developed a software package built around algorithms which have been patented and for which CSIRO is looking for a technology-licensing partner. Called MolSAR, it is a predictive computational tool which involves data modelling and data mining using neural networks, Bayesian statistics, and special mathematical expressions for molecular properties. CSIRO believes it is applicable to a wide range of human drug discovery and development problems with potential applications in agrochemical and veterinary drug discovery as well.
Beyond visualization
Taking technology a step beyond visualisation is another CSIRO group.
The Interactive Modelling and Visualisation Systems group is exploring advanced virtual environments which allow users to not only visualise multi-dimensional datasets and models but touch them. The Canberra-based team, under research group leader Duncan Stevenson, is creating applications using force-feedback techniques.
Its Haptic Workbench (haptic means 'touch' in Greek) gives users the sense of physically touching and moving a virtual object on a computer screen. The feedback provides both a sense of the object's surface feel and its weight.
The group's experience in data visualisation goes back 20 years and it has been working with 3D data displays for the last decade. It contracts its expertise to industry and is currently helping develop a 3D virtual surgery simulation in conjunction with Perth telemedicine company MedicVision.
The virtual surgery application will be used to tele-train surgeons for hazardous and difficult procedures such as brain surgery and spinal operations. The system allows networking so that doctors in different locations can simultaneously work in the same virtual anatomical space -- an impossibility in real-life training.
On the biotech front, an area the team may enter in future relates to the study of protein structures at the molecular level. It foresees the arrival of molecular modelling software which generates not only visual representations of molecular structures but a sensation of the attractive and repulsive forces between them.
For researchers engaged in evaluating the docking forces between molecules, the advantages of that type of application could prove compelling. Stevenson sees it as an emerging opportunity but concedes that his team's ability to produce such software using its Haptic Workbench suite is still two to three years away.
Fetuses can fight infections within the womb
A fetus has a functional immune system that is well-equipped to combat infections in its...
Gene therapy reverses heart failure in large animal model
The therapy increases the amount of blood the heart can pump and dramatically improves survival,...
Meditation to reduce pain is not a placebo — it's real
Mindfulness meditation has long been speculated to work by activating processes supporting the...