Big new home for big data: $30M computing center nears opening

Bayview site will provide more digital storage space to scholars from Johns Hopkins, University of Maryland

Image caption: The new high-performance computing center in Baltimore will be available for use by Big Data researchers at Johns Hopkins University and the University of Maryland. It is one of the largest academic facilities of its kind in the nation.

Credit: Will Kirk / Johns Hopkins University

Whether they're studying distant galaxies or deadly diseases deep within human cells, big data researchers increasingly need more powerful computers and more digital storage space. To address this demand, two Maryland universities are preparing to open one of the nation's largest academic high-performance computing centers, located at the edge of the Johns Hopkins Bayview Medical Center campus in East Baltimore.

Supported by $30 million in state funding, the Maryland Advanced Research Computing Center (MARCC, pronounced "MAR-see") is expected to provide state-of-the-art digital processing power to a wide array of researchers at Johns Hopkins University and the University of Maryland, College Park. Final testing is under way at the facility, which is expected to be functional by the end of the month.

Thanks to speedy fiber-optic cable connections to the participating campuses, big data university researchers will never need to leave their labs or offices to tap into the new computing center.

"Everyone is going to be able to access the new facility on a remote basis," said Jaime Combariza, a Johns Hopkins computational chemist who became director of MARCC in June of last year. "MARCC allows all of Johns Hopkins and the University of Maryland to centralize their computing power."

For participating researchers, he said, the arrangement should lead to significant cost savings and greater efficiency. Instead of requiring individual research groups to use time, money, and space to create their own high-performance computing centers, all participants will share the costs of cooling, networking, and running the single center.

The shared equipment within the nondescript 3,786-square-foot building will be capable of delivering a hefty digital punch. The setup includes more than 19,000 processors and 17 petabytes of storage capacity—that's 17 million gigabytes.

Access to this computing power will be granted to Johns Hopkins researchers from the university's School of Medicine, Bloomberg School of Public Health, Krieger School of Arts and Sciences, and Whiting School of Engineering, and to scholars from the University of Maryland, College Park.

"The deans will have the decision-making power over which of their researchers will be able to use the facility," Combariza said. "The computing resources will be available to researchers from all of these schools."

The users are expected to include astrophysicists who grapple with vast amounts of celestial data from powerful telescopes. Scholars from Biophysics and Materials Science also have inquired about using MARCC for their research.

Alex Szalay, a professor in the Krieger School's Department of Physics and Astronomy who pioneered the use of big data in sky-mapping projects, has also begun to apply his expertise to biomedical research. One new project, slated to run on MARCC computers, involves newly designed software, written in collaboration with scientists from the McKusick-Nathans Institute of Genetic Medicine and the Department of Computer Science. The software is designed to perform a demanding genetics task. With MARCC on board, computations that used to take a day to complete will finish in much less than an hour, enabling Szalay's team to crunch several hundred genomes' worth of data in a matter of days.

Many other big data projects in biology and medicine have become popular, and they also require significant computing resources. For example, just one experiment comparing gene activity in two types of tissue generates 30 to 40 gigabytes of data. A simulation of the workings of the heart generates one terabyte of data. A single MRI or CT scan creates one-to-two terabytes. MARCC is expected to speed up the completion of studies involving such information.

Having a central location such as MARCC is also expected to result in less idle time for computers, meaning researchers will spend less time waiting for their results.

Natalia Trayanova, a Johns Hopkins professor of biomedical engineering, leads a team that creates complex simulations of the heart, using everything from MRIs to the latest information on heart-specific proteins. Her team currently uses computing centers at Johns Hopkins' Homewood campus and often must wait for enough processors to become available. If Trayanova's team needs 10, and only nine are available, they have to wait. Now, with thousands of processors in a central location, idle computers can be used by any researchers who need them. Members of Trayanova's team are already participating in beta-testing of the new computing center's equipment.

Even before it officially opens, 80 percent of MARCC's computing power is already allocated. But with enough land for four more identical centers on the lot at Bayview, there's plenty of room to grow if the demand and funding materialize, MARCC administrators say.