The digital revolution hurtles on. The data required to display a single vacation snap on your smartphone would choke a top-of-the-line desktop computer from the 1980s. And the exponential growth of computing power at the consumer level is dwarfed by big data used in scientific research. From charting global ocean turbulence to creating 3D maps of the universe, scientists now collect and analyze data by the petabyte—the equivalent of 1.5 million CD-ROMs.
Big data is fundamentally changing how we do science, according to JHU computer science Professor Alexander Szalay. And emerging data-intensive approaches to scientific research promise still more change. "If we are starting to trust AI to drive our cars, why can't we trust it to drive our telescopes or our gene sequencers?" Szalay says. But the seemingly limitless potential of computers in scientific research is limited by one issue: a shortage of science-focused software engineers to write code.
Good News
Johns Hopkins is part of a new, $40 million effort to address this shortfall. Earlier this year, the Virtual Institute for Scientific Software was launched by Schmidt Futures, a philanthropic initiative founded by Eric Schmidt, former CEO of Google, and his wife, Wendy, president of the Schmidt Family Foundation. Four inaugural centers—Johns Hopkins, the University of Cambridge, the Georgia Institute of Technology, and the University of Washington—will each receive $2 million annually over the course of the five-year funding period. The goal, according to Schmidt Futures, is to "address the growing demand for software engineers with backgrounds in science, complex data, and mathematics who can build dynamic, scalable open software to facilitate accelerated scientific discovery across fields." In other words: Let's bring world-class software engineers to college campuses.
"Just having graduate students write software is not good enough anymore," says Szalay, who is leading the effort at Hopkins. "Much of the existing scientific software was written in Fortran in the 1970s, '80s, and '90s, and we've just been living off those codes." (Fortran is a science- and engineering-focused programming language dating to the era of vacuum-tube computers.)
From 'Good Enough' to Great
The goal, Szalay says, is for scientists to stop cobbling together programs that are "just good enough" for a specific project and instead to create robust software that the broader research community can access and modify as needed. By using modern computer languages, such as Python, that are modular based, users can easily swap in and out segments of code to perform desired calculations. The funding allows each virtual center to hire six software engineers, and Szalay hopes Hopkins will have its new programmers on board by the end of the year. Independent of this funding, he says, the university aims to hire roughly two dozen new software engineers over the next couple of years. "The idea is that we build a large pool of software engineering talent who can be assigned to different projects and at different project phases," he says.
Academia is often challenged to compete with the business world when it comes to hiring software engineers who, Szalay says, are in high demand and often command much higher salaries than a university can pay.
But computing work done on campuses has special appeal, he notes. "I'm willing to think some software engineers will make a trade-off," Szalay says. "They will want to help discover the cure for cancer rather than, you know, how to better sell toothbrushes."