There is surging demand among AI researchers for access to the high-performance computing (HPC) systems that are necessary to solve tough computational problems on everything from human genetics to the climate. Unfortunately, the supply of HPC resources has not kept up with this growing demand. On December 10, the Center for Data Innovation hosted a panel discussion exploring what steps Congress, the National Science Foundation (NSF), and the U.S. Department of Energy (DOE) should take to increase access to HPC resources for AI researchers in the United States.
Hodan Omaar, policy analyst at the Center for Data Innovation, set up the discussion with a presentation of the Center’s report on this topic, How the United States Can Increase Access to Supercomputing. The report, as Omaar explained, identifies at a high-level three problems limiting AI researchers from having access to the HPC resources they need to solve important problems.
First, there is a lack of federal funding in HPC preventing NSF and DOE from investing in building an adequate supply of HPC resources for researchers. Currently, the HPC resources these agencies have invested in can only support a third of the researchers who need access to them. Second, access to HPC is concentrated in states that have leading academic institutions with the ability to stand up their own HPC centers or partner with other leading computing centers. This means AI researchers in states with fewer leading universities and less computer science funding, such as Alabama, Indiana, or Utah, are less able to pursue their research goals. Finally, because women and minorities are underrepresented in science and engineering, they are also underrepresented in the HPC workforce. This means they are less able to become the next generation of AI researchers solving tough problems.
To address the first problem, panelists agreed Congress should significantly increase funding for HPC to both NSF and DOE. While the report calls for Congress to triple federal funding for HPC to $10 billion over the next 5 years, Cheryl Martin, director of global business development, higher education, and research at Nvidia, pointed out that this figure will need to grow over time as the use-cases for AI applications mushroom into new disciplines. But, according to Andrew Jones who is planning future HPC and AI capabilities at Microsoft, funding must be matched with an institutional commitment to strategic planning and operational processes, coordinated between public and private stakeholders at both the local and national level. Echoing a key point from the report, Jones explained that without a clear strategy for acquiring state-of-the-art computing infrastructure, software that can effectively make the most of a system, and experts who can make both perform well, frictions between disparate initiatives will impede researchers from being more productive and innovating at higher rates than their competitors.
To address the second problem regarding geographic disparities in HPC access, Sharon Broude Geva, director of advanced research computing in the Office of Research at the University of Michigan, stressed the need for NSF and DOE to diversify the portfolio of HPC resources they are making available to AI researchers, including by exploring cloud computing options. As the Center’s report explains, cloud computing gives access to high-performance computing resources through a convenient network interface, making access easily available to those with an Internet connection, rather than restricting access only to those with access to a particular facility. However, referencing a survey she co-authored on cloud and on-premises HPC usage at academic institutions, Broude Geva pointed out that 83 percent of academic institutions only have access to on-premises HPC systems. Providing access to cloud services for AI researchers across the country, like the national cloud computing bill calls for, will provide more AI researchers with the tools they need to solve problems. Cloud-based computing architectures are not well-suited to all applications though, which means increasing access to on-premises systems for AI researchers in the states that lack them is still important. To this end, Martin explained how Nvidia has partnered with the University of Florida—a state identified in the Center’s report as one that is conducting high levels of AI research but lacking in HPC—to launch an AI supercomputer. This partnership can serve as one model for how universities, industry, and government can combine their respective strengths for the benefit of all.
To address the third problem regarding the additional barriers women and minorities face in becoming the next generation of HPC-enabled AI researchers, Broude Geva, who also serves as director of the Women in HPC organization Chapters and Affiliates program, stressed the importance of initiatives aimed at retaining talent from underrepresented groups. She explained that these efforts need to tackle the problem on two fronts. On one hand, long-term initiatives need to address the leaky pipeline hindering qualified candidates from underrepresented groups with computer science backgrounds entering the HPC workforce. Martin pointed out that this gap, caused by the gender and diversity imbalance of those with science and engineering backgrounds that ultimately make up the HPC community, can be addressed by ensuring HPC and AI workforce development programs are established at a broader scale so that they include K-12 and minority-serving institutions (MSIs). On the other hand, Broude Geva said short-term initiatives need to address the fact that existing talent are oftentimes not even aware of what resources are available. MSIs tend to be smaller institutions, which typically do not have a central body responsible for advocating for their HPC funding needs or acting as a port of contact for government partnerships. If they did, they could be better represented in larger organizations, such as the Coalition for Academic Scientific Computation that Broude Geva chairs, that bring universities and computing centers together to share HPC knowledge and tools, and that advocate for the needs of the HPC community at large.
Panelists agreed that the core message is that the United States needs to take significant steps today to ensure its AI researchers have the freedom and ability to choose the HPC resources that best equip them to solve problems of economic and social importance.