DoE stuffs $24m into data management R&D

The US Department of Energy is shoveling $24 million into “next generation” data management and visualization research.

Organizations large and small are grappling with the problem of managing large amounts of data, as well as visualizing what it is actually telling them. However, the challenge is particularly acute for the DoE, given its role overseeing some of the US’s most storied scientific labs and associated super computers.

So the DoE will allocate $23.9 million to research “next data management”. Barbara Helland, DoE Associate Director of Science for Advanced Scientific Computing Research, said in a canned statement: “These efforts will enable data to be processed and stored at higher rates across the edge, cloud, and high-performance computing environments, and develop new visualization methods to explore that data, form hypotheses, and convey conclusions to a broad spectrum of audiences.”

The agency said improvements in data management would “facilitate discovery in a wide range of fields” including climate modelling, clean energy development, and increasing energy consumption and reducing energy consumption.

Part of this will come through “optimizing” the management of data that has to be moved and analyzed, “using sophisticated mathematical techniques, including machine learning”.

Projects are also likely to “advance innovative techniques that exploit smart storage and networking hardware that may provide breakthroughs that address the data challenges scientists and engineers face.”

The results of the research will, presumably, trickle down to the rest of us in time, both in terms of better storage and data management tools and techniques, as well as breakthroughs in energy creation, management, and efficiency.

The Argonne National Laboratory and Los Alamos National Laboratory will all kick off projects on “A Compositional Approach to Harnessing Smart Devices within Elastic Data Services”.

Sandia National Laboratories, Illinois Institute of Technology, and Oak Ridge National Laboratory will examine “Coeus: Accelerating Scientific Insights using Enriched Metadata”.

Brookhaven National Lab, Texas University, and Argonne will research “Scalable Metadata and Provenance Services for Reproducible Hybrid Workflows”.

Lawrence Berkley National Lab, Northeastern University, and the University of Illinois will carry out research into “End to End Object Focused Software-Defined Data Management for Science”.

Penn State will research “Intelligent File System Interfaces for Computational Storage Devices”.

The balance of the pot will go towards visualization research “on new techniques and theory needed to aid in the development of informative and interactive visualization of complex scientific data of interest to DoE’s mission space.”

The explosion of data creation is itself an energy consumption problem. Energy guzzling by buildings housing computers means some areas are struggling to accommodate buildings housing people.

The DoE last year announced $5 million in funding for research into data reduction techniques and algorithms “to facilitate more efficient analysis and use of massive data sets produced by observations, experiments and simulation”.