Automation offers big solution to big data in astronomy

This is a Hubble telescope image of stars forming inside a cloud of cold hydrogen gas and dust in the Carina Nebula, 7,500 light-years away. Credit: Space Telescope Science Institute

That's the situation now, at least, with the Square Kilometer Array (SKA), a radio telescope planned for Africa and Australia that will have an unprecedented ability to deliver data — lots of data points, with lots of details — on the location and properties of stars, galaxies and giant clouds of hydrogen gas.

In a study published in The Astronomical Journal, a team of scientists at the University of Wisconsin-Madison has developed a new, faster approach to analyzing all that data.

Hydrogen clouds may seem less flashy than other radio telescope targets, like exploding galaxies. But hydrogen is fundamental to understanding the cosmos, as it is the most common substance in existence and also the “stuff” of stars and galaxies.

As astronomers get ready for SKA, which is expected to be fully operational in the mid-2020s, “there are all these discussions about what we are going to do with the data,” says Robert Lindner, who performed the research as a postdoctoral fellow in astronomy and now works as a data scientist in the private sector. “We don't have enough servers to store the data. We don't even have enough electricity to power the servers. And nobody has a clear idea how to process this tidal wave of data so we can make sense out of it.”

Lindner worked in the lab of Associate Professor Snezana Stanimirovic, who studies how hydrogen clouds form and morph into stars, in turn shaping the evolution of galaxies like our own Milky Way.

In many respects, the hydrogen data from SKA will resemble the vastly slower stream coming from existing radio telescopes. The smallest unit, or pixel, will store every bit of information about all hydrogen directly behind a tiny square in the sky. At first, it is not clear if that pixel registers one cloud of hydrogen or many — but answering that question is the basis for knowing the actual location of all that hydrogen.

People are visually oriented and talented in making this interpretation, but interpreting each pixel requires 20 to 30 minutes of concentration using the best existing models and software. So, Lindner asks, how will astronomers interpret hydrogen data from the millions of pixels that SKA will spew? “SKA is so much more sensitive than today's radio telescopes, and so we are making it impossible to do what we have done in the past.”

In the new study, Lindner and colleagues present a computational approach that solves the hydrogen location problem with just a second of computer time.

For the study, UW-Madison postdoctoral fellow Carlos Vera-Ciro helped write software that could be trained to interpret the “how many clouds behind the pixel?” problem. The software ran on a high-capacity computer network at UW-Madison called HTCondor. And “graduate student Claire Murray was our 'human,'” Lindner says. “She provided the hand-analysis for comparison.”

Those comparisons showed that as the new system swallows SKA's data deluge, it will be accurate enough to replace manual processing.

Ultimately, the goal is to explore the formation of stars and galaxies, Lindner says. “We're trying to understand the initial conditions of star formation — how, where, when do they start? How do you know a star is going to form here and not there?”

To calculate the overall evolution of the universe, cosmologists rely on crude estimates of initial conditions, Lindner says. By correlating data on hydrogen clouds in the Milky Way with ongoing star formation, data from the new radio telescopes will support real numbers that can be entered into the cosmological models.

“We are looking at the Milky Way, because that's what we can study in the greatest detail,” Lindner says, “but when astronomers study extremely distant parts of the universe, they need to assume certain things about gas and star formation, and the Milky Way is the only place we can get good numbers on that.”

With automated data processing, “suddenly we are not time-limited,” Lindner says. “Let's take the whole survey from SKA. Even if each pixel is not quite as precise, maybe, as a human calculation, we can do a thousand or a million times more pixels, and so that averages out in our favor.”

###

David Tenenbaum
608-265-8549
djtenenb@wisc.edu

Media Contact

Robert Lindner EurekAlert!

All latest news from the category: Physics and Astronomy

This area deals with the fundamental laws and building blocks of nature and how they interact, the properties and the behavior of matter, and research into space and time and their structures.

innovations-report provides in-depth reports and articles on subjects such as astrophysics, laser technologies, nuclear, quantum, particle and solid-state physics, nanotechnologies, planetary research and findings (Mars, Venus) and developments related to the Hubble Telescope.

Back to home

Comments (0)

Write a comment

Newest articles

First-of-its-kind study uses remote sensing to monitor plastic debris in rivers and lakes

Remote sensing creates a cost-effective solution to monitoring plastic pollution. A first-of-its-kind study from researchers at the University of Minnesota Twin Cities shows how remote sensing can help monitor and…

Laser-based artificial neuron mimics nerve cell functions at lightning speed

With a processing speed a billion times faster than nature, chip-based laser neuron could help advance AI tasks such as pattern recognition and sequence prediction. Researchers have developed a laser-based…

Optimising the processing of plastic waste

Just one look in the yellow bin reveals a colourful jumble of different types of plastic. However, the purer and more uniform plastic waste is, the easier it is to…