From Gigabytes to Terabytes: The Astronomical Data Surge
The scale of data collection in astronomy is undergoing a seismic shift. For years, the Hubble Space Telescope set the standard, producing roughly 1 to 2 GB of sensor readings daily. While groundbreaking, that volume is a drop in the bucket compared to the next generation of observatories.
The James Webb Space Telescope has already pushed boundaries, transmitting 57 GB of stunning imagery every day. However, the upcoming launch of the Nancy Grace Roman Space Telescope in September 2026—arriving eight months ahead of schedule—will redefine “big data” in space. Over its lifespan, the Roman telescope is expected to provide astronomers with a staggering 20,000 TB of data.
Simultaneously, the Vera C. Rubin Observatory in the mountains of Chile is preparing to start its survey, which is projected to collect 20 TB of data every single night to help astronomers investigate dark matter. We are moving from an era of observing individual objects to analyzing massive cosmic datasets.
The GPU Revolution: Moving Beyond Manual Review
With data volumes exploding, the traditional method of manual review by astronomers has become obsolete. The industry is shifting toward GPU acceleration to handle the computational load. This evolution has moved from simple observations to CPU-based scaling, and finally to GPU-accelerated analysis.

Brant Robertson, an astrophysicist at UC Santa Cruz, has spent 15 years collaborating with Nvidia to apply GPUs to space research. This technology is now critical for everything from simulating supernova explosions to processing the torrents of data arriving from new observatories.
The reliance on GPUs isn’t just about speed; it’s about capability. Without this acceleration, the sheer volume of data from the Roman and Rubin projects would be impossible to parse in a human lifetime.
AI-Driven Discovery: Transformers and Generative Models
The integration of deep learning is fundamentally changing how we understand the universe. A prime example is Morpheus, a deep learning model developed by Robertson and Ryan Hausen. Morpheus can scan massive datasets to identify galaxies, and early AI analysis of Webb data has already revealed an unexpected abundance of certain disk-shaped galaxies, challenging existing theories of cosmic development.
The future of these tools lies in the “transformer” architecture—the same technology powering today’s large language models. By transitioning to this architecture, tools like Morpheus can expand their analysis range several times over, drastically accelerating the pace of discovery.
The Hardware Hurdle: Funding and Infrastructure
Despite the scientific potential, the path forward is fraught with logistical and financial challenges. Rocket technology currently limits the size of mirrors that can be sent into orbit—making 8-meter mirrors nearly impossible to launch. This makes software optimization for ground-based sites, like the Rubin Observatory, the most viable alternative.
the demand for GPUs is creating a resource crisis. Research clusters, such as the one built at UC Santa Cruz via the National Science Foundation (NSF), face rapid hardware obsolescence as researchers pivot to more compute-intensive techniques. This represents compounded by political instability; recent budget proposals from the Trump administration have suggested cutting NSF funding by 50%.
For those at the technical frontier, a “startup mentality” is now required. With limited and risk-averse university resources, researchers must constantly prove the value of AI and machine learning to secure the GPU power necessary for their operate.
Frequently Asked Questions
How does the Roman Space Telescope differ from Hubble in terms of data?
While Hubble produced 1-2 GB of data daily, the Roman Space Telescope is expected to generate a total of 20,000 TB over its lifecycle.

What is the role of GPUs in modern astronomy?
GPUs are used to accelerate the analysis of massive datasets, enabling simulations of supernovae and the rapid identification of galaxies that would be impossible with CPUs or manual review.
How is Generative AI helping ground-based telescopes?
Generative AI is used to mitigate the blurring and distortion caused by the Earth’s atmosphere, improving the clarity of images captured from the ground.
Join the Conversation
Do you consider AI will eventually replace the need for larger physical telescopes, or will the two always evolve together? Let us know your thoughts in the comments below or subscribe to our newsletter for more updates on the intersection of AI and space exploration!
