Sudden inundations rank among the most lethal meteorological occurrences globally, claiming the lives of over 5,000 individuals each year. They are also notoriously challenging to anticipate. Yet, Google believes it has resolved this issue through an unexpected avenue — by examining news reports.
While a substantial amount of atmospheric data has been gathered by humans, flash floods are too transient and localized to be comprehensively measured, unlike how temperatures or even river flows are continuously monitored. This informational void signifies that deep learning models, which are increasingly adept at predicting weather, cannot forecast these swift floods.
To tackle this predicament, Google’s researchers leveraged Gemini — Google’s large language model — to sift through 5 million news articles worldwide, isolating accounts of 2.6 million distinct flood events. These reports were then transformed into a geo-tagged chronological data set christened “Groundsource.” According to Gila Loike, a product manager for Google Research, this marks the first instance the company has deployed language models for such an undertaking. The findings and the data set were publicly released on Thursday morning.
With Groundsource serving as an empirical foundation, the investigators trained a model built on a Long Short-Term Memory (LSTM) neural network. This model was designed to absorb global weather outlooks and compute the likelihood of flash floods occurring in a specified region.
Google’s predictive model for sudden floods is now underscoring risks for metropolitan areas in 150 nations via the company’s Flood Hub platform. It is also disseminating its information to emergency response organizations across the globe. António José Beleza, an official for emergency operations at the Southern African Development Community, who piloted the forecasting model with Google, stated that it assisted his entity in reacting to inundations with greater alacrity.
Nonetheless, the model still possesses limitations. For instance, it operates at a relatively coarse resolution, pinpointing hazards within 20-square-kilometer zones. Furthermore, it is not as precise as the flood alert mechanism of the US National Weather Service, partially because Google’s model fails to integrate localized radar data, which facilitates instantaneous monitoring of rainfall.
A core objective, however, is that the initiative was conceived for deployment in locales where local administrations lack the financial means to fund costly meteorological sensing facilities or possess inadequate historical weather data.
Techcrunch event
San Francisco, CA
|
October 13-15, 2026
“Because we’re amassing millions of reports, the Groundsource data set actually contributes to redressing geographical imbalances,” Juliet Rothenberg, a program manager on Google’s Resilience team, informed journalists earlier this week. “It permits us to infer for other locales where less data is available.”
Rothenberg expressed the team’s aspiration that leveraging large language models to construct measurable data collections from textual, descriptive origins could be utilized in endeavors to compile data sets about other transient yet crucial-to-predict occurrences, like periods of intense heat and landslides.
Marshall Moutenot, the Chief Executive Officer of Upstream Tech, a firm that employs analogous deep learning systems to predict water currents in rivers for clients such as hydroelectric power firms, noted that Google’s input constitutes a component of an expanding initiative to compile information for meteorological prediction models founded on deep learning. Moutenot was a co-founder of dynamical.org, an organization maintaining an archive of weather data prepared for machine learning applications for scholars and nascent businesses.
“A dearth of data represents a formidable hurdle within geophysics,” Moutenot commented. “Concurrently, an abundance of terrestrial data exists, yet when empirical validation is sought, insufficiency prevails. This constituted an exceptionally inventive method for acquiring such information.”
{content}
Source: {feed_title}

