Skip to content

Ag GeoSpatial Data Explorer (AgGeo): A Web-Based Tool for exploring Agricultural, Geospatial, and Climate Data 

In 2023, approximately 282 million people across 59 countries faced severe food insecurity, an increase of 24 million from the previous year. Sub-Saharan Africa (SSA) bears a disproportionate burden of this crisis. Climate change is already reducing cereal production across the region, with projections showing a further 20% decline by 2030. Food import bills in Southern Africa surged from $35 billion to $43 billion between 2019 and 2022. Africa is off track to meet the Sustainable Development Goal of ending hunger by 2030 and the Malabo Commitment to reduce poverty by half by 2025. Climate variability, rising temperatures, shifting rainfall patterns, and increasing frequency of droughts and floods, is fundamentally reshaping agricultural systems. Responding to climate shocks requires evidence-based policy, yet a significant gap persists between the availability of climate data and its practical usability for agricultural policy analysis.

The Data Access Challenge

High-quality climate datasets are increasingly available. The Climate Hazards Center (CHC) provides CHIRPS precipitation and CHIRTS temperature data at high resolution. NASA and other agencies offer temperature and vegetation data. These datasets are often free and well-documented, but “available” does not mean “accessible.”

Desktop software like GeoCLIM requires users to install QGIS, download several gigabytes of raw data, and have GIS expertise. Web-based alternatives like the Early Warning Explorer eliminate software installation but typically limit users to rectangular map regions and single-year time periods. Tableau needs pre-constructed datasets requiring massive storage. Most web tools provide only basic statistics rather than derived indicators that help identify drought-prone areas or understand seasonal dry spells. When researchers cannot easily access climate data, policy-relevant analyses either may not happen or rely on national-level averages that obscure local variation critical for small-scale producer (SSP) agriculture. The Ag GeoSpatial  Data Explorer (AgGeo) addresses these barriers through a web interface that processes data on-demand, follows actual administrative boundaries, and allows custom seasonal definitions matching agricultural calendars.

How the Climate Data Explorer Works

EPAR’s Center on Risk and Inclusion in Food Systems (CRIFS) developed AgGeo to address these challenges. Rather than requiring users to download and process data locally, the platform processes climate data on-demand and delivers results through a web interface. This directly tackles the software installation, bandwidth, and technical expertise barriers.

The platform covers countries across sub-Saharan Africa and South Asia. It is built on public datasets: The CHC’s CHIRPS provides precipitation data over four decades, while GAEZ provides spatial classifications of land use.

Users select their country, layer, and time period through a sidebar interface. The tool offers annual trends, seasonal patterns, monthly details, or quarterly summaries. Users choose between interactive maps (detailed grid values or regional averages with color-coded boundaries) or time series charts that are rendered within seconds. Users can hover over locations for exact values, zoom into regions, compare multiple territories, and download visualizations.

Figure 1: Average Annual Rainfall in India in 2020-2024

What the Tool Provides

AgGeo currently offers three layers that help researchers and policymakers understand climate variability and its agricultural implications.

Rainfall volumes track precipitation. Users can calculate total rainfall for any time period, understand typical rainfall intensity through daily averages, and compare across different regions. This helps identify wet and dry seasons, understand year-to-year variations, and spot long-term trends. Researchers can analyze data annually, seasonally, monthly, or quarterly, depending on their specific questions.

Dry days monitor drought patterns by counting days with minimal rainfall. This indicator helps identify drought-prone areas, understand when and where seasonal dry spells occur, and assess water availability during critical growing periods. Knowing how many consecutive dry days occurred during planting or flowering stages reveals much more about potential crop stress than total seasonal rainfall alone. This information helps target drought-resistant varieties or irrigation investments to the areas that need them most.

Agro-ecological zones visualize agricultural potential based on climate, soil, and terrain. The tool shows both current conditions and future projections, helping researchers and policymakers understand which areas are suitable for different crops today and how climate change might shift these zones over time. This supports long-term agricultural planning and adaptation strategies.

Users can explore climate patterns at multiple administrative levels, from entire countries down to individual states, districts, or smaller regions. The ability to compare multiple territories side-by-side helps understand regional differences and identify areas facing similar climate challenges.

Figure 2: Agro-ecological Zones of the Ethiopian Region of Oromia 

Why This Matters for Policy

Accessible climate data enables critical policy-relevant analysis. National planning offices can map rainfall variability to identify regions requiring different crop strategies or irrigation investments. Understanding spatial patterns helps target interventions effectively.

Program evaluations benefit from climate control variables that distinguish program effects from environmental factors. Food security monitoring reliability increases when analysts can quickly compare current rainfall to historical patterns, identifying regions at risk of production shortfalls. Researchers can identify drought-prone areas and analyze long-term trends to support climate adaptation planning.

Figure 3: Number of days with rainfall <1 mm in the Nigerian states of Ebonyi, Kebbi, and Niger during the 2010-2024 growing seasons (March through August)

Upcoming Features

The platform currently provides comprehensive coverage across sub-Saharan Africa and South Asia. Users can generate interactive maps and time series charts, compare multiple territories, and download results in multiple formats.

In the near future, EPAR plans substantial expansion including standardized drought indices, temperature anomalies, growing degree days for crop development tracking, and vegetation health monitoring. Integration plans include adding AgGeo within AgQuery+ and overlaying LSMS-ISA enumeration area coordinates for direct linkage with household survey data.

Getting Started

The Ag GeoSpatial Data Explorer is available at https://agquery.org/aggeo. The interface is designed for users without technical training. Select your parameters, generate visualizations, and download data through your web browser. You can click on the ‘About’ tab in the top-right corner of the app to learn more.

To cite this tool: UW EPAR (2025). Ag GeoSpatial Data Explorer: A platform for constructing and visualizing geospatial indicators for sub-Saharan African and South Asian countries.

EPAR welcomes feedback and suggestions for platform improvements. More information about EPAR’s work on climate and food systems is available at epar.evans.uw.edu.

Blog written by Apurwa Rahulkar and Joaquin Mayorga.

How Well Does Machine Learning Predict Food Insufficiency? A Case Study from Malawi

The World Food Programme estimates (pdf) that approximately 3.5 million Malawians are chronically food insecure. That number rises, and hunger becomes more acute, in the lean season between harvests. Seasonal hunger is further aggravated by limited storage options and extreme weather events, such as the recent weak rains that rendered 25% of the population acutely food insecure. The impacts of crop losses go beyond rural subsistence producers; they also affect urban consumers through the resulting food price shocks. 

In a recent article, we asked whether data related to crop production or markets could allow for greater precision in identifying which communities would face food insufficiency, which we define as a substantial proportion of households reporting “running out” of food in each month. Our data come from four waves of the Malawi Integrated Household Survey (IHS). Our goals were to understand whether using publicly available data derived from satellites and geo-coded market food prices with machine learning models could produce more accurate predictions than simpler approaches like classical regression or non-modeled human predictions based on past occurrences. To test the models, we first fit them using information from one or more survey rounds, then tested whether they could classify each community in the next survey round as food sufficient or food insufficient using updated predictor variables.

Our key takeaways were:

  • Producing accurate forecasts can require multiple years of data and, even in their simplest form, machine learning models still require some expertise
  • Prices, which reflect multiple factors on the supply and demand side, work as well as weather data
  • When food insufficiency recurs with some spatial and temporal predictability, machine learning models may not add substantial improvements to overall accuracy, but the distribution of communities predicted as food sufficient or food insufficient varies substantially by model, even at similar accuracy rates.

More details, including considerations for constructing indicators, specifying models and determining relevant predictors, are available in our paper.

1. Evaluating the Comparative Accuracy of Machine Learning Models

    Machine learning models can discern subtle patterns in large datasets, but the size requirement can restrict where the models can be effectively used. Using datasets that are too small can lead to low generalizability – fits are highly accurate on the sample used to develop the models, but extrapolations on new data may be weaker.

    To evaluate whether machine-learning was performing well given the available public data, we trained ML models on one to three rounds of the survey, testing on the second through fourth survey rounds, and compared them to classical regression and a non-modeled measure of checking the food security situation of the nearest neighboring community in the previous year. In addition to accuracy, we compared rates of false positive predictions (classifying food-sufficient communities as food-insufficient) and false negative predictions (classifying food-insufficient communities as food-sufficient) using two additional metrics: recall and precision. Recall represents the ratio of true positive predictions out of all observations of food insufficiency, and precision represents the ratio of true positive predictions to false positive predictions. Low recall scores indicate a bias toward producing false negative predictions, and low precision scores indicate a bias toward false positive predictions. We found that machine learning models tended to outperform on recall but under-perform on precision compared to the classical approaches and that model accuracy did not approach the simple non-modeled approach until three waves of survey data were used for training, a finding that indicates that although food insufficiency has high recurrence rates, the underlying reasons in a given year may vary.

    Figure 1:  Comparison of modeling approaches as the amount of training data increases (top to bottom: model 1: built using survey wave 1 and tested on survey wave 2; model 2: built using data from waves 1 and 2 and tested on wave 3; model 3: built using data from waves 1-3 and tested on wave 4). Blue bars represent simpler approaches (logit and LASSO, a minimal machine-learning example), red bars represent two alternative machine learning approaches, and the green bar represents a simple algorithm that uses the value of the nearest community in the previous survey.

    2. Comparing Prices to Weather as Leading Indicators

    Although the demand for staple crops is relatively stable, prices are affected by international trade, policy, weather, input prices, and other factors that affect supply. Hence, price movements offer a convenient summary of current and forecast changes in the availability of food relative to demand. 

    Using a primary measure of food affordability (observed maize market price over the prior year), combined with indicators of direction (maize price inflation and overall CPI inflation), we found that the predictions had similar or better accuracy compared to models that used variables that might influence agricultural productivity over the previous growing season, such as precipitation and temperature.

    In our datasets, food insufficiency was highest in December, January, and February, typically dropping in March and April. In Wave 4 (2020-2021), March was an unusually severe month due to shocks to production in the previous growing season, and price-based models were slightly better at reacting to that difference than weather-based models, which may have been more reliant on seasonal variations. The price-based models were also more sensitive to the transition back to widespread food insufficiency at the end of the dry season.

    One disadvantage of using machine learning compared to classical regression is that the former lack easily interpretable coefficients to assess the relative contributions of each variable. The Shapley Additive Values Framework was developed to help demystify modeling results and shows the effect each variable has on the final predicted value for each data point. Applying the Shapley framework to our fitted models suggested that prices in the past month and inflation in the previous year both had substantial, if mixed influences on predicted values. The most influential variable is the maize price in the previous month, but there is a mixed impact: low maize prices could be either a strong or a weak signal of future food insufficiency depending on the month in which the observation took place, while high maize prices tended to have a generally positive influence on predictions of insufficiency on positive predictions.

    3. Recurrence of Food Insufficiency

    Despite the weather and price volatility during the observation period, temporal stability, captured in a dummy variable representing the quarter when the observation occurred, was a highly influential factor in the model predictions. A large portion of the observed variation comes from seasonal swings in food availability. Much of the remainder is associated with the drought in southern Malawi, although spikes in flood-associated shocks to production also occur in the third survey round (Figure 2). These features of the agricultural sector in Malawi create a predictable spatio-temporal path for food insufficiency, making non-modeling prediction approaches based on historical occurrences of food insufficiency effective, although differences exist in which communities are flagged as food insufficient depending on the modeling approach being used.

    Figure 2: Top: Observed food insufficiency over four survey rounds, derived from recall over the previous twelve months prior to the survey date, and events impacting Malawian agricultural production or overall food sufficiency. Bottom: nominal market maize prices during the survey period.

    Conclusion

    While model accuracy may be higher in the absence of shocks like flooding and cyclones, historical observations of food insufficiency suggest that insufficiency in one year can arise from the consequences of a poor harvest in the previous year. Therefore, where recurring spatial patterns exist, decision makers could already have the information needed to act. Instead, modeling may offer benefits like increasing the utility of ongoing data collection for extrapolating to the rest of the country or providing visualizations of spatial patterns. The decreasing cost and increasing resolution of spatial data products could allow for detailed analyses to help inform policy-making, but only with frequent collection of training data. In resource-constrained environments, timely interventions based on informed priors may be preferable to gathering more information.

    The Data and Policy open-access manuscript is available here.

    Blog written by C. Leigh Anderson, Didier Alia, and Andrew Tomes.