IBM Research is introducing an experimental offering named IBM PAIRS Geoscope (Physical Analytics Integrated Data Repository & Services), a unique cloud-centric geospatial information and analytics service that can accelerate the discovery of new insights.
Terms like big data, analytics, data science, and the Internet of Things (IoT) have arisen in recent years to help explain a world awash in data. Fueled by increasingly sophisticated and affordable electronics, the exponential growth rates of data created each day is expected to continue unabated for years to come. Virtually all human activities will be impacted by this age of data, and those who can quickly extract value from this superabundant resource will enjoy a decided advantage.
Extracting value from the vast and ever-growing stores of geospatial-temporal big data poses a significant challenge. This class of big data, so named because of its inherent link to place and time, includes satellite and aerial imagery, global-scale data and models (weather, climate, oceans, etc.), geo-referenced IoT/sensor networks, and big-event data captured on platforms like Twitter and GDELT. Such data is often freely available, but its massive size and the complexities associated with its preparation for use make it difficult to exploit and scale, especially for large areas and time-critical applications.
IBM PAIRS Geoscope arose from a project and engagement a few years ago with the E. & J. Gallo Winery. In an effort to conserve water while improving crop uniformity and yield, IBM and Gallo co-developed a precision irrigation system that incorporated a cloud-based communication network, hundreds of sensors and actuators, satellite imagery to measure the uniformity and health of the greenery, a complex model for estimating water loss from greenery and soil that required numerous meteorological and atmospheric parameters from a variety of sources, and a localized weather model to estimate future irrigation needs. In addition to demonstrating a new form of a potentially commercial water-efficient drip irrigation technology, a two-season trial of this system on a ten-acre test ranch delivered a 26 percent increase in crop yield, a 50 percent increase in crop uniformity, and a doubling of a key crop quality index, all while using up to 22 percent less water.
This experience taught us that rapidly obtaining insights and value from an unwieldy mix of large geospatial-temporal datasets required new thinking on at least two fronts:
Second, geospatial-temporal datasets exhibit a daunting array of complex formats. Understanding and curating this diversity can be an arduous task that hinders rapid analysis. On both fronts, significant and sometimes insurmountable bottlenecks are encountered when attempting to bring the data to the analytics.
IBM scientists, Xiaoyan Shao (left) and Conrad Albrecht, interact with the IBM PAIRS Geoscope service.
PAIRS Geoscope addresses this problem by reversing the situation. That is, by offering a service that allows clients to bring their analytics to the data. It frees clients from the cumbersome processes that dominate conventional geospatial-temporal data acquisition and preparation and provides search-friendly, ready access to a rich, diverse, and growing catalog of historical and continuously updated geospatiotemporal information.
The service is built on a highly scalable, cloud-based repository especially crafted for the complexities of geospatial-temporal information. This repository, currently growing by terabytes per day, can automatically ingest, curate, and seamlessly integrate all forms of geospatial-temporal data. Large, heterogeneous, and complex datasets are tamed into a tidy aligned and indexed structure designed for efficient retrieval and query.
Clients can now use PAIRS Geoscope at different levels to tap a vast and valuable source of previously underutilized data. As an information service, PAIRS Geoscope can quickly provide a variety of contextual information about a particular place and time. Used as a discovery service, it can identify a set of regions that share a similar set of client-defined characteristics. As an advanced analytics service, it can leverage machine learning and artificial intelligence techniques to make predictions based on a complex mix of parameters, models, and historical data.
The ongoing development of PAIRS Geoscope remains intentionally tethered to its real-world origins and is currently in trial deployments with clients in the areas of agriculture, finance, energy, and meteorology. For example, IHI Corporation, a global engineering, construction and manufacturing company that provides a broad range of products in aero engine, space and defense and other business areas, is working with the service to develop a new system for improving the accuracy of long-term (30 days or more) weather forecasts by more than 30 percent over all other techniques.
Specifically, the team uses data from GPS Radio Occultation sensors on satellites, which can yield three-dimensional temperature, pressure, and humidity profiles of the atmosphere. IHI and its customers use PAIRS Geoscope to blend this data with historical and long-term weather forecast data and machine learning techniques to produce improved weather forecast insights.
An interactive web interface for the service allows users to quickly and easily run queries across petabytes of geospatial-temporal data. Results appear as visualizations that can be downloaded in a variety of formats (that in future versions will include SPARK dataframes). A REST API provides developers with a unified cloud-based interface to interact with the technology, thus allowing them to enhance their applications without replacing or disrupting their preferred set of mapping, visualization, data acquisition, and control platforms.
The world of digital discovery has been revolutionized by the ability to index and rapidly search web pages, social networks, and business transactions. Geospatial-temporal data, due to its size and complexity, has been resistant to this trend and remains underutilized. IBM PAIRS Geoscope takes geospatial-temporal data out of the dark and enables clients to capture the full value of this ever-growing, ubiquitous, and vitally important class of information in their applications. To explore PAIRS Geoscope, visit the landing page, and let us know how you plan to use it and the insights you hope to gain.
Our study "Comparison of methods to reduce bias from clinical prediction models of postpartum depression” examines healthcare data and machine learning models routinely used in both research and application to address bias in healthcare AI.
Founded in March 2020 just as the pandemic’s wave was starting to wash over the world, the Consortium has brought together 43 members with supercomputing resources. Private and public enterprises, academia, government and technology companies, many of whom are typically rivals. “It is simply unprecedented,” said Dario Gil, Senior Vice President and Director of IBM Research, one of the founding organizations. “The outcomes we’ve achieved, the lessons we’ve learned, and the next steps we have to pursue are all the result of the collective efforts of these Consortium’s community.”
The next step? Creating the National Strategic Computing Reserve to help the world be better prepared for future global emergencies.