IBM Analytics Engine and Watson Studio now support a wide variety of geospatial functions as native components of these services. 

The library—developed by IBM Research, which provides geospatial and temporal functions for various other IBM products—joins these two cloud services. With the introduction of this feature, Watson Studio (in its Spark environments) and Analytics Engine support complex geospatial functions and expand data science to location analytics.

Key features of spatiotemporal functions

  • Geodetic function support—all functions are accurate for all geometries without the need for projections, including large geometries like entire countries or hemispheres and geometries that are near the poles or the anti-meridian.
  • With Analytics Engine and Watson Studio Spark environment, all the geospatial functions are fully distributable and can take advantage of Spark’s native distributed processing capabilities. This improves overall performance.
  • Extensions of Spark distributed joins to performing spatial, temporal, and spatiotemporal joins.
  • Native geohash support for arbitrary geometries that can be used for simple aggregations and object storage spatial indexing, improving cloud storage retrieval.
  • SQL/MM extensions to Spark SQL, providing native SQL support for geospatial functions

Why geospatial support?

Location sensors have become first-class citizens in various devices—spanning phones, connected cars, and IoT sensor feeds. Fusing geographic feature data (e.g., zipcode polygons, address features) with these location sensor data is key to extracting and enriching with contextual information. The addition of such a location context to most of the enterprise problems provide valuable business insights.

Geospatial information, either by itself or in combination with traditional relational data, can help institutions and businesses do things like decide in which areas to provide services or determine the locations of possible markets. For example:

  • The manager of a county welfare district can verify which welfare applicants and recipients actually live within the area that the district services. This can be done by analyzing the geometry of the service area and the addresses of the applicants and recipients.
  • The owner of a restaurant chain wants to open new restaurants in nearby cities and needs the answer to such questions as: 
    • Where in these cities are concentrations of the types of people who typically frequent restaurants like mine? 
    • Where are the major highways? 
    • Where is the crime rate lowest? 
    • Where are competing restaurants located?
  • A business analyst in a health insurance company wants to determine if there are primary care providers within a 15-mile driving distance of each of their customers. A list of suggested health care providers can be suggested proactively and customized to the needs of the customers.

What does it do?

Our geospatial functions include, but are not limited to, the following:

  • Topological functions, per the DE-9IM standard
  • Metric functions, distance, azimuth, etc.
  • SQL/MM extensions to Spark SQL
  • Spatial indexing
  • WKT and GeoJSON support
  • Spatial, temporal, and spatiotemporal joins
  • Distributed spatial functions

Get started

This functionality is available in the Spark Python and Spark Scala environments on Watson Studio and Analytics Engine clusters. You can get started by going over the sample notebooks for Spatial and Spatial Index.

More from Analytics

In preview now: IBM watsonx BI Assistant is your AI-powered business analyst and advisor

3 min read - The business intelligence (BI) software market is projected to surge to USD 27.9 billion by 2027, yet only 30% of employees use these tools for decision-making. This gap between investment and usage highlights a significant missed opportunity. The primary hurdle in adopting BI tools is their complexity. Traditional BI tools, while powerful, are often too complex and slow for effective decision-making. Business decision-makers need insights tailored to their specific business contexts, not complex dashboards that are difficult to navigate. Organizations…

IBM unveils Data Product Hub to enable organization-wide data sharing and discovery

2 min read - Today, IBM announces Data Product Hub, a data sharing solution which will be generally available in June 2024 to help accelerate enterprises’ data-driven outcomes by streamlining data sharing between internal data producers and data consumers. Often, organizations want to derive value from their data but are hindered by it being inaccessible, sprawled across different sources and tools, and hard to interpret and consume. Current approaches to managing data requests require manual data transformation and delivery, which can be time-consuming and…

A new era in BI: Overcoming low adoption to make smart decisions accessible for all

5 min read - Organizations today are both empowered and overwhelmed by data. This paradox lies at the heart of modern business strategy: while there's an unprecedented amount of data available, unlocking actionable insights requires more than access to numbers. The push to enhance productivity, use resources wisely, and boost sustainability through data-driven decision-making is stronger than ever. Yet, the low adoption rates of business intelligence (BI) tools present a significant hurdle. According to Gartner, although the number of employees that use analytics and…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters