IBM Analytics Engine and Watson Studio now support a wide variety of geospatial functions as native components of these services.
The library—developed by IBM Research, which provides geospatial and temporal functions for various other IBM products—joins these two cloud services. With the introduction of this feature, Watson Studio (in its Spark environments) and Analytics Engine support complex geospatial functions and expand data science to location analytics.
Key features of spatiotemporal functions
- Geodetic function support—all functions are accurate for all geometries without the need for projections, including large geometries like entire countries or hemispheres and geometries that are near the poles or the anti-meridian.
- With Analytics Engine and Watson Studio Spark environment, all the geospatial functions are fully distributable and can take advantage of Spark's native distributed processing capabilities. This improves overall performance.
- Extensions of Spark distributed joins to performing spatial, temporal, and spatiotemporal joins.
- Native geohash support for arbitrary geometries that can be used for simple aggregations and object storage spatial indexing, improving cloud storage retrieval.
- SQL/MM extensions to Spark SQL, providing native SQL support for geospatial functions
Why geospatial support?
Location sensors have become first-class citizens in various devices—spanning phones, connected cars, and IoT sensor feeds. Fusing geographic feature data (e.g., zipcode polygons, address features) with these location sensor data is key to extracting and enriching with contextual information. The addition of such a location context to most of the enterprise problems provide valuable business insights.
Geospatial information, either by itself or in combination with traditional relational data, can help institutions and businesses do things like decide in which areas to provide services or determine the locations of possible markets. For example:
- The manager of a county welfare district can verify which welfare applicants and recipients actually live within the area that the district services. This can be done by analyzing the geometry of the service area and the addresses of the applicants and recipients.
- The owner of a restaurant chain wants to open new restaurants in nearby cities and needs the answer to such questions as:
- Where in these cities are concentrations of the types of people who typically frequent restaurants like mine?
- Where are the major highways?
- Where is the crime rate lowest?
- Where are competing restaurants located?
- A business analyst in a health insurance company wants to determine if there are primary care providers within a 15-mile driving distance of each of their customers. A list of suggested health care providers can be suggested proactively and customized to the needs of the customers.
What does it do?
Our geospatial functions include, but are not limited to, the following:
- Topological functions, per the DE-9IM standard
- Metric functions, distance, azimuth, etc.
- SQL/MM extensions to Spark SQL
- Spatial indexing
- WKT and GeoJSON support
- Spatial, temporal, and spatiotemporal joins
- Distributed spatial functions
This functionality is available in the Spark Python and Spark Scala environments on Watson Studio and Analytics Engine clusters. You can get started by going over the sample notebooks for Spatial and Spatial Index.