Analytics Engine HDFS connection

Use the Analytics Engine HDFS connection to connect to IBM Analytics Engine with the WebHDFS API.

IBM Analytics Engine is a Hadoop and Spark service on IBM Cloud that provides an environment to develop and deploy advanced analytics applications. Data is stored in IBM Cloud Object Storage (COS). The Analytics Engine service starts clusters of compute nodes when needed. Analytics Engine HDFS was formerly known as "IBM BigInsights on Cloud."

Create a connection to IBM Analytics Engine

To create the connection asset, you need these connection details:

For Credentials and Certificates, you can use secrets if a vault is configured for the platform and the service supports vaults. For information, see Using secrets from vaults in connections.

Select Use Home As Root to use the username's home directory for the root for browsing.

Hive properties
The Hive properties are only for when you want to use the Analytics Engine HDFS connection for target (write) data. If you specify Hive properties and you write a file into the target HDFS, then a Hive connection will be established that creates a Hive table for the associated file. If you want to browse the Hive tables of Analytics Engine, use the Apache Hive connection.

Choose the method for creating a connection based on where you are in the platform

In a project Click Add to project > Connection. See Adding a connection to a project.


In a catalog Click Add to catalog > Connection. See Adding a connection asset to a catalog.


In a deployment space Click Add to space > Connection. See Adding data assets to a deployment space.


In the Platform assets catalog Click New connection. See Adding platform connections.

Next step: Add data assets from the connection

Where you can use this connection

You can use Analytics Engine HDFS connections in the following workspaces and tools:

Analytics projects

Catalogs

Analytics Engine setup

Getting started tutorial

Supported file types

The Analytics Engine HDFS connection supports these file types: Avro, CSV, Delimited text, Excel, JSON, ORC, Parquet, SAS, SAV, SHP, and XML.

Learn more

Parent topic: Supported connections