Frequently asked questions

Get answers to the most commonly asked questions about this product.



What file types are supported?

This software supports structured and semi-structured data sets. Currently the following formats are supported: XLS, XLSX, CSV, JSON, Avro, Parquet, TXT, and Tableau. PDF and video are not supported at this time. 

Is this a cloud solution?

No. IBM InfoSphere Advanced Data Preparation is available for on-premises deployment only. However, the solution can provide connectivity to your existing cloud platforms and private cloud environments.

How can I import a data set?

The first step is to create a flow, a container for organizing and managing datasets. Now you are ready to add data sets to your flow by importing data from your local machine, a relational database, a file system, or from an existing flow.

Can I join data sets together?

Yes. IBM InfoSphere Advanced Data Preparation supports joining disparate data sets into your flow.

Does this solution use sampling?

Yes. There are six types of samples this solution can perform: first rows, random, filter-based, anomaly-based, stratified and cluster-based. Note that some samples are only available if Hadoop is enabled. The sample can also be scaled to meet your needs, either from the entire data set or from a quick sample of the first 2GB of data.

Can I grant different access levels?

Yes – this helps your organization provide the right data to the right data citizens. There are admin and user accounts that can be customized to fit your needs.

Is there full integration with IBM InfoSphere Information Server or IBM Watson Knowledge Catalog?

Not at this time, but it is in the plan. Since IBM InfoSphere Advanced Data Preparation complements both products from a data pipeline perspective, IBM is actively working to incorporate this solution as part of an integrated platform.