Data Analytics

Query many data sources as one: IBM Queryplex for data analytics

Share this post:

Queryplex runs advanced analytics (SQL, Python, R, PySpark, etc) across many devices and data sources as though they are a single consolidated data repository. The technology can be used to erase data silos of multiple databases (e.g. Oracle, DB2, PostgreSQL, Netezza), or compute analytics across tens of thousands of distributed Internet of Things devices where data may be stored in smaller repositories (text files, Excel spreadsheets, Informix, MySQL).  Queryplex let’s you query many data sources at once with a single statement, whether they are large repositories, small devices, or any combination of them.

Leveraging powerful new technology Queryplex creates a computational mesh across all of your data sources enabling them to self-organize and collaborate when processing a query. When Queryplex connects all of your data sources together it has the power to turn your many data devices into a virtual super computer.  This means you’ll often accomplish many times more with Queryplex than you could achieve with either standard edge computing or centralized big data systems.

IBM Queryplex

Queryplex cloud service makes it possible to query huge numbers of data sources as if they were one

Data silos are a thing of the past

With Queryplex you can easily run queries across many data repositories as if all of the data was centralized. It’s the ultimate virtual data lake!

Analytics for Internet of Things

Queryplex’s powerful distributed algorithms scale massively to hundreds of thousands of nodes, with very low requirements for compute and RAM. This is ideal for the Internet of Things. You can now run complex analytics across thousands of devices with ease.

Query over many data formats

Queryplex supports a huge range of source databases and data formats that you can query as a single source, including: CSV (text) files, Excel, Oracle, MySQL, SQL Server, DB2, PostgreSQL, MongoDB (JSON), Cloudant (JSON)

Query with multiple languages & tools

Queryplex supports the most popular languages for data science and analytics, including: SQL, R, Python, Scala, PySpark, and Jupyter Notebooks. You can use your favorite analytics tools too, such as RStudio, Tableau, Cognos, Watson Analytics, Spotfire, and Microstrategy.

Queryplex provides industry leading language compatibility. Products like Oracle, Netezza, PostgreSQL, DB2, all support variations of the SQL standard. With Queryplex we support a huge range of SQL dialects, as well as stored procedure languages like PL/SQL and SQL PL so you can run queries originally written for these platforms directly on Queryplex without significant changes.

Computational mesh – The power of many together

Queryplex creates a computational mesh across all of your data sources that allows the newly formed constellation of devices and clusters to collaborate on computing your analytics. The computational mesh forms organically as data sources connect using a dynamic algorithm that is location, subnet, and latency aware. By combining the compute from several sources you can aggregate and leverage huge compute capacity that would never be practical to configure in a single cluster. We believe this is a game changer for distributed analytic query processing. In many cases you will achieve acceleration 10-10,000 times greater than what would be possible with processing using either a data cluster or edge computing.

Queryplex computational mesh

The computational mesh capabilities in Queryplex multiply your effective computing power in a way never before possible. Achieve 10-10,000 times speedup over other edge-computing methods.

 

Queryplex for hybrid cloud computing

Queryplex is a cloud service that connects data sources together anywhere in the world, whether on-premises or in a public cloud.  When you application connects to Queryplex it looks and feels to the application as though it is connecting to a single repository such as a Spark cluster, or a relational database like Oracle or DB2. In fact, while the connection point for Queryplex is in the cloud,  your data can be located in many data sources located around the world, or throughout your company.  This means that you can use Queryplex to query data sources in the cloud, on mobile devices, or inside your company behind a firewall  (with your permission of course!!!) as if all that data was inside a single repository.

Why Queryplex?

Queryplex is the only technology available today that can create a computational mesh that distributes analytic workload across many devices and data sources.

  • Simplicity! Run analytics easily across many data sources at once. Avoid the complexity of creating and maintaining a centralized data repository, and eliminate data silos. It is a fully managed cloud service, which means that you never have to worry about configuration, tuning, backups or availability; we’ve got those covered for you.
  • Cost. Avoid the cost of creating and maintaining a centralized data repositories (big data clusters, or data warehouses).
  • Data currency. Queryplex computes data analytics directly on the data sources, not on a stale copy that has been streamed or replicated from the original sources.  When you run analytics, you will have confidence that the result was based on the most current data you have access to.
  • Performance. Multiply your processing potential as never before by leveraging many devices in concert. Queryplex’s unique computational mesh technology is a profound breakthrough that offers extreme computing capability very simply. In many cases your performance will be an order of magnitude faster (or more) than what you were previously able to experience.

 

 

More Data Analytics stories
March 27, 2019

Db2 Warehouse Flex Comes to Amazon Web Services (AWS)

In a strategic escalation in our approach to cloud data warehousing, we’re bringing Db2 Warehouse Flex to Amazon Web Services as a fully managed, scalable, and elastic cloud data warehouse.

Continue reading

March 12, 2019

Expanding Data Warehouse Capabilities for the IBM Hybrid Data Management Platform

The IBM Hybrid Data Management Platform is expanding capabilities with both the Flex and Hybrid Flex plans. These two types of warehousing solutions will help you optimize your hybrid cloud architectures in terms of both performance and cost-savings

Continue reading

March 5, 2019

Deprecation of Apache Spark (Lite Plan)

We’d like to inform you about the deprecation of the Apache Spark (Lite plan) service. The Lite plan of this service will be retired on June 28, 2019. Please note that the Enterprise plan has already been deprecated.

Continue reading