I have been working with the Gaian Database recently to demonstrate its scalability.
In the tests I grew a database cluster with over a thousand Gaian Database nodes and measured the time it took to query across these thousand nodes, and fetch over a million rows of data. I also tested the impact on speed of executing multiple queries at the same time.
I will include more detailed postings on each of these three cases, but the high level results are as follows:
Query Time – We are able to query all 1000 nodes in about 1/8 second. The results show that the query time grows logarithmically - in other words as you add more and more databases, the increase in query time slows down, providing excellent scaling. The way that a Gaian Network is grown from individual nodes automatically ensures this behaviour.
Fetch Time – We are able to fetch 1 million rows of data in under 5 seconds. The fetch time is proportional to the amount of data returned so that if you fetch twice the data it takes twice as long regardless of which of the 1000 nodes the data resides in. The Gaian Database actively pre-fetches the data from all the nodes to achieve this scalability
Concurrent Queries – I injected queries from up to 40 nodes at the same time, the Gaian Database showed that it could handle these queries robustly with a modest increase in the query time due to running out of available processor time on our test platform.
There have been a number of changes to the Gaian code to achieve these results, a new release will be delivered to Alphaworks soon.
Check out the following link for a visualization of 1250 Gaian Database nodes in a network:http://manyeyes.alphaworks.ibm.com/manyeyes/visualizations/gaian-db-1250-nodes