High Performance and High Throughput Cluster
HPC(High Performance Clusters) and HTC(High Throughput Cluster) have been used for a long time for scientific research and commercial activities. Since the beginning these clusters have relied on heavy configuration both on the server as well as nodes.
We can design new hybrid clusters which will work on the existing models where in the nodes connected through fast interconnects in the local network or over the internet can contribute their computing resources. Also this cluster will have a dedicated mode in which the computers will be connected to each other and dedicated solely to the cluster. The cluster will have one remote boot server based on the LTSP(Linux Terminal Server Project). The remote server will be responsible for pushing right operating system to the node according its configuration. Thus tailored operating systems will be run on the node computers optimized to deliver full performance. The remote server will also have all the configurations required by the node.
This cluster system will also have a dynamic distributed file system. Any machine booting on the cluster with permanent storage will automatically become a part of the distributed file system. There will be a central file server which will be responsible for managing data distribution and backup. This server will decide where the data is to be stored and also push a copy of the data to the backup server. When a node boots in the cluster its storage will be reinitialized and the data will be shared from other nodes or from the backup server which will send it the data of a node that recently went down. New node will read and write data on the distributed file system and while going down notify the central server. The central server will distribute the data that was available on this node to the other nodes on the server and the cluster will resume normal operations.
Maintenance on this type of a cluster will be very easy as the nodes can be added or removed on the go and their is no special configuration required for adding new nodes or expanding the storage of the cluster on a whole.
I am still working on the project and have completed the LTSP integration. I would like to have your support. This is an open source GPL v2 program.
The repository is at http://newerahpc.googlecode.com