TRN
08-09-2003, 07:49 AM
The debian mailing list has a good topic going regarding clusters. Someone asked what they are and why someone would need one.
Ron Johnson answered this way:
The Original Cluster - The VAXcluster (now VMScluster, since there's the Alpha) is a Shared Disk Cluster. This means that there is a central "storage controller" that the disks plug into, and then the computers plug into the storage controller. There is a dedicated
network link between all the cluster nodes, and a Distributed Lock Manager (DLM) runs on each nodes, and the nodes talk to each other about what file, and section of file that each node has open at any one moment. That way, processes on node A don't overwrite file modifications made by a process on node B. The DLM is deeply woven into the OS and libraries, so it's all transparent to the app. As of now, there can be up to 128 nodes in a cluster, I think.
Good for general computing applications.
- Failover - The nodes are plugged into shared disks, like above. The main node handles all processing, and emits a "heartbeat". If the secondary node stops hearing the heartbeat, it takes over.
This is what Microsoft Cluster Services is. Sucks.
- Unshared Disk - This is how IBM does it. Each node has it's local disks. An app (like, say, DB2) has to be specially programmed to do this. Basically, the app is it's own DLM. (Open)Mosix is like this. Sistina's Global File System also does this, and attempts to make it transparent.
Good for general computing applications.
Beowulf - A master node parcels out small chunks of vectorized data to each of a multitude of compute-nodes. Suitable for many, but not all, numerical problems. Thus, the high-performance computing (HPC) community is ga-ga over Beowulf. Apps must be specifically written for this kind of parallelism.
Only good for science.
Ron Johnson answered this way:
The Original Cluster - The VAXcluster (now VMScluster, since there's the Alpha) is a Shared Disk Cluster. This means that there is a central "storage controller" that the disks plug into, and then the computers plug into the storage controller. There is a dedicated
network link between all the cluster nodes, and a Distributed Lock Manager (DLM) runs on each nodes, and the nodes talk to each other about what file, and section of file that each node has open at any one moment. That way, processes on node A don't overwrite file modifications made by a process on node B. The DLM is deeply woven into the OS and libraries, so it's all transparent to the app. As of now, there can be up to 128 nodes in a cluster, I think.
Good for general computing applications.
- Failover - The nodes are plugged into shared disks, like above. The main node handles all processing, and emits a "heartbeat". If the secondary node stops hearing the heartbeat, it takes over.
This is what Microsoft Cluster Services is. Sucks.
- Unshared Disk - This is how IBM does it. Each node has it's local disks. An app (like, say, DB2) has to be specially programmed to do this. Basically, the app is it's own DLM. (Open)Mosix is like this. Sistina's Global File System also does this, and attempts to make it transparent.
Good for general computing applications.
Beowulf - A master node parcels out small chunks of vectorized data to each of a multitude of compute-nodes. Suitable for many, but not all, numerical problems. Thus, the high-performance computing (HPC) community is ga-ga over Beowulf. Apps must be specifically written for this kind of parallelism.
Only good for science.