Site Administration The College of William and Mary

The CPD Opteron cluster - hardware information

Description of the cluster

The configuration of the Opteron cluster is shown in Fig. 1. It consists of 114 compute nodes connected by Infiniband and gigabit networking. All machines run Red Hat Enterprise Linux 4 and Platform OCS (a vendor supported "Rocks"-like distribution) All compute nodes are are accessed from the front end. A RAID array is directly attached to the front end and contains the home directories and 2 globally accessible scratch spaces of 1 TB each. Also, there is a set of 3 IO nodes which serve as a 1.5 TB parallel file system and is accessible from the nodes and the front end via the infiniband network.

Diagram of the CPD Opteron cluster

 

 

Technical details

The following table lists the details of each machine in the cluster:

Table 1: Front end and compute node details.
# Node name architecture processor memory local disk
1 Dell PowerEdge 2970 pzt.wm.edu Opteron 2220 / 2.8 GHz 8 GB 380 GB
98 Dell PowerEdge SC1435 c1-[1-28],c2-[29-57],c4-[58-86],c5-[87-102] Opteron 2218 / 2.6 GHz 8 GB 120 GB
4 Dell PowerEdge SC1435 c5-[103-106] Opteron 2222 SE / 3.0 GHz 32 GB 420 GB
8 Dell PowerEdge SC1435 c5-[107-114] Opteron 2222 SE / 3.0 GHz 16 GB 420 GB

 

Table 2 shows the various filesystems that users may access within the cluster. The /home, /scr1 and /scr2 filespaces are all mounted via NFS over the gigabit ethernet network. Currently a 10GB quota is set for all users /home directory space. /scr1 and /scr2 are to be used as global scratch space. The global scratch space is to be used for temporarily storing large files. Neither filesystem is backed up and purging is not done automatically. Users with large amounts of data (>100 GB) will be made to reduce their usage when the scratch space starts to get too full (> ~70%).

 

Table 2: Filesystems on the Opteron cluster
name size notes
/home/$USER 1 TB users get a 10 GB quota
/scr1/$USER 1 TB Global NFS scratch 1
/scr2/$USER 1 TB Global NFS scratch 2
/pvfs 1.5 TB Global parallel scratch
/lscr 120 GB / 420 GB Local scratch on compute node

 

 

©2008 The College of William and Mary