Site Administration The College of William and Mary

The CPD Opteron cluster

Description of the cluster

The configuration of the Opteron cluster is shown in Fig. 1. It consists of 114 compute nodes connected by Infiniband and gigabit networking. All machines run Red Hat Enterprise Linux 4 and Platform OCS (a vendor supported "Rocks"-like distribution) All compute nodes are are accessed from the front end. A RAID array is directly attached to the front end and contains the home directories and 2 globally accessible scratch spaces of 1 TB each. Also, there is a set of 3 IO nodes which serve as a 1.5 TB parallel file system (via PVFS2 which is not yet public) and is accessible from the nodes and the front end via the infiniband network.

Diagram of the CPD Opteron cluster

 

 

Technical details

The following table lists the details of each machine in the cluster:

Table 1: Front end and compute node details.
# Node name architecture processor memory local disk
1 Dell PowerEdge 2970 (front end) Opteron 2220 / 2.8 GHz 8 GB 380 GB
98 Dell PowerEdge SC1435 c1-[1-28],c2-[29-57],c4-[58-86],c5-[87-102] Opteron 2218 / 2.6 GHz 8 GB 120 GB
4 Dell PowerEdge SC1435 c5-[103-106] Opteron 2222 SE / 3.0 GHz 32 GB 420 GB
8 Dell PowerEdge SC1435 c5-[107-114] Opteron 2222 SE / 3.0 GHz 16 GB 420 GB

 

 

Table 2 shows the various filesystems that users may access within the cluster. The /home, /scr1 and /scr2 filespaces are all mounted via NFS over the gigabit ethernet network. /scr1 and /scr2 are to be used as global scratch space. Currently at 10GB quota is set for all users /home directory space. The global scratch space is to be used for temporarily storing large files. Neither filesystem is backed up and purging is not done automatically. Users with large amounts of data (>100 GB) will be made to reduce their usage when the scratch space starts to get too full (> ~70%).

Table 2: Filesystems on the Opteron cluster
name size notes
/home/$USER 1 TB users get a 10 GB quota
/scr1/$USER 1 TB Global NFS scratch 1
/scr2/$USER 1 TB Global NFS scratch 2
/pvfs 1.5 TB Global parallel scratch
/lscr 120 GB / 420 GB Local scratch on compute node

 

 

Backup policy - as of 4/20/08:

The system and home directories are mirrored to another system every night. The global scratch directories scr1 and /scr2 will not be backed up.

 

 

©2008 The College of William and Mary