BioHPC @ CBSU
This is the first BioHPC installation, and all the available
applications are included. Some of the applications are freely
available for all, some, due to a very high computational demand,
are available to registered users only. Currently (as of January
2010) CBSU BioHPC installation is linked to 326 local compute nodes
with 962 CPU cores grouped in 7 clusters, also 64 nodes from
Athena cluster at Microsoft headquarters are linked to this interface via
HPC Profile/JSDL connection. Due to size and load this installation has a separate
web server, separate fileserver (6TB storage), separate ftp server
(6TB storage) and a separate Microsoft SQL server.
BioHPC is the main CBSU venue
for delivering High Performance Computing to biological groups. It
was developed in collaboration with Microsoft and due to its
sponsorship part of CBSU resources are open to general scientific
community. BioHPC users are divided into two categories: guests and
registered users. Registered users come from Cornell community and
have several privileges over guests including number of jobs and
full access to restricted computationally expensive programs. BioHPC
was very well received in the scientific community and its usage,
both registered and guest, increases substantially every year.
Since the initial deployment
in the summer of 2003 BioHPC processed 127,906 jobs (as of
12/31/09), with the average load of 19,678 jobs a year and 44,608
jobs in 2009. The jobs were submitted by 11,471 unique users from 83
countries, the majority (57% by CPU time used) coming from the USA,
and 52% of the USA utilized CPU time coming from New York. Among
them there are 257 unique Cornell users, 2,580 users from .edu
domains representing 426 unique .edu institutions, and 4,813 users
from .com domains (including 4,191 users with Yahoo, Gmail and
Hotmail e-mail addresses).
Cornell registered users
accounted for 28% of CPU usage while accounting for only 2.2% of the
total amount of users, which means Cornell users consume significant
computational resources per job and per user. Registered users have
access to the most resource consuming sequence analysis programs not
available to guests, however with majority of their usage still
focused on population genetics. They also can run more jobs per user
than guests resulting in higher resource utilization per category.
Programs developed by Cornell
researchers and hosted by CBSU via BioHPC attracted 1,891 job
submissions in the last quarter of 2009 and total of 7,403 job
submissions in entire year 2009.


