BioHPC Home
BioHPC
Computational Biology Application Suite for High Performance Computing
 
Using BioHPC Architecture Applications Future Directions BioHPC @ CBSU Using BioHPC Administration of BioHPC Installing on cluster Installing on server Real-time scheduler Download from CBSU

BioHPC: Applications

BioHPC provides users with popular bioinformatics tools covering various aspects of computational biology:
  • Data mining / sequence analysis
        BLAST, HMMER, InterProScan, RepeatFinder

  • Protein structure prediction and modeling
        LOOPP, Modeller

  • Population genetics
        IM, IMa, InStruct, MDIV, Migrate, MKPRF, Parentage, PLINK, SFS_CODE, Structurama, Structure, TESS

  • Phylogenetics
        MrBayes, ClustalW, T-COFFEE

  • Association analysis / statistics
        PLINK, R


The system is flexible and can be easily customized to include other software, in fact the number of applications available in BioHPC grows fast. The interface to each application is standardized,  users can choose the cluster, number of nodes or allow the interface to determine it based on the best load balance and node availability. It is also scalable, the installation on our servers currently processes approximately 10,000 job submissions per year, many of them requiring massively parallel computations for a long time. It is integrating different cluster technologies (MS CCS, Cornell ched, JSDL). There are both parallel and serial applications available through the interface. LOOPP and MrBayes are examples of genuine parallel applications. P-BLAST, P-HMMER and P-IPRSCAN are parallelized through input sequence distribution (trivial parallelization). MPI is used for communication. 

The applications accessible via BioHPC are various third party programs governed by their respective licenses. Only part developed at CBSU is covered by BioHPC license. It is sole responsibility of the administrators/owners of a particular BioHPC server to assure that use of these applications is in agreement with their respective licenses.

Currently BioHPC @ CBSU implementation is processing around 15,000 jobs a year. The most popular applications for job submission from 6/13/2003 to 11/7/2007 are:

MDIV 16,033 (population genetics) 1 node from few hours to two weeks   (average: 2-5 days).
LOOPP 14,659 (protein structure prediction)
5-20 nodes for 3-10 hours
MrBayes 4,064 (population genetics) 8-20 nodes for a few hours to two weeks (average: 5 days)
P-BLAST 3,152 (sequence analysis / data mining) 10 – 100 nodes for a few days to a week (average 3 days)
IM 2,791 (population genetics) 1 node for 2-5 days
Structure1,822(population genetics) 1 node for 2-5 days
MKPRF 1,710 (population genetics) 5-10 nodes for 1-3 days
Migrate1,124(population genetics) 1 node for 2-3 days
All applications 51,215 (average 15,364 per year)  



BioHPC @ Cornell Contact us