BioHPC Home
BioHPC
Computational Biology Application Suite for High Performance Computing
 
What's new Using BioHPC BioHPC NextGen Support Architecture Applications Web Services Access Future Directions BioHPC @ CBSU Using BioHPC Administration of BioHPC Installing on cluster Installing on server Real-time scheduler Download from CBSU

BioHPC: Applications

BioHPC provides users with popular bioinformatics tools covering various aspects of computational biology:
  • Data mining / sequence analysis
    BLAST, HMMER, InterProScan, RepeatFinder, GIMSAN, SLIM

  • Next Generation Sequencing tools
    FastX

  • Protein structure prediction and modeling
    LOOPP, Modeller

  • Population genetics
    BayesScan, BEAST, BEST, Clumpp, Colony, IM, IMa, IMa2, InStruct, LAMARC, MCMCcoal, MDIV, Migrate, MKPRF, MSVAR, OmegaMap, Parentage, SFS_CODE, Structurama, Structure, TESS

  • Phylogenetics
    MrBayes, ClustalW, Stretcher, T-COFFEE

  • Association analysis / statistics
    PLINK, R

  • MSR Biomedical
    CreateEpitome, Epipred, FalseDiscoveryRate, HlaAssignment, HlaCompletion, PhyloD

The system is flexible and can be easily customized to include other software, in fact the number of applications available in BioHPC grows fast. The interface to each application is standardized,  users can choose the cluster, number of nodes or allow the interface to determine it based on the best load balance and node availability. It is also scalable, the installation on our servers currently (April 2010) processes approximately 50,000 job submissions per year, many of them requiring massively parallel computations for a long time. It is integrating different cluster technologies (MS CCS, MS HPC Server 2008, JSDL). There are both parallel and serial applications available through the interface. LOOPP and MrBayes are examples of genuine parallel applications. P-BLAST, P-HMMER and P-IPRSCAN are parallelized through input sequence distribution (trivial parallelization). MPI is used for communication. 

The applications accessible via BioHPC are various third party programs governed by their respective licenses. Only part developed at CBSU is covered by BioHPC license. It is sole responsibility of the administrators/owners of a particular BioHPC server to assure that use of these applications is in agreement with their respective licenses.

BioHPC @ CBSU implementation is a good example of what is typical application usage. Below is an example of usage of a few popular programs between 6/13/2003 and 3/18/2010. For up to date information about BioHPC@CBSU please go to this page.

MDIV 20,965 (population genetics) 1 core from few hours to two weeks   (average: 2-5 days).
LOOPP 20,385(protein structure prediction)
5-20 cores for 3-10 hours
MrBayes 18,799(population genetics) 8-20 cores for a few hours to two weeks (average: 5 days)
P-BLAST 4,504(sequence analysis / data mining) 10 – 100 cores for a few days to a week (average 3 days)
IM / IMa / IMa222,567(population genetics) 1 core for 2-5 days
Structure17,968(population genetics)  
All applications 140,141 (average 20,090 per year)  
(since 3/18/2009)49,738   



BioHPC @ Cornell What's new   Contact us