GPU and Coprocessor Technologies over .NET Bio

Feb 4, 2014 at 10:52 PM
Dear all,

I'm looking for a new approaches for the .NET BIO Project using NVIDIA GPU and Intel Xeon coprocessor technologies.

I'm interested in develop a new module to perform BLAST operations, specifically BLASTn and BLASTp programs.

I know we have C# CUDA implementation in the GPU world and As far as I know coprocessors technologies uses C and Fortran programming languages only. However I want to explore if it is posible use Visual c++ over Intel coprocesor technologies for bioinformatics applications.

FYI, Nigel posted something related 5 months ago. Wiki Link: [discussion:456450] .

Lastly, I'm open to receive any help or advice to the community,

Thanks,

Leo
Developer
Feb 5, 2014 at 4:09 PM
Edited Feb 5, 2014 at 4:10 PM
Hi Leo,

Hope you're doing great. So, I would actually advise against attempting to implement a BLASTn or BLASTp in C# with GPUs. This is based on my personal experience implementing BLAST in C#. I did this early on in graduate school to identify regions in bacterial genomes, and though it worked great, it was clearly a bit slower than what NCBI provides and in retrospect definitely not worth the effort. The NCBI implementations of BLAST have been worked on and optimized pretty extensively, and they have had teams of people work on various GPU/coprocessor implementations.

One advantage of C# is that it makes interfacing with C code pretty dang simple though. I think for most bang for buck/time, it might be worth just writing a wrapper around the NCBI APIs that takes .NET bio object types and runs blasts with them. I did this for the genome aligner BWA-MEM earlier (https://github.com/evolvedmicrobe/BWA-Sharp/blob/master/Bio.BWA/Bio.BWA/BWA.cs). It might be worth having an "unmanaged" extensions project that contains code to interface .NET Bio with C code, and could include things like BWA-MEM, BLAST, etc. However, I think it would probably be better to keep it out of the main library. As soon as unmanaged code enters the mix we will go from the <5 MB xcopy install to the nightmare that is dealing with the gcc and trying to port code around different platforms, make files, etc. etc.

Just my $0.02,
Nigel
Feb 13, 2014 at 3:33 PM
Hi Nigel,

Thanks for your answer. You're right about the NCBI implementations however it would be interesting take a look to this into a research Project context. Your suggestion about the wrapper is very interesting also have the "unmanaged" extensions projects outside the main library it would be better to keep control and make the Project scalable.

If you need more information,
GPU-BLAST: using graphics processors to accelerate protein sequence alignment http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3018811/
Building BLAST for Coprocessor Accelerators Using Macah http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.152.6257&rep=rep1&type=pdf

Best,

Leo