Future Goals

Nov 22, 2011 at 5:14 PM

  what do you have planned for this project that has yet to be implemented?

Nov 22, 2011 at 10:16 PM

Hi Eric,

As an open source project driven by the community, the users get to decide what direction it goes in. If you are a programmer with an interest in making a contribution for example, you would be welcome to choose your own area of interest, or one of the current contributors or potential users could provide suggestions if you prefer. The v1.0 release is very recent, and while it was in its final stages we needed to freeze the code to build installers and make other last-minute changes - this means we are now managing the integration of several contributions of code that have been waiting for the past few months. The first of these extends base functionality in support of the FaST-LMM algorithm - you can find that in a related CodePlex project, or linked as one of the associated projects on our homepage. FaST-LMM does Genome-Wide Association (GWAS) - attempting to identify genomic features in a set of subjects that more strongly correlate with some phenotype - such as succeptibility to heart disease. The novelty in FaST-LMM is that it scales to at least hundreds of thousands of subjects, whereas existing GWAS approaches handle 10-15 thousand subjects at best. The next check-in will be a genome browser utility that reads GenBank files and allows the user to click on features and see them visualized in the standard linear manner of any browser - maybe not a breakthrough, but a useful utility. I know there are several more, but I don't have a list in front of me - you'll see them in the next month or two, once they have been reviewed and committed.

Looking further ahead, you will see in one of the other forum threads a lot of interest in a generic browser/visualization framework. Work has only just started, and we are working out the best way to share specs and collaborate. I also know that we have storage limitations of about 2bn sequence objects, and a similar 2bn limit in the size of any sequence - we want to work on that, but it will require some pretty fundamental engineering, so don't expect it right away. We also know that being able to consume more datasets would be nice, and native support for 454 is a target, as well as more RNA data formats. I also heard of interest in biological network analysis at one of our recent training courses - hopefully something will come of that. Finally, a group is interested in the challenges of de novo and comparative assembly of plant genomes, including resolution of polyploidy, another is interested in remote execution of Linux apps, and a group in Colombia will be using .NET Bio in their biodiversity work - and making contributions of apps and fixes along the way.

That about covers it for now - but (and as you can probably tell) these directions reflect the interests of the current participants rather than any Grand Master Plan.

Comments/criticisms and suggestions - and of course participants - always welcome.





Nov 29, 2011 at 3:11 PM

Eric, one other thing somewhat related here is that we will be holding our next training course on .NET Bio in Redmond on Dec. 5th/6th. The course runs from 9:00AM - 5:00PM each day. It will be covering some basic .NET programming but then it will delve into .NET Bio specific items. I think this might be valuable for you to see and of course interact with other participants. These can be great pairing opportunities for those who have a programming background being paired with someone who has a more biological bent. The course is free. You can register here http://dotnetbio.eventbrite.com/. Hope to see you there.