.NET Bio 2.0 Released onto Nuget!

Coordinator
Jun 27, 2014 at 11:19 PM
Edited Jun 29, 2014 at 3:13 PM
.NET Bio 2.0 is on Nuget in released form and all the source code is checked in. It's still in a fork, but we'll move it over once we are certain all the fixes have been moved into this version.

An installer will be pushed up with all the source code and the command line utilities, likely next week - we are holding off a little longer to get a couple more things into the core library.

There are a lot changes in this release, including several that are breaking changes for the way you used the framework in 1.1.

Cross-Platform
.NET Bio is now cross platform and supports .NET Framework 4.5, Mono 4.5, Windows 8.0 and 8.1 (WinRT), Windows Phone 8.1 (WinPRT), Xamarin.Android and Xamarin.iOS. This is done through three components:
  1. There is a Bio.dll which is a Portable Class Library and holds almost all the real code.
  2. There is platform helper DLL for each platform which provides necessary abstractions to make it work - this includes things like creating regular expressions, creating temporary files, and locating extension assemblies.
  3. There is a platform-specific DLL which supplies extensions to get to the file system (abstractions which are different on each platform). In some cases, this library may also have functionality which is specific to the platform; for example the desktop/Mono version has classes to parse out command-line arguments - something that is not necessary for mobile versions which do not support spawning processes with arguments.
Because of this change, we are no longer going to ship the binaries in the installer for you to include in your project. You should instead use Nuget to get the proper set of assemblies for any project you create. This is easily done in Visual Studio or Xamarin Studio by right-clicking on the references and selecting Manage Nuget Packages. Then, search for .NET Bio Core and add it to your project. This will download the proper DLL set for your platform and you will be all set.

Parsers and Formatters
This is probably the biggest change in terms of usage. We have unified the parsers and formatters and made it more consistent. For example, there is now an IParser base interface that all parsers inherit. Then most of them implement IParser<T> which supports the Parse and ParseOne mechanics.

These parsers in the core DLL all deal with low-level System.IO.Stream input and output - the extension methods provided in the platform-specific DLLs then provide the normal Open/Parse/Close methods and also provide methods to pass a filename into Parse so it opens, parses and closes the file for you.

Warning: one thing that you might run into with this approach is disposable blocks with streams. For example:
IEnumerable<ISequence> sequences;
using (var stream = File.OpenRead("someFile.fasta"))
{
    sequences = new FastAParser().Parse(stream);
}
will properly compile, but will throw an exception at runtime because the Stream is disposed before you use it. The proper code for this style is:
IList<ISequence> sequences;
using (var stream = File.OpenRead("someFile.fasta"))
{
    sequences = new FastAParser().Parse(stream).ToList();
}
Notice the call to ToList which forces evaluation of the data - this is so the stream isn't closed before we read the data. Another approach is to not close off the using block until you've processed the data. Alternatively, use the extension methods:
var parser = new FastAParser();
using (parser.Open("test.fasta"))
{
    var sequences = parser.Parse();
    ... work with sequences here ...

} // stream is closed here
All of the text-based parsers now support a FormatString, also done through an extension method. Overall, the new architecture allows us to share most of the file management across all the parsers and formatters - so the real classes just have to deal with the format.

Web Services
The web services framework has moved to Bio.WebServices and is a separate Nuget package (search for Bio.Web). It will automatically drag in Bio.Core. Currently, it's a bit light and only has support for BLAST, but we'll be adding a few more services to it shortly.

Algorithms
PaDena and Pamsam are in a separate Nuget package as well (search for Bio.Algorithms). This will include both of them in a single download. They are also Portable Libraries and work across all platforms, although PamSam is intentionally set to use a single core when on a 32-bit platform for optimization.

Unit Tests
Many of the unit tests have been reorganized so they are primarily in the core testing set. Prior to this, we had three separate libraries. This is an ongoing effort as there is a lot of cleanup to be done here.
Developer
Jul 6, 2014 at 5:45 PM
Any idea when this will become the "main" branch?
Coordinator
Jul 7, 2014 at 2:50 PM
Hi Nigel,

I'll push it over this week, I was making sure we got everything. I think I'll create a new branch off the existing 1.1 code and then pull over the form onto the main trunk so we have both.. to be honest, part of my hesitation is to make sure I do it the right way ;)
Coordinator
Jul 21, 2014 at 8:31 PM
Nigel: it's the main branch now. There's a few missing pieces I still need to add in, but everything important is there. I moved the original content to the branch "version 1" so it's still there.