This project is read-only.

.NET Bio Cookbook

May 9, 2014 at 2:02 PM
One of our goals is to start a "Cookbook" of .NET Bio code snippets and recipes. So, I need two things from the community:

1) What would be the most useful snippets to start with?
2) Where should we put it?

For #2, we could put it here, maybe in the documentation area - or we could host it on another site and just link to here. What do you guys think?

mark
May 9, 2014 at 4:38 PM
I think the codeplex documentation site could work well, but it makes it difficult to add pictures and formatting for code in the rich way one can with pure HTML. Perhaps I am mistaken about this though? Hav

I'll see if I can find some examples of approaches taken by other projects.

I had previously used the sphinx framework to document a project I worked on: (http://sphinx-doc.org/). I thought it wound up looking reasonably nice (http://www.people.fas.harvard.edu/~rojasechenique/claritydocs/index.html). Perhaps we could use that or something similar? Does anyone know of some content generators/managers for .NET Bio that would be useful?

Cheers,
Nigel
May 10, 2014 at 12:16 AM
The documentation page has wiki markup and html editing capabilities. I don't know how convenient the web based editing experience is for large documents but one could presumably create html content offline and paste it in. I suppose images would be atttachments, referenced via their attachment url.
May 11, 2014 at 5:22 AM
Edited May 11, 2014 at 5:23 AM
My understanding is that our current documentation consists of:
  • Sandcastle .chm file (and though we don't currently use this feature, I think sandcastle can also generate MSDN style html documentation).
  • Codeplex html/wiki
  • Word documents
I think going forward it would be nice to integrate these things as much as possible. One way would be too use a framework that takes some markup text (latex, restructuredText, markdown) and outputs that in either HTML or PDF/Word/XPS format so we could have a centralized documentation generator with multiple possible outputs. Sphinx, Doxygen, etc. do this. However, they have major drawbacks in that it introduces other dependencies (Sphinx runs on python and it's world of versioning issues, or Doxygen, which is C++, eck). For this reason, I don't think committing to any of these frameworks would be a great idea.

I don't really have a strong opinion or expertise in this area, but think I might be in favor of keeping one language and location for the library (both code and documentation). The CLR is well designed, general purpose and simple, so hopefully our documentation can reflect that. Given the amount of things we have in word, and the power of sandcastle, one option might be to first figure out how to programmatically push documentation changes to codeplex. This way we could have sandcastle generate the html, and then have this be the "API reference" pages at codeplex. Going forward, we could have wiki/markdown files that are also pushed to codeplex, with our cookbook/getting-started guide. I think as long as we can figure out how to programmatically push to codeplex, and keep the documentation in one folder in the source tree, we should be in a good position going forward.

As I said, I definitely don't have expertise in this area so am happy to hear any other proposals.

All the best,
N
May 11, 2014 at 9:53 AM
My limited experience with richer content here suggests that the exercise is a bit fraught. That said, most of the wiki markups are pretty basic in their support for rich content. To me there are two distinct styles here. The cookbook is much more of a document in which style and formatting and images matter. Is it a case of continuing with sandcastle for the main code docs and having a separate cms for longer stuff? The codeplex editor does let you drop in formatted content, but are there limits on this? The question is whether we need to maintain an html version of all of this stuff. Or will linking to a pdf or word doc be ok?

I'm agnostic on the tooling. Further thoughts?
May 11, 2014 at 9:33 PM
Just some quick cookbook tasks while I am thinking of them.
  • Take a tab separated value file with chrom/position headers, and filter out positions that don't overlap with a BED file.
  • Take a genome and use mummer to create a mask of repetitive regions.
  • Take a genbank file and list of positions/mutations and annotate them (e.g. overlapping genes, nonsynonmous, synonymous etc.).
I wrote an issue on the codeplex site asking about pushing updates without cut and pasting (https://codeplex.codeplex.com/discussions/544957), am also agnostic on tooling.
May 11, 2014 at 10:05 PM
Nuget also has a decent looking documentation scheme: https://github.com/NuGet/NuGetDocs
May 22, 2014 at 2:37 PM
I've created the starting Cookbook page [here | .NET Bio Cookbook]. If you'd like to add to it, let me know and we'll arrange for editing permissions. Some of you can already edit it - feel free to start adding your snippets.

I'll be adding basic scenarios for changes we've done to .NET Bio 2.

Thanks!
mark
May 22, 2014 at 11:52 PM
We haven't really decided on a list, but i really don't think one is needed. Let's just add and review as things are posted. Use the first couple to fix the style evils as they emerge.
May 27, 2014 at 9:34 PM
I'd like to see topics on use of the new parser infrastructure, as described in your recent posts, Mark.
May 29, 2014 at 6:50 AM
just discussing this with thomas and others. Getting thomas to produce some parser cookbook items as a way of rounding out his project seems a very logical thing to do - including his own contribution in respect of VCF. I notice also that you mentioned on the other thread a namespace in the nuget package involving BioHelpers or BioHelper on the other thread. I did a recent nuget as I had to re-image the machine, but I can't find it anywhere. Can you point us to the relevant assembly?
Jun 19, 2014 at 2:42 PM
Hi Jim --

The .NET Bio 2 distribution is all done via Nuget; we'll have .vsix templates eventually, but since we are now cross-platform (woohoo!) the DLLs you need to add are platform specific (it changes) so it makes sense to let Nuget automate that, particularly since Xamarin Studio and MonoDevelop now fully support Nuget on Mac and Linux.

You want the .NET Bio 2.PCL package.. it will install everything appropriate for your target.
Jun 19, 2014 at 2:59 PM
Brilliant - we spent some time this afternoon discussing cookbook entries and will post some fresh ones tomorrow and also add a list for wider discussion. One important question for the cookbook: do we want to (later, perhaps) include useful stuff that relies on additional dlls? Am thinking pattern search especially.
Jun 19, 2014 at 3:48 PM
Jim, my 2 cents on the question of covering non-.NET Bio .dlls is that 'it depends' - I can see a good case for cookbook entries along the lines of 'how do I use .NET Bio to...' rather than how other tools can be used on their own, but since .NET Bio is intended as an integration layer, I see nothing wrong with entries that cover how it can be used to leverage third-party tools to access additional functionality.

After all, no .NET Bio cookbook would be complete without an example of how to access a web or cloud service programmatically, so I can see a good case for showing how to access a third-party .dll.

Simon