Tree Changes

Developer
Sep 20, 2013 at 9:30 PM
Hi All,

Am thinking of making a couple changes to Bio.Phylogenetics that I wanted to run by for votes yay or nay.

Issue: Phylogenetic trees come in a variety of shapes and sizes, sometimes with meta-data information.

1 - Change the class for nodes in a tree (Node.cs) so that it contains a <string,string> dictionary for metadata.

2 - Extract interfaces from all of the tree related classes, to create INode, ITree and IEdge, that can be passed around.

2 is because the Bio.Selectome dataspace I am thinking of contributing has a lot of information specific to those trees, but would be nice to have available in a statically typed class that did not inherit from the base type.

Sound okay?
-N
Coordinator
Sep 21, 2013 at 2:21 AM
Conceptually the changes sound fine. The metadata map sounds largely to be a no-brainer. However, the changes to the hierarchy, which make perfect sense. are potentially breaking. Is there anyone out there using the library who has an issue with this, I wonder?

Maybe this needs more of a news announcement? Let's see what reaction we get in the first instance. I would support it in principle though
Developer
Sep 28, 2013 at 1:53 AM
Edited Sep 28, 2013 at 2:08 AM
Okay, well no negative feedback so far from anyone (and I think the Bio.Phylogenetics space might be an under-utilized framework at present, so adding functionality might be ok and beneficial at this stage). I am going to go ahead and make three changes in stages to this namespace, to make rolling them back easier if this is a problem.
  • Implement an ITree interface and meta-data dictionary in the files
  • Implement extend newick format parsing in the newick parser
  • Change functions that accept Tree to accept ITree
I think only the third case could be breaking, because Tree implements ITree, it should not require a cast. There may however be a performance implication to using an interface instead of a direct class reference, but I think that should be very very small.
Developer
Sep 28, 2013 at 5:59 PM
Edited Sep 29, 2013 at 12:49 AM
Jim was right, totally a breaking change or a lot of unneccessary code to make it work smoothly, and I finally understand why contra/covariance are terms worth discussing.

I implemented the newick extended format parser and the meta data along with a unit test on both. I think just converting between tree types would be easiest for now rather than having all extend a base type or interface, will wait until there are several more tree types before worrying more about it.
Coordinator
Sep 30, 2013 at 2:18 AM
Pity. Definitely a logical approach, but as you say, not for now, necessarily.