Global alignment error - NeedlemanWunsch

Apr 6, 2013 at 3:17 PM
Edited Apr 7, 2013 at 2:44 PM
Implemented a system that performs the alignment between all sequences of a multi-fasta, but for some comparisons below error is generated for the other score is returned normally.
  var alinha = SequenceAligners.NeedlemanWunsch;
  var resultado = alinha.AlignSimple(new[] { seq1, seq2 }); 
 int pontos = Convert.ToInt32(resultado[0].AlignedSequences[0].Metadata["Score"]);
The error ArgumentOutOfRangeException () is displayed in the variable resultado,thus understand that nothing is being generated for this variable.

Here also two sequences that show error in alignment.
gi|105630106|gb|DW005385.1|DW005385 TR0038 Trichophyton rubrum
ACGATCAACTATATCTCACGCAAGGACACCCAAATCAAGCTCAACGGCCAACGAATTGAGCTTGGTGAGATTGAATACCATTGCAAAGTGA
gi|291058796|gb|EL786974.1|EL786974 EST037205 Trichophyton rubrum
GGCCTGCATCTTCACCCTGTTTACATCTCGACTCCCTATAGATTGTGTATACATCGTTTTACCTTGTGCTTGCTATTGCTATATACTCTCTTTTGTTCTTACGTATTATCCCCACTTTACCTAACTAGTTACTCCTGCAGGCAGCGGCCAGCAAAACCCGCCGAACGCAATAATGACTCCTCAAGACCGCCTGAACCAGGTCTCCGCGCACCTCAACTACCCCCAAGGCATGCTTGCTGGTCAAGTAGCCATTATAACCGGGTCGGGACAGGGCATAGGGGCAGAGGCTGCTAGGCTCTTCGCCAACGAGGGCGCCAAAGTTGTCGTCGCTGACCTTGACAGCTCCAAGGCCGAGGCCGTTGCCAAAGCCATCAATGATGCATCTCCTGGCAGAGCAATTGCCGTTGCAGGAGACGTACAGGATGGAGCGTACCTGAAGAGACTCGTACAAAGGGCCGCTGAGTTCGGAAACGGAAAGATCCATATCATCGTGAACAATGCTGGATTCACCT
Thanks for the help
Matheus Franco
Coordinator
Apr 7, 2013 at 2:19 PM
Hi,

For some reason the existing algorithm isn't supplying any results. The new version of the S/W algorithm coming in 1.1 very shortly is supplying results - here's a sample run using the original gap cost (-8) with the diagonal similarity matrix (2,-2) (which is not optimal IMO by the way):

NeedlemanWunschAligner [Needleman-Wunsch]
SEQUENCE_1 (91): ACGATCAACTATATCTCACGCAAGGACACCCAAATCAAGCTCAACGGCCAACGAATTGAGCTTGGTGAGATTGAATACCATTGCAAAGTGA
SEQUENCE_2 (512): GGCCTGCATCTTCACCCTGTTTACATCTCGACTCCCTATAGATTGTGTATACATCGTTTTACCTTGTGCTTGCTATTGCTATATACTCTCTTTTGTTCTTACGTATTATCCCCACTTTACCTAACTAGTTACTCCTGCAGGCAGCGGCCAGCAAAACCCGCCGAACGCAATAATGACTCCTCAAGACCGCCTGAACCAGGTCTCCGCGCACCTCAACTACCCCCAAGGCATGCTTGCTGGTCAAGTAGCCATTATAACCGGGTCGGGACAGGGCATAGGGGCAGAGGCTGCTAGGCTCTTCGCCAACGAGGGCGCCAAAGTTGTCGTCGCTGACCTTGACAGCTCCAAGGCCGAGGCCGTTGCCAAAGCCATCAATGATGCATCTCCTGGCAGAGCAATTGCCGTTGCAGGAGACGTACAGGATGGAGCGTACCTGAAGAGACTCGTACAAAGGGCCGCTGAGTTCGGAAACGGAAAGATCCATATCATCGTGAACAATGCTGGATTCACCT

Gap Cost: -8, SimilarityMatrix: Diagonal: match value 2, non-match value -2

Alignment #1
--------------
Score: -3186, FirstOffset:0, SecondOffset: 0
Metadata:
Consensus: GGCCTGCATCTTCACCCTGTTTACATCTCGACTCCCTATAGATTGTGTATACATCGTTTTACCT... +[448]
EndOffsets: { 90, 511 }
FirstOffset: 0
IdenticalCount: 91
Insertions: { 421, 0 }
Score: -3186
SecondOffset: 0
SimilarityCount: 91
StartOffsets: { 0, 0 }

Alignments and markup [ |=match :=similar .=mismatch ]:
1 -------A-C--------G---A--TC---A------A------------ 8
| | | | || | |
1 GGCCTGCATCTTCACCCTGTTTACATCTCGACTCCCTATAGATTGTGTAT 50
9 -C-T------A--T----------AT--CT-----C-------------- 17
| | | | || || |
51 ACATCGTTTTACCTTGTGCTTGCTATTGCTATATACTCTCTTTTGTTCTT 100
18 ACG------C---A----A--------G--------G-A--CA-C--CCA 32
||| | | | | | | || | |||
101 ACGTATTATCCCCACTTTACCTAACTAGTTACTCCTGCAGGCAGCGGCCA 150
33 --AA----------------T-----C-----AAG-C----T---C-A-- 43
|| | | ||| | | | |
151 GCAAAACCCGCCGAACGCAATAATGACTCCTCAAGACCGCCTGAACCAGG 200
44 ---------AC---------------GGC---C---------AA----C- 52
|| ||| | || |
201 TCTCCGCGCACCTCAACTACCCCCAAGGCATGCTTGCTGGTCAAGTAGCC 250
53 ---------G-------A-A-----T------------TG--AG-CT-T- 63
| | | | || || || |
251 ATTATAACCGGGTCGGGACAGGGCATAGGGGCAGAGGCTGCTAGGCTCTT 300
64 -G-----G------------T-G---------A----GA----T------ 71
| | | | | || |
301 CGCCAACGAGGGCGCCAAAGTTGTCGTCGCTGACCTTGACAGCTCCAAGG 350
72 ---------T-G--AA-----T-A------C--C-------A------TT 82
| | || | | | | | ||
351 CCGAGGCCGTTGCCAAAGCCATCAATGATGCATCTCCTGGCAGAGCAATT 400
83 GC------A--A-A-GT---G-A--------------------------- 91
|| | | | || | |
401 GCCGTTGCAGGAGACGTACAGGATGGAGCGTACCTGAAGAGACTCGTACA 450
92 -------------------------------------------------- 91

451 AAGGGCCGCTGAGTTCGGAAACGGAAAGATCCATATCATCGTGAACAATG 500
92 ------------ 91

501 CTGGATTCACCT 512


Consensus:
GGCCTGCATCTTCACCCTGTTTACATCTCGACTCCCTATAGATTGTGTATACATCGTTTTACCTTGTGCTTGCTATTGCTATATACTCTCTTTTGTTCTTACGTATTATCCCCACTTTACCTAACTAGTTACTCCTGCAGGCAGCGGCCAGCAAAACCCGCCGAACGCAATAATGACTCCTCAAGACCGCCTGAACCAGGTCTCCGCGCACCTCAACTACCCCCAAGGCATGCTTGCTGGTCAAGTAGCCATTATAACCGGGTCGGGACAGGGCATAGGGGCAGAGGCTGCTAGGCTCTTCGCCAACGAGGGCGCCAAAGTTGTCGTCGCTGACCTTGACAGCTCCAAGGCCGAGGCCGTTGCCAAAGCCATCAATGATGCATCTCCTGGCAGAGCAATTGCCGTTGCAGGAGACGTACAGGATGGAGCGTACCTGAAGAGACTCGTACAAAGGGCCGCTGAGTTCGGAAACGGAAAGATCCATATCATCGTGAACAATGCTGGATTCACCT
Took 00:00:00.0192488

Hopefully this is in line with your expectations. We are hoping to have this up and available very shortly - sorry you've had troubles!

mark


Mark Smith

[email removed] | @marksm | 214-774-4749 | www.julmar.com/blog/mark



Apr 7, 2013 at 2:43 PM
Edited Apr 7, 2013 at 2:44 PM
Hello Mark,

Thanks for help me. Hopefully soon the new version of the algorithm is available to solve this problem.

best regards,
Matheus Franco
Coordinator
Apr 8, 2013 at 6:59 PM
Hi mefmachine,

We are hoping to get the next release (which will be 1.1) out in a month or so.

Simon
Coordinator
Apr 8, 2013 at 7:04 PM
Hi Matheus,

I'm going to be in Sao Paulo for a conference from May 13-15. I'd be happy to meet if you are available, and learn more about your .NET Bio plans.

Just let me know,

Simon
Apr 8, 2013 at 7:54 PM
Hi Simon,

If possible I will go to Sao Paulo to talk to you. Which conference will participate?

Best regards,
Coordinator
Apr 8, 2013 at 8:57 PM
I'll be at the 2013 LATAM eScience Workshop, run jointly by FAPESP and Microsoft Research:
http://www.fapesp.br/eventos/latam2013

I don't think the agenda has been released yet, but there will be a demo of some bioinformatics tools on the Microsoft platform from 17:00 - 18:00 on Tuesday May 14th, if you are interested - perhaps we could talk after that?

Simon