What is a simple command line tool for doing Needleman-Wunsch pair-wise alignment on the command line
up vote
5
down vote
favorite
I have two DNA strings:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
and
AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I want a tool that allows me to do something like this on the command line:
$ aligner GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
and receive an ASCII visualization of a pairwise alignment. Something like this would work:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC---------
---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I prefer a tool that is packaged on bioconda. A tool that also does protein and RNA sequences is even more preferable.
sequence-alignment
add a comment |
up vote
5
down vote
favorite
I have two DNA strings:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
and
AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I want a tool that allows me to do something like this on the command line:
$ aligner GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
and receive an ASCII visualization of a pairwise alignment. Something like this would work:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC---------
---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I prefer a tool that is packaged on bioconda. A tool that also does protein and RNA sequences is even more preferable.
sequence-alignment
Is this a toy example? If so, maybe update it to another one. These sequences are low complexity and the actual correct alignment output is unclear to me
– Chris_Rands
Nov 26 at 16:07
The sequences are not important. Edits are welcome.
– winni2k
Nov 26 at 16:11
1
To get the alignment in your example, you have to use zero (or very small) gap cost at the ends of sequences. That is not the standard NW.
– user172818♦
Nov 27 at 3:43
add a comment |
up vote
5
down vote
favorite
up vote
5
down vote
favorite
I have two DNA strings:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
and
AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I want a tool that allows me to do something like this on the command line:
$ aligner GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
and receive an ASCII visualization of a pairwise alignment. Something like this would work:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC---------
---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I prefer a tool that is packaged on bioconda. A tool that also does protein and RNA sequences is even more preferable.
sequence-alignment
I have two DNA strings:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
and
AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I want a tool that allows me to do something like this on the command line:
$ aligner GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
and receive an ASCII visualization of a pairwise alignment. Something like this would work:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC---------
---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I prefer a tool that is packaged on bioconda. A tool that also does protein and RNA sequences is even more preferable.
sequence-alignment
sequence-alignment
edited Nov 26 at 16:38
terdon♦
3,8611726
3,8611726
asked Nov 26 at 13:58
winni2k
955117
955117
Is this a toy example? If so, maybe update it to another one. These sequences are low complexity and the actual correct alignment output is unclear to me
– Chris_Rands
Nov 26 at 16:07
The sequences are not important. Edits are welcome.
– winni2k
Nov 26 at 16:11
1
To get the alignment in your example, you have to use zero (or very small) gap cost at the ends of sequences. That is not the standard NW.
– user172818♦
Nov 27 at 3:43
add a comment |
Is this a toy example? If so, maybe update it to another one. These sequences are low complexity and the actual correct alignment output is unclear to me
– Chris_Rands
Nov 26 at 16:07
The sequences are not important. Edits are welcome.
– winni2k
Nov 26 at 16:11
1
To get the alignment in your example, you have to use zero (or very small) gap cost at the ends of sequences. That is not the standard NW.
– user172818♦
Nov 27 at 3:43
Is this a toy example? If so, maybe update it to another one. These sequences are low complexity and the actual correct alignment output is unclear to me
– Chris_Rands
Nov 26 at 16:07
Is this a toy example? If so, maybe update it to another one. These sequences are low complexity and the actual correct alignment output is unclear to me
– Chris_Rands
Nov 26 at 16:07
The sequences are not important. Edits are welcome.
– winni2k
Nov 26 at 16:11
The sequences are not important. Edits are welcome.
– winni2k
Nov 26 at 16:11
1
1
To get the alignment in your example, you have to use zero (or very small) gap cost at the ends of sequences. That is not the standard NW.
– user172818♦
Nov 27 at 3:43
To get the alignment in your example, you have to use zero (or very small) gap cost at the ends of sequences. That is not the standard NW.
– user172818♦
Nov 27 at 3:43
add a comment |
4 Answers
4
active
oldest
votes
up vote
10
down vote
accepted
You are looking for the needle
program from the EMBOSS
suite. Available in bioconda.
http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/needle.html
To read sequences from the commandline, you need so specify the format as asis
. To get the output on the screen, you'll need -stdout
and to use the default alignment parameters (gap penalty 10, extend_penalty 0.5) you'll need -auto
. Thus your query above would be:
$ needle -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG -auto -stdout
########################################
# Program: needle
# Rundate: Tue 27 Nov 2018 10:48:50
# Commandline: needle
# -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
# -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
# -auto
# -stdout
# Align_format: srspair
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 2
# 1: asis
# 2: asis
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 56
# Identity: 37/56 (66.1%)
# Similarity: 37/56 (66.1%)
# Gaps: 18/56 (32.1%)
# Score: 181.0
#
#
#=======================================
asis 1 GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC--- 47
|||||||||||||||||||||||||||||||||||||.
asis 1 ---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAG 41
asis 47 ------ 47
asis 42 AGGAGG 47
#---------------------------------------
#---------------------------------------
add a comment |
up vote
9
down vote
Install biopython with conda and use the following script:
#!/usr/bin/env python
import sys
from Bio import Align
aligner = Align.PairwiseAligner()
aligner.mode = "local"
alignments = aligner.align(sys.argv[1], sys.argv[2])
print(alignments[0])
The output is then:
.GGA-GGAGGGAG--AAGGAGGGAGGGAAGA-GGAGGGAG--AAGGAGGGAGGC
.|-|-||||||||--|||-|||-|||||-||-||||||||--|||-|||-|||.
AG-AAGGAGGGAGGGAAG-AGG-AGGGA-GAAGGAGGGAGGGAAG-AGG-AGG.
There are a LOT of alignments with that same score and you can tweak the details of the mode by changing aligner.mode
.
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
Nov 26 at 15:53
add a comment |
up vote
3
down vote
I'd just use something like t_coffee
and write a little bash function to run it on the strings you give:
aligner(){
tmp=$(mktemp);
args=("$@")
for ((i=0; i<$#; i++)); do
printf '>%sn%sn' $i ${args[i]} >> "$tmp"
done
t_coffee "$tmp" 2>/dev/null | /bin/grep -Pv '^CLUSTAL|^s*$'
rm "$tmp"
}
If you add that to your shell's initialization file (e.g. ~/.bashrc
if using bash on Linux, ~/.profile
if using bash on macOS), you can run align2seqs
with as many sequences as you want to align:
$ aligner 'GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG'
0 GG--AGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
1 nAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGG-GAAGAGGAGG
********* ******* ********* * ** * *
$ aligner 'GAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'GGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG' GAGGGAGGGAAGAGGAGGGAGAAGGATCCGAGGAAGAGGAGG AGAAGGAGGGAGGGAAGAGGCCCGAGATATATAAGGATCCGAGGAA
0 GA-GG----GAGAAGGAGGGAGGGAAGAGGAGGGAGAAGG--AGGGAGGC
1 GGAGG----GAGGGAAGAGGAGGGAGAAGGAGGGAG-GGA--AGAGGAGG
2 GA-GG----GAGGGAAGAGGAGGGAGAAGGATCCGA-GGA--AGAGGAGG
3 AGAAGGAGGGAGGGAAGAGGCCCGAGATATAT--AA-GGATCCGAG-GAA
* *** ** ** * * * *
add a comment |
up vote
2
down vote
Here's another option: https://github.com/noporpoise/seq-align; I've used it in the past as it is small and has no dependencies (just make
and you're set).
This tool actually prompted my question! Do you know if anyone has packaged it?
– winni2k
Nov 29 at 8:23
Ah, no... I didn't notice the last part of your question.
– Kirill G
Nov 29 at 21:27
Could you edit and include an example of how the OP could use this tool to do what they ask for? That will help future readers of this question.
– terdon♦
Dec 4 at 13:25
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "676"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f5509%2fwhat-is-a-simple-command-line-tool-for-doing-needleman-wunsch-pair-wise-alignmen%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
10
down vote
accepted
You are looking for the needle
program from the EMBOSS
suite. Available in bioconda.
http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/needle.html
To read sequences from the commandline, you need so specify the format as asis
. To get the output on the screen, you'll need -stdout
and to use the default alignment parameters (gap penalty 10, extend_penalty 0.5) you'll need -auto
. Thus your query above would be:
$ needle -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG -auto -stdout
########################################
# Program: needle
# Rundate: Tue 27 Nov 2018 10:48:50
# Commandline: needle
# -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
# -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
# -auto
# -stdout
# Align_format: srspair
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 2
# 1: asis
# 2: asis
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 56
# Identity: 37/56 (66.1%)
# Similarity: 37/56 (66.1%)
# Gaps: 18/56 (32.1%)
# Score: 181.0
#
#
#=======================================
asis 1 GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC--- 47
|||||||||||||||||||||||||||||||||||||.
asis 1 ---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAG 41
asis 47 ------ 47
asis 42 AGGAGG 47
#---------------------------------------
#---------------------------------------
add a comment |
up vote
10
down vote
accepted
You are looking for the needle
program from the EMBOSS
suite. Available in bioconda.
http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/needle.html
To read sequences from the commandline, you need so specify the format as asis
. To get the output on the screen, you'll need -stdout
and to use the default alignment parameters (gap penalty 10, extend_penalty 0.5) you'll need -auto
. Thus your query above would be:
$ needle -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG -auto -stdout
########################################
# Program: needle
# Rundate: Tue 27 Nov 2018 10:48:50
# Commandline: needle
# -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
# -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
# -auto
# -stdout
# Align_format: srspair
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 2
# 1: asis
# 2: asis
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 56
# Identity: 37/56 (66.1%)
# Similarity: 37/56 (66.1%)
# Gaps: 18/56 (32.1%)
# Score: 181.0
#
#
#=======================================
asis 1 GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC--- 47
|||||||||||||||||||||||||||||||||||||.
asis 1 ---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAG 41
asis 47 ------ 47
asis 42 AGGAGG 47
#---------------------------------------
#---------------------------------------
add a comment |
up vote
10
down vote
accepted
up vote
10
down vote
accepted
You are looking for the needle
program from the EMBOSS
suite. Available in bioconda.
http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/needle.html
To read sequences from the commandline, you need so specify the format as asis
. To get the output on the screen, you'll need -stdout
and to use the default alignment parameters (gap penalty 10, extend_penalty 0.5) you'll need -auto
. Thus your query above would be:
$ needle -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG -auto -stdout
########################################
# Program: needle
# Rundate: Tue 27 Nov 2018 10:48:50
# Commandline: needle
# -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
# -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
# -auto
# -stdout
# Align_format: srspair
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 2
# 1: asis
# 2: asis
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 56
# Identity: 37/56 (66.1%)
# Similarity: 37/56 (66.1%)
# Gaps: 18/56 (32.1%)
# Score: 181.0
#
#
#=======================================
asis 1 GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC--- 47
|||||||||||||||||||||||||||||||||||||.
asis 1 ---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAG 41
asis 47 ------ 47
asis 42 AGGAGG 47
#---------------------------------------
#---------------------------------------
You are looking for the needle
program from the EMBOSS
suite. Available in bioconda.
http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/needle.html
To read sequences from the commandline, you need so specify the format as asis
. To get the output on the screen, you'll need -stdout
and to use the default alignment parameters (gap penalty 10, extend_penalty 0.5) you'll need -auto
. Thus your query above would be:
$ needle -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG -auto -stdout
########################################
# Program: needle
# Rundate: Tue 27 Nov 2018 10:48:50
# Commandline: needle
# -asequence asis:GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
# -bsequence asis:AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
# -auto
# -stdout
# Align_format: srspair
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 2
# 1: asis
# 2: asis
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 56
# Identity: 37/56 (66.1%)
# Similarity: 37/56 (66.1%)
# Gaps: 18/56 (32.1%)
# Score: 181.0
#
#
#=======================================
asis 1 GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC--- 47
|||||||||||||||||||||||||||||||||||||.
asis 1 ---------AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAG 41
asis 47 ------ 47
asis 42 AGGAGG 47
#---------------------------------------
#---------------------------------------
edited Nov 27 at 10:51
answered Nov 26 at 20:13
Ian Sudbery
2,126217
2,126217
add a comment |
add a comment |
up vote
9
down vote
Install biopython with conda and use the following script:
#!/usr/bin/env python
import sys
from Bio import Align
aligner = Align.PairwiseAligner()
aligner.mode = "local"
alignments = aligner.align(sys.argv[1], sys.argv[2])
print(alignments[0])
The output is then:
.GGA-GGAGGGAG--AAGGAGGGAGGGAAGA-GGAGGGAG--AAGGAGGGAGGC
.|-|-||||||||--|||-|||-|||||-||-||||||||--|||-|||-|||.
AG-AAGGAGGGAGGGAAG-AGG-AGGGA-GAAGGAGGGAGGGAAG-AGG-AGG.
There are a LOT of alignments with that same score and you can tweak the details of the mode by changing aligner.mode
.
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
Nov 26 at 15:53
add a comment |
up vote
9
down vote
Install biopython with conda and use the following script:
#!/usr/bin/env python
import sys
from Bio import Align
aligner = Align.PairwiseAligner()
aligner.mode = "local"
alignments = aligner.align(sys.argv[1], sys.argv[2])
print(alignments[0])
The output is then:
.GGA-GGAGGGAG--AAGGAGGGAGGGAAGA-GGAGGGAG--AAGGAGGGAGGC
.|-|-||||||||--|||-|||-|||||-||-||||||||--|||-|||-|||.
AG-AAGGAGGGAGGGAAG-AGG-AGGGA-GAAGGAGGGAGGGAAG-AGG-AGG.
There are a LOT of alignments with that same score and you can tweak the details of the mode by changing aligner.mode
.
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
Nov 26 at 15:53
add a comment |
up vote
9
down vote
up vote
9
down vote
Install biopython with conda and use the following script:
#!/usr/bin/env python
import sys
from Bio import Align
aligner = Align.PairwiseAligner()
aligner.mode = "local"
alignments = aligner.align(sys.argv[1], sys.argv[2])
print(alignments[0])
The output is then:
.GGA-GGAGGGAG--AAGGAGGGAGGGAAGA-GGAGGGAG--AAGGAGGGAGGC
.|-|-||||||||--|||-|||-|||||-||-||||||||--|||-|||-|||.
AG-AAGGAGGGAGGGAAG-AGG-AGGGA-GAAGGAGGGAGGGAAG-AGG-AGG.
There are a LOT of alignments with that same score and you can tweak the details of the mode by changing aligner.mode
.
Install biopython with conda and use the following script:
#!/usr/bin/env python
import sys
from Bio import Align
aligner = Align.PairwiseAligner()
aligner.mode = "local"
alignments = aligner.align(sys.argv[1], sys.argv[2])
print(alignments[0])
The output is then:
.GGA-GGAGGGAG--AAGGAGGGAGGGAAGA-GGAGGGAG--AAGGAGGGAGGC
.|-|-||||||||--|||-|||-|||||-||-||||||||--|||-|||-|||.
AG-AAGGAGGGAGGGAAG-AGG-AGGGA-GAAGGAGGGAGGGAAG-AGG-AGG.
There are a LOT of alignments with that same score and you can tweak the details of the mode by changing aligner.mode
.
answered Nov 26 at 15:40
Devon Ryan♦
12.5k21135
12.5k21135
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
Nov 26 at 15:53
add a comment |
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
Nov 26 at 15:53
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
Nov 26 at 15:53
yes, this should be pretty fast too since the alignment work is done in C (rather than python)
– Chris_Rands
Nov 26 at 15:53
add a comment |
up vote
3
down vote
I'd just use something like t_coffee
and write a little bash function to run it on the strings you give:
aligner(){
tmp=$(mktemp);
args=("$@")
for ((i=0; i<$#; i++)); do
printf '>%sn%sn' $i ${args[i]} >> "$tmp"
done
t_coffee "$tmp" 2>/dev/null | /bin/grep -Pv '^CLUSTAL|^s*$'
rm "$tmp"
}
If you add that to your shell's initialization file (e.g. ~/.bashrc
if using bash on Linux, ~/.profile
if using bash on macOS), you can run align2seqs
with as many sequences as you want to align:
$ aligner 'GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG'
0 GG--AGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
1 nAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGG-GAAGAGGAGG
********* ******* ********* * ** * *
$ aligner 'GAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'GGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG' GAGGGAGGGAAGAGGAGGGAGAAGGATCCGAGGAAGAGGAGG AGAAGGAGGGAGGGAAGAGGCCCGAGATATATAAGGATCCGAGGAA
0 GA-GG----GAGAAGGAGGGAGGGAAGAGGAGGGAGAAGG--AGGGAGGC
1 GGAGG----GAGGGAAGAGGAGGGAGAAGGAGGGAG-GGA--AGAGGAGG
2 GA-GG----GAGGGAAGAGGAGGGAGAAGGATCCGA-GGA--AGAGGAGG
3 AGAAGGAGGGAGGGAAGAGGCCCGAGATATAT--AA-GGATCCGAG-GAA
* *** ** ** * * * *
add a comment |
up vote
3
down vote
I'd just use something like t_coffee
and write a little bash function to run it on the strings you give:
aligner(){
tmp=$(mktemp);
args=("$@")
for ((i=0; i<$#; i++)); do
printf '>%sn%sn' $i ${args[i]} >> "$tmp"
done
t_coffee "$tmp" 2>/dev/null | /bin/grep -Pv '^CLUSTAL|^s*$'
rm "$tmp"
}
If you add that to your shell's initialization file (e.g. ~/.bashrc
if using bash on Linux, ~/.profile
if using bash on macOS), you can run align2seqs
with as many sequences as you want to align:
$ aligner 'GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG'
0 GG--AGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
1 nAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGG-GAAGAGGAGG
********* ******* ********* * ** * *
$ aligner 'GAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'GGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG' GAGGGAGGGAAGAGGAGGGAGAAGGATCCGAGGAAGAGGAGG AGAAGGAGGGAGGGAAGAGGCCCGAGATATATAAGGATCCGAGGAA
0 GA-GG----GAGAAGGAGGGAGGGAAGAGGAGGGAGAAGG--AGGGAGGC
1 GGAGG----GAGGGAAGAGGAGGGAGAAGGAGGGAG-GGA--AGAGGAGG
2 GA-GG----GAGGGAAGAGGAGGGAGAAGGATCCGA-GGA--AGAGGAGG
3 AGAAGGAGGGAGGGAAGAGGCCCGAGATATAT--AA-GGATCCGAG-GAA
* *** ** ** * * * *
add a comment |
up vote
3
down vote
up vote
3
down vote
I'd just use something like t_coffee
and write a little bash function to run it on the strings you give:
aligner(){
tmp=$(mktemp);
args=("$@")
for ((i=0; i<$#; i++)); do
printf '>%sn%sn' $i ${args[i]} >> "$tmp"
done
t_coffee "$tmp" 2>/dev/null | /bin/grep -Pv '^CLUSTAL|^s*$'
rm "$tmp"
}
If you add that to your shell's initialization file (e.g. ~/.bashrc
if using bash on Linux, ~/.profile
if using bash on macOS), you can run align2seqs
with as many sequences as you want to align:
$ aligner 'GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG'
0 GG--AGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
1 nAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGG-GAAGAGGAGG
********* ******* ********* * ** * *
$ aligner 'GAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'GGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG' GAGGGAGGGAAGAGGAGGGAGAAGGATCCGAGGAAGAGGAGG AGAAGGAGGGAGGGAAGAGGCCCGAGATATATAAGGATCCGAGGAA
0 GA-GG----GAGAAGGAGGGAGGGAAGAGGAGGGAGAAGG--AGGGAGGC
1 GGAGG----GAGGGAAGAGGAGGGAGAAGGAGGGAG-GGA--AGAGGAGG
2 GA-GG----GAGGGAAGAGGAGGGAGAAGGATCCGA-GGA--AGAGGAGG
3 AGAAGGAGGGAGGGAAGAGGCCCGAGATATAT--AA-GGATCCGAG-GAA
* *** ** ** * * * *
I'd just use something like t_coffee
and write a little bash function to run it on the strings you give:
aligner(){
tmp=$(mktemp);
args=("$@")
for ((i=0; i<$#; i++)); do
printf '>%sn%sn' $i ${args[i]} >> "$tmp"
done
t_coffee "$tmp" 2>/dev/null | /bin/grep -Pv '^CLUSTAL|^s*$'
rm "$tmp"
}
If you add that to your shell's initialization file (e.g. ~/.bashrc
if using bash on Linux, ~/.profile
if using bash on macOS), you can run align2seqs
with as many sequences as you want to align:
$ aligner 'GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG'
0 GG--AGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
1 nAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGG-GAAGAGGAGG
********* ******* ********* * ** * *
$ aligner 'GAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC' 'GGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG' GAGGGAGGGAAGAGGAGGGAGAAGGATCCGAGGAAGAGGAGG AGAAGGAGGGAGGGAAGAGGCCCGAGATATATAAGGATCCGAGGAA
0 GA-GG----GAGAAGGAGGGAGGGAAGAGGAGGGAGAAGG--AGGGAGGC
1 GGAGG----GAGGGAAGAGGAGGGAGAAGGAGGGAG-GGA--AGAGGAGG
2 GA-GG----GAGGGAAGAGGAGGGAGAAGGATCCGA-GGA--AGAGGAGG
3 AGAAGGAGGGAGGGAAGAGGCCCGAGATATAT--AA-GGATCCGAG-GAA
* *** ** ** * * * *
answered Nov 26 at 16:37
terdon♦
3,8611726
3,8611726
add a comment |
add a comment |
up vote
2
down vote
Here's another option: https://github.com/noporpoise/seq-align; I've used it in the past as it is small and has no dependencies (just make
and you're set).
This tool actually prompted my question! Do you know if anyone has packaged it?
– winni2k
Nov 29 at 8:23
Ah, no... I didn't notice the last part of your question.
– Kirill G
Nov 29 at 21:27
Could you edit and include an example of how the OP could use this tool to do what they ask for? That will help future readers of this question.
– terdon♦
Dec 4 at 13:25
add a comment |
up vote
2
down vote
Here's another option: https://github.com/noporpoise/seq-align; I've used it in the past as it is small and has no dependencies (just make
and you're set).
This tool actually prompted my question! Do you know if anyone has packaged it?
– winni2k
Nov 29 at 8:23
Ah, no... I didn't notice the last part of your question.
– Kirill G
Nov 29 at 21:27
Could you edit and include an example of how the OP could use this tool to do what they ask for? That will help future readers of this question.
– terdon♦
Dec 4 at 13:25
add a comment |
up vote
2
down vote
up vote
2
down vote
Here's another option: https://github.com/noporpoise/seq-align; I've used it in the past as it is small and has no dependencies (just make
and you're set).
Here's another option: https://github.com/noporpoise/seq-align; I've used it in the past as it is small and has no dependencies (just make
and you're set).
answered Nov 29 at 4:32
Kirill G
1386
1386
This tool actually prompted my question! Do you know if anyone has packaged it?
– winni2k
Nov 29 at 8:23
Ah, no... I didn't notice the last part of your question.
– Kirill G
Nov 29 at 21:27
Could you edit and include an example of how the OP could use this tool to do what they ask for? That will help future readers of this question.
– terdon♦
Dec 4 at 13:25
add a comment |
This tool actually prompted my question! Do you know if anyone has packaged it?
– winni2k
Nov 29 at 8:23
Ah, no... I didn't notice the last part of your question.
– Kirill G
Nov 29 at 21:27
Could you edit and include an example of how the OP could use this tool to do what they ask for? That will help future readers of this question.
– terdon♦
Dec 4 at 13:25
This tool actually prompted my question! Do you know if anyone has packaged it?
– winni2k
Nov 29 at 8:23
This tool actually prompted my question! Do you know if anyone has packaged it?
– winni2k
Nov 29 at 8:23
Ah, no... I didn't notice the last part of your question.
– Kirill G
Nov 29 at 21:27
Ah, no... I didn't notice the last part of your question.
– Kirill G
Nov 29 at 21:27
Could you edit and include an example of how the OP could use this tool to do what they ask for? That will help future readers of this question.
– terdon♦
Dec 4 at 13:25
Could you edit and include an example of how the OP could use this tool to do what they ask for? That will help future readers of this question.
– terdon♦
Dec 4 at 13:25
add a comment |
Thanks for contributing an answer to Bioinformatics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f5509%2fwhat-is-a-simple-command-line-tool-for-doing-needleman-wunsch-pair-wise-alignmen%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Is this a toy example? If so, maybe update it to another one. These sequences are low complexity and the actual correct alignment output is unclear to me
– Chris_Rands
Nov 26 at 16:07
The sequences are not important. Edits are welcome.
– winni2k
Nov 26 at 16:11
1
To get the alignment in your example, you have to use zero (or very small) gap cost at the ends of sequences. That is not the standard NW.
– user172818♦
Nov 27 at 3:43