designer_dna.oligos¶
Common utility functions to work with and analyze oligonucleotide sequences.
Functions
|
Complement a nucleotide sequence. |
|
Return the complement of a nucleotide sequence. |
|
Find the longest palindromic substring within a nucleotide sequence. |
|
Calculate the maximum observed repeats of composite pattern size n characters. |
|
Calculate the longest substring of n repeating characters. |
|
Find the longest palindromic substring within a nucleotide sequence. |
|
Find the longest substring palindrome within a nucleotide sequence. |
|
Reverse a nucleotide sequence. |
|
Reverse a nucleotide sequence. |
|
Reverse complement a nucleotide sequence. |
|
Reverse complement a nucleotide sequence. |
|
Return the maximum length of a single letter (nucleotide) repeat in a string. |
|
Calculate the maximum stretch of a single character in a string. |
- designer_dna.oligos.complement(sequence, dna=True)¶
Complement a nucleotide sequence.
- Parameters:
sequence (str) – Nucleotide sequence string.
dna (bool) – Sequence is DNA, else RNA.
- Returns:
(str) Complement of a nucleotide sequence string.
Examples
complement("ATGC", True) == "TACG" complement("ATGC", False) == "UACG"
- designer_dna.oligos.complement_py(sequence: str, dna: bool = True) str [source]¶
Return the complement of a nucleotide sequence.
- Parameters:
sequence (str) – Nucleotide sequence string.
dna (bool) – If true, treat sequence as DNA, otherwise treat as RNA
- Returns:
Complement of input sequence.
- Return type:
(str)
Examples
complement_py("ATGC", True) == "TACG" complement_py("ATGC", False) == "UACG"
- designer_dna.oligos.manacher(sequence, dna=True)¶
Find the longest palindromic substring within a nucleotide sequence.
- Parameters:
sequence (str) – Nucleotide sequence string.
dna (bool) – Sequence is DNA, else RNA.
- Returns:
(str) Longest palindromic substring within a sequence.
Notes
This is a cython/c++ implementation of the O(n) Manacher’s algorithm.
- designer_dna.oligos.nrepeats(sequence, n)¶
Calculate the maximum observed repeats of composite pattern size n characters.
- Parameters:
sequence (str) – Nucleotide sequence string.
n (int) – Size of k-mers (composite pattern) to observe.
- Returns:
(int) The longest tandem run of nucleotides comprised of a composite pattern of length n characters.
- Raises:
ZeroDivisionError – if value of n is 0.
Examples
nrepeats("AAAA", 1) == 3 # True nrepeats("AAAA", 2) == 1 # True nrepeats("ACAACAACA", 3) == 2 # True
- designer_dna.oligos.nrepeats_py(sequence: str, n: int) int [source]¶
Calculate the longest substring of n repeating characters.
- Parameters:
sequence (str) – Nucleotide string or Series of string
n (int) – stretch of k-mer to observe
- Returns:
(int) The longest run of repeating n-length characters.
- Raises:
ValueError – when n < 1
Examples
nrepeats_py("AAAA", 1) == 3 # True nrepeats_py("AAAA", 2) == 1 # True nrepeats_py("ACAACAACA", 3) == 2 # True
- designer_dna.oligos.palindrome(sequence, dna=True)¶
Find the longest palindromic substring within a nucleotide sequence.
- Parameters:
sequence (str) – Nucleotide sequence string.
dna (bool) – Sequence is DNA, else RNA.
- Returns:
(str) longest palindromic subsequence within sequence.
Examples
palindrome("ATAT") == "ATAT" palindrome("GATATG") == "ATAT" palindrome("ANT") == "ANT" # Handles degenerate bases
Notes
If a sequence contains two or more palindromic substrings of equal size, the first leftmost palindrome is prioritized.
- designer_dna.oligos.palindrome_py(sequence: str, dna: bool = True) str [source]¶
Find the longest substring palindrome within a nucleotide sequence.
- Parameters:
sequence (str) – Nucleotide sequence string.
dna (bool) – If true, treat sequence as DNA, otherwise treat as RNA
- Returns:
longest palindromic subsequence within sequence.
- Return type:
(str)
Examples
palindrome_py("ATAT") == "ATAT" palindrome_py("GATATG") == "ATAT"
Notes
Algorithmic time complexity is O(N).
If a sequence contains two or more palindromic substrings of equal size, the first leftmost palindrome is prioritized.
- designer_dna.oligos.reverse(sequence)¶
Reverse a nucleotide sequence.
- Parameters:
sequence (str) – Nucleotide sequence string.
- Returns:
(str) Reverse a string.
Examples
reverse("ATATAT") == "TATATA" reverse("AATATA") == "ATATAA"
- designer_dna.oligos.reverse_complement(sequence, dna=True)¶
Reverse complement a nucleotide sequence.
- Parameters:
sequence (str) – Nucleotide sequence string.
dna (bool) – Sequence is DNA, else RNA.
- Returns:
(str) Reverse complement of sequence string.
Examples
reverse_complement("ATGC", True) == "GCAT" reverse_complement("ATGC", False) == "GCAU"
- designer_dna.oligos.reverse_complement_py(sequence: str, dna: bool = True) str [source]¶
Reverse complement a nucleotide sequence.
- Parameters:
sequence (str) – Nucleotide sequence string.
dna (bool) – sequence is dna, else rna.
- Returns:
(str) Reverse complement of sequence string.
Examples
reverse_complement_py("ATGC", True) == "GCAT" reverse_complement_py("ATGC", False) == "GCAU"
- designer_dna.oligos.reverse_py(sequence: str) str [source]¶
Reverse a nucleotide sequence.
- Parameters:
sequence (str) – Nucleotide sequence string.
- Returns:
(str) Reverse a string.
Examples
reverse_py("ATATAT") == "TATATA" reverse_py("AATATA") == "ATATAA"
- designer_dna.oligos.stretch(sequence)¶
Return the maximum length of a single letter (nucleotide) repeat in a string.
- Parameters:
sequence (str) – Nucleotide sequence string.
- Returns:
(int) Length of maximum run of a single letter.
Examples
stretch("ATATAT") == 0 # True stretch("AATATA") == 1 # True
- designer_dna.oligos.stretch_py(sequence: str) int [source]¶
Calculate the maximum stretch of a single character in a string.
- Parameters:
sequence (str) – Nucleotide sequence string.
- Returns:
maximum length observed within sequence of a repeated character.
- Return type:
(int)
Examples
stretch_py("AAAA") == 3 stretch_py("AATT") == 1