Basic Information

Gene Symbol
-
Assembly
GCA_018153235.1
Location
JAECXV010000198.1:8785744-8798747[-]

Transcription Factor Domain

TF Family
THAP
Domain
THAP domain
PFAM
PF05485
TF Group
Zinc-Coordinating Group
Description
The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 29 2.8 6.5e+03 -2.2 3.3 38 62 333 360 324 376 0.59
2 29 2.4e-15 5.4e-12 46.1 4.1 1 86 569 641 569 642 0.85
3 29 8.8e-15 2e-11 44.3 5.0 1 87 669 738 669 738 0.83
4 29 8.1e-16 1.8e-12 47.6 0.2 1 87 760 832 760 832 0.85
5 29 4.4e-16 1e-12 48.5 5.8 1 87 931 1001 931 1001 0.81
6 29 5.1e-15 1.2e-11 45.1 3.5 1 86 1025 1096 1025 1097 0.81
7 29 8.3e-13 1.9e-09 38.0 1.2 1 87 1132 1200 1132 1200 0.81
8 29 5.8e-11 1.3e-07 32.1 1.7 1 86 1240 1309 1240 1310 0.75
9 29 2e-17 4.5e-14 52.8 0.3 1 86 1337 1406 1337 1407 0.82
10 29 7.3e-13 1.7e-09 38.2 1.5 1 85 1428 1496 1428 1498 0.79
11 29 1.2e-14 2.8e-11 43.9 1.0 1 86 1525 1596 1525 1597 0.85
12 29 4e-14 9.1e-11 42.2 2.0 1 86 1677 1746 1677 1747 0.82
13 29 5.1e-13 1.2e-09 38.7 0.1 1 86 1770 1838 1770 1839 0.82
14 29 1.8e-13 4.1e-10 40.1 1.2 1 87 1966 2035 1966 2035 0.81
15 29 5.6e-08 0.00013 22.5 0.0 1 86 2128 2193 2128 2194 0.77
16 29 3.2e-06 0.0074 16.9 0.0 1 58 2209 2256 2209 2272 0.81
17 29 1.9e-12 4.3e-09 36.8 0.2 1 87 2286 2358 2286 2358 0.80
18 29 3.2e-14 7.2e-11 42.5 0.5 1 87 2418 2488 2418 2488 0.81
19 29 2.1e-10 4.9e-07 30.3 0.0 1 86 2520 2591 2520 2592 0.80
20 29 4.9e-13 1.1e-09 38.7 0.0 1 87 2602 2674 2602 2674 0.79
21 29 1.2e-15 2.8e-12 47.0 0.2 1 85 2691 2761 2691 2763 0.82
22 29 3.2e-06 0.0072 16.9 0.1 1 58 2795 2842 2795 2871 0.84
23 29 1.4e-12 3.1e-09 37.3 0.1 1 87 2880 2952 2880 2952 0.82
24 29 3.4e-15 7.8e-12 45.6 0.2 1 86 3058 3130 3058 3131 0.80
25 29 1.1e-12 2.4e-09 37.6 3.2 1 86 3192 3262 3192 3263 0.82
26 29 1.1e-13 2.6e-10 40.7 3.0 1 86 3333 3403 3333 3404 0.85
27 29 7.9e-12 1.8e-08 34.9 0.1 1 87 3487 3557 3487 3557 0.84
28 29 3.7e-10 8.3e-07 29.5 1.8 1 58 3585 3633 3585 3641 0.85
29 29 2.1e-09 4.7e-06 27.1 1.5 18 86 3651 3708 3640 3709 0.73

Sequence Information

Coding Sequence
ATGTCACAACATAACCAACCCCACCAAGTTCCCCCGCAACCCCATCCGCACTATCCTTACCACCACGCCTCTTTGTCGCTGCCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCTATATGGGCCACGAAGTAATGGCCGGCAGCAGCTATCCCTACATCAAAAGCGAACCCATGGAGGCATTCCAGCACCCGCCAAACCCCATGGCACCGCCGCCACCCCTGCCTCCGGCCCCGGAAATGATCATAAAATCGGAACCCATGGACGAACAGGCCTACAAGTCCAACTATATAGACGACAACACCCCGTTTGCGGACTTCAGCAAGTTTAACGAATACAGCGAGGATATGCTGAGTCCCAAAGTGGAGCTTACCGTTAAGGACGAGTCCTACGGCAAGAACCATAATAGTTTTCCTCGTCGCAAGCCACCAAATGATCGTCTCGCTGGTAATGAGAGCCTGCCGATCTGCCAGCGCTGCAAGGAGGTGTTCTTCAAGAAGCAGACCTACCTGCGCCACGTTGCCGAGAGCAGTTGCAGCATCCAGGAGTATGACTTCAAGTGCAACATCTGCCCCATGTCCTTCGTGAGCGCTGAGGAGCTGCAGCGCCACAAAAACCATCATCGGGCCGACCGATTCTTCTGCCACAAATACTGCGGCAAGCACTTTGAAACGATTGCCGAGTGTGAGGCGCATGAGTACATGCAGCACGAATACGACAGCTTTGTCTGCAACATGTGCTCTGCCACTTTTGCCACAAGGGATCAGCTTTACTCCCACCTGCCGCAGCACAAGTTTCAGCAGCGCTTCGACTGCCCCATTTGCCGCCTGTGGTACCAGACCGCTCTCCAGCTCCACGAGCACCGGCTGGCAGAACCCTACTACTGTGGGAAGTACTACGGGGCCGGGCTGAATACGGCAACACCTCAGCAGCAACATCACCACCAGAGCCAGACCAACTACAAGCTACAGGATTGCCACATGGCCACAATGGAGATGCCCAACACATCGCAGCACAAGCCAAACTCCTCCAACTCCACCTTGCCGGCCACGGCGGCTCTCAGTTCCTTGCTGCAACAGCGGCAAGCGAATGCTGATGGCGCTGCCATGTTCGCTGCCTCGGCGGCGGTCAAGGCGGAGATGAACGTGAAGCTGGAGCGAAGCTACAGCAACTCGACCAGTGAATCATCGTATGGTGTGCAGGATGGCGGCTACAACAACTCCTTCTCCGGAGAAACTTCCATGCACAGTGGTGCCATCGCTGGGCCACAGGCGAACTCCTCGACGCTGGACGACTCCGAGGACGCGCTGTGCTGTGTGCCATTGTGTGGAGTGCGCAAGAGCACCAGCCCTACGCTCCAGTTCTTCACGTTCCCCAAAGACGAAAAATACCTCAACCAGTGGCTGCACAACCTGAAGATGTTCCACGTGCCGGCCTCCAGCTACGCCAGCTTCCGCATCTGCAGCATGCACTTCCCCAAGCGCTGCATCAACCGATACTCGCTGTGCTATTGGGCAGTTCCGACATTCAACCTGGGCCACGACGACGTGGCCAATCTCTACCAGAACAGAGAGCTGACCAACACCTTCACCGTCGGTGAAGTGGCCAGGTGCAGCATGCCGCACTGCACCAGCCAGCGGGGCGAGAGCAATCTCAAGTTCTACAACTTTCCCAAGGACATCAAGAGCCTGATCAAGTGGTGTCAGAACGCCCGTCTCCCTGTCCAGGCCAAGGAGCCGCGCCACTTCTGCAGCCGCCACTTCGAGGAGCGTTGTATTGGCAAGTTCCGCCTTAAGCCCTGGGCTGTGCCCACTCTCCATCTGGGCGCCCAGTACGGAAAGATCCACGACAATCCGAAGAACCTGTATGTGGAGGAGAAACGGTGTTGCCTCAACTTCTGTCGCAGGAGCAGGTCCTCCGACTTTAATATGTCCCTCTACCGCTTTCCCAGAGACGAAGTCCTTCTCCGCCGTTGGTGCTATAACCTTCGACTAGATCCCGGAGTATATCGCGGCAAGAATCACAAAATATGCAGTGCCCACTTTATCAAGGAGGCGTTGGGCTTGCGGAAGCTATCACCTGGGGCGGTGCCAACATTGCATTTGGGCCACAACGACACCTTCAACATCTACGAGAACGAGCTGTGGCCGCCGCCGACTCCCTCCACCAGCCACGGCAGTGGCCAGGTGCACATGCAGCATCAGCAACACATCCCGTCGCACCACTCGCTCCAGCACCAGCTGCATCTTGGACAAAGCAAGTCCTATCAACGGCACTCGGCCGCATCCACTTCGTCCTCGGCGAGCTCCACCTCGCACTACGTGGATCCGGAGGTGAGTGCTTCGTACCTGGCGATGGGCGGATCCGCGGCAAACGCTAGCGACAGTATGGATGTCTGCTGTGTGCCCAGCTGCGAGAGCAAACGGCACAACGCCGAGAACATCACCTTCCACACGATTCCCCGAAGACCCGAGCAGATGCGCAAGTGGTGCCACAACCTGAAGATACCCGAGGACAAAATGCACAAGGGCATGCGGATCTGCAGCCGGCACTTCGAGGCCTACTGCATCGGTGGGTGCATGCGTCCGTTCGCAGTGCCCACACTGCATCTGGGTCACGACGACGAGGACATCCACCGCAATCCGGACGTTATAAAGAAGCTAAACATCCGCGAGACCTGCTGCGTGGCTGTCTGCAAACGAAACCGGGACCGGGACCATGCCAACCTGCACCGCTTCCCCAGCAACGTGGCGTTGCTGACCAAGTGGTGTGCCAATCTCCAGCGCCCCGTCCCGGACGGCAGCAAGCTCTTCAACGACGCCATTTGCGAGGTGCACTTCGAGGACCGATGCCTGCGGAACAAACGCCTGGAAAAGTGGGCGGTGCCTACACTGACCCTGGGCCACGACGACATTGCCTATCCCCTGCCCACGCCGGAGCAGGTTGCCGAGTTCCACTCTCGGCCATCGGCCCCCAACAATGGCGAGGAGCAGGGCGAGTGCTGCGTGGAGACCTGTAAGCGAAACCCCAGCGTGGACGACATTAAACTGTACCGCCCTCCGGAGGAGGCCTCTGTGCTGGCCAAGTGGGCGCACAACCTACAGACGGAGGCGGCACAGCTGGTGAGCCAGCGAATCTGCAATCTGCACTTCGAGGCCCATTGCATCGGCAAGCGGATGCGGCCATGGGCCATACCCACTCTCAACCTGGCCGGCAACATTGAGAATCTCTACGAGAATCCGGAGCCTTCCATGCTCTACAAGCGGCGGATGCACTCCAAAGCGAAACTGTCGGTCTCTGCGAAACCCACCTGGGTGCCGCGTTGCTGCCTGCCGCATTGCCGCAAGGTGCGCGCCATCCACAATGTCCAGCTCTACCGCTTCCCCAAGCACAACCGCTCCACGCTGGCCAAGTGGGCGCATAACCTGCAGGTGCCCATGGTGGGCAGTGCCCAACGCCGGGTGTGCTCGGCCCACTTTGAGCCTCTTGTGCTGAGCAAGAAGTGCCCGGTGCCGCTGGCGGTGCCCACACTGGACCTGAACGCCCCTGCAGGGCACATGGTGTACCAGAATCCGGCCAAGCTCAGGGCCAGTAAGCTGTGCCTGCAGCGCGTGTGCATCGTAGAGAGCTGTCGCAAGACTCGGGCGCAAGGAGTGCAACTCTTCCGGCTCCCGCACAATCCATCCCAGCTTCGAAAGTGGATGCACAATATCCGGACACGTCCGCGGGGATCCATGCGGTCTCAGTACCGGATCTGCTCCCGCCACTTTGAGACGCACTCGTTTAACGGGCGAAGGCTCAGTGCAGGAGCCATTCCCACGCTGGAGCTGGGCCACGACGACGACGACATCTACCCCAACGAGGCGCAGGCTTTTGTGGACGAACACTGCGCCGTGGAGGGATGCGGGGCATCCAAGGAACAGCCGGAAGTGCGACTGTTCCGCTTCCCCACTGACGACGATGACATGTTGTGGAAGTGGTGCAACAACCTCAAGATGAACCCCGCGGACTGCACGGGCGTGCGCATCTGCAACAAGCACTTCGAGGCGGACTGCATTGGACCCAAGCACCTATTCAAGTGGGCCATTCCCACCCAAGAGCTGGGCCACGACGATGCCCAGATAGAACTCATTCCAAACCCCAAGCCGGAGGATCGGTACGTGGACCCAGTGTTCAAGTGTGTGGTCCCCACCTGTGGCAAGACGCGGCGCTTTGACGAAGTCCAGATGAACAGTTTCCCCAAGGACCCGGAGCTCTTCCAGCGGTGGCGACACAACCTCCGCTTGGACCACTTGCACTTCCACGAGCGAGAGCGCTACAAGATCTGCAATGCCCACTTCGAGGACGTGTGTATTGGCAAGACCCGCTTGAACATCGGCTCGATACCCACACTAGAGCTGGGCCACGACGAGACCGAGGACCTGTTCCAAGTCAATCCCGCGGAGCTGCAGAGCAACTTGTTTGGTCGCCAACGGCGGCTGCTCGACGGATCGGAATCCGGGGAGGTGGTGGTCAAGCAGGAGCTTCCGGATGAGGAAACCGAGCCCGAGGACATCAAGCCGGACATTCGAGAACTATTAGTTTCCAGACCCAGACAGGTGAAGGTCAAAAAAGGGATGCTGGGGAATCTGAAGTGCTGTGTCCGGAGCTGCGGAAGGAGCCGGCTCCAACATGGTGCTCGTCTGTTTGCCTTTCCCACGGGCAAGCAGCAGCACCTCAAGTGGCGCCACAATCTGCGCCTGGAGCCAGAGGACGTGGACAGGTCTACGAGGGTGTGCAGCGCTCACTTCAATCGCCGTTGCATAGACGGCAAGCAGCTTCGGAGCTGGGCCATGCCCACCCTGCAGCTGGGCCATCGGGAGCAGCCCATCTACGAGAACCCCAAGAACATACCGGGCTTCTTCACGCCCACCTGTGCCCTGAGCCACTGCCGCCAGAGAAGGAGCATCGACAACGACCTTCGAACATACCGGTACCCGCGGACAGAAGACCTGCTGGAGAAGTGGCGGGCAAATCTCCGCCTGACTCCGGATCAGTGCCGAGGTCGTATCTGTGCGGATCACTTTGAACCTATGGTGCGTGGTAAGCTGAAGCTGAAAACGGGAGCGGTGCCCACCTTGAAGCTTGGCCATGACGAGGGACTGATCTACGATAACGAGGCGATCAAGGCTGGCATGGCGGAGGAGGAGGAGGTCACCTGCAAGCAGGAGATGGTCGAAGAGGAGGAAGAGGCCGAGGGAGAGGAGTCGCCCGAAGGAGTTCCCGCTCTCAACGAGGATGACGACGACAAAGACGACAGCTACTTCGATCCTTTGGAGTTGGTAGAGACGTTCGCAGAGCGCGCCAGTGATGAAGAAGTGGAAGACCACGAAGTCGAGGAGAAGAATGAGCCGGAGGAGGGGGATGAGGAGGAGGCAGAGGAGCTCCTGCCAGACCTGCCACCCACACCGCCACCTGTACCCCAGCGTCGCGAAAAACCCGCCAACAATGTGACCCCCATTTGCTGTCTGAAGCACTGTCGCAAGGAGCGCACGGCCTTCCATCTGCTGAGCACATTCGGCTTCCCGAAGGACCGCAAGCTCTTGCTGAAGTGGTGCGCCAATCTCCACCTGCTTCCGCATGACGTTGTCGGGCGGGTCTGCATCGAACACTTCGAACCGGAGGTGCTCGGAACTCGGAAGCTGAAACAGAATGCAGTGCCCACCTTGAACGTGGGCCACGACGACCCCTTGCGGTATACCTGCCATGGTGTGGAGCAGGATCTGGACTTGGAGCATGGACAGCCGCAGCACTCGGTTTTTCGGCTTTGGAGCCTGAAACACTGTCGCAAGAGGAAGCTATCGGATCCGCCGGACATTCGCCCCAGCCACTGGAAGGAACTGAAGCTGCACATGCAGAAGCAGAGGCAGGTGGAAATGGTAGAGATGGAGACCGACATACTGATGAGCACTCCTCCTCAGACACCGGTGAAGATTAAACCCAAAAGATGCTGCGTCATCAGCTGCGGGAGCGAGGATCCTAGAAAACTGGTGGCGCTGCCGGATGAGCGCAGCCTTCTCCGCCGGTGGCAGCACAACCTGAAGCTGTCAGTGCTGACGGATCCAGGTCTTGGCTTGTGCCTGGACCATTTCGAAGAGTCTCTGGTCCAATATGGAAAGCCCATGGAGAGGGCAGTGCCCACCCTGAAGTTGGGTCACAAGGGCGGTAATCTCTACCGGAACAATGCCACTTGTCTGGTCCCCAGTTGTCCCAGTTCCGGCTCCGATAGCACTAGTTTTGTGGGTCTGCCCCTCAATCCGGTGATGAAAAGGGCCTGGCTCTCCTACCTCCAACTGCCATTCACTAGCGCCGGTCTTCTATGTGGCAACCACTTCGTGGAGCTATACGAGCAGGTGGACTTGCCTGAGGACTTGCCCGTCCAGGATTTGGAGGAGCTGGAACGAACTGTCGATGAGCTGCAGTGCGCTGTGCCCGGTTGTGCGTCAAAGAACGCCCGTGAGATTCCCGTCCAGCTGGTCCAGTTACCCCACAACGAGAAGGAACTGTCCAAGTGGCTGCACAACACAAAGATCACCTATGACTATTCCCGGCACGGCAGCTATCGGATTTGCCTGCTCCACTTCGACCCAATCTGCCTCGATGAGGACTTTCCCCAGAGTTGGGCAGTGCCTACTCTAAACCTGGGCCACGACGACCAAATCCACTTGAATCCCGTGCAGAATCAGGTTGCTGAGGCTCTTAACGGAACGTCCAATAGCCATCATAGCCATAGCCTGAGACCTCTGAGGATTAGGACAGAACTAGCATCCAGTCCGAGTGTGAGTGCCAGTCCCAGTCCGAGAGGAAACATCCGGATTTGTTGCATCCCCACTTGCAACCAGTTTGGGAACAGCCAGGTGCGACTCTATCGCTTTCCCAGCGAGGAGCAGTTCCTCCTCCAGTGGCTGGTCAACACGCAGCAGCAGCCCCGACTCGTGGATCCCATGGAGCTTTACGTGTGTCAGGCACACTTTGAAACCGACGCCACCTACAAGAAGCACCTTCGCAGCTGGGCCTTGCCGACCCTGAATCTTGGCCATGACGGGCATGTCTTTCAAAACGCCAGGCACAACGGAAACACTGCCGACGTCGAGGAGGCATTGAAGTTTATCCGGGAGCGCTACTGTTCGGTGCTGAGTTGCTTTCAACTCAGAGGAGAGGGAGTCCGCCTTTTCGAGTACCCCGAGGACATGGCTATGATCCGAAAGTGGGCAGTTGCCTGCAAACATCGTTCCATGCACGCCAGGAGCCATGGCCTCCAGGTGTGCCAGGCGCACTTTGCTCCCGACTGCTTTGATCCCGACACTGGAGACCTACTAGAGGGATCAATACCCACGCTGGAACTCAACCGCGAAGACATCGAGAGACACTGCTTGGTGCCAGGTTGTGAGCAGGACGAAGCGGGCCCCCGGCTGCGATTCTATAAGCTGCCCAAGATCGGTGAACAGCTCGAGGCGTGGAGCACCAATATAAAGATTCCGGTCTCAGAACTGAAGCGCGGAGACCAGCGCATCTGTGAGCGCCACTTCGAGACGTACTGCTTCGGACCTAGCCGGGGTCTGCGGCTGGGAGCCTTACCCACTCTGTTCCTGGGTCACGAGGACCTGCTTCTTAATCCCGACAACTTGCGGGAGAACTGCTGCGTACCGGGGTGCGGGCGTATCCGGCAGACTGATGACATTCCCTTCTACGGCTTCCCGAAGCATTGGTCCTTGGCCAGGAAGTGGCTGCACAACATCCGCTTGGAAAAGACCAGCAAGGATCAGCTGAACAAACTGAGGGTATGCCCGGCGCACTTTGAGTCGGATGTGCGGGAAAACGACGGACTCCTGCCAGAAGCCATGCCCACCAAGCAGCTGGGGCATTCCTCCGAAGGGATTTTCCTCACGGACAAGGGCACGCAGGCTAGGAGTCTTCCGAATCTCAAAAGATCCTCTCCGGAGGTCACTTGCTGTTATCCGGACTGCACTGATTCGTCGAGATTCCAGTTATTGGATTTTCCCGACCAGGCAGAGCTCCGCGATGCATGGTTGGGTCACTTGAGACTCAAGGAGCTACATGATGAAGCCCCACAGCTCTGTCCCCTCCATTATGTGATTCTATATGAGCACAGTGCCAAGGAGTTTCCGGAGCACGTTCCAGACCAGTTGATGGAAGCAAACTATACTAACGCCCGCGCCAACCGGCGGGTCAAGATCGTCAGTTGCGCCATCAAAGGCTGCACAACGGTGAGGCCTAGGGATGGAGTGCCGCTGCACGGCATGCCCACGTACAAGGATATCCTGCAGATGTGGGTGGACAATGGGCAGGTGGACTTCTCCGAACCGCAACGGTACATGCTCAAGGTGTGTCACAGGCACTTCGAGCCACGTTGCTTCGTCGATGAACGGCGGCTCAGCTCCTGGAGTGTTCCTACCCTGCATCTTCCCGGTGAGACTGTCCACCAGAATCCCAGCAAAGAAGAGTGGGAGGCCATCAAGCGAGAGAACAAGGAAGAGCCAGAAATCAAGGAGGAACCTCTAGAGACGGAGCCAGAGATGGAGATGGAAACGGAAAATTCTCTACTGGAGCCCATTGTCAAGATGGAACACCTGGAATCCGAGGAGGAGGACTCAGAAATGCAGGCGTTGGAGGTGCTGCTGGAGGTTGGGCACGTGGAGCGGCTGGACAGCTATGAAAAGATCGACGAATCCCCCATTGCCTACAAGTCCAATCGTGGTCAGTACAACGCCAACAGCTGTGCCGTGGAAGGGTGTGACGTCACGGCCGAGGACGTGGGCGGAACTATCAAGCTGCACAAGTTTCCCGCCCCAGCGGAAGCCGCCCGCAAGTGGATGCACAACACACAGGTGGACATGGAGGAGAAGTTCTGGTGGCGCTATCGCATCTGCAGCTACCACTTTCACCAGGACTGCTTCCAGGGGTCTAGAATCCGAAAGGGAGCCATGCCCACGCTGCTCTTGGGACCTCGGAGACCGGATGAGGTCTACGACAATGAGTTCGCATCGCAGCCGGAGGTTAAGGACCCACCTCCGCCGGTCGAGATCGTCCCAGTGACCAGTGTGACTAAACGGATAGCTCCCGATGTTACCAATATCTGCCTTCCTCCGCCGGCTGCGCCCCGGAAATCCAGCAAGTTCTGCCAAATCGAAGGCTGCTCGAATCACCTGACCACCGACAACATAACCCTCCACAAGTTTCCGCACTCGGAGGACATGTGCATCCGATGGCAGCACAACTCTCAAGTTCCATTCGATCCGAACCATCGCTGGCGATACAGGATCTGCACCGCCCACTTCGAACCCGTGTGCTTGTCTAACTTGCGCCTGCTCCACGGAAGTGTGCCCACTTTGAAGCTAGGACCCAAAGCTCCCGCGGAGCTCTTCGACAACGATTTTGAGGCCATCAACCAGCGACTGGACAAGAGATCGGCGGCAGAGGTGAAACAGGAACGGGTGGATATGGAAGACGAGCTGCACGAGGACCAAATGGATGTGCCTAGCTTGATGCCTGTGAAGCAGGAGAAGGTATCCTTCAACCAGATCAAGTCTGGCTACGACAAGTGCTCACTGGCCCACTGCCAGCGCCAAAGATCTCTGCACGGTGTCCACATCTACAAGTTCCCCAGGTCGCAGATCCAGCAGGAGCGATGGATGCACAACCTCCGCATCCGCTACGATGAGCGCCGTCCCTGGCGATTCATGATATGTAGCGTCCACTTCGAGCCCCACTGCATCAGCCTAAGGAAGCTGCGTCCCTGGGCAGTTCCTACGCTGGAGCTGGGCACGAATGTGCCGGAGATACTCTTCACCAACGAACAGTGCCTGGAACTGGAGGTGGAACAACCCAGCGATCGTAGCGAAGCGGAGAGCGAAGAGGAGGATGGCCTGGAAGAAGACGACGATGGTGAGGACGTCGAGGCGGAGGAAGAAGGACATGACTCCAATGTCCGCATCAAAAAGGAACGGCGTTCGAGACTGGATCCATATCCTGCTGGTCAGGTTCCGCCCTGGAAAGTGAAGCAGTGCTGCCTTCCCTACTGTCGTGCCTTCCGAGGAGATGGCATCAAGCTCTTCCGGCTCCCCAACAACCGAACCTCTATTCACAATTGGGAGTTGGCCACGGGCATGGTGTTCAAGGAGTCGCAGCGAAACACGCGACTCATTTGTAGTCGACATTTCGATCCGGAGCTTATCGGAGTGCGTCGCCTCATGCGCAACGCTATTCCAACTCTGCATCTGAATCCGGAAGCCGTTAAGGGCAAGGAGAAAAAGGTTTGGCAGAGCAAACCCAAGGAAGCTCCCACACCCATCCCAACCTGCTGCATGGCGGACTGCCATCACAATGGAAACGCCAAGCTGCATAAGTTCCCCAATGATTCCACACACCTGAGGCAGTGGTGCCAGGCCCTCAGACTCACGGATATACAACGTTATCGTGGCAAGTACATCTGCTCGGCCCACCTGCCGACCAACATGACCGTAAGCTGCGTCGTCTGCGGGGTAGATGACGTTCAGCTACCGATGCTGGACTTTCCAGAGAACCGCAACCAGCGGGCCAAATGGTGCTACAACCTAAAAATCGAGACCATACCCAAGTGGGATCGCTCCAAGCACATCTGTTGCCGGCACTTTGAGTCACACTGCTTTGTCCGGCCGGGTGAACTTCGTCCAGGAGCGACCCCAACAGTGGCATTGAACCACAGCGATACAAACATATTCCTCAGCGACTACGCCACCGATCCGACGACCTCCTATGCGGGTAATCAGATCAAGGACGAGCCCATGGACGGCGACGAGACGCTTCTGGTCTAG
Protein Sequence
MSQHNQPHQVPPQPHPHYPYHHASLSLPXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXYMGHEVMAGSSYPYIKSEPMEAFQHPPNPMAPPPPLPPAPEMIIKSEPMDEQAYKSNYIDDNTPFADFSKFNEYSEDMLSPKVELTVKDESYGKNHNSFPRRKPPNDRLAGNESLPICQRCKEVFFKKQTYLRHVAESSCSIQEYDFKCNICPMSFVSAEELQRHKNHHRADRFFCHKYCGKHFETIAECEAHEYMQHEYDSFVCNMCSATFATRDQLYSHLPQHKFQQRFDCPICRLWYQTALQLHEHRLAEPYYCGKYYGAGLNTATPQQQHHHQSQTNYKLQDCHMATMEMPNTSQHKPNSSNSTLPATAALSSLLQQRQANADGAAMFAASAAVKAEMNVKLERSYSNSTSESSYGVQDGGYNNSFSGETSMHSGAIAGPQANSSTLDDSEDALCCVPLCGVRKSTSPTLQFFTFPKDEKYLNQWLHNLKMFHVPASSYASFRICSMHFPKRCINRYSLCYWAVPTFNLGHDDVANLYQNRELTNTFTVGEVARCSMPHCTSQRGESNLKFYNFPKDIKSLIKWCQNARLPVQAKEPRHFCSRHFEERCIGKFRLKPWAVPTLHLGAQYGKIHDNPKNLYVEEKRCCLNFCRRSRSSDFNMSLYRFPRDEVLLRRWCYNLRLDPGVYRGKNHKICSAHFIKEALGLRKLSPGAVPTLHLGHNDTFNIYENELWPPPTPSTSHGSGQVHMQHQQHIPSHHSLQHQLHLGQSKSYQRHSAASTSSSASSTSHYVDPEVSASYLAMGGSAANASDSMDVCCVPSCESKRHNAENITFHTIPRRPEQMRKWCHNLKIPEDKMHKGMRICSRHFEAYCIGGCMRPFAVPTLHLGHDDEDIHRNPDVIKKLNIRETCCVAVCKRNRDRDHANLHRFPSNVALLTKWCANLQRPVPDGSKLFNDAICEVHFEDRCLRNKRLEKWAVPTLTLGHDDIAYPLPTPEQVAEFHSRPSAPNNGEEQGECCVETCKRNPSVDDIKLYRPPEEASVLAKWAHNLQTEAAQLVSQRICNLHFEAHCIGKRMRPWAIPTLNLAGNIENLYENPEPSMLYKRRMHSKAKLSVSAKPTWVPRCCLPHCRKVRAIHNVQLYRFPKHNRSTLAKWAHNLQVPMVGSAQRRVCSAHFEPLVLSKKCPVPLAVPTLDLNAPAGHMVYQNPAKLRASKLCLQRVCIVESCRKTRAQGVQLFRLPHNPSQLRKWMHNIRTRPRGSMRSQYRICSRHFETHSFNGRRLSAGAIPTLELGHDDDDIYPNEAQAFVDEHCAVEGCGASKEQPEVRLFRFPTDDDDMLWKWCNNLKMNPADCTGVRICNKHFEADCIGPKHLFKWAIPTQELGHDDAQIELIPNPKPEDRYVDPVFKCVVPTCGKTRRFDEVQMNSFPKDPELFQRWRHNLRLDHLHFHERERYKICNAHFEDVCIGKTRLNIGSIPTLELGHDETEDLFQVNPAELQSNLFGRQRRLLDGSESGEVVVKQELPDEETEPEDIKPDIRELLVSRPRQVKVKKGMLGNLKCCVRSCGRSRLQHGARLFAFPTGKQQHLKWRHNLRLEPEDVDRSTRVCSAHFNRRCIDGKQLRSWAMPTLQLGHREQPIYENPKNIPGFFTPTCALSHCRQRRSIDNDLRTYRYPRTEDLLEKWRANLRLTPDQCRGRICADHFEPMVRGKLKLKTGAVPTLKLGHDEGLIYDNEAIKAGMAEEEEVTCKQEMVEEEEEAEGEESPEGVPALNEDDDDKDDSYFDPLELVETFAERASDEEVEDHEVEEKNEPEEGDEEEAEELLPDLPPTPPPVPQRREKPANNVTPICCLKHCRKERTAFHLLSTFGFPKDRKLLLKWCANLHLLPHDVVGRVCIEHFEPEVLGTRKLKQNAVPTLNVGHDDPLRYTCHGVEQDLDLEHGQPQHSVFRLWSLKHCRKRKLSDPPDIRPSHWKELKLHMQKQRQVEMVEMETDILMSTPPQTPVKIKPKRCCVISCGSEDPRKLVALPDERSLLRRWQHNLKLSVLTDPGLGLCLDHFEESLVQYGKPMERAVPTLKLGHKGGNLYRNNATCLVPSCPSSGSDSTSFVGLPLNPVMKRAWLSYLQLPFTSAGLLCGNHFVELYEQVDLPEDLPVQDLEELERTVDELQCAVPGCASKNAREIPVQLVQLPHNEKELSKWLHNTKITYDYSRHGSYRICLLHFDPICLDEDFPQSWAVPTLNLGHDDQIHLNPVQNQVAEALNGTSNSHHSHSLRPLRIRTELASSPSVSASPSPRGNIRICCIPTCNQFGNSQVRLYRFPSEEQFLLQWLVNTQQQPRLVDPMELYVCQAHFETDATYKKHLRSWALPTLNLGHDGHVFQNARHNGNTADVEEALKFIRERYCSVLSCFQLRGEGVRLFEYPEDMAMIRKWAVACKHRSMHARSHGLQVCQAHFAPDCFDPDTGDLLEGSIPTLELNREDIERHCLVPGCEQDEAGPRLRFYKLPKIGEQLEAWSTNIKIPVSELKRGDQRICERHFETYCFGPSRGLRLGALPTLFLGHEDLLLNPDNLRENCCVPGCGRIRQTDDIPFYGFPKHWSLARKWLHNIRLEKTSKDQLNKLRVCPAHFESDVRENDGLLPEAMPTKQLGHSSEGIFLTDKGTQARSLPNLKRSSPEVTCCYPDCTDSSRFQLLDFPDQAELRDAWLGHLRLKELHDEAPQLCPLHYVILYEHSAKEFPEHVPDQLMEANYTNARANRRVKIVSCAIKGCTTVRPRDGVPLHGMPTYKDILQMWVDNGQVDFSEPQRYMLKVCHRHFEPRCFVDERRLSSWSVPTLHLPGETVHQNPSKEEWEAIKRENKEEPEIKEEPLETEPEMEMETENSLLEPIVKMEHLESEEEDSEMQALEVLLEVGHVERLDSYEKIDESPIAYKSNRGQYNANSCAVEGCDVTAEDVGGTIKLHKFPAPAEAARKWMHNTQVDMEEKFWWRYRICSYHFHQDCFQGSRIRKGAMPTLLLGPRRPDEVYDNEFASQPEVKDPPPPVEIVPVTSVTKRIAPDVTNICLPPPAAPRKSSKFCQIEGCSNHLTTDNITLHKFPHSEDMCIRWQHNSQVPFDPNHRWRYRICTAHFEPVCLSNLRLLHGSVPTLKLGPKAPAELFDNDFEAINQRLDKRSAAEVKQERVDMEDELHEDQMDVPSLMPVKQEKVSFNQIKSGYDKCSLAHCQRQRSLHGVHIYKFPRSQIQQERWMHNLRIRYDERRPWRFMICSVHFEPHCISLRKLRPWAVPTLELGTNVPEILFTNEQCLELEVEQPSDRSEAESEEEDGLEEDDDGEDVEAEEEGHDSNVRIKKERRSRLDPYPAGQVPPWKVKQCCLPYCRAFRGDGIKLFRLPNNRTSIHNWELATGMVFKESQRNTRLICSRHFDPELIGVRRLMRNAIPTLHLNPEAVKGKEKKVWQSKPKEAPTPIPTCCMADCHHNGNAKLHKFPNDSTHLRQWCQALRLTDIQRYRGKYICSAHLPTNMTVSCVVCGVDDVQLPMLDFPENRNQRAKWCYNLKIETIPKWDRSKHICCRHFESHCFVRPGELRPGATPTVALNHSDTNIFLSDYATDPTTSYAGNQIKDEPMDGDETLLV

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2