Basic Information

Gene Symbol
-
Assembly
GCA_005876895.1
Location
VCKU01000088.1:163840-177045[+]

Transcription Factor Domain

TF Family
THAP
Domain
THAP domain
PFAM
PF05485
TF Group
Zinc-Coordinating Group
Description
The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 29 9.5 1.2e+04 -3.0 3.2 44 62 255 276 240 291 0.55
2 29 8.2e-15 1.1e-11 45.3 3.5 1 86 486 558 486 559 0.85
3 29 1.8e-14 2.3e-11 44.2 4.8 1 87 586 655 586 655 0.82
4 29 1.5e-15 2e-12 47.6 0.2 1 87 677 749 677 749 0.85
5 29 9.1e-16 1.2e-12 48.3 5.5 1 87 857 927 857 927 0.82
6 29 6.2e-15 8e-12 45.7 3.2 1 86 951 1022 951 1023 0.82
7 29 5.2e-13 6.6e-10 39.5 0.8 1 87 1058 1126 1058 1126 0.81
8 29 2.9e-11 3.7e-08 33.9 2.0 1 86 1165 1234 1165 1235 0.77
9 29 6.5e-17 8.3e-14 52.0 0.4 1 86 1262 1331 1262 1332 0.82
10 29 6e-13 7.6e-10 39.3 0.9 1 86 1353 1422 1353 1423 0.79
11 29 8.1e-14 1e-10 42.1 1.2 1 86 1450 1521 1450 1522 0.85
12 29 2.9e-12 3.7e-09 37.1 2.7 1 85 1593 1661 1593 1663 0.82
13 29 3.4e-12 4.3e-09 36.9 0.1 1 86 1686 1754 1686 1755 0.82
14 29 4.4e-14 5.6e-11 42.9 1.3 1 87 1928 1997 1928 1997 0.79
15 29 2.6e-10 3.4e-07 30.8 0.0 1 86 2083 2156 2083 2157 0.79
16 29 0.0025 3.2 8.5 0.0 1 58 2176 2220 2176 2234 0.80
17 29 9.6e-13 1.2e-09 38.6 0.2 1 86 2256 2325 2256 2326 0.81
18 29 2.1e-13 2.7e-10 40.8 0.1 1 86 2405 2473 2405 2474 0.82
19 29 1.2e-10 1.6e-07 31.9 0.3 1 85 2509 2579 2509 2581 0.79
20 29 2.2e-11 2.8e-08 34.3 0.5 1 87 2595 2665 2595 2665 0.80
21 29 2.8e-15 3.6e-12 46.8 1.0 1 86 2690 2762 2690 2763 0.79
22 29 3.1e-05 0.04 14.6 0.0 1 58 2790 2845 2790 2864 0.81
23 29 1e-11 1.3e-08 35.4 0.4 1 87 2883 2955 2883 2955 0.79
24 29 1.7e-12 2.2e-09 37.8 0.0 1 86 3087 3157 3087 3158 0.78
25 29 8.3e-13 1.1e-09 38.8 3.2 1 86 3213 3283 3213 3284 0.81
26 29 2.5e-14 3.2e-11 43.7 5.5 1 86 3409 3479 3409 3480 0.84
27 29 4e-12 5.1e-09 36.7 0.2 1 86 3565 3634 3565 3635 0.84
28 29 3.6e-09 4.6e-06 27.2 0.7 1 58 3655 3704 3655 3719 0.86
29 29 1.3e-09 1.7e-06 28.6 0.5 18 87 3723 3781 3708 3781 0.76

Sequence Information

Coding Sequence
ATGTCACAGCATAATCCACATGCCCATCCGCACTACCATCACCACCCACTGCACCACCATCAGACtcaacaccaccaccaccagcagcagcagttgcagcagcagcagcagcaacagcaacaacaggcGCAAATGCCACAACACAGCAATTGGTACTCACATGTTGCTTCCTACCCGCCACCGCCACCCCACCACCATGCGACGAGTACCTCCGCATTTGCCGCCACCTCCACACCTTGCAAGGGCACTGACTACGGAGCAGGCAGCACGCATGGATattatgctgctgctgccgccgccgctggcGGTGGGCTCAATGTCAATGCTCCACCCAATCCGATGGCCCCACCCCCAGCGCCAGATATGATAATAAAATCGGAACCCATGGATGAGCATCCCTACAAGTCCAACTATATTGATGACAATACGCCCTTTGCTGATTTCAATAAGTTCAACGAATTCAGCGGCGATATGCTAAGCCCCAAAGTCGAGCTAACCGTCAAAGATGAGACCTACGGAAAGACTTCTAGCAGCAGCTTTGCACGCCGCAAAgcccagcaacaacagcaacagcagacaACAGATCATTCGGCGGAGAGTTTGCCCATTTGCCAGCGCTGCAAGGAGGTCTTCTTCAAGAAGCAGTCCTATCTGCGGCATGTGGCCGAGAGCAGTTGTGGCATCCAGGAGTATGACTTCAAGTGCAACATATGCCCCATGTCTTTCATGAGCACCGAAGAGCTGCAGCGGCACAAGCATCTGCATCGTGCGGACAAGTTCTTTTGCCACAAATACTGCGGCAAGCATTTCGACACGATAGCCGAGTGCGAATCGCACGAGTACATGCAGCACGAGTATGAGAGCTTTGTTTGTAATATGTGCTCTGGAACCTTTGCCACGCGGGAGCAGCTGTATGCCCACTTGCCGCAGCACAAGTTCCAGCAGCGTTACGACTGTCCCATCTGCCGTCTGTGGTATCAAACTGCCGGCGAATTGCACGAGCATCGACTGGCGGCACCATACTTTTGCGGCAAGTACTATACcaatcaacagcagcagcagcttgcGACGAATCAGGGCAACTACAAGCTGCAGGACTGCCATATGGCCACCATGGAAATGCCCACAGCCCCACTCCATAAGGCTACGCCTTCCAATGCCTCGGCCCTGCCAGCCACAGCTGCTTTGAGCTCTCTGTTGCAACAGCGCCAGGCAAATGCCGATGGGGCAGCGGCCATGTTCGCTGCTGCTTCCTCCACTTCCGCCTCGCTGAAGAGTGAGGTGAGCGTGAAGCTGGAGCGCAGCTACAGCAACTCCACCAGCGAGTCCTCGTACAGCCATCAAGAGAACAGCAGCTACAACAATGCCTATGGCAGCGACAGCTCCATCCACGGCGGAGCACTGGCAGGACCACAGGCGCACTCCTCAACGCTGGACGACTCCGAGGACGCCCTGTGCTGTGTGCCTCTGTGCGGCGTCAGAAAGAGCACCAGTCCAACGCTGCAGTTCTTCACGTTCCCGAAGGATGACAAATATCTGAATCAGTGGCTGCACAACCTCAAGATGTTCCACATACCAGCCGCCAGCTATGCGACATTTCGCATCTGCAGCATGCACTTCCCGAAGCGTTGCATAAATCGGTATTCGCTGTGCTATTGGGCGGTGCCCACCTTCAATCTGGGGCACGACGATGTGGCTAATCTGTACCAGAACCGGGAGCTAACCAACACCTTTACCACTGGAGAGGTGGCACGCTGCAGTATGCCGCACTGCACTAGCCAGCGGGGGGAGAGCAATCTGAAGTTCTACAATTTCCCCAAGGACATCAAGAGCCTGATCAAGTGGTGCCAGAACGCCCGCCTGCCAGTGCAGGCCAAGGAGCCGCGTCACTTTTGCAGCCGCCACTTTGAGGATCGTTGCATTGGCAAGTTCCGACTGAAGCCCTGGGCCGTGCCCACCCTCCATCTGGGGGCGCAGTACGGTAAAATCCATGACAATCCCAAGAACCTGTATGTGGAGGAGAAGCGCTGCTGCCTGAACTTTTGTCGTCGCAGCCGTTCCTCGGACTTTAACATGTCACTGTATCGATTCCCCCGCGATGAGGTGCTCCTGCGACGCTGGTGCTATAATTTGAGGCTGGACCCGGGCGTGTATCGCGGCAAGAATCATAAAATATGCAGTGCGCATTTCATCAAGGAAGCACTGGGTCTGAGAAAGCTATCGCCAGGGGCCGTTCCCACATTGCATTTGGGTCACAATGACACCTTTAATATCTACGAGAACGAGCTGTGGCCACCGCCATCTCCCACTGGACAACATGGCGGCAATCACCAGctcctccagcagcagcagacgtcGCAGCAGCTGTCACATCACCACTCGtcgctgcagcagcagcagcatcagccaATGCATAGCAAATCCTATCAACGCCATTCGGCGGCATCCACTTCCTCCTCAGCCAGCTCGGCCTCTCATTACGTGGACCCCGAGATGAGTGCGTCGTATCTGAGCCTGTCTGCGGGTGGCTCCTCTGGCGGGATGAATGCCAGCGACTGCATGGATGTGTGCTGTGTGCCCAGCTGCGAGAGCAAGCGGCACAACAGCGAGAACATCACATTCCACACGATACCGCGAAGGCCGGAGCAGATGCGCAAGTGGTGCCACAATCTGAAGATACCCGAGGACAAGATGCACAAGGGCATGAGGATTTGCAGCCTGCACTTCGAGCCCTACTGCATTGGTGGCTGCATGCGACCGTTTGCCGTGCCCACACTCCATTTGGGGCACGAGGATGAGGACATTCACCGCAATCCGGATGTGATCAAGAAGCTGAACATCCGCGAGACCTGCTGTGTGGCCGTGTGCAAGCGGAATCGCGACAGAGACCATGCCAATCTCCATCGCTTCCCCAGCAATGTGGCGCTGCTCACGAAGTGGTGTGCGAATCTGCAGCGGACAGTGCCCGATGGCAGCAAACTCTTCAACGATGCCATCTGTGAGGTGCACTTTGAGGATCGTTGTCTGCGCAACAAGAGGCTGGAGAAGTGGGCTGTGCCCACACTGATCCTCGGCCACGAGGACATTGCCTATCCGCTGCCAACGCCCGAGCAGGTGGCCGAGTTCTATGCTCGGCCCACGGCCCCCAACAATGGCGAGGAGCAGGGCGAGTGCTGTGTGGAAACGTGCAAACGGAACCCGAGCGTGGACGACATCAAATTGTATCGTCCGCCGGAGGATGCTTCGGTGCTGGCCAAATGGGCGCACAATCTGCAAACGGAGGCCGCTGTCCTAACGAACATGCGGATATGCAATCTGCACTTTGAGGCCCACTGCATTGGCAAGCGCATGCGTCCGTGGGCCATACCCACGCTCAATCTGGCAGGAAACATTGAGAATCTGTTCGAGAATCCCGAGCATTCGATGCTGTACAAGCGAAGGACGCACCTCAAACAGAAGGTGCCAGTGACGAAGCCCACGTGGGTGCCACGCTGCTGTCTGCCGCACTGCCGCAAGGTGCGTGCCCTGCACAATGTCCAGCTGTATCGCTTCCCCAAGCTGAACCGCTCGACGCTGGCCAAGTGGGCGCACAATCTGCAGGTGCCGCAGGTGGGCAGTGCCCAGCGGCGGGTCTGTTCCGCACACTTTGAGCCGCATGTTCTGAGCAAGAAGTGCCCGGTGCCGCTGGCGGTGCCCACACTAGACTTGAACTCACCAGCTGGCCACAAGATCTACCAGAATCCGGCCAAGCTGAAGGCCAACAAGCTGTGCCTGCAGCGCGTCTGCATTGTGGAGAGCTGCAGAAAGACCAGAGCCCAGGGCGTGCAGCTCTTCCGACTGCCCCACAGCCCCACGCAGCTAAGGAAATGGATGCACAATATACGGACACGCCCGAGGGCGGCCATGAGGAGCCAGTATCGTGTCTGCTCGCGACACTTCGAGACGCACTCCTTCAACGGCCGAAGGCTAAGCGCCGGGGCCATTCCCACCTTGGAGTTGGGCCATGACGATGATGATATCTTCCCGAATGAGGCGCAGGCCTTTGCGGATGAGCACTGCGCTGTGGAGGGCTGTGAATCGTCGAAGGAACAGCCGGAAGTGCGGCTGTTCCGCTTCCCCACGGACGACGACGACATGCTGTGGAAGTGGTGCAACAATCTCAAGATGAATCCGGTCGACTGCATCGGTGTGCGGATCTGCAACAAGCATTTCGATGCCGATTGCATTGGACCCAAGCATCTGTATAAATGGGCCATACCTACGATGCTGCTCGGCCATGATGACTCGCAAATCGAGCTCATACTCAACCCCAAGCCGGAGGAACGCTATGTAGATCCCGTGTTCAAGTGCATTGTCCCCACCTGCGGCAAGACGCGCCGCTTCGATGAGGTGCAAATGAACAGCTTCCCCAAGGATGCGGAACTCTTCCAGCGCTGGCGCCACAACCTCCGCCTGGAGCACCTGTGCTTCAAGGAGCGCGAGAAGTACAAGATCTGCAATGCCCATTTCGAGGACATTTGCATTGGGAAGACGCGTCTCAACATTGGCTCGATACCCACTCTGGAGTTGGGCCACGAGGAAACGGAGGATCTGTTCAAAGTGAATCCGGAAGATCTGCAGAGCAATCTGTTCGGACGTCCCCGGCGGCTGCTAAGAGGATACAACAACGTGACCATCAAGCAGGAGGTGCCAGAGACGGAGGGGCAGGACATCAAGCCCGATATAGGAGCCAATTTTACACAGGTAAAGGTTAAGAAATCTCTGGGGGATATCAAGTGCTGTGTGCACACGTGTGGACGCAGTCGCTTGGAGCATGGGGCACGTCTCTTTCCCTTTCCCACGGGCAAGCAACAGCACCTCAAGTGGCGCCACAATCTGCGCCTCGAGCCCGACGAAGTGGACAAAACCACGCGCGTCTGCAGCGCTCACTTCAACAGGCGCTGCATCGATGGCAAGCATCTTAGGGGCTGGGCCATGCCCACACAGCAGTTGGGCCACCAAGAGCAGCCCATTTACGAGAATCCCAAGAATATACCCGGCTTCTTTACGCCCACCTGTGCGCTGGGGCACTGCCGCAAGCGTCGGAGCATCGACAATGACTTGCGCACATATCGGTATCCGCGGAGCGAGGATCTCCTCGAGAAGTGGCGTGCGAATCTCAGACTATCGTTGGATCAGTGCCGCGGGAGGATCTGTGCGGATCACTTTGAGCCGCAGGTGAGGGGGAAACTGAAGCTGAAGACTGGGGCAGTACCCACGCTCAAACTGGGCCATGAGGAGGCTTTGATGTACGACAATGAGGCTATAAAGGCTGGAGTGGCCGAAGAAGAGGCTGGCAGTCCGGCGCCATCGCCTCTGGTGATACCCAAGACGGAAGTGCTGGACGAAGAGGAGCGCGAGgaagatgaggaggaggaggagaaccCCGAAGAAGAGCAGCAGGAAACCCACGATGAGGAGAAGGATGAACATGAAGATGACACGCCCGAAGGAGCAGAGCAGCTGGGAGATGAGGATGACGACGAAGATCCAGGCAACTATTTTGATCCCTTGGAACTGGTGGAGACTTATGCGGAGCATCCCAGCGACGATGACAACAGTCACGAGCCAGCAGACGATGCCAGAgaagaggatgaggatgaCGAGGAGGAGCCAGAAACGCTCCTGCCTGATACGCCACcccaaccagcagcagccgttCTTCGCGTGCCCAAACCGTGGGAAAGACCTGTCGCCGTAGTGCCTCGCCGAGAGAAGCGTCCGAATAACGTGGATCCCATCTGCTGCCTCAAGCACTGCCGCAAGGAGCGCTCCGCCATGTATCTGCTGAGCACATTTGGCTTCCCCAAGGaccagcagctgctgctgaagTGGTGCGCCAACCTCCAAATGAATCCCTCGAGCTGCATTGGCCGCGTCTGCGTCGAGCACTTCCAGTCGGAGGTTCTGGGCACGCGAAAACTCAAGCAGAATGCGGTGCCCACCCTCAATGTGGGGCACGATGTGCCACTGCGCTACACCTGCAACGGCCAGGAAATGCctcaggcagcagcagcggtgaCGGCGGTCACCACCAGCAGCTTCCCCGACGAAATGCCACAGCATTCGGTTTTTCGGCTTTGGAGCCTGAAACACTGCCGCAAGAGGAAGCTGTCGGAGagtccagctccagctccagcagcgatcaaggaggaggagcagatGCAGACTCAGATtcagatggagatggagactAAGCCAAAGATGTGCTGCCTCCCCAGTTGTGGCAATGTGGAGGGTTACGGACCAGGCGGGCACTTTCAGCCGCTGCCCCACGACCAAAGAGTGCTGAAAAAGTGGCAGCACAACCTGAGGCTATCATCCATTAATCCTGACTCGGATCTTCGAGGCTTGCGTCTGTGCATGGAGCACTTTGAGCCGCATCAAATCGAGAACGGAGCACCAGTGCGGATGGCAGTGCCGACCCTCAAGCTAGGCCACTCCAGTCCCAACATCTTTAAGAACAGCGAGAGCACGCTGCCGGGATGCCTGTGGCCCTCGTGTCCGCCCAATCGCAAGATCTGCTACGATCTGCCTGACAATGAAGCCGTGCGAGCGGCTTGGCTGTCCTATGTGCGGCTGCCGCTGGACAGCCCGGGACGCCTTTGTGGCCTGCACTTTCTGCAGCTGTACGAGGAGGTGGATCTACCAGGAGATGTACCCGAAACGGTGCTCGAGTGTCTGCAGGATACCTACGATCAGGCCTCCATCTCGCTGAAGTTTCAGTGCTCGGTGCAGGGCTGCGGCTCCAAGTACAAGCAGGACACGCATTTGGCGAAGCTTCCACGCGACCCGGAACTGCTCGCCAAGTGGCTGCACAACACCAAGATCTCCTACGATCGCTCCTTGCATTTCAGCTACCGCATTTGTCTCCTGCACTTCGAGGCGTTCTGCTTGAATGGCGTCCGCCCACAGACCTGGGCCATACCAACACTCCAGCTGAATCACGACGAGGAGATCTACCAGAATACCGTCAAGCAGGAGATCCACGAGAGTCCCTTGAAGCAGGAGAAACCCCACTGTAGCAGCCTCCCCAGTCTGAATCTCTCTATCCCCCTGCACATCAAGACCGAACAGGGTCCTGTCCAGCGATCCCGAGGCACTTGGGGCACATCTTCTCAGAGCAGTCCCTGCCTGAGCGCCAGCTCCAGTCCACGCACGAAGAACAGGATTTGCTGCATTCCCGATTGCGGGGAGAATGCCAGAACCCAGCGGCTCTTCCGCTTTCCCACCGCCGAACCGGCGCTGCTCAAGTGGCTGGTGAATACCCAGCAAAAACCGGGACTGGTGGACATCCAGAGCCTGTTTGTGTGCCAGCTACACTTCGAGGCGGACGCCATCAACCAGACGCAGCTCAGCAGCTGGGCCGTGCCCACACTAAGGTTGGGCCACGACGGCCATGTCATACCGAATGCCAAGCACAATGGGAACATAGCCAACAGCCAGGAGACGGAGCAGGCCATGGAGTTCATTCGGGCCAACTACTGCTCCGTGCTGAGCTGCTTCCAGCCGAAGGCCGACGGTGTGCGCTTCTACAAGTATCCCAGCGACATTGCCATGGTGCGCAAGTGGGCCACGAATCTCAAGCATCGCTCCATGCAGGCCAGCAGCCATGGCTTCCTGGTGTGCCAGTCCCACTTCGCCGCCGACTGTTTTGATCCGGAGACGGGAGACCTGCGCGAGGACGCCGTGCCCGTATGCACAGTCGCGGGGAGCGTGAAAACAGAGGTCGTGCTGCTGCGTTGTCTGGTAAAGGGTTGCTCTACGGATAACTCTGGAAAAGGACTGCTGTTCAAGGTGCCAAAGAAGAATCGTGTGCGGGACGAGTGGGCCCACAATCTCTGGATGCATCCGATAGAGCTGATGGGAGAGCACTACATCTGTGATCGACACTTCAAGGCGCACTGCGTGAACGAACACAAACTGCTGCACGCGGGCTCAGTGCCAACCCTTCACCTGGGGCATAATGAACCGCTGGAACTGCTGCCCAATCCCCAGACCTTCCAGGAGTGCCCCGAGAAGTGCGAGTGCTGTGTGCCCGGCTGTGGACGCACCAACCGGAAGGAGGAGGATCTGCAGTTTAGCAAATTTCCCAAGTGGCGAGTGCTGTATGAAAAGTGGCTGCACAACTTCCGCCTCGAAGTGCCCAAGGAGCACCGCATTGGGGCGCTGCGAGTGTGCCACATGCACTTTGAGGAGAACTGTTTTGATGGCCAGAGCGTGCGCAGGGGAGCCCTGCCCACCCTGGAGCTGGGACACTCGCATCCAGACATTTATCGCACGGACAAGGGATCGCTATGGAAAAAAGTTCACAAGCGATTCAGTGACTGCTGCTATCCCGATTGCTACGAGGAGTGCCAAAAGGCCAACACCAATCGCATGGTCTACGATCTGCCCAGCGATGGGCCATTGCGAGAGTCCTGGCAGCAGCACATGGGCATTCCTGGCACCGGTGAGGATAGCTCCTCGGCACTAAAGCTCTGTGCCCTGCACTACATCATGCTCTACGAGCACAGCGAACAGAGCTTCCCCGAGCACGGACCCAATCTTCTGCTCGACAGGAACTACGAGCACGCCCGCCAGTTGGCGTATCTCCGACGCTTCATGTGTGCCGTGCAAGGGTGTCGCCATCTGCAGCAGCGGGATGGGGGTCTGATGCACGGCATACCCAGGCGGAGGGAGATCCTTCGGATGTGGGTGGAGAACGCCCAGCTGCGGCTGAACGAGCACGAAATTTACATGACCAAACTGTGCAGCAAACACTTTGAGGCCCACTGCCTGTTCGAGGGAAAAAAATGCTATCCCTGGAGTGTGCCAACGCTCCATCTGCCAGAGCTGCAGCCTGGCCAGGTGCTCCATCAGAATCCCACCAAGGAGGAGTGGCAGCAAATGAAACAGAGAATGACAATGGACGAGCAGACGCTGAAGGCGGAGCAGCAGGTAGATGGATTGCTAGTGGAACCTTACATCAAGATGGAACCCCACGACGACGAGTCACAAACGGAATCGGAATTGCTGATAAATGAGAGCACGCTGGATTCTCAAGAAGAGGACTCTCCAGCCCATGAGCCCATGGAAATGCCCGCCCTGGAGGTGCTCCTGGAGGTGGGCCATGTTGAGCGGCTGGATAGCTACGAGAAGAAGGAAGGCTCTTCGGATACTCCGGCCATCACGTATGCTCCCCCCAAACGTTTCCGCCATCAGTACACTGCGCACAAGTGCAGTGTCGAGGGCTGTGGCGTGTCGCTGGAGGACCTTGACGGGAATCTGAAGCTACATAAGCTACCCAGCTCCACGGAGGCGACCAGGAAGTGGCTGTACAACATTCAGGTGGATATAGAGGATAAATGGCGGATACGCGTCTGCAGCCATCACTTCGACAGGCAATGCCTCAATGGTTCGAGGCTCAGGAGGGGATCGATGCCCACTCTGCTGCTGGGGCCGCGTGTTCCAGAGATTATCCATCAGAATGAGTTTGCGCAGCTGCAATTGGACGATGCGCCAGCACAGAATGGCCTTCCATCGGAGCGAACCATTGGAAAGGTTGTGCAGCTATGCGTGCCACGTCCCTCGCCGCCGCGCAAGTCCAGCAAATTCTGCCAGATCGAGGGATGTCCGAATCATTTGACCAGCGAGAATATGACACTCCACAAGTTCCCGCACTCGTCGTTGATCTGCACCAAGTGGCAGCACAACACACAGGTGCCGTTCGATCCGGAGTACCGCTGGCGCTATCGCATCTGCAGCGCCCACTTCCATCCCGTGTGCATGGCCAATATGCGGCTGCTGCATGGCAGTGTGCCCACCCTGAAGCTGGGTCCACGGGCGCCCGCCGAGCTCTTTGACAGCGACTTTGAGGCCATAAACATAAAGATTGAAAAAATGGAGAAGATGGAGAGGAAATCTGAGGCTCATCGGAGCACCACTGGAGGAGATAGGTATCCCACCATGCAGGACATGGGGGAGAGGAAGTTCAAGACTGAGGAGAAGATGGAGGATGGAATGGACGAGGAGGATGACATGCTCTACCTGGAGCCAGAGATGCAGCTATACGAGGATCAGGAAGAACAGCAACAGAAGCCAAAGGTCAATCTGGGAGTCTCCAATGGCGGCTGGAAAACGGAACTCCGTTTGCCATCGAAGGGTAGGGTGGCCTTCAATCCGGTGAGATCTGGCTACGACAAGTGCTCGCTGATGCATTGTCAGCGCCAGAGGTCGAAGCACGGCGTCCACATCTACAAGTTCCCCCGATCGCAGGAGCACCAGCAGCGATGGATGCACAACTTGCGCATCCGCTACGACGAGAAGCGGCCCTGGAAGTTTATGGTCTGCAGCGTGCACTTTGAACCGCATTGCATACGGCTGCGGAAGCTGCGGCCCTGGGCAGTGCCCACGCTGGAGTTGGGAGACAATGTCCCCGAGGACATCTATACGAACGAGCAGTGCCAGATGTTTGCCAGTGGACAGGGAGGGGAGATCAATGGCATCGAAAGCGATGAGGCGGAGAGCGATGGGAATGACGAGGAGGATGGCCTGCAGGAGGACGAGGATGAGGAGACAGACGATCAGGAGCCCACTGCTAAGAAGCGTCGTCGTTCGCGTTTGGATGCAGTCTGGCCTCCCGGCCAGGTGCCACCGTGGAAGGTGAAGCAATGCTGTCTTCCCTACTGCCGCAGTCCTCGCGGCGAGGGCATCAAGCTGTTTCGACTGCCCAACAAAGTCAACTCCATCCGCAACTGGGAGCTGGCCACGGGTATGAAGTTCAAGGAGTCGCAGCGCAACACGAGACTCATCTGCAGCCGCCACTTTGAGCCGGAGCTGATCGGAGTGCGTCGTCTCATGCGCAATGCCATTCCCACCAGGCATCTGGGTCCCACAGGCGATGTTAAGCCACTGGTGGCTCCACCGACAGCTGGTCCCAAATGCTGCATGGCAGATTGTGCCTATGATGTGGCAGATGTAAAGCTGCACAAGTTTCCCAGCAATCCCAAACTATTAAGGGAGTGGTGCCAGGCATTGAGGGTCACGGATATGCAGAGGTATCGCGGCAAGCACATTTGCTCCGCCCATCTGCCCGTGCACAAGGCCGTGCAATGCATTGTTTGTGGCGCGGACAAGGCCCCCCTGCTGCCGATGCTTAATTTTCCCGCTAACCGGAATCAGCGCGCCAAATGGTGCTACAATCTGAAGATAGAAACAATACCCAAATGGGACATATCCAAGCACATTTGCTGTAAACACTTTGAGCCATATTGCTTTGGAGAGGCGGGTCTCCTAAAGCCGGAGGCGGCGCCCACACTGCATTTGAATCACAATGATACCAACATATTCCTTAACGATTGTGCCATAAATCCTGCCTTCAGTGGAGGAGCAATGCAGGTGAAGGATGAGCCCATGGACAATCAGGTCCTGTCGTTGATGTAG
Protein Sequence
MSQHNPHAHPHYHHHPLHHHQTQHHHHQQQQLQQQQQQQQQQAQMPQHSNWYSHVASYPPPPPHHHATSTSAFAATSTPCKGTDYGAGSTHGYYAAAAAAAGGGLNVNAPPNPMAPPPAPDMIIKSEPMDEHPYKSNYIDDNTPFADFNKFNEFSGDMLSPKVELTVKDETYGKTSSSSFARRKAQQQQQQQTTDHSAESLPICQRCKEVFFKKQSYLRHVAESSCGIQEYDFKCNICPMSFMSTEELQRHKHLHRADKFFCHKYCGKHFDTIAECESHEYMQHEYESFVCNMCSGTFATREQLYAHLPQHKFQQRYDCPICRLWYQTAGELHEHRLAAPYFCGKYYTNQQQQQLATNQGNYKLQDCHMATMEMPTAPLHKATPSNASALPATAALSSLLQQRQANADGAAAMFAAASSTSASLKSEVSVKLERSYSNSTSESSYSHQENSSYNNAYGSDSSIHGGALAGPQAHSSTLDDSEDALCCVPLCGVRKSTSPTLQFFTFPKDDKYLNQWLHNLKMFHIPAASYATFRICSMHFPKRCINRYSLCYWAVPTFNLGHDDVANLYQNRELTNTFTTGEVARCSMPHCTSQRGESNLKFYNFPKDIKSLIKWCQNARLPVQAKEPRHFCSRHFEDRCIGKFRLKPWAVPTLHLGAQYGKIHDNPKNLYVEEKRCCLNFCRRSRSSDFNMSLYRFPRDEVLLRRWCYNLRLDPGVYRGKNHKICSAHFIKEALGLRKLSPGAVPTLHLGHNDTFNIYENELWPPPSPTGQHGGNHQLLQQQQTSQQLSHHHSSLQQQQHQPMHSKSYQRHSAASTSSSASSASHYVDPEMSASYLSLSAGGSSGGMNASDCMDVCCVPSCESKRHNSENITFHTIPRRPEQMRKWCHNLKIPEDKMHKGMRICSLHFEPYCIGGCMRPFAVPTLHLGHEDEDIHRNPDVIKKLNIRETCCVAVCKRNRDRDHANLHRFPSNVALLTKWCANLQRTVPDGSKLFNDAICEVHFEDRCLRNKRLEKWAVPTLILGHEDIAYPLPTPEQVAEFYARPTAPNNGEEQGECCVETCKRNPSVDDIKLYRPPEDASVLAKWAHNLQTEAAVLTNMRICNLHFEAHCIGKRMRPWAIPTLNLAGNIENLFENPEHSMLYKRRTHLKQKVPVTKPTWVPRCCLPHCRKVRALHNVQLYRFPKLNRSTLAKWAHNLQVPQVGSAQRRVCSAHFEPHVLSKKCPVPLAVPTLDLNSPAGHKIYQNPAKLKANKLCLQRVCIVESCRKTRAQGVQLFRLPHSPTQLRKWMHNIRTRPRAAMRSQYRVCSRHFETHSFNGRRLSAGAIPTLELGHDDDDIFPNEAQAFADEHCAVEGCESSKEQPEVRLFRFPTDDDDMLWKWCNNLKMNPVDCIGVRICNKHFDADCIGPKHLYKWAIPTMLLGHDDSQIELILNPKPEERYVDPVFKCIVPTCGKTRRFDEVQMNSFPKDAELFQRWRHNLRLEHLCFKEREKYKICNAHFEDICIGKTRLNIGSIPTLELGHEETEDLFKVNPEDLQSNLFGRPRRLLRGYNNVTIKQEVPETEGQDIKPDIGANFTQVKVKKSLGDIKCCVHTCGRSRLEHGARLFPFPTGKQQHLKWRHNLRLEPDEVDKTTRVCSAHFNRRCIDGKHLRGWAMPTQQLGHQEQPIYENPKNIPGFFTPTCALGHCRKRRSIDNDLRTYRYPRSEDLLEKWRANLRLSLDQCRGRICADHFEPQVRGKLKLKTGAVPTLKLGHEEALMYDNEAIKAGVAEEEAGSPAPSPLVIPKTEVLDEEEREEDEEEEENPEEEQQETHDEEKDEHEDDTPEGAEQLGDEDDDEDPGNYFDPLELVETYAEHPSDDDNSHEPADDAREEDEDDEEEPETLLPDTPPQPAAAVLRVPKPWERPVAVVPRREKRPNNVDPICCLKHCRKERSAMYLLSTFGFPKDQQLLLKWCANLQMNPSSCIGRVCVEHFQSEVLGTRKLKQNAVPTLNVGHDVPLRYTCNGQEMPQAAAAVTAVTTSSFPDEMPQHSVFRLWSLKHCRKRKLSESPAPAPAAIKEEEQMQTQIQMEMETKPKMCCLPSCGNVEGYGPGGHFQPLPHDQRVLKKWQHNLRLSSINPDSDLRGLRLCMEHFEPHQIENGAPVRMAVPTLKLGHSSPNIFKNSESTLPGCLWPSCPPNRKICYDLPDNEAVRAAWLSYVRLPLDSPGRLCGLHFLQLYEEVDLPGDVPETVLECLQDTYDQASISLKFQCSVQGCGSKYKQDTHLAKLPRDPELLAKWLHNTKISYDRSLHFSYRICLLHFEAFCLNGVRPQTWAIPTLQLNHDEEIYQNTVKQEIHESPLKQEKPHCSSLPSLNLSIPLHIKTEQGPVQRSRGTWGTSSQSSPCLSASSSPRTKNRICCIPDCGENARTQRLFRFPTAEPALLKWLVNTQQKPGLVDIQSLFVCQLHFEADAINQTQLSSWAVPTLRLGHDGHVIPNAKHNGNIANSQETEQAMEFIRANYCSVLSCFQPKADGVRFYKYPSDIAMVRKWATNLKHRSMQASSHGFLVCQSHFAADCFDPETGDLREDAVPVCTVAGSVKTEVVLLRCLVKGCSTDNSGKGLLFKVPKKNRVRDEWAHNLWMHPIELMGEHYICDRHFKAHCVNEHKLLHAGSVPTLHLGHNEPLELLPNPQTFQECPEKCECCVPGCGRTNRKEEDLQFSKFPKWRVLYEKWLHNFRLEVPKEHRIGALRVCHMHFEENCFDGQSVRRGALPTLELGHSHPDIYRTDKGSLWKKVHKRFSDCCYPDCYEECQKANTNRMVYDLPSDGPLRESWQQHMGIPGTGEDSSSALKLCALHYIMLYEHSEQSFPEHGPNLLLDRNYEHARQLAYLRRFMCAVQGCRHLQQRDGGLMHGIPRRREILRMWVENAQLRLNEHEIYMTKLCSKHFEAHCLFEGKKCYPWSVPTLHLPELQPGQVLHQNPTKEEWQQMKQRMTMDEQTLKAEQQVDGLLVEPYIKMEPHDDESQTESELLINESTLDSQEEDSPAHEPMEMPALEVLLEVGHVERLDSYEKKEGSSDTPAITYAPPKRFRHQYTAHKCSVEGCGVSLEDLDGNLKLHKLPSSTEATRKWLYNIQVDIEDKWRIRVCSHHFDRQCLNGSRLRRGSMPTLLLGPRVPEIIHQNEFAQLQLDDAPAQNGLPSERTIGKVVQLCVPRPSPPRKSSKFCQIEGCPNHLTSENMTLHKFPHSSLICTKWQHNTQVPFDPEYRWRYRICSAHFHPVCMANMRLLHGSVPTLKLGPRAPAELFDSDFEAINIKIEKMEKMERKSEAHRSTTGGDRYPTMQDMGERKFKTEEKMEDGMDEEDDMLYLEPEMQLYEDQEEQQQKPKVNLGVSNGGWKTELRLPSKGRVAFNPVRSGYDKCSLMHCQRQRSKHGVHIYKFPRSQEHQQRWMHNLRIRYDEKRPWKFMVCSVHFEPHCIRLRKLRPWAVPTLELGDNVPEDIYTNEQCQMFASGQGGEINGIESDEAESDGNDEEDGLQEDEDEETDDQEPTAKKRRRSRLDAVWPPGQVPPWKVKQCCLPYCRSPRGEGIKLFRLPNKVNSIRNWELATGMKFKESQRNTRLICSRHFEPELIGVRRLMRNAIPTRHLGPTGDVKPLVAPPTAGPKCCMADCAYDVADVKLHKFPSNPKLLREWCQALRVTDMQRYRGKHICSAHLPVHKAVQCIVCGADKAPLLPMLNFPANRNQRAKWCYNLKIETIPKWDISKHICCKHFEPYCFGEAGLLKPEAAPTLHLNHNDTNIFLNDCAINPAFSGGAMQVKDEPMDNQVLSLM

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2