Basic Information

Gene Symbol
-
Assembly
GCA_036172665.1
Location
CM069876.1:45051499-45070524[-]

Transcription Factor Domain

TF Family
zf-GAGA
Domain
zf-GAGA domain
PFAM
PF09237
TF Group
Zinc-Coordinating Group
Description
Members of this family bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 45 0.0023 1.4 10.1 0.0 17 44 34 58 28 65 0.82
2 45 1.6 9.3e+02 1.0 0.1 22 52 64 94 61 96 0.83
3 45 2.3 1.4e+03 0.5 0.1 17 44 87 114 75 123 0.72
4 45 2.4 1.4e+03 0.4 0.2 21 48 147 173 138 177 0.89
5 45 0.052 31 5.7 0.0 21 44 203 223 194 228 0.83
6 45 0.025 15 6.7 0.0 21 52 228 259 225 261 0.88
7 45 5.3 3.2e+03 -0.7 0.9 18 44 308 334 303 338 0.75
8 45 0.024 14 6.8 0.1 21 44 339 362 335 367 0.90
9 45 0.056 33 5.7 0.0 21 46 367 391 363 395 0.90
10 45 0.51 3e+02 2.6 0.2 21 52 395 425 392 427 0.90
11 45 0.85 5.1e+02 1.9 0.0 15 51 443 479 439 482 0.79
12 45 0.11 68 4.7 0.1 21 49 477 504 475 510 0.88
13 45 0.0081 4.8 8.3 0.0 21 51 533 560 529 562 0.80
14 45 0.37 2.2e+02 3.0 0.2 21 46 558 583 556 589 0.85
15 45 0.33 1.9e+02 3.2 0.4 20 45 585 610 578 617 0.78
16 45 0.16 94 4.2 0.0 21 44 614 637 611 645 0.86
17 45 1.3 7.6e+02 1.3 0.1 21 45 642 666 639 675 0.81
18 45 3.7 2.2e+03 -0.2 0.6 20 48 703 730 690 734 0.83
19 45 0.55 3.3e+02 2.5 0.4 18 52 729 762 726 765 0.84
20 45 0.002 1.2 10.3 0.1 21 44 787 810 771 818 0.90
21 45 7.7 4.5e+03 -1.2 0.0 21 44 815 838 812 845 0.84
22 45 1.1 6.6e+02 1.5 0.1 26 46 858 878 849 884 0.84
23 45 0.61 3.6e+02 2.3 0.2 21 43 881 903 878 909 0.84
24 45 0.076 45 5.2 0.0 21 44 909 929 905 938 0.79
25 45 2.7 1.6e+03 0.3 0.1 21 43 934 956 931 961 0.84
26 45 0.028 17 6.6 0.0 21 44 962 985 959 994 0.87
27 45 9.3 5.5e+03 -1.5 0.0 18 44 1042 1068 1031 1073 0.80
28 45 0.0003 0.18 12.9 0.1 21 47 1073 1099 1070 1106 0.86
29 45 3.5 2.1e+03 -0.1 0.1 21 46 1101 1126 1099 1132 0.85
30 45 1.7 1e+03 0.9 0.0 21 34 1185 1198 1181 1213 0.74
31 45 4 2.4e+03 -0.3 0.1 20 43 1209 1232 1202 1238 0.84
32 45 0.22 1.3e+02 3.8 0.1 21 44 1238 1261 1230 1265 0.87
33 45 1.5 9e+02 1.1 0.7 18 44 1290 1316 1285 1321 0.84
34 45 0.0003 0.18 12.9 0.1 21 44 1321 1344 1317 1346 0.93
35 45 1.7 1e+03 0.9 0.0 20 47 1354 1381 1346 1386 0.75
36 45 1.3 8e+02 1.2 0.0 21 48 1383 1409 1381 1413 0.87
37 45 0.18 1.1e+02 4.0 0.0 18 44 1436 1459 1426 1467 0.81
38 45 4.7 2.8e+03 -0.5 0.0 21 44 1464 1487 1461 1491 0.90
39 45 0.098 58 4.9 0.1 21 47 1492 1518 1485 1526 0.86
40 45 5.1 3.1e+03 -0.6 0.4 17 44 1544 1571 1532 1579 0.76
41 45 0.62 3.7e+02 2.3 0.1 21 44 1630 1653 1626 1662 0.84
42 45 4.7 2.8e+03 -0.5 0.1 22 44 1659 1681 1654 1685 0.88
43 45 1.1 6.4e+02 1.5 0.9 18 44 1710 1736 1700 1740 0.84
44 45 0.04 24 6.1 0.1 21 47 1741 1767 1737 1775 0.86
45 45 0.1 62 4.8 1.1 21 49 1797 1824 1793 1828 0.87

Sequence Information

Coding Sequence
ATGAATCTGCACAAATTACGACACGCCAATGAAAAGCGATTCCGCTGTACCCTTTGCGAATACAAATGCTTCGGAGCAGCCCAATTAAAACGGCATGTCTCAACGCACGACAGCGAAAAGCCGTTCACCTGCGGAATTTGCAACAAGAAAGTCAGAAACTTGAGACCCCAcatgttaatacacaccggcgaaaagACGTTCAGTTGTaatctctgcgattacaagtgtcgGCACGGCTCAAAGTTGAAACGTCACATGTcaatacacaccggcgaaaagccATTcacttgcgatctttgcgactACAAGTGCGGAGAGACCTCGACGTTAAAACACCACATgttaacacacaccggcgaaaagccgttcggttgcgatctttgcgatttcaGGTGTCGGGGAAACGCAACGCTGAAACAGCACAACTTAATACACACCGGTGAAAAGCCACTGAGTTGCGATCTTTGCAATTTCAAATGCCGACACGTCCAAAGTTTGAAACTGCACAAGTTAAGGCACGCCGACGAGAAGCGGTTCCGTTGTACTCTTTGCGAATACAAATGCCTCAAAGCATCAGAATTAGAACGGCACGTGTTAACGCACGACGACGAAAAGCCGTTCACCTGCGGAATTTGCGATAAGAAAGTCAGAAACTTGAGACCCCAcatgttaatacacaccggcgaaaagccgTTTAGCTGcggtctttgcgattacaagtgtcgAGGCAAGTCGAATTTGAAACGGCACATGTCAGTACACACCGGCGgaaagccgttcagttgcgatctttgTAATTACGAGTGTCGAGGCATGTCAGTCTTGAAACAGCACATGTCGACACACACCGGAAACTCGTTGAGTTGCTACgcttgcgattacaagtgcgatCGCTCCGACGCTATGAAGGCGCACAAATTAAGACATGCCGACGAGAAGCGATTCAGTTGTGCTCTTTGCGAATACAGATGCGTCCGATCGTCACATTTAAAACGGCACATATTAACACACAACAACGAAAAGCCGTTCACCTGCGGGGTTTGCGATAGGAAATTCAGAGAGCTAGCGCACTTGAGACGTCACACGTTAATACACACCGGGGAGAAGCCATTcacttgcgatctttgcgatttcaGTTGTCGAGAGACCTCGTCGTTGAAGAAACACTTGTtgaaacacaccggcgagaagccgttaagttgcgatctttgcgattataaatgccgacaccACCAAAGTTTGAAACTGCACATGTCAAGGCACACCGGAAAACCAACGACGGACTGTACCGGTCACACCGAAGCaattaatttgtgCAAACCGACTCCTAATTCTAAACTTGAAGACAAAGAGAAAGTCACGTGCCGCATATGTGATTCTAGGTTTACAACTAAAGCATATTTGAAGAAGCACTTGCTgatacataccggcgagaagcccTACATTTGCGacatttgcgattataaatgtcgagaaTCCTCCGGTCTAAAATCGCACAAATTAAggcacaccaacgagaagcccTTCCGTTGTACTCTTTGCGAATACAAAAGCTTCGCAGCGTCACAACTAAAACGGCACGTGTTAACGCACAACAAGGAAAAGCCGTTCACCTGCGGAATTTGCAACAAGAAAATCAGAAACTTGAGGCCCCACATGTTGCTGCACACCGACGAAAAGCCGTTCACTTgtaatctttgcgattacaagtgccgagACCGCACGATATTGAGACAACACACGTCAGtacacaccggcgaaaagccgATCAGTTGCGGtctctgcgattacaagtgcctcAGGGCATCGCAATTAAAACGGCATATGTTAACGCACAACAACGAAAAGCCGTTTAACTGTGAGATTTGCAATAAGAAATTCAGTCAACTCGGGAACTTAAGACTTCACGCGTTagtgcacaccggcgagaagccgtttagTTGCGGCCTTTGTGATTACAAGTGTCGAAGAAAATCGTGGTTGAAACGGCACacgttaatacacaccggcgagaagccgttcgcttgCGATTTTTGCGATTACATGTGTCGGGACAAGTCAATGTTGAAAAACCACGCGTTAACGCACGGCGAAAAGTCGTTACGTACCGGCGAGAAGCGTTTCAGTTGTGacgtttgcgattacaagtgcagGCGTTCCGACTGTATGAGGCTGCACAAATTAAGACATACCAACGAGAAGCGCTTCAGTTGCACTCTTTGCGAATACAAGAGTCTCCAGTCAACGCATTTAAAACGGCATATGTTGACAcacaacgagaagccgttcggttgcgaTCTCTGCGACTACAGGAGTCGAGAGACCTCGACGTTGAAACACCACATGTTGAAACACActggcgagaagccgttcggttgcgaAGTCTGCGATTACGTGTGCCGAGGGACCAGATCGTTGAAACGTCACATGTTAGGACACACAGGCGAAAAGCCtttcagttgcgatctttgcggTTTCAAGTGCCGAGAGGAATCCATgctgaaacaacacgtgttaattcatttGTACAAGCGGACTCTTGAATCTAAACTCGAAGGGAAAGGACATGTCGTGTGCCGCATATGTGATTCCACATTCAGAACTAAAGCGTACTTGAAAAGACACGTGCTgatacataccggcgagaaacccttcaactgtgatctctgcgattacaaatgcatCCGAGCATCGCAGCTAAAACGCCACGCGTTAACACACAACGTCGAAAAGCCGTACACCTGCGGAATCTGCGACAAGAAAGTCAGAGACTTGAAGCGCCACatgttaacacacaccgacgaaaagCCATTcacttgcgatctttgcgattacaagtgtcgAGAAAACCCGATGCTGAAACAGCACaagttaatacacaccggcgaaaagccgTTTGGCTGCGATTTCTGCGACTACAAGTGCCGAGGCAGCTCAAATTTGAAACGGCACGTgttaacacacaccggcgaaaagccgTTCGGTTGCGACCTATGCGACTACAAGTGTCGAGACAAGCCGATGTTGAAACAGCACATGCTAATACACACCGGAAACTCGTTGAGTTGTGACgtctgcgattacaagtgcgaACGTCCCGACGCCATGAAAACGCACAAGTTAAGACACGCCAACGAGAAGCGGTTCAGCTGCGCACTTTGCGAATACAAGAGTCTCGAATCGACGGCATTAAAACGTCACATGTTAACCCACAACGACGAAAAGCCGTTCACCTGTGGGATCTGCGATTATAAGTTCAGGCAACTCGAACACCTGAGACGCCACGTgttgatacacaccggcgagaagccgtacagttgcgacctttgcgatttcAGGTGTCGAGAGAACGCGACGctgaagcaacacgtgttaatacacaccggcgacaagccgttcagttgtgacctctgcgatttCAAGTGCCGGCACGTTCAAAGCTTGAAACTGCACAGGTTGAGGCACACCAACGAAAAGCGATTGCGCTGTGCTCTTTGCGAATACAGGTGCGTCAGGGCGTCGCAGTTAAAACAGCACGTGTTGAAGCACAGCGACGAAAATCCGTTCACCTGCGGAATTTGCGATAAGAAAGTCAAATACTTGAGGCCCCACATGTTGACACACAAcagcgagaaaccgttcggttgcgatctttgcgattacaagtgtcgAAACAACGCGCTTTTGAAGCAGCATAGgctaacgcacaccggcgaTAAACCGTTCAGCTGCGATCTTTGTGATTACGTGTGTCGAGACAGGTCAGTGTTGAAGCGGCACGTGTTAATACACACAGGAGAGTTGTCGAGTTGTGacgtttgcgattacaagtgcgaCAGTTCCGAAGCTATGAAGATGCACAAATTACGACATGCCGACGAGAAGCGATTCAGTTGTAACCTTTGCGAATACAGATGTCTCCAGTCATCGCAATTAAAACGGCACATGTTGACGCACAACAACGAAAAGCCGTTCACCTGTGGGGTTTGCTACAAGAAATTCAGAGAGCTCGCGCACTTGCGACGGCAcatgtgCAAAATCTCTCCTGAATCTCAGCTTGAGGAGAAGGGGAAAGTCACGTGCCGCATATGCGATTCGAAATTCACAACCAAGGCACACTTGAAGAAACACTTGCTgatacataccggcgagaaaccgttcaacTGTAACCTCTGCGATTACAAGACTCGAGATTCCTCCACCCTGAAGACGCACAAAATAAggcacaccaacgagaagcgATTCCGTTGCGCCCTCTGCGAATACAAATGCTTCAGGCCGTCCCTATTAAAACGGCACATGATAAGgcacaccggcgaaaagccgTTCGCCTGCGGCATCTGCAACAAGAAAATCCGAAACCTGAAAGCCCACATGCTAATACACACCGACGAAaggccgttcagttgtgatctctgcgatcaCAAGTGCCGAACCGGCTCAAAATTGAAACAGCACATGACgacgcacaccggcgaaaagccgttcacttgcgatctttgcgattcgAGGTGCCGAGATAACGCAACGCTGAAACAACACATgttaatacacaccgacgaaaaGCCGTTCGGTTGCGACCTGTGCGATTTCAAAAGCCGACACCTCCAAAGTTTGAAACTGCACAAGTTAAGGCACACCAGCGAAAAGCGGTTCCGCTGTACTATCTGCGAATACAGATGTCTCAGAGCGACAGAATTAAAACAGCACGTGTTAAAGCACGATGACGGAAAACCGTTCAAAACATGCGCAATTTGCGATAAGAAAGTCAAACACCTGAGACCTCACATGTTAATACACACCGAAGAAAAGCCGTTCggctgcgatctttgcgattacaagtgtcgCAGCGGCTCGAAGTTGAAGCAGCACGTgctaacgcacaccggcgaaaagccgTTCACCTGCGATCTTTGTGATTACGAGTGTCGAGGCAGGTCAATGTTGAAACGGCACGCGCTAACGCACACCGGCCAAAAGCCCGTCGGTTGCGGACTTTGTGATTACAAATGTCGAGACAGGTCGAAGCTGAACCGGCACATGTTGACACACACCGGAAATTCGTTGAGTTGTGACgcttgcgattacaagtgcgtGCGTCCTGATGCCATGAGGACGCACAAATTAAGACATACCAACGAGAAGCGATTCAGTTGTACTCTTTGCGAATACCGATGTTTGCAATCGTCGCAATTAAAACGGCACATGTTGACACACAACGACGAAAAGCCGTTCACCTGCGCGGTTTGCGATAAGAAGTTCAAGGAACCGGCGCCCTTGAGACGGCACGTGTTAATACATACCGACGAGAAGCCATTCggctgcgatctttgcgatttcaGGAGCCGAGATAACTCGTCGCTGAAACAGCACgtgttaatacacaccggcgagaagccgttaaGTTGCAGTATTTGCGATTATAGATGCCGACACCGCCGAAGTTTGAAGATGCACATGTCAAGGCACGCCGGAAAATCAACGACGCGCTCTGCCGCCCACGAAAGCAATTAA
Protein Sequence
MNLHKLRHANEKRFRCTLCEYKCFGAAQLKRHVSTHDSEKPFTCGICNKKVRNLRPHMLIHTGEKTFSCNLCDYKCRHGSKLKRHMSIHTGEKPFTCDLCDYKCGETSTLKHHMLTHTGEKPFGCDLCDFRCRGNATLKQHNLIHTGEKPLSCDLCNFKCRHVQSLKLHKLRHADEKRFRCTLCEYKCLKASELERHVLTHDDEKPFTCGICDKKVRNLRPHMLIHTGEKPFSCGLCDYKCRGKSNLKRHMSVHTGGKPFSCDLCNYECRGMSVLKQHMSTHTGNSLSCYACDYKCDRSDAMKAHKLRHADEKRFSCALCEYRCVRSSHLKRHILTHNNEKPFTCGVCDRKFRELAHLRRHTLIHTGEKPFTCDLCDFSCRETSSLKKHLLKHTGEKPLSCDLCDYKCRHHQSLKLHMSRHTGKPTTDCTGHTEAINLCKPTPNSKLEDKEKVTCRICDSRFTTKAYLKKHLLIHTGEKPYICDICDYKCRESSGLKSHKLRHTNEKPFRCTLCEYKSFAASQLKRHVLTHNKEKPFTCGICNKKIRNLRPHMLLHTDEKPFTCNLCDYKCRDRTILRQHTSVHTGEKPISCGLCDYKCLRASQLKRHMLTHNNEKPFNCEICNKKFSQLGNLRLHALVHTGEKPFSCGLCDYKCRRKSWLKRHTLIHTGEKPFACDFCDYMCRDKSMLKNHALTHGEKSLRTGEKRFSCDVCDYKCRRSDCMRLHKLRHTNEKRFSCTLCEYKSLQSTHLKRHMLTHNEKPFGCDLCDYRSRETSTLKHHMLKHTGEKPFGCEVCDYVCRGTRSLKRHMLGHTGEKPFSCDLCGFKCREESMLKQHVLIHLYKRTLESKLEGKGHVVCRICDSTFRTKAYLKRHVLIHTGEKPFNCDLCDYKCIRASQLKRHALTHNVEKPYTCGICDKKVRDLKRHMLTHTDEKPFTCDLCDYKCRENPMLKQHKLIHTGEKPFGCDFCDYKCRGSSNLKRHVLTHTGEKPFGCDLCDYKCRDKPMLKQHMLIHTGNSLSCDVCDYKCERPDAMKTHKLRHANEKRFSCALCEYKSLESTALKRHMLTHNDEKPFTCGICDYKFRQLEHLRRHVLIHTGEKPYSCDLCDFRCRENATLKQHVLIHTGDKPFSCDLCDFKCRHVQSLKLHRLRHTNEKRLRCALCEYRCVRASQLKQHVLKHSDENPFTCGICDKKVKYLRPHMLTHNSEKPFGCDLCDYKCRNNALLKQHRLTHTGDKPFSCDLCDYVCRDRSVLKRHVLIHTGELSSCDVCDYKCDSSEAMKMHKLRHADEKRFSCNLCEYRCLQSSQLKRHMLTHNNEKPFTCGVCYKKFRELAHLRRHMCKISPESQLEEKGKVTCRICDSKFTTKAHLKKHLLIHTGEKPFNCNLCDYKTRDSSTLKTHKIRHTNEKRFRCALCEYKCFRPSLLKRHMIRHTGEKPFACGICNKKIRNLKAHMLIHTDERPFSCDLCDHKCRTGSKLKQHMTTHTGEKPFTCDLCDSRCRDNATLKQHMLIHTDEKPFGCDLCDFKSRHLQSLKLHKLRHTSEKRFRCTICEYRCLRATELKQHVLKHDDGKPFKTCAICDKKVKHLRPHMLIHTEEKPFGCDLCDYKCRSGSKLKQHVLTHTGEKPFTCDLCDYECRGRSMLKRHALTHTGQKPVGCGLCDYKCRDRSKLNRHMLTHTGNSLSCDACDYKCVRPDAMRTHKLRHTNEKRFSCTLCEYRCLQSSQLKRHMLTHNDEKPFTCAVCDKKFKEPAPLRRHVLIHTDEKPFGCDLCDFRSRDNSSLKQHVLIHTGEKPLSCSICDYRCRHRRSLKMHMSRHAGKSTTRSAAHESN

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
iTF_01258269;
90% Identity
iTF_01258269;
80% Identity
iTF_01258269;