Basic Information

Gene Symbol
-
Assembly
GCA_018703685.1
Location
JAGWEN010001092.1:3082-40153[+]

Transcription Factor Domain

TF Family
zf-GAGA
Domain
zf-GAGA domain
PFAM
PF09237
TF Group
Zinc-Coordinating Group
Description
Members of this family bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 55 0.32 1.2e+02 2.9 0.0 21 47 82 108 75 113 0.85
2 55 0.24 94 3.3 0.0 21 44 110 133 105 141 0.86
3 55 0.86 3.4e+02 1.5 0.0 21 48 138 165 135 169 0.86
4 55 0.28 1.1e+02 3.1 0.0 21 46 166 191 162 198 0.85
5 55 3.4 1.3e+03 -0.4 0.0 21 43 194 216 190 221 0.86
6 55 0.044 17 5.6 0.0 20 46 221 247 209 253 0.81
7 55 0.017 6.6 7.0 0.0 21 52 250 281 248 282 0.88
8 55 0.35 1.4e+02 2.7 0.0 21 43 278 300 274 306 0.85
9 55 0.055 22 5.3 0.0 18 44 303 329 300 338 0.85
10 55 0.0063 2.5 8.3 0.0 21 44 360 383 349 393 0.86
11 55 1.5 6e+02 0.7 0.0 21 35 388 402 383 406 0.85
12 55 0.41 1.6e+02 2.5 0.0 21 47 404 430 402 435 0.86
13 55 0.38 1.5e+02 2.6 0.0 21 46 432 457 428 463 0.86
14 55 0.25 99 3.2 0.0 19 43 458 482 450 487 0.84
15 55 0.03 12 6.1 0.0 18 44 485 511 480 520 0.85
16 55 7.1 2.8e+03 -1.5 0.0 21 32 542 553 534 557 0.83
17 55 2.2 8.8e+02 0.1 0.0 26 46 575 595 572 603 0.84
18 55 3.1 1.2e+03 -0.3 0.0 21 36 598 613 587 616 0.80
19 55 1.7 6.5e+02 0.6 0.0 27 47 632 652 629 657 0.85
20 55 0.95 3.8e+02 1.3 0.0 21 44 654 677 651 686 0.87
21 55 8 3.1e+03 -1.6 0.0 21 32 682 693 676 696 0.83
22 55 0.62 2.4e+02 1.9 0.1 20 34 743 757 735 770 0.82
23 55 0.0074 2.9 8.1 0.1 21 45 772 796 765 804 0.88
24 55 3.6 1.4e+03 -0.5 0.0 26 43 805 822 801 826 0.90
25 55 0.34 1.3e+02 2.8 0.0 21 45 828 852 820 859 0.83
26 55 0.69 2.7e+02 1.8 0.0 21 43 856 878 850 882 0.91
27 55 0.0097 3.8 7.7 0.0 21 43 884 906 877 911 0.91
28 55 0.0053 2.1 8.6 0.0 21 43 912 934 907 938 0.91
29 55 0.062 24 5.1 0.0 21 43 940 962 935 967 0.90
30 55 0.014 5.6 7.2 0.0 21 44 968 991 963 995 0.90
31 55 0.0061 2.4 8.4 0.0 21 47 996 1022 991 1029 0.86
32 55 0.31 1.2e+02 2.9 0.0 21 44 1043 1066 1039 1070 0.89
33 55 0.17 67 3.7 0.0 21 43 1071 1093 1067 1098 0.90
34 55 0.01 4.1 7.6 0.1 22 52 1100 1130 1094 1131 0.83
35 55 0.042 17 5.7 0.0 23 46 1129 1152 1126 1158 0.85
36 55 0.0014 0.54 10.4 0.1 22 43 1156 1177 1151 1183 0.88
37 55 0.017 6.7 6.9 0.0 21 43 1183 1205 1179 1212 0.91
38 55 0.99 3.9e+02 1.3 0.0 22 43 1212 1233 1208 1238 0.87
39 55 0.24 95 3.2 0.0 22 43 1240 1261 1234 1267 0.88
40 55 2.6 1e+03 -0.1 0.1 22 35 1268 1281 1265 1285 0.86
41 55 0.05 20 5.4 0.0 21 47 1283 1309 1280 1316 0.86
42 55 9.9 3.9e+03 -1.9 0.0 21 33 1311 1323 1309 1325 0.85
43 55 0.062 24 5.1 0.0 20 45 1366 1391 1355 1398 0.81
44 55 0.013 5 7.3 0.0 21 44 1395 1418 1390 1426 0.90
45 55 0.2 80 3.5 0.1 4 35 1415 1446 1414 1460 0.83
46 55 0.69 2.7e+02 1.8 0.0 22 49 1461 1488 1455 1493 0.83
47 55 0.0038 1.5 9.0 0.0 21 47 1516 1542 1506 1548 0.84
48 55 1.1 4.2e+02 1.2 0.1 21 42 1544 1565 1540 1575 0.76
49 55 4.9 1.9e+03 -0.9 0.0 21 33 1572 1584 1564 1587 0.82
50 55 0.083 33 4.7 0.0 18 43 1625 1650 1618 1654 0.88
51 55 0.02 7.9 6.7 0.0 21 43 1656 1678 1651 1687 0.86
52 55 1.6 6.4e+02 0.6 0.1 21 35 1684 1698 1679 1701 0.85
53 55 0.0063 2.5 8.3 0.0 21 46 1700 1725 1698 1731 0.86
54 55 1.1 4.3e+02 1.2 0.1 21 42 1728 1749 1724 1759 0.76
55 55 3.6 1.4e+03 -0.5 0.1 21 33 1756 1768 1748 1771 0.83

Sequence Information

Coding Sequence
ATGCCCGATTTGCCCGAAAACTACGAAGCGATCAATATTGTCCCGACAGCGAAATATGACGTCGAAAAACGCAAGCAAGAACTTATAGACAAAATTTCATCATTTATCGACTCATTTGAAAGCAACGAGCAGATCGCAGCTTTCGAAAACATCTTCAACAACATCATCGACCATCAAAATGTAGATTCGACCAATCAAATCGAAGATTTGAGCAGTTTAAACCGTCATAAAAGGATCCATACAGGAGAGAAACCATATAAATGTGATTTCTGCGAAGCTGCATTCACCGCGTCTGGCGATTTGAAAAAGCATACTAGGATCCACACAGGAGAGAAGCCATACAAATGTGACATCTGCGATGCCATGTTTATGTATACAAAAGGTTTGAAAACACACATTTTGACTCATACTGGAGAGAAGCCATATAAATGTGATTTCTGCGAAGCTGCATTCACCGCATCAAATTATTTGAAAATACATATTAGGATCCACACAGGAGAGAAGCCATACAAATGTGACATCTGCGATGCCATGTTTTTGTATACAAGTGGTTTGAAAACACACCTTTCGACTCATACTGGAGAGAAGCCATATAAATGTGATTTCTGCGATGCTACATTCACCGGGTCAGGCAGTTTGAAAACGCATAAAAGGGTCCACACAGGAGAGAAGCCATATAAATGTGACATCTGCGATGCCATGTTTATGTATTCAAGTGCTTTAAAAAGCCACATTTGGATTCATACTGGGGAGAAGCCATATAAATGTGATATCTGCAGAGCTACATTCACCGCGTCAGGCAGTTTGAAAAAGCATATAAGGGTCCACAGAGGAGAGAAGCCATATAAATGTGATATATGCGATGCTAAATTTACCAAGTCAAGCGTTTTGAAAATGCATAATAGGAGACACACAGGAGAGAAGCCATATAAATGTGACATCTGTGATGCCATGTTTATGTATTCATATAGTTTGAAAGGCCACATTTTGACTCATACTGGAGAGAAGCCAAGTGATTTCTGCGAAGCTGCATTCACCATGGCAGGCGATTTAAAAAGGCATAATAAGACCCACACAAAAGAGAAGCCATATAAATGTGACATCTGCCATGCTACATTCACCCTGAGTAGCAGTTTAAAAAGGCATAATAGGACCCACACAGGAGAGAAGCCTTTTAAATGTGACATCTGCGATGCAACGATCCATACAGGAGAGAAACCATATAAATGTGATTTCTGCGAAGCTGCATTCACCGCGTCAGGCGATTTGAAAAAGCATACTAGGATCCACACAGGAGAGAAGCCATACAAATGTGATATCTGCGGTGCCATGTTTCTGTATACAAGTGGTTTGAAAACACACCTTTCGACTCATACTGGAGAGAAGCCATATAAATGTGATATATGCGATGCTAAATTCACCAAGTCAAGCGTTTTGAAAATGCATAATAGGAGACACACAGGAGAGAAGCCATATAAATGTGACATCTGTGATGCCATGTTTATGTATTCAAAAAGTTTGAATGGCCACATTTTGACTCATACTGGAGAGAAGCCAAGTGATTTCTGCGAAGCTGCATTCACCATGGCAGGCGATTTAAAAAGGCATAATAAGACCCACACAAAAGAGAAGCCATATAAATGTGACATCTGCGATGCAACGTTCACGCACTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTCACATAAATGTGATTTTTGCGAAGCTGCATTCACCTGGTCAGGCGATTTAAAAAGGCATATTAGGACCCACACAGGAGAGAAGCCTTTTAAATGTGACATCTGCGATGCAACGTTCGCACGGTCATGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCATATAAATGTGACATCTGCGATGCCATGTTTGTGTATTCAAATGTATTGAAAAGCCACCTTTTAATTCATACTGGAGAGAAGCCATATAAATGTGATTTCTGCGAAGCTGCATTCACCATGGAAAACGATTTAAAAAGGCATAATAGGACCCACACAGGAGAGAAGCCTTTTAAATGTGACATCTGCGATGCAACGTTCTCANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCGCATAAATGTGATTTTTGCGAAGCTGCATTCACCGGGTCAGCCATTTTCACCAAGTCAAGCAGTTTGAAAAAGCATAATGTGACCCACAAAGGAGAGAAGCCATATAAATGTGACATCTGCCATGCCATATTTCCGTATTTAAGCATTTTCATAAGGCATAATAGGACCCACACCGGAGAGAAGCCATATAAATGTGATACTTGCGATGCTACATTCTCCCAATCAAACAGTTTGAAAAGGCATAATAGGATCCACACGGGAGTGAAGACATATAAATGTGATATATGCGATGCCACATTCACCGAATCAAGCAGTTTGAAACAGCATAATATGACCCACTCAGGAGAAAAGCCATATAAATGTGATATCTGCCAAGCTGCGTTCTCTGCGTCAGGCTATTTGAAAATTCACAATAGGATCCACACAGGAGAGAAGCCATATAAATGTGAACTTTGCGAAGCTGCATTCTCCATGTCATGCGATTTGAAAAAGCATAATATGACCCACACTGGAGAGAAGCCATATAAATGTGATATCTGTAAAGCTGCATTCACTTGGTCAACCAATTTGAAAAAGCATAATAGGACCCACACTGGAGAGAAGCCATATAAATGTGATATCTGTAAAGCTGCATTCACTTCGTCAACCAATTTGAAAAAGCATAATAGGACCCACACTGGAGAGAAGCCATTTAAATGTGACTTCTGCGAAGCTGCATTCACTTCGTCAACCAATTTGAAAAAGCATAATATGACCCACACAGGAGAGAAGCCATATAAATGTGACATCTGCGATGCTACATTCGCCAATTCAAGCAATTTGAAAACGCATAATAGGATCCACACAGGAGAGAAGCCATATAAATGTGACATCTGCGATGCCACTTTCACCGAGTCAAGCAGTTTGAAAAGGCATAATAGGATCCACACAGGAGAGAAGCCTTATAAATGTGACATCTGCGAAGCCACTTTCACCGTGACCCACACAGGAGAGAAGCCATATAAATGTGATATCTGCGACGCTACATTTACCATGTTAAACAGTTTGAAAAGGCATAATAGGATCCACACAGGAGAGAAGCCATATAAATGTGATATATGCGATGCTACATTCGCCAAGTCAAGCCATTTGAAAACGCATAATAGGACCCATAGAGGAGAGAAGCGTTATAAATGTAATATCTGTGATGCCACGTTCGACCTCTCAAACAATTTGAGAAAGCATAATAGGATCCACAGAGGAGTGAAGCCATATAAATGTGACATCTGCGATGCTACATTCGCCAAGTCAAGCAATTTGAAAACGCATAATAGGATCCACACAGGAGATAAGCCATATAAATGTGATATCTGCGAAGCTGCATTCTCCACGTCAGGAAATTTGAGAAGGCATAATATGAACCACACAGCAGAGAAGCCATATAAATGTGATATCTGCGAAGCTGCATTCTCCATGTCATGCGATTTGAAAAGGCATAATATGACCCACACAGGTGAGAAGCAATATAAATGTGATATATGCGATGCTACATTCGCCCAGAAAGGCCATTTGACAAGGCATAATAGGACCCACACCAGAAAGAAGCCATATAAATGTGATATCTGCGATGCCACTTTCACCGACTCAAGCAGTTTGAAAAAGCATAATATGATCCACACAGGAGAGGAGCCATATAAATGTGACATCTGCGAGGCTACGATCCACACAGGAGAGAAGCCATATATATGTTATATCTGCGAAGCCACATTCACCGACTCAAGCAGTTTGAAAAAGCATAATAGGATCCATACAGGAGAGAAGCCATATAAATGTGACATCTGCGGTGCAATGTTCACGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAAGCTGCATTCACCTGGTCAGGCGATTTAAAAAGGCATACTAGGACCCACACAGGAGAGAAGCCATATAAATGTGACATCTGCGATGCCACTTTCTCTGTGTCAAGCAGTTTGAAAACGCATATTAGGACCCACACAGGTGAGAAGCCATTTAAATGTGATATCTGCCAAGCTGCGTTCACCGAGTCAGGCCATTTGAAAAGGCATAAAAGGACCCACACAGGAGAGAAGACATATAAGATCCACAGAGGAGAGAAGCCATATAAATGTGACATATGCAATGCCATGTTTCCATATCCAAGCAGTTTGAAACTTCATAATATTACCCACACAGGAGAGAAGCAATATAAATGTGATATCTGCGATGCTACATTCACCACGCCAGGCAATTTGAAAAAGCATAAGAAGATACACACAGGAGAGAAGCCATATAAATGTGATACTTGCTCTGCTAGATTCACCGTGAGATGCAGTTTGAAAAAACATAATAGGATCCACACAGGAGAGAAGCCATATAAATGTGATATCTGCGATGCTAAATTCACCAACTCAAGCAATTTGAAAAAGCACAATAGGATCCACACAGGAGAGAAGCCTTATAAATGTGAAATCTGCCATGCTACATTCACCCTAAGATACAGTTTAAAAATGCATAATAGGACCCACACAGGAGAGAAGCCATATAAATGTGACATCTGCGATGCCATGTTTGTAAATCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNATGCCACTTTCACCGTGTCACGTGATTTGACAAGACATAATAGGAGACACACAGGAGAGAAGCCATTTAAATGTAATATATGCGATGCCACTTTCACCGAGTCAGGCAGTTTGAAATCGCATAATAGGACCCACACAGGAGAGAAACCATATAAATGTGATATCTGCGATGCTACATTCAACCAGTCAAGCACTTTGAAAATCCATAATAGGACCCACACAGGGGAGAAGCCATATAAATGTGACATCTGCGATGCAACGATCCACACAGGAGAGAAGCCATATAAATGTGATATCTGCGATGCTAAATTCACCAACTCAAGCAATTTGAAAAAGCACAATAGGATCCACACAGGAGAGAAGCCTTATAAATGTGAAATCTGCCATGCTACATTCACCCTGAGATACAGTTTAAAAATGCATAATAGGACCCACACAGGAGAGAAGCCATATAAATGTGACATCTGCGATGCCATTGCAATACACTATTGTATCATCAGCAAAAAGGTGGACTTTTCCATTCCTGATGATATTCGAGAAGTCATTGATATAAAGTATGAATAA
Protein Sequence
MPDLPENYEAINIVPTAKYDVEKRKQELIDKISSFIDSFESNEQIAAFENIFNNIIDHQNVDSTNQIEDLSSLNRHKRIHTGEKPYKCDFCEAAFTASGDLKKHTRIHTGEKPYKCDICDAMFMYTKGLKTHILTHTGEKPYKCDFCEAAFTASNYLKIHIRIHTGEKPYKCDICDAMFLYTSGLKTHLSTHTGEKPYKCDFCDATFTGSGSLKTHKRVHTGEKPYKCDICDAMFMYSSALKSHIWIHTGEKPYKCDICRATFTASGSLKKHIRVHRGEKPYKCDICDAKFTKSSVLKMHNRRHTGEKPYKCDICDAMFMYSYSLKGHILTHTGEKPSDFCEAAFTMAGDLKRHNKTHTKEKPYKCDICHATFTLSSSLKRHNRTHTGEKPFKCDICDATIHTGEKPYKCDFCEAAFTASGDLKKHTRIHTGEKPYKCDICGAMFLYTSGLKTHLSTHTGEKPYKCDICDAKFTKSSVLKMHNRRHTGEKPYKCDICDAMFMYSKSLNGHILTHTGEKPSDFCEAAFTMAGDLKRHNKTHTKEKPYKCDICDATFTHXXXXXXXXXXXXXXXSHKCDFCEAAFTWSGDLKRHIRTHTGEKPFKCDICDATFARSXXXXXXXXXXXXXXXYKCDICDAMFVYSNVLKSHLLIHTGEKPYKCDFCEAAFTMENDLKRHNRTHTGEKPFKCDICDATFSXXXXXXXXXXXXXXXXXHKCDFCEAAFTGSAIFTKSSSLKKHNVTHKGEKPYKCDICHAIFPYLSIFIRHNRTHTGEKPYKCDTCDATFSQSNSLKRHNRIHTGVKTYKCDICDATFTESSSLKQHNMTHSGEKPYKCDICQAAFSASGYLKIHNRIHTGEKPYKCELCEAAFSMSCDLKKHNMTHTGEKPYKCDICKAAFTWSTNLKKHNRTHTGEKPYKCDICKAAFTSSTNLKKHNRTHTGEKPFKCDFCEAAFTSSTNLKKHNMTHTGEKPYKCDICDATFANSSNLKTHNRIHTGEKPYKCDICDATFTESSSLKRHNRIHTGEKPYKCDICEATFTVTHTGEKPYKCDICDATFTMLNSLKRHNRIHTGEKPYKCDICDATFAKSSHLKTHNRTHRGEKRYKCNICDATFDLSNNLRKHNRIHRGVKPYKCDICDATFAKSSNLKTHNRIHTGDKPYKCDICEAAFSTSGNLRRHNMNHTAEKPYKCDICEAAFSMSCDLKRHNMTHTGEKQYKCDICDATFAQKGHLTRHNRTHTRKKPYKCDICDATFTDSSSLKKHNMIHTGEEPYKCDICEATIHTGEKPYICYICEATFTDSSSLKKHNRIHTGEKPYKCDICGAMFTXXXXXXXXXXXXXXXXXXXXXXXXAAFTWSGDLKRHTRTHTGEKPYKCDICDATFSVSSSLKTHIRTHTGEKPFKCDICQAAFTESGHLKRHKRTHTGEKTYKIHRGEKPYKCDICNAMFPYPSSLKLHNITHTGEKQYKCDICDATFTTPGNLKKHKKIHTGEKPYKCDTCSARFTVRCSLKKHNRIHTGEKPYKCDICDAKFTNSSNLKKHNRIHTGEKPYKCEICHATFTLRYSLKMHNRTHTGEKPYKCDICDAMFVNXXXXXXXXXXXXXXXXXXXXXXXATFTVSRDLTRHNRRHTGEKPFKCNICDATFTESGSLKSHNRTHTGEKPYKCDICDATFNQSSTLKIHNRTHTGEKPYKCDICDATIHTGEKPYKCDICDAKFTNSSNLKKHNRIHTGEKPYKCEICHATFTLRYSLKMHNRTHTGEKPYKCDICDAIAIHYCIISKKVDFSIPDDIREVIDIKYE

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
iTF_01108657;
90% Identity
iTF_01108657;
80% Identity
iTF_01108657;