Basic Information

Gene Symbol
-
Assembly
GCA_949710015.1
Location
OX453291.1:4203406-4209926[-]

Transcription Factor Domain

TF Family
zf-GAGA
Domain
zf-GAGA domain
PFAM
PF09237
TF Group
Zinc-Coordinating Group
Description
Members of this family bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 54 1.4 8.2e+03 -3.0 0.0 25 49 55 79 52 84 0.73
2 54 0.16 9.4e+02 0.0 0.5 26 47 109 130 99 134 0.84
3 54 0.93 5.5e+03 -2.4 0.0 26 48 235 257 230 263 0.80
4 54 0.02 1.2e+02 2.9 0.2 25 48 306 330 300 333 0.79
5 54 0.72 4.2e+03 -2.1 0.0 26 48 334 356 330 362 0.81
6 54 0.98 5.8e+03 -2.5 0.2 26 48 382 404 371 410 0.76
7 54 0.2 1.2e+03 -0.3 0.2 25 48 405 428 392 432 0.87
8 54 0.79 4.6e+03 -2.2 0.2 26 48 432 454 427 460 0.75
9 54 0.39 2.3e+03 -1.2 0.1 25 48 455 478 443 486 0.85
10 54 0.018 1e+02 3.1 0.2 25 48 503 527 491 530 0.80
11 54 0.78 4.6e+03 -2.2 0.0 26 48 531 553 528 559 0.81
12 54 0.39 2.3e+03 -1.2 0.1 25 48 602 625 589 630 0.84
13 54 0.018 1e+02 3.1 0.2 25 48 650 674 637 677 0.80
14 54 0.0034 20 5.4 0.1 25 48 701 724 696 727 0.92
15 54 0.4 2.4e+03 -1.3 0.0 27 49 728 750 725 752 0.87
16 54 1.6 9.4e+03 -3.2 0.0 26 44 776 794 774 798 0.78
17 54 0.0018 11 6.2 0.1 26 48 827 849 823 852 0.90
18 54 0.072 4.2e+02 1.1 0.2 26 48 852 874 850 877 0.90
19 54 0.002 11 6.1 0.1 26 48 877 899 873 902 0.90
20 54 0.1 6e+02 0.6 0.1 26 49 902 925 900 929 0.91
21 54 0.083 4.9e+02 0.9 0.1 26 49 978 1001 973 1005 0.90
22 54 0.0018 11 6.2 0.1 26 48 1054 1076 1050 1079 0.90
23 54 0.072 4.2e+02 1.1 0.2 26 48 1079 1101 1077 1104 0.90
24 54 0.002 11 6.1 0.1 26 48 1104 1126 1100 1129 0.90
25 54 0.1 6e+02 0.6 0.1 26 49 1129 1152 1127 1156 0.91
26 54 0.083 4.8e+02 0.9 0.1 26 49 1205 1228 1200 1232 0.90
27 54 0.59 3.4e+03 -1.8 0.0 26 48 1230 1252 1225 1253 0.83
28 54 0.0072 42 4.3 0.1 26 48 1281 1303 1277 1306 0.87
29 54 0.058 3.4e+02 1.4 0.3 26 48 1306 1328 1302 1330 0.91
30 54 0.011 66 3.7 0.1 25 49 1330 1354 1325 1358 0.89
31 54 0.77 4.5e+03 -2.2 0.0 27 48 1357 1378 1355 1379 0.83
32 54 0.0018 11 6.2 0.1 26 48 1407 1429 1403 1432 0.90
33 54 0.1 6e+02 0.6 0.1 26 49 1432 1455 1430 1459 0.91
34 54 0.084 4.9e+02 0.9 0.1 26 49 1508 1531 1503 1535 0.90
35 54 0.0018 11 6.2 0.1 26 48 1584 1606 1580 1609 0.90
36 54 0.032 1.9e+02 2.3 0.4 26 49 1609 1632 1607 1634 0.92
37 54 0.013 78 3.5 0.2 25 49 1633 1657 1628 1661 0.89
38 54 1.4 8.2e+03 -3.0 0.0 26 45 1659 1678 1657 1681 0.80
39 54 0.0018 11 6.2 0.1 26 48 1710 1732 1706 1735 0.90
40 54 0.067 3.9e+02 1.2 0.3 26 48 1735 1757 1731 1760 0.90
41 54 0.0024 14 5.8 0.0 26 48 1760 1782 1755 1785 0.90
42 54 0.0029 17 5.6 0.0 26 48 1785 1807 1783 1810 0.90
43 54 0.094 5.5e+02 0.8 0.2 26 48 1810 1832 1806 1835 0.90
44 54 0.0022 13 6.0 0.1 26 48 1835 1857 1829 1861 0.89
45 54 0.049 2.9e+02 1.7 0.3 26 48 1860 1882 1858 1884 0.92
46 54 0.013 74 3.5 0.2 25 49 1884 1908 1879 1913 0.89
47 54 0.0018 10 6.3 0.2 26 48 1961 1983 1956 1987 0.90
48 54 0.069 4e+02 1.2 0.3 26 48 1986 2008 1982 2011 0.90
49 54 0.0026 15 5.8 0.1 26 48 2011 2033 2007 2036 0.90
50 54 0.0012 6.7 6.9 0.2 26 48 2036 2058 2032 2062 0.90
51 54 0.1 6e+02 0.6 0.1 26 48 2061 2083 2059 2086 0.90
52 54 0.0023 14 5.9 0.0 26 48 2086 2108 2080 2111 0.89
53 54 0.002 12 6.1 0.1 26 48 2111 2133 2108 2136 0.91
54 54 0.0087 51 4.1 0.0 24 44 2134 2154 2130 2155 0.80

Sequence Information

Coding Sequence
ATGGTTTATCATGAcacaatttgtacaaaatgtgaTATCGAGTTTACCAATGATGAAGCCTTGGCGCAACATACACGAGATaagcataataaaatatgtttgcattgtaataaaatttttgtcgataggGATGCCTTGGCTAATCACGCTAAGGATAATCTTGCTCTTTGTCCGAAATGTGATGATATCTACTCCAATAAGGTTGACTTAGAAAATCACTTCAAGGATAAACATCCGACTTGTCCAAATTGTGCGCGCATCTTCGTTCATAAAGGTGCCTTCGATAATCACGTGAGGGATAAACATACAGACATAATTTGTTCGAATTGTCCAATTTGTGAACGCATTTTTGTTCGTAAAGATGCATTGAATAATCACTTGAGAGATGAACATAAATGCACAATTTGTTCGAAATGTGATGCTGCCTTCTTCAATACGGATGAATTAGATAACCACTTCAAGAATACACATCCGACTTGTCCAAATTGTAAGCGCACCTTCGTTCTTAAAGATGCTTTAGATAATAACTTGAGAGATAAACATAAAGACATAATTTGTTCGAAGTGTGATGCTGCCTTCTTCAATACAGAGGACTTAGAAAATCACGTCAAGGATACACATCCGACTTGTCTATGTTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTGAGAGATAAACATAAAGACACAATTTGTTCGAAGTGTGATGCTGTCTTCTCCAATACGGAGGACTTAGAAAATCACTTCAAGGATACACATCCGATTTGTCCATGTTGTAATGTCAGCTTCTTTGATAAAGATGCTTTAACTAGTCACCTGAAGGCGGAACATAAAAATTGTACGAAGTGTGATGCTGTCTTCTCCAATACGGAGGACTTAGAAAATCACTTCAAGGATAAACATCCGACTTGCCCAATTTGtaaatgcattaaatttattgatggAATCGCTTTGACTTATCACATGGAGGCAAGACATAAAGACACAATTTGTTCGAAGTGTGATGCTGTCTTTTTCAATACGAAGGACTTAGAAAATCACTTCAAGGATACACATCCGATTTGTCCATGTTGTAATGTCAGCTTCTTTGATAAAGATGCTTTAACTAGTCACCTGAAGGCGGAACATAAAAATTGTACGAAGTGTGATGCTGTCTTCTCCAATACGGAGGACTTAGAAAATCACTTCAAGGATACACATCCGACATGTCCATGTTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTGAGAGATAAACATAAAGACACAATTTGTTCGAAGTGTGATGCTGTCTTCTCCAATACGGAGGACTTAGAAAATCACTTCAAGGATACACATCCAACTTGTCCCTGTTGTAATGTCAGCTTCTTTGATAAAGATGCTTTAACTAGTCACCTGAAGGCGGAACATAAAAATTGTACGAAGTGTGATGCTGTCTTCTCCAATACGGAGGACTTAGAAAATCACTTCAAGGATAAACATCCGACTTGCCCAATTTGtaaatgcattaaatttattgatggAATCGCTTTGACTTATCACATGGAGGCAAGACATAAAGACACAATTTGTTCGAAGTGTGATGCTGTCTTTTTCAATACGAAGGACTTAGAAAATCACTTCAAGGATACACATCCGATTTGTCCATGTTGTAATGTCAGCTTCTTTGATAAAGATGCTTTAACTAGTCACCTGAAGGCGGAACATAAAAATTGTACGAAGTGTGATGCTGTCTTCTCCAATACGGAGGACTTAGAAAATCACTTCAAGGATACACATCCGACTTGTCCATGTTGTAATGTCAGCTTCTTTGATAAAGATGCTTTAACTAGTCACCTGAAGGCGGAACATAAAAATTGTACGAAGTGTGATGCTGTCTTCTCCAATACGGAGGACTTAGAAAATCACTTCAAGGATAAACATCCGACTTGCCCAATTTGtaaatgcattaaatttattgatggAATCGCTTTGACTTATCACATGGAGGCAAGACATAAAGACACAATTTGTTCGAAGTGTGATGCTGTCTTTTTCAATACGAAGGACTTAGAAAATCACTTCAAGGATAAATATCCGACTTGTCCAATTTGTAAATGCATCTTTATTGATGGGGTCGCTTTTACTTATCACATGGAGGCAAGACATAAAGAAATTTGCTTAATTTGTAATGCTGTATTCTTCAGTTGGGATCAATTGGCTGATCATATGAAAGCTAAGCACCTTGAAATGTGTCCATATTGTAATGTATTCCTCCACAAGGATGACCTAGTTAATCACATAAAAGGTAAGCACCATGAAATCTGTCCAGATTGTAATGCGATCTTGAACAGCATGGATGCTCTAAATAGTCACAGAAAAGATAAACATAATGATATAAGGACTCTTTGTATTGCAGCCTTTAGTAATAAGGATGCCTTAGGTAATGACATAAAGGATAAACGTAATTGTGACATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGAAGACCTAGTCAATCATATAAAAGACAAGCACCATGAAATGTGTCCAAATTGTTGTGCTGTATTCCACCACAAGGATGACCTAGTTAATCACATAAAAGGTAAGCACCATGACATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGAAGACCTAGTCAATCATATAAAAGACAAGCACCATGAAATGTGTCCAGATTGTAATGCCGTATTCCTCCACAAGGCTGACTTGGTTAATCACATGAAAGCTAAACACCATGAAATCTGTCCAGATTGTAATGCGATCTTTAACAGTATGGATGCTCTAAATAGTCACAGAAAAGATAAACATAATGATATAAGGACTCTTTGTATTGCAGCCTTTAGTAATAAGGATGCCTTAGGTAATGATATAAAGGATAAACGTAATTGTGAAATGTGTCCAGATTGTAATGCCGTATTCCTCCACAAGGCTGACTTGGTTAATCACATGAAAGCTAAACACCACGAAATCTGTCCAGATTGTAATGCGATCTTGAACAGCATGGATGCTCTAAATAGTCACAGAAAAGATAAACATAATGATATAAGGACTCTTTGTATTGCAGCCTTTAGTAATAAGGATGCCTTAGGTAATGACATAAAGGATAAACGTAATTGTGACATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGAAGACCTAGTCAATCATATAAAAGACAAGCACCATGAAATGTGTCCAAATTGTTGTGCTGTATTCCACCACAAGGATGACCTAGTTAATCACATAAAAGGTAAGCACCATGACATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGAAGACCTAGTCAATCATATAAAAGACAAGCACCATGAAATGTGTCCAGATTGTAATGCCGTATTCCTCCACAAGGCTGACTTGGTTAATCACATGAAAGCTAAACACCATGAAATCTGTCCAGATTGTAATGCGATCTTTAACAGTATGGATGCTCTAAATAGTCACAGAAAAGATAAACATAATGATATAAGGACTCTTTGTATTGCAGCCTTTAGTAATAAGGATGCCTTAGGTAATGATATAAAGGATAAACGTAATTGTGAAATGTGTCCAGATTGTAATGCCGTATTCCTCCACAAGGCTGACTTGGTTAATCACATGAAAGCTAAACACCACGAAATCTGTCCAGATTGTAATGCGATCTTGAACAGCATGGATGCTCTAAAGAGTCACAGAAAAGATAAACATAATGATATAAGGACTCCTTGTAATGCAGCCTTTAGTGATAAGGATGCCTTAGGTAATGACATAAAGGATATACGTAATTGTGAAATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGAAGACGTAGTCAATCATATAAAAGATAAGCACCATGAAATGTGTCCAGATTGTTGTGCCGTATTCCACCATAAGGATGACCTAGTTAATCACATAAAAGCTAAGCACCATGAAACGTGTCCAGATTGTAATGCCGTATTCCTCCACAAGGCTGACTTGGTTAATCACATGAAAGCTAAACACCACGAAATCTGTCCAGATTGTAATGCGATCTTGAACAGCATGGATGCTCTAAAGAGTCACAGAAAAGATAAACATAATGATATAAGGACTCCTTGTAATGCAGCCTTTAGTGATAAGGATGGCTTAGGTAATGACATAAAGGATATACGTAATTGTGAAATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGAAGACCTAGTCAATCATATAAAAGACAAGCACCATGAAATGTGTCCAGATTGTAATGCCGTATTCCTCCACAAGGCTGACTTGGTTAATCACATGAAAGCTAAACACCATGAAATCTGTCCAGATTGTAATGCGATCTTTAACAGTATGGATGCTCTAAATAGTCACAGAAAAGATAAACATAATGATATAAGGACTCTTTGTATTGCAGCCTTTAGTAATAAGGATGCCTTAGGTAATGACATAAAGGATAAACGTAATTGTGAAATGTGTCCAGATTGTAATGCCGTATTCCCCCACAAGGCTGACTTGGTTAATCACATGAAAGCTAAACACCACGAAATCTGTCCAGATTGTAATGCGATCTTGAACAGCATGGATGCTCTAAATAGTCACAGAAAAGATAAACATAATGATATAAGGACTCTTTGTATTGCAGCCTTTAGTAATAAGGATGCCTTAGGTAATGACATAAAGGATATACGTAATTGTGAAATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGAAGACCTAGTCAATCATATAAAAGACAAGCACCATGAAATGTGTCCAAATTGTTGTGCTGTATTCCACCACAAGGAAGACCTAGTTAATCACATAAAAGCTAAGCACCATGAAACGTGTCCAGATTGTAATGCCGTATTCCTCCACAAGGCTGACTTGGTTAATCACATAAAAGCTAAACACCATGAAATCTGTCCAGATTGTAATGCGATCTTTAACAGCATGGATGCTCTAAAGAGTCACAGAAAAGATAAACATAATGATATAAGGACTCTTTGTATTGCAGCCTTTAGTAATAAGGATGCCTTAGGTAATGACATAAAGGATATACGTAATTGTGAAATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGAAGACCTAGTCAATCATATAAAAGACAAGCACCATGAAATGTGTCCAAATTGTTGTGCTGTATTCCACCACAAGGATGACCTAGTTAATCACATAAAAGGTAAGCACCATGACATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGATGACCTAGTTAATCACATAAAAGGTAAGCACCATGACATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGATGACCTAGTTAATCACATAAAAGATAAGCACCATGACATGTGTCCAGATTGTTGTGCTGTATTCCACCACAAGGATGACCTAGTTAATCACATAAAAGATAAGCACCATGACATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGATGACCTAGTTAATCACATAAAAGATAAGCACCATGACATGTGTCCAGATTGTTGTGCTGTATTCCACCACAAGGAAGACCTAGTTAATCACATAAAAGCTAAGCACCATGAAACGTGTCCAGATTGTAATGCCGTATTCCTCCACAAGGCTGACTTGGTTAATCACATAAAAGCTAAACACCATGAAATCTGTCCAGATTGTAATGCGATCTTTAACAGCATGGATGCTCTAAAGAGTCACAGAAAAGATAAACATAATGATATAAGGACTCCTTGTAATGCAGCCTTTAGTGATAAGGATGCCTTAGGTAATGACATAAAGGATAAACGTAATTGTGAAATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGAAGACCTAGTCAATCATATAAAAGACAAGCACCATGAAATGTGTCCAAATTGTTGTGCTGTATTCCACCACAAGGATGACCTAGTTAATCACATAAAAGGTAAGCACCATGACATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGATGACCTAGTTAATCACATAAAAGGTAAGCACCATGACATGTGTCCAGATTGTTATGCCGTGTTGCACCACAAGGAAGACCTAGTTAATCACATAAAAGATAAGCACCATGACATGTGTCCAGATTGTTGTGCTGTATTCCACCACAAGGATGACCTAGTTAATCACATAAAAGGTAAGCACCATGACATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGATGACCTAGTTAATCACATAAAAGGTAAGCACCATGACATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGATGACCTAGTTAATCACATAAAAAGTAAGCACCATGACATGTGTCCAGATTGTTATGCCGTGTTCCACCACAAGGATGACCTAGTTAATCACATAAAAGGATGA
Protein Sequence
MVYHDTICTKCDIEFTNDEALAQHTRDKHNKICLHCNKIFVDRDALANHAKDNLALCPKCDDIYSNKVDLENHFKDKHPTCPNCARIFVHKGAFDNHVRDKHTDIICSNCPICERIFVRKDALNNHLRDEHKCTICSKCDAAFFNTDELDNHFKNTHPTCPNCKRTFVLKDALDNNLRDKHKDIICSKCDAAFFNTEDLENHVKDTHPTCLCCKRIFVHKDALDNHLRDKHKDTICSKCDAVFSNTEDLENHFKDTHPICPCCNVSFFDKDALTSHLKAEHKNCTKCDAVFSNTEDLENHFKDKHPTCPICKCIKFIDGIALTYHMEARHKDTICSKCDAVFFNTKDLENHFKDTHPICPCCNVSFFDKDALTSHLKAEHKNCTKCDAVFSNTEDLENHFKDTHPTCPCCKRIFVHKDALDNHLRDKHKDTICSKCDAVFSNTEDLENHFKDTHPTCPCCNVSFFDKDALTSHLKAEHKNCTKCDAVFSNTEDLENHFKDKHPTCPICKCIKFIDGIALTYHMEARHKDTICSKCDAVFFNTKDLENHFKDTHPICPCCNVSFFDKDALTSHLKAEHKNCTKCDAVFSNTEDLENHFKDTHPTCPCCNVSFFDKDALTSHLKAEHKNCTKCDAVFSNTEDLENHFKDKHPTCPICKCIKFIDGIALTYHMEARHKDTICSKCDAVFFNTKDLENHFKDKYPTCPICKCIFIDGVAFTYHMEARHKEICLICNAVFFSWDQLADHMKAKHLEMCPYCNVFLHKDDLVNHIKGKHHEICPDCNAILNSMDALNSHRKDKHNDIRTLCIAAFSNKDALGNDIKDKRNCDMCPDCYAVFHHKEDLVNHIKDKHHEMCPNCCAVFHHKDDLVNHIKGKHHDMCPDCYAVFHHKEDLVNHIKDKHHEMCPDCNAVFLHKADLVNHMKAKHHEICPDCNAIFNSMDALNSHRKDKHNDIRTLCIAAFSNKDALGNDIKDKRNCEMCPDCNAVFLHKADLVNHMKAKHHEICPDCNAILNSMDALNSHRKDKHNDIRTLCIAAFSNKDALGNDIKDKRNCDMCPDCYAVFHHKEDLVNHIKDKHHEMCPNCCAVFHHKDDLVNHIKGKHHDMCPDCYAVFHHKEDLVNHIKDKHHEMCPDCNAVFLHKADLVNHMKAKHHEICPDCNAIFNSMDALNSHRKDKHNDIRTLCIAAFSNKDALGNDIKDKRNCEMCPDCNAVFLHKADLVNHMKAKHHEICPDCNAILNSMDALKSHRKDKHNDIRTPCNAAFSDKDALGNDIKDIRNCEMCPDCYAVFHHKEDVVNHIKDKHHEMCPDCCAVFHHKDDLVNHIKAKHHETCPDCNAVFLHKADLVNHMKAKHHEICPDCNAILNSMDALKSHRKDKHNDIRTPCNAAFSDKDGLGNDIKDIRNCEMCPDCYAVFHHKEDLVNHIKDKHHEMCPDCNAVFLHKADLVNHMKAKHHEICPDCNAIFNSMDALNSHRKDKHNDIRTLCIAAFSNKDALGNDIKDKRNCEMCPDCNAVFPHKADLVNHMKAKHHEICPDCNAILNSMDALNSHRKDKHNDIRTLCIAAFSNKDALGNDIKDIRNCEMCPDCYAVFHHKEDLVNHIKDKHHEMCPNCCAVFHHKEDLVNHIKAKHHETCPDCNAVFLHKADLVNHIKAKHHEICPDCNAIFNSMDALKSHRKDKHNDIRTLCIAAFSNKDALGNDIKDIRNCEMCPDCYAVFHHKEDLVNHIKDKHHEMCPNCCAVFHHKDDLVNHIKGKHHDMCPDCYAVFHHKDDLVNHIKGKHHDMCPDCYAVFHHKDDLVNHIKDKHHDMCPDCCAVFHHKDDLVNHIKDKHHDMCPDCYAVFHHKDDLVNHIKDKHHDMCPDCCAVFHHKEDLVNHIKAKHHETCPDCNAVFLHKADLVNHIKAKHHEICPDCNAIFNSMDALKSHRKDKHNDIRTPCNAAFSDKDALGNDIKDKRNCEMCPDCYAVFHHKEDLVNHIKDKHHEMCPNCCAVFHHKDDLVNHIKGKHHDMCPDCYAVFHHKDDLVNHIKGKHHDMCPDCYAVLHHKEDLVNHIKDKHHDMCPDCCAVFHHKDDLVNHIKGKHHDMCPDCYAVFHHKDDLVNHIKGKHHDMCPDCYAVFHHKDDLVNHIKSKHHDMCPDCYAVFHHKDDLVNHIKG

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
-
90% Identity
-
80% Identity
-