Basic Information

Gene Symbol
-
Assembly
GCA_947049265.1
Location
CAMRIQ010000200.1:175080-187831[-]

Transcription Factor Domain

TF Family
zf-GAGA
Domain
zf-GAGA domain
PFAM
PF09237
TF Group
Zinc-Coordinating Group
Description
Members of this family bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 19 0.87 8.7e+03 -1.9 0.0 22 48 245 270 240 275 0.81
2 19 0.0079 79 4.6 0.2 21 44 300 323 292 326 0.90
3 19 0.0057 57 5.1 0.1 20 44 327 351 323 355 0.90
4 19 0.0084 84 4.5 0.1 22 44 373 395 369 399 0.91
5 19 0.0061 61 5.0 0.1 20 44 399 423 396 427 0.90
6 19 0.0059 59 5.0 0.1 20 44 427 451 424 456 0.90
7 19 0.0036 37 5.7 0.0 21 45 472 496 467 504 0.87
8 19 0.011 1.1e+02 4.1 0.1 22 44 530 552 527 556 0.92
9 19 0.0057 57 5.1 0.1 20 44 556 580 552 584 0.90
10 19 0.0041 41 5.5 0.0 21 44 601 624 596 632 0.88
11 19 0.01 1e+02 4.3 0.0 22 44 659 681 655 686 0.91
12 19 0.0041 41 5.5 0.0 21 44 702 725 697 733 0.88
13 19 0.01 1e+02 4.3 0.0 22 44 760 782 756 787 0.91
14 19 0.0054 54 5.1 0.0 21 44 803 826 799 830 0.90
15 19 0.007 70 4.8 0.0 20 44 830 854 827 859 0.89
16 19 0.044 4.4e+02 2.2 0.3 23 44 884 905 875 909 0.87
17 19 0.0054 54 5.1 0.0 21 44 926 949 922 953 0.90
18 19 0.0059 59 5.0 0.1 20 44 953 977 950 982 0.90
19 19 0.0053 53 5.1 0.0 21 44 998 1021 994 1025 0.90

Sequence Information

Coding Sequence
ATGGAATCGCAAGTTTGCGTCCATTGTCTGAACAAGAAGATCATAAACTCGACAACCAACCGGTTGGTGACTGAAAGATGTGGTCATGTGAAGTGCATGGATTGTTTGTTGCACGAGAAGGCAGGTTGCGTCGCTTGTGAGGAGAAGGCCACGTGTAAGGAGCCCGAAGACACCCTGGAGCCGAGCGAGGGGCCACCGTCGGATGGCCACGTTGAGCCGGAGCCGGTGCACACAGACTTCCCTCTCAAACACTTCCTCCTGGAAGAGTGCCAAGGGAAGGATGCACAGACTGAGTACTATTGTGATAAGAATGATAAGAAGAAGAAGCTAGAGACATCTCACATTAGGATAGAAACAGGTAACATCACCTGTGTGTCTAATATCTCCTCTTGTAATGGGCATTCGCGCGAGCACGTGGCGCAGGGGGGCGGCGCCGGCCGTCCCCTTCCGTTCTGCTCCTGTACTGAGGCGCCGACGGGGCTCCCGCAGACCGATGTGGAGGCGGAGGTCTCTGTTAGGCGAGCAATATGTATTTACATATTCATTATGCCAAGAGACGGGAGTCGGCGGTATTACTTCTGCACAGCATGCAAGAGGAAGTTCCAAGCACGCAGCCAGGTCTCGTACCATGCTTACTGCAACGGACAGCAAAAGCCCTACCAGTGCCCCGAGTGCAATAAGAGTTTCGCTTCGCACTCCCACTTCAAGTACCACATGCGCGTGCACCGCGACGAGCGCACGTACGCCTGCGAGCTGTGCGGGGACAGCTTCTTCCAGATGTCCAAGCTGCAGCGGCACAAGCTCAAGCACACGAAGGAGAAGAAATACCCGTGTCCAGAGTGCAGCAAAGCGTTCAACAACCTGACGTCTCTCCGCAAACACGCGCTCACGCACTCAGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCCGCCGCTTCAGGGACTCCAGCAACTACCGCAAGCACGTGGCCAAACACAACGACGAGCGGCCATACGGCTGCGAGACTTGCGGCCGCCGCTTCAGGGACTCCAGCAACTACCGCAAGCACGTGGCCAAACACAACGGTGAGCGACAACCTGACGTCTCCTACACACACGCGCTCACGCATTTAGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCCGCCGCTTCAGGGACTCCAACAACTACCGCAAGCACGTGGCCAAACACAACGACGAGCGGCCATACGGCTGCGAGACTTGCGGCCGCCGCTTCAGGGACTCCAGCAACTACCGCAAGCACGTGGCCAAACACAACGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCCGCCGCTTCAGGGACTCCAGCAACTACCGCAAGCACGTGGCCAAACACAACGGTGAGCGACAACATGACATCTCCCACACACACGCGTTCACGCATTCAGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCGGCCGCTTCAGGGACTCCAGCAATTACCGCAAGCACGTGGCCAAACACAACGGGACTCCAGCAATTACCGCAAGCACGTGGCCAAACACAACGGTGAGCGACAACCTGACGTCCTACACACACGCGCTCACGCATTTAGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCCGCCGCTTCAGGGACTCCAGCAACTACCGCAAGCACGTGGCCAAACACAACGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCCGCCGCTTCAGGGACTCCAGCAACTACCGCAAGCACGTGGCCAAACACAACGGTGAGCGACAACATGACATCTCCCACACACACGCGTTCACGCATTCAGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCGGCCGCTTCAGGGACTCCAGCAATTACCGCAAGCACGTGGCCAAACACAACGGGACTCCAGCAACTACCGCAAGCACGTGGCCAAACACAACGGTGAGCGACAACCTGACGTCTCCTACACACGCGCTCACGCATTTAGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCCGCCGCTTCAGGGACTCCAGCAACTACCGCAAGCACGTGGCCAAACACAACGGTGAGCGACAACATGACATCTCCCACACACACGCGTTCACGCATTCAGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCGGCCGCTTCAGGGACTCCAGCAATTACCGCAAGCACGTGGCCAAACACAACGGGACTCCAGCAACTACCGCAAGCACGTGGCCAAACACAACGGTGAGCGACAACCTGACGTCTCCTACACACGCGCTCACGCATTTAGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCCGCCGCTTCAGGGACTCCAGCAACTACCGCAAGCACGTGGCCAAACACAACGGTGAGCGACAACATGACATCTCCCACACACACGCGTTCACGCATTCAGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCGGCCGCTTCAGGGACTCCAGCAATTACCGCAAGCACGTGGCCAAACACAACGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCTTCCGCTTCAGGGACTCCAGCAATTACCGCAAGCACGTGGCCAAACACAACGGTGAGCGACAACCTGACGTCCTACACACACGCGCTCACGCATTTAGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCCGTACGGCTGCGAGACTTGCGGCCGCCGCTTCAGGGACTCCAGCAACTACCGCAAGCACGTGGCCAAACACAACGGTGAGCGACAACATGACATCTCCTACACACACGCGTTCACGCATTCAGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCGGCCGCTTCAGGGACTCCAGCAATTACCGCAAGCACGTGGCCAAACACAACGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCCGCCGCTTCAGGGACTCCAGCAACTACCGCAAGCACGTGGCCAAACACAACGGTGAGCGACAACATGACATCTCCTACACACACGCGTTCACGCATTCAGACGAGCGGCCGTACGGCTGCGAGACTTGCGGCGGCCGCTTCAGGGACTCCAGCAATTACCGCAAGCACGTGGCCAAACACAACGGTGAGCGATAA
Protein Sequence
MESQVCVHCLNKKIINSTTNRLVTERCGHVKCMDCLLHEKAGCVACEEKATCKEPEDTLEPSEGPPSDGHVEPEPVHTDFPLKHFLLEECQGKDAQTEYYCDKNDKKKKLETSHIRIETGNITCVSNISSCNGHSREHVAQGGGAGRPLPFCSCTEAPTGLPQTDVEAEVSVRRAICIYIFIMPRDGSRRYYFCTACKRKFQARSQVSYHAYCNGQQKPYQCPECNKSFASHSHFKYHMRVHRDERTYACELCGDSFFQMSKLQRHKLKHTKEKKYPCPECSKAFNNLTSLRKHALTHSDERPYGCETCGRRFRDSSNYRKHVAKHNDERPYGCETCGRRFRDSSNYRKHVAKHNGERQPDVSYTHALTHLDERPYGCETCGRRFRDSNNYRKHVAKHNDERPYGCETCGRRFRDSSNYRKHVAKHNDERPYGCETCGRRFRDSSNYRKHVAKHNGERQHDISHTHAFTHSDERPYGCETCGGRFRDSSNYRKHVAKHNGTPAITASTWPNTTVSDNLTSYTHALTHLDERPYGCETCGRRFRDSSNYRKHVAKHNDERPYGCETCGRRFRDSSNYRKHVAKHNGERQHDISHTHAFTHSDERPYGCETCGGRFRDSSNYRKHVAKHNGTPATTASTWPNTTVSDNLTSPTHALTHLDERPYGCETCGRRFRDSSNYRKHVAKHNGERQHDISHTHAFTHSDERPYGCETCGGRFRDSSNYRKHVAKHNGTPATTASTWPNTTVSDNLTSPTHALTHLDERPYGCETCGRRFRDSSNYRKHVAKHNGERQHDISHTHAFTHSDERPYGCETCGGRFRDSSNYRKHVAKHNDERPYGCETCGFRFRDSSNYRKHVAKHNGERQPDVLHTRAHAFRRAAVRLRDLRPYGCETCGRRFRDSSNYRKHVAKHNGERQHDISYTHAFTHSDERPYGCETCGGRFRDSSNYRKHVAKHNDERPYGCETCGRRFRDSSNYRKHVAKHNGERQHDISYTHAFTHSDERPYGCETCGGRFRDSSNYRKHVAKHNGER

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
iTF_00656344;
90% Identity
iTF_00656344;
80% Identity
iTF_00656344;