Basic Information

Gene Symbol
-
Assembly
GCA_000696795.2
Location
NW:678-11088[+]

Transcription Factor Domain

TF Family
zf-GAGA
Domain
zf-GAGA domain
PFAM
PF09237
TF Group
Zinc-Coordinating Group
Description
Members of this family bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 18 0.046 44 3.8 0.2 21 44 131 154 128 161 0.79
2 18 0.86 8.2e+02 -0.3 0.1 27 48 165 186 161 190 0.87
3 18 0.08 76 3.0 0.1 26 48 222 244 216 249 0.87
4 18 0.58 5.5e+02 0.3 0.0 30 45 284 299 281 304 0.82
5 18 0.22 2.1e+02 1.6 0.1 21 45 304 328 299 338 0.84
6 18 0.00059 0.56 9.9 1.0 11 52 350 392 340 394 0.81
7 18 0.084 80 3.0 0.1 22 48 390 416 388 421 0.83
8 18 0.036 34 4.1 0.1 21 45 418 442 413 447 0.81
9 18 0.0039 3.7 7.2 0.3 21 44 447 470 443 476 0.89
10 18 0.069 66 3.2 0.2 20 43 475 498 471 504 0.75
11 18 0.88 8.4e+02 -0.3 0.1 22 45 506 529 501 535 0.78
12 18 0.21 2e+02 1.7 0.1 19 32 561 574 556 590 0.74
13 18 0.01 9.8 5.9 0.1 20 44 591 615 579 620 0.86
14 18 0.031 29 4.4 0.1 21 48 621 648 616 652 0.87
15 18 0.024 23 4.7 0.1 21 48 650 677 647 679 0.85
16 18 0.00077 0.73 9.5 0.5 20 44 678 702 674 707 0.87
17 18 0.045 43 3.8 0.1 20 48 707 735 703 741 0.81
18 18 0.0033 3.2 7.4 0.1 26 48 742 764 733 771 0.79

Sequence Information

Coding Sequence
ATGGATCAAATTGTTTCAAATAACCCAGGCATTCCtgttaaagaagaaattattgatgAGACTGAAACCTTTTGTTTAAGCAACTCTAGTATTTCAATTAAAGAAGAAACTACTGATGAGACCGAAACCTTTTGTATGAACAACTCTGGTATTTCAGTTAAAGAAGAAATTACTGATGCGACTGAAACCACCTTTCTTACCTTGGCTAGTATTAAAGAAGAGGAAACTCTAGAGATCTATTATgATATAAATAACTCTGGTGTTTcagttaaagaagaaataagtgATGAAATTGATCCCTCTGgaACCATAGATTCTCTTTCGCAGACAAAGAAGATGATACAGTATGGTTTGGAATCTGCAGTGAAGAATTGTGATTTGTCTATGATAGCTGAGAAACCTCTTGAATGTCCTTATTGTGAATATAAGGTTGTAGAAAGAAGTCTAATGATAAGACATATAATGAATCATCATACAAGTGTGGAGCACAGGTGTCCTCATTGTAAATATACAGCAACAGTATCTACtaatttaaaacttcatattattaccAATCATACAGATGAAAGTCCTTATCAGTGCCATCATTGTGGATataaagcagtaaaaaaaagcattattaaGCAACATATAATGGCGCTTCATACTGGTGATAGGCATCAtaagtgtcctcattgtgaatacgTGGCAACAGtgtctaataatttaaaacttcatattattaccAGTCATACAGGTGAAAATCCTTATCAGTGCCATCATTGTGGATataaagcagtaaaaaaaagcattattaaGCAACATATAATGGCACTTCATACTGGTGGTGGGCATCAAAAGTATTCTTATTGTGATTATATAGCAAAACAaagtggtaatttaaaaaatcatataatgtcCCTAcatactggtgagaagcctTTTAAGTGTCCTCATTGCAATCACAAAGCAGCACATAGTGGTACTTTGAAAACACATATAATGTCCCTTCATACTGATGTTAGACCTCATATGTgccctcattgtgattataaagcaGCACATAGAGGAACTTTGAAAACACATATAATGTCTCATACTGGTGAGAGGCCTTATAAGTGTCCTTGTTGTGATTATAAAGCAACACAAAgtagttctttaaaaaaacatattatgttCCTTCATACTGATAGACCTCATAtgtgtcctcattgtgattataaagcaGTACATAGTGGTAATTTGAAGGCACATATTATGTCCCATCACAACAGTGATGGACTTCATATGTGTCCTCATTGTGACTTTAAAACAACACAAAGTTGcaatttgaaaaatcatataatggccCTTCATACTGGTGAGAGACCTTATAAGTGTCCTCATTGTAATTATAAAGCAACACAAAGTGGTACTTTGAAAAGACACATCATTTCCCTTCATACTAGTGAGAAGCCACATAAGTGCCCTTATTGTGATTATAGAGGAACACGAAGTGGTCATTTGAAAAGTCATATTATATCTCTTCATACTGGTGATAGGCCTTATAAGTGTCCTTATTGCGACTACAAAGCAACACTAAATGGTACTTTGAAAAGACATATAATGTCCCTTCATATTGGTGATAGACCTCATATGTGTTCTTATTGTGACTTTAAAGCAACACTAATTAGTACTTTGAAAAGTCATATAATGTCCCTTCATACTAGTGAGAAGCCGTAtaagtgtcctcattgtgattataaagcaACAAAGAATGGTACTTTAAAAGCACACATAATATCTGTTCATACTGATGAAAGGCCGTATAAGTGTCCTCATTGTAATTATAAAGCAAAACAAAGTGGTACTTTGAAGACACATATAATGTTCCATAATAATAGTGATAGAGTTCATAAGTGTCCTCATTGTAATTATAAAGCAAAACAAAGTGGTACTTTGAAGACACATATAATGTTCCATCATAATAGTGATAGAGTTCATATGTGTCCTCATTGTGACTTTAAAGCAACACAAAGtggtaatttgaaaaatcatataatgtcCCATCATACTGGTGAGAGACCTTATAAGTGTcctcattgttattataaagcAACACGAAGTGGTGCTTTGAAAAGACATATCATTTCCCTTCATACTAGTGAGAAGCCGTATAAGTGCCCTCATTGTAACTATGAAGCAAAACTAAATAGTactttgaaaaaacatataatgtccCTTCATACTAATAAGAGGGCTTTTAAGTGTCCCAATTGTGATTATAAAGCAACTCAAAGTGGTAATTTGAAAAGACATATTATTTCCATTCATACATATAAGAAACCTCATAAGTGTCCTTATTGTGATTATGAAGGAACACGAAGTGGTCTTTTGAAAAGTCATATTATATCTCTTCATACTGATCTGAAGTCTTATAATGTAAGAGTCCCCATTGGGACTATGAAGCAACACAAAGTGGAGTTTTGA
Protein Sequence
MDQIVSNNPGIPVKEEIIDETETFCLSNSSISIKEETTDETETFCMNNSGISVKEEITDATETTFLTLASIKEEETLEIYYDINNSGVSVKEEISDEIDPSGTIDSLSQTKKMIQYGLESAVKNCDLSMIAEKPLECPYCEYKVVERSLMIRHIMNHHTSVEHRCPHCKYTATVSTNLKLHIITNHTDESPYQCHHCGYKAVKKSIIKQHIMALHTGDRHHKCPHCEYVATVSNNLKLHIITSHTGENPYQCHHCGYKAVKKSIIKQHIMALHTGGGHQKYSYCDYIAKQSGNLKNHIMSLHTGEKPFKCPHCNHKAAHSGTLKTHIMSLHTDVRPHMCPHCDYKAAHRGTLKTHIMSHTGERPYKCPCCDYKATQSSSLKKHIMFLHTDRPHMCPHCDYKAVHSGNLKAHIMSHHNSDGLHMCPHCDFKTTQSCNLKNHIMALHTGERPYKCPHCNYKATQSGTLKRHIISLHTSEKPHKCPYCDYRGTRSGHLKSHIISLHTGDRPYKCPYCDYKATLNGTLKRHIMSLHIGDRPHMCSYCDFKATLISTLKSHIMSLHTSEKPYKCPHCDYKATKNGTLKAHIISVHTDERPYKCPHCNYKAKQSGTLKTHIMFHNNSDRVHKCPHCNYKAKQSGTLKTHIMFHHNSDRVHMCPHCDFKATQSGNLKNHIMSHHTGERPYKCPHCYYKATRSGALKRHIISLHTSEKPYKCPHCNYEAKLNSTLKKHIMSLHTNKRAFKCPNCDYKATQSGNLKRHIISIHTYKKPHKCPYCDYEGTRSGLLKSHIISLHTDLKSYNVRVPIGTMKQHKVEF

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
iTF_00764004;
90% Identity
iTF_00764004;
80% Identity
iTF_00764004;