Basic Information

Gene Symbol
-
Assembly
GCA_903994105.1
Location
NW:5446-9389[+]

Transcription Factor Domain

TF Family
zf-GAGA
Domain
zf-GAGA domain
PFAM
PF09237
TF Group
Zinc-Coordinating Group
Description
Members of this family bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 14 0.11 1.3e+02 2.4 0.1 11 44 181 215 175 220 0.82
2 14 0.0012 1.4 8.8 0.1 21 44 220 243 216 247 0.91
3 14 0.0037 4.2 7.2 0.1 21 44 248 271 244 275 0.90
4 14 0.0007 0.79 9.5 0.1 21 46 276 301 271 307 0.86
5 14 0.05 56 3.6 0.2 21 44 304 327 300 331 0.87
6 14 0.0035 4 7.3 0.5 21 44 332 355 327 360 0.90
7 14 0.0016 1.8 8.3 0.4 21 46 360 385 355 391 0.86
8 14 0.031 35 4.2 0.1 21 44 388 411 384 415 0.88
9 14 0.011 13 5.6 0.1 21 45 416 440 411 444 0.89
10 14 0.025 28 4.5 0.3 21 48 444 471 440 477 0.84
11 14 2.1 2.4e+03 -1.6 0.1 20 33 471 484 467 503 0.70
12 14 0.48 5.4e+02 0.4 0.1 21 44 500 523 489 527 0.84
13 14 0.0064 7.2 6.4 0.0 11 48 517 555 514 561 0.79
14 14 0.0033 3.7 7.3 0.2 21 45 584 608 574 615 0.85

Sequence Information

Coding Sequence
ATGGATCAATCCGAATTCAAATGCACATCCGAAGGAGGACTTAGCGCTCCTGAAGACAGGATCCCAGAGATTTCTATAGGCAGAATAAGTGTGAAGGATCCACAGGACATTATCCCGGAGGTTGGTAACACTATATTCATAAAGTGTGAGTTAACTGAACCGGAAGCAGAATCTGCATTTGACACGGAATTTGCCTCTCATACACCTTGGCATGTGAAAACTGAAATTGGAACTTCtccaagtgaagaaaaaaagtctgtTGAACACTTCAATAGCGCCTCGTGTTCCTCCAATTCTCTCATCGGCCATTCCATCAGGATAAAATCGGAAGGAgacaatgaaaatgaaaattggatCGACCAGTTACATACCAGTATTGAGATTAAAAACGAAGACCCTCTTAAGACTGACAGCTATGAATGCACTTACGTAAAGAAAGATAAAgaacaagattttgaaaatgagattCAAAATCATATGCCAACCACAACATGCAAACCGTTCAGTTGCAGTCGTTGCTCGGCCCGTTTTGCTATAAAAGGTACTTTAAAAACACACATGCGAacacataccggtgagaaaccatttgCTTGCAGAAATTGCTCGGCTTGTTTCgctataaaaggaaatttaatacaacacatgcgaacacataccggtgagaaaccatttacTTGCAGCGAGTGCTCGGCCCCTTTTGCTCAGAAACAACATTTAAAACGCCACATGCGAacacataccggtgagaaaccatttacTTGCAGCCAGTGCTCGGCTTGTTTCGCTCAAAAAGGAGATTTAAAACAACACATGCGAacacataccggtgagaaaccatttacTTGCAGCGAGTGCTCGGCCCCTTTTGCTCAGAAACAACATTTAAAACGCCACATGCGAacacataccggtgagaaaccatttacTTGCAGCCAGTGCTCGGCTTGTTTtgctctaaaagaaaatttaatacacCACATACGAACACAtaccggtgaaaaaccatttACTTGCAGCCATTGCTCGGCTTGTTTCCCTCTAAAAAGAACTTTACAACGGCACGTGCGAACccataccggtgagaaaccattcactTGCAGCCAGTGCTCGGCTTGTTTCCCTCTAAAAAGAACTTTACAACGGCACATGCGAacacataccggtgagaaaccattcactTGCAGCCATTGCTCGGCTTGTTTCGCTAGAAAAGGAGATTTAAAACAACACATGCGAacacatactggtgagaaaccattcactTGCAGCCATTGCTCGGCTTGTTTCGCTAGAAAAGGAGATTTAAAACAACACTTGCGAacacatactggtgagaaaccattcactTGCAGCCATTGCTCGGACTGCTTCGCTCTAAACGCAAATTTAATACGGCACATGCGAACACATACcagtgagaaaccattcagatGCAGCCAATGCTCGGTCTGTTTCCCTCGAAAAGGAACGTTATTAAATCACATGCGAacacataccggtgagaaaccattcagatGCAGCCATTGCTCGGTCTGCTTCACTGAAAAAGGAACTTTAAAACGGCACATGCGAACACATACCGGTGAGAAGCCATTTACTTGCAGCCATTGCTCGGCTTCTTTCGctctaaaaggaaatttaataaaacaCATGCGAACACATACTGGTGGCAAACCATTTATTTGCAGCCATTGCTCGGCTTGTTTcgctctaaaagaaaatttaataccacacatgcgaacacataccggtgagaaaccatttacTTGCAGCCATTGCTCGGTCTGCTTCACTCGAAAAGGAAATTTAATACGACACCTGCGAACACATACTGTTGAGGAACCATTAATTACTTGTAGCTATCGCCCTGCACGTTTTGCTGAGTGA
Protein Sequence
MDQSEFKCTSEGGLSAPEDRIPEISIGRISVKDPQDIIPEVGNTIFIKCELTEPEAESAFDTEFASHTPWHVKTEIGTSPSEEKKSVEHFNSASCSSNSLIGHSIRIKSEGDNENENWIDQLHTSIEIKNEDPLKTDSYECTYVKKDKEQDFENEIQNHMPTTTCKPFSCSRCSARFAIKGTLKTHMRTHTGEKPFACRNCSACFAIKGNLIQHMRTHTGEKPFTCSECSAPFAQKQHLKRHMRTHTGEKPFTCSQCSACFAQKGDLKQHMRTHTGEKPFTCSECSAPFAQKQHLKRHMRTHTGEKPFTCSQCSACFALKENLIHHIRTHTGEKPFTCSHCSACFPLKRTLQRHVRTHTGEKPFTCSQCSACFPLKRTLQRHMRTHTGEKPFTCSHCSACFARKGDLKQHMRTHTGEKPFTCSHCSACFARKGDLKQHLRTHTGEKPFTCSHCSDCFALNANLIRHMRTHTSEKPFRCSQCSVCFPRKGTLLNHMRTHTGEKPFRCSHCSVCFTEKGTLKRHMRTHTGEKPFTCSHCSASFALKGNLIKHMRTHTGGKPFICSHCSACFALKENLIPHMRTHTGEKPFTCSHCSVCFTRKGNLIRHLRTHTVEEPLITCSYRPARFAE

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
iTF_00201881;
90% Identity
iTF_00201881;
80% Identity
iTF_00201881;