Aara002620.1
Basic Information
- Insect
- Anopheles arabiensis
- Gene Symbol
- -
- Assembly
- GCA_016920715.1
- Location
- NC:108863991-108868897[+]
Transcription Factor Domain
- TF Family
- zf-GAGA
- Domain
- zf-GAGA domain
- PFAM
- PF09237
- TF Group
- Zinc-Coordinating Group
- Description
- Members of this family bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 13 3 9.3e+03 -3.7 0.5 9 26 78 95 77 97 0.79 2 13 2.2 6.9e+03 -3.3 0.4 15 35 105 124 99 134 0.53 3 13 1.4 4.4e+03 -2.6 0.1 27 46 212 231 209 237 0.84 4 13 0.36 1.1e+03 -0.7 0.0 26 32 371 377 365 388 0.74 5 13 0.32 9.7e+02 -0.5 0.0 26 48 437 459 432 464 0.82 6 13 0.0033 10 5.8 0.0 21 46 460 485 455 491 0.85 7 13 0.00015 0.45 10.2 0.0 21 45 488 512 484 519 0.87 8 13 0.004 12 5.6 0.1 21 45 516 540 512 545 0.89 9 13 0.12 3.8e+02 0.8 0.0 21 48 568 595 564 600 0.83 10 13 0.00016 0.48 10.1 0.1 21 52 597 628 587 629 0.82 11 13 0.026 79 3.0 0.0 22 45 654 677 651 684 0.84 12 13 0.0015 4.6 6.9 0.2 21 48 681 708 677 712 0.86 13 13 0.00025 0.77 9.4 0.0 21 45 709 733 706 739 0.91
Sequence Information
- Coding Sequence
- ATGCTGGGCAAGTATGGCGAGGCGAGTTCCGGCTCGATGAAAAACCTGGAAAATGGACTGCGCGACCGGGTGCTTTACCATCAGAATATGCTGAGCATGAACCTGACAACCGAGTTTTACCACACTCCATTCCTTACGGCCGGCAGTGGTAGCAACGTTGAAACAGCTGCAATCGTAGTAGGCGCTCCCGCGCTGGCACAGTCGCAAAAGATGCTCAACAGTCTCGCATCCCCCTCTGGATtagcctcctcctcctcttccagCACCTCGCAACAGCACGCCACTAATGGGGCAGCGCCGTCACAGACAGTCCCAAACAGTGTGTCCGCAGCGCAGCAGTATCAGTGTCAACTCTGCCAGAAGTGTTTCATCTCGAGCGCGTTACTGGTGCAGCATATGAAAACGCACGACAATAATGGGTTTATTGCGGGGCGTGAGACCGCCGGCAGCTAcggccaacagcagcagcagcagcagcagcaatcggtATCGTCGGCCTCTTCCCCCGTGACGGCAGTGGCATCGAGCAGCAAAACTAGTCCGCCAGTCGCAACGCAGCACATCATCAAGAGCGAGTACATCGGTGGAACCGTAACGAACCAGTATGCGTACGGTATGGCCAAGCAGTTCGAGTGTCACATCTGCCACAAGTCCTTCATGACGATGGTGAACCTGAACCTGCACATGAAGATACACGAAAGTGCAATAAAGCCGCTCGCAGCGACTCACATGTACGCGGGCCAGGCAGGCCTTGCCGGCAACTTGATTGCTGGATCGACCGCGGGCTACCATCACGGCGGGCAGCATCACAGTCATCTGTCGCAGCATCTCGCGGTTCAAAGCGCCAGCAGCAGTGCGGACGGGTTGTGCCAGATATGCCACAAAACGTTCAGCACGGCCGACCAGTTTGCGGCGCACATGAAGATCCACGAGAATGAGTTTAAAAATCGAGCGCTATATCATTCCGGTAGTGCGAACGATGGTGGCGAGGGAGCGCCAGCACCCACCAGTAATGGGGCTGGTGTCGGAGACCATTTCTACACCGCGTCACAGCCGCCACACCACCATCTGGGTCATGCACCGCCGCATCTCGATGCTAGCAAGGGCCATCGCTGCCCAATCTGTCACAAAATGTCCAACAACATCATCGAGCACATTAAACAGCACGAGGGTCAGCTGCAGGGCACGTCCGATGGGTCGGGTGCGGGGTATCATCCTATGAACGATGATTCCCAATCGTCGCTGGAAGAAGACAgcgacggtggcggtggtggtggcgctggCGGGGACGAGAGCCTGCGAAAGCACGAGTGCTTGATATGCCACAAAAAGTTCTCCAGCTCGGGCAACCTGGCGATACACATACGGGTGCATTCGGGCGAAAAGCCGTTCCGGTGCAGCGTGTGCGGCAAGGGCTTCATCCAGTCGAACAATCTGGCGACGCACATGAAGACGCACACGGGTGAAAAGCCGTACGCGTGCACCATCTGCGGCAAGAACTTTAGCCAGTCGAACAACCTGAAGACGCACATCCGGACGCACACGGGCGAAAAGCCGTACGCGTGTACAATCTGCGGCAAGCGCTTCAACCAGAAGAACAACCTGACCACGCACATGCGCACACACCAGCTGGTGTGCATGGTGTGCGGGGTGCAGTTCATGCATCCGATCGATCTGGCGACCCACATGAAGTTCCACAACGACGAAAAACCCTACATCTGCTCGGTGTGCAACAAGGTCTACCTGAACCTGGACGAGCTGACCGAGCACATGAAGAAGACGCACAACCAGGTAAAGCCGTATCGGTGCCACATCTGCGACAAAACCTTTACCCAATCCAACAACCTGAAGACCCACATCAAGACGCACATCTTTCAGGATCCGTACAAGTGTCAGGTGTGTTCGCGCTCCTTCCAGAAGGAGGATGACTTCTCGCAGCACATGCTGGTCCACACGGCCGACAAGCCGTACGAGTGTACGTACTGCGGCAAGCGGTTCATCCAGTCGAACAACCTGAAAACGCACGTGCGCACGCACACGGGCGAAAAGCCGTACCGGTGCACGATCTGCGCGAAGCACTTTAACCAGAAGAACAATCTCAACACGCACATGCGCATCCACACGGGGGAGAAACCGTTCGAGTGTACCATCTGCGACAAGCGCTTTAATCAGTCGAACAATTTGaacaaacacattaaaacacaCGGCCAGGAGaaggatcagcagcagcagcagtcccagcagcagcagcagcagcagaaccaacaacaacaacagcaagtccagcaacaacagcagcagcagcagcaacatgcgAGTTGA
- Protein Sequence
- MLGKYGEASSGSMKNLENGLRDRVLYHQNMLSMNLTTEFYHTPFLTAGSGSNVETAAIVVGAPALAQSQKMLNSLASPSGLASSSSSSTSQQHATNGAAPSQTVPNSVSAAQQYQCQLCQKCFISSALLVQHMKTHDNNGFIAGRETAGSYGQQQQQQQQQSVSSASSPVTAVASSSKTSPPVATQHIIKSEYIGGTVTNQYAYGMAKQFECHICHKSFMTMVNLNLHMKIHESAIKPLAATHMYAGQAGLAGNLIAGSTAGYHHGGQHHSHLSQHLAVQSASSSADGLCQICHKTFSTADQFAAHMKIHENEFKNRALYHSGSANDGGEGAPAPTSNGAGVGDHFYTASQPPHHHLGHAPPHLDASKGHRCPICHKMSNNIIEHIKQHEGQLQGTSDGSGAGYHPMNDDSQSSLEEDSDGGGGGGAGGDESLRKHECLICHKKFSSSGNLAIHIRVHSGEKPFRCSVCGKGFIQSNNLATHMKTHTGEKPYACTICGKNFSQSNNLKTHIRTHTGEKPYACTICGKRFNQKNNLTTHMRTHQLVCMVCGVQFMHPIDLATHMKFHNDEKPYICSVCNKVYLNLDELTEHMKKTHNQVKPYRCHICDKTFTQSNNLKTHIKTHIFQDPYKCQVCSRSFQKEDDFSQHMLVHTADKPYECTYCGKRFIQSNNLKTHVRTHTGEKPYRCTICAKHFNQKNNLNTHMRIHTGEKPFECTICDKRFNQSNNLNKHIKTHGQEKDQQQQQSQQQQQQQNQQQQQQVQQQQQQQQQHAS
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- iTF_00101361;
- 90% Identity
- iTF_00104825;
- 80% Identity
- iTF_00093596;