Adir004540.1
Basic Information
- Insect
- Anopheles dirus
- Gene Symbol
- -
- Assembly
- GCA_000349145.1
- Location
- KB672490.1:7640450-7657549[+]
Transcription Factor Domain
- TF Family
- zf-GAGA
- Domain
- zf-GAGA domain
- PFAM
- PF09237
- TF Group
- Zinc-Coordinating Group
- Description
- Members of this family bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 14 2.6 1.1e+04 -3.9 0.0 13 30 65 82 63 83 0.77 2 14 0.00093 3.8 7.2 0.0 21 44 101 124 93 133 0.86 3 14 0.002 8 6.1 0.1 23 47 131 155 128 161 0.83 4 14 0.0053 21 4.8 0.1 23 48 159 184 155 189 0.85 5 14 0.0058 24 4.6 0.1 23 48 187 212 183 217 0.85 6 14 0.0029 12 5.6 0.1 23 52 215 244 211 246 0.87 7 14 0.014 56 3.4 0.0 23 49 271 297 266 302 0.85 8 14 0.45 1.8e+03 -1.4 0.0 24 48 300 324 297 330 0.82 9 14 0.47 1.9e+03 -1.5 0.0 23 49 327 353 324 358 0.80 10 14 0.068 2.8e+02 1.2 0.0 23 52 355 384 351 386 0.86 11 14 0.019 77 3.0 0.0 23 48 383 408 379 412 0.84 12 14 0.019 77 3.0 0.0 21 45 409 432 404 437 0.88 13 14 8.8e-05 0.36 10.5 0.1 23 44 439 460 437 466 0.91 14 14 2.7 1.1e+04 -3.9 0.2 33 45 686 698 685 703 0.83
Sequence Information
- Coding Sequence
- ATGTTCTACCCGCTGCAAACTCAGTTCGAGGATGGCGTAATTGAGCTGACACGTTCGGAAATGCTCCAGCAGGATCCATTGGCGGACAATAAGCACTTCGTCGTATCCCTTTCGCTAGGAAATACGCTCATCAATCTCAACAAAATCAAGTGTCCTCAATGTCGGAAGCGCTTTGATACGATGGAGGAAATGGAACAGCACCGGACCAAGCATTTGACAGAGAACAAGTTTAAGTGCGAAATTTGCAGCAAGGAGTTTCCCAGCCATAGTTCCATGTGGAAGCACACCAAGGCGCACACCGGCGAACGTCCTTTCGTGTGCCAGATATGCAACAAAGGCTTCACCCAACTGGCCAACCTCCAACGACATGATCTCGTCCACAACGGACTTAAACCCTTCAAATGTCCAATCTGTGAGAAATGTTTCACTCAACAAGCGAACATGCTGAAGCATCAACTACTGCATACTGGACTTAAACCGTACAAATGTCCCGTGTGCGAGAAAGCGTTCTCGCAACATGCAAACATGGTCAAACATCAAATGCTTCACACAGGTTTGAAGCCCTACAAGTGTCCCGTGTGCGAAAAGGCATTTACGCAGCACGCCAACATGATCAAGCATCAAATGCTCCATACTGGTCTAAAGCCATATAAATGTCCTGTTTGTGAGAAGGCTTTCACCCAGCAGGCCAACATGGTGAAACATCAAATGTTGCACACCGGCGTGAAACCGTACAAATGTTCCACCTGTGGAAAGGCTTTTGCCCAGCAGGCCAACATGGTCAAACACGAGATGCTTCATACCGGTATCAAACCGTACAAGTGTCCCACCTGTGACAAAGCATTTGCCCAGCAAGCAAACATGATGAAACATCAAATGTTGCATACGGGATTGAAGCCGTACAAATGCGGTACATGCGACAAAGCGTTTGCCCAGCAGGCCAATATGGTCAAACATCAGATGCTTCATACGGGTTTAAAACCGTACAAATGCAATACCTGTGGCAAGGCATTCGCACAGCAGGCCAACATGGTCAAACACGAGATGCTTCATACCGGAATCAAACCCTACAAATGTTCCGTTTGCGATAAAGCCTTTGCCCAGCAGGCCAACATGGTCAAACATCAGATGCTCCACAGCGGAATCAAACCGTACAAGTGCCCCACTTGCGATAAAGCATTTGCCCAGCAGGCAAACATGGTTAAGCATCAGATGCTCCATACGGGAGAAAAACCGTTCAAATGCAAAAGCTGTGATAAGGCTTTCTCGCAAAATGCCAACCTGAAGAAGCACGAAATGGTACACCTCGGCATCCGGCCGCACACCTGTCCGCTGTGTACGAAGTCCTACTCGCAGTATTCGAACCTGAAGAAGCATTTGCTCAGCCACCAGAAGCAAGCGATcaagcaggagcagcagaacGGGCAGGTGATGGCTATTCTCTACAGCTGCCAGTCGTGCAAGATGCAGTTTGAGGATATCATCGAGTTCGAACGCCACACCAGACAGTGCGGCATCAACAgcgtgcaacagcagcagcagcagcaacaccagcagcaccagcaacagcagcagcagcaacagcaacagcacagcGTGAAGCTGGAGAACATCAAGAGCGAAATCGACATCGACGGAAGTTCCAGTTCGGGCATGCAGCAGCAcatcgtcaccaccaccaccaacggcacCATCACGAACGGTGGCGGCGTCAGCCAGTCGCAAACGCCGACCCCGATGCACATCCCGTCGGCCATCCTCACCTCCGTCATCTCGTCGTCGGTCGGGGCGAACGTCGTAACGCCGCACAACCTGGTCGTCCCGtccgcacactcacaccacgGGCACATTACGTCGAACGGGGGCACCATCCTGGTCGGAGGCATCCCGACCGGACACCCCCActcccaacagcagcacacccCACCGGGCGTTGGGTtgtcacagcagcagcattctcACGcccaacagcaccaacagcaggctcaccagcagcagcagcagcagcaacagcaacagctgtCCCCGCTCGgcacacaccaacagcatcagctggtgctgcagcagcagcacaacctGCCGAtccatctgcagcagcagctgtcgcACCACCTGATCAGCTCCCACCTGCCCCACCCGCAGGATCACGGTCCGGGCGggacgggcggcggcggcggcgccgaACTGCACCATCAGGTGAACTTCCACCATCCGCACATCTCGCACCTGCCGAACATCTCGCACAAGATCCTGTCGCCGCTGTTCCACATTCCGCCcttcaacaacaaccaccacagcACATAA
- Protein Sequence
- MFYPLQTQFEDGVIELTRSEMLQQDPLADNKHFVVSLSLGNTLINLNKIKCPQCRKRFDTMEEMEQHRTKHLTENKFKCEICSKEFPSHSSMWKHTKAHTGERPFVCQICNKGFTQLANLQRHDLVHNGLKPFKCPICEKCFTQQANMLKHQLLHTGLKPYKCPVCEKAFSQHANMVKHQMLHTGLKPYKCPVCEKAFTQHANMIKHQMLHTGLKPYKCPVCEKAFTQQANMVKHQMLHTGVKPYKCSTCGKAFAQQANMVKHEMLHTGIKPYKCPTCDKAFAQQANMMKHQMLHTGLKPYKCGTCDKAFAQQANMVKHQMLHTGLKPYKCNTCGKAFAQQANMVKHEMLHTGIKPYKCSVCDKAFAQQANMVKHQMLHSGIKPYKCPTCDKAFAQQANMVKHQMLHTGEKPFKCKSCDKAFSQNANLKKHEMVHLGIRPHTCPLCTKSYSQYSNLKKHLLSHQKQAIKQEQQNGQVMAILYSCQSCKMQFEDIIEFERHTRQCGINSVQQQQQQQHQQHQQQQQQQQQQHSVKLENIKSEIDIDGSSSSGMQQHIVTTTTNGTITNGGGVSQSQTPTPMHIPSAILTSVISSSVGANVVTPHNLVVPSAHSHHGHITSNGGTILVGGIPTGHPHSQQQHTPPGVGLSQQQHSHAQQHQQQAHQQQQQQQQQQLSPLGTHQQHQLVLQQQHNLPIHLQQQLSHHLISSHLPHPQDHGPGGTGGGGGAELHHQVNFHHPHISHLPNISHKILSPLFHIPPFNNNHHST
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- iTF_00102068;
- 90% Identity
- iTF_00099138;
- 80% Identity
- iTF_00099138;