Wsmi002508.1
Basic Information
- Insect
- Wyeomyia smithii
- Gene Symbol
- -
- Assembly
- GCA_029784165.1
- Location
- CM056644.1:27773371-27779250[+]
Transcription Factor Domain
- TF Family
- zf-C2H2
- Domain
- zf-C2H2 domain
- PFAM
- PF00096
- TF Group
- Zinc-Coordinating Group
- Description
- The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 40 0.00035 0.038 16.1 4.7 1 23 52 74 52 74 0.97 2 40 0.00073 0.079 15.1 0.2 1 23 80 102 80 102 0.98 3 40 0.00021 0.023 16.8 1.4 1 23 112 134 112 134 0.97 4 40 0.013 1.4 11.2 5.3 3 23 144 164 142 164 0.98 5 40 9.9e-07 0.00011 24.2 3.1 1 23 170 192 170 193 0.95 6 40 0.43 47 6.4 2.2 1 23 201 223 201 223 0.98 7 40 4.7e-06 0.00052 22.0 0.6 1 23 240 262 240 262 0.99 8 40 0.00069 0.076 15.2 1.7 1 23 268 290 268 290 0.97 9 40 4e-06 0.00044 22.2 0.7 1 23 296 318 296 318 0.98 10 40 0.079 8.6 8.7 1.0 1 21 328 348 328 352 0.88 11 40 6.8e-05 0.0074 18.4 1.2 3 23 360 380 358 380 0.98 12 40 0.00067 0.073 15.3 2.5 1 23 386 409 386 409 0.96 13 40 0.0034 0.37 13.0 2.3 1 23 417 439 417 439 0.96 14 40 0.00039 0.042 16.0 0.7 1 23 456 478 456 478 0.98 15 40 0.0019 0.21 13.8 3.5 1 23 484 506 484 506 0.97 16 40 0.00085 0.093 14.9 0.5 1 23 512 534 512 534 0.98 17 40 0.00012 0.013 17.6 3.0 1 23 544 566 544 566 0.97 18 40 5.6 6.1e+02 2.9 5.1 3 23 571 591 569 591 0.98 19 40 2.3e-06 0.00025 23.0 2.0 1 23 597 620 597 620 0.97 20 40 0.0019 0.21 13.8 2.4 1 23 628 650 628 650 0.98 21 40 0.00011 0.012 17.8 1.3 1 23 667 689 667 689 0.98 22 40 0.0058 0.64 12.3 1.3 1 23 695 717 695 717 0.97 23 40 0.00085 0.093 14.9 0.5 1 23 723 745 723 745 0.98 24 40 0.00012 0.013 17.6 3.0 1 23 755 777 755 777 0.97 25 40 5.2 5.7e+02 3.0 5.5 3 23 782 802 780 802 0.98 26 40 1.4e-05 0.0015 20.5 2.7 1 23 808 831 808 831 0.95 27 40 0.00064 0.07 15.3 0.7 2 23 858 879 857 879 0.96 28 40 0.0055 0.6 12.4 2.8 1 23 885 907 885 907 0.96 29 40 1e-05 0.0011 20.9 2.8 1 23 913 935 913 935 0.99 30 40 0.0026 0.29 13.4 3.3 1 23 941 963 941 963 0.97 31 40 2.7e-06 0.0003 22.8 0.9 1 23 969 991 969 991 0.96 32 40 8.8e-06 0.00096 21.2 2.1 1 23 999 1021 999 1021 0.98 33 40 8e-05 0.0087 18.2 2.6 1 23 1027 1049 1027 1049 0.98 34 40 7.3e-05 0.008 18.3 1.7 1 20 1057 1076 1057 1079 0.92 35 40 0.0061 0.66 12.2 0.9 1 23 1085 1108 1085 1108 0.95 36 40 0.029 3.1 10.1 0.9 1 23 1115 1137 1115 1137 0.97 37 40 0.00014 0.015 17.4 2.1 1 23 1143 1165 1143 1165 0.97 38 40 1.1e-05 0.0012 20.9 6.5 1 23 1171 1193 1171 1193 0.99 39 40 0.00014 0.015 17.4 0.7 1 23 1201 1223 1201 1223 0.99 40 40 0.00015 0.016 17.3 0.9 1 23 1231 1253 1231 1253 0.99
Sequence Information
- Coding Sequence
- ATGAGCGAAGGTACAATTGCGACATCTGTGGCAAGAGATTCTATTGGAGACGGTCCTGGTCGTACCACAAGAAAATGCACCTCGGAGAATGGCCTCACAAATGTGATATATGTGTTGAGCGTACATCTGAAGACACATAGCACAGAACGTACGCACGAGTGTATGGTTTGCGGAACGCGCTTCAAGTATCGCTCCAAGCTAGCCCAGCACATGCACACTCACAGTGAAGTGCGATGTTTCACGTGCGACGTTTGTGGGGCGTCGTTCAAGCTACCACAAGGTCTAagtaatcataaaaaaattcacaCCACCGTTGTGGAAGAATCAAATGCGTACAAGTGCACTATATGCAGGAGACGTTTTCCCACCGAGGAAACACTGAAGCAACATAAAGCATCTCATCCACATATGGAACCACAGAATTTCGGTTGTGAAACCTGCGGCTTAGTGTGTTCTACCAAGTCTAATCTCAAGTGTCATCAGAAAACTCACACCGGCGAGAAACCGTTCATATGTGAAACGTGTGGTAAAGCGTTCTGCAGGAGCGACAATCTTAACAGACATATAACCGTACACCACAAGGAGCCATTGGAGAAGAACTTTCGGTGTGAGCTGTGTTGTGAAGGCTTCCGCTTGAGGATTCAATTGTTGCGCCATCTGAAGTTACATCAGAAGGAACGCCCGCGCAAAAAAGGCCCGCATACAGCGGAGAGAAATTTTAAATGTGAAATGTGTGGCAAAAATTTTCCTTTGAACTCACAGTTGAGTGTACATCTGAAGACACATAGCACAGAACGTGCACACGAGTGTCTGGTTTGTGGAAAACGCTTCAGTTGGCGCTCTAAGCTAGCACAGCATATGTACACTCATAATGAAGGGCGATGTTTTACGTGCGACGTTTGTGGGGCGTCGTTCAAGTCATCAGAAAATTtaagtaaacataaaaaaattcacatcACCGTTTTGGAGGAGTCTAATATTCATAAATGCATTATATGCAGGAGACGTTTTTCCACGGAGGAAACACTGAAGCAACATGAAGCATCTAATCCGCATACGGAACCGCAGCAGTTCGGTTGCGAAATCTGCGGCTTAGTGTGTACTACAAAATATAATCTCGAGCAACATCGGAAAATTCACACCGGCGAGAAACCGTTCGAATGTGGAACGTGTGGCGAAGCGTTCAGGAGAGGCCAACATCTTAACAAACATTTAACCGTATGCCACGAGAAACCATTGGAGAAGAACTTTCGGTGTGAGCTGTGTGGTAAAGCCTTTCTTTTGAGGAAGCATTTGTTGCGCCATCTGAAGTTACATCAGGAGGAACGTCCGCACAAGAACGGCCCGCATACAACGGAGAAAAAATTTGAATGTGAAATATGTGGTGAAAATTTTCCCTTATGTACACAGTTGAGTGAACATCTGAAGACACATAGCACAGAACGTGGCCACGAGTGTCTGGTTTGTGGAAAACGCTTCAGTTGGCGCTCTAAGCTAGCACACCACATGTACACTCATAGTAAAGCGCGATGTTTTACGTGCGACGTTTGTGGGTCGTCTTTCAAGTTACCACAAGGTTTATATAAACATAGAAAAATTCACACCACCGTTTTGGGAGAGTCAAATGCGCACAAGTGCACGATATGTGGAAAACGTTTTTCCACCGAGGAAACAATGAAGCAACACAAAGCATCTCATCCGCAATTCGGTTGCGAAACCTGCGGTGTATTGTGTACCACCAAGTTTAATCTCGAGTGTCATCGGAAAACTCATACCGGCGAGAAACCGTTCAAATGTGAAACGTGTGGTAAAGCGTTCAGCAGGAGACATTATCTTAACACACATATAATCGTACAACACGAGAAGCCATTGGAGAAGAACTTTCGGTGTGAGCTGTGTGGTAAAGACTTTCACTTGAAAACTTATTTGTTAAGCCATCTGAAGGTACATCAGAAGGAACGTCCGCACAAGAACGGCCCGCATACAACGGAGAAAAAATTTGAATGTGAAATATGTGGTAAAAATTTTCCTTTATGTACACAGTTGAGTGAACATCTGAAGACACATGGCACAGAACCTGCGCACGAGTGTCTGGTTTGTGGAACGCGCTTCATATGGCGTTCTAAGCTAGAGCAGCATATGTACACTCATAGTAAAGAGCGATGTTTTACGTGCGACGTTTGTGGGTCGTCATTCAAGTTACCACAAGGTTTATATAAACATAGAAAAATTCACACCACCGTTTTGGGAGAGTCAAATGCGCACAAGTGCACGATATGTGGAAAACGTTTTTCCACCGAGGAAACAATGAAGCAACACAAAGCATCTCATCCGCAATTCGGTTGCGAAACCTGCGGTTTATTGTGTACCACCAAATTTAATCTCGAGTGTCATCGGAAAACTCACACCGGCGAAAAACCGTTCAAATGTGAAACGTGTGGTAAAGCGTTCAGCAGGAGACATTATCTTAACACACATGTAACCGTACAACACGAGAAGCCATTGGAGAAGAACTTTCGCAACACGACTAAATTGGAAACGAACCCTTCGGTTGACAGCCAGCAGAAACTCATTTGCGACATTTGCGGCAAGGAGTACAAGACGAAGCAAGTTCTCTATCAACACAAGAAAACACATAACGGCCCACGCTGGTTCAAGTGTGGTGAATGCGACGAAACGTTTCCCACATCCCTCAAACTTAGGTACCATAGAACACATCACACAGACGAAAAGAATTACAGCTGCGATATATGTAAGAAAGCCTTCTGCCTAAAAGTACAGCTAAGAGTACACATGAGGCGGCACATTACGGATCGCCCCCACAAATGCCTTATTTGCGGGCAAGGGTTCAGGGTTAAATTTTTTCTAACGAGACACACGAAACTTCACAGTGGCGAGAGATTGTTCGACTGCGACATCTGTGGTAAATCCTACAAATCAGCAAGTAATCTAAATGATCATAAGAAGTATCACACATCCGGTGAAGATGAACTGCACAAATGTGACCTTTGTGGAAAAGGTTTCGTACATCGCTACGCTCTTGTAGATCACATGAACACTCACAGTAACGAGCGGCGATTTAAGTGTGACATCTGCGGTTTGTCATGCAAAACACGTAAAAATTTAAACGGTCACCGGAGGGGTCACACATCTCCCAAAGCAAAATCATTTAAATGTGACATGTGCGACAAAAAGTACGTCTCGAGTTCTTCTCTCAGGAGACATAAACCGACTCACGTCAGCAGGCGGCCTTTCAAATGTGACATGTGTAGCAACGCTTATCAAACCAGCAGCAGTCTCGAGaaacataaaatcataaaacacaCAACATTGGAAAAGAAGTTCAAATGTGAAATATGTAGCAAAGATTTCCATTTAGATGAGCAATTAGTGCAACACCTGGGTACGCACAGCACGAACCGATCTCACGAGTGTATGATTTGCGGCAGAGGGTTCCGACTTAGTTCGCTGCTCGCAAAGCACATGCACAGTCACAGCAATGAGCGGCTATTCACGTGCGACATCTGTGGTAAAACCTTCAAGTCGCCCAGTTGCTTATGTACGCACAAAAAAACCCACACATTCGTAGATGAGCGTCCCTACCAGTGTGACATCTGTGGCTCCTCGTACACGTCGCGTAGCGGTATGCAGTTGCACCGGAAGATGCACTCCTTTGTCGAGCGCAAAGCGTACAAATGTGATCTGTGCAATGCCGGATTCACCCAGAAGGGTACCCTCGCTAATCATAAGAAAAAACACGCCCAAAGCACGCCGGATGGTAGTGAATTAAGTCCAATGGGACGGTCATTGGTCGTTTGA
- Protein Sequence
- MSEGTIATSVARDSIGDGPGRTTRKCTSENGLTNVIYVLSVHLKTHSTERTHECMVCGTRFKYRSKLAQHMHTHSEVRCFTCDVCGASFKLPQGLSNHKKIHTTVVEESNAYKCTICRRRFPTEETLKQHKASHPHMEPQNFGCETCGLVCSTKSNLKCHQKTHTGEKPFICETCGKAFCRSDNLNRHITVHHKEPLEKNFRCELCCEGFRLRIQLLRHLKLHQKERPRKKGPHTAERNFKCEMCGKNFPLNSQLSVHLKTHSTERAHECLVCGKRFSWRSKLAQHMYTHNEGRCFTCDVCGASFKSSENLSKHKKIHITVLEESNIHKCIICRRRFSTEETLKQHEASNPHTEPQQFGCEICGLVCTTKYNLEQHRKIHTGEKPFECGTCGEAFRRGQHLNKHLTVCHEKPLEKNFRCELCGKAFLLRKHLLRHLKLHQEERPHKNGPHTTEKKFECEICGENFPLCTQLSEHLKTHSTERGHECLVCGKRFSWRSKLAHHMYTHSKARCFTCDVCGSSFKLPQGLYKHRKIHTTVLGESNAHKCTICGKRFSTEETMKQHKASHPQFGCETCGVLCTTKFNLECHRKTHTGEKPFKCETCGKAFSRRHYLNTHIIVQHEKPLEKNFRCELCGKDFHLKTYLLSHLKVHQKERPHKNGPHTTEKKFECEICGKNFPLCTQLSEHLKTHGTEPAHECLVCGTRFIWRSKLEQHMYTHSKERCFTCDVCGSSFKLPQGLYKHRKIHTTVLGESNAHKCTICGKRFSTEETMKQHKASHPQFGCETCGLLCTTKFNLECHRKTHTGEKPFKCETCGKAFSRRHYLNTHVTVQHEKPLEKNFRNTTKLETNPSVDSQQKLICDICGKEYKTKQVLYQHKKTHNGPRWFKCGECDETFPTSLKLRYHRTHHTDEKNYSCDICKKAFCLKVQLRVHMRRHITDRPHKCLICGQGFRVKFFLTRHTKLHSGERLFDCDICGKSYKSASNLNDHKKYHTSGEDELHKCDLCGKGFVHRYALVDHMNTHSNERRFKCDICGLSCKTRKNLNGHRRGHTSPKAKSFKCDMCDKKYVSSSSLRRHKPTHVSRRPFKCDMCSNAYQTSSSLEKHKIIKHTTLEKKFKCEICSKDFHLDEQLVQHLGTHSTNRSHECMICGRGFRLSSLLAKHMHSHSNERLFTCDICGKTFKSPSCLCTHKKTHTFVDERPYQCDICGSSYTSRSGMQLHRKMHSFVERKAYKCDLCNAGFTQKGTLANHKKKHAQSTPDGSELSPMGRSLVV
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -