Cgly024220.1
Basic Information
- Insect
- Coenonympha glycerion
- Gene Symbol
- -
- Assembly
- GCA_963855885.1
- Location
- OY979654.1:1702932-1716264[+]
Transcription Factor Domain
- TF Family
- zf-C2H2
- Domain
- zf-C2H2 domain
- PFAM
- PF00096
- TF Group
- Zinc-Coordinating Group
- Description
- The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 34 0.082 5 8.4 4.7 1 23 313 335 313 335 0.98 2 34 0.0023 0.14 13.3 2.0 2 23 343 365 342 365 0.96 3 34 0.72 44 5.4 1.4 1 23 369 391 369 391 0.94 4 34 0.00025 0.015 16.3 2.6 1 23 396 419 396 419 0.98 5 34 0.0011 0.07 14.2 0.2 2 23 424 446 423 446 0.95 6 34 0.0011 0.07 14.2 0.2 2 23 468 490 467 490 0.95 7 34 0.0011 0.07 14.2 0.2 2 23 512 534 511 534 0.95 8 34 0.0066 0.4 11.9 0.3 2 23 556 578 556 578 0.94 9 34 0.0066 0.4 11.9 0.3 2 23 600 622 600 622 0.94 10 34 0.0015 0.091 13.9 0.3 2 23 644 666 644 666 0.95 11 34 0.0066 0.4 11.9 0.3 2 23 688 710 688 710 0.94 12 34 0.00097 0.059 14.5 0.1 2 23 732 754 731 754 0.95 13 34 1.5 92 4.4 0.1 6 21 770 785 767 790 0.90 14 34 3.6 2.2e+02 3.2 0.2 1 13 859 871 859 876 0.82 15 34 3.6 2.2e+02 3.2 0.2 1 13 935 947 935 952 0.82 16 34 3.6 2.2e+02 3.2 0.2 1 13 1011 1023 1011 1028 0.82 17 34 3.6 2.2e+02 3.2 0.2 1 13 1087 1099 1087 1104 0.82 18 34 3.6 2.2e+02 3.2 0.2 1 13 1163 1175 1163 1180 0.82 19 34 0.0014 0.087 13.9 3.0 1 23 1239 1262 1239 1262 0.93 20 34 7e-06 0.00043 21.2 1.3 1 23 1268 1290 1268 1290 0.99 21 34 0.00013 0.0077 17.3 0.3 1 20 1297 1316 1297 1316 0.95 22 34 0.00071 0.043 14.9 0.5 6 23 1334 1351 1333 1351 0.99 23 34 0.00013 0.0077 17.3 0.3 1 20 1358 1377 1358 1377 0.95 24 34 0.00071 0.043 14.9 0.5 6 23 1395 1412 1394 1412 0.99 25 34 0.00013 0.0077 17.3 0.3 1 20 1419 1438 1419 1438 0.95 26 34 0.00071 0.043 14.9 0.5 6 23 1456 1473 1455 1473 0.99 27 34 0.00013 0.0077 17.3 0.3 1 20 1480 1499 1480 1499 0.95 28 34 0.00071 0.043 14.9 0.5 6 23 1517 1534 1516 1534 0.99 29 34 0.00013 0.0077 17.3 0.3 1 20 1541 1560 1541 1560 0.95 30 34 0.00071 0.043 14.9 0.5 6 23 1578 1595 1577 1595 0.99 31 34 0.00013 0.0077 17.3 0.3 1 20 1602 1621 1602 1621 0.95 32 34 0.00076 0.046 14.8 0.4 6 23 1639 1656 1638 1656 0.99 33 34 7.3 4.5e+02 2.3 0.1 1 9 1663 1671 1663 1678 0.88 34 34 0.012 0.75 11.0 2.8 1 23 1681 1704 1681 1704 0.94
Sequence Information
- Coding Sequence
- ATGTACAAACGCAATACCGGCGTCGACGAAAACGTTGCACAGCTTGCCGACCTTGACGAGGATGAAGATGTGCCTTTATTGTTTTGCATGATATGTCTAGACACAGACTGTAAGCTGTATCTGATGAACAAACACAATTTGGAAACAAAATTTGAACATTTGACAGGAATTTCTCTTCAAATTGAAGGGAACTTTGCACCACAGCTTTGCAGTGAGTGTGCTCAGAGGCTGAGCACCAGCAGTGAGCTCCGGGATAAGGCCCTGAGAAGCTATCACCTGTTAGCTGAGTTAGTTAAGAACCATCCCGAGTTAGCGACACAAGatgtaaaaacaataaaccgtaTGGCTCATCAACTTACATCAAACATAACCAAGAGAACATTTCAGCCAGACCACTGTGACTTATTCTTGATACACATAGAAGAGGAGGTCAAAGTCGAAGTAAAATGTGTGGAGAACAAAGCAAATGCAAGCGAAAATCAAAAAACTGAAGTAAAAATTGACTTTGAACCTAAAAAAGAAGCTTTTAATGACATCAAGAGTGACAGTGACGATGATTTCCTAATTGATGACACATTTAATGACAGTGTTAAATCGGATGGAGAAGAAAATGAGGAACTTCCAAATGAAGAACATGGCAAAGAGCTGTCAAATGAAACAGACAGTGATTGGGACTTTTTAAGAGACAGCGATGTTAGTAATAGTGATGAAGAAGTTGTGAATAGTAAAAATATTAAGaaaaaaaatatcagaTATAGTGAAGAAAATGAGAACAAAGCTACAAAACGTGGCAGACTTAAACAAGATTCCAAAAAACCAGGCCGACCAGAAAACGAGGAGGTCTTGAAGCTATTCAAAATAACCATATTGAGCCATGAAGAACAATTAGCTGAAATTTTGAAGAGAAAGGACATGGAGAATTTCAAAAACTCTCCATACAAATGTATGAAGTGTTACAaagttttttgtagtgtgaCCACTTATGATTCGCATATGGAGAGGCATACAGATAAATTCGGTCCGGTCGAATGCACAGTTTGCGGCATGCGGGTCAAAAGCCAGACTCAACTGCGACACCACACGAATAAAAACCATAGGACGCAATACACTTGCACTGAGTGTCCGTACGTCACTCACCATAAGGAGTCAGCCGTGAGTCATGCACAATGGCACAAAGGTACTAAATTCAAATGTCCTCACTGCGACGAAGAGTATAGCAAGAAAACGTCGTATTTCTCCCATCTGCGCATATGGCATCCAACCGACGTGGTGTGCACGttgtgcgggttctccttcatcaaCGAGAGGGGTCTCAACATGCACATGAACCTCAAACACCGCTTCGACGACGCGCAGGTGAGCACCACACTCCTCGTGCGGGTCTCACAGCGCACTGACGTGGTGTGCACGTTGTGCGGGTTCTCTTTCATCAACGAGAGGGGTCTCAACATGCACATGAACCTCAAACACCGCTTCGACGACGCGCAGGTGAGCACCACACTCCTCGTGCGGGTCTCACAGCGCACTGACGTGGTGTGCACGTTGTGCGGGTTCTCTTTCATCAACGAGAGGGGTCTCAACATGCACATGAACCTCAAACACCGCTTCGACGACGCGCAGGTGAGCACTACACCCCTCCTGCGGGTCTCACAGCGCACTGACGCGGTGTGCATGttgtgcgggttctccttcatcaaCGAGAGGGGTCTCAACATGCACATGAACCTCAAACACCGCTTCGACGACGCGCAGGTGAGCACCACACTCCTCGTGCGGGTCTCACAGCGCACTGACGCGGTGTGCATGttgtgcgggttctccttcatcaaCGAGAGGGGTCTCAACATGCACATGAACCTCAAACACCGCTTCGACGACGCGCAGGTGAGCACTACACTCCTCCTGCGGGTCTCACAGCGCACTGACCCGGTGTGCACGTTGTGCGGGTTCTCTTTCATCAACGAGAGGGGTCTCAACATGCACATGAACCTCAAACACCGCTTCGACGACGCGCAGGTGAGCACTACACTCCTCCTGCGGGTCTCACAGCGCACTGACGCGGTGTGCATGttgtgcgggttctccttcatcaaCGAGAGGGGTCTCAACATGCACATGAACCTGAAACACCGCTTCGACGACGCGCAGGTGAGCACTACACTCCTCCTGCGGGTCTCACAGCGCACTGACGTGGTGTGCACGttgtgcgggttctccttcatcaaCGAGGGGGGTCTCAACATGCACATGAACCTCAAACACCGCTTCGACGACGCGCAGAGCGCGGCGGGTCCGCTGTGCGCGCCGTGCGGCATCCGCTTCGCGTCGCAGACCGCCTACACGCAGCACCTGGAGGTGTCGCCCAAACATGCCAACTATGAAGGCAACAAGCGCAACACGCCTCGCTACAGACACATCGTGAAACCGCTCCACAAGCCCAAGGCGGAGCCGCGGATGATAGAGTGCGAACAGtgcgGGATGCAACTACAGGGTTACAAGATGTACCGGTCACATTTCTCGAGGCTCCACCCGGACGAGAGTCGGACTCAGTACCCGCCGAGCTCGCAGCGGCGGTTCCTGTGCGAGCAGTGCGGGAGAGGCTTTAAGGTGAgcccgagccgcaggcgggacTATGTTACAACATGTACCGGTCACATTTCTCGAGGCTCCACCCGGACGAGAGTCGGACTCAGTACCCGCCGAGCTCGCAGCGGCGGTTCCTGTGCGAGCAGTGCGGGAGAGGCTTTAAGGCTCCACCCGGACGAGAGTCGGACTCAGTACCCGCCGAGCTCGCAGCGGCGGTTCCTGTGCGAGCAGTGCGGGAGAGGCTTTAAGGTGAgcccgagccgcaggcgggacTATGTTACAAGATGTACCGGTCTCATTTCTCGAGGCTCCACCCGGACGAGAGTCGGACTCAGTACCCGCCGAGCTCGCAGCGGCGGTTCCTGTGCGAGCAGTGCGGGAGAGGCTTTAAGGCTCCACCCGGACGAGAGTCGGACTCAGTACCCGCCGAGCTCGCAGCGGCGGTTCCTGTGCGAGCAGTGCGGGAGAGGCTTTAAGGTGAgcccgagccgcaggcgggacTATGTTACAACATGTACCGGTCACATTTCTCGAGGCTCCACCCGGACGAGAGTCGGACTCAGTACCCGCCGAGCTCGCAGCGGCGGTTCCTGTGCGAGCAGTGCGGGAGAGGCTTTAAGGCTCCACCCGGACGAGAGTCGGACTCAGTACCCGCCGAGCTCGCAGCGGCGGTTCCTGTGCGAGCAGTGCGGGAGAGGCTTTAAGGTGAgcccgagccgcaggcgggacTATGTTACAACATGTACCGGTCACATTTCTCGAGGCTCCACCCGGACGAGAGTCGGACTCAGTACCCGCCGAGCTCGCAGCGGCGGTTCCTGTGCGAGCAGTGCGGGAGAGGCTTTAAGGCTCCACCCGGACGAGAGTCGGACTCAGTACCCGCCGAGCTCGCAGCGGCGGTTCCTGTGCGAGCAGTGCGGGAGAGGCTTTAAGGTGAgcccgagccgcaggcgggacTATGTTACAAGATGTACCGGTCTCATTTCTCGAGGCTCCACCCGGACGAGAGTCGGACTCAGTACCCGCCGAGCTCGCAGCGGCGGTTCCTGTGCGAGCAGTGCGGGAGAGGCTTTAAGGCTCCACCCGGACGAGAGTCGGACTCAGTACCCGCCGAGCTCGCAGCGGCGGTTCCTGTGCGAGCAGTGCGGGAGAGGCTTTAAGcATCGTTGCCTCCTAAGAGACCATATGCTGTTAGCGCATTCTAACAAGAAGGAGTTCCAATGCGATGCTTGTAAGAAGACATTCTCCCTCAAGGGGAACCTCATAACGCACATGAAGACGCACAGCGAGTCGCAGCCGACGCACGCGTGCCCGATATGCGGCAAACAGTTCACCAACAAGGCCAACACCAACAGGCATATATGGGTGAGTGAACAGACCGTCTTTCTTCTCCCTCTCTCACAGAGCGACGCTTGTAAGAAGACATTCTCCCTCAAGGGGAACCTCATAACGCACATGAAGACACACAGCGAGTCGCAGCCGACACACGCATGCCCGATATGCGGCAAACAGTTTACCAACAAGGCCAACACCAACAGGCATATATGGGTGAGTGAACAGACCGTCTTTCTTCTCCCTCTCTCACAGAGCGACGCTTGCAAGAAGACGTTCTCCCTCAAGGGGAACCTCATAACGCACATGAAGACACACAGCGAGTCGCAGCCGACACACGCGTGCCCGATATGCGGCAAACAGTTTACCAACAAAGCCAACACCAACAGGCATATATGGGTGAGTGAACAGACCGTCTTTCTTCTCCCTCTCTCACAGAGCGGCGCTTGCAAGAAGACGTTCTCCCTCAAGGGGAACCTCATAACGCACATGAAGACACACAGCGAGTCGCAGCCGACACACGCGTGCCCGATATGCGGCAAACAGTTTACCAACAAGGCCAACACCAACAGGCATATATGGGTGAGTGAACAGACCGTCTTTCTTCTCCCTCTCTCACAGAGCGACGCTTGCAAGAAGACGTTCTCCCTCAAGGGGAACCTCATAACGCACATGAAGACACACAGCGAGTCGCAGCCGACACACGCGTGCCCGATATGCGGCAAACAGTTTACCAACAAGGCCAACACCAACAGGCATATATGGGTGAGTGAACAGACCGTCTTTCTTCTCCCTCTCTCACAGAGCGACGCTTGCAAGAAGACGTTCTCCCTCAAGGGGAACCTCATAACGCACATGAAGACACACAGCGAGTCGCAGCCGACACACGCGTGCCCGATATGCGGCAAACAGTTTACCAACAAGGCCAACACCAACAGGCATATATGGGTGAGTGAACAGACCGTCTATCTTCTCCCTCTCTCACAGAGCGACGCTTGTCAGAAGACGTTCTCCCTCAAGGGGAACCTCATAACGCACATGAAGACACACAGCGAGTCGCAGCCGACGCACGCGTGCCCGATATGCGGCAAACAAACAGGCATATATGGGGATCTAAAGCCGTACAAATGCCACGCGTGCGAAAAGACCTTCGTGAACGCGTCGTCGCGACGGAGCCACGTGCTGCACGCGCATCTCAAGCAGCCGTGGCCTAAGAAGACCCGTGGCCCGCGCCAGAGAGCCAGCCGCGCGCGCCACACCAAGGAGGCCGTGTGTGGCACCATATGGCCCAAGGTGAGAGTTGAGGAGCTTTTGaacgcgtcatcacgacggagCCACGTGCTGCACGCGCATCTCAAGCAGCCGTGGCCCAAGAAGAGCCGTGGCCCGCGCCAGGAAGCCAGCCGCGCGCGCCACACCAAGGAGGCCATGTGTGGTACTATATGGCCCAAGGCTGAACTTTCAACAGCGCCCATCGAGTTCCACAGTTGA
- Protein Sequence
- MYKRNTGVDENVAQLADLDEDEDVPLLFCMICLDTDCKLYLMNKHNLETKFEHLTGISLQIEGNFAPQLCSECAQRLSTSSELRDKALRSYHLLAELVKNHPELATQDVKTINRMAHQLTSNITKRTFQPDHCDLFLIHIEEEVKVEVKCVENKANASENQKTEVKIDFEPKKEAFNDIKSDSDDDFLIDDTFNDSVKSDGEENEELPNEEHGKELSNETDSDWDFLRDSDVSNSDEEVVNSKNIKKKNIRYSEENENKATKRGRLKQDSKKPGRPENEEVLKLFKITILSHEEQLAEILKRKDMENFKNSPYKCMKCYKVFCSVTTYDSHMERHTDKFGPVECTVCGMRVKSQTQLRHHTNKNHRTQYTCTECPYVTHHKESAVSHAQWHKGTKFKCPHCDEEYSKKTSYFSHLRIWHPTDVVCTLCGFSFINERGLNMHMNLKHRFDDAQVSTTLLVRVSQRTDVVCTLCGFSFINERGLNMHMNLKHRFDDAQVSTTLLVRVSQRTDVVCTLCGFSFINERGLNMHMNLKHRFDDAQVSTTPLLRVSQRTDAVCMLCGFSFINERGLNMHMNLKHRFDDAQVSTTLLVRVSQRTDAVCMLCGFSFINERGLNMHMNLKHRFDDAQVSTTLLLRVSQRTDPVCTLCGFSFINERGLNMHMNLKHRFDDAQVSTTLLLRVSQRTDAVCMLCGFSFINERGLNMHMNLKHRFDDAQVSTTLLLRVSQRTDVVCTLCGFSFINEGGLNMHMNLKHRFDDAQSAAGPLCAPCGIRFASQTAYTQHLEVSPKHANYEGNKRNTPRYRHIVKPLHKPKAEPRMIECEQCGMQLQGYKMYRSHFSRLHPDESRTQYPPSSQRRFLCEQCGRGFKVSPSRRRDYVTTCTGHISRGSTRTRVGLSTRRARSGGSCASSAGEALRLHPDESRTQYPPSSQRRFLCEQCGRGFKVSPSRRRDYVTRCTGLISRGSTRTRVGLSTRRARSGGSCASSAGEALRLHPDESRTQYPPSSQRRFLCEQCGRGFKVSPSRRRDYVTTCTGHISRGSTRTRVGLSTRRARSGGSCASSAGEALRLHPDESRTQYPPSSQRRFLCEQCGRGFKVSPSRRRDYVTTCTGHISRGSTRTRVGLSTRRARSGGSCASSAGEALRLHPDESRTQYPPSSQRRFLCEQCGRGFKVSPSRRRDYVTRCTGLISRGSTRTRVGLSTRRARSGGSCASSAGEALRLHPDESRTQYPPSSQRRFLCEQCGRGFKHRCLLRDHMLLAHSNKKEFQCDACKKTFSLKGNLITHMKTHSESQPTHACPICGKQFTNKANTNRHIWVSEQTVFLLPLSQSDACKKTFSLKGNLITHMKTHSESQPTHACPICGKQFTNKANTNRHIWVSEQTVFLLPLSQSDACKKTFSLKGNLITHMKTHSESQPTHACPICGKQFTNKANTNRHIWVSEQTVFLLPLSQSGACKKTFSLKGNLITHMKTHSESQPTHACPICGKQFTNKANTNRHIWVSEQTVFLLPLSQSDACKKTFSLKGNLITHMKTHSESQPTHACPICGKQFTNKANTNRHIWVSEQTVFLLPLSQSDACKKTFSLKGNLITHMKTHSESQPTHACPICGKQFTNKANTNRHIWVSEQTVYLLPLSQSDACQKTFSLKGNLITHMKTHSESQPTHACPICGKQTGIYGDLKPYKCHACEKTFVNASSRRSHVLHAHLKQPWPKKTRGPRQRASRARHTKEAVCGTIWPKVRVEELLNASSRRSHVLHAHLKQPWPKKSRGPRQEASRARHTKEAMCGTIWPKAELSTAPIEFHS
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -