Hsem014887.1
Basic Information
- Insect
- Hipparchia semele
- Gene Symbol
- -
- Assembly
- GCA_933228835.1
- Location
- CAKOGE010000178.1:4137590-4167150[-]
Transcription Factor Domain
- TF Family
- zf-C2H2
- Domain
- zf-C2H2 domain
- PFAM
- PF00096
- TF Group
- Zinc-Coordinating Group
- Description
- The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 36 0.018 0.97 10.4 0.9 1 23 175 197 175 197 0.96 2 36 7.7e-05 0.0041 17.8 2.7 1 23 218 240 218 241 0.95 3 36 9e-05 0.0048 17.6 0.5 3 23 245 265 244 265 0.97 4 36 0.01 0.55 11.1 2.2 1 23 271 293 271 293 0.98 5 36 6.4e-06 0.00034 21.2 0.6 1 23 300 322 300 322 0.98 6 36 0.001 0.053 14.3 6.6 1 23 351 373 351 373 0.98 7 36 0.00077 0.041 14.7 4.9 1 23 378 400 378 400 0.98 8 36 0.0006 0.032 15.0 3.4 1 23 406 428 406 428 0.97 9 36 8.2e-06 0.00043 20.9 2.9 1 23 439 461 439 461 0.98 10 36 3.3e-05 0.0017 19.0 0.9 1 23 467 489 467 489 0.96 11 36 1.1e-05 0.00059 20.5 2.3 1 23 495 517 495 517 0.99 12 36 0.00014 0.0075 17.0 1.5 1 23 523 545 523 545 0.98 13 36 0.00094 0.049 14.4 5.4 1 23 612 634 612 634 0.98 14 36 3.3e-05 0.0017 19.0 3.5 1 23 640 662 640 662 0.97 15 36 1.6e-05 0.00087 20.0 0.7 1 23 668 690 668 690 0.98 16 36 4.2e-06 0.00022 21.8 0.8 1 23 696 718 696 718 0.98 17 36 0.082 4.4 8.3 2.9 1 23 724 746 724 746 0.97 18 36 1.9e-06 9.9e-05 22.9 1.1 1 23 752 774 752 774 0.98 19 36 0.00097 0.051 14.4 12.1 1 23 833 855 833 855 0.98 20 36 4.9e-05 0.0026 18.5 3.7 1 23 861 883 861 883 0.97 21 36 1.6e-05 0.00087 20.0 0.7 1 23 889 911 889 911 0.98 22 36 2.7e-06 0.00014 22.4 0.8 1 23 917 939 917 939 0.98 23 36 0.39 20 6.2 4.7 1 23 945 968 945 968 0.97 24 36 4.8e-07 2.5e-05 24.8 1.5 2 23 974 995 973 995 0.97 25 36 0.00097 0.051 14.4 12.1 1 23 1060 1082 1060 1082 0.98 26 36 4.9e-05 0.0026 18.5 3.7 1 23 1088 1110 1088 1110 0.97 27 36 1.6e-05 0.00087 20.0 0.7 1 23 1116 1138 1116 1138 0.98 28 36 2.7e-06 0.00014 22.4 0.8 1 23 1144 1166 1144 1166 0.98 29 36 0.39 20 6.2 4.7 1 23 1172 1195 1172 1195 0.97 30 36 4.8e-07 2.5e-05 24.8 1.5 2 23 1201 1222 1200 1222 0.97 31 36 0.00097 0.051 14.4 12.1 1 23 1287 1309 1287 1309 0.98 32 36 3.9e-05 0.0021 18.8 3.7 1 23 1315 1337 1315 1337 0.97 33 36 1.6e-05 0.00087 20.0 0.7 1 23 1343 1365 1343 1365 0.98 34 36 2.7e-06 0.00014 22.4 0.8 1 23 1371 1393 1371 1393 0.98 35 36 0.39 20 6.2 4.7 1 23 1399 1422 1399 1422 0.97 36 36 4.8e-07 2.5e-05 24.8 1.5 2 23 1428 1449 1427 1449 0.97
Sequence Information
- Coding Sequence
- ATGGAGGAAGAAACCAAGCAGTCACTGCTGGCTGCAGGCCTCATCGTCCCCAAAGAGGAAGTCATAGAAGACATAGAAATAACCTTCGAGCGTGACGACAACGATGAGTTCATTGTGCCCAAGGAAGAGTTACCAGACACCAGAATGGATGAAGAAGACAATCAAGAGTTAACTGTGCCCAAAGAAGAACCGTCCAACTCCAGCTTTACGTTTGAAGAAGACAACGAAAACTCTGTTGTGCGCAAAGAAGAGTTGTCCAACCCTAGTGGCCAAAGTGAAGAGAATGAAGAGAGTGCTATGCCTATCAAGACAGAATTGACCTTGAGTAAAGAGATGCGTAAAATACGGAGTGAGAAAGAGAGTGGACAAGGAGAGTCTGCAGGATATGTAGAAATTAAGGTGGAGCCGGTGGACGAGACAAGTCCCGCCGCCGACTGCGAGCCGTCATGCAGCGACATGTTGAACTGCAAGATCAAAGGTCGCGGTAGAGTGGCCGTGAAGCCGCGCACAAGGAAACAGTTCCATTCCTGCCCACCGTGCCTGAAGGTGTTCACCAGCGACAGAGCCTACTCCGCCCACATGAAAACCCATTGCACCAGAACTCGCCGACCCTCACCCGCGCAGAAAGAACCAGACTCCGGCAAGAAGTACTTCGCTTGCGAACAGTGCGACAGGAGTTACCGGCGAAAGTTCAAGCTTATCGCCCATATGACGTCCCACCACCCCGCCGCTTGCAACATATGCCAGATCAAGTTCGCCAAAAAGAGCCATCTACTCGCTCATATGGCGATACATCCAGGAGTCAAAGCCTACTCGTGCGACCTGTGCGAAATGAAGTTCAAGAACGTGAATCGGCTGAAGGAACACAAAAGGCGTCATACTGCAACTGGCAAGACGTTTCCGTGCGAGATATGTCACGAGAAGTTCGGACAGAAGAAGGATCTGGTGGCTCACATACGAACCCACGCTGATAAGAAGCCTTGCACTTGCGACTTCTGTCAAGTTAAATTGGGTAGAAAGAGAACACCGACAGTTGTTGGCGGGCACCAGTTCACCTGCAATCAATGCCAGGTGCAGTTCACGCATCAGAACCATTTGAACCACCACATGGCCATTCACGGGGAGAAACCTTTCCACTGCAAGCTGTGCACCAGGAAGTACACCAAGCGTCGCGATCTAGCCGACCACATGCGTACCCACGGCGAGGAGAAGGCGTTCGCTTGCAGCATCTGTCAAGTCAGATACACTCTGAAACGGCATTTAACGACACACATGAAGACGCACGCGAAAAATCGCGAGGAACGGAAGGCGTCGTTCAACTGTGACATGTGCCAGAAGAAGTTTACTTACAAAAGTAGTTTAATCGAGCACAGAAGAACTCACACCGACGAGAGACCTCATGCTTGCGAGTGGTGCGACATGAAGTTCAAGTCTTCCAGTGCTCTGACTGGTCACTTGAGGACTCACACGGGCGAGAGGCCCTACACTTGCGGCCTGTGCGAGAAGGGCTTCAGTAAGAATTGCAACTTGACTGCCCACTTGAGAACTCACACCGGTGAGAGACCTTACTCGTGTGATCTATGCCAGGAGAAGTTCACTTACAAGCGCAGCGTAAGGGCTCACATGAGGCTACACGTTAGTCAGAAATCTTGTTCAGGCGCTCTTTTGCCGAAGAAAAGTTCGAAGAACCAGCGCGTAGCAGATGGAACACCGTGCTCTGATCCCGGCCATATTTGCGACGAAACGTGTGAAAAACGAAACGCACTGCTTGTACCCATCCGGACCAAACAAAAAGATCATAGTAACATGAGCTCCAACATGGGCGTTAAACTGTTCTCCTGTCACCTTTGTGAGGCCACGTTTAAGAGAAAATGTAATTTGTCGGAACATGTGAGATCTCATTCTGGCGACAAACATTTCTTCTGTCAGCTCTGTCAGGCGCGATTCGCACAAAAAAGTCATTTAACTGTTCACATGAGAAAGCATACTGGTGGCAAACCGTTCTCGTGTGACCTGTGCGAGCTTAAATTTGCACGAAAATCTGGTTTAACGAATCATATGAGCACTCATACCAGCGAAAAACCTTTCTCATGCGATGTTTGTGAGGAGAAGTTTGCCCAAAATTCACATTTACTGGATCACATGAGAGTACATACTGGCGAGAAATCTTTCTGTTGTGATCTGTGCGAGGTTACATTTGCATGCGAATCTGATTTAACTAATCACATGTCATCTCACACTGGCGACAAACCTTTCTCCTGTCAAGTGTGCCCCGTGAAATTTGCTCAGAAAAGTCATTTAAATGTTCATATGAGAATTCATTCTGGCGGTCTTTTGCCGAAGAAAAGTTCGAAGAACCAGCGCGTAGCAGATGGAACGCCGTGCTCTGATCCCGGCCATATTTGCGATGAAACATGTGATAACGAAAACCCACTGCTTGTACCCATGCGATCTATGCAATATAATCTTTTAACTAGTGACATGGAAGTTGAACTTTTCTCCTGTCATCATTGTGAGGCCACGTTTAAAAGAAAATGTCATTTGTCGAAACATATGAGATCTCATTCTGGCGACAAACATTTTTTCTGTCAGCTCTGTCAGGCGCGATTCATACAAAAAAGTCATTTAACTGTTCACATGAGAAAGCATACTGGTGGCAAACCGTTCTCGTGTGACCTGTGCGAGCTTAAATTTGCACGAAAATCTGGTTTAACGAATCATATGAGCACTCATACTAAGGAAAAACCTTTCTCATGCGATATATGTGAGGTGAAGTTTACACAAAAGTCATACTTATTGAATCACATGAGAGTACATACTGGCGAGAAATCTTTCTGTTGTGATCTATGCGAGGCTAAATTTGTATGCAAATCTGACTTGTCGTATCACATGAGATCTCAGCATGGCGACAAACCTGTATCATGCCATGTGTGCCCCCAGAAATTTACTCAGAAAAGTCAGTTAACTGCTCATATGAGAATTCATACGGGCGAGAAACCTTACTCTTGCGGTCTTTTGCCGAAGAAAAGTTCGAAGAACCAGCGCGTAGCAGATGGAACGCCGTGCTCTGATCCCGGCCATATTTGCGATGAAACATGTGATAACGAAAACCCACTGCTTGTACCCATGCGATCTATGCAATATAATCTTTTAACTAGTGACATGGAAGTTGAACTTTTCTCCTGTCATCATTGTGAGGCCACGTTTAAAAGAAAATGTCATTTGTCGAAACATATGAGATCTCATTCTGGCGACAAACATTTTTTCTGTCAGCTCTGTCAGGCGCGATTCATACAAAAAAGTCATTTAACTGTTCACATGAGAAAGCATACTGGTGGCAAACCGTTCTCGTGTGACCTGTGCGAGCTTAAATTTGCACGAAAATCTGGTTTAACGAATCATATGAGCACTCATACTAAGGAAAAACCTTTCTCATGCGATATATGTGAGGTGAAGTTTACACAAAAGTCATACTTATTGAATCACATGAGAGTACATACTGGCGAGAAATCTTTCTGTTGTGATCTATGCGAGGCTAAATTTGTATGCAAATCTGACTTGTCGTATCACATGAGATCTCAGCATGGCGACAAACCTGTATCATGCCATGTGTGCCCCCAGAAATTTACTCAGAAAAGTCAGTTAACTGCTCATATGAGAATTCATACGGGCGAGAAACCTTACTCTTGCGGTCTTTCGCTGAAGAAAAGTACGAAGAACCAGCGCGTAGCAGATGGAACACCGTGCTCTGATCCCGGCCATATTTGCGATAGAACATGTGATAACGAAACCCCACTGCTTGTACCCATGCGATCTATGCAATATAATCTTTTAACTAGTGACATGGACGTTGAACTTTTCTCCTGTCATCATTGTGAGGCCACGTTTAAAAGAAAATGTCATTTGTCGAAACATATGAGATCTCATTCTGGCGACAAACATTTTTTCTGTCAGCTCTGTCAGGCGCGATTTGTACAAAAAAGTCATTTAACTGTTCACATGAGAAAGCATACTGGTGGCAAACCGTTCTCGTGTGACCTGTGCGAGCTTAAATTTGCACGAAAATCTGGTTTAACGAATCATATGAGCACTCATACTAAGGAAAAACCTTTCTCATGCGATATATGTGAGGTGAAGTTTACACAAAAGTCATACTTATTGAATCACATGAGAGTACATACTGGCGAGAAATCTTTCTGTTGTGATCTATGCGAGGCTAAATTTGTATGCAAATCTGACTTGTCGTATCACATGAGATCTCAGCATGGCGACAAACCTGTATCATGCCATGTGTGCCCCCAGAAATTTACTCAGAAAAGTCAGTTAACTGCTCATATGAGAATTCATACGGGCGAGAAACCTTACTCTTGTGAGATATGTGAGGAGAGGCGCTCTTTCGCTGAAGAAAAGTACGAAGAACCAGCGCGTAGCAGATGGAACACCGTGCTCTGA
- Protein Sequence
- MEEETKQSLLAAGLIVPKEEVIEDIEITFERDDNDEFIVPKEELPDTRMDEEDNQELTVPKEEPSNSSFTFEEDNENSVVRKEELSNPSGQSEENEESAMPIKTELTLSKEMRKIRSEKESGQGESAGYVEIKVEPVDETSPAADCEPSCSDMLNCKIKGRGRVAVKPRTRKQFHSCPPCLKVFTSDRAYSAHMKTHCTRTRRPSPAQKEPDSGKKYFACEQCDRSYRRKFKLIAHMTSHHPAACNICQIKFAKKSHLLAHMAIHPGVKAYSCDLCEMKFKNVNRLKEHKRRHTATGKTFPCEICHEKFGQKKDLVAHIRTHADKKPCTCDFCQVKLGRKRTPTVVGGHQFTCNQCQVQFTHQNHLNHHMAIHGEKPFHCKLCTRKYTKRRDLADHMRTHGEEKAFACSICQVRYTLKRHLTTHMKTHAKNREERKASFNCDMCQKKFTYKSSLIEHRRTHTDERPHACEWCDMKFKSSSALTGHLRTHTGERPYTCGLCEKGFSKNCNLTAHLRTHTGERPYSCDLCQEKFTYKRSVRAHMRLHVSQKSCSGALLPKKSSKNQRVADGTPCSDPGHICDETCEKRNALLVPIRTKQKDHSNMSSNMGVKLFSCHLCEATFKRKCNLSEHVRSHSGDKHFFCQLCQARFAQKSHLTVHMRKHTGGKPFSCDLCELKFARKSGLTNHMSTHTSEKPFSCDVCEEKFAQNSHLLDHMRVHTGEKSFCCDLCEVTFACESDLTNHMSSHTGDKPFSCQVCPVKFAQKSHLNVHMRIHSGGLLPKKSSKNQRVADGTPCSDPGHICDETCDNENPLLVPMRSMQYNLLTSDMEVELFSCHHCEATFKRKCHLSKHMRSHSGDKHFFCQLCQARFIQKSHLTVHMRKHTGGKPFSCDLCELKFARKSGLTNHMSTHTKEKPFSCDICEVKFTQKSYLLNHMRVHTGEKSFCCDLCEAKFVCKSDLSYHMRSQHGDKPVSCHVCPQKFTQKSQLTAHMRIHTGEKPYSCGLLPKKSSKNQRVADGTPCSDPGHICDETCDNENPLLVPMRSMQYNLLTSDMEVELFSCHHCEATFKRKCHLSKHMRSHSGDKHFFCQLCQARFIQKSHLTVHMRKHTGGKPFSCDLCELKFARKSGLTNHMSTHTKEKPFSCDICEVKFTQKSYLLNHMRVHTGEKSFCCDLCEAKFVCKSDLSYHMRSQHGDKPVSCHVCPQKFTQKSQLTAHMRIHTGEKPYSCGLSLKKSTKNQRVADGTPCSDPGHICDRTCDNETPLLVPMRSMQYNLLTSDMDVELFSCHHCEATFKRKCHLSKHMRSHSGDKHFFCQLCQARFVQKSHLTVHMRKHTGGKPFSCDLCELKFARKSGLTNHMSTHTKEKPFSCDICEVKFTQKSYLLNHMRVHTGEKSFCCDLCEAKFVCKSDLSYHMRSQHGDKPVSCHVCPQKFTQKSQLTAHMRIHTGEKPYSCEICEERRSFAEEKYEEPARSRWNTVL
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -