Bdim018112.1
Basic Information
- Insect
- Balanococcus diminutus
- Gene Symbol
- PEG3
- Assembly
- GCA_959613365.1
- Location
- OY390718.1:39586416-39598388[+]
Transcription Factor Domain
- TF Family
- zf-C2H2
- Domain
- zf-C2H2 domain
- PFAM
- PF00096
- TF Group
- Zinc-Coordinating Group
- Description
- The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 40 0.00042 0.033 14.8 2.0 2 23 115 137 114 137 0.95 2 40 0.00023 0.018 15.6 1.9 1 21 406 426 406 430 0.90 3 40 0.1 7.9 7.3 5.5 2 23 441 463 440 463 0.96 4 40 0.039 3 8.6 1.8 2 23 468 490 467 490 0.96 5 40 0.0019 0.15 12.8 3.6 1 23 503 525 503 525 0.99 6 40 0.00058 0.045 14.4 0.8 1 19 529 547 529 553 0.94 7 40 0.18 14 6.6 1.3 3 23 564 585 562 585 0.93 8 40 0.06 4.6 8.1 3.2 3 23 591 612 590 612 0.97 9 40 0.0038 0.3 11.8 0.8 1 23 618 640 618 640 0.97 10 40 4.5e-05 0.0035 17.9 1.8 1 23 644 666 644 666 0.98 11 40 0.0014 0.11 13.2 0.6 2 23 673 695 672 695 0.96 12 40 0.0036 0.28 11.9 0.9 2 23 702 724 701 724 0.92 13 40 0.1 7.8 7.4 0.1 3 23 742 763 740 763 0.90 14 40 0.00012 0.009 16.6 3.2 3 23 770 791 768 791 0.95 15 40 0.00017 0.013 16.0 2.1 1 23 796 818 796 818 0.99 16 40 0.00013 0.01 16.4 1.5 1 23 824 846 824 846 0.98 17 40 0.0004 0.031 14.9 1.8 1 23 852 875 852 875 0.96 18 40 0.018 1.4 9.7 3.0 1 17 879 895 879 896 0.92 19 40 0.00035 0.027 15.1 1.1 2 23 914 935 913 935 0.96 20 40 0.0013 0.1 13.3 0.9 2 23 940 961 939 961 0.96 21 40 1.3e-06 0.0001 22.7 0.3 2 23 968 989 967 989 0.97 22 40 0.0088 0.68 10.7 5.8 2 23 998 1019 997 1019 0.93 23 40 9.3e-06 0.00072 20.0 1.7 2 23 1026 1047 1025 1047 0.97 24 40 4.2e-06 0.00033 21.1 3.3 2 23 1056 1077 1055 1077 0.97 25 40 5.7e-06 0.00045 20.7 0.5 3 23 1085 1105 1084 1105 0.98 26 40 1.5e-06 0.00011 22.6 7.7 1 23 1109 1131 1109 1131 0.99 27 40 0.012 0.95 10.2 6.6 3 23 1139 1159 1137 1159 0.96 28 40 1.5e-05 0.0012 19.4 1.2 3 23 1169 1189 1167 1189 0.97 29 40 0.2 15 6.4 0.1 2 23 1196 1217 1195 1217 0.96 30 40 0.00011 0.0082 16.7 0.2 2 23 1257 1278 1256 1278 0.96 31 40 0.00029 0.023 15.3 1.5 1 23 1291 1314 1291 1314 0.95 32 40 0.00089 0.069 13.8 1.9 1 19 1402 1420 1402 1425 0.92 33 40 0.32 25 5.8 0.2 3 23 1434 1455 1432 1455 0.95 34 40 0.0011 0.089 13.5 3.0 1 23 1460 1482 1460 1482 0.96 35 40 1.9e-06 0.00015 22.2 4.3 1 23 1486 1508 1486 1508 0.98 36 40 0.0001 0.008 16.8 2.8 1 23 1514 1536 1514 1536 0.98 37 40 0.2 16 6.4 1.3 1 23 1542 1565 1542 1565 0.93 38 40 0.0008 0.062 14.0 2.9 3 23 1573 1594 1571 1594 0.95 39 40 1.2e-05 0.00096 19.6 2.5 1 23 1600 1622 1600 1622 0.98 40 40 1.6e-07 1.3e-05 25.6 0.7 1 23 1628 1650 1628 1651 0.96
Sequence Information
- Coding Sequence
- ATGGACAGGCAACTAGAGATTTTCTTGGCAGTCATGAGCACGGGCCGAGCTTTGTTCGGGCTACCCGGGTCGAGCTCGTCAAATATCGGGCCCGAGTCGGGCCGGGCCTCGAGCCTGAGCCCGATTGACGATAAGAACATCGAATCAGCATCGATAGTTGAAAATCGCTCCAGTCCAAAGTCGCCACATTCGGTTATTCAATACGACACCTTAAACGACGACCAACCCAGTCCAGATgcagttttaaatatcatcACATCAGATGCAATCGACGAACCAGACTCGATAGATCATTCAACGAAGGCCGAATCAATCcaatcatctgaaaaatcgCCACTCCTCACGTGCGAAATTTGCGACCAATCTTTCACAAGCTGCAGTCAACTTCTCGGTCATAAAAGCACAACTCACACTATAGAGAATAACATATCTCTTGtcgtgaaaaatgatgatatcGATTCCGTTGACATAGCTGATATCGAAATCAATACCATCGAAGATCTAGATGAAGTGTCGAACGACTGCAAAACTGACTCGTTCAATCATTTGATAAACAAAGAAATAAGCTCTCTCGATCAATGTGCAGTTCCTAGCGAGATTGAAGGTAAAATATCTACTACTTCTAGTCATCTTCTAGATCAACCTGATACAGATCAGGATCAGACTAGTCTTTGTTTCGTGGACTGTGATCATCAATCTAACGATGCTGAAGTTGATATGATAAATGTAGGCAATAGTGTTCAAAACATTACAGTAATATTTgccaatgaaaataaatctgaATCGACGAGTGAATCGTTTAAAGATTGTGTTGCCGTCGAAAGTCTTGATCAATGTGAAATGGAAGTCGAAGTTTCTTCTCCATTCAGTGATATCGATTCAGGCTACGAAAATATACATATTCCAACAGTCGACGAATCCAGCTCAGACAGCGAAGTGGATGGTCCACAAGATGTTGAAGAAATCTTGGACGATATAGAATCAGATACGATAAATGATTTGGTAAATGAAGCAGGATGTGAACAGCCTGTTGCTGACAGTCGAACTGCTGAAAACAGTCGTATCGTGAATGGCAATGGTATAAATTCTGAAGATCATGATCAAACGAACATCGCTGAAgacaaatttgaagaaatctccATAGATATCGACTTGGACACGACAATCAACCATCAGACACATCGATTTGACCAGAACAGCACCTCCAACAAGTCattcaattgtaaaaaatgcgACAAAACATTCACCAGCGAAAAAGGTCTGCGTATTCACGAAAAGCAACAATCTCACGGACCTGGAGGTCGTCCCAATCCCAGGCTAAAATGCCAACAATGCGATTTTCACGCTAGATACAAACATAATCTCAAGTATCATATTCTAACTCGACATGGCAGTTCGGTGACTTGTCCAACATGTGGTAAACTATGCGACAACAGGATTAGCTACCTCAATCATCGTAGAAATTATCACACGAATAAACCTCCAAGCCAGAAGCATACGAAAAAGTACACTTGTGCCCTCTGCCAGAAAGATTTCTGGCACAAATCAACTCTAAAATCGCATCTCCGCCGTCATAACAAAAGATTTAGATGCAAACAATGCGGCAAAACATTCGCCAAAGAAAGAGGTTTACGCGTTCATGGATATCTTCGATCGCATAAGATTGATCAATCGACACTGACGCTGGATTGTGACAAATGCGACTACCACGCTAAGAACAGACCGAATCTCAATTACCATAGCGCAACCAAACACACCGGTCCAGTCCTTTGTCCAACATGTGGAGTATCATGCAGCAATAGAATCAGCTTCCTATATCACCACAGACGTGTACACTTGAACAAGAAAGAATTCATTTGCGCTCATTGTGAGAAATCTTTCGGTGACAAAAGAATCCTACTAGCGCATCTCCGCAATCATAGTTCTGCTCATAAATGCGAAGAGTGCGATGAAAGGTTCCCTACGAAGATGGGATTACGTACACATAAACGAATTCATACATCTGTGACCGTGATAAAGTGCGACGTATGCGATTTTACTTTCAACTCTCGTAGAATGGTCAATGATCATATGTTGAGCCAGCATGGTATCGAAGGATCTATACAGTGCAATTTGTGCGACGAGTTGTTCAAAAGTCAGACGTCTCTTTCGAGACATAATACCAAGGTCCATAAGCAGAGTGCACGCATTCGATATGTACGTGATACATCCGAACCTATAATTTGTAACGATTGCGGAATTTCCTTTTGGGCACGAACCCAACTTATACAACATATCGAGTTTAACCACATGGTCGAAGAATCTATTTGCCAAGTTTGCAATAAAACACTCAGAAGCAAAGCCAGTCTGAAAAAACACATGAGACACAATCACAACTTGAACAAATACACGTGCGAAATATGCGATTATAAATGCCCAACACCGAGTCGTCTACAGGATCACTTCAGGACACACCTGGGAGAAAACCCTTACTCCTGCGATGTTTGCGATAGCTCCTTCACAAATAGAAAAAGTCTCATTCAGCACAAAACCACTCACGAAGAAGAAAGTCAATTCGTCTGCGATCATTGCCAAGAAAAATTCAGGTTGAAAATAAGCCTGAGAGAGCACATCAGAGAAACACACACGTTCCCGTTCAAATGCAACCAGTGCCATAAACGGTTCTCCACAGAAGGGGAACTTTGCTCGCAGGAGAACGCCCATACTTCCGATGGTAAACTAGTCCCTCAAAAGAAGTGCATTTGCGAGATTTGTGGCAAAGCCTGCGCCACTGAATTCTATCTAGCACTGCATCAACGTGTTCACAACGATGCTCTGACTTGTAGCACTTGTCAGGAATCGTTCAACTCGATAAAACTCCTGTCACGCCATCTTCTCAGTCATGCCCCCGCTGAAAAGATCCCTTGCGAAACATGTGGCAAAACTTTCATCAATACGAGTGCTCTGAAAACGCACCAAAGAGTTCATCAACGTGATCCCAATGATGAGCTGACTTGTATCCATTGCCAGAAATCTTTCGACACAAAAAAAGACATGACACATCATCTTCGCAGTCACGAATCCCTTGAAGGGGTTCCTTGTGAAACATGTGGCAAAACTTTCACCAACACGTGTGCTTTGAGAACGCACCAAAGAGTACATCAACGTGATCCCAACGATGAGCTGACTTGTAACCATTGTCAGGAAGCTTTCACCTCGAAAAAAGACCTGACACGTCATCTTCGCAGTCACGAAGCTCTGGAAGGGGTCGCTTGCGAAACATGTGGCAAAACTTTCACCAGCACGGGTGGTTTGAAAACGCATCAAAGAATCCATCAGCGTGAGTTCACCTGTAACCATTGTCACAAATCTTTCAACAGGAGAAGTAACATGGCACGTCATCTTCGCACTCacgaacctcccaaaaaaatgtgttgcGAAACATGTGGCAAAACTTTCACCTGTAGGTCTGGTTTCAAATCGCACCTAAGAATGCATCAACGTGTTCACAGTGATAAGCTGGCTTGCGACTATTGTCAGAGACCTTTCAACAAGAGAAAAGACTTAACACGTCATCTTCGCAGTCACGATATTCTCGAACAGATGTCTTGTGAAACATGCGGCCAGAATTTCCTCGGAACTGCTGCTTTGATATCGCACCAAAAAATGCATCAACGTGTTAACAACGATGAGCTCACAATGCCGACTTCTAACGATTGTCAGACATCTTTCGACTTGGGAGAACCCTCGACAGATCAGCTTGGCAGTCATGTCCCTCTTTTAAAGATGCCTTGCGAAATATGCGGCCTAACTTTCATTAATACGGCTACTTTGAGATCACACCAGAAACAgcaccgaaaaaaattcacAACGCATACTGGAGAAACGCCTTACTTGTGTGATATATGCGACCTCAGATTTTCGACGAAAACAGACTTACACTGGCATAATCGCATCTATCATTCTCCCAATCAGATTAAGAAAGCATTCGTTTTACCAAGGGACAGCGAATCCAGACCGGTTGAAAATTGCAACAGTCCTTACTTCGAAACCATTTACCTTCTAACTAGTCCTTATTCAAGAGACGACGAAACCGACTCGCACGACGTACTCGACCCAGATACGGTggcaaacattgaaaaaatctcgattaaaATAGAACCAGACTCAGTGAAACAGTtaccaaagaagaaaaaattcagctcttATTACAGGCCGCAACATTTCCCTTGTGGAATTTGCCACAAAACATTCACTCGTGCTTGCGATCTATTAAATCACGACAATATAGATCACGCGAACATTACTCCTGTCGTGAACTGCGACGTTTGCGGTAAACTAGTCATATCGGAAGATCGACTGACGCttcataaacaaaaattccacACTGAGAAAATCTACCCTTGTAACATTTGTCAGAAGAAATTCGTCTCTTGTAAATCGCTCGACAAACATTACGATCTACATTTCCAAAAGTTCACTTGCGATCACTGCCAGAAAAGTTTCCGGTCCAAGAGCACTTTGACGACGCATATTGCCAGACATATGGTGACGCAGACGTTCAATTGTAACAaatgtggtaaaaatttcttcagtaaGGCTGGATTACATATTCACGAAAGGACTCATTCTGCTGTCAAACTATTCAAATGTGATAAATGTGGATTCGCTGCCAAACACCAGAGTGGAATTTATATACATACGTTGAATCGACATACTGTGGAAGAATCGATCATTTGTGAACTATGCGGTAgatcgtttaaaaataagatgCGTTTGAATGAACACCATAAACGAAAACACGAGACTCCGAGAAGATACGCTTGCGATAAATGCGATTACAAGTTCAAACAGATGTATCAACTTAGGGTTCATTACAGAACACATACTGGCGAAAAACCGTTCGTTTGTGAAATCTGCGGCAAGGCGTTTACTCGAAGTGATGGGTTGAAAGAGCACACACGTATTCATCACGATAACAAATCACTGACGTGA
- Protein Sequence
- MDRQLEIFLAVMSTGRALFGLPGSSSSNIGPESGRASSLSPIDDKNIESASIVENRSSPKSPHSVIQYDTLNDDQPSPDAVLNIITSDAIDEPDSIDHSTKAESIQSSEKSPLLTCEICDQSFTSCSQLLGHKSTTHTIENNISLVVKNDDIDSVDIADIEINTIEDLDEVSNDCKTDSFNHLINKEISSLDQCAVPSEIEGKISTTSSHLLDQPDTDQDQTSLCFVDCDHQSNDAEVDMINVGNSVQNITVIFANENKSESTSESFKDCVAVESLDQCEMEVEVSSPFSDIDSGYENIHIPTVDESSSDSEVDGPQDVEEILDDIESDTINDLVNEAGCEQPVADSRTAENSRIVNGNGINSEDHDQTNIAEDKFEEISIDIDLDTTINHQTHRFDQNSTSNKSFNCKKCDKTFTSEKGLRIHEKQQSHGPGGRPNPRLKCQQCDFHARYKHNLKYHILTRHGSSVTCPTCGKLCDNRISYLNHRRNYHTNKPPSQKHTKKYTCALCQKDFWHKSTLKSHLRRHNKRFRCKQCGKTFAKERGLRVHGYLRSHKIDQSTLTLDCDKCDYHAKNRPNLNYHSATKHTGPVLCPTCGVSCSNRISFLYHHRRVHLNKKEFICAHCEKSFGDKRILLAHLRNHSSAHKCEECDERFPTKMGLRTHKRIHTSVTVIKCDVCDFTFNSRRMVNDHMLSQHGIEGSIQCNLCDELFKSQTSLSRHNTKVHKQSARIRYVRDTSEPIICNDCGISFWARTQLIQHIEFNHMVEESICQVCNKTLRSKASLKKHMRHNHNLNKYTCEICDYKCPTPSRLQDHFRTHLGENPYSCDVCDSSFTNRKSLIQHKTTHEEESQFVCDHCQEKFRLKISLREHIRETHTFPFKCNQCHKRFSTEGELCSQENAHTSDGKLVPQKKCICEICGKACATEFYLALHQRVHNDALTCSTCQESFNSIKLLSRHLLSHAPAEKIPCETCGKTFINTSALKTHQRVHQRDPNDELTCIHCQKSFDTKKDMTHHLRSHESLEGVPCETCGKTFTNTCALRTHQRVHQRDPNDELTCNHCQEAFTSKKDLTRHLRSHEALEGVACETCGKTFTSTGGLKTHQRIHQREFTCNHCHKSFNRRSNMARHLRTHEPPKKMCCETCGKTFTCRSGFKSHLRMHQRVHSDKLACDYCQRPFNKRKDLTRHLRSHDILEQMSCETCGQNFLGTAALISHQKMHQRVNNDELTMPTSNDCQTSFDLGEPSTDQLGSHVPLLKMPCEICGLTFINTATLRSHQKQHRKKFTTHTGETPYLCDICDLRFSTKTDLHWHNRIYHSPNQIKKAFVLPRDSESRPVENCNSPYFETIYLLTSPYSRDDETDSHDVLDPDTVANIEKISIKIEPDSVKQLPKKKKFSSYYRPQHFPCGICHKTFTRACDLLNHDNIDHANITPVVNCDVCGKLVISEDRLTLHKQKFHTEKIYPCNICQKKFVSCKSLDKHYDLHFQKFTCDHCQKSFRSKSTLTTHIARHMVTQTFNCNKCGKNFFSKAGLHIHERTHSAVKLFKCDKCGFAAKHQSGIYIHTLNRHTVEESIICELCGRSFKNKMRLNEHHKRKHETPRRYACDKCDYKFKQMYQLRVHYRTHTGEKPFVCEICGKAFTRSDGLKEHTRIHHDNKSLT
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -