Bdim012043.1
Basic Information
- Insect
- Balanococcus diminutus
- Gene Symbol
- -
- Assembly
- GCA_959613365.1
- Location
- OY390716.1:60137693-60142309[+]
Transcription Factor Domain
- TF Family
- Homeobox
- Domain
- Homeobox
- PFAM
- PF00046
- TF Group
- Helix-turn-helix
- Description
- This entry represents the homeodomain (HD), a protein domain of approximately 60 residues that usually binds DNA. It is encoded by the homeobox sequence [7, 6, 8], which was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well-conserved in many other animals, including vertebrates [1, 2], as well as plants [4], fungi [5] and some species of lower eukaryotes. Many members of this group are transcriptional regulators, some of which operate differential genetic programs along the anterior-posterior axis of animal bodies [3]. This domain folds into a globular structure with three α-helices connected by two short loops that harbour a hydrophobic core. The second and third form a helix-turn-helix (HTH) motif, which make intimate contacts with the DNA: while the first helix of this motif helps to stabilise the structure, the second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. One particularity of the HTH motif in some of these proteins arises from the stereo-chemical requirement for glycine in the turn which is needed to avoid steric interference of the β-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, while for many of the homeotic and other DNA-binding proteins the requirement is relaxed.
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 15 0.43 99 3.0 0.2 3 49 7 57 5 61 0.84 2 15 0.06 14 5.8 0.1 22 48 128 154 114 159 0.84 3 15 0.011 2.5 8.2 0.0 4 45 165 206 163 207 0.89 4 15 0.0006 0.14 12.2 0.6 8 55 268 316 261 318 0.90 5 15 5.7e-12 1.3e-09 37.9 0.2 7 53 332 378 329 382 0.96 6 15 0.025 5.6 7.0 0.0 24 47 505 528 488 529 0.90 7 15 0.00022 0.05 13.6 0.0 11 50 604 643 597 647 0.93 8 15 1.4e-12 3.1e-10 39.9 0.1 7 53 666 712 664 714 0.96 9 15 0.001 0.24 11.4 0.1 22 50 757 786 739 794 0.75 10 15 5.8e-09 1.3e-06 28.3 0.1 6 52 805 851 804 856 0.94 11 15 3.5e-07 8e-05 22.6 0.3 11 54 902 945 897 946 0.91 12 15 7.1e-14 1.6e-11 44.0 0.1 5 53 961 1009 957 1011 0.95 13 15 1.8e-07 4e-05 23.5 0.1 9 52 1163 1206 1159 1207 0.92 14 15 1.8e-11 4.1e-09 36.3 0.3 3 57 1421 1475 1419 1475 0.95 15 15 0.0021 0.48 10.4 0.1 13 50 1494 1531 1486 1535 0.88
Sequence Information
- Coding Sequence
- atgaatttttgttcagcAGCTCGTATTACAATTACCAGAGAGCAGAGCATCAAGTTACAGAAACTGTTGCACGAGAAACCAGCCGGCCATCGCAAGTTCACCGATTTCGAAGTACGCAGAAATGCCTCCCGATTGAAAATCCCAGCTGTAAAAATTCGCTATTGGCTAAAACAACGTAACGCTTACGTCGACGAAGGAGCTGCTGAAACACGTTCATTATCCATCGTGGTACCCAAATCTCGCGTATTATCCGTGGTCGTGCCTAAATCACAGGACGACGAAGTATCCGTCATAGGAGTTAAATCTCACTCTCGCCAATCCAAGTTACTCAACGAGATATCGTTCGAAAAAAAGGTacagcttttgaaaaaaaccatgACCACGATGAACGTGAGCACTCGCTACACGCGAGAATTATCGCGAGAATTGCAAATAGACGAGAATCGAATTTACAACTGGCTCAAGCATAAGGTTCATCGTAActtgaatatgaaaaaatctctaaccGCCGACGAGCTTTCAGCTTTACATAGGAAATTCGCCGAATACGATTACCTGGACGACGACACGGCTCTAGTGTTGTCCGATCGTTTCAACGTTCCTATTGCGGTGATCAAGAAAGCGCTCCAGCAGCATAAACCGTCGACCACGCCGAAGAACGTCCCTAAGCCCGTTGTCCACATCAGAAAATTGCCGATAGTGGCTTTAACGTCGGTGACGCCATCGAAGGGAATCAGATCGTCACCGAAGATGAAACCAGTCGGCGACGGCGGCTCTTCGAAGAAGAATCCGAGTTTTAAAACCGTCGATCAGAAGAACTTGCTATTCGAGGCGTTCCGCCAGTCTCCGACGCTGACCAAAAACGATACCGTGACCGAGCTCGTGAACGCGACCGGACTGCAGCCTCAGCAGATTACCAAATGGCTGAGCGAGTTTCGAACGAAATGGGCCCACAATAACGAAAGCGGATTGAAGAAAGCGCTGACTCGCGGATTAACCCAAAGCCAACTGGTAGAGATGGAGAAAGCGTACCGAGGGGATCGATTTTTGACCCATGCTCAGCTCGCCGAATTAGCCAAGACCGTTAATTTGACAGTGAAAGCCTTACAAGCTTGGTTTTCTAATCGTCGAGTCTACGAGATTCGTTCGTGCGATAAAGATCTCGTGAATATCGGACCGAAAACCGCCGCTGCTAGAAAAGAAGCCAGTCAGCCGTTACAGCAGCAGCCACAGCCTCAGCAGGTAATCGCCCAACCGGCCAATTCGGGCGATGCAGACAACGCAGAGTTCTTCGAATCTTTGAACAAGGAGCAAAAAGCGCTTTTGAAAGCCGGCTGCAAGAGTTACAACGTTTCGTACTCCAGATTAGCTCAAACTATTGGCGTACCGGCCGAAAAGATCAAGCAATACATCGTAGGCTATCGCGTACGTCATTCGATCTTCAGACAGAATAATCGTTCGATCCCGGAACGTGTTCACAAATCGTTGCTAAATCATATGCTGAAATATGGTAAAATATCCAGCAAGACTAGCGTAGCTCTAGccaaacgtttgaaaatacgCCCCGAGCAGGTTAGAAATTGGAGCCGAACTTACGCCAGTCGTATCCTCGAGCATAACAGACCAGCTCAGCGACCGGCCATCGCAGCAGCTCCTCAACCTGCCCCAGTCGCACCCGCGGTCACCACTCAGCAACCGGTCGTAGAACTTTACGAAGATACTAAACCAGCAGCGTCGTCGGCGGCAGCTGGGCCTAAATATTCGGCGAAAAAAAGGTTAGTAGGTTCGAAATACGTAGTTCCTTCTTCTGCCAAAAACGTGCTGCTGGAAGAATTCTTAAAATCGCCTCAAGCGGCCGCGAACAAGAGTAAAGAACTCGCCAAAATGGTGGGAATAACTCCGATGCAGGCGAACAATTGGTTTTACAATTTCGGTAAAAGACTCAACGAGCAAAATAAATCCGAAGTGGTCGCTTGCGTGATAAATAATCCGGCAATCAGCGACCAACAACGAGCTAGGCTAGAGGCCGAGTATAGAAATTGTCGTTATTTGGAGGTTCCCGAAATGGAAGCGTTGACTGCGGAAATCGGCTTGAAAAGGAAACAAAtagaaaattggtttattaACGCTCGTTACTACGAAGTGTTGACCGGACAGTGTCCCGGAGGATCTGGCGGTACCAAGACGGCTTCGAAATCGGCCACGACGCCCAAATCGTTTTACGAAAGTCTGAACTCGCAGCAACGGGCATGTTTGAAGGACGAACTCAACTCGTACCCTTTGGACGATCAAAGACTCTACGACCTGGCTAACGAATTACAAGTGTCTTCGAACGAGTTGAAGAATTGGTTCGAAAAAGCTTCTCGGAGTAAGACACGAAGTCGTAGTTTACCTAATCCCGCGCTACCTCTCAATTTAACCGCTAAAGCGTTGGAAACGTTGACCAAGGAATTCGAAATCGATCCTATTCTGCCGGAAGCTCGCGCCGGTGCGTTAGCTAAACGCACCAAAGTCACCAAAGACCGTATCAAAGCTTGGTTCGCGAATCGCCAAGAAGAAATGCATAAATTCGAAGAGAACGAGCAGGCCGACGAACGTAAGCTAGTCAAACATAAAATACCTCCGATTAGAATCACCATACCAAAGTACGTGGCTGCGAACGGCGACGCGGATTTCGACTCGTCGTCCAAGAAAAGCTACTTCCAACAGAACAGACTATTCGAAGAGTTCAAGATGGATCCGACCTTGACCAACGAAAGATTGATCAAAATAAGCAGAGATACCAACTTGGACGGTAAACAGATTTCAGCCTGGTTCGATTGGATAAAAGCGAAACTAGCTTCGATACCGAGAGACAGTCTGCTCGAGGAGAATCGCAACAGTAATCTAACTGCTCGACAAATCGCCATACTAGGAGAAGAGTACGCCAAGAACAGATACGTCGACAGAACGGTTCGCGAAGCCCTCTCAAGATCGTTGGGCGTATCGAAAAACGTCGTAAAAACCTGGTTCGCAAACAAACGTTATTCGGAAATTCTCTGCCAAGACGGATCCGTCGATAGCGATCCGCTTTCTCCTCAGGATGGAGatttcgacgacgacgacgaagacgaagCACGTCAGACCGAAACGCCCATCGACTACGGATGGGACCAAGAAGAAACGGTCTTCGTCGACGACGACGCCGAAGTTGAACATAAATTAGACGTTAAACGCCCGGTAGAAGTCGATCCCTTGCTCGATTACAGTTTCGAAGACGAAACCAACAACGAAGACGTCAAAACCAACATTTCGTATAATTTCCAACTTTCGCCGAACAGCGCGTTCACCGACAATTTGACCAGTTTACCGTATCAACCTCGAAATCGTCTCTTCACTAGGCTCGGTAAAGATTTAGAACCATTGGAAGGAGACGTCGAAAAGCTATTCTGGTTCGCGAGTAAAACGCCGAATAAAACAGACGCCAGATTCGATCAAGATATCGACGACGAGAAGAATCGCGTCTTGGAAAgcgaatttggtaaaaatcctTGGCCAGAGTTGGAAAGAATATCGCAACTGTCGACTCAGCTGTTGATCTCGGAACCGAAAATACACTGGTGGTTCATAAAGAAACGCTGTTTCTTGACGAAAACCATTCTCAATATACCATCGCCCAAGCCCAAGCCTAAACCGAAATCTGCACCTAAAAACGTTTTAATCGATTTGACTGTTGACGAATCCGACTCACAATCGCCTAAATGCGgagaatttgaatttgttaCGTTAGACGAGACGACTCCAgttaaagaagaagaagaagaaccgCTAACCGAAGAAGACGACGATCTCGAGATTGGCGATCCTTTTGAAAACGACGTTAGGGAAGAGCAATCGTTGATTAACGAAGAACCTTCTCAAGATCCGATTGTTACTAACGCTATGGAAGTTCAAACGCCATTCGTTGAACCGTCTGTAGATCCGATTGATACTAACACTGTAGAAGAACAATCTCTCGATCCGATTGATACGAACGCTATGGAAGTAGAGTCGCCATTTAACGAACAGTCTGTAGATCCGATCGATACTAACACCGTAGAAGAGCAATCGTCGGTCAGCGAACAATCATCTCAAGATACGATCAATACTAACACTACAGTAGAACAATCGTCTATCAATGATCATACACAAGATCCTCTTATTACCGACGATACGGTTACGAACGCGACCACCATCTCTCCTACTAATAGCTCGGCCTCTGCGACTATAAAGAAAAAACCCAAGAAGATTCCGTTGACTACTCATCAACGTGCTGTGCTAAAGCAAGAATTCAAACTCAATAAAATTATCCCAAATTCGCAAGCTCGTCTATTAGCTCAAGATTTAGGTCTGACCGTTGGTCGAATAGAAGCTTGGTTCGAAAACATGAGAATAAAACGAAAGAAACCCAATCCTTCTGAGACTTCTACTTCGGTGAAAAACTTGAGCGCAGAAATCGAAGCTGGTTTAGAGGAGGAGTACTTGAAAAGCGTCAATTTGAGCAATAAAAGAGCTAAAATGATTGCCATCAAATTGAACACCAGGAAGAGAATTGTCGTGAAGTGGTTCGTCGAACGAGCTAAAAGAATAGACCAATAG
- Protein Sequence
- MNFCSAARITITREQSIKLQKLLHEKPAGHRKFTDFEVRRNASRLKIPAVKIRYWLKQRNAYVDEGAAETRSLSIVVPKSRVLSVVVPKSQDDEVSVIGVKSHSRQSKLLNEISFEKKVQLLKKTMTTMNVSTRYTRELSRELQIDENRIYNWLKHKVHRNLNMKKSLTADELSALHRKFAEYDYLDDDTALVLSDRFNVPIAVIKKALQQHKPSTTPKNVPKPVVHIRKLPIVALTSVTPSKGIRSSPKMKPVGDGGSSKKNPSFKTVDQKNLLFEAFRQSPTLTKNDTVTELVNATGLQPQQITKWLSEFRTKWAHNNESGLKKALTRGLTQSQLVEMEKAYRGDRFLTHAQLAELAKTVNLTVKALQAWFSNRRVYEIRSCDKDLVNIGPKTAAARKEASQPLQQQPQPQQVIAQPANSGDADNAEFFESLNKEQKALLKAGCKSYNVSYSRLAQTIGVPAEKIKQYIVGYRVRHSIFRQNNRSIPERVHKSLLNHMLKYGKISSKTSVALAKRLKIRPEQVRNWSRTYASRILEHNRPAQRPAIAAAPQPAPVAPAVTTQQPVVELYEDTKPAASSAAAGPKYSAKKRLVGSKYVVPSSAKNVLLEEFLKSPQAAANKSKELAKMVGITPMQANNWFYNFGKRLNEQNKSEVVACVINNPAISDQQRARLEAEYRNCRYLEVPEMEALTAEIGLKRKQIENWFINARYYEVLTGQCPGGSGGTKTASKSATTPKSFYESLNSQQRACLKDELNSYPLDDQRLYDLANELQVSSNELKNWFEKASRSKTRSRSLPNPALPLNLTAKALETLTKEFEIDPILPEARAGALAKRTKVTKDRIKAWFANRQEEMHKFEENEQADERKLVKHKIPPIRITIPKYVAANGDADFDSSSKKSYFQQNRLFEEFKMDPTLTNERLIKISRDTNLDGKQISAWFDWIKAKLASIPRDSLLEENRNSNLTARQIAILGEEYAKNRYVDRTVREALSRSLGVSKNVVKTWFANKRYSEILCQDGSVDSDPLSPQDGDFDDDDEDEARQTETPIDYGWDQEETVFVDDDAEVEHKLDVKRPVEVDPLLDYSFEDETNNEDVKTNISYNFQLSPNSAFTDNLTSLPYQPRNRLFTRLGKDLEPLEGDVEKLFWFASKTPNKTDARFDQDIDDEKNRVLESEFGKNPWPELERISQLSTQLLISEPKIHWWFIKKRCFLTKTILNIPSPKPKPKPKSAPKNVLIDLTVDESDSQSPKCGEFEFVTLDETTPVKEEEEEPLTEEDDDLEIGDPFENDVREEQSLINEEPSQDPIVTNAMEVQTPFVEPSVDPIDTNTVEEQSLDPIDTNAMEVESPFNEQSVDPIDTNTVEEQSSVSEQSSQDTINTNTTVEQSSINDHTQDPLITDDTVTNATTISPTNSSASATIKKKPKKIPLTTHQRAVLKQEFKLNKIIPNSQARLLAQDLGLTVGRIEAWFENMRIKRKKPNPSETSTSVKNLSAEIEAGLEEEYLKSVNLSNKRAKMIAIKLNTRKRIVVKWFVERAKRIDQ
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -