Cros030198.1
Basic Information
- Insect
- Choristoneura rosaceana
- Gene Symbol
- LOC113510063
- Assembly
- GCA_037349165.1
- Location
- JAZBNJ010000048.1:3859074-3888992[-]
Transcription Factor Domain
- TF Family
- Homeobox
- Domain
- Homeobox
- PFAM
- PF00046
- TF Group
- Helix-turn-helix
- Description
- This entry represents the homeodomain (HD), a protein domain of approximately 60 residues that usually binds DNA. It is encoded by the homeobox sequence [7, 6, 8], which was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well-conserved in many other animals, including vertebrates [1, 2], as well as plants [4], fungi [5] and some species of lower eukaryotes. Many members of this group are transcriptional regulators, some of which operate differential genetic programs along the anterior-posterior axis of animal bodies [3]. This domain folds into a globular structure with three α-helices connected by two short loops that harbour a hydrophobic core. The second and third form a helix-turn-helix (HTH) motif, which make intimate contacts with the DNA: while the first helix of this motif helps to stabilise the structure, the second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. One particularity of the HTH motif in some of these proteins arises from the stereo-chemical requirement for glycine in the turn which is needed to avoid steric interference of the β-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, while for many of the homeotic and other DNA-binding proteins the requirement is relaxed.
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 33 0.015 4.2 8.2 0.1 4 33 34 63 32 68 0.92 2 33 0.04 11 6.8 0.0 6 35 54 83 50 88 0.91 3 33 0.0096 2.7 8.8 0.1 6 35 72 101 68 108 0.91 4 33 0.032 9 7.1 0.0 6 33 90 117 86 122 0.92 5 33 0.046 13 6.6 0.0 6 35 108 137 105 140 0.91 6 33 0.036 10 7.0 0.0 6 33 126 153 122 156 0.92 7 33 0.011 3 8.7 0.1 6 35 162 191 158 195 0.91 8 33 0.041 12 6.8 0.0 6 35 198 227 194 233 0.91 9 33 0.054 16 6.4 0.0 6 33 234 261 230 266 0.92 10 33 0.034 9.7 7.0 0.1 6 33 270 297 266 302 0.92 11 33 0.011 3 8.7 0.1 6 35 306 335 302 339 0.91 12 33 0.041 12 6.8 0.0 6 35 342 371 338 377 0.91 13 33 0.054 15 6.4 0.0 6 33 378 405 374 410 0.92 14 33 0.13 37 5.2 0.0 6 33 396 423 392 429 0.92 15 33 0.16 45 4.9 0.1 6 35 414 443 411 448 0.88 16 33 0.006 1.7 9.5 0.1 6 34 432 460 428 464 0.92 17 33 0.0088 2.5 8.9 0.1 6 34 450 478 445 485 0.90 18 33 0.026 7.3 7.4 0.1 6 35 468 497 464 502 0.91 19 33 0.031 9 7.1 0.0 6 33 486 513 482 518 0.92 20 33 0.13 37 5.2 0.0 6 33 504 531 500 536 0.92 21 33 0.01 2.9 8.7 0.1 6 35 540 569 536 574 0.91 22 33 0.0038 1.1 10.1 0.1 6 34 576 604 572 608 0.92 23 33 0.0013 0.38 11.6 0.3 6 34 612 640 608 643 0.92 24 33 0.025 7.2 7.4 0.1 6 35 648 677 644 682 0.91 25 33 0.015 4.2 8.2 0.0 6 35 684 713 680 716 0.91 26 33 0.0098 2.8 8.8 0.1 6 35 702 731 697 737 0.91 27 33 0.012 3.4 8.5 0.1 6 34 720 748 716 752 0.92 28 33 0.0029 0.83 10.5 0.1 6 35 738 767 734 774 0.91 29 33 0.012 3.4 8.5 0.1 6 34 756 784 752 788 0.91 30 33 0.0012 0.35 11.7 0.0 6 40 774 808 770 811 0.89 31 33 0.015 4.2 8.2 0.0 6 35 792 821 788 828 0.90 32 33 1.5 4.4e+02 1.7 0.1 6 28 828 850 824 854 0.87 33 33 4.3 1.2e+03 0.3 0.3 7 41 1097 1132 1095 1134 0.76
Sequence Information
- Coding Sequence
- ATGAGGCACTCGTATGTTCTTCTGCTGCTCTGCCTGTCGTGGACGCGGGGCCAGACATTGGATGACGAAATCAAATGGCTGACGGATCTGTTCGGGACCAAATCGATCCCTACAAAGGAAGAACGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGATAGATCTGTTCGGAACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCAACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGATAGATCTGTTCGGAACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCAAACGAGGATGAAAGGAAACGGCTGATAGATCTGTTCGGAACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCAACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGATAGATCTGTTCGGAACCAAACCGACCCCAAACGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGATCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGATAGATCTGTTCGGAACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCAAACGAGGATGAAAGGAAACGGCTGATAGATCTGTTCGGAACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCAACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGATAGATCTGTTCGGAACCAAACCGACCCCAAACGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGATCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGATAGATCTGTTCGGAACCAAACCGACCCCAAACGAGGATGAAAGGAAACGGCTGATAGATCTGTTCGGAACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCGACCGAGGATGAAAGGGAACGGCTGACGGATCTGTTCGCGACCAAACCGACCCCGACCGAGGATGAAAGGAAATTGCTGACGGATCTGTTCGGGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTTGGGACCAAACCAACCCCGACCGAGGATGAAAGGAAACGGCTGATAGATCTGTTCGGAACCAAACCGACCCCAAACGAGGATGAAAGGAAACGGCTGATAGATCTGTTCGGAACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCAACTGAGGATGAAAGGAAACGGCTGACGGATCTGTTTGGGACCAAACCAACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGCGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGCGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGAGACCAAACCGACCCCGACCGAGGATGAAAGGGAACGGCTGACGGATCTGTTCGCGACCAAACCGACCCCGACCGAGGATGAAAGGAAATTGCTGACGGATCTGTTCGGGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTTGGGACCAAACCAACCCCGACCGAGGATGAAAGGAAATGGCTGACGGATCTGTTCGCGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCAACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTTGGGACCAAACCAACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGCGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGCGACCAAACCGACCCCGACCGAGGATGAAAGGAAACGGCTGACGGATCTGTTCGGGATCAAACCGACCCCGAATGAGGATGAAAGGAAACGGCTGACGGATCTATTCGGGACCAAACTTACCTGGACCGAAGATGAAAGGAAACGGCTGACGGATTTGTTCGAGACCAAACCGACCCCGACTAATACGGCAACTAGTACGTTTTGTGTGGTGTGGTTGATTATGATAAAACACTATACGGCTATTGAATACTGGAATAACGCCAAATACGCCGTGCGCAAGAAGGCGTGCGCGCGACGGCAGCCAGCAGCTGGCCAGCCCGCGCTATGCCCGCGCCCCGGAAGCCCCGCGCTGACGGAGCAAGAGGAAAAGATCTTATCAGCGAATAGTGCCCGCAGCTCCCGAGTGGTGACGCATTTTGAAAAGCGGTGGCTGTTGGTAGAGACAGCAATGATCGCTGCCACCCGCGAAATGCTCGTAGCCCGGATGTTAGCCAAAGTCAGCGTGGCACTGCAGGAGGTGGCGGAGGTGGCGCGGGCATTGGACCGAAGGGCGGGTTTAAGAAAGAACcaaaaacgcCAACGTATAGTGGAGGCGAACTTACTGAAAAGGAGAGAAAACGCGTCGAAGAGGTGTTCGGAAAGACGAATCCAACAGACGGAGACACAGCTGGACACTAGCCCGAACAAAGAGGCCAGCCGTTACGCGATCGCGGGCTTGGTGGCGTGGGGTCTGGGGTGTGGGGACGCCGTGCCCGGCGCGTACGTCAACACGCCGCACTTCAACGACTGGCTCAAGCGGCGAATGGACGAGGAGGGCTTCGGCACTAGCTCTTACACTTACACACCTGAATTAATAGACTATTGGAATTtaatGCACTCGTGTGTTCTTCTGTTGCTGTGCCTATCGTGGACGTGGGGCCAGACAGTGGATGAAGATGAAATGAAACGGCTGATGGAATTGTTCCCGACGAAACCGACCCCGACCGAAGATAACTTCAACCTGACACTCGcTACCCGCGCATCCTTAAAGCAGCAGCAGGAATGCGGGTGGACCGGTGCTGACAATACAGGCTTCCGCATCAACGGCGAAGCCGACTCGGCCGAGTTTGGAGAGTTTCCCTGGATGGTCGCTATCTTGAAGCGTTCGAAGAGTACGACGTGGTCCCAGGGCGACTACCTCGGGGGCGGCTCCATCATCCACCCCTCTGTGGTGCTGACCGCCGCTCACAAGGTTGACGGGAAACTACCCTCAGAGGTGAAGTGTCGCGCCGGCGAGTACGACACACAGACGGAACTGGAGGCTTCGCACCAAGAACGAGACGTAAAGAAGATCATACAGCACGAGGACTTCTACAGGCCGTCAGTCCTCAACAGTATCGCGCTGCTGATTCTGGAGTCCCCGTACGACTTCACTGCCGCGCCCCACATAGGGGTCGCGTGCATGGCGCCGGCACCCCCGCCCCCGGGGACGCGGTGCTACAGCATGGGCTGGGGCAAGGAGTTCGCCGACAAAGAGAAGTACGCCGTCATACTGAAGAAGGTCCCGCTGCCGATCGTAGCGCGCGACAAATGCCGCAACGAGCTGCGCAAGACCCGGCTCGGCGTGCACTTCGAGCTGCATCCGTCCCTCACGTGCgcgggcggcgagcgcggcCGGGACACCTGCACCGGCGACGGCGGCTCGCCGCTCGTCTGTCCCATTGGGGTCACTAAACCGAACAAAGAGGCCAGCCGTTACGAGGTCGTGGGCATGGTGGCGTGGGGTGTGGGGTGCGGTGAGGCCTTGCCCGGCGCGTACGTCAACACGCCGCAGTTCAACGACTGGCTCAAGCGGCGCATGGACGAGGAGGGCTTCGGCACTAGCTCTTACACTTACACAtctaattaa
- Protein Sequence
- MRHSYVLLLLCLSWTRGQTLDDEIKWLTDLFGTKSIPTKEERKRLTDLFGTKPTPTEDERKRLIDLFGTKPTPTEDERKRLTDLFGTKPTPTEDERKRLTDLFGTKPTPTEDERKRLIDLFGTKPTPTEDERKRLTDLFGTKPTPNEDERKRLIDLFGTKPTPTEDERKRLTDLFGTKPTPTEDERKRLTDLFGTKPTPTEDERKRLIDLFGTKPTPNEDERKRLTDLFGTKPIPTEDERKRLTDLFGTKPTPTEDERKRLIDLFGTKPTPTEDERKRLTDLFGTKPTPNEDERKRLIDLFGTKPTPTEDERKRLTDLFGTKPTPTEDERKRLTDLFGTKPTPTEDERKRLIDLFGTKPTPNEDERKRLTDLFGTKPIPTEDERKRLTDLFGTKPTPTEDERKRLIDLFGTKPTPNEDERKRLIDLFGTKPTPTEDERKRLTDLFGTKPTPTEDERERLTDLFATKPTPTEDERKLLTDLFGTKPTPTEDERKRLTDLFGTKPTPTEDERKRLIDLFGTKPTPNEDERKRLIDLFGTKPTPTEDERKRLTDLFGTKPTPTEDERKRLTDLFGTKPTPTEDERKRLTDLFATKPTPTEDERKRLTDLFATKPTPTEDERKRLTDLFETKPTPTEDERERLTDLFATKPTPTEDERKLLTDLFGTKPTPTEDERKRLTDLFGTKPTPTEDERKWLTDLFATKPTPTEDERKRLTDLFGTKPTPTEDERKRLTDLFGTKPTPTEDERKRLTDLFATKPTPTEDERKRLTDLFGTKPTPTEDERKRLTDLFATKPTPTEDERKRLTDLFGIKPTPNEDERKRLTDLFGTKLTWTEDERKRLTDLFETKPTPTNTATSTFCVVWLIMIKHYTAIEYWNNAKYAVRKKACARRQPAAGQPALCPRPGSPALTEQEEKILSANSARSSRVVTHFEKRWLLVETAMIAATREMLVARMLAKVSVALQEVAEVARALDRRAGLRKNQKRQRIVEANLLKRRENASKRCSERRIQQTETQLDTSPNKEASRYAIAGLVAWGLGCGDAVPGAYVNTPHFNDWLKRRMDEEGFGTSSYTYTPELIDYWNLMHSCVLLLLCLSWTWGQTVDEDEMKRLMELFPTKPTPTEDNFNLTLATRASLKQQQECGWTGADNTGFRINGEADSAEFGEFPWMVAILKRSKSTTWSQGDYLGGGSIIHPSVVLTAAHKVDGKLPSEVKCRAGEYDTQTELEASHQERDVKKIIQHEDFYRPSVLNSIALLILESPYDFTAAPHIGVACMAPAPPPPGTRCYSMGWGKEFADKEKYAVILKKVPLPIVARDKCRNELRKTRLGVHFELHPSLTCAGGERGRDTCTGDGGSPLVCPIGVTKPNKEASRYEVVGMVAWGVGCGEALPGAYVNTPQFNDWLKRRMDEEGFGTSSYTYTSN
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -