Mjur021518.1
Basic Information
- Insect
- Maniola jurtina
- Gene Symbol
- Otp
- Assembly
- GCA_905333055.1
- Location
- HG995212.1:230994-243739[-]
Transcription Factor Domain
- TF Family
- Homeobox
- Domain
- Homeobox
- PFAM
- PF00046
- TF Group
- Helix-turn-helix
- Description
- This entry represents the homeodomain (HD), a protein domain of approximately 60 residues that usually binds DNA. It is encoded by the homeobox sequence [7, 6, 8], which was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well-conserved in many other animals, including vertebrates [1, 2], as well as plants [4], fungi [5] and some species of lower eukaryotes. Many members of this group are transcriptional regulators, some of which operate differential genetic programs along the anterior-posterior axis of animal bodies [3]. This domain folds into a globular structure with three α-helices connected by two short loops that harbour a hydrophobic core. The second and third form a helix-turn-helix (HTH) motif, which make intimate contacts with the DNA: while the first helix of this motif helps to stabilise the structure, the second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. One particularity of the HTH motif in some of these proteins arises from the stereo-chemical requirement for glycine in the turn which is needed to avoid steric interference of the β-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, while for many of the homeotic and other DNA-binding proteins the requirement is relaxed.
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 31 1.6e-16 3.8e-14 52.6 0.1 1 51 81 131 81 133 0.96 2 31 1.7e-13 4e-11 42.9 0.0 6 51 136 181 134 183 0.95 3 31 1.7e-13 4e-11 42.9 0.0 6 51 186 231 184 233 0.95 4 31 2.9e-11 6.8e-09 35.8 0.0 6 46 236 276 234 282 0.90 5 31 1.6e-13 3.8e-11 43.0 0.0 6 51 286 331 283 333 0.95 6 31 1.7e-13 4e-11 42.9 0.0 6 51 336 381 334 383 0.95 7 31 3.2e-13 7.4e-11 42.1 0.1 6 50 386 430 384 431 0.93 8 31 3.2e-13 7.4e-11 42.1 0.1 6 50 436 480 434 481 0.93 9 31 3e-13 7e-11 42.1 0.0 6 50 486 530 484 532 0.93 10 31 4.7e-11 1.1e-08 35.1 0.0 6 50 536 580 534 581 0.92 11 31 1.7e-13 4e-11 42.9 0.0 6 51 586 631 584 633 0.95 12 31 3e-13 7e-11 42.1 0.0 6 50 636 680 634 682 0.93 13 31 4.7e-11 1.1e-08 35.1 0.0 6 50 686 730 684 731 0.92 14 31 1.9e-12 4.5e-10 39.6 0.0 6 51 736 781 734 783 0.95 15 31 6.2e-13 1.5e-10 41.1 0.0 6 50 786 830 784 832 0.93 16 31 4.7e-11 1.1e-08 35.1 0.0 6 50 836 880 834 881 0.92 17 31 3e-13 7e-11 42.1 0.0 6 50 886 930 884 932 0.93 18 31 4.7e-11 1.1e-08 35.1 0.0 6 50 936 980 934 981 0.92 19 31 1.7e-13 4e-11 42.9 0.0 6 51 986 1031 984 1033 0.95 20 31 1.7e-13 4e-11 42.9 0.0 6 51 1036 1081 1034 1083 0.95 21 31 1.7e-13 4e-11 42.9 0.0 6 51 1086 1131 1084 1133 0.95 22 31 1.7e-13 4e-11 42.9 0.0 6 51 1136 1181 1134 1183 0.95 23 31 1.8e-13 4.2e-11 42.9 0.1 6 51 1186 1231 1184 1232 0.95 24 31 3.2e-13 7.5e-11 42.0 0.0 6 50 1236 1280 1234 1281 0.93 25 31 3.2e-13 7.4e-11 42.1 0.1 6 50 1286 1330 1284 1331 0.93 26 31 3.2e-13 7.4e-11 42.1 0.1 6 50 1336 1380 1334 1381 0.93 27 31 2e-12 4.8e-10 39.5 0.0 6 50 1386 1430 1384 1431 0.93 28 31 3.2e-13 7.4e-11 42.1 0.1 6 50 1436 1480 1434 1481 0.93 29 31 3.2e-13 7.4e-11 42.1 0.1 6 50 1486 1530 1484 1531 0.93 30 31 3.2e-13 7.4e-11 42.1 0.1 6 50 1536 1580 1534 1581 0.93 31 31 2.3e-12 5.3e-10 39.3 0.0 6 45 1586 1625 1584 1627 0.97
Sequence Information
- Coding Sequence
- ATGCTTAATAACCTCACGGCGACAGCTTTAGGAGGTCACAAAGAGTGCGGCGACAGCAAGCTCGAGCTAGAGCTCAAGCCGAATGTGGAACGGCACCATGCGCCCGGAGCGCGCCTTCTCGCGCACAGCATCCGAGCCGACCATCTGCAAGGCGGCATCCACCAGCAGCACGGCGGGCACATGTCGGTGCCGGTCAGCATGTCTTCCTCCGCCTCCGACAGCGACAAGCCCGCCAAGCAGAAGAGGCATCGGACGAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCGTGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCGTGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCGTGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGGGCAGGTCAGTGCTCAAAACCGTGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCGTGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCGTGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCATCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGGGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCGTGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCATCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGGGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAAGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGTGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCGTGCGTGTTTGCGCAGGTTCACCCCGGTGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCATCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGGGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCATCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGGGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCGTGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCGTGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCGTGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCGTGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCGTGCTTGTGTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGGGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAAATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGGGTGCAGGTCAGTGCTCAAAACCATGCGTGTTTGCGCAGGTTCACCCCGGCGCAACTGAACGAGCTGGAGCGGTGCTTCACCAAGACGCACTACCCCGACATCTTCATGAGGGAGGAGATCGCCATGCGCATCGGACTCACGGAGAGCAGAGTGCAGGAATACCCGGCTGAGTTTGTTGATGGGCTCTTCTCAGACTTGGGCGCGCTTGGAGCATTCATAGCTTTAGTTTTAATTTAA
- Protein Sequence
- MLNNLTATALGGHKECGDSKLELELKPNVERHHAPGARLLAHSIRADHLQGGIHQQHGGHMSVPVSMSSSASDSDKPAKQKRHRTRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNRACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNRACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNRACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRGQVSAQNRACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNRACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNRACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNHACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNHACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNHACLRRFIPAQLNELERCFTKTHYPDIFMREGIAMRIGLTESRVQVSAQNHACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNRACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNHACLRRFIPAQLNELERCFTKTHYPDIFMREGIAMRIGLTESRVQVSAQNHACLRRFTPAQLNELEWCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNRACLRRFTPVQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNHACLRRFIPAQLNELERCFTKTHYPDIFMREGIAMRIGLTESRVQVSAQNHACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNHACLRRFIPAQLNELERCFTKTHYPDIFMREGIAMRIGLTESRVQVSAQNHACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNRACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNRACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNRACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNRACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNRACVRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNHACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNHACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNHACLRRFTPAQLNELERCFTKTHYPDIFMREGIAMRIGLTESRVQVSAQNHACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNHACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNHACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQVSAQNHACLRRFTPAQLNELERCFTKTHYPDIFMREEIAMRIGLTESRVQEYPAEFVDGLFSDLGALGAFIALVLI
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -