Tman016385.1
Basic Information
- Insect
- Trabutina mannipara
- Gene Symbol
- -
- Assembly
- GCA_900080175.1
- Location
- FKYK01019372.1:907-5132[+]
Transcription Factor Domain
- TF Family
- Homeobox
- Domain
- Homeobox
- PFAM
- PF00046
- TF Group
- Helix-turn-helix
- Description
- This entry represents the homeodomain (HD), a protein domain of approximately 60 residues that usually binds DNA. It is encoded by the homeobox sequence [7, 6, 8], which was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well-conserved in many other animals, including vertebrates [1, 2], as well as plants [4], fungi [5] and some species of lower eukaryotes. Many members of this group are transcriptional regulators, some of which operate differential genetic programs along the anterior-posterior axis of animal bodies [3]. This domain folds into a globular structure with three α-helices connected by two short loops that harbour a hydrophobic core. The second and third form a helix-turn-helix (HTH) motif, which make intimate contacts with the DNA: while the first helix of this motif helps to stabilise the structure, the second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. One particularity of the HTH motif in some of these proteins arises from the stereo-chemical requirement for glycine in the turn which is needed to avoid steric interference of the β-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, while for many of the homeotic and other DNA-binding proteins the requirement is relaxed.
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 15 0.052 13 5.7 0.6 3 50 14 65 13 68 0.87 2 15 0.042 10 6.0 0.2 22 48 110 136 100 143 0.86 3 15 1.2e-05 0.0029 17.4 0.0 3 54 146 197 144 198 0.95 4 15 0.00026 0.065 13.1 1.1 8 55 243 291 236 293 0.88 5 15 7.1e-11 1.8e-08 34.1 0.2 8 53 308 353 304 357 0.95 6 15 0.033 8.1 6.4 0.1 31 54 426 449 422 450 0.92 7 15 0.086 21 5.0 0.0 26 47 479 500 467 501 0.87 8 15 1.8e-06 0.00045 20.0 0.1 13 50 570 607 560 611 0.90 9 15 5.2e-11 1.3e-08 34.5 0.3 7 53 629 675 626 677 0.94 10 15 9e-07 0.00022 21.0 0.8 6 53 744 791 743 794 0.94 11 15 1.1e-06 0.00028 20.7 0.2 10 54 842 886 837 887 0.90 12 15 7.3e-15 1.8e-12 46.9 0.5 6 53 903 950 899 952 0.95 13 15 1.6e-07 3.9e-05 23.4 0.1 11 52 1101 1142 1095 1144 0.92 14 15 3.5e-13 8.8e-11 41.5 2.4 2 57 1261 1316 1260 1316 0.96 15 15 8.1e-07 0.0002 21.1 0.0 8 51 1328 1371 1325 1375 0.88
Sequence Information
- Coding Sequence
- ATGGTGGCTTTATTAAAGGGTGTGAATGGAGATAGCGTAACCCGTATCACATTTACCAGAGAGCAGCATAAAAAGTTACACAAGCTGTTACGCGAAAAACCAATCGGCAAGCGTAAATTCAATGATCAAGAAATCCTTAAGAATGCCTCCCGCCTGAAAATCGCAGCTGTAAAGATACGTTACTGGCTGAAACAACGTAATGCTGTTGACGAAGCACCCAGACAATGCGTAATGCTGACGACTAACGAGATAAATCGAGTACGTAGTAGGTTACTGAACGAGCTATCGTTCGACAAGAAGGTGTTACTGTTGAAGAAAACTCAGTCGGCGACCCATGTCACCGTTCGCCTAATCCAAGAGTTGTCTCGTATACTGCAAGTAGACGAGAATCGAATTCAAAGCTGGCTCAAGCACAAGGTGCACCGTAACTtgaggatgaaaaaatcactgaacGCCGAAGAGATCTCGACTTTGAACGCAAAATTCGCCGAGTACGACTATCTGGACGACGATACGGCTCTATTGTTGTCTGATCGTTTCAACGTTTCCAGTTCGGTGATCAAGAAAGCATTCCAGCAACAACGTACTAAACCGCCGTTGAAAAGCGCCACCAAGCCGGTCGTCATGATAAAGAAGATCGCGCTAGCTTCGGTGCCAGCGCTGCCGAAAGGCATTCGGATGGCGTCGAAGACGACAGCCGACGTTAAGAAGAAACCCAGCTATAAAACGGTCGATCAGAAGAACTTACTATTCAAGGCATTTCGTCAGTCTCCCACCTTGACCAGAAACGAATCGTTGGCCGAACTGGTCAAGAAGACCGGACTGACTTCGCAGCAGATTGTCAAATGGTTGAGCGTGTTCCGCACTAAATGGTCGCAGAGTAACGAAAGCGGATTGAAGAGAGCGTTAACGAGAGGATTGAACCAGGATCAATTGCTAATGCTGGAGAAGGCGTACCGAGAAGAACGATTTTTGACTAATACGCAGCTCATCGAGTTGCCCAAGGCTGTCGGCATGAGCATGAAAACGGTCCAGGCTTGGTTCTCTAATCGTCGCGTCTACGAGATTCGGTCCTGCGATAAAGATCTAGTGAACATCGGACCGAAAACCGTCGCGGCCAGGAAAGAATCCAGTCAGCTGTTCAAGTCGCAGCAGCCGCCAGTCGCTCCGGTCGTCGGTCAAGGTTACCTAACCGACGCCGAATTTTACGAAACGTTGAGCAAAGAACAAAAGGCTCATTTGAGAACCGCCTGCAAAAGCTACAACGTTTCGTATTCCAGATTGGCTCAAACGTTGGGCGTGCAATACGAGCAAATCAAACGGTACGTGCAAAATTACCGCATACGTAATTCGATTTTCAGGCAGAGTAATCGTTCGTTACCGGAGCGTATCCATAAAGCGTTGGCCAACCATACGCTCAAATATGGTCAAATATCTTCGAAGACTGGCATCGTTTTAGCTAAACGTCTGAAAATACGACCTGAGCAAGTTAGAAATTGGAGCAGAACTTATTCGTCGCGTTATCTCGAAATTAGTAGGGTATCTCAAAGACCAACCAATAGTGCATCTCAGAACGCTCAAACGGTCCACGTCGAACCTGTTACCACCGAGCAACCGGTAGTAAAAGAAGAATATTACGAAGATACCGAACCAGCACCCAAACTCGAATCGCCTGGTACAAAGTACACCGTTCCTCCTTCGGCCAAAATTACGCTGCTCTCCGAATTCATAAAATCGCCTCAAACAGCCTCGTCCAGAACCAAACAACTCGCGCAAACGGTGGGTCTGACTCAGAAGCAGGTGCGCAAgtggttttataattttggtaaaaaactCAGCGAGCAAACCAAGTCGAAAGTCGTAGCTTGTCTGAATAATGCGTCCATCAGCGCCGAAGTAAGAGCTAAACTGGAAAAAGAGTACAGAAATCGTCGTTACTTGGACTTACCTGAAATGGAAAGGTTGGCAGCGGAATTCGGGTTAGCCAGGAGGCAGTTGGAGAATTGGTTCATTAACGCTCGTTACTACGAGGTGCTGTTCGGTCGCTGTCCTAGCTCTGATGATTACGCGAGGACGCTAGCTTCGAATTCATTCGACTCGTACTATTTGGCGGAAGAAAAACTCGACGAATCGGAAGCACCCTTGGAAGAGACGAAGAAATGGTTGGATAACGCTGCTCGTAGTAGAACGAGACGTCGTAGTTCACTGCCCAGTCCCAACGtacctttcaattttacacCTAGAGTGCAGCAAAGATTGGTGAACGAATTCGAAGCTAATCCAATTTTGAACGAAGCGCGTGCCATCGCTCTAGCCAAAGGTATGAAAGTCGCCAAAGATCGTATTAAAGCTTGGTTCGCGAGTCGTCAACAAGAACAGCAATTCGTACAGATCGAAAATTCCGATGAACAGAATCTAAGCAAGTATAAGGTACCGCCCATTAAGATCACCTTAGCCCAGTACAGACACCCGAACGACGTAGACGTCAACACCAACGACGTGAGCTCATCTAGAAAGAGTTACGTCCAACAGAACAGACTATTCGAAGAGTTCAAGATGGACTCGACTTTAACGAGCGAAAGATTACTGCGGATAAGCGAGGATACCAGCTTGGACGGTAAACAGATCACAGCCTGGTTCGATTGGATAAAAGCGAAACTATCTTCGATATCGAGGGAGGGCCTTCTGGCAGAAAATCGTAACGCCAATTTAACGCAGCAGCAAATCGCCGCTCTAGAAGAAGAATACGCAAAGAATCGATACGTCGACAGATTAGTTCGCGAATCTCTTTCGCGATCGTTGGGCGTGCCTAGAAACGTGGTAAAAACGTGGTTCGCCAACAAACGTTATTCGGAGATTTTATGCCCGGACGAGCTCGTCGACACCGACCCACTCTCTTCgcaggacgacgacgacgactacgacgccGAAAAAGACACCGGAACGGACGAAACGGTCATCGATTACGAATGGGACGAAGAATCGAACCTTTTCGACGATCCTGAACAGAAATTTGACGTTAAAAGACAACTGGAAATCGATCCTCTGCTCGACTATTGCCTCGAAGAGGAATCTGACCAAGATTCTAAAGCCGATATATTTCACgatttacaacttttgtctaaaaatttgtttaccgAAGACCTGACCAGTGCTCCGTATAAACCTCGCAATCGTCTTTTTACCCGACTCGGCATAGACTTAGAACCGTTGGAAGCTgacgtcgaaaaaatattctggtTTGCGAACAAGACGCCCAATAAAATAGACGAGAGGTTCGATCGGGACATAGACGAAGACCGCAATCGTACGCTAGAAAGCGAATTCGGTAAAAATCCTTGGCCCGAGTTAGAGAGAATATCGCAATTATCGGCTCAGTTGATGGTCTCCGAACCGAAAATACACTGGTGGTTTATTAAGAAACGATGCTTCATGACGAAAACCATATTGAATCTACCGGCGCCCAAACCGAAACCGAAACCTCCGCTTAAAAACGTTGTAATCGATTTGACCGAAGACGACGGCTTGAACGACGAATCTCAACCAGCTAAATGCGGAGAGTTCGAATTCGTTACGTTAGACGAGACGCCTCAAATTAAAGAAGAACAATTAACAGAGGAAGTTCCTACTCCGGAAAATGGCGATCCTTTCGAAAACGAGCCCATGGACGAGCAGCCGTCGTTCAACGAACACCTTACTCAAGCTTCGACCACTAGCGTTAACGATACCGCTACCGTCTCTCCAACTATCGACTCGGCCACgccaattaaaaagaaaaaaaccaagaagATTCCGTTGACCATTCATCAACGTACGATTCTACGACGAGAATTCAAGCGCAACAAAATCATTTCGAGTTCGCAAGCACGCTTGTTAGCTCAAGATTTAGGACTCACCGTCGCTCGAGTAGAAGCTTGGTTCGAAAATATGaggcaaaaaaagaagaaaaccagTTTCGCCGAGACGTATGTGAAAAGCTTGAGTCCAGAAATTGAAAGTGCTCTGGAAACGGAATATTCAACGAGCGTTATTTTGAGCGACGAAAGAAGTGAAGTGATTGCCGCTCAGTTAAACATCAAGAAAaatgttgtcaaaaaatgGTTCATCGagcgaactaaaaaaatggacgGACAAACGACCTAA
- Protein Sequence
- MVALLKGVNGDSVTRITFTREQHKKLHKLLREKPIGKRKFNDQEILKNASRLKIAAVKIRYWLKQRNAVDEAPRQCVMLTTNEINRVRSRLLNELSFDKKVLLLKKTQSATHVTVRLIQELSRILQVDENRIQSWLKHKVHRNLRMKKSLNAEEISTLNAKFAEYDYLDDDTALLLSDRFNVSSSVIKKAFQQQRTKPPLKSATKPVVMIKKIALASVPALPKGIRMASKTTADVKKKPSYKTVDQKNLLFKAFRQSPTLTRNESLAELVKKTGLTSQQIVKWLSVFRTKWSQSNESGLKRALTRGLNQDQLLMLEKAYREERFLTNTQLIELPKAVGMSMKTVQAWFSNRRVYEIRSCDKDLVNIGPKTVAARKESSQLFKSQQPPVAPVVGQGYLTDAEFYETLSKEQKAHLRTACKSYNVSYSRLAQTLGVQYEQIKRYVQNYRIRNSIFRQSNRSLPERIHKALANHTLKYGQISSKTGIVLAKRLKIRPEQVRNWSRTYSSRYLEISRVSQRPTNSASQNAQTVHVEPVTTEQPVVKEEYYEDTEPAPKLESPGTKYTVPPSAKITLLSEFIKSPQTASSRTKQLAQTVGLTQKQVRKWFYNFGKKLSEQTKSKVVACLNNASISAEVRAKLEKEYRNRRYLDLPEMERLAAEFGLARRQLENWFINARYYEVLFGRCPSSDDYARTLASNSFDSYYLAEEKLDESEAPLEETKKWLDNAARSRTRRRSSLPSPNVPFNFTPRVQQRLVNEFEANPILNEARAIALAKGMKVAKDRIKAWFASRQQEQQFVQIENSDEQNLSKYKVPPIKITLAQYRHPNDVDVNTNDVSSSRKSYVQQNRLFEEFKMDSTLTSERLLRISEDTSLDGKQITAWFDWIKAKLSSISREGLLAENRNANLTQQQIAALEEEYAKNRYVDRLVRESLSRSLGVPRNVVKTWFANKRYSEILCPDELVDTDPLSSQDDDDDYDAEKDTGTDETVIDYEWDEESNLFDDPEQKFDVKRQLEIDPLLDYCLEEESDQDSKADIFHDLQLLSKNLFTEDLTSAPYKPRNRLFTRLGIDLEPLEADVEKIFWFANKTPNKIDERFDRDIDEDRNRTLESEFGKNPWPELERISQLSAQLMVSEPKIHWWFIKKRCFMTKTILNLPAPKPKPKPPLKNVVIDLTEDDGLNDESQPAKCGEFEFVTLDETPQIKEEQLTEEVPTPENGDPFENEPMDEQPSFNEHLTQASTTSVNDTATVSPTIDSATPIKKKKTKKIPLTIHQRTILRREFKRNKIISSSQARLLAQDLGLTVARVEAWFENMRQKKKKTSFAETYVKSLSPEIESALETEYSTSVILSDERSEVIAAQLNIKKNVVKKWFIERTKKMDGQTT
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -