Fatl013167.1
Basic Information
- Insect
- Flexamia atlantica
- Gene Symbol
- not2
- Assembly
- GCA_035578135.1
- Location
- JAQJVK010000001.1:180185631-180197616[+]
Transcription Factor Domain
- TF Family
- Homeobox
- Domain
- Homeobox
- PFAM
- PF00046
- TF Group
- Helix-turn-helix
- Description
- This entry represents the homeodomain (HD), a protein domain of approximately 60 residues that usually binds DNA. It is encoded by the homeobox sequence [7, 6, 8], which was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well-conserved in many other animals, including vertebrates [1, 2], as well as plants [4], fungi [5] and some species of lower eukaryotes. Many members of this group are transcriptional regulators, some of which operate differential genetic programs along the anterior-posterior axis of animal bodies [3]. This domain folds into a globular structure with three α-helices connected by two short loops that harbour a hydrophobic core. The second and third form a helix-turn-helix (HTH) motif, which make intimate contacts with the DNA: while the first helix of this motif helps to stabilise the structure, the second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. One particularity of the HTH motif in some of these proteins arises from the stereo-chemical requirement for glycine in the turn which is needed to avoid steric interference of the β-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, while for many of the homeotic and other DNA-binding proteins the requirement is relaxed.
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 33 2.3e-12 2.4e-09 39.2 0.6 1 43 30 72 30 72 0.98 2 33 0.043 44 6.3 0.2 27 45 70 88 69 89 0.91 3 33 0.01 10 8.3 0.0 23 45 93 115 90 116 0.91 4 33 0.25 2.6e+02 3.8 0.0 23 43 120 140 117 142 0.91 5 33 0.072 74 5.6 0.0 28 45 176 193 173 194 0.91 6 33 0.036 37 6.5 0.0 23 45 198 220 195 221 0.91 7 33 0.01 10 8.3 0.0 23 45 225 247 222 248 0.91 8 33 0.01 10 8.3 0.0 23 45 252 274 249 275 0.91 9 33 0.0037 3.8 9.7 0.1 23 45 279 301 276 302 0.92 10 33 0.0054 5.5 9.2 0.1 23 45 306 328 303 329 0.92 11 33 0.0054 5.5 9.2 0.1 23 45 333 355 330 356 0.92 12 33 0.0091 9.3 8.4 0.0 23 45 360 382 357 383 0.91 13 33 0.0097 9.9 8.4 0.0 23 45 387 409 384 410 0.92 14 33 0.01 10 8.3 0.0 23 45 414 436 411 437 0.91 15 33 0.01 10 8.3 0.0 23 45 441 463 438 464 0.91 16 33 0.0049 5 9.3 0.1 23 45 468 490 465 491 0.92 17 33 0.0089 9.2 8.5 0.0 23 45 495 517 492 518 0.91 18 33 0.0048 4.9 9.3 0.1 23 45 522 544 519 545 0.92 19 33 0.0049 5 9.3 0.1 23 45 549 571 546 572 0.92 20 33 0.0091 9.3 8.4 0.0 23 45 576 598 573 599 0.91 21 33 0.0049 5 9.3 0.1 23 45 603 625 600 626 0.92 22 33 0.021 22 7.3 0.0 23 45 630 652 627 653 0.89 23 33 0.023 24 7.1 0.0 23 45 657 679 654 680 0.89 24 33 0.0054 5.5 9.2 0.1 23 45 684 706 681 707 0.92 25 33 0.0054 5.5 9.2 0.1 23 45 711 733 708 734 0.92 26 33 0.0091 9.3 8.4 0.0 23 45 738 760 735 761 0.91 27 33 0.0054 5.5 9.2 0.1 23 45 765 787 762 788 0.92 28 33 0.072 74 5.6 0.0 23 45 792 814 789 815 0.91 29 33 0.0049 5 9.3 0.1 23 45 819 841 816 842 0.92 30 33 0.0091 9.3 8.4 0.0 23 45 846 868 843 869 0.91 31 33 0.016 16 7.7 0.0 24 45 874 895 870 896 0.89 32 33 0.023 24 7.1 0.0 23 45 900 922 897 923 0.89 33 33 6.4e-09 6.6e-06 28.1 0.1 23 57 927 967 924 967 0.80
Sequence Information
- Coding Sequence
- ATGTGCAGCGGGGAAGAAATGGGGCGGAAGGCAAGGGAAGATAAGGACCGCGACTGCGGAGGCGGCGGACGCAAGGGCGGCAAGTCCAAGCGAGTGAGGACCATCTTCACCCCCGAGCAGCTGGAGCGCCTGGAGGCGGAGTTCGAGCGGCAGCAATACATGGTGGGGCCGGAGCGGCTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGCGACTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGGCCGGAGCGGCTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGGCCGGAGCGGCTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGCGGCAGCAGTACAGTAACGGTGGGGCCGGAGCGACTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGACAGCAGTACAGTAACGGTGGGGCCGGAGCGACTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGGCCGGAGCGACTGTACATGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGGCCGGAGCGACTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGGCCGGAGCGACTGTACCTTGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGAGGCCGGAGCGACTGTACCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGAGCCGGAGCGACTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGAGCCGGAGCGACTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTCAGAGCGGCAGCAGTACAGTACATGGTGGGGCCGGAGCGACTGTACCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGACAGCAGTACAGTACATGGTGGGGCCGGAGCGGCTGTACCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGGCCGGAGCGGCTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGGCCGGAGCGGCTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGAGCCGGAGCGACTGTACCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATAGTGGGGCCGGAGCGACTGTACCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATAGTGGAGCCGGAGCGACTGTACCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGAGCCGGAGCGACTGTACCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGGCCGGAGCGACTGTATCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGAGCCGGAGCGACTGTATCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGGCTGGAGCGACTGTACCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGGCTGGAGCGGCTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTGCAGTACATGGTGGAGCCGGAGCGACTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGAGCCGGAGCGACTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCGGCAGTACAGTACATGGTGGGGCCGGAGCGACTGTATCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGAGCCGGAGCGACTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGGCCGGAGCGACTGTATCTGGCACACACGCTGCAGCTCACAGTGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGAGCCGGAGCGACTGTACCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGGCCGGAGCGGCTGTACCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTAGAGTACATGGTGGGGCCGGAGCGACTGTATCTGGCACACACGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGGCTGGAGCGGCTGTACCTGGCACACGCGCTGCAGCTCACAGAGGCTCAGGTGAGAGCGGCAGCAGTACAGTACATGGTGGGTCTGGAGCGACTGTACCTGGCACACGCGCTGCAGCTCACAGATCATGGATATACATCGATGGATGAGGTAAAGGTCTGGTTCCAAAATAGAAGAATAAAGTGGAGGAAACAACACCTAGAGCTTCAGCAGCAAAGGTTGGCAGCGCTGAAGCAGCAGCAGCAGAGGCTTCAGCAGGAGCAGGAAATGGAGAGCGACCCGGAGACTTCCAGGGGGGACGACCCCCAGCACTTCCTAGCCGCGGGTGTACCTTCCGTCAGCCACGAGTAG
- Protein Sequence
- MCSGEEMGRKAREDKDRDCGGGGRKGGKSKRVRTIFTPEQLERLEAEFERQQYMVGPERLYLAHALQLTEAQRLYLAHALQLTEAQVRAAAVQYMVGPERLYLAHALQLTEAQVRAAAVQYMVGPERLYLAHALQLTEAQRQQYSNGGAGATVPGTRAAAHRGSGESDSSTVTVGPERLYLAHALQLTEAQVRAAAVQYMVGPERLYMAHALQLTEAQVRAAAVQYMVGPERLYLAHALQLTEAQVRAAAVQYMVGPERLYLAHALQLTEAQVRAAAVQYMVRPERLYLAHTLQLTEAQVRAAAVQYMVEPERLYLAHALQLTEAQVRAAAVQYMVEPERLYLAHALQLTEAQVRAAAVQYMVGPERLYLAHTLQLTEAQVRATAVQYMVGPERLYLAHTLQLTEAQVRAAAVQYMVGPERLYLAHALQLTEAQVRAAAVQYMVGPERLYLAHALQLTEAQVRAAAVQYMVEPERLYLAHTLQLTEAQVRAAAVQYIVGPERLYLAHTLQLTEAQVRAAAVQYIVEPERLYLAHTLQLTEAQVRAAAVQYMVEPERLYLAHTLQLTEAQVRAAAVQYMVGPERLYLAHTLQLTEAQVRAAAVQYMVEPERLYLAHTLQLTEAQVRAAAVQYMVGLERLYLAHTLQLTEAQVRAAAVQYMVGLERLYLAHALQLTEAQVRAAAVQYMVEPERLYLAHALQLTEAQVRAAAVQYMVEPERLYLAHALQLTEAQVRAAAVQYMVGPERLYLAHTLQLTEAQVRAAAVQYMVEPERLYLAHALQLTEAQVRAAAVQYMVGPERLYLAHTLQLTVAQVRAAAVQYMVEPERLYLAHTLQLTEAQVRAAAVQYMVGPERLYLAHTLQLTEAQVRAAAVEYMVGPERLYLAHTLQLTEAQVRAAAVQYMVGLERLYLAHALQLTEAQVRAAAVQYMVGLERLYLAHALQLTDHGYTSMDEVKVWFQNRRIKWRKQHLELQQQRLAALKQQQQRLQQEQEMESDPETSRGDDPQHFLAAGVPSVSHE
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -