Fatl055714.1
Basic Information
- Insect
- Flexamia atlantica
- Gene Symbol
- B-H1
- Assembly
- GCA_035578135.1
- Location
- JAQJVK010000011.1:20001056-20013682[+]
Transcription Factor Domain
- TF Family
- Homeobox
- Domain
- Homeobox
- PFAM
- PF00046
- TF Group
- Helix-turn-helix
- Description
- This entry represents the homeodomain (HD), a protein domain of approximately 60 residues that usually binds DNA. It is encoded by the homeobox sequence [7, 6, 8], which was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well-conserved in many other animals, including vertebrates [1, 2], as well as plants [4], fungi [5] and some species of lower eukaryotes. Many members of this group are transcriptional regulators, some of which operate differential genetic programs along the anterior-posterior axis of animal bodies [3]. This domain folds into a globular structure with three α-helices connected by two short loops that harbour a hydrophobic core. The second and third form a helix-turn-helix (HTH) motif, which make intimate contacts with the DNA: while the first helix of this motif helps to stabilise the structure, the second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. One particularity of the HTH motif in some of these proteins arises from the stereo-chemical requirement for glycine in the turn which is needed to avoid steric interference of the β-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, while for many of the homeotic and other DNA-binding proteins the requirement is relaxed.
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 40 1.6e-23 1.6e-20 74.9 3.0 1 54 175 228 175 229 0.98 2 40 4.1e-13 4.2e-10 41.6 0.6 22 54 227 259 226 260 0.96 3 40 5.9e-13 6e-10 41.1 0.4 22 54 258 290 258 291 0.97 4 40 5.9e-13 6.1e-10 41.1 0.4 22 54 289 321 289 322 0.97 5 40 5.9e-13 6e-10 41.1 0.4 22 54 320 352 320 353 0.97 6 40 5.9e-13 6.1e-10 41.1 0.4 22 54 351 383 351 384 0.97 7 40 5.9e-13 6.1e-10 41.1 0.4 22 54 382 414 382 415 0.97 8 40 5.9e-13 6e-10 41.1 0.4 22 54 413 445 413 446 0.97 9 40 5.9e-13 6e-10 41.1 0.4 22 54 444 476 444 477 0.97 10 40 5.9e-13 6.1e-10 41.1 0.4 22 54 475 507 475 508 0.97 11 40 5.9e-13 6e-10 41.1 0.4 22 54 506 538 506 539 0.97 12 40 4e-13 4.2e-10 41.6 0.6 22 54 537 569 536 570 0.96 13 40 5.9e-13 6.1e-10 41.1 0.4 22 54 568 600 568 601 0.97 14 40 5.9e-13 6.1e-10 41.1 0.4 22 54 599 631 599 632 0.97 15 40 5.9e-13 6e-10 41.1 0.4 22 54 630 662 630 663 0.97 16 40 5.9e-13 6.1e-10 41.1 0.4 22 54 661 693 661 694 0.97 17 40 6e-13 6.1e-10 41.1 0.4 22 54 692 724 692 725 0.97 18 40 0.0033 3.4 9.8 0.1 22 43 723 744 723 746 0.94 19 40 2.2e-13 2.2e-10 42.5 0.4 21 54 754 787 753 788 0.96 20 40 5.9e-13 6.1e-10 41.1 0.4 22 54 786 818 786 819 0.97 21 40 0.0033 3.4 9.9 0.1 22 43 817 838 817 840 0.94 22 40 2.9e-13 3e-10 42.1 0.3 22 54 887 919 885 920 0.96 23 40 5.9e-13 6.1e-10 41.1 0.4 22 54 918 950 918 951 0.97 24 40 4.1e-13 4.2e-10 41.6 0.7 22 54 949 981 948 982 0.96 25 40 5.9e-13 6.1e-10 41.1 0.4 22 54 980 1012 980 1013 0.97 26 40 5.9e-13 6.1e-10 41.1 0.4 22 54 1011 1043 1011 1044 0.97 27 40 5.9e-13 6.1e-10 41.1 0.4 22 54 1042 1074 1042 1075 0.97 28 40 5.9e-13 6e-10 41.1 0.4 22 54 1073 1105 1073 1106 0.97 29 40 0.0011 1.2 11.4 0.2 22 44 1104 1126 1103 1128 0.91 30 40 0.0022 2.3 10.4 0.1 22 43 1167 1188 1165 1188 0.93 31 40 2.6e-13 2.7e-10 42.2 0.3 21 54 1187 1220 1186 1221 0.95 32 40 1.2e-12 1.3e-09 40.0 0.4 22 54 1219 1251 1219 1252 0.97 33 40 5.9e-13 6e-10 41.1 0.4 22 54 1250 1282 1250 1283 0.97 34 40 5.8e-13 6e-10 41.1 0.4 22 54 1281 1313 1281 1314 0.97 35 40 3.9e-13 4e-10 41.6 0.6 22 54 1312 1344 1311 1345 0.96 36 40 0.0033 3.4 9.9 0.1 22 43 1343 1364 1343 1366 0.94 37 40 0.042 44 6.3 1.4 45 54 1395 1404 1395 1405 0.91 38 40 0.013 13 8.0 0.7 21 41 1402 1422 1401 1423 0.92 39 40 0.0022 2.2 10.4 0.1 22 43 1474 1495 1472 1496 0.94 40 40 4.4e-05 0.046 15.8 0.1 21 45 1494 1518 1493 1519 0.94
Sequence Information
- Coding Sequence
- ATGTACGCACTCCTCGCCGAGTTTATCAACCCTGGCCGCCACATTCGGGGCTCCAGCCCTCAGTCAAGCGGGGTATCAGCTCGTGATCGCGGCGCGGGGGTTGGCGGCGCTGCGTGcgtgggagggggagggggctCAAAGCGGACAAAAGCTGGGGCAATTAAATCCAACAAGCATAAGCTAATTGTCGTCGTTCGAAATTACTCCGAACCCGGCGTAATCGAGTCAGCGCCCACCCCGAGCGCCACCGTCGCCAGGTTGCGGCTGTCTCGAACAATTTGGCGAGGCTGGCGCCCCAAATGGAGCCTTTCCCCCGGGACCGCCGATAGTAACGCGGTTAAAGCGCTCAAACGCCGCTCAAAAGCCGAGAGCCGAACCGCGCTCTGTGATCTCGGCTCCTCGCTGCTCACAGCGCCTGACAGATCCTTGCTAAACACCGATCATCGCAGGGCCCTGGGGACTCGCCACAAGGAGGACGGCGCGGAGTCGCGCAGCAGCCACTCGGGGCTGAGCAAGAAGCAGCGGAAGGCGCGCACGGCGTTCACCGACCACCAGCTGCAGACGCTGGAGAAGAGCTTCGAGCGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGATAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGATAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGCAAGACCTGGTACCAGAACAGGAGGTCAACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGGCAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGCAAGACCTGGTACCAGAACAGGAGAAGTACCTGAGTGTGCAGGACAGGATGGAACTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGGTCAGTACACCACCGAGTCTTGTATGTAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAACTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACATGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGGCCTGCTACTAGAACAGGAGGTCAGTACAACACCGAGTCTTGTATGTAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAATTTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGCAAGACCTGGTACCAGAACAGGAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGCAAGACCTGGTACCAGAACAGGAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAGCTGGCCGCCAAGCTGAGCCTCAGCGACATACCCAGGTCAAGACCTGGTACCAGAACAGGAGAAGTACCTGAGTGTGCAGGACAGGATGGAACTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGGCCTGCTACTAGAACAGGAGGTCAGTACACCACCGAGTCTTGTATGTAGACAGAAGTACCTGAGTGTGCAGGACAGGATGGAACTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGAAGTACCTGAGTGTGCAGGACAGGATGGAACTGGCCGCCAAGCTGAGCCTCAGCGACACCCAGGTCAAGGCCTGCTACTAG
- Protein Sequence
- MYALLAEFINPGRHIRGSSPQSSGVSARDRGAGVGGAACVGGGGGSKRTKAGAIKSNKHKLIVVVRNYSEPGVIESAPTPSATVARLRLSRTIWRGWRPKWSLSPGTADSNAVKALKRRSKAESRTALCDLGSSLLTAPDRSLLNTDHRRALGTRHKEDGAESRSSHSGLSKKQRKARTAFTDHQLQTLEKSFERQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQARPGTRTGGQQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQARPGTRTGEVPECAGQDGTGRQAEPQRHPGQDLVPEQEVSTPPSLVCRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQTEVPECAGQDGAGRQAEPQRHPGQGLLLEQEVSTTPSLVCRQKYLSVQDRMELAAKLSLSDTQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAANLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQVKTWYQNRRQKYLSVQDRMELAAKLSLSDTQARPGTRTGEVPECAGQDGAGRQAEPQRHPGKTWYQNRRQKYLSVQDRMELAAKLSLSDIPRSRPGTRTGEVPECAGQDGTGRQAEPQRHPGQGLLLEQEVSTPPSLVCRQKYLSVQDRMELAAKLSLSDTQKYLSVQDRMELAAKLSLSDTQVKACY
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -