Nswa012384.1
Basic Information
- Insect
- Nematopogon swammerdamellus
- Gene Symbol
- Ets97D
- Assembly
- GCA_946902865.1
- Location
- CAMPPV010000389.1:124393-140422[+]
Transcription Factor Domain
- TF Family
- ETS
- Domain
- Ets domain
- PFAM
- PF00178
- TF Group
- Helix-turn-helix
- Description
- Transcription factors are protein molecules that bind to specific DNA sequences in the genome, resulting in the induction or inhibition of gene transcription [3]. The ets oncogene is such a factor, possessing a region of 85-90 amino acids known as the ETS (erythroblast transformation specific) domain [3, 5, 4]. This domain is rich in positively-charged and aromatic residues, and binds to purine-rich segments of DNA. The ETS domain has been identified in other transcription factors such as PU.1, human erg, human elf-1, human elk-1, GA binding protein, and a number of others [3, 5, 2]. It is generally localized at the C terminus of the protein, with the exception of ELF-1, ELK-1, ELK-3, ELK-4 and ERF where it is found at the N terminus. NMR-analysis of the structure of the Ets domains revealed that it contains three α-helices (1-3) and four-stranded β-sheets (1-4) arranged in the order α1-β1-β2-α2-α3-β3-β4 forming a winged helix-turn-helix (wHTH) topology [1]. The third α-helix is responsive to contact to the major groove of the DNA. Different members of the Ets family proteins display distinct DNA binding specificities. The Ets domains and the flanking amino acid sequences of the proteins influence the binding affinity, and the alteration of a single amino acid in the Ets domain can change its DNA binding specificities.
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 8 7.6e-06 0.032 15.4 0.1 1 25 607 631 607 643 0.87 2 8 1.2e-05 0.053 14.7 0.2 1 24 654 677 654 682 0.92 3 8 7.5e-06 0.032 15.4 0.1 1 25 701 725 701 737 0.87 4 8 5.2e-06 0.022 15.9 0.1 1 25 748 772 748 786 0.81 5 8 7.6e-06 0.032 15.4 0.1 1 25 795 819 795 831 0.87 6 8 8.5e-05 0.36 12.1 0.0 1 25 842 866 842 878 0.87 7 8 8.4e-06 0.036 15.3 0.1 1 25 889 913 889 924 0.89 8 8 2e-34 8.4e-31 107.0 0.1 1 81 936 1015 936 1015 0.98
Sequence Information
- Coding Sequence
- ATGGATATCCGGACGCGGCTGTCGACGCTGCGCGGCATGCTGGAGCGGCGGCTCGCCACCGACCTCACACACTACACCTTCTGTTCTGATGGATATCCGGACGCGGCTGTCGACGCTGCGCGGCATGCTGGAGCGGCGGCTCGCCACCGACCTCACACACTACACCTTCTGTTGTTCACAGAGGACGTGATCGTGCAGCTGATGGATATCCGGACGCGGCTGTCGACGCTGCGCGTCATGCTGACGCGCCGGCTCGCCACCGACCTCACACTACACCTTCTGTTGTTCACAGAGGACGTGATCGTGCAGCTGATGGATATCCGGACGCGGCTGTCGACGCTGCGCGGCATGCTGGAGCGGCGGCTCGCCACCGACCTCACACACTACACCTTCTGTTCTGATGGATATCCGGACGCGGCTGTCGACGCTGCGCGGCATGCTGGAGCGGCGGCTCGCCACCGACCTCACACACTACACCTTCTGTTGTTCACAGAGGACGTGATCGTGCAGCTGATGGATATCCGGACGCGGCTGTCGACGCTGCACGGCATGCTGGAGCGCCGGCTCGCCACCGACCTCACACACTACACCTTCTGTTCTGATGGATATCCGGACGCGGCTGTCGACGCTGCGCGGCATGCTGGAGCGGCGGCTCGCCACCGACCTCACACACTACACCTTCTGTTGTTCACAGAGGACGTGATCGTGCAGCTGATGGATATCCGGACGCGGCTGTCGACGCTGCGCGGCATGCTGGAGCGGCGGCTCGCCACCGACCTCACACACTACACCTTCTGTTCTGATGGATATCCGGACGCGGCTGTCGACGCTGCGCGGCATGCTGGAGCGGCGGCTCGCCACCGACCTCACACACTACACCTTCTGTTGTTCGCAGAGGACGTGATCGTGCAGCTGATGGATATCCGGACGCGGCTGTCGACGCTGCGCGGCATGCTGGAGCGCCGGCTCGCCACCGACCTCACACACTACACCTTCTGGCTGCAGGATGCTAAGGAGCTGGAGTCGCACAAGACGCTGGTGGACCAGTGCATCAAGGGGGAGGGCGTGGTGCAGGTGAACCTGCAGCTGAAGCCGCTCGAGCGCCGCATCAACATCCTCGACGTACTCAAGCCGGACGAGGAGCTCATCGACCTCGCGCCCCCGCCAGACGGTGCGGAGACGCCGGAGGAGGTGGCAGACGGGGAGGCGGTCGGCGCGGACGGGGAGGCGGGGGTGGAGGTGGAGGCGGAGGTGAGCGCGGCGGACACGGTGTCGGCCGACCTGCCGCCTGCCGCGGCCTCCCCGCACGCGCCGCAGGACAACAGCAAGCAGCCGCTGCACTCCACCATGGTCAAGTGGATCGTGGACGAACAGTTCCAGTCGGACCGCTCGCGCCTCAAGATGCCCGACGAGCCCGCCGACTGGTCGGTGGCGCACGTGCGGCTCTGGATACAGTGGGCGGTGCGGCAGTTCAACCTGACCGGCATCAAGCTGCCCGACTGGAGCATCAGCGGGGAGCAACTCTGCGCCATGTCCCTGCACGAGTTCAGAGAGAAGTGCCCGTCGGACCCCGGCGATATCTTCTGGACACACTTCGAGTTGCTTCGCAAGTGCAAATTTATTGCGGTGATACAGAAGGAGGATGGGGGCGGGGGGGCGCCGCGCGCCGCCGCCGGCAAGGACACCGAGCGACAGTACGCCGTGAAGAGGAAGAAACCGCAGCCGCTGCAGCAGATGATAACAGAGCCTGGGGACCTGACGGAGTACGGGGGCGGCGCCGGCGCGGGCACGCTACGTGGCGCCAGCACCGGCCAGATACACCTGTGGCAGTTCCTGCTGGAGCTACTCACCACCGCGGACCATCTCCCCGTCATACAGTGGCACGGTCAGTCACTCTCTCGCCGTTCCTTCCTCCTGCTACCTCCCGCTACACACTCGTGCGTGTGCACCGGCCAGATACACCTGTGGCAGTTCCTGCTGGAGCTCCTCACCACCGCGGACCATCTCCCCGTCATACAGTGGCACGGTCAGTCACTCTCTCGCCGTTCCTGCCTCCTGCTACCTCCCGCTACACACTCGTGCGTGTGCACCGGCCAGATACACCTGTGGCAGTTCCTGCTGGAGCTACTCACCACCGCGGACCATCTCCCCGTCATACAGTGGCACGGTCAGTCACTCTCTCGCCTTTCCTTCCTCCTGCTACCTCCCGCTACACACTCGTGCGTGTGCACCGGCCAGATACACCTGTGGCAGTTCCTGCTGGAGCTACTCACCACCGCGGACCATCTCCCCGTCATACAGTGGCACGGTCAGTCACTCTCTCGCCGTTCCTTCCTCCTGCTACATCCCGCTACATACTCGTGCGTGTGCACCGGCCAGATACACCTGTGGCAGTTCCTGCTGGAGCTACTCACCACCGCGGACCATCTCCCCGTCATACAGTGGCACGGTCAGTCACTCTCTCGCCGTTCCTTCCTCCTGCTACCTCCCGCTACACACTCGTGCGTGTGCACCGGCCAGATACACCTGTGGCAGTTCCTGCTGGAGCCCCTCACCACCGCGGACCATCTCCCCATCATACAGTGGCACGGTCAGTCACTCTCTCGCCGTTCCTTCCTCCTGCTACCTCCCGCTACACACTCGTGCGTGTGCACCGGCCAGATACACCTGTGGCAGTTCCTGCTTGAGCTACTCACCACCGCGGACCATCTCCCCGTCATACAGTGGCACGGTCAGTCACTCTCTCGCTGTTCCTTCCTCCTGCTACCTCCCGCTACACACTCGTGCGTGTGCACCGGCCAGATACACTTGTGGCAGTTCCTGCTTGAGCTACTCACCACCGCGGACCATTTTCCCGTCATACAGTGGCACGGCACGGAGGGTGAGTTCAAGCTGGTGGAGCCGGAGCGCGTGGCGCGGCTGTGGGGCTCGCGCAAGCACAAGCCCGCCATGAACTACGAGAAGCTCTCCCGCGCGCTCCGCTACTACTACGATGGCGACATGATCGCCAAGGTCAACGGCAAGAGGTTCGTGTACAAGTTCGTGTGCGACCTGCGCCAGCTGGTGGGCTACTCTGCCGGCGAGTTGGCGCGCCTCGTGTCCGACACCTACGAGCAGCACAGTGCGGCTCCAGCGCGCTCAACAACACCGACTAACACCGACCGCGTCACGTGCGACCTGCGCCAGCTGGTGGGCTACTCTGCCGGCGAGCTAGCGCGCCTCGTGTCCGACACCTACGAGCAGCACTGGTGGGCTACTCTGCCGGCGAGTTGGCGCGCCTCGTGTCCGACACCTACGAGCAGCAGTGCGGCTCCAGCGCACTCAACAACACCGACTAACAGCCCGCGTCACGTGCGACCTGCGCCAGCTGGTGGGCTACTCTGCCGGCGAGCTAGCGCGCCTCGTGTCCGACACCTACGAGCAGCAGTGCGGCTCCAGCACGCTCAACAACACCGACTAAAACCCCGCGTCACGTGCGACCTGCGCCAGCTGGTGGGCCACTCTGCCGGCGAGCTGGCGCGCCTCGTGTCCGACACCTACGAGCAGCAGTGCGGCTCCAGCGCGCTCAACAACACCGACTAA
- Protein Sequence
- MDIRTRLSTLRGMLERRLATDLTHYTFCSDGYPDAAVDAARHAGAAARHRPHTLHLLLFTEDVIVQLMDIRTRLSTLRVMLTRRLATDLTLHLLLFTEDVIVQLMDIRTRLSTLRGMLERRLATDLTHYTFCSDGYPDAAVDAARHAGAAARHRPHTLHLLLFTEDVIVQLMDIRTRLSTLHGMLERRLATDLTHYTFCSDGYPDAAVDAARHAGAAARHRPHTLHLLLFTEDVIVQLMDIRTRLSTLRGMLERRLATDLTHYTFCSDGYPDAAVDAARHAGAAARHRPHTLHLLLFAEDVIVQLMDIRTRLSTLRGMLERRLATDLTHYTFWLQDAKELESHKTLVDQCIKGEGVVQVNLQLKPLERRINILDVLKPDEELIDLAPPPDGAETPEEVADGEAVGADGEAGVEVEAEVSAADTVSADLPPAAASPHAPQDNSKQPLHSTMVKWIVDEQFQSDRSRLKMPDEPADWSVAHVRLWIQWAVRQFNLTGIKLPDWSISGEQLCAMSLHEFREKCPSDPGDIFWTHFELLRKCKFIAVIQKEDGGGGAPRAAAGKDTERQYAVKRKKPQPLQQMITEPGDLTEYGGGAGAGTLRGASTGQIHLWQFLLELLTTADHLPVIQWHGQSLSRRSFLLLPPATHSCVCTGQIHLWQFLLELLTTADHLPVIQWHGQSLSRRSCLLLPPATHSCVCTGQIHLWQFLLELLTTADHLPVIQWHGQSLSRLSFLLLPPATHSCVCTGQIHLWQFLLELLTTADHLPVIQWHGQSLSRRSFLLLHPATYSCVCTGQIHLWQFLLELLTTADHLPVIQWHGQSLSRRSFLLLPPATHSCVCTGQIHLWQFLLEPLTTADHLPIIQWHGQSLSRRSFLLLPPATHSCVCTGQIHLWQFLLELLTTADHLPVIQWHGQSLSRCSFLLLPPATHSCVCTGQIHLWQFLLELLTTADHFPVIQWHGTEGEFKLVEPERVARLWGSRKHKPAMNYEKLSRALRYYYDGDMIAKVNGKRFVYKFVCDLRQLVGYSAGELARLVSDTYEQHSAAPARSTTPTNTDRVTCDLRQLVGYSAGELARLVSDTYEQHWWATLPASWRASCPTPTSSSAAPAHSTTPTNSPRHVRPAPAGGLLCRRASAPRVRHLRAAVRLQHAQQHRLKPRVTCDLRQLVGHSAGELARLVSDTYEQQCGSSALNNTD
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -