Basic Information

Gene Symbol
Dsp1_1
Assembly
GCA_963855985.1
Location
OY979741.1:4973127-4984284[+]

Transcription Factor Domain

TF Family
ETS
Domain
Ets domain
PFAM
PF00178
TF Group
Helix-turn-helix
Description
Transcription factors are protein molecules that bind to specific DNA sequences in the genome, resulting in the induction or inhibition of gene transcription [3]. The ets oncogene is such a factor, possessing a region of 85-90 amino acids known as the ETS (erythroblast transformation specific) domain [3, 5, 4]. This domain is rich in positively-charged and aromatic residues, and binds to purine-rich segments of DNA. The ETS domain has been identified in other transcription factors such as PU.1, human erg, human elf-1, human elk-1, GA binding protein, and a number of others [3, 5, 2]. It is generally localized at the C terminus of the protein, with the exception of ELF-1, ELK-1, ELK-3, ELK-4 and ERF where it is found at the N terminus. NMR-analysis of the structure of the Ets domains revealed that it contains three α-helices (1-3) and four-stranded β-sheets (1-4) arranged in the order α1-β1-β2-α2-α3-β3-β4 forming a winged helix-turn-helix (wHTH) topology [1]. The third α-helix is responsive to contact to the major groove of the DNA. Different members of the Ets family proteins display distinct DNA binding specificities. The Ets domains and the flanking amino acid sequences of the proteins influence the binding affinity, and the alteration of a single amino acid in the Ets domain can change its DNA binding specificities.
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 14 0.15 4.3e+02 2.3 0.2 34 69 405 440 393 450 0.77
2 14 0.025 74 4.7 0.0 34 58 493 517 480 529 0.81
3 14 0.11 3.1e+02 2.7 0.0 35 58 535 558 524 566 0.85
4 14 0.057 1.7e+02 3.6 0.1 35 58 565 588 559 602 0.86
5 14 0.024 72 4.7 0.0 34 58 610 634 597 643 0.81
6 14 0.049 1.5e+02 3.8 0.1 35 59 641 665 635 678 0.84
7 14 0.014 41 5.5 0.0 25 58 694 728 676 737 0.74
8 14 0.057 1.7e+02 3.6 0.1 35 58 735 758 730 774 0.86
9 14 0.03 89 4.5 0.0 34 58 780 804 767 815 0.83
10 14 0.016 46 5.4 0.0 25 58 885 919 867 927 0.74
11 14 0.053 1.6e+02 3.7 0.0 35 58 926 949 921 958 0.83
12 14 0.059 1.7e+02 3.5 0.1 35 58 956 979 950 987 0.85
13 14 0.059 1.7e+02 3.5 0.1 35 58 986 1009 981 1023 0.86
14 14 0.35 1e+03 1.1 0.0 34 55 1031 1052 1021 1052 0.86

Sequence Information

Coding Sequence
ATGGGGGACAGAGGCGCGACGGGGGGCGCTTGGGGTGCGCGCGACGAAGCCTCGTGGTGGCCGGGTGGCGCAGGTGAACTCCAGCACCAACAACAAATTAATGAAGAAATCGCTAGGAGTACCGCCGCAGCCACCCATCAATTATACACATACAAAATGACCGGCGGATTTTCCAATAACAGCGCAGATAATTCTACAGCTAGTTACGATTACCGATTAATTTCAAATACGAATACCAGGGAGCAATCACCTCAGCAGCCATGGTGGTATGCATCAGGCTCAGTTGAGTCTCAACAAACATCTTCTCCAACGCCTCAGAACCAGTCGAGCCCCGACCCTGATCAGGGCAGTCAGCAGTCTAACGGCATTCAGCAGAACCATCAGGTTCtccaacaacaacaacagcagcaacagcaacagcagcagcagcagcagcaacaacagcagcagcagcaacaaCAACAGCAACAACAACAGCAGCAACAACAACAGAATCAGCATACTCAACAACACACGCAGACTCTTCAGCAGACTCTGCAGCAGCAACAGCTGCAGAGCCAGCAAACTTTACAGCAGATGCTGCAACAGCATCAACAGCaacaacagcagcagcagcagcaacaacagcagcagcaacaacaacaacagcagCAACAACAGCAGCAGCAACAACAAGCCTTGCAGCAGAGTTTGCAGCAAACTCTGCAAGTCAGCCAAGCGCAGGCGCAGGCGATAGTGCAAGCGCAAGCGGTTCTGCAGCAGCAAGTCGCGCAGACGCTGCAGCAACAGCAACAGTCGCTGCATGAGCATATGCAAGCTGTCCAACAGCAGCAGATACAAGCTGCTCTACAGCGACAGTCGGCCACTTTGCAGGAGCTCCAGCAACAAGCCCAACAGCAAGCTCTTGCCCAAGGCCCAGTAACAAAGACCAGAATGCCGCGAGTGAGGCCTTACAACAAGCCTCGCGGGCGTATGACGGCGTACGCCTTCTTCGTGCAAACTTGCCGCGAGGAGCACAAGAAGAAACACCCCGATGAGAACGTCGTCTTCGCTGCCTTCTCCAAGAAGTGCGCTGAGAGCTGCTATAGGATGTGTGTGAGGTCATACAACAAGCCTCGCGGGCGTATGACGGCGTACGCCTTCTTCGTGCAAACTTGCCGCGAGGAGCACAAGAAGAAACACCCCGATGAGAACGTCGTCTTCGCTGCCTTCTCCAAGAAGTGCGCTGAGAGGTGGAATACGATGTCCGAGAAAGAGAAAGAGCGTTTCCACGAGATGGCCGACCAAGACAAGATGCGATACGACCGCGAGATGCAGAACTACGTGCCGCCTAAGGACGTTAAAGTACGCGGGCGCAAGCGGCAAGTGAAGGACCCGAACGCGCCCAAGCGATCCTTATCAGCGTTCTTCTGGTTCTGCAACGACGAGCGTCCCAAAGTGAAGGCCAACAACCCGATGTTCACGATGGGCGACATCGCAAAGGAGCTGGGTCGGCTGTGGGCCGCTGCTGAGCCCGAGACCAAGTCCAAATATGAAGCGCTCTCTGAACAGGACAAGGCGCGGTATGATCGGTACACTGCCTTCACCTTGGGATACATCGCCAAGGAGCTGGGTCGGCTGTGGGCCGTTGCTGAGCCTGAGACCAAGTCCAAATATGAAGCTCTCTCTGAACAGGACAAGGCGCGGTATGATCGGGAGCTGGGTCGGCTGTGGGCCGCTGCTGAGCCTGAGACCAAGTCCAAATATGAAGCTCTCTCTGAACAGGACAAGGCGAGGTATGATCGGGTAAGTGATCTATCTTCAACTGCCTTCACCTTGGGAGACATCGCCAAGGAGCTGGGTCGGCTGTGGGCCGCTGCTGAGCCTGAGACCAAGTCCAAATATGAAGCGCTCTCTGAACAAGACAAGGCGCGGTATGATCGGGAGTTGGGTCGGCTGTGGGCCGCTGCTGAGCCTGAGACCAAGTCCAAATATGAAGCTCTCTCTGAACAAGACAAGGCGCGGTATGATAGGACCAAGTCCAAATATGAAGCTCTCTCTGAACAAGACAAGGCGCGGTATGATCGGGTAAGTGATCTATCTTCAACTGCCTTCACCTTGGGCGACATCGCCAAGGAGCTGGGTCGGCTGTGGGCCGCTGCTGAGCCTGAGACCAAGTCCAAATATGAAGCGCTCTCTGAACAAGACAAGGCGCGGTATGATCGGGAGTTGGGTCGGCTGTGGGCCGCTGCTGAGCCTGAGACCAAGTCCAAATATGAAGCTCTCTCTGAACAAGACAAGGCGCGGTATGATCGGGTAAGTGATCTAGCTTCAACTGCCTTCACCTTGGGAGACATCGCCAAGGAGCTGGGTCGGCTGTGGGCCGCTGCTGAGCCTGAGACCAAGTCCAAATATGAAGCTCTCTCTGAACAAGATAAGGCGCGGTATGATCGGGTAAGTGATCTATCGTCAACATCAGATTGTAGCACACTGCTTTCACCTTGGGCGACATCGCCAAGGAGCTGGGTCGGCTGTGGGCCGCTGCTGAGCCTGAGACCAAGTCCAAATATGAAGCTCTCTGAACAAGACAAGGCGCGGTATGATCGGACCAAGTCCAAATATGAAGCTCTCTCTGAACAAGACAAGGCGCGGTATGATCGGGTAAGTGATCTATCTTCAACTGCCTTCACCTTGGGCGACATCGCCAAGGAGCTGGGTCGGCTGTGGGCCGCTGCTGAGCCTGAGACCAAGTCCAAATATGAAGCGCTCTCTGAACAAGACAAGGCGCGGTATGATCGGGAGCTGGGTCGGCTGTGGGCCGCTGCTGAGCCTGAGACCAAGTCCAAATATGAAGCTCTCTCTGAACAAGACAAGGCGCGGTATGATCGGGAGTTGGGTCGGCTGTGGGCCGCTGCTGAGCCTGAGACCAAGTCCAAATATGAAGCTCTCTCTGAACAAGACAAGGCGCGGTATGATCGGGAGTTGGGTCGGCTGTGGGCCGCTGCTGAGCCTGAGACCAAGTCCAAATATGAAGCTCTCTCTGAACAAGACAAGGCGCGGTATGATCGGGTAAGTGATCTAGCTTCAACTGCCTTCACCTTGGGAGACATCGCCAAGGAGCTGGGTCGGCTGTGGGCCGCTGCTGAGCCTGAGACCAAGTCCAAATATGAAGCTCTCTGA
Protein Sequence
MGDRGATGGAWGARDEASWWPGGAGELQHQQQINEEIARSTAAATHQLYTYKMTGGFSNNSADNSTASYDYRLISNTNTREQSPQQPWWYASGSVESQQTSSPTPQNQSSPDPDQGSQQSNGIQQNHQVLQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQNQHTQQHTQTLQQTLQQQQLQSQQTLQQMLQQHQQQQQQQQQQQQQQQQQQQQQQQQQQQQALQQSLQQTLQVSQAQAQAIVQAQAVLQQQVAQTLQQQQQSLHEHMQAVQQQQIQAALQRQSATLQELQQQAQQQALAQGPVTKTRMPRVRPYNKPRGRMTAYAFFVQTCREEHKKKHPDENVVFAAFSKKCAESCYRMCVRSYNKPRGRMTAYAFFVQTCREEHKKKHPDENVVFAAFSKKCAERWNTMSEKEKERFHEMADQDKMRYDREMQNYVPPKDVKVRGRKRQVKDPNAPKRSLSAFFWFCNDERPKVKANNPMFTMGDIAKELGRLWAAAEPETKSKYEALSEQDKARYDRYTAFTLGYIAKELGRLWAVAEPETKSKYEALSEQDKARYDRELGRLWAAAEPETKSKYEALSEQDKARYDRVSDLSSTAFTLGDIAKELGRLWAAAEPETKSKYEALSEQDKARYDRELGRLWAAAEPETKSKYEALSEQDKARYDRTKSKYEALSEQDKARYDRVSDLSSTAFTLGDIAKELGRLWAAAEPETKSKYEALSEQDKARYDRELGRLWAAAEPETKSKYEALSEQDKARYDRVSDLASTAFTLGDIAKELGRLWAAAEPETKSKYEALSEQDKARYDRVSDLSSTSDCSTLLSPWATSPRSWVGCGPLLSLRPSPNMKLSEQDKARYDRTKSKYEALSEQDKARYDRVSDLSSTAFTLGDIAKELGRLWAAAEPETKSKYEALSEQDKARYDRELGRLWAAAEPETKSKYEALSEQDKARYDRELGRLWAAAEPETKSKYEALSEQDKARYDRELGRLWAAAEPETKSKYEALSEQDKARYDRVSDLASTAFTLGDIAKELGRLWAAAEPETKSKYEAL

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
iTF_01125445;
90% Identity
-
80% Identity
-