Basic Information

Gene Symbol
Smg1
Assembly
GCA_905404315.1
Location
FR990154.1:15424403-15442325[-]

Transcription Factor Domain

TF Family
MH1
Domain
MH1 domain
PFAM
PF03165
TF Group
Unclassified Structure
Description
The MH1 (MAD homology 1) domain is found at the amino terminus of MAD related proteins such as Smads. This domain is separated from the MH2 domain by a non-conserved linker region. The crystal structure of the MH1 domain shows that a highly conserved 11 residue beta hairpin is used to bind the DNA consensus sequence GNCN in the major groove, shown to be vital for the transcriptional activation of target genes. Not all examples of MH1 can bind to DNA however. Smad2 cannot bind DNA and has a large insertion within the hairpin that presumably abolishes DNA binding. A basic helix (H2) in MH1 with the nuclear localisation signal KKLKK has been shown to be essential for Smad3 nuclear import. Smads also use the MH1 domain to interact with transcription factors such as Jun, TFE3, Sp1, and Runx [2, 1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 20 0.55 3.7e+03 -0.7 0.0 20 71 251 299 248 309 0.80
2 20 0.0013 9.1 7.7 0.4 37 95 713 774 684 780 0.76
3 20 0.0055 37 5.7 0.0 59 95 781 820 774 827 0.83
4 20 0.0015 10 7.5 0.3 36 95 806 866 805 873 0.77
5 20 0.0054 36 5.8 0.0 59 95 873 912 866 920 0.83
6 20 0.0057 39 5.7 0.0 59 95 919 958 912 964 0.83
7 20 0.0057 38 5.7 0.0 59 95 965 1004 958 1010 0.83
8 20 0.0061 41 5.6 0.0 59 95 1011 1050 1004 1053 0.82
9 20 0.001 6.7 8.1 0.3 35 95 1035 1096 1034 1102 0.79
10 20 0.0011 7.3 8.0 0.3 35 95 1081 1142 1080 1145 0.79
11 20 0.0055 37 5.7 0.0 59 95 1149 1188 1142 1195 0.83
12 20 0.0054 36 5.8 0.0 59 95 1195 1234 1188 1242 0.83
13 20 0.0063 42 5.5 0.0 60 95 1242 1280 1235 1286 0.83
14 20 0.00092 6.2 8.2 0.2 35 95 1265 1326 1264 1334 0.79
15 20 0.27 1.8e+03 0.3 0.1 59 95 1333 1367 1326 1373 0.79
16 20 0.0023 16 6.9 0.4 45 95 1360 1413 1336 1419 0.78
17 20 0.0085 57 5.1 0.3 73 95 1391 1413 1375 1460 0.53
18 20 0.0029 20 6.6 0.3 51 95 1453 1500 1439 1506 0.78
19 20 0.0064 43 5.5 0.0 59 94 1507 1545 1500 1548 0.82
20 20 0.001 6.7 8.1 0.3 35 95 1531 1592 1530 1598 0.79

Sequence Information

Coding Sequence
ATGCGACGCGGGCGCGAGACGCTGCTCACGCTGCTGGAGGCGTTTGTGTACGACCCGCTGGTGGAGTGGGGCGGGGGCCGGCGGCGGCGGGGGGAGAGGCACGTGCGAGCCGCGCGCGCCATGTTGGCCGTGCGAGTGCACGAGCTGAAGTACAGCGTCAACCACCTCGTCGAACAACTTTTGACATTATTACCCGAAGTAAAGAAGTGCGCAGACAAGTGGCTCGAAGAGAATGAAGAACTGAATGCTATTCAATCCAAGTTACAGCTGTGTCACCAACAAATGGTTCTGATCAAAGAAATCGAGGCATACGGAAGTAACCTGAGTAACCATCCATTACACGCTATATCTCAGAAGTATGCCTCCTACAAGCAAGCGAAAAACGCGGTCGAAGACTCCATGAAAGCTTTAGTTAAAATCCTGAACGACTTTGACACCCAAATAGAAAACTTTGCCTCAACCAATGAAGTGCTTAACGGTCCTCAATTAATGGCATGGGTACAAGAGTATAGTGGCCCCAACGAAGACGAACAATTGCCTATTTTTGAACATATAAAAGAATTTCTAACCAACGCGGGCCAAGGAACTATGTTGACACAATGCGAACAAGCCGAAGCTGAATTAAATCAGTGCATGCAGCAAACGAATGTACTCCTTCGTTCTTGTATTGAGCTACTGTCGCAGTATGTAGCCGTATCACAATATTATCCTCGGAGCCAAACAGAATACCATCGCATAGTGTTGTTCCGTGAATACCTCGCGAAGGCATTAGAAAGTAAATCACCCGAGGTCTGCCGCGAGGTTGCTAATCAAGTCACGGCGCTCGTGAACGCTGAGAGTAGCGCGGATCCACAACAAGTCATCGCCTACAATTACCGCCTTCAGCAATTGAACGGTGATTCCAACACCCTGGTCAATAAGTGCTTGGACCGGCTGCAACTAGAAGGTGGGCCTGATGCTATTACAAAAGCCCAAGAATCTTACAAAGACGTGAAAACTAATATAAGCAATTGGGTCCGGGCCGAAGAAGGTGCTGCAGCTGCGCTGGAAAGTGTCTCAATTGGAATGCTGTGCAATTTGAACAGGCGGTATTTAATGTTGGAGAATGGAGCTCAGAGTGCGGGTGATTGTCTTGTGGATCTGACATCCAGAGAGGGGGGATGGTTCCTGGACGACATGAGCGCTTTGTCCATGCAGACTGTCGAGTTACTCTCACTCTTGCCGCTGCAATCGGCGGCGGTTGAAGACGCTTCGATGCCCGTTGCTGTTGAATGTGTGAGGAACGCGAACTTGTTGCTGGCTGATCTGGTGCAATTGAATTACAATTTCAGCACCATTATACTACCTGAAGCCCTGAAGAAGGTGCACTCGGAAGATCCCTCCACGCTGCAAATCATCAATGAGTTGAATGCTGTCATTTTGAACTCTCCGGCACCATTGAACGAGATACTCGCACAGTTGGAGGTGCATTTCAGATACTTGGTAATGGAAATGGAGTCCCCTGCCAGCGGTGCTCCCCTGTGGGCGGCAGCTCTGCGCGCCAGATACGAAGCTCTCCTTTCCCCACCGAACGAGGGCGAGGCCCAATCAGGTGGCAGGATGCTGCTGATGGGCTTTAACGGACTGTTTGCGGCCGTTGAGCTGAGGGCTCGCGAGCTAGCGGACCATCTCAACTCGCCCATACCGGCCGCCTGGAGGAAGATCGATCACGTTAATGATGCGCTACATATGTCCGCAGCGATGCAAAGTCCCGCTCTTCGCGGCGTACTAGAAGACATATTCCTGGTCCGTCGTATTCAGACTGTGGGGGAAGTGTTTGCAATGTGCGCCCAGCTCAGTTGTGCTTTCCGCGGCACCGGGCCCACTGTGCTGTACGACGATGCTGCACTCTGCAAGCCGGTTCGAAGGTTCATAGCGGAGTACGTGTCGCGCTGCCTGCTGGGCGTGCACTCCAAGGCTCTGGCATCCGTGCTGTGTCTGCTGCTGCGGCGCGCACGCCTCGACCTTCACGCGGAGGTGGAACAAAAGGAGATCGGTGCGTCGTGGAGCGTGTCGCTCGAGTCGCTGTGCGAGAAGGTGTGCGCGGGCGCGAGCGGGGAGCGGGCCGTGGAGCGGGCGGCGTCGCTGGCGGGCAGCCTGTGTGCGAGCCGCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTGTGTGCGAGCCGCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTGTGTGCGAGCCGCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGGTACGCGGTCTATCTAGGTGTGTGCGAGCCGCGCGCGCCGACCGCGCCGCCGTGTGCGTGCGGGCGCTGGAGCGGGCGCGGGCCGCCGCGCGCATGCACCGACTGAGGGCACTGGCGCACGCACACCTGCACGCTGAGATACTGAACAGTAACCAAGAGTGGAGCGGGCAGTTGGCGCGTCGAGCTCGCGAGCTGAGCGCGGCCCTCGAGCGGCTGCAGGCCGGTCGCACCAAGGCGCATGGACTGCTGCAGGCCGCGCGACAGAGGTGGGGCGCGATACCGACTCAATCAGCTCTCAGGAgccacaataatataatatacaaataa
Protein Sequence
MRRGRETLLTLLEAFVYDPLVEWGGGRRRRGERHVRAARAMLAVRVHELKYSVNHLVEQLLTLLPEVKKCADKWLEENEELNAIQSKLQLCHQQMVLIKEIEAYGSNLSNHPLHAISQKYASYKQAKNAVEDSMKALVKILNDFDTQIENFASTNEVLNGPQLMAWVQEYSGPNEDEQLPIFEHIKEFLTNAGQGTMLTQCEQAEAELNQCMQQTNVLLRSCIELLSQYVAVSQYYPRSQTEYHRIVLFREYLAKALESKSPEVCREVANQVTALVNAESSADPQQVIAYNYRLQQLNGDSNTLVNKCLDRLQLEGGPDAITKAQESYKDVKTNISNWVRAEEGAAAALESVSIGMLCNLNRRYLMLENGAQSAGDCLVDLTSREGGWFLDDMSALSMQTVELLSLLPLQSAAVEDASMPVAVECVRNANLLLADLVQLNYNFSTIILPEALKKVHSEDPSTLQIINELNAVILNSPAPLNEILAQLEVHFRYLVMEMESPASGAPLWAAALRARYEALLSPPNEGEAQSGGRMLLMGFNGLFAAVELRARELADHLNSPIPAAWRKIDHVNDALHMSAAMQSPALRGVLEDIFLVRRIQTVGEVFAMCAQLSCAFRGTGPTVLYDDAALCKPVRRFIAEYVSRCLLGVHSKALASVLCLLLRRARLDLHAEVEQKEIGASWSVSLESLCEKVCAGASGERAVERAASLAGSLCASRARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVCASRARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVCASRARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEVRGLSRCVRAARADRAAVCVRALERARAAARMHRLRALAHAHLHAEILNSNQEWSGQLARRARELSAALERLQAGRTKAHGLLQAARQRWGAIPTQSALRSHNNIIYK*

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
-
90% Identity
-
80% Identity
-