Basic Information

Gene Symbol
Hsf
Assembly
GCA_935412865.1
Location
CAKXYQ010000387.1:667307-674568[+]

Transcription Factor Domain

TF Family
HSF
Domain
HSF_DNA-bind domain
PFAM
PF00447
TF Group
Helix-turn-helix
Description
Heat shock factor (HSF) is a transcriptional activator of heat shock genes [1, 4]: it binds specifically to heat shock promoter elements, which are palindromic sequences rich with repetitive purine and pyrimidine motifs [1]. Under normal conditions, HSF is a homo-trimeric cytoplasmic protein, but heat shock activation results in relocalisation to the nucleus [2]. Each HSF monomer contains one C-terminal and three N-terminal leucine zipper repeats [3]. Point mutations in these regions result in disruption of cellular localisation, rendering the protein constitutively nuclear [2]. Two sequences flanking the N-terminal zippers fit the consensus of a bi- partite nuclear localisation signal (NLS). Interaction between the N- and C-terminal zippers may result in a structure that masks the NLS sequences: following activation of HSF, these may then be unmasked, resulting in relocalisation of the protein to the nucleus [3]. The DNA-binding component of HSF lies to the N terminus of the first NLS region, and is referred to as the HSF domain.
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 1 7.7e-37 4.8e-33 114.1 0.1 1 100 14 113 14 113 0.97

Sequence Information

Coding Sequence
ATGCGTTCAGTAGTTGAAATTGGGGCCAGTGTGCCTGCTTTCCTCGGAAAATTATGGAAACTGGTCAATGATACAGAGACGAACCATCTGATTTCCTGGAGTCCTGGTGGGAAAACATTTGTTATTAAAAATCAGGCAGACTTTGCAAGAGAGCTCTTGCCTCTATATTATAAACACAACAATATGGCAAGCTTCATTCGGCAATTAAATATGTATGGATTTCATAAGATTACTTCAGTGGAGAATGGTGGATTGCGCTATGAAAAAGATGAAATTGAATTTTCTCACCCCTGTTTTATGAAAGGTCATGCATATCTTTTGGAACATATTAAAAGAAAAATTGCCAACCCAAAGTCAATAGTTACCAGTAATGAAAGTGGTGAGAAAGTTCTTTTGAAGCCAGAGCTGATGAATAAAGTACTGGCAGACGTAAAACAAATGAAAGGCAAACAGGAGAGCCTTGATGCTAAGTTTAGTGCCATGAAACAGGAAAACGAAGCTTTGTGGAGAGAAGTGGCCATACTCAGACAGAAGCACATAAAACAACAGCAAATTGTTAACAATCTTATCCAATTCCTGATGTCACTAGTCCAACCAGCCAGGCCACCAAATGCTAGTGGAAATAATGTTGGTGTAAAGAGGCCATACCAGTTGATGATAAACAGTGCAGCTCACAACTCTGGCACAGATGGCTCTTATCCTGGCAGGCTGAAGAATATGAAATTAGACAAAGAAACCTTATTGGATGAGTTAAGTGAGGATAATCTGGAGGATGGACCTACTATACATGAGCTGGCACATGATGATATCTTGCACAATGAGGGCACACAAGACTCCTTGACCGCAACTGACTTTGTTACAGTTGAAATACCAACCATGACAAATGCTACTACTAGCCACAGTCCTACAAATCCTGCCAGTTCTATTTACCATGTCACTATGGAAGACGGAGATGATGCTGAATCAGGAGGTAACAGACTAGTATATCCCATAACTCCCAGCAATGGTATGCAAAGAACAAATGACAACATGCCCATTATCACATCAGCTTCCCCAACACCTGTAAATCCAACAATGGATCAGCCTGCTATAGACCAAGTGGTTAACAATATGATGGGATCACCTATCATTGTAAAAGTTGAAGGACAGATGGCATCTAAATCAAAAAGCACTAATAATTCTGCTAACAGAAGCACAGCAACATCCTCAGGCAAAAGTCTTCTGTCATCTGGAAATTATAACACATTAAACCCATCAGCTGATTTGAAACTACCTGCCGAAATATTTGCCAGTGATGATTCAGTAAGTGATGCTGGGGTTGCTGCTGCTGATGATATAAACATGCAAAATGCATTGCAGGATCTAGTTGATGAATCACTTCTAACAGCTTCTAAAGACAAAATGCTTGGTGGTATCAATATTAAAATAGAAAGACCTGTAGAAGCTCAAAAAGCCAACAGAAAATCAAAGAAAGGCAACAAAAGTACTGAGCACACTTTGAACTTGGCTGATATAAAAACAGAACTCCAGGATGATTTTGATTGGAACAATATGACTCTAGCTACTGTAAACAACAATACTAATGTGAATAGGTTCCAGACAAGGAATAGGCGAGATAACATCAGCAAAAACCGGGAGGAATATTCTTCCCTCTTTGGAACAAATTCAAACAAGAATGATATTGATGATCATTTGGATGCAATGCAAACAGACCTTGAATCATTAAGAGAACTGCTCCGAAGTGATACCTATGCACTTGACACCAATACATTACTGGGGACTGCTGATGCAAGAATATTACCTTCGATGTCTCTCGCTATTCCTACATGTCAGCCTACAACAAATAATTATTTATTTGGGTCAGATGATCCCTTCTACGGCCTCCCCTATAACTCCATGGATGACAGGGCGAAATCTGCTAATAATGCAAAGAAGCAGAAAGGTGAGATATCAAATGTAAGCAGTGACGACACTCGCGTCCAGCCTCTGTTTACCGAGGATAATGTTGAAGGAAATCAGCTAATCTCGTATACAGGGAATATTCCAGACTTTGAGGATATAAACATGCCAGATCTGGAGTGCGATAACTCTCAGGAGCCGTGCGTGTCGCCCGCTCCCAGCAGCTCCACACTGCATACGCCGCAGATGCAGGTGCGCTCGCCTTCCTTTACGTTGAAACCATGA
Protein Sequence
MRSVVEIGASVPAFLGKLWKLVNDTETNHLISWSPGGKTFVIKNQADFARELLPLYYKHNNMASFIRQLNMYGFHKITSVENGGLRYEKDEIEFSHPCFMKGHAYLLEHIKRKIANPKSIVTSNESGEKVLLKPELMNKVLADVKQMKGKQESLDAKFSAMKQENEALWREVAILRQKHIKQQQIVNNLIQFLMSLVQPARPPNASGNNVGVKRPYQLMINSAAHNSGTDGSYPGRLKNMKLDKETLLDELSEDNLEDGPTIHELAHDDILHNEGTQDSLTATDFVTVEIPTMTNATTSHSPTNPASSIYHVTMEDGDDAESGGNRLVYPITPSNGMQRTNDNMPIITSASPTPVNPTMDQPAIDQVVNNMMGSPIIVKVEGQMASKSKSTNNSANRSTATSSGKSLLSSGNYNTLNPSADLKLPAEIFASDDSVSDAGVAAADDINMQNALQDLVDESLLTASKDKMLGGINIKIERPVEAQKANRKSKKGNKSTEHTLNLADIKTELQDDFDWNNMTLATVNNNTNVNRFQTRNRRDNISKNREEYSSLFGTNSNKNDIDDHLDAMQTDLESLRELLRSDTYALDTNTLLGTADARILPSMSLAIPTCQPTTNNYLFGSDDPFYGLPYNSMDDRAKSANNAKKQKGEISNVSSDDTRVQPLFTEDNVEGNQLISYTGNIPDFEDINMPDLECDNSQEPCVSPAPSSSTLHTPQMQVRSPSFTLKP

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
iTF_00072101; iTF_01260024; iTF_00171675; iTF_01424961; iTF_00726210; iTF_01338499; iTF_01339940; iTF_00122277; iTF_00121317; iTF_00123270; iTF_01031052; iTF_00124160; iTF_00851686; iTF_00850540; iTF_00300091; iTF_00869542; iTF_01151990; iTF_00000233; iTF_00785921; iTF_01221414; iTF_00001098; iTF_00327004; iTF_01487636; iTF_00282185; iTF_00281048; iTF_01491799; iTF_00017258; iTF_01377386; iTF_00018093; iTF_01440906; iTF_00831088; iTF_00445029; iTF_00446024; iTF_00447035; iTF_00448935; iTF_00375093; iTF_01029136; iTF_01076306; iTF_00907795; iTF_00363941; iTF_00906030; iTF_00906968; iTF_00036539; iTF_00237380; iTF_01526935; iTF_00383540; iTF_00450036; iTF_01342133; iTF_00049819; iTF_00888148; iTF_01028181; iTF_01030117; iTF_00784313; iTF_00785124; iTF_00783345; iTF_01285461; iTF_01264314; iTF_01025947; iTF_01525851; iTF_00951710; iTF_00771793; iTF_00038616; iTF_00037693; iTF_00952607; iTF_01092988; iTF_01094817; iTF_00120402; iTF_00667260; iTF_01093951; iTF_00924512; iTF_01085054; iTF_00071303; iTF_01084167; iTF_01246740; iTF_00809025; iTF_00274362; iTF_00809979; iTF_01062722; iTF_01063683; iTF_00177029; iTF_00373831; iTF_00745544; iTF_00758054; iTF_01061802; iTF_01064554; iTF_00043403; iTF_00040708; iTF_00042542; iTF_00039706; iTF_00041696; iTF_01230444; iTF_00147213; iTF_00425294; iTF_00973679; iTF_01534711; iTF_00928578; iTF_01531883; iTF_01532941; iTF_01533843; iTF_00711744; iTF_00273529; iTF_01192539; iTF_00302016; iTF_00301088; iTF_00685344; iTF_00622720; iTF_01439810; iTF_00172645;
90% Identity
iTF_01342133;
80% Identity
-