Basic Information

Gene Symbol
-
Assembly
GCA_002938485.2
Location
NW:588557-603963[-]

Transcription Factor Domain

TF Family
POU
Domain
Homeobox|Pou
PFAM
PF00157
TF Group
Helix-turn-helix
Description
The POU domain is a bipartite domain composed of two subunits separated by a non-conserved region of 15-55 aa. The N-terminal subunit is known as the POU-specific (POUs) domain (this entry), while the C-terminal subunit is a homeobox domain (IPR001356). Both subdomains contain the structural motif 'helix-turn-helix', which directly associates with the two components of bipartite DNA binding sites, and both are required for high affinity sequence-specific DNA-binding. 3D structures of complexes including both POU subdomains bound to DNA are available. The domain may also be involved in protein-protein interactions [6]. The subdomains are connected by a flexible linker [7, 5, 8]. Despite of the lack of sequence homology, the tridimensional structure of POUs is similar to 3D structure of bacteriophage lambda repressor and other members of HTH_3 family [7, 5]. POU proteins are eukaryotic transcription factors containing a bipartite DNA binding domain referred to as the POU domain. The acronym POU (pronounced 'pow') is derived from the names of three mammalian transcription factors, the pituitary-specific Pit-1, the octamer-binding proteins Oct-1 and Oct-2, and the neural Unc-86 from Caenorhabditis elegans. POU domain genes have been identified in diverse organisms including nematodes, flies, amphibians, fish and mammals but have not been yet identified in plants and fungi. The various members of the POU family have a wide variety of functions, all of which are related to the function of the neuroendocrine system [4] and the development of an organism [1]. Some other genes are also regulated, including those for immunoglobulin light and heavy chains (Oct-2) [3, 2], and trophic hormone genes, such as those for prolactin and growth hormone (Pit-1).
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 6 2 5.8e+03 -3.0 0.0 14 32 79 97 70 112 0.77
2 6 0.0078 22 4.7 0.0 11 29 168 186 159 206 0.82
3 6 0.064 1.8e+02 1.8 0.0 7 29 256 278 251 297 0.85
4 6 0.16 4.7e+02 0.5 0.0 14 29 355 370 345 387 0.87
5 6 0.0065 19 5.0 0.0 11 30 444 463 435 482 0.79
6 6 9.1e-05 0.26 10.9 0.0 14 58 539 586 529 591 0.76

Sequence Information

Coding Sequence
ATGGGTGATTTGTTATTCGTATGTTATAGATTAGAGCACAGCAGCGACAACGAATGTAGTTGTACATGTAGCCAAATAGTATGCAATTTTGAAGATAATTTGTGTGAGATAGCACTTGCAAAAGAAAATACTGCTAATCCAATTATAAAATGTCAACACGGAACTACTTTTTCAAAAGACCCTTTCAAAGATGGATTTGAAAATATCGATCTAAAttcaacCGTTCCTAATTCTAAACAGGAAAGAACTAGGTGTGGCTTAAACGAGGAAGATGTTgcaaaccaaattaaaattgaaagatGTAAGAATGATAATTCATTGGCTCAAAGCACTTTGGTCCATCCAGAATTTGTTTGTGTCGATGGAAGAAATGTTTGCCTTCAGGAAAACCCTTCGTATTATGACCCTAATGAAGTAACTACTGAACCAAGTTATGAATACGTATCCAGAAGTAAAAGAGGTTCCCTTGAACCATCTCATGAAAAATCTGAACAGACGACTGCTGATAATAGACAGAAAAGAATTAAGTGTGGCTTAAGCGAGGAAGATGTTGCAAACGAAATAATTATTGAAGGGTTTAAGAATGATAATTCATTGGCTCAAAGCACTGTGGTCTATCCAGAATCTGATTGTGTGGATGTAAGGAATCTTTGCATTCACGAAAATCGTTCGTATTATGGCCCTAATGAGGGGAATACTGAACCAAGTGATGAACACGTTTCCAGAAGTAAAAGTGGATCCCTTGAACCATCTATTGAAAGATCTGAACAGATGACTGCTGATAATAGACAGAAAAGAATTAAGTGTAGCTTAAGCGAGGAAGATGTTGCAAACGAAATAATTATTGAAGGGTTTAAGAATGATAATTCATTGGCTCAAAGCACTGTGGTCTATCCAGAATCTGATTGTGTGGATGTAAGGAATCTATGCATTCACGAAAATCGTTCGTATTATGGCCCTAATGAGGGGAATACTGAACCAAGTGATGAACACGTTTCCAGAAGTAAAAGTGGATCCCTTGAACCATCTATTGAAAGATCTGAACAGACGACTGCTGATAATAGACAGAAAAGAATTAAGTGTAGCTTAAGCGAGGAAGATGTAGCAAACGAAATAATTATTGAGGGATTTAAGAATGATAATTCATTGGCTCAAAGCACTTTGGTCCATCCAAAATCTGTTTGTGTCGATGGAAGAAATGTTTGCCTTCAGGAAAACCCTGCGTATTATGACCCTAATAAAGGAACTACTGAACCAAGTGATGAATACGTATCCAGAAGTAAAAGAGGTTCCCTTGAACCATCTCATGAAAAATCTGAACAGACGACTGCTGATAATAGACAGAAAAGAATTAAGTGTGGCTTAAGCGAAGAAGATGTTGCAAACGAAATAATTATTGAAGTATTTAAGAATGATAATTCATTGGTTCAAAGCACTGTGGTCTATCCAGAATCTGATTGTGTGGATGTAAGGAATCTTTGCATTCACGAAAATTGTTCGTATTATGGCCCTAATGAGGGGATTACTGAACCAAGTGATGAACACGTTTCCAGAAGTAAAAGTGGATCCCTTGAACCATCTATTGAAAGATCTGAACAGACGACAACTGGTTCTAGACAGAAAAGAATTAAGTGTGGCTTAAGCCAGGAAGATGTTGCAaacaaaacaataattgaaagatTTAAGAATAAAAATTCATTGGCTCAAAGCACTGTGGTCCATCCAGAAGCTGTTTGTATGGATGTAAGAAATATTTGCATTCAGGAGGACCTTTCGTATTATGACCCTAATGAGGAAACTACTGAATCAAGTGATGAACACGTTTTCAGAAGTAAAAGAGGATCCCTTGGACCTTATAAAAAATCTGGACagAGGAAAATTTGTAGAACTACTGCTTCTAGAACTTCGGATACACCAAGCACCAAGCGGCTTTTGAATGAACAAAACATTCAACTAAGAAATACTGAATTATTTGAGTTGCATTCCTCTACAAACAACGATATTATTTGCGACACTATTAGAAATAGTATGACGGGAGATTCTGAGAAAACCATTACGTCCAATCTTCGCAGTGCTCCAGAACATGTAATTGATAATGATACTATTACCAGTACAGTTGAATCAGTCCATAGTGTTGACAAAAAGTCGGCAAAAACACTGTCTTTTCAAGATTACAAAAGAAAATATGGACAGTTTCGCCGTTCCAGTACAGATCACAAGAAGCGAGAATTTTCACTCGATggtaaaattaatagtttgaACATTGACATAGGAACTCACAAAACCAATAATTTTGATACCACTATAAATACAGATGCTGATACAATATTATACAATATGTACAAAGGATCGCTTGTAAATAACTATACAGTAACATGTACTGGAGAGAAAAGACAGGGaTCAGAAAATGCTATTCAATATTACATAGACCTTGAAAATTCATATATTCCTGATACTCAATTAATCAAATGCATTGTGTGTTTAGAAAATATCGAAAGTATTGACACACTAGCAGTACATTTTTGGGTAAATCATAGAGATCAACCGATAGACAAAATTGTATCCGTGTTCTGTGAAGTATGCGGTGAGAAAATTAACACAGCATATTCTAACAATCACTTTCATCTTTACAAATTTGAATGTGTCTGTTGTAATGAGGCATTCGAAAATACCGACCGCCTAATGTCGCATTATATTGTCTCTCATCCGAAGGACAACAAAGAGGAATTAGAATTCCGATCTATTGGTACCCACCACACAACTTATTTCTGTAACTTTtgtggagctcaatttttcttCATTGCTAGTTTGTTTAGACATAAAAATGGATATCACAGTAATATATTGAATATTCTTAAAATGAAGGAATGCGCTGATGCGCATGTAGATTCAAATGAGGAGCCGTCCAGGGACATTGAGGAGTCTATAAACTACGCAAACGAAGAATTCGAGCAGACAATATATGACTTTCAATGTGATTTAACGGATTGTACGTCACCGATACAAATGAACTCTGAAGATGGTTTGTCTTGGGGAGAGAAACAACAGCATAAACCAAGCAGTTCGATTACCAAGAAAGCTACCCATCGAAACTATAATGAAAACCCATCCACTACTCAAGATGTAGAAGATCCAAATAAATGTATACGATACAACCGAGATCCAAGACGTCAAAGATTCCGATATTAA
Protein Sequence
MGDLLFVCYRLEHSSDNECSCTCSQIVCNFEDNLCEIALAKENTANPIIKCQHGTTFSKDPFKDGFENIDLNSTVPNSKQERTRCGLNEEDVANQIKIERCKNDNSLAQSTLVHPEFVCVDGRNVCLQENPSYYDPNEVTTEPSYEYVSRSKRGSLEPSHEKSEQTTADNRQKRIKCGLSEEDVANEIIIEGFKNDNSLAQSTVVYPESDCVDVRNLCIHENRSYYGPNEGNTEPSDEHVSRSKSGSLEPSIERSEQMTADNRQKRIKCSLSEEDVANEIIIEGFKNDNSLAQSTVVYPESDCVDVRNLCIHENRSYYGPNEGNTEPSDEHVSRSKSGSLEPSIERSEQTTADNRQKRIKCSLSEEDVANEIIIEGFKNDNSLAQSTLVHPKSVCVDGRNVCLQENPAYYDPNKGTTEPSDEYVSRSKRGSLEPSHEKSEQTTADNRQKRIKCGLSEEDVANEIIIEVFKNDNSLVQSTVVYPESDCVDVRNLCIHENCSYYGPNEGITEPSDEHVSRSKSGSLEPSIERSEQTTTGSRQKRIKCGLSQEDVANKTIIERFKNKNSLAQSTVVHPEAVCMDVRNICIQEDLSYYDPNEETTESSDEHVFRSKRGSLGPYKKSGQRKICRTTASRTSDTPSTKRLLNEQNIQLRNTELFELHSSTNNDIICDTIRNSMTGDSEKTITSNLRSAPEHVIDNDTITSTVESVHSVDKKSAKTLSFQDYKRKYGQFRRSSTDHKKREFSLDGKINSLNIDIGTHKTNNFDTTINTDADTILYNMYKGSLVNNYTVTCTGEKRQGSENAIQYYIDLENSYIPDTQLIKCIVCLENIESIDTLAVHFWVNHRDQPIDKIVSVFCEVCGEKINTAYSNNHFHLYKFECVCCNEAFENTDRLMSHYIVSHPKDNKEELEFRSIGTHHTTYFCNFCGAQFFFIASLFRHKNGYHSNILNILKMKECADAHVDSNEEPSRDIEESINYANEEFEQTIYDFQCDLTDCTSPIQMNSEDGLSWGEKQQHKPSSSITKKATHRNYNENPSTTQDVEDPNKCIRYNRDPRRQRFRY

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
-
90% Identity
-
80% Identity
-