Basic Information

Gene Symbol
pros
Assembly
GCA_035045925.1
Location
JAWNOP010000022.1:5365674-5376727[-]

Transcription Factor Domain

TF Family
HPD
Domain
HPD domain
PFAM
PF05044
TF Group
Helix-turn-helix
Description
Prospero is a large drosophila transcription factor protein that is expressed in all neural lineages of drosophila embryos. It is needed for correct expression of several neural proteins and in determining the cell fates of neural stem cells. Homologues of prospero are found in a wide range of animals including humans with the highest level of similarity being found in the C-terminal 160 amino acids. This region was identified as containing an atypical homeobox domain followed by a prospero domain. However, the structure shows that these two regions form a single stable structural domain as defined here [1]. This homeo-prospero domain binds to DNA.
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 1 9.6e-87 1.7e-82 274.8 3.2 1 153 1751 1903 1751 1904 0.99

Sequence Information

Coding Sequence
ATGTCATCAGAGGAGTACGTGGCGGACTGTTTTGGTTTGTACAGCGATGAGAACAACGTGCTGCTGAAGGCGACTGGAGAACAACCGGAGATAACAGCCCCAAGCACCAAGCAGCAACAGTCGCcgccgcaacagcaacagcaacagcaactgcaacaaccaCCGCTAAACGGACACACGACCCCAACAACACCCATACAGCTAACAAACGGCAGCGACATTTTTACGGCGAATTCCGGCGACACCAATACTAATACTAATACTAATACCAataccaacagcaacagcaacaacaacacggaAGAGCAGGAGAGCAAGagagagcagcagcagtcaaAGCAGCAGGCAAGGCAGCAGGCCCACGACAAGGAAAAGGAGAAAGTCATCGACGACGAGGGCGAGTCTGACGATTCCGACGACGACGTCGTGGTCGTGCTCGAAGgttgcaacaacagcaacagcaacagcaacagcaacagcagcgcaagcagcagcaacaacaaccataaGGGAGCAACGACGGCGACAGCAACGACAACGGCAAACAACTGTAAtcgtggcagtggcagcggcagcggcagcaacatcagcggAGGTCTCAGTCGTAGTCATCGAAGCGGCCGGAGTAGTCGGCAAATCAGACAGTCAACGGCCGTCGGCAAGACAACAACGTGCGCGGCTAAAAAATCCGGAGTGACTTCAGTTCCGACTCCAGCCAAGAACACGAACGTCGTAAATTCGATCGGAAGTAACGGAAATAGTAAtattaatagtaatagtaacgggaacgggaacggaaacggaaacgtgAGTGCTAAAGTGAATCGTCGTTCGCGCCACACATCCTTGAGCAaagacaacaacagcagcagcaacagcaacagcaactgcaatagcaacagcaattGCAATAGCAACACTAGCCAAAttagcaacaccagcaacactGCTGGCTTCATGAGTAGCGCTGCCGCGGCTGCTGCGGGTGCCgctggtggaggaggaggagcactGTTCCAACCCCAATCGGCCAGCACGGCCAACACGAGTCCTTCGGGTCTGGGACTGAGCCAAGCGACAATTCCCAGCCACTCGCCCACAAGCAACTCACCGGTTTCGGGTGCCAGTTCCGCGTCGTCTTTGTTGACGGCAGCCTTTGGCAACCTGTTCGGGGGATCCTCGGCCAAGATGCTGAACGAACTGTTCGGCCGCCAAATGAAGCAGGCCCAAGACGCCACCAGTGGCCTGCCCCAGAGCCTGGACAACGCCATGCTGGCCGCAGCCATGGAGACTGCCACCAGTGCCGAGCTACTGATCGGCAGCCTGAGTTCCACGTCCAAAttgttgcagcaacaacagcagcagctcaaCAATAATTCCGCCACAACAGCAACCACAACGCCGCTGAGCAACGGAACCAATGCCTCGATCTCGCCTGGATCGGCCCACAGCTCCAGCCACTCGCATCACGGTGTCTCCCCGAAGGGAAGTCGTCGGGTTTCGGCCTGCAGTGATCGCTCCCTGGACGCCTCGGCAGCCGATGTAGCTGGCGGATCTCCTCCGAGAGCGGCATCCGTGAGCTCTTTGAACGGAGGAGCTAGCAGTGgagagcagcaacagcaacagcagctgcaacaCGATCTCGTGGCCCATCATATGCTGCGCAATATCCTGCAGGGAAAGAAGGAGCTCATGCAACTCGACCAGGAACTCCGCACGGccatgcaacagcagcaacagcaacagctccaGGACAAGGACCAGCTCCACTCCAagctcaacaacaacaacaacaataatatatCGGCAGCtgccaacaataacaacaacaacacgatGGAGAGCATCAACCTCATCGAGGACTCCGACATGGCGGACATCAAAATCAAGAGCGAGCCTCAGACGGCGCCAGCACCGCAGCAATCGCCGCacggcagcagccacagcagtcACAGCGGGAGCGCCAGTGGCAGCCACAGCAGCCTGGCCAGCGATGGCAGCCTGCGACGCAAGTCCTCGGACTCGCTGGACAGCCACGGCGGCCATGACGAGGCGGATCAGGAGGAGCAAGACCCCAATCGGCAAAGATCGCAGAGCCGTGCTCCGGAGGACCCGCAGCTGCCTACCAAAAAGGAGGCTGTGGACGATATGCTGGACGAGGTGGAGCTGATGGGTCTGCACTCGCGGGGATCCGATCTGGAGAGCCTGGCCTCGCCCAGCCATTCGGACATGATGCTGCTGGACAACAGCAAGGACGATGTTCTcgatgacgacgatgacgatgattgCGTGGAACAGAAACGCGACCCATCTACATGCCTCAAGAAACCGGGCATGGACCTCAAGCGGGCACGCGTAGAGAACATTGTGTCCGGCATGCGATGCAGTCCCTCATCAGGACTCGCCCAGGCCGGACAGCTCCAGGTAAACGGCTGCAAGAAGCGCAAACTCTACCAGCCCCAGCAACACGCCATGGAGCGTTATGTGGCAGCAGCTGCCGGCCTGAATTTCGGCCTCAATCTGCAGAGCATGATGCTCGACCAGGACGACAGTGAGTCCAACGAGCTGGAGTCGCCGCAGATCCAGCAGAAGCGCGTGGAGAAGAACGCCCTCAAGTCCCAGTTGCGATCCATGCAGGAGCAGCTCGCCGAGATGCAGCAGAAGTACGTGCAGCTGTGCACGCGCATGGAGCAGGAGAGTGAGTGCCAGGAGCTCGATCAGGACCACGAGCAGGATCAAGAGCCGGAGCCAGAGCAAGAACAGGAGCCGGACAACGGCAGCAGCGACCACATCGAGCTGTCGCCATCGCCCACTCTGACCGGCGACGACGATGTAAGTCCTGCCCACAAGGTGGAAGTTCCGGGCTCCAGTTCTCCCTCGCCATCGCCCCTCAAGCCAAAGACGGCGCTGGGCGAGAGCGGAGACTCCGGTGCCAATATGCTGTCCCAGATGATGAGCAAGATGATGTCCGGCAAGCTGCACAATCCTCTCGTCGGCGTGGGCCATCCTGCCCTGCCGCAAGGATTCCCACCGCTGCTGCAGCACATGGGCGACATGTCCCATGCCGCAGCCATGTACCAACAGTTCTTCTTCGAGCAGGAGGCACGCATGGCCAAGGAGGCCGccgagcagcaacagcaacagcaacaacagcaacaccagcagcagcaacagcaacagcagcagcaggaacagCAGCGCCGCTtcgagcaggagcagcaggaacaGCAGCGGCGAAAGGacgaacagcagcagctgcagcgccagcaacagcacttgCAACatctgcaacaacagcagttggagcagcagcagcatgccGCCAATGTCGCCCCGCGACAGCAGCAGATGCACCACCCGGCCCCCGCCCGACTGCCCACTCGAATGGGCGGGGCTGCCGCCCACAGCGCCCTCAAGTCGGAGCTGTCGGAGAAGTTCCAAATGCtgcgcagcaacaacaatagctcCATGATGCGCATGTCCGGCTCGGATCTCGAGGGACTCGCCGATGTCCTGAAGTCCGAAATCACCACTTCGCTGTCCGCCCTAGTGGACACCATTGTGACCCGCTTCGTCCACCAACGGCGACTCTTCAGCAAGCAGGCGGACTCCGTGACCGCAGCGGCCGAGCAGCTGAACAAGGACCTGCTGCTGGCCTCACAGATCCTGGACAGGAAATCGCCGCGCACCAAGGTGGCAGACCGGCCCCAAAACGGACCCACACCCGCATCCCAATCAGGTAATAATGGTTCACTACTCCTAGCTAATAGTCAGATGCCCGCCCAGCCGTCCGCGAatgcgcagcagcagctgcagcaggctcagggatcccagcagcaacagcagcagccaggatcgcaacagccacagccaagctcgcagcagcagcagcagcaacagaatgTGGTGCAGCAACAGCCGCATCCCTTGATGCCGCCCAACTGCCAGCAGCTAATAGCCGCTCCCCGCTTGAACGGGAGCCAGTTGTCCTTCGCCTCGCCAGCGGCTGCTGCAGCGGCTGCCATGGGCCTGCAGATGCAccatgccgccgccgccgcggCCATGtccgcccagcagcagcagcaacagcaacagcagcagcagcagcaacagcaacaacagccggGCGATCCTGGCATGAACACTAATCCGGCTAGCGGGCCAACAAACCCCACTAACAATAGCTTAAGTACCCTTAATATTCCACCTCCTCACATTCGTCCTTCGCCCACAGCGGCTGCCATGTTCCAGGCGCCAAAGACGCCGCAGGGCATGAATCCGGTGGCCGCCGCCGCGCTCTACAACTCGATGACCGGTCCCTTCTGCCTGACGCCCGACCAgcaggcgcagcagcagcagcagtccgcccagcagcagcaatccGTTCAGAATCCCCAGCAGAGTGCGCAGCAGACGCAGCAACAGCTGGAGCAGAACGAGGCCCTCAGCCTAGTAGTGACTCCGAAGAAGAAGCGCCATAAGGTCACAGACACGCGCATCACACCACGTACCGTCAGCAGGATTCTGGCCCAGGACGGCGTAGTCCCACCCAACGGCGGATCCTCAAACActccccagcagcagcagcaacagcagcagcaaagtgtccagcagcagcagcagcaacaacagcagcagcaacaggcggCGAACGGCAACAGCAGCGCCACACCCGCCCAGAGTCCGACGCGAGGCAGTGGGGGAGTAGCTCCCTACCATCCAcagccaccgccaccaccaccacccatgATGCCGGTGTCCTTGCCCACTTCGGTGGCTATTCCTAATCCCTCGCTGCACGAGTCCAAGGTCTTCTCGCCGTACTCGCCGTTCTTCAATCCGCACGCAGCCGCCGGCCAGCCCACGGCCGCCCAGTTGcatcagcaccaccagcagcaccacccgCACCACCAGTCCATGCAGCTCTCCTCCAGCCCGCCCGGCAGCTTGGGTGCGCTTATGGATTCGCGCGACTCGCCGCCGCTGCCACACCCGCCGTCGATGCTGCACCCCGCCCTCCTGGCGGCCGCCCACCACGGTGGCTCGCCCGACTACAAGACCTGCCTGCGGGCCGTCATGGACGCCCAGGACCGCCAGTCCGAGTGCAACTCGGCGGACATGCAGTTCGATGGCATGCAGCCTACTATATCCTTttacaaacaacaacaacaacataaaaTCGAATCAGAGCAGTCTGCTCTACTGATGATGGTCAAAAATTGCGAATCCTTGACTCCTTTGCACTCTTCTACATTGACACCGATGCACCTGCGCAAGGCCAAGCTGATGTTCTTCTGGGTGCGCTATCCCAGCTCCGCGGTGCTCAAGATGTACTTCCCGGACATCAAGTTCAACAAGAACAACACAGCACAATTGGTGAAATGGTTCTCAAACTTCCGAGAATTCTACTACATACAAATGGAGAAATATGCACGACAAGCTGTCACCGAAGGCATCAAGACACCCGATGATCTGTTGATTGCTGGAGATAGTGAATTGTATCGCGTGCTCAATTTGCACTACAATCGCAATAACCACATTGAGGTCCCCCAGAACTTCCGCTTCGTGGTCGAACAGACTCTGCGGGAGTTCTTCCGTGCAATACAAAGCGGCAAAGACACCGAGCAGTCGTGGAAGAAGTCGATTTACAAGATCATCTCGCGCATGGACGATCCTGTGCCCGAGTACTTCAAGTCGCCGAATTTTTTAGAGCAACTGGAATAA
Protein Sequence
MSSEEYVADCFGLYSDENNVLLKATGEQPEITAPSTKQQQSPPQQQQQQQLQQPPLNGHTTPTTPIQLTNGSDIFTANSGDTNTNTNTNTNTNSNSNNNTEEQESKREQQQSKQQARQQAHDKEKEKVIDDEGESDDSDDDVVVVLEGCNNSNSNSNSNSSASSSNNNHKGATTATATTTANNCNRGSGSGSGSNISGGLSRSHRSGRSSRQIRQSTAVGKTTTCAAKKSGVTSVPTPAKNTNVVNSIGSNGNSNINSNSNGNGNGNGNVSAKVNRRSRHTSLSKDNNSSSNSNSNCNSNSNCNSNTSQISNTSNTAGFMSSAAAAAAGAAGGGGGALFQPQSASTANTSPSGLGLSQATIPSHSPTSNSPVSGASSASSLLTAAFGNLFGGSSAKMLNELFGRQMKQAQDATSGLPQSLDNAMLAAAMETATSAELLIGSLSSTSKLLQQQQQQLNNNSATTATTTPLSNGTNASISPGSAHSSSHSHHGVSPKGSRRVSACSDRSLDASAADVAGGSPPRAASVSSLNGGASSGEQQQQQQLQHDLVAHHMLRNILQGKKELMQLDQELRTAMQQQQQQQLQDKDQLHSKLNNNNNNNISAAANNNNNNTMESINLIEDSDMADIKIKSEPQTAPAPQQSPHGSSHSSHSGSASGSHSSLASDGSLRRKSSDSLDSHGGHDEADQEEQDPNRQRSQSRAPEDPQLPTKKEAVDDMLDEVELMGLHSRGSDLESLASPSHSDMMLLDNSKDDVLDDDDDDDCVEQKRDPSTCLKKPGMDLKRARVENIVSGMRCSPSSGLAQAGQLQVNGCKKRKLYQPQQHAMERYVAAAAGLNFGLNLQSMMLDQDDSESNELESPQIQQKRVEKNALKSQLRSMQEQLAEMQQKYVQLCTRMEQESECQELDQDHEQDQEPEPEQEQEPDNGSSDHIELSPSPTLTGDDDVSPAHKVEVPGSSSPSPSPLKPKTALGESGDSGANMLSQMMSKMMSGKLHNPLVGVGHPALPQGFPPLLQHMGDMSHAAAMYQQFFFEQEARMAKEAAEQQQQQQQQQHQQQQQQQQQQEQQRRFEQEQQEQQRRKDEQQQLQRQQQHLQHLQQQQLEQQQHAANVAPRQQQMHHPAPARLPTRMGGAAAHSALKSELSEKFQMLRSNNNSSMMRMSGSDLEGLADVLKSEITTSLSALVDTIVTRFVHQRRLFSKQADSVTAAAEQLNKDLLLASQILDRKSPRTKVADRPQNGPTPASQSGNNGSLLLANSQMPAQPSANAQQQLQQAQGSQQQQQQPGSQQPQPSSQQQQQQQNVVQQQPHPLMPPNCQQLIAAPRLNGSQLSFASPAAAAAAAMGLQMHHAAAAAAMSAQQQQQQQQQQQQQQQQQPGDPGMNTNPASGPTNPTNNSLSTLNIPPPHIRPSPTAAAMFQAPKTPQGMNPVAAAALYNSMTGPFCLTPDQQAQQQQQSAQQQQSVQNPQQSAQQTQQQLEQNEALSLVVTPKKKRHKVTDTRITPRTVSRILAQDGVVPPNGGSSNTPQQQQQQQQQSVQQQQQQQQQQQQAANGNSSATPAQSPTRGSGGVAPYHPQPPPPPPPMMPVSLPTSVAIPNPSLHESKVFSPYSPFFNPHAAAGQPTAAQLHQHHQQHHPHHQSMQLSSSPPGSLGALMDSRDSPPLPHPPSMLHPALLAAAHHGGSPDYKTCLRAVMDAQDRQSECNSADMQFDGMQPTISFYKQQQQHKIESEQSALLMMVKNCESLTPLHSSTLTPMHLRKAKLMFFWVRYPSSAVLKMYFPDIKFNKNNTAQLVKWFSNFREFYYIQMEKYARQAVTEGIKTPDDLLIAGDSELYRVLNLHYNRNNHIEVPQNFRFVVEQTLREFFRAIQSGKDTEQSWKKSIYKIISRMDDPVPEYFKSPNFLEQLE

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
iTF_00602499;
90% Identity
iTF_00550172;
80% Identity
-