Ucar024242.1
Basic Information
- Insect
- Urophora cardui
- Gene Symbol
- pros
- Assembly
- GCA_960531455.1
- Location
- OY482672.1:142769143-142779486[+]
Transcription Factor Domain
- TF Family
- HPD
- Domain
- HPD domain
- PFAM
- PF05044
- TF Group
- Helix-turn-helix
- Description
- Prospero is a large drosophila transcription factor protein that is expressed in all neural lineages of drosophila embryos. It is needed for correct expression of several neural proteins and in determining the cell fates of neural stem cells. Homologues of prospero are found in a wide range of animals including humans with the highest level of similarity being found in the C-terminal 160 amino acids. This region was identified as containing an atypical homeobox domain followed by a prospero domain. However, the structure shows that these two regions form a single stable structural domain as defined here [1]. This homeo-prospero domain binds to DNA.
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 1 1.9e-28 3e-24 87.0 2.5 1 50 2270 2319 2270 2319 0.99
Sequence Information
- Coding Sequence
- ATGATGTCATCAGAGGAGGACAACGATTGTTTTGGTTTGTATagcgatgatgatgatgataagtTACTAGTGAAGGAGGTTGTCAAGCAAACATTAACAGCACTggctacagcaacaacaacaatttctgACACTGCATGCACGCCGCCATTATTGGATTTGGTTAGTGATGATGTTGTTGTAAAGCAAGAAATAGAAACAGAGCTAGAAACAGAGCAAGAAACAGAACTAGAAACAAGCCAGTTACTGCTGCCACACGTTAATAGTAGCACAACTTTACAAAAACAACCACAGCAAACGTTGAGAAATACCAggcacaacagcaacaataataaGAATAACATTAACAAACGGCCTGCAGCTGTGGAGGAGTGTAACGACAAAGAGCGTCGACGAAGCGCTGCCAACAGCAACAGAAATagaagcaaaagcaaaagcaacagCCAAAACGACGACAAAGTTGTAGAAAACGGCAGAGCACACAACAGTATTAACGAAAATCAACACAATCAACGCAAGAATACCAACAACAGCACCATTAACGACGACAACGGCGACGACGTACACGACATCGCTAACGATAACGATAACAAATCTAACGACGACGAGCACGAGGACGACGACGACGTTGTAGTCGTACTCGACGGCATTATTGCTAGCAGCGAAGTCTCCACAAAAACAGCACAAACTCAGAGACAACAGcaaagacaacaacaacaacaacaacaacagtcacGCATACGTAGTCGTAATCGTAATCGCAATCGTAATCGCCATACAAGTTGTAGTCGTAGTCGTAGCTGCAGTCGCGGCAGTAGTCTCAGCAcaagcagcagcagcggcagcgCCAAAGACAGCACACACGGTGCTATTAGCAGTGGTGGTGGTGGCAAGCAGCAGCAAGTCGTTGAACAGCAGCAAGTCAAGCCAGCCACAGAGCCAAGTAACGAAAGTAGTACGTGTGGTGCCGTCAATAAGATAGCAACGTGCACCGGTAACTTACAACAAACGACGCGTTTGTCTACTAATAAAACGACTTTAGTGTTAAATGCAATAGCAAAGTCATCAACAACGTCTTCAACGAAATTgtgcaaaacaaaaaagaatGTATCCTTAACAGgaatagcaataacaacaacagcgaGTGAAATAAATCATGTTGTGAATGCGAAAAACAATCATACAACAACAAATGTAAAtgcaaatttgcaaaaaaatataaacgTTACGAATTCTTCGCTTAATGGCAATTGTGCCACGAATAACGTAAATTCAAATACAAATAATGAGAATGTGAATGTGCTTGTTAATAATAACAGTAATAATAACGTTAGTAGTGTGCTAGAACAAAGTGCAAAAGTGCCAACAGCCTTTACGCTTGCTGGTCAACAACGTCAAGAATTAAATCAACACAATACAAGTTCGATTGCTTTGGCATTAACGCCATCGCCGCCACTGCCGCCCGACGCAAACAACGCTGTACGCACGTCGTTGTCGTCATCGTCATTGTCGTTTTCGTTGTCATCatcaaaagcaacaacaacaccaacagcaacaacaacaacaacaacagcagcagcaacatcaGCAACACCGCCATTAGTGCAAACGCCGTCAACAGCGACTGCCTCCTACATGAGTAGTGCTGCTGCGGCTATAGCGGGCGCAGCTGGTGGAGCAGGAGCTCAACTTTTTGGTAATTTATTAAATTCAAGCAACGTTCAAAACAGcagcaccaacaacaacaacaagaacaacaacacgaACACTGCCAACGGCAATAATCtaaatacaacaacaaacactGCAACCGCATCCACATCCGCATCCGCATCGCCAACATCCGCATCACCAACAAACACAGCCGCCGCAGCCTCACTATTGACGGGCAATTTTACAGCGGCACTGGGCAGCCTCTTTTCGGCGACAAACGGTTTCGATTCAGCAAAAATGTTGAATGAGCTATTTGGACGACAAATGAAGCAGGCGCAAGATGCTACCAGCGGCTTACCCGCTAATTTAGACAACGCGATGTTAGCTGCGGCTATGGAATCCGCTACCAGCGCTGAGCTGTTGAGCGCAGCTGGTTTGGTGAATAGTTTGGGCTTGAATGCCAGCAATAAATTGCTCaataataatacaataaatAGTTTAAGTAATAATGGCAGTGCCGGCGGTGCAGCTACGACGGCGAATAACAattcaaataataataatattaatggTGGTGCGGCTGCTGGTGACAATCAGCAAACGAATGCAACTGTCTCAGCAAACGGCTCCGCTGCCGGTTCGGGCTGTGGCTCACCGAAAAATCGGCGCGTTAGCAATTGTAGTGATAGATCAATGGATGAAGTGCAATCGCGTGGCGGTGATAATGCCAGCCCACCGCGTGCCGCCTCTGTCAGCAGTTCGAGCTCGGCTGTAATGGAGAGCAGCTCAACCAGCgtacagcaacagcagcagcaacaacagcaaaatgaGCTGGCTCACCATATGCTACGCAATATATTGCAAGGTAAGAAAGAGTTAATGCAATTGGATCAAGAGCTGCGCTCGGTTATGaaccaacaacagcagcagcaacagcagcagcagctggGCGATAATCAAACTACACTTAAacacaacaataataatacatcAGCCGCAatcaacaataataacaataacaacaacaataatgcTGAAAAGGAGCCAATTAGCGTCATCAATTTGCTAGAGGACACAATGTCcgatataaaaattaaatgcgAACCCGCAACAAACCTAGCAAATTGCGCCGATTCGCTAACGGAAGCGCGTCGCAAATCCATTGATTCGGAAAACGGCGATTCACAGCAAGAAGATGAACAACAATCCGTAGATGAGCGCATGGATCATGCAACGCATATGCGAGATGAATCGGATGAGCAACAACAGCGCTCCTGCTCCTCCTCCCAACTGCCCGTGATTGGCGTTACGAAAAAGGAGGCCGAAGAAATACTCGAAGATGTCGAGCTGATGGGCTTAAATTCACGCTCCGACTTGGATTCGCTCGCCTCACCCAGCCAATCGGAAATGCTGCTTTTGGAGCATAGCAAAGATGAGCTGGAGGAGGAGGATGATATGGAAAAGGATGTGGTGGTTGCCGCCACTGCGTCGATGACAAAGAAGCGCGCACGCGTCGAGAATATTGTAACGACAATGCGTGGCAGTCCATCAACGCAtcaccaccaccaccatcaTCAGTCTCACGCCGCCTTGCTGCAAGTGAACGGCTGCAAGAAGCGCAAGCTCTATCAACCACAACAACATGCCATGGAGCGTTATGTTGCCGCTGCAGCCGGCTTAAATTTCGGCTTGAATTTGCAGAGCATGATGTTGGATGAGGAGGCTTCCAGTGAAATGGAATCACCACAAATACAACAGAAGCGCGTCGAGAAGAATGCGCTCAAGTCGCAATTGCGTTCGATGCAAGAGCAATTGGCTCAGATGCAGCAGAAGTATGTGCAATTGTGCTCGCGCATGGAGCAAGAGTCCGAGTGTCAGGATTTGGATGATACGGCCAGCGATAGCATGGAGCAGGAGGATGATAACGATGATAATGCCGTTGAACTATCATCATCGCCAACGCCATCCGCATCGGGCGCATCGCTAACTGGCGGGCGCATCGGCGTCGGTGGCTGTGGTGTTGGTGCTGATGCGAGTTTAAATGACAAACATCATCATCAATCGGTTGATGGCGAGCGACTGAGCAGCAGCTGCACTTCACCGCCGCCGTCTCTATCCACCACAAACTGCCAATTGCCTTTGAAGGCAACGAAATCACATTTAAGCTCACCGCccagcgccgccgccgccatgaCGCTCGACAGCGCGCCGAATGTGCTCTCCCAAATGATGAGCAAAATGATGTCATCGCGTTCGTTGGTGGCTGGCGCACATCCACATATGCCGCAAACGTTCAATGGTCCCTTGCCGCTGCTGCCGCATATGCCACAATTGCCGGGCGATGCGAATCTGACGCATCCAGCGGCGATTAGCAATGCGGCCGCCATGTATTTGGGACAACAGTTTTTCTTTGAGCAGGAGGCGCGCATGGCCAAAGAGGCTGCCGAACAGCAAGAGCGTCAACAGCAACAAGCGCAACAAGCTCAGCAGCAAATGCAGCAAgcgcaacagcaacaacaacaacaacaggcGGCGCAACAGCAGGAGCATGAACAACAACAACGCCGCTACGAGCAGGAGCGCCGTAAAGAGGAAAAGCAACAACAGGCAGCTGCGGCAGCAGCTGCAGCAGCGCAACAATTACAGCGACAACAACAGCAATTGCAACAGCACatccagcaacaacaacagcaacagcattTGGAGCAAACAGCAACGCCGCCTGGTGCGGCGCTTTCGTCGCTGCCTCACCAGCCGATACGCTCGCAATTGCATCACAATCGTTTGCATCCACGTCACAGCGCCACACATTCCTCCTCCGCCTCCTCCGCGCTCAAATCGGAGCTTTCGGAAAAATTCAATATGTTACGCTCCAGCTCGAATTCGATTATGCGCATGTCCGGTGCGGATTTGGAAGGTTTAGCCGATGTGCTCAAGTCTGAAATTACCACTTCGCTATCGGCGCTGGTGGATACGATCGTTACGCGCTTCGTACATCAACGTCGCCTGTTTAGCAAGCAGTCCGACTCGGTGGCCGCAGCCGCCGAGCAACTCAATAAGGACTTGTTGATGGCCTCGCAGATACTCGATCGCAAATCGCCGCGCACAAAACTCGCCGCCGAAcgccagcaacaacaacaacaacaacaagcgaGTGGCGGCAATACGAGCGGcagcaataacaataataacagtTCGGCATCAGGCGGTAGCGCGGCTGCACAAGCTGCAGCAGTTCAATCAGGTAATAATGGTTCTCTCCTTCTagctaataataataatagtaacaataacaataccTCCTTAACTAACCCATTAAATAACATTCACAATACAAATCATACTAGTCATTTGTTGGGTCACCACCGCCATCACTTGCAGTCAAATGTAAGTAGCGTAAGTAGTAGTAATGGCCCACCTCCACCCCAAATGCTTGTCGGCTGTAATGCCTCCGCCTTGGCTGTTGGCGGTGTCGTTAATGTACCTCATCCACATACGCTTAATCAAGCAAGTCAAGCGTGCGTTAATGGTGGTCTGCCACCACAATCGCAACAGCCAGCAAATGCTAATGtagtacaacaacaacaaatgacACAACAACAAATGTCAGCGTTGAATGCGGCAGcgcaaaattgccaaaatttaATTGCTGCGCCGCGTTTGAATGGCAATCAGCTGTCGTTTCCATCGCCGGCTGCGGCCGCAGCTGCGGCAATGGGCCTGCAAATGCATCATGCCGCCGCTGCAGCTGCAATGTCCGCAGCCGCTGCTGCTAGCCAACAACAGcatcagcaacagcagcagcagcaacaagcGCAAGCGCAAgcgcaacaacagcaacagcaaatgACAAACGACAATTCTAGCCAACAACAACTTTCAATGACTAATGCGAGTTTAGCTATGAACTCGAATTTGAATactaataatactaataataacaATAGTTTAAGTACAATTAATATACCACCACCTCATATACGTCCTTCGCCCACAGCGGCAGCACTATTTCAGGCACCGAAGACACCGCAGGGTATCAATCCCGTCGCCGCTGCTGCGCTCTACAATTCAATGACCGGCGGTGGGCCGAATCAGCTGAATCCCTTCTGCTTGCCGGATGCGCGTGAGGCGGCAgcccagcaacaacaacagcagcagcagcaggcGCAGCAGcaagcacaacaacaacaacaacaagcgcaACAACAGGcgctgcaacagcaacaacaactggAACAAAACGAGGCGCTCAGCTTGGTTGTAACGCCAAAGAAGAAGCGTCATAAGGTAACCGATACACGCATTACGCCACGCACCGTTAGTCGCATTTTGGCGCAGGATGGCGTGGTGCCACCAAATgcaaatcaacaacaacagcaatcgCAAACAAATGCGACTACAACGCAATCAAATGCAGGCAATAGTGGCGCCGGCGGCACACCGCTAACCACACCAGCGCAAAGCCCCTCACCATCGCGCCTACAAGCGGTGGGGCcaagcgccgccgccgccgccgctgctGCCTATcatcagcaacagcaacaggCACAATCGCAACCGCCGCATGCATCACCACATGCGCCAtcacagcaacagcaacaaccaccaccaccaccacccaTGTTGCCCGTTTCGTTGCCCACATCTGTGGCCATACCGAATCCATCGTTGCACGAATCAAAAGTCTTCTCACCTTACAGTCCATTCTTCAATCCACatccgccgccgccgccgcatgcgccaccaccgccaccaccaccGCATCCACACGCACACCCACAGGCGCATGGTGGTCCGCATGGTCCGCATGGCGGCGCTGGCGTTGGCGGTGGCCACAATACGCCAACCGCCGCCCAAATGCATCACATGAAAATGTCGATGAGTCCAACCGGTTTGGGTGGTTTGCTGGATTCACGCGAATCACCACCGCTGCCACATCCGCCCACAATGTTGCATCCCGCTTTGTTGGCGGCTGCCCATCATGGCGGTTCGCCCGATTATAGTGCGCATTTGCGTGCGGCCATGGATGCGCAAGATCGTAATTCGGATTGTAATTCGGCGGATATGCAATTTGATGGTATGCAACCGACAATATCGTTCCTCAAGCAGCAaatgatCAAAAATAGTGATTCGTTATCGCCATTACACTCATCAACGCTAACACCGATGCACCTGCGCAAAGCGAAATTGATGTTCTTTTGGGTGCGTTATCCCAGTTCGGCGGTATTAAAAATGTACTTCCCAGACATTAAATTCAATAAGAATAATACAGCACAATTGGTGAAATGGTTCTCGAATTTCCGGTaa
- Protein Sequence
- MMSSEEDNDCFGLYSDDDDDKLLVKEVVKQTLTALATATTTISDTACTPPLLDLVSDDVVVKQEIETELETEQETELETSQLLLPHVNSSTTLQKQPQQTLRNTRHNSNNNKNNINKRPAAVEECNDKERRRSAANSNRNRSKSKSNSQNDDKVVENGRAHNSINENQHNQRKNTNNSTINDDNGDDVHDIANDNDNKSNDDEHEDDDDVVVVLDGIIASSEVSTKTAQTQRQQQRQQQQQQQQSRIRSRNRNRNRNRHTSCSRSRSCSRGSSLSTSSSSGSAKDSTHGAISSGGGGKQQQVVEQQQVKPATEPSNESSTCGAVNKIATCTGNLQQTTRLSTNKTTLVLNAIAKSSTTSSTKLCKTKKNVSLTGIAITTTASEINHVVNAKNNHTTTNVNANLQKNINVTNSSLNGNCATNNVNSNTNNENVNVLVNNNSNNNVSSVLEQSAKVPTAFTLAGQQRQELNQHNTSSIALALTPSPPLPPDANNAVRTSLSSSSLSFSLSSSKATTTPTATTTTTTAAATSATPPLVQTPSTATASYMSSAAAAIAGAAGGAGAQLFGNLLNSSNVQNSSTNNNNKNNNTNTANGNNLNTTTNTATASTSASASPTSASPTNTAAAASLLTGNFTAALGSLFSATNGFDSAKMLNELFGRQMKQAQDATSGLPANLDNAMLAAAMESATSAELLSAAGLVNSLGLNASNKLLNNNTINSLSNNGSAGGAATTANNNSNNNNINGGAAAGDNQQTNATVSANGSAAGSGCGSPKNRRVSNCSDRSMDEVQSRGGDNASPPRAASVSSSSSAVMESSSTSVQQQQQQQQQNELAHHMLRNILQGKKELMQLDQELRSVMNQQQQQQQQQQLGDNQTTLKHNNNNTSAAINNNNNNNNNNAEKEPISVINLLEDTMSDIKIKCEPATNLANCADSLTEARRKSIDSENGDSQQEDEQQSVDERMDHATHMRDESDEQQQRSCSSSQLPVIGVTKKEAEEILEDVELMGLNSRSDLDSLASPSQSEMLLLEHSKDELEEEDDMEKDVVVAATASMTKKRARVENIVTTMRGSPSTHHHHHHHQSHAALLQVNGCKKRKLYQPQQHAMERYVAAAAGLNFGLNLQSMMLDEEASSEMESPQIQQKRVEKNALKSQLRSMQEQLAQMQQKYVQLCSRMEQESECQDLDDTASDSMEQEDDNDDNAVELSSSPTPSASGASLTGGRIGVGGCGVGADASLNDKHHHQSVDGERLSSSCTSPPPSLSTTNCQLPLKATKSHLSSPPSAAAAMTLDSAPNVLSQMMSKMMSSRSLVAGAHPHMPQTFNGPLPLLPHMPQLPGDANLTHPAAISNAAAMYLGQQFFFEQEARMAKEAAEQQERQQQQAQQAQQQMQQAQQQQQQQQAAQQQEHEQQQRRYEQERRKEEKQQQAAAAAAAAAQQLQRQQQQLQQHIQQQQQQQHLEQTATPPGAALSSLPHQPIRSQLHHNRLHPRHSATHSSSASSALKSELSEKFNMLRSSSNSIMRMSGADLEGLADVLKSEITTSLSALVDTIVTRFVHQRRLFSKQSDSVAAAAEQLNKDLLMASQILDRKSPRTKLAAERQQQQQQQQASGGNTSGSNNNNNSSASGGSAAAQAAAVQSGNNGSLLLANNNNSNNNNTSLTNPLNNIHNTNHTSHLLGHHRHHLQSNVSSVSSSNGPPPPQMLVGCNASALAVGGVVNVPHPHTLNQASQACVNGGLPPQSQQPANANVVQQQQMTQQQMSALNAAAQNCQNLIAAPRLNGNQLSFPSPAAAAAAAMGLQMHHAAAAAAMSAAAAASQQQHQQQQQQQQAQAQAQQQQQQMTNDNSSQQQLSMTNASLAMNSNLNTNNTNNNNSLSTINIPPPHIRPSPTAAALFQAPKTPQGINPVAAAALYNSMTGGGPNQLNPFCLPDAREAAAQQQQQQQQQAQQQAQQQQQQAQQQALQQQQQLEQNEALSLVVTPKKKRHKVTDTRITPRTVSRILAQDGVVPPNANQQQQQSQTNATTTQSNAGNSGAGGTPLTTPAQSPSPSRLQAVGPSAAAAAAAAYHQQQQQAQSQPPHASPHAPSQQQQQPPPPPPMLPVSLPTSVAIPNPSLHESKVFSPYSPFFNPHPPPPPHAPPPPPPPHPHAHPQAHGGPHGPHGGAGVGGGHNTPTAAQMHHMKMSMSPTGLGGLLDSRESPPLPHPPTMLHPALLAAAHHGGSPDYSAHLRAAMDAQDRNSDCNSADMQFDGMQPTISFLKQQMIKNSDSLSPLHSSTLTPMHLRKAKLMFFWVRYPSSAVLKMYFPDIKFNKNNTAQLVKWFSNFR
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -