Mdol023243.1
Basic Information
- Insect
- Megamerina dolium
- Gene Symbol
- -
- Assembly
- GCA_963854835.1
- Location
- OY978529.1:33213026-33223496[-]
Transcription Factor Domain
- TF Family
- THAP
- Domain
- THAP domain
- PFAM
- PF05485
- TF Group
- Zinc-Coordinating Group
- Description
- The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 15 2.4e-13 3.1e-10 41.6 7.6 1 86 472 544 472 545 0.85 2 15 2.5e-15 3.2e-12 48.0 1.8 1 86 572 640 572 641 0.80 3 15 7.7e-17 9.8e-14 52.8 0.3 1 87 663 735 663 735 0.84 4 15 1.9e-12 2.4e-09 38.7 4.0 1 87 815 884 815 884 0.81 5 15 3.3e-16 4.2e-13 50.8 4.2 1 87 908 980 908 980 0.80 6 15 1.3e-13 1.7e-10 42.4 2.4 1 87 1017 1087 1017 1087 0.80 7 15 9.7e-11 1.2e-07 33.2 5.4 1 86 1125 1194 1125 1196 0.74 8 15 4.5e-15 5.7e-12 47.1 2.2 1 87 1222 1292 1222 1292 0.80 9 15 3.2e-12 4e-09 38.0 3.2 1 86 1313 1386 1313 1387 0.80 10 15 2.5e-14 3.2e-11 44.7 2.9 1 87 1415 1487 1415 1487 0.86 11 15 0.66 8.4e+02 1.7 0.0 1 58 1550 1600 1550 1615 0.68 12 15 6e-14 7.7e-11 43.5 1.3 1 86 1643 1716 1643 1717 0.83 13 15 1.6e-14 2e-11 45.4 1.4 1 85 1740 1809 1740 1811 0.80 14 15 9.8e-12 1.2e-08 36.4 5.8 1 85 1839 1907 1839 1909 0.80 15 15 1.4e-16 1.7e-13 52.0 1.2 1 87 1931 2002 1931 2002 0.82
Sequence Information
- Coding Sequence
- ATGAATACACCGGCATTCGGGCATATGAATACAATGTTTGGCGGCAATTGCTCGGGCTCCTATAATATTGCAGAAACTTATAATGGCTCCATAAGTCGAACCGCTGATTGTAGTAGTTATGGTACCGGTGGCGGTTACGACCTTGATATGCGTAATTGTGAGCAGTCACCAGCATATAGTGGATACTATCCAAATCATAATCAGCTGACACCATCAGCGACACCTCATATGATAGCGTCTGGAGCACAAAACATAAAAGTGGAACCACCTGATCATATTCAAACGCCGACCATACAAATGGAAGAATTGATTATAAAATCTGAACCACCTGACgaaacatataaatacaattccGTAAATGATTATTGCACGCCATATAGTAATAATACCAGCAGCATGCCAGCAATGATGTCAACTCCACCTCTTGTTCACAATATTAAACAAGAAAATTCATTAGAACCAACAGAAATATATGCACAACAACAAAGGCCTCTAAATTTTCCACGACGTAAAGTACAAACTGAACGTTCAGATACCTTACCAATATGTCAGCGTTGTAAACAGGCGTTCCTAAAGAAGCATTGCTATGTGAAACATGTTGCCCAGAGTGTATGCAATATATTGGAATACGattttaaatgttgtatttGTCCAATGTCCTTTATGTCACACGAAGAATTGCAAATGCATGAACGATTACATgcagcaaataaatttttttgtcacAAATATTGTGGGAAACATTTTGATAATGTCGAATTGTGCGAATCGCACGAGTATGGTCAACATGAATATCCACATTTTCCATGTAAGATTTGTCCGGCAAGCTTTCAACATCGTGAACATTTCTTTGCGCATCTATACCAACATAAACATCAAATACGTTATGATTGTTGTGTATGTAGGCTTTGGTTTCACACACCATACGAATTACACCAGCATCGATCAGGAGCACCATACTTTTGCGGGCGTTTTTATACGTATAAAGAACCGATGAATTATCATCGTTCATTATCTCTAAATAATACACCAAATATGTCTCAACCTATGCGAAGTTATAGCCTACAAGATTGCCATATGGGTGTAATTGAACCTCCGTCTTCCAGCAACTTAGCAGGTTATTCTTCAAACAATTGCGCTGTTCGCAGCAATTACGCGAGTGATAATAACTTTCCAAGTCACAATTTCATTAAAACTGAATTTAAAGCGGAAAAAGAAGATGACTATTATACACCAACCGAATATCCATCACGCGATTACAAAGATTGCAACAATGATGCCTTTAACAATACTCACATTAATTCGCTACAAACTTTCTCCATTGACTATCTCACCGATAATAAGAATTGCTATAAACCATCGGGTGATGTGAATACTCAAATTTCCAAAATCAGTAACGATGAAGATGATGCGTGTTGCGTTCCAAAGTGTGGTGTGTCGAAGAAGACATGCCCAACGttgcaattttttccatttccaaCAGATGATCATTACTTAAATCAATGGTTGCATAACCTTAAAATGCCACGTGATCCAAATTGCGATTATACGAGTAATCGTATTTGTAATTTACATTTTCCTAAACGATGTATGAATCGTTATTCATTATGCTATTGGGCGGTGCCCACATTTAATCTAGGCCATGACGATGTGGCAACTCTGTATCAGAATCGTGAAATTACAAGTGCGTTCACTTGCATGGAAAATGCTCAATGTAGTATGCCGGGTTGTACTAGCCAACGAGGTCGTAGCAATTTGAAATTCTACAATTTTCCCAAAGAcacaaaaatgttgataaaatgGTGTCAAAATGCTCGTTTATCGGTGCAAGGTAAAGAACTACGTCACTTTTGTAGTCTCCATTTTGAAGATCGCTGCTTTGGGAAATTCCGCTTAAAACCATGGGCAGTACCAACATTGGAACTGGGTACATCGTATGGTAAAATACATGATAATCCAAGTGTCATGTATTTGGAGgagaaaaaatgttgtttgcCACATTGTCGTCGAACACGTTCGTCACAATATAACCTCTCATTATATCGATTTCCACGTGACGAACGTCTATTGCGGCGTTGGTGCTATAATTTACGTTTGGATCCGTCAATATATCGTggcaaaaatcataaaatttgtagtGCTCATTTTATTAGAGAGGCACTTGGTTTGCGCAAATTGTCACCAGGTGCTGTGCCAACATTAAATTTGGGACACAATGATAGTtttgatatttatgaaaatgaattgaatacaCCACCATCACCTCCACCACCGCCTATGCGATCCCAATTGTTATCAATGGCCCAATCCGTCTTCAAGATGAAAAAGTATAAATTTCATGCATCAAATGCATCAATCAGCAGTCGTACTACTAGCACTACAACATCGCCGACCACACCCGCTTCAACAATGGTTACAAATAATATGGACTTTGCCGATGTTTGCTTCCTGTCGAGTTGTAAATGTACCCGCCAATCGGATAATGTTACATTGCATACTATTCCTCGTCGTCCCGAGCAGCGGAAGAAATGGTGCCATAATCTTAAACTGGATCCAGCGATGCTACACAAAGGTGTACGCATTTGCAGTCTGCACTTTGAGAAGGAATGCATTGGCGGCTGCATGCGACCGTTTGCTGTGCCCACATTAAATTTGGGTCACGATGATCCGAATATCTATAAAAATCCAGATGTTATCAAGAAGCTGAATATACGAGAGACCTGCTGTATACCACAATGCAAAAGAAATCGTGATCGGGATCATGCTAGCCTACATCGTTTTCCACATAATCCCGAGTTATTTGAAAAATGGTTTGCCAATCTGCGCAAACCATTGCCGGATGGCACCAAGCTATTCAATGATGCGATTTGTGAAATACATTTTGAAGATCGCTGTTTACGCAACAAACGTCTAGAGAAATGGGCAATACCAACTATTAATTTGGGCCATACAGAAGAAGTTTTGCATCAATTGCCAAGCGAAGCGGAAATCGCTGAGTTCTGGACAAAGCAAACGTCAACATTGAATACAGGAGACGAAGTTGGTGAATGCTGTGTGGAGACTTGCAAACGTGATCCACGTGTCGATGATATCCGATTGTACCGCTGGCCAGAGGAGTCGGAGCTGCTGGCTAAATGGTTGCACAATATTCAATGTGATCTACCTGGCGATGCATCGAGTTTACGTATATGTAGTTTGCATTTTGAAGCGCACTGTTTTAAGAAACTTCGTCTTCATCATTGGGCTCTGCCTACTCTCAATCTGGCAAAGAATGTCGAGCACCTTTTCAGGAATCCAGAACGTGCATTAAGCGTTAAGAAGGAGAAAGGTGTAAAAGAATATAAGGACAAGGCGAAGAAGTGGTCGCTACGTTGTTGCCTTTCGCATTGTCGTAAAATGCGTTCCGCTGATAATGTACAGCTCTTTCGATTTCCACATCGTAATCGGGGTATGCTCGCAAAGTGGTGTCACAACTTGCAGCTTCCGATGACGGACACTTGCAAGCGTCGCATATGTTCAATGCACTTTGAGGCGCACGCACTCACGAAAAGATGCCCGATGCAGCAGTCAGTGCCGACGCTAGATCTGAATACTCCACCCGGTTATAAAATCTACCAGAATCCAGCACGACTGAAGATCGCCAAACAGCGTCTTGAACGCATTTGCGCAGTGCCCACGTGTCGAAAGACTCGCTCGGACGGCGTAAACCTCTATCGTTTCCCTCAAAATCGTCGACTCTTGCGCAAATGGTGTCACAATACGCTACAAAAGATTAACGAAGCCGTACGAGCACAATATCGTGTGTGCTCCGAACATTTTGAGCACCATTCATTTGGTGCTAGACGTTTGTGTCCCGGCTCCATACCCACGTTGAATTTGAGTCCCGATGTTGTCGAAGTCTATCCAAATGAAGCGCAAAAATGGGACGATAAAGTGTGCGTAGTGAATGGCTGCAATATGATGAGCTCCGTAGTGCGTGCATCAGAAGTGCGTTTCTATAAATTTCCATGCGACAATGAAGATCAATTATGGAAGTGGTGCAATAATCTTAAAATGAATCCCATCGACTGTCATGGGGTTCATATATGTAGTCGACATTTCGAGGCGATTTGCTTGAGCTCCAAATTGTTTTTGTACAAGTGGGCGATACCGACATTGCAATTGGGACACGACGATGCCGATATTGAGCTTGTGACAAATCCATCACCAGAAGCACGCTTTACTAGTGATGTTTTTGCTAAGTGCTGCGTGCCGACATGTGGTAAATCGCGTAAATATGACGACGCGCAAATGAACAGCTTTCCGAAAAATGCGAAAATGTTTGAACGCTGGAAACATAACCTCAAACTGGAGTACCTAAACTTTAGggaacgtgaaaaatacaaaatttgcaatgaTCACTTTGAGCAGGTGTGCCTGGGAAAGATGCGGCTCAACTGTGGTGCTATACCAACGATAAATTTGGGCCATGACGATGTTGGCGATCTTTTCAAAGTCAATCCCAAAAAGTTGTTCTATACATTGTTTGGCAAGAATATACCTGTTGCACGAAGAGACTCCGATTCGGACAATGATGACACAATGAGCGAGACTTCACAACGAATTGCTTCCAGCGAGCCGATCGCAGATGATCATGCTGTAGTAAAATGTTCCTATCCTAATTGTACTGCCACAAAGGTTATGTTGCGAGAATCTGTCATTTTACCACTGGATGATAAGTTGCTGGATTTGTGCTCTCAACAAATGTGTGTGGAGAAAGCCAAACTCATGGCGAATCCCAGATTATGTGGTCTAcattatatgtgcatatatgagaTTACAATAGCGCGTACTGACAATGTCATCGATCTGGAAACTTCAGTACATTCTGCGCTCACACAATCCTATCGACGCTGCTGTAATTTGGTGCCTTTGCGTAGTTCAAAGTGCTGCATTAGCGGCTGTGTAACGACGACTCAGCAAACTACACATAAACTTTTTGAATTTCCCAACGATCGCGTGACTTTTCAAAAATGGTTATTCAATACCAGTGTAGAGCTTGTGGTGCCTGAGCGCCGTCGGCATATTTATAAAGTTTGTTCCTTACATTTCGAATCGAATGCTTTAACAAAGGCACATCGTTTACGCCCATGGGCATTGCCAACGTTATTATTGAATCATGTGAAACCGGATGATATATATCGAAATTCTGAAGGAGAATCAGTAAGCGATCGAGAATGCTGTGTGCCAAATTGCTTGCGCCGTAACAAGACTGATGAGGAGGCGGTTAGACTGTTTGATTTTCCCACAGATGAAACGCTTCTAACAAAATGGTTggataaaatgaatttattgcgAGATAGAGCTGGATCCTATAAAATATGTGAACGTCATTTCACAGCGGACTGCTTTGAGAACTCACAACTATGCTCTTGGGCCATACCGACACAATATTTAGAGGAGTTGGCTGACGAAGAGGATATGTTGCCCAATGTGCTACAGGTGAAGGTGAAGAAATCACTCGATCTGATTAAATGTTGTGTGCCGGCATGCCGCAAAAGTCGCTTAAAACATGGCGTGCGTCTATTTGCATTTCCAAACAGTCAACTAATGCTGAAGAAATGGTGTCACAATCTCCAGCTACCCATTAGTATTGCTCAAGATCATCCACGTATATGTAATATGCATTTTCACAAGCGCTGCATTGAGGGTAAATCGCTACAGCCATGGGCAAAACCTACAAAACGTTTGGGTCATCACAAGGCTATCTATGACAATCCGAAAAGTACATCCGGTATTTTCTTGCCCAGTTGTTGTCTATCACATTGTCGTAAGCAACGGACACTGGAAAATAATTTACGCACTTTTGGATTTCCTCGTGATGcagaattgtttaaaaaatggtGTGATAATTTGAAATTGCCGAATCCCAATGAATATGAGAAAGCTCGTATTTGTATTGAGCACTTTGATTCGGATGTGACGGGTAATCGGAAGCTGAAAGCAGGTGCTGTGCCGACACTTAATCTGGGTCATGACGAAATACCTCTTCATAATAACGTTAGTCTATTGTCAGCTGTAAATAATATACAGGCAATTGTTGCCAATGACAATACACCATCACCCATGACTGTGGAGAGTCGTGATCCTTTGGGTTTCTCGGATAATGATGATCGCGGTgatcatcataatgatgatgagaacgatgacgatgatgacgacgatcgtgatgagaatgatgacgatgatcgtgatgatgatgaaaatgatgacgatgatcgtgatgaagatgaaaatgatgacgatgatcgtgatgatgagaatgatgacgatgatcgtgatgatgatgagaatgattACGATGatcgtgatgatgatgagaatgatgGCGATGAGCGTGATGATGAGAATGATGGCGATGAGCGTGATGATGATGACCGCGATGTTGATGGGAGTGGTGACGATCGTGATGATGTTGATCATGATAATAccgatgatgatcatgatgatgcaGATGATCACGATGATACTGACGTTGATGATACCGATGATGATCGCAACGATGACAGTGCTTACATGTTATGA
- Protein Sequence
- MNTPAFGHMNTMFGGNCSGSYNIAETYNGSISRTADCSSYGTGGGYDLDMRNCEQSPAYSGYYPNHNQLTPSATPHMIASGAQNIKVEPPDHIQTPTIQMEELIIKSEPPDETYKYNSVNDYCTPYSNNTSSMPAMMSTPPLVHNIKQENSLEPTEIYAQQQRPLNFPRRKVQTERSDTLPICQRCKQAFLKKHCYVKHVAQSVCNILEYDFKCCICPMSFMSHEELQMHERLHAANKFFCHKYCGKHFDNVELCESHEYGQHEYPHFPCKICPASFQHREHFFAHLYQHKHQIRYDCCVCRLWFHTPYELHQHRSGAPYFCGRFYTYKEPMNYHRSLSLNNTPNMSQPMRSYSLQDCHMGVIEPPSSSNLAGYSSNNCAVRSNYASDNNFPSHNFIKTEFKAEKEDDYYTPTEYPSRDYKDCNNDAFNNTHINSLQTFSIDYLTDNKNCYKPSGDVNTQISKISNDEDDACCVPKCGVSKKTCPTLQFFPFPTDDHYLNQWLHNLKMPRDPNCDYTSNRICNLHFPKRCMNRYSLCYWAVPTFNLGHDDVATLYQNREITSAFTCMENAQCSMPGCTSQRGRSNLKFYNFPKDTKMLIKWCQNARLSVQGKELRHFCSLHFEDRCFGKFRLKPWAVPTLELGTSYGKIHDNPSVMYLEEKKCCLPHCRRTRSSQYNLSLYRFPRDERLLRRWCYNLRLDPSIYRGKNHKICSAHFIREALGLRKLSPGAVPTLNLGHNDSFDIYENELNTPPSPPPPPMRSQLLSMAQSVFKMKKYKFHASNASISSRTTSTTTSPTTPASTMVTNNMDFADVCFLSSCKCTRQSDNVTLHTIPRRPEQRKKWCHNLKLDPAMLHKGVRICSLHFEKECIGGCMRPFAVPTLNLGHDDPNIYKNPDVIKKLNIRETCCIPQCKRNRDRDHASLHRFPHNPELFEKWFANLRKPLPDGTKLFNDAICEIHFEDRCLRNKRLEKWAIPTINLGHTEEVLHQLPSEAEIAEFWTKQTSTLNTGDEVGECCVETCKRDPRVDDIRLYRWPEESELLAKWLHNIQCDLPGDASSLRICSLHFEAHCFKKLRLHHWALPTLNLAKNVEHLFRNPERALSVKKEKGVKEYKDKAKKWSLRCCLSHCRKMRSADNVQLFRFPHRNRGMLAKWCHNLQLPMTDTCKRRICSMHFEAHALTKRCPMQQSVPTLDLNTPPGYKIYQNPARLKIAKQRLERICAVPTCRKTRSDGVNLYRFPQNRRLLRKWCHNTLQKINEAVRAQYRVCSEHFEHHSFGARRLCPGSIPTLNLSPDVVEVYPNEAQKWDDKVCVVNGCNMMSSVVRASEVRFYKFPCDNEDQLWKWCNNLKMNPIDCHGVHICSRHFEAICLSSKLFLYKWAIPTLQLGHDDADIELVTNPSPEARFTSDVFAKCCVPTCGKSRKYDDAQMNSFPKNAKMFERWKHNLKLEYLNFREREKYKICNDHFEQVCLGKMRLNCGAIPTINLGHDDVGDLFKVNPKKLFYTLFGKNIPVARRDSDSDNDDTMSETSQRIASSEPIADDHAVVKCSYPNCTATKVMLRESVILPLDDKLLDLCSQQMCVEKAKLMANPRLCGLHYMCIYEITIARTDNVIDLETSVHSALTQSYRRCCNLVPLRSSKCCISGCVTTTQQTTHKLFEFPNDRVTFQKWLFNTSVELVVPERRRHIYKVCSLHFESNALTKAHRLRPWALPTLLLNHVKPDDIYRNSEGESVSDRECCVPNCLRRNKTDEEAVRLFDFPTDETLLTKWLDKMNLLRDRAGSYKICERHFTADCFENSQLCSWAIPTQYLEELADEEDMLPNVLQVKVKKSLDLIKCCVPACRKSRLKHGVRLFAFPNSQLMLKKWCHNLQLPISIAQDHPRICNMHFHKRCIEGKSLQPWAKPTKRLGHHKAIYDNPKSTSGIFLPSCCLSHCRKQRTLENNLRTFGFPRDAELFKKWCDNLKLPNPNEYEKARICIEHFDSDVTGNRKLKAGAVPTLNLGHDEIPLHNNVSLLSAVNNIQAIVANDNTPSPMTVESRDPLGFSDNDDRGDHHNDDENDDDDDDDRDENDDDDRDDDENDDDDRDEDENDDDDRDDENDDDDRDDDENDYDDRDDDENDGDERDDENDGDERDDDDRDVDGSGDDRDDVDHDNTDDDHDDADDHDDTDVDDTDDDRNDDSAYML
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -