Dasa003695.1
Basic Information
- Insect
- Drosophila asahinai
- Gene Symbol
- -
- Assembly
- GCA_008042795.1
- Location
- VNJZ01006338.1:38338-45144[-]
Transcription Factor Domain
- TF Family
- THAP
- Domain
- THAP domain
- PFAM
- PF05485
- TF Group
- Zinc-Coordinating Group
- Description
- The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 16 1.9e-12 4e-09 37.3 0.1 1 86 28 96 28 97 0.83 2 16 1.1e-13 2.2e-10 41.4 2.4 1 87 243 312 243 312 0.80 3 16 1.5e-06 0.0031 18.5 0.3 23 86 389 440 375 445 0.72 4 16 2.7e-13 5.6e-10 40.1 0.9 1 87 493 563 493 563 0.83 5 16 1.1e-10 2.3e-07 31.7 0.1 1 86 598 672 598 673 0.79 6 16 4.7e-12 9.9e-09 36.1 0.0 1 86 683 756 683 757 0.79 7 16 0.018 37 5.4 0.6 50 62 802 817 781 839 0.71 8 16 6.8e-06 0.014 16.3 0.1 1 58 846 896 846 920 0.81 9 16 7e-12 1.5e-08 35.5 0.3 1 87 936 1008 936 1008 0.81 10 16 6.5 1.4e+04 -2.8 0.1 20 45 1014 1035 1012 1043 0.68 11 16 1.2e-15 2.4e-12 47.6 0.6 1 86 1120 1192 1120 1193 0.81 12 16 8.8e-13 1.8e-09 38.4 3.4 1 86 1258 1328 1258 1329 0.80 13 16 1.3e-13 2.6e-10 41.1 3.1 1 86 1421 1491 1421 1492 0.84 14 16 1.6e-11 3.4e-08 34.3 0.1 1 86 1570 1639 1570 1640 0.83 15 16 1.1e-10 2.4e-07 31.6 0.8 1 58 1665 1713 1665 1721 0.84 16 16 2.9e-10 6.1e-07 30.3 0.7 18 87 1731 1789 1721 1789 0.74
Sequence Information
- Coding Sequence
- ATGCCCACCCAACAATTGGGTCACCACGACCAGCCGATATACGAGAACCCAAAGAATATACCTGGGTTTTTCACACCAACCTGTGCCCTGGGACACTGTCGCAAGAGGAGGAGTATTGACAACGATCTGCGTACCTATCGATATCCTAGAAGCGAAGATCTGCTGGAAAAATGGCGAGTTAATTTACAGCTGGCTCCGGATCAGTGCCGTGGTCGGATTTGTGCAAACCATTTCGAGCCGCAGGTGCGGGGAAAGCTAAAGTTAAAGACGGGAGCCGTTCCCACACTAGAACTTGGACATGATGATGGAGTAATCTATGACAATGAGGCTATAAAGACGGGTATGATGGAGGAAGAGGAAGGCCTCACCACAGAATTCCAGCGcctgaaaccaaaaaaagagaTGTTTGAAGTGGTTGAAGAGGACGTTGGAGAGAATGTTGGTGAAGAGCAGCACCCAGATTACCAGGACGAAAATGCAGATGAGGATGACAAGGATGACCAGTATTTTGATCCTCTTGAACTGGTAGAGACCTTTGAACATCGCAGCAATGATGACGCCcaagatgatgatgaagaaGAGGGCCGAGTTCTGGACTCCCCCTCTGGTTACATGGTCAAGAAGGAGATAGAACAGCTTCCAAGCAGCCCACCTTCACCTTCACCTTTACCCCTACGGCACCAAACTCTGCGTCAAGGCAAGCCTGCCAACAATGTAACGCCCATTTGTTGTCTAAAACACTGCAGGAAGGAGCGTACTGCCTTCCACCTGCTGAGCACTTTTGGTTTCCCAAAGGATCGTCAATTGCTGCTGAAGTGGTGTGCCAATCTGCATCTAAACCCGGATGACTGCGCCGGCCGGGTTTGCATCGAACACTTCCAGCCGGAGGTACTCGGTACCCGTAAGCTAAAGCAGAACGCGGTGCCCACTGTTAACGTTGGACATGATGAGCCGCTTAGGTACTCGTGTCATGGCGTGGATCAGAACCTCGAGGACCAAGACCCACAGCCACAGCATTCGGTTTTTCGGCTTTGGAGCCTAAAACACTGCCGCAAAAGGAAGCTAACGGAGCCACCGGATATTGCCCTGGCCAAGAGGAAACCGCTGGGAATGCCGATAATGAAGCGGGAATGGGAGATGGAGAAGTCAAAGAAGATGACTCAAGCGGACAATAACAAAGAGGTGCTGTCCAAGTGGCTGCACAACACCAAGATCCCCTACGATCCTTCTAGGCACCGAGGCTATCGCATCTGCGGGCTGCACTTTGAATCAGAGTACTTAGAAGCGGATTGGCCGCTACAATGGGTTATACCGACGCTCCATCTAAACCAAGAAGATGAGATCTACTTAAATAGTAAGCCCTTGCAAGAGGACCAGCTCTTAATGTTGGCTCCACTGCGGATAAAGACGGATCTAGCCTTGCTGGGTAGTCCGAGTGCAAGTGCAAGCCCCAGTCCTCGCGGCAGGATCCGGATATGTTGCATTCCCACATGTGGTCAGTTTGGAAGCAATCAAGTAAGGCTCTATCGCTTTCCGACCGAGGAGCAGGCGTTGCTTAGGTGGTTGGTGAACACGCAACAGCAGCCAAGACTCGTAGACCCCAAGGACTTATATGTGTGCCAGTCGCATTTTGAGCCTGAGGCCATTTGTAAGAAACAGCTTCGCAACTGGGCAGAGCCCACATTGAACTTGGGGCACGACGGTCACGTGATTCCAAATGCCAAACACAATGGCAACATTTCTGATAGCCAGGATACCGAGCAGGCAATGAAGTTCATTCGCGAACGATTCTGCTCCGTCATTTCTTGTTTTCAGGCAGGAGGACCGGAGGAGGGAGAAGTGAGGCTATATGATTATCCCGAAGATATGGCTACTACTCGAAAGTGGGCAGCCGCTTGCAGACATCGCTCCATGCAGGCCAGGAGCCATGGGTTCAAGGTATGCCAGTTTCACTTCGCTACGGAATGCTTTGACCCAGCTACTGGAAAATTGACTGAGGGTTCGGTCCCCACACTGGAGTTGACCAGAGATGATATGGAAAGGCAGTGTCTTGTAGCTGGATGTGTAAAGATTGATCCTAATGGGGCCCGTCTTCGCTACTTCAAGATACCAAAGACTGCTGCTCAATTGGAAGCGTGGAGCAACAACCTTAAGATCCATCCCACGGTTCTAATGCAAGGGGAACAGCAGTACATCTGCGAGAAACACTTTGAAGCGTTCTGCTTTGGAGCTAACAAAGGACTGCGTTCTGGTGCCCTTCCAACTCTCTTGCTGGGTCATGATGACGTAGATTTGCTTCCAAATCCGGAAAGTCTCTTCTGCCAGAGCAAGACGGACAGGTGCTGCGTTCCAGGTTGCGGACGTATCTGGCAGGCTGGCGATCCGACCATGGATGAACTGGGAAAGCTTAAGGTCTGCAGTGCTCACTTTGAGAGTTCTGCGATACCCACCGTGGAATTGGGTCATTCTTCTCCGGATATTTACCAAGCGGATTTACCAAGCTTAAAGTCCCAAAAGCGGTCCGTAATGGTCTATTGCTGTTATCCCAAGTGCGAGGAAATCAGTCTTTCCAAGAATCTGTATTATGGGCTTCCACAAGAGAAGCATCTGCGAAGTGCCTGGTTAAGGCACATGAACATTGAAGATCCGAAAGATGGAGCAGTCGCAGAGCTTTGCCCGCTGCACTATGTCATTCTCTACCAGCACAGTGCCAAAAACTATCCTGAGTATCACGATTCAAACCGATTGTTTCTTGATGATAACTACAAGGATGCGCGGAACAACCGGCGCATAAGGATTGTGAGCTGTGTGATCAAGGGCTGCGACATGGTTAAGCCACGGGATGGGATACCATTGCACGGGATGCCGCAGAACCAGGACATCCTGCAGATGTGGATAGATAATGGTCAGTTTGAGTTCTTAGAGCAGCAGCGGTACATGCTTAAGGTGTGTCACAACCACTTTGAGCCATGCTGTTTCTTCGACGATAGACGCTTGCTCTCATGGAGCGTGCCGACTCTGCACCTACCTGGCGAAGCATTTCACCAAAATCCTACCGCCGAACAGTGGCAAAACATGGTCAAAAAACAATCAGCAGCCAAAACAAATGCAGTGGAGAAAGAGGAGTCTGAGCTATATAGGGATGAGGATAGGACGGAGcccattttaaaaatggagCATATTGAATCAGAATATGAAGATAAAAACTCGGAGATGCAGGCCCTAGAAGTCCTCCTGGAAGTTGGGCATGTGGAACGAATGGAGAGCTATGAGAAAGTGGATAAATCACCGGTTATCTATACCGAAAATGCACCCTTCCGATCGTCACCAATACGTGGCCAATACAATGCTAATCAGTGTGCCGTAGAGGGATGCCAGGTGACCGTCGAGGATGTGGACGGCACTATTAAACTGCACAAGTTCCCCGCGTCGCAGGAAGCCGCACAAAAGTGGAAGCACAACACGCAAGTTGACATGGACGAAAAGTTCTGGTGGCGCTACCGTATATGCAGTTACCACTTCGATCAGGAGTGCTTTCAGAGCGCTAGGATTCGAAAGGGCGCGATGCCCACGCTTTTGTTGGGACCTCGGCGACCGGATAAGGTCTACGATAATGAATTTGCACAAccagaggcagaggcggaaGAGTCTTTTTTAGAGCCACCGGGAATTCAGCTGGAGGAAGGATGTACGTCTGTGTCTAGAGTTCGAAAGGAGGTTTCTAGTTTATGCCTGCCGCCACGGGCGCCGCCTCGAAAGTCGAGCAAGTTTTGCCAGATTTATTCTTGTACGAATCACCTGACAACTGAGAACATGACACTCCACAAGTTTCCCCACTCAGAGGACATGTGCCTTAAATGGCAGCACAACACGCAAGTGCCATTCGATCCCTACTACCGCTGGCGTTACCGCATCTGCAGTGCCCATTTCCATCCAGTGTGTTTGGTCAACATGCGTCTAGTCCACGGAAGCGTTCCCACTTTAAAGCTAGGACCCAAGGCTCCATCCGAACTGTTTGACAACGACTTTAATGCCATCAACCTAAGGCTGGACAAAAGGTTGACGGAATCCAATGCCAATGTGTATATCAAGCATGAAAAGAGGGAAGAGGGTGAAGGCTCGCCAATTCTGCTGGAGCCCGAGCTCCAGTTTCAAGAGGATCAAGACGATAGGATATCAGCATGGAACAGCAAACTGCAATTGCGACCTATAAAGCTGGAGAAAATAAGTTATAGCCAGAAGAAGTTTGGCTCTGATAAGTGTTTGCTGGCTCACTGCCAACGCCAAAGGTTCCAACATGGCGTCCACATTTATAAGTTTCCAAGAGCGAGGCGGCAACAGGAGCGTTGGATGCACAACCTCCGCATCCGCTATGATGAGCGTACACCGTGGAAATTCATGATCTGCAGCGTTCATTTCGAACCTCGCTGCATCAGCTTAAGGAAGCTGCAACCTTGGGCGGTCCCCACACTGGATCTGGGCGACAATGTGCCAGAGAATATCTTTTCGAACGAACAGTGCGAGGAGGATTTGGTGATCGATCGCAGCGAGCTAGAGAGCGACGCCGAGGATGAAGAAGGCTTACAGGAGGACGACGATGATCAAGACGAGGACGATCTGAAGCCGGATGTTGGAATAAAAAGGCGAAGACGTTTCAATAGAGATTCCTCCTGCCCTACCCAGACACCACCCTGGAAAGTCAAACAATGCTGCCTCCCCTATTGCCGTGCCTTCCGAGGCGATGGCATCAAGCTTTTTCGGCTACCGAGCAACCAAAACTCCATTAGCAACTGGGAACTGGCCACAGGAATGGTTTTCAAAGAATCACAACGGAATACGCGATTGATTTGCAGCCGTCACTTTGAGACAGAGCTGATTGGAGTGAGGCGTCTAATGCGTAACGCCATTCCCACAAAGCACTTAAATCCGCATAGCATTGACCAGGTCCGTACTAAAAAGGAGAAGAGTCCTCAAGCCTCTATTATTCCCATCTGCTGCATGGCGGACTGCCACTACAATGGAAATGTGAAGCTGCACAAATTTCCCAGTgATCCCACACTGCTTAAACAGTGGTGCCAGGCTCTTCGGCTCACCGACACGCAGCGGTATTTTGGCAAGCACATTTGTTCGATGCACCTGCCGATGAACGACACACTGAGCTGTGTTATCTGCGGTGGAGATAACATAGAGTTGCCGTTGCTTGGGTTTCCGGAGAACCGCAACCAGCGCGCCAAATGGTGTTATAATCTCAAAATTGAGACAATACCAAAGTGGGACCACTCAAAGCACATATGCTGCCGGCACTTTGAGTCCTATTGCTTTGATAGGCCGGGTGAGCTACGTCCAGGAGCAGCTCCCACGCTCCATCTGAATCACAATGACACAAACATATTCTTCAGCGACTATGCCACTGGTCTTTCGTCCTCGCCAACACGCAATCAAATTAAAGACGAGCCGCTGGAATCGGAGTCTGACGAGATGCTGCTGGTGTAG
- Protein Sequence
- MPTQQLGHHDQPIYENPKNIPGFFTPTCALGHCRKRRSIDNDLRTYRYPRSEDLLEKWRVNLQLAPDQCRGRICANHFEPQVRGKLKLKTGAVPTLELGHDDGVIYDNEAIKTGMMEEEEGLTTEFQRLKPKKEMFEVVEEDVGENVGEEQHPDYQDENADEDDKDDQYFDPLELVETFEHRSNDDAQDDDEEEGRVLDSPSGYMVKKEIEQLPSSPPSPSPLPLRHQTLRQGKPANNVTPICCLKHCRKERTAFHLLSTFGFPKDRQLLLKWCANLHLNPDDCAGRVCIEHFQPEVLGTRKLKQNAVPTVNVGHDEPLRYSCHGVDQNLEDQDPQPQHSVFRLWSLKHCRKRKLTEPPDIALAKRKPLGMPIMKREWEMEKSKKMTQADNNKEVLSKWLHNTKIPYDPSRHRGYRICGLHFESEYLEADWPLQWVIPTLHLNQEDEIYLNSKPLQEDQLLMLAPLRIKTDLALLGSPSASASPSPRGRIRICCIPTCGQFGSNQVRLYRFPTEEQALLRWLVNTQQQPRLVDPKDLYVCQSHFEPEAICKKQLRNWAEPTLNLGHDGHVIPNAKHNGNISDSQDTEQAMKFIRERFCSVISCFQAGGPEEGEVRLYDYPEDMATTRKWAAACRHRSMQARSHGFKVCQFHFATECFDPATGKLTEGSVPTLELTRDDMERQCLVAGCVKIDPNGARLRYFKIPKTAAQLEAWSNNLKIHPTVLMQGEQQYICEKHFEAFCFGANKGLRSGALPTLLLGHDDVDLLPNPESLFCQSKTDRCCVPGCGRIWQAGDPTMDELGKLKVCSAHFESSAIPTVELGHSSPDIYQADLPSLKSQKRSVMVYCCYPKCEEISLSKNLYYGLPQEKHLRSAWLRHMNIEDPKDGAVAELCPLHYVILYQHSAKNYPEYHDSNRLFLDDNYKDARNNRRIRIVSCVIKGCDMVKPRDGIPLHGMPQNQDILQMWIDNGQFEFLEQQRYMLKVCHNHFEPCCFFDDRRLLSWSVPTLHLPGEAFHQNPTAEQWQNMVKKQSAAKTNAVEKEESELYRDEDRTEPILKMEHIESEYEDKNSEMQALEVLLEVGHVERMESYEKVDKSPVIYTENAPFRSSPIRGQYNANQCAVEGCQVTVEDVDGTIKLHKFPASQEAAQKWKHNTQVDMDEKFWWRYRICSYHFDQECFQSARIRKGAMPTLLLGPRRPDKVYDNEFAQPEAEAEESFLEPPGIQLEEGCTSVSRVRKEVSSLCLPPRAPPRKSSKFCQIYSCTNHLTTENMTLHKFPHSEDMCLKWQHNTQVPFDPYYRWRYRICSAHFHPVCLVNMRLVHGSVPTLKLGPKAPSELFDNDFNAINLRLDKRLTESNANVYIKHEKREEGEGSPILLEPELQFQEDQDDRISAWNSKLQLRPIKLEKISYSQKKFGSDKCLLAHCQRQRFQHGVHIYKFPRARRQQERWMHNLRIRYDERTPWKFMICSVHFEPRCISLRKLQPWAVPTLDLGDNVPENIFSNEQCEEDLVIDRSELESDAEDEEGLQEDDDDQDEDDLKPDVGIKRRRRFNRDSSCPTQTPPWKVKQCCLPYCRAFRGDGIKLFRLPSNQNSISNWELATGMVFKESQRNTRLICSRHFETELIGVRRLMRNAIPTKHLNPHSIDQVRTKKEKSPQASIIPICCMADCHYNGNVKLHKFPSDPTLLKQWCQALRLTDTQRYFGKHICSMHLPMNDTLSCVICGGDNIELPLLGFPENRNQRAKWCYNLKIETIPKWDHSKHICCRHFESYCFDRPGELRPGAAPTLHLNHNDTNIFFSDYATGLSSSPTRNQIKDEPLESESDEMLLV
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- iTF_00531690;
- 90% Identity
- iTF_00531690;
- 80% Identity
- -