Dmel015173.1
Basic Information
- Insect
- Drosophila melanogaster
- Gene Symbol
- -
- Assembly
- GCA_000001215.4
- Location
- JANZWZ010000289.1:312270-318905[+]
Transcription Factor Domain
- TF Family
- THAP
- Domain
- THAP domain
- PFAM
- PF05485
- TF Group
- Zinc-Coordinating Group
- Description
- The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 19 6.4e-08 0.00016 22.7 0.4 25 86 25 75 18 76 0.83 2 19 8.8e-05 0.22 12.6 0.5 1 87 103 154 103 154 0.70 3 19 4.9e-16 1.2e-12 48.7 0.2 1 87 176 248 176 248 0.85 4 19 2.3e-16 5.6e-13 49.8 5.5 1 87 329 399 329 399 0.82 5 19 3.1e-08 7.7e-05 23.7 0.1 28 86 430 477 411 478 0.80 6 19 0.00019 0.48 11.5 1.0 36 87 517 561 495 561 0.70 7 19 7.2e-11 1.8e-07 32.1 0.1 1 62 601 657 601 673 0.79 8 19 9 2.2e+04 -3.9 0.3 1 11 699 709 699 715 0.77 9 19 1.6 4.1e+03 -1.1 0.0 1 12 772 783 772 809 0.62 10 19 1.8e-10 4.3e-07 30.9 0.1 18 86 822 879 813 880 0.83 11 19 2.1e-12 5.2e-09 37.0 2.7 1 85 954 1022 954 1024 0.82 12 19 2.7e-07 0.00066 20.7 0.1 1 87 1047 1116 1047 1116 0.70 13 19 1.8e-08 4.4e-05 24.5 0.1 17 81 1213 1262 1205 1275 0.71 14 19 1.8e-11 4.5e-08 34.1 0.4 1 86 1348 1414 1348 1415 0.78 15 19 1.4e-07 0.00035 21.6 0.0 1 59 1430 1478 1430 1494 0.78 16 19 2.6e-13 6.3e-10 40.0 0.6 1 87 1507 1577 1507 1577 0.83 17 19 2.3e-12 5.8e-09 36.9 0.9 1 87 1633 1703 1633 1703 0.83 18 19 1.2e-11 2.9e-08 34.6 0.2 1 86 1738 1809 1738 1810 0.81 19 19 5.9e-05 0.15 13.2 0.0 1 40 1820 1856 1820 1865 0.85
Sequence Information
- Coding Sequence
- ATGCACGCAGGAGCGATCGCCGGGCCCCAGGCGCACTCCTCAACGCTGGACGACTCCGAGGACGCCCTGTGCTGCGACGAGAAGTACCTCAATCAGTGGCTGCACAACCTCAAGATGTTTCACATACCCGCCGCCAGCTACGCAAATTTCCGCATCTGTAGCATGCATTTTCCGAAGCGCTGCATTAACCGCTACTCTCTGTGCTACTGGGCCGTTCCCACGTTCAACCTGGGCCATGACGACGTGGCCAATTTATACCAAAACAGGGAACTCACTAACACCTTCACCACCGGCGAGGTAGCGCGCTGCAGCATGCCTCACTGTACAAGTCAGCGGGGTGAGAGCAACTTGAAGTTCTACAACTTTCCCAAGGACATCAAAAGCTTGATCAAGTGTCGCCACTTCGAGGAGCGCTGTATTGGCAAGTTCCGTCTGAAGCCGTGGGCGGTGCCTACTTTACACCTAGGTGCCCAATATGGCAAGATCCACGATAACCCAAAGAATTTGTACGTTGAAGAAAAACGCTGCTGCCTCAACTTCTGCCGCCGGAGCCGATCCTCTGACTTCAATATGTCGCTATATCGATTTCCCAGAGATGAAGTTCTCCTGCGACGCTGGTGCTACAATCTCCGCTTGGATCCCGGAGTGTATCGTGGGAAAAATCACAAAATATGCAGCGCCCACTTTATCAAAGAGGCGTTGGGTCTGCGCAAACTATCACCAGGGGCCGTTCCGACGCTTCATCTGGGTCACACTGATACCTTCAACATCTACGAGAACGAATTGTGGCCACCGCCAACGGCACCCAACAGTCACAGCAGTGGCCTCCAGCACCAGACGCAACAACATTCCTCGCAACACTCACTGCAGCAGCAATTGCATAGCAAATCGTACCACCGGCAATCGGCGGCCTCCACGTCCTCCTCCGCCAGTTCGGGTGCCAGTGGATCTTCTGCAATGAATGCCAGCGACAGCATGGACGTATGTTGTGTGCCCAGTTGTGAGAGCAAGAGGCACAATAATGAGAACATTACATTCCACACCATACCACGTCGACCGGAGCAGATGCGCAAGTGGTGCCATAATCTGAAAATACCCGAAGAAAAGATGCACAAGGGTATGAGGATCTGCAGCCTGCACTTCGAGCCCTATTGCATCGGCGGTTGCATGCGTCCATTTGCGGTGCCTACGCTTAACTTGGGTCACGATGACGACGATATTCATAGAAATCCGGATGTGATCAAGAAGTTAAACATCCGGGAAACGTGCTGCGTCGCCGTCAATGTGTCCTTATTGACCAAGTGGTGTGGCAATCTGCAGCGTCCTGTTCCGGATGGAAGTAAACTTTTCAACGACGCCATTTGTGAAGTGCACTTTGAGGAACGATGTCTGCGCAACAAAAGACTAGAGAAGTGGGCAGTGCCGACACTATCGTTAGGCCACGAAAACATCCCATACCCGCTGCCAACGCCGGAACAGGTTACGGAGTTTTACTCTCGACCCACTGCGCCCAATAATGGCGAGGAACAGGGAGAGTGCTGTGTGGAGACGTGCAAGAGAAATCCAAGTGTGGACGACATCAAGCTTTATCGGCCGCCGGAGAAGCTTCCGATCTGTAATCTTCACTTTGAGGCACACTGCATCGGCAAGCGGATGAGACCTTGGGCTATTCCAACACTGAATCTGGCAGGCACAATAGAGAATCTCTACGAGAATCCGGAGCATTCGATGCTGTACAAGCGGCGGACCCACATGAAAGCCAAGCAATCGGCTTCCGTGAAGCCCACTTGGGTGCCCAGGTGCTGTCTTCCGCATTGCCGCAAAGTTCGGGCTCTCCACAACGTTCAGCTTTATCGCTTCCCAAAGCTCAATCGCTCCACTCTGGCTAAGTGGGCGCACAATCTGCAGGTTCCTATGGTTGGCAGTGCCCAGCGTCGTCTATGCTCGGCTCATTTCGAGCCGCACGTGCTCAGCAAAGAAGTGCCCGGTGCCGCTGGCGGTGCCCACATTGGACTTGAATGCGCCGCCCGGCTTGAAGATTTACCAGAATCCAGCCAAGCTCTCAAGGCTAGCAAGCTGTGTCTGCAGCGCGTGTGCATTGTCGAGAGTTGTCGCAAGACGCGGGCGCAGGGCGTCAGCTCTTCCGACTGCCACATAGTCCAACGCAGCTGCGCAGTGGATGCACAACATCAAAACGCGTCCCAGAGCGGCGATGAGGGCCCAATACCCGCTGGCGCCATTCCCACCCTGGAACTGGGTCATGACGACGAGGACATCTATCCCAACGAAGCGCAGGCCTTTGCGGACGAGCACTGCGTGGTGGAGGGCTGCGAGGCATCAAGGAACAGCCTGACGCAGATTGCATCGGGCCTCAAGCACTTATACAAGTGGGCTATTCCCACCGAGGAACTGGGTCACGACGACGCTGACATCGAGCTAGTGCTAAATCCCAAGCCGGAGGACAGTCAGATGAACAGTTTTCCCAAGGACGCGAATCTCTTCGAGCGATGGAAACACAACTTGCGGTTGGAACACCTCAGCTTCCACGAACGCGATCGGTACAAGATATGCAACTCTCACTTTGAGGATATATGTATTGGGAAGACGCGGCTTAACATAGGTTCGATTCCGACTCTAGAATTGGGTCACGACGAGACAGACGATCTGTTCCAGGTAAATCCGGCGGAGCTGCAGAGCAACACTTTTCGGACGACGGCGGCGAGATTACACGACGAGTCGGGCGGTATAATCATCAAGCAGGAGTTTTCCGAGTCGGAGGACGTCAAAACGGACGTGTCTGATGCCAAAGATTTCAATACGAGACAGGTTAAGCTCAGAAAGACTATGTCCGATCTGAAGTGTTGTGTGCGCAGTTGTGGGCGCAGTCGACTGGAGCACGGAGCGCGCCTCTTTCCATTTCCACCGGCAAGCAGCAGCACCTGGAAGTGGCGCCATAACCTGCGCCTGGAACCCGACGAGGTGGACCGATCGACACGGATTTGCAGTGCGCACTTCAATCGGCGCTGCATTGATGGCAAGCAGCTGAGAAGCTGGGCAATGCCCACGCAACAACTGGGCCACCAGGAGCAGCCGATCTACGAGAATCCGAAAAACATACCAGGATTCTTTACGCCCACCTGTGCTCTGAGTCATTGCCGCAAGCGTAGGAGCATTGACAACGATCTCCGCACCTATCGATATCCGAGGGTGAGGATCTTCTGGAGAAATGGCGGGCGAATCTGCGTCTGGCGCCGGATCAGTGTCGCGGCTGGGATATGTGCTGACCATTTTGAGTCACAGGTGCGTGGAAAGTTGAAGCTGAAAACGGGAGCGGTGCCTACTCTAAATCTGGGCCATGATGAGGGCTTAATATACGACAATGAGGCTATAAAGCTCGGAGATGACACGACTGAAACCCAAAGAGAGCTGATTGATGAAGAGGAAGAAGAACTAGAGGCTGAGGAGGAGCCCCATGAGCACGATATGTACGATGAAGACGAGAAGGACGGCCACTATTTCGATCCTCTCGAACTGGAGGATGAGCGCGACGAAGATGAGGACTTGGACGAGGCGGAGCACTTTCATCCGGACAACCCACCCACTCCCCAACCATCCCTCTGCGTCGCGAAAAGCCCGCGAATAATCACCTTTGGCTTTCCCAAGGATCGCCAACTGCTGCTCAAGTGGTGCGCCAATCTACACCTGAATCCGGATGACTGCATCGGCCGCGTTTGCATAGAGCACTTTCAGCCGGAGGTATTGGGAACCCGAAAGCTGAAGCAAAATGCGGAGCAATTGCAGCCACAGCACTCGGTTTTTCGGCTTTGGAGCCTAAAACACTGCCGCAAAGAGGAAACTGACGGAGCCGCCGGACATCCGCCAAAACAAGTGGAGTGCGGAAGTGCGGAAGATGCAGAGATTGAGGATGGAAATGAAGATGAAGATAGGGAGGGAGATCAAGCTGGAGGTGCAGACGGAGAGGAAATGAAGTCCAAGGAAAAGACTCCAACGACGAGTCTCGGAAAGATTAAGTTGGAAATATGTTGCATCAACTCCTGTGCGAATGACGACGTTAACCAACTACTTCAGCTGCCTGAGGATCAAAATCTCTTAAGAAAGTGGCAGCATAACCTAAAGTTATCCGTGGACACGGACTTCAAGGAAATCCAAGTGTGTCTAAAGCACTTTGAGGAGCAAGTGGTGCAAAACGGAAAGCCCTTGGAGCAGGCTGTACCCACCTTACAGCTAGATCAAAACAGTTGGAACATCTACAGAAACAGCGGGAATTGCCTGTTTCCAGAGTGCAGTAATTCTTCATCGGACCAGTTAAGCTTTGTTGATTTACCTGGAAATGCGGTCATAAGAGATGCCTGGATGAGTCACCTCAATTTGCCACCCAGCAGTGAGGGTCTTCTTTGTAGTGAGCACTTTATGCAACTCTTTGAACAGGTGGAGTACCCCAAGGTATTGGCTGCACAAGATTTGGAAGACTTGCAGTGGATTGTTGACGAACTTAGATGCGCTGTTCCCAGTTGTTCATCCAAATCTGATGGGGATCTTCAGCTTATCCCTTTTCCGAAAAAGGATGCTACCCTTTTGAAGTGGCTGCAAAACACAAAGATATCTTACGATCATTTAAAGCACAAAAGCTATCGCATATGTGTTCTTCATTTCGAGCCGACTTGCCTAGAGGCAAATTTTCCGAAAGCTTGGGCTATACCCACCTTGCATTTAAACCACGATGACGAGCTTCATTTGAATCTCAGGCCTGAATCTCGCAGTGGTACACCAAACAGCAACTCCAGACTAACTCCATTGAGAATTAAAACAGATCTGACCTCCTTGGGAAGTCCATGCTCGAGTGCAAGTCCTAGTCCTCGAGGCAGGATCAGGATATGTTGTATTTCCACATGTGGACAGATTGGAAGTAGTCAAGTTCGACTCTACCGCTTTCCCACCGAAGAGCAGGCCCTACTCCGGTGGCTGGTGAACACGCAGCAGCAACCTCGCATTGTGGACCCTGCAGAGCTATATGTGTGCCAATCTCACTTTGAACCAGACGCCATTTGCAAAAAACAACTTCGTTGCTGGGCAGAACCCACCTTAAACCTGGGCCACGACGGGTTTGTTATCCCCAATGCCAAGCACAATGGAAACATTGCTGGGGGCCAGGATACTGAGGAGGCGATGAGGCTTATCCGGGAGCGCTATTGCTCCGTACTGACTTGTTTCCAGGCTGAAGCTAGCGGTGTAAGACTTTATGAGTATCCCAAGGATATGCCAACTATACGAAAGTGGGCAGCCGCGTGTAGACATCGCTCCATGCAGGCCAGCAGCCATGGATTCAAGGTATGCCAGTCTCACTTCGCACCGGAATGCTTCGAGCCGGACACATTAAATTTGATTGAGGGATCCGTTCCCACTCTGGAGTTAAGTAGAGGCGACATCGAAAGACACTGCCTAGTGTCTGGATGTGAAAAGGATGCATCTGAAGGACGTCTGCGCTATTACAAGGTGCCAAAAACCACTGCTCAACTGAATGCTTGGAGCAACAACCTGAAGATCAGTTGCCAGGACCTCGGATTGGGGAGCAGCTCATCTGTGAGCGTCACTTTGAGCCCTTTTTGCTTCGGTGCCCACAAAGGGATTACGCCCTGGCGCACTGCCGACTCTCATGCTAGGCCACGACGAAGAGGTGGAGATGTTACCGAACCCAGAAATTCTCTGGCAGAAAAAAGCCGAGGTTTGCTGTGCCACTGCATGTGGTCGAATATGGCAGCCTGGAGACCCTAA
- Protein Sequence
- MHAGAIAGPQAHSSTLDDSEDALCCDEKYLNQWLHNLKMFHIPAASYANFRICSMHFPKRCINRYSLCYWAVPTFNLGHDDVANLYQNRELTNTFTTGEVARCSMPHCTSQRGESNLKFYNFPKDIKSLIKCRHFEERCIGKFRLKPWAVPTLHLGAQYGKIHDNPKNLYVEEKRCCLNFCRRSRSSDFNMSLYRFPRDEVLLRRWCYNLRLDPGVYRGKNHKICSAHFIKEALGLRKLSPGAVPTLHLGHTDTFNIYENELWPPPTAPNSHSSGLQHQTQQHSSQHSLQQQLHSKSYHRQSAASTSSSASSGASGSSAMNASDSMDVCCVPSCESKRHNNENITFHTIPRRPEQMRKWCHNLKIPEEKMHKGMRICSLHFEPYCIGGCMRPFAVPTLNLGHDDDDIHRNPDVIKKLNIRETCCVAVNVSLLTKWCGNLQRPVPDGSKLFNDAICEVHFEERCLRNKRLEKWAVPTLSLGHENIPYPLPTPEQVTEFYSRPTAPNNGEEQGECCVETCKRNPSVDDIKLYRPPEKLPICNLHFEAHCIGKRMRPWAIPTLNLAGTIENLYENPEHSMLYKRRTHMKAKQSASVKPTWVPRCCLPHCRKVRALHNVQLYRFPKLNRSTLAKWAHNLQVPMVGSAQRRLCSAHFEPHVLSKEVPGAAGGAHIGLECAARLEDLPESSQALKASKLCLQRVCIVESCRKTRAQGVSSSDCHIVQRSCAVDAQHQNASQSGDEGPIPAGAIPTLELGHDDEDIYPNEAQAFADEHCVVEGCEASRNSLTQIASGLKHLYKWAIPTEELGHDDADIELVLNPKPEDSQMNSFPKDANLFERWKHNLRLEHLSFHERDRYKICNSHFEDICIGKTRLNIGSIPTLELGHDETDDLFQVNPAELQSNTFRTTAARLHDESGGIIIKQEFSESEDVKTDVSDAKDFNTRQVKLRKTMSDLKCCVRSCGRSRLEHGARLFPFPPASSSTWKWRHNLRLEPDEVDRSTRICSAHFNRRCIDGKQLRSWAMPTQQLGHQEQPIYENPKNIPGFFTPTCALSHCRKRRSIDNDLRTYRYPRVRIFWRNGGRICVWRRISVAAGICADHFESQVRGKLKLKTGAVPTLNLGHDEGLIYDNEAIKLGDDTTETQRELIDEEEEELEAEEEPHEHDMYDEDEKDGHYFDPLELEDERDEDEDLDEAEHFHPDNPPTPQPSLCVAKSPRIITFGFPKDRQLLLKWCANLHLNPDDCIGRVCIEHFQPEVLGTRKLKQNAEQLQPQHSVFRLWSLKHCRKEETDGAAGHPPKQVECGSAEDAEIEDGNEDEDREGDQAGGADGEEMKSKEKTPTTSLGKIKLEICCINSCANDDVNQLLQLPEDQNLLRKWQHNLKLSVDTDFKEIQVCLKHFEEQVVQNGKPLEQAVPTLQLDQNSWNIYRNSGNCLFPECSNSSSDQLSFVDLPGNAVIRDAWMSHLNLPPSSEGLLCSEHFMQLFEQVEYPKVLAAQDLEDLQWIVDELRCAVPSCSSKSDGDLQLIPFPKKDATLLKWLQNTKISYDHLKHKSYRICVLHFEPTCLEANFPKAWAIPTLHLNHDDELHLNLRPESRSGTPNSNSRLTPLRIKTDLTSLGSPCSSASPSPRGRIRICCISTCGQIGSSQVRLYRFPTEEQALLRWLVNTQQQPRIVDPAELYVCQSHFEPDAICKKQLRCWAEPTLNLGHDGFVIPNAKHNGNIAGGQDTEEAMRLIRERYCSVLTCFQAEASGVRLYEYPKDMPTIRKWAAACRHRSMQASSHGFKVCQSHFAPECFEPDTLNLIEGSVPTLELSRGDIERHCLVSGCEKDASEGRLRYYKVPKTTAQLNAWSNNLKISCQDLGLGSSSSVSVTLSPFCFGAHKGITPWRTADSHARPRRRGGDVTEPRNSLAEKSRGLLCHCMWSNMAAWRP
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -