Ccos015554.1
Basic Information
- Insect
- Chymomyza costata
- Gene Symbol
- -
- Assembly
- GCA_018150985.1
- Location
- JAECWU010000499.1:675603-685544[-]
Transcription Factor Domain
- TF Family
- THAP
- Domain
- THAP domain
- PFAM
- PF05485
- TF Group
- Zinc-Coordinating Group
- Description
- The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 16 1.1e-15 1.1e-12 48.6 5.3 1 86 583 655 583 656 0.85 2 16 3.1e-15 3e-12 47.2 4.8 1 87 683 752 683 752 0.83 3 16 1.5e-15 1.5e-12 48.2 0.2 1 87 774 846 774 846 0.85 4 16 1e-16 1e-13 51.9 4.6 1 86 926 995 926 996 0.80 5 16 1.2e-14 1.1e-11 45.3 4.5 1 86 1020 1091 1020 1092 0.81 6 16 1.6e-13 1.6e-10 41.7 0.7 1 87 1127 1195 1127 1195 0.80 7 16 9.4e-11 9.2e-08 32.8 1.6 1 86 1236 1305 1236 1306 0.75 8 16 1.1e-17 1e-14 55.1 0.5 1 86 1333 1402 1333 1403 0.82 9 16 6e-13 5.9e-10 39.8 1.4 1 86 1424 1493 1424 1494 0.81 10 16 2.2e-15 2.2e-12 47.6 1.5 1 86 1521 1592 1521 1593 0.85 11 16 9.1e-15 8.9e-12 45.7 3.9 1 87 1656 1726 1656 1726 0.83 12 16 2e-12 2e-09 38.1 0.1 1 86 1748 1816 1748 1817 0.81 13 16 8.6e-14 8.4e-11 42.5 2.2 1 87 1922 1991 1922 1991 0.81 14 16 3.4e-10 3.4e-07 31.0 1.7 1 85 2052 2115 2052 2117 0.81 15 16 0.00025 0.25 12.2 0.9 1 59 2147 2199 2147 2222 0.75 16 16 5.4e-13 5.3e-10 40.0 1.0 1 86 2237 2306 2237 2307 0.84
Sequence Information
- Coding Sequence
- ATGTCACAACAACACCACCCGCACAATCACTATCATCAACAACAACAAGCACAACAACAACAACACCATCATCAACATCTTGCGCATCAGCAAAATCAACACAAACTACAACATAAACAAATACAGCACAGTTGGTACTCACATGTTGCTTCCTACCCACCGCACGGAACAGCCTTTTCGGCATCCCCGACCTGTAAGAGCAATGTTAATATGAACGCATATGGAGCTGCGTCCAACACGCATGCATATTACGGTGGGAATAGTATAGGGGGCAGTGGCATCCCTGGCGGGGTCGGTAGCATGGGCGGCGTCAATGTCAATGCTGACGGGCATTCCATGGCCTATAACCCTGATGCCCCGCAGGTAAATACTGTTGCCTATGCANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGACCAACTTATCCTCCTTACATTAAAAGCGAGCCCATGGAACTACCCACTGAAAGACTAAGGCATCAGCAGCATTTTCAAACGCCCATACCGATGGCACCGCCGCCAGCGCCCGCCACTCGCTTGGATGCCAGCGGCGGCGGCGTCGGAAATGATATGATAATAAAATCGGAACCAATGGATGAACATGCTTTTAAATCAAGCTATATCGATGACAACACACCATTTGCAGATTTTAGCAAGTATCCAGAGTTCAATCAGGATATGTTGAACCCCAAAGTAGAGCTAACCGTTAAAGACGATGTTTTTAGCAATTCAAGTCAAAAGCACGCACTAAATTTTCCACGCCGTAAAATGGAAACAGAACGCTCGGAAAGCCTTCCACCCATTTGCCAGCGCTGCAAGGAAGTTTTCTTCAAAAAACAAATCTATCTGCGTCACGTAGCTGAGAGTAGCTGCGCCATAAACGAATACGATTTCAAGTGTGGCATTTGTCCAATGTCCTTTATGAGCGGTGAAGAATTAAAAAAACACAAGCAACTACATAGATTCAACAAATTCTTTTGTCACAAGTACTGCGGAAAGAATTTTGACACAATTGAGGAGTGCGAATCACATGAGTATATGCAGCATGAGTACGAAACCTTTGTGTGTCACATGTGCTCGGGTAACTTTCCAACGCGTGATCAACTGTACGCACATTTGCCTCAGCATAAATTTCAGTCACGCTACGATTGTCCTGTATGCCGTTTGTGGTATCAAACACCGCTGGAACTGCACGAACATCGTCTGGCTGCACCTTATTTTTGTGGCAAGTACTATACCAGCGTTGGCACAAATCACACTATGGCACAGTCAAATCTTCCTGCTCAGCACGCGCAACAACAACAGCAACAACAAACAAACTACAAGCTACAGGATTGCCATATGGGCACCATTGAAATGCCTTCACCACATCACAAATCTTCTACAGCGGCGGCTAGTACGCTCCCGGCAACGGCTGCACTTAGTTCCTTGCTACAACAACGTCAAGCCAATGCGGACAGTACAGCTCTGTTTGGACCGCTCAAAACTGATGTCAAGTTAGAGCGCAGCTATAGCAATTCGACAAGCGAGTCATCCTATAATAGCATGCAGGATAGCAATTATAATAATGCTTTTGGCAGCGATGCATCGCTGCTAGGTGGCGGTCAGTCAGCACACTCGTCCACTCTAGATGACTCGGATGATGCTCTGTGTTGCGTGCCAAAGTGCGGAGTGCGTAAAAGCACTAGTCATACACTGCAATTCTTTACTTTTCCGAAGGATGAAAAGTATTTGCATCAGTGGTTGCACAATCTCAAAATGTTTCACATTCCGGCCACTACATATATGAGCTACCGCATTTGTAGCATGCATTTTCCCAAACGCTGCATTAATCGCTATTCATTGTGTTATTGGGCCGTGCCCACGTTTAATCTGGGCCATGACGATGTAGCCAACTTGTATCAGAATCGCGAGCTCACCAATACCTTTACAGCTGGCGAGGTGGCACGCTGCAGCATGCCCAACTGCAATAGCCAAAGGGGTGAAAGCAATTTAAAATTTTATAATTTTCCCAAGGATATCAAGAGTCTGATCAAATGGTGTCAGAATGCTCGATTGCCTGTGCAGGCAAAGGAGCCGCGCCACTTTTGCAGCCGTCACTTTGAAGAGCGGTGCATTGGAAAATTTCGACTGAAACCATGGGCGGTGCCCACTCTTCATTTGGGAACACAATACGGCAAAATTCATGATAATCCCAAGAATCTTTATGTGGAGGAGAAGCGCTGCTGTTTAGCCTTCTGTCGACGCAGTCGTTCATCAGATTTTAATATGTCTTTGTATCGGTTTCCCCGAGATGAAGTGCTTTTAAGACGGTGGTGCTATAATTTGCGTTTGGATCCAGGTGTCTATCGCGGCAAGAACCACAAAATTTGCAGCGCACATTTTATTAAGGAGGCGCTTGGTTTACGAAAACTTTCGCCGGGTGCTGTGCCAACTTTGCATTTGGGACACAATGACAACTTTAATATTTACGAGAACGAATTGTGGCCCCCACCAACACCAACTGGGACATCTAATTCTCACAATCAACAGCCTCGGAGCTATCAACGCCACTCTGTGGCTTCCACTTCATCGTCAACTAGCTCTTCATCACTGTATATCGAACAGGAAATGAATGCATCCTATCATGGCATGTCAACTTCTTCATCCTCACTCAACGTGACCGAATGCATGGATGTCTGCTGTGTGCCCGGTTGCGAAAGTAAACGTCACAATAATGAAAACATCACTTTCCATACAATACCACGCCGCCCGGAACAGATGAGCAAGTGGTGCCATAATCTAAAAATACCCGAAGAGAAAATGCACAAGGGCATGCGCATTTGCAGTCGTCACTTCGAATCCTACTGCATTGGCGGCTGTATGCGTCCCTTTGCCGTGCCCACACTTTATCTGGGCCACGATGACGATGATATATACCGAAATCCTGATAAGATCAAAAAGCTTAACATACGTGAGACATGCTGTGTTCAGGTTTGCAAAAGGAATCGAGATCGCGATCATGCCAATCTGCATCGCTTTCCTTCAAATCCAACGCTACTCGCTAAGTGGTGCGCCAATCTACATAAACCTGTACCAGATGGCAGCAAATTGTTTAATGATGCCATTTGTGAGGTACACTTTGAGGATCGCTGTCTGCGCAACAAACGACTTGAAAAGTGGGCAGTACCTACGCTTGTTTTGGGCCATGATATCGTACCACATATGCTGCCTAGTGAGGCAGAAGTTGCCGAGTTCTACGCACGTCCTAGTGCGCCAAACAATGGTGAAGAGGAGGGCGAATGTTGTGTGGAGACTTGCAAACGCGATCCCAGCGTTGACGATATTAAGCTCTATCGTCCGCCTGAAGATCAAGAAGTGCTCGCCAAATGGGCGCACAATCTGCAGCTGGATGTTGAGCAGCTGTCCAGTCTGAGAATATGTAATCTACACTTTGAATCGCATTGCATTGGCAAACGCATGCGCCCATGGGCCATACCCACGCTTAATCTAGCCAACAACATTGAGAATCTCTATGAAAATCCCGAGAATAATATGCTTTATGTGCGCAAACAGCGCCGTTTTCTTTCCTCGGAAACGGGCATGACAAAGCCCACTTGGGTGCCGCGCTGCTGCTTGCCACACTGCCGCAAAGTACGTGCCATACACAATGTTCAACTCTATCGTTTCCCCAAAATTAACCGTTCCACATTGGCTAAGTGGGCGCACAATCTGCAAGTGCCGTTAATGGGTAGTGCTCAGCGTCGTGTGTGCTCCGCTCATTTTGAGCCGCATGTTTTGAGCAAGAAATGTCCCGTTCCATTGGCCGTGCCCACACTTGATTTGAACATGCCTCCTGGCCTTAAAATCTATCAAAATCCTGCCAAGCTTAAGACCAGCAAGCTGTGCTTACAGCGAGTTTGCATTGTGGAAGGCTGTCGTCGGCAACGTGTACATGGCGTGCAACTATTCCGATTTCCACACAATACGGCACAGTTACGTAAATGGTTGCATAATATTAAGCAGCGGCCCAAAGGCGGCATGCGCAATCAATTTCGCATCTGTTCTAAACATTTTGAGACGCATTCGTTTAATGGCAAACGACTGAGCGCAGGTGCAATACCCACGCTTGAACTGGGACACGACGATGACGATCTATATCCAAATGAGGTTCAGTCATTTGTAGAAGAGCACTGCGCTGTGGAAGGTTGTGATGCCTCAAAAGAGCAGCCCGAAGTAAGGCTTTTCAAATTTCCCACCGATGATGAGGATTTGCTATGGAAATGGTGCAACAATTTGAAAATGAATCCTGTTGATTGTATTGGCGTACGTATATGCAATAAGCACTTCGATCCGGACTGTATTGGGCCAAAGCATCTTTTTAAATGGGCCATACCGACCTTGGAATTGGGACACGATGATGCTCAAATTGAGTTGATTCTAAATCCAAAGCCAGAAGAACGCTATTTGGATCCCATATTTAAATGCTGTGTGCCCACCTGCGGCAAAACACGCAAATTTGATGAGGTGCAAATGAACAGCTTTCCCAAAGATCCCACCCTTTTCGAGCGCTGGCGCCATAATCTCAAGCTGGAGCATTTGAGCTTTAAGGAGCGAGAACGGTATAAGATCTGCAATTCCCATTTTGAAGATATTTGCATCGGCAAAACACGACTTAATATAGGCGCCATACCAACGCTAGAGCTGGGCCATGACGAGACTGATGATCTCTATCAAGTAAATCCTGAGGAGCTGCAAAGCAATTTATTTGGGCGCCCGCGTCGCGTTCAAGAAACAAACACGTTACCCAAGAGTGAAGATACCATATCAGAGGCTACTGATTTGAATACTAGCCAAGTTAAGATTAAAAAAAGTTTGACCGAGTTCAAATGTTGTGTAGAGAGCTGCGGAAAAACTCGTTTAGAGCATGGAGTACGTCTCTATGCCTTTCCCTCTACCAAACAACAGCAGAATAAATGGCGCCATAATCTGCACCTCAGTCCAGAGGAGTTGGATAAAAATACGCGCGTTTGCAGCGCGCATTTCAATAAGCGTTGCTTCGACGGCAAACAGCTGCGCAGTTGGGCAATGCCTACGCTGCATTTGGGTCACCAGCAGCCCATCTATGAGAATCCAAAGAATGTACCAGGCTTTTTTACACCTACCTGCGCCTTAGAGCATTGTCGAAAGCGGCGCACTATTGATAATGATTTGCGAACCTATCGTTATCCCAGAAACGAGGAGCTGCTAGAGAAATGGCGAATTAATTTAAGATTGGAGCCTTCGCAATGTCGTGGCCGAATTTGTGCTGACCACTTTGAACCGTTGGTGAGAGGCAAGCTAAAACTTAAGACGGGTGCTGTGCCCACGCTCAAGCTAGGTCATGATGAAGATATCATCTATGACAACGAGGCCATTAAAGCAAGCTTGGAATTGGATGAGGATATAAGCCTAGAGTCCAGTGAGCATATGGGTATTCAACCGAAGAGTGTGCCCACGTACGAGGAAGACCTTGATGATGAAGAGCAGTATCAGAATTCAGCGTACTTCGATCCAATGGAATTGGTGCAAACCTTTGCAGAACAGCACAACAGTGAAGAGGAGCAGCTCGCTGAATCTGAACCACGAGACCTGCCACCCGAAGTAACAATTAAGCGCGAAAAACCTGCCAATAATGTTACGCCAATTTGTTGCCTGAAGCATTGCCGCAAGGAAAGAACTGCTACTTATCATTTGAGTACATTTGGTTTTCCCAAGGATCAAAAAGTACTTCTTAAATGGTGCGCCAATTTACATTTGCAGCCATCTGACTGTATTGGACGCGTATGTATTGAACACTTTGATCCCGAGGTGCTTGGCAGTCGTAAGCTTAAGCAAAATGCCGTGCCCACCATTAATGTGGGACACGATGATCCGCTGCCTTATGCACACAACGGCGTAGAACTGCACTATGACGAACAACCTCAGCATTCGGTTTTTCGGCTTTCCAGCCTGAAACACTGCCGCAAACGGAAGGACTCCGAACCACTAGATCAAGAAACTAGCTCTAGTGACTTCAGCCAAGACTTCAAACAGTGCTGTTCAGTATTAAATTGCAGCCGCTATGACGTGCGACTTGTGCGTTTGCCCAAGACCCGCATACTGCAACGGAAATGGCTGCACAATTTGCAGCTTATAGATTCCGTACAGACACCTAAAATTTGCTTAGATCACTTTGAGATGCATTGCTTTCAAAATGAATGCTCTCTCAAACCTGACGCTCTGCCCACCAGAAAACTTGGGCACAAGGAACCGAACATCTACCGAAACAGGGTTGGAAAGCCAAAGTTAGTATTGGCCAATAGGAATAATCTCGTGAGAAACTGCTTGTTTTCCAACTGTCGGTATGCACGTGCATACAATTGTCAACATTATGCGCTGCCGTTGGACGACACTCTTCGTAAATGTTGGCTTGAACACTTAAAACTAAATATATCCGGAAAATTAAAAATCAGTGTTGGTCTTTGTTCCATACACTACTTGCAGTGCTATGAGCAGACAACAATTCCCAGTAATTTATCAGAGTCTGAGCGACAAGAGTTATGGAAAAATTATACTGGTATTGTAAATTCGCCGACGGCTCAAATGCTGCGCTGCGCTGTACCAGGCTGTCTTACCGTCGTTACGGACAATTTACGGCTTATCAGCCTGCCACAGTCTAGCGATCAGTGTGAGAGATGGGTAGAAAATACCAAAATGGAATATGTTGCTTCCTGTCACAACTACTACCGCATTTGTCAACTACACTTTGAGCGGCATTGCTTGGGGCTACGGCGTATCAAGAATTGGGCAGTCCCCACACTGAAGCTTAACCATAATGATGGAAATCCATGA
- Protein Sequence
- MSQQHHPHNHYHQQQQAQQQQHHHQHLAHQQNQHKLQHKQIQHSWYSHVASYPPHGTAFSASPTCKSNVNMNAYGAASNTHAYYGGNSIGGSGIPGGVGSMGGVNVNADGHSMAYNPDAPQVNTVAYAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXPTYPPYIKSEPMELPTERLRHQQHFQTPIPMAPPPAPATRLDASGGGVGNDMIIKSEPMDEHAFKSSYIDDNTPFADFSKYPEFNQDMLNPKVELTVKDDVFSNSSQKHALNFPRRKMETERSESLPPICQRCKEVFFKKQIYLRHVAESSCAINEYDFKCGICPMSFMSGEELKKHKQLHRFNKFFCHKYCGKNFDTIEECESHEYMQHEYETFVCHMCSGNFPTRDQLYAHLPQHKFQSRYDCPVCRLWYQTPLELHEHRLAAPYFCGKYYTSVGTNHTMAQSNLPAQHAQQQQQQQTNYKLQDCHMGTIEMPSPHHKSSTAAASTLPATAALSSLLQQRQANADSTALFGPLKTDVKLERSYSNSTSESSYNSMQDSNYNNAFGSDASLLGGGQSAHSSTLDDSDDALCCVPKCGVRKSTSHTLQFFTFPKDEKYLHQWLHNLKMFHIPATTYMSYRICSMHFPKRCINRYSLCYWAVPTFNLGHDDVANLYQNRELTNTFTAGEVARCSMPNCNSQRGESNLKFYNFPKDIKSLIKWCQNARLPVQAKEPRHFCSRHFEERCIGKFRLKPWAVPTLHLGTQYGKIHDNPKNLYVEEKRCCLAFCRRSRSSDFNMSLYRFPRDEVLLRRWCYNLRLDPGVYRGKNHKICSAHFIKEALGLRKLSPGAVPTLHLGHNDNFNIYENELWPPPTPTGTSNSHNQQPRSYQRHSVASTSSSTSSSSLYIEQEMNASYHGMSTSSSSLNVTECMDVCCVPGCESKRHNNENITFHTIPRRPEQMSKWCHNLKIPEEKMHKGMRICSRHFESYCIGGCMRPFAVPTLYLGHDDDDIYRNPDKIKKLNIRETCCVQVCKRNRDRDHANLHRFPSNPTLLAKWCANLHKPVPDGSKLFNDAICEVHFEDRCLRNKRLEKWAVPTLVLGHDIVPHMLPSEAEVAEFYARPSAPNNGEEEGECCVETCKRDPSVDDIKLYRPPEDQEVLAKWAHNLQLDVEQLSSLRICNLHFESHCIGKRMRPWAIPTLNLANNIENLYENPENNMLYVRKQRRFLSSETGMTKPTWVPRCCLPHCRKVRAIHNVQLYRFPKINRSTLAKWAHNLQVPLMGSAQRRVCSAHFEPHVLSKKCPVPLAVPTLDLNMPPGLKIYQNPAKLKTSKLCLQRVCIVEGCRRQRVHGVQLFRFPHNTAQLRKWLHNIKQRPKGGMRNQFRICSKHFETHSFNGKRLSAGAIPTLELGHDDDDLYPNEVQSFVEEHCAVEGCDASKEQPEVRLFKFPTDDEDLLWKWCNNLKMNPVDCIGVRICNKHFDPDCIGPKHLFKWAIPTLELGHDDAQIELILNPKPEERYLDPIFKCCVPTCGKTRKFDEVQMNSFPKDPTLFERWRHNLKLEHLSFKERERYKICNSHFEDICIGKTRLNIGAIPTLELGHDETDDLYQVNPEELQSNLFGRPRRVQETNTLPKSEDTISEATDLNTSQVKIKKSLTEFKCCVESCGKTRLEHGVRLYAFPSTKQQQNKWRHNLHLSPEELDKNTRVCSAHFNKRCFDGKQLRSWAMPTLHLGHQQPIYENPKNVPGFFTPTCALEHCRKRRTIDNDLRTYRYPRNEELLEKWRINLRLEPSQCRGRICADHFEPLVRGKLKLKTGAVPTLKLGHDEDIIYDNEAIKASLELDEDISLESSEHMGIQPKSVPTYEEDLDDEEQYQNSAYFDPMELVQTFAEQHNSEEEQLAESEPRDLPPEVTIKREKPANNVTPICCLKHCRKERTATYHLSTFGFPKDQKVLLKWCANLHLQPSDCIGRVCIEHFDPEVLGSRKLKQNAVPTINVGHDDPLPYAHNGVELHYDEQPQHSVFRLSSLKHCRKRKDSEPLDQETSSSDFSQDFKQCCSVLNCSRYDVRLVRLPKTRILQRKWLHNLQLIDSVQTPKICLDHFEMHCFQNECSLKPDALPTRKLGHKEPNIYRNRVGKPKLVLANRNNLVRNCLFSNCRYARAYNCQHYALPLDDTLRKCWLEHLKLNISGKLKISVGLCSIHYLQCYEQTTIPSNLSESERQELWKNYTGIVNSPTAQMLRCAVPGCLTVVTDNLRLISLPQSSDQCERWVENTKMEYVASCHNYYRICQLHFERHCLGLRRIKNWAVPTLKLNHNDGNP
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -