Sper014373.1
Basic Information
- Insect
- Sarcophaga peregrina
- Gene Symbol
- -
- Assembly
- GCA_014635995.1
- Location
- CM025791.1:101888715-101897499[+]
Transcription Factor Domain
- TF Family
- THAP
- Domain
- THAP domain
- PFAM
- PF05485
- TF Group
- Zinc-Coordinating Group
- Description
- The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 19 6.8e-16 8.7e-13 49.5 2.7 1 87 9 80 9 80 0.81 2 19 3.6e-13 4.6e-10 40.7 9.6 1 87 125 198 125 198 0.86 3 19 3.3e-13 4.3e-10 40.9 0.7 1 86 220 287 220 288 0.80 4 19 3.5e-14 4.5e-11 44.0 0.2 1 87 559 628 559 628 0.81 5 19 6.1e-11 7.8e-08 33.6 5.0 1 86 686 780 686 781 0.72 6 19 7.9e-13 1e-09 39.7 2.6 1 87 844 913 844 913 0.82 7 19 8.8e-14 1.1e-10 42.7 2.6 1 87 933 1003 933 1003 0.80 8 19 3e-14 3.9e-11 44.2 1.7 1 86 1026 1096 1026 1097 0.78 9 19 1.2e-06 0.0015 19.9 0.2 1 59 1113 1160 1113 1186 0.78 10 19 5.6e-13 7.2e-10 40.1 5.0 1 86 1201 1270 1201 1271 0.81 11 19 8.4e-10 1.1e-06 29.9 2.9 1 86 1296 1381 1296 1382 0.69 12 19 0.00011 0.14 13.6 3.0 1 86 1402 1459 1402 1460 0.76 13 19 2.1e-06 0.0027 19.0 1.1 1 86 1479 1548 1479 1549 0.73 14 19 9.6e-13 1.2e-09 39.4 2.7 1 86 1572 1648 1572 1649 0.78 15 19 0.0075 9.6 7.7 4.1 1 59 1677 1733 1677 1754 0.66 16 19 9.9e-13 1.3e-09 39.3 2.5 1 87 1774 1849 1774 1849 0.84 17 19 2.7e-14 3.5e-11 44.3 1.3 1 87 1877 1951 1877 1951 0.85 18 19 1.3e-13 1.7e-10 42.2 4.1 1 87 2082 2155 2082 2155 0.80 19 19 6.6e-13 8.5e-10 39.9 0.4 1 86 2180 2249 2180 2250 0.82
Sequence Information
- Coding Sequence
- ATGTTGGGACCACAAATACCAAAATGTGCAGTTGCCAACTGTAATTCGGATAAAACGAGTGACGATGAAtccataaaattgtttaactttCCCACTGATGAAAATCTACTCAAGAAGTGGTGcgacaatttaaaaatgtctCATCGTTTGACGCCTTTGCAGAAGATCTGCTCTTTACATTTTGAAAAGTCATGCTTGGGCAGCTGTCGTATACGTTCTTGGGCTATACCCACCTTGAATTTGGGCCACGACGAAGCACCAGAACATCCGAATAAAAAAACCACAAGCCGAGAGGTTTATGATGGATCAGAAGATACAACCGAAATACAACTGAAGCAAGTTAAAATTAAACGGCCCCTGGATTCAACCAAATGCTGTATAGCCAGTTGTCGCAAGAGTCGTTTAAAACACGGCGTAAGACTATATTGTTTGCCTAGCAATTCAAAAATGAAACGCAAGTGGAtgcacaatttaaaaataaatcatttaaaaaacaatcccAAATTGCACAGCATCAAGGTTTGTAATCATCATTTTCATAAAAGATGTTGGGatggcaaaaatttaaaaccatgGGCCGTCCCCACCTTGCATTTGGGACATTCGGAGGCTATTTTTGATAATCCTCGCCGTATACAAGCAGTGCCTATAGTACGCTGCGCCCTTAGTAGCTGTAAGAACCACAAGGCAATCAAAGATGTGCAAGCATTTGTATTTCCCAAATCCCCTGAATTATTGGAGAAATggtcaaagaatttaaaattagattTGGAGCAGTGCACGGGTAAAATATGTTACGAACACTTTGAAAAAGAGGTTTTAGcggagaaaaaattaaaggctTATGCAGTGCCCACACTAAAGTTGGGTCACGACGATATTATTTTCGATAATACGGAATTAATCGATAAATTACAGAGAAAGCAAACAGAACAGTTAAATGCTAAGAAACGACGTTACGAAGACGACGATGATTATATGGATGTGGAGTATGAAGATCCAGTAGAAGAAGGGGAGGATGATGACATGTGGGAATATGAGGAGTATAATGAAGATTTGGAAGATGAGGAGGAAGATGAAGAAGAGGAATATGAAGAGTGTTATTATGACGACGgcgaagatgatgatgatgataatgacgagGTAGAAGACGAGGACGAcgaagatgaaaatgaatatgaaGACAACGATGATGAATACAGCGTATCAAACTCAATAACTGATTGGAGTGGCATTAAATTCAAAGAACTGCGCGTCAGCCTCACTCCTTTAACACCCGAAGATCTTATGGATTTATGTTCACGCTCCTCATATGAAAGAGAATTTGGCTCTATAACATCTGCTAGCAGCTTAAGGGGGGGACGTAGATCGATAACACCAGCGGCAAGTTTAAAAGATTTATGTTCAGAGACCCCAGAACAAAACTCAAGATCCGAAACTCCCAATCAAAAGCAATTCAATTGTTTCAGAGAACCTCTTACCATTGCTGCTGCTGATCAAATTAAATCCGACAACTTCCGTGAACCGAGTGCTCTAACGCCCGAACAAAAAACTGATTCAAGAGATTCCCTAACTGAGTACTCAATAGATGCTATCATGTACAATGAAACTCCgGAGCATAACGCCAGCACAAATCTTCGAACCGATAAACCACTTAATCCCATATCCCCACGTTGCTGTTTAAAACATTGCGGCAAAGAAAAAACACCCGAACAGCATTTAACGACGTATGGTTTCCCTAAAGACCCACAGCTTTTAAGGAAATGGTGTGACAATCTAGGTTTACAACCAGAAGAATGTATAGGGCGTGTATGCATAGATCACTTTGAATTACGTGTGGTGGGCGTAAGACGTCTTAAGCTGGGTGCTGTACCAACCTTAAATCTAGGACCAAATTGTACAGCCAAACATACTAACTCAGAGGAAACACCACAAAAGAAAGCCATAATCAAGGAATTTACCGAAACGGGGAACATGCAGGAAACAGACACCAGTTCTAAGCCACTACCGCCGTATAAGACAACGAAACCTGGTAAGCAATCGGTTTTTCGGCTATGTTGCCTCAAACATTGTCGTCGCAAGAAATTTATAAAGCCGAACAAGAAGACGAAACGACCGGATTTGAATCAGACTTCAATGGAATGCCAGAAGAACGCAGCAGTAGCCCCGAATACATTATTTAAGTTCCCCTCGgatacaaaaattctaaaaaaatggTGCAAAAACTTAAGGTTACCGGAAAAACTTTGCCTACCATCCGACTTGGAAATATGTGCGAGACATTTCGAAGCCAAAGCCATACAAGATGGCAAATTACATCCCAAAGCCATACCCACATTGGAGCTAAGTTATGCTAACAGGGCGcccatttataaaaataacccCAAAGATTTTGACCAGCAACCGCAAAAAaaacatttaagaGTTGGCGGACTACCTACCCTCAACTTGGGCCATAATGGGGCTATCGTTAGAAATTGCCGTAAATTGCGTTTAAAGAAAACGAATGGTGGCGCAATAAAAGAGAAGTGTTGCGTACAACAGTGTCAGGAAACAaatcttaaattgttttcatttccgAGAAGTTCGGATTTGCGTAAGATTTGGTGTAACAATCTGCAATTGGATTTGCGTCAAGTACTGATTAATCATTTGAAAGTATGTGCACGACATTTTAACATTGAATGCTTTACAGTGGGCACCGACAATCTGAAACTGAATGCAGTGCCTATGCTGCACTTGGGTTTACAAAATGAATCGCACATGGTGCTGGAAATCATGCCAAGCGAAAGGAAATGTATGGTTGAAAATTGCCAAAAGACACCCAGTGTGGATCgagttaaattgtttaattttccaCAAAAGAAGGACATACTCAAGAAGTggcttttcaatttaaacttatCGCCCGATACATACAATCCGAATGCCTTTATATGCAGTAGACATTTCGATAAGAGCTGCATTAAGAATGGCGTGCTACATGAAAATTCAATACCCACGCACTTTTTACAAATCACACCCAAAGGCTGGTTCTACAAAAACAACGAGGAATTGTATGAAATGCCGAAAAAATGTTGTGTCCTTAACTGTGGTCAAAGCTCGGAAGAAGCCAAACATTTGTATAGATTTCCCAAACATAAGGAGGATTTGGAAAAATGGTTATACAATCTCAAACTGCAGGTGGATGAGAATGATGTTAAGGACTTAAGGGTATGCGACAGACATTTCGAGCAGAATTGCAAGATTTCCAATAAGGATTTGATTACCCAGTCTTTGCCTACATTAAATTTGGGCCATACAGACACCGATATTTATGGCAATAATTTCATCAAATGTTGCCTGAACACATGCAACATAGAGGGTTTCTATTTCCATAAGTTGCCCGAAGATTTGATGCTGCAAAGCTATTGGTTTCAGGAACTCGAAATGGAAAGCACCTATAATAGTTCTCTGTATATATGCTCAGTACATTTTGTTGCGTTTTTCGAACGGATATTGGAAAAGTACAGTGCTTTCCTTAAAGAATCCAAAGAGTATGTAAAACTAGCCTTAACCTATAACGAACTGAAGGCTTTGCCAGCCTTGCAATGCTACAAATGTTTCATACCCAAATGCAATTCAGGTTTTAAGCTTATCTGGAAATTGTTTAAGTTTCCCAAAGACGAAACGTTATTTAATAAATGGCTTCACAACACGGGCTTGCAAATTGAACACAGTCAAAGACCTTGCTATCGCATATGTGCCCAGCATTTTGAAGAGAGATGTTTAAGTGAAAAGAAATTACATCGTTGGTCTTTGCCAACCTTAAAATTGCCTTTCAATAATAGTTTGTATGTCAATCCGCCTGAAGCTTTACCCTCCAATCATGAAAATCTCAAACACTGTTGTGTCTCAAATTGCCTTAACGAGAAAGGACCGTTTTTTAAGTTTCCTGTTAAGCAATTGGAGGTTAAAAAATGGATCCACAATTTGGATTTGGGTCCCCAACAATGTACACTCAATCTAAGGAAATGGTTGCACAATCTAAATTTAACGGCTCAACAGTACAAGGAAACCATGAGAATATGCGGCGTACACTTCGAAATGGATTGTTTCTATAAGGACTTTAAGCTAATGCGCAAACACTCAGTGCCCACTTTGGCTTTGGCAACAAACGTCAAAGAGTTATATCGCAATCCAGTGCGAAGGCCATATCTAAAGTGTTGTGTTAAGCTATGCAAGGGCCCTTGGGAGAATCTCATTAACTTTCCAAAACACAAAACATTATTAAGGAAATGTGGCATTTTAAAGCCTGTTAAAGTCTGTATAGAACACTTTGAATCGCATTGTCTTAAGGATAACAATCGTTTGCTGTTTGGTGCAGTACCTACACTAAAGCTGGGCGTTAAACTAGACTCTAAGGagattttaaaaacctttagttATTCAAGATGCCGAATCGAAAGTTGTCAAAGATCAATTTATTATGATAAAATCAATCGCATACCATTTCCCAAAGGTTTAATGAAAACCAAATGGTGTTGcctattgaatttaaatgatgACGAAATCTCTAACAAAGATTGGATTTGTCATCGGCATTTCGAGAAGGGGGCCTTAATTGATTGTCGTAAATTGAAACCGGGCACACAACCTACTTTACTATTGGATACCAAGGCCAAGGCTAAGAATGCAAATACAAATGATGTCTTACCGACAAAATCAAAGAAATGTTGTGTACGTTCCTgtaacagcagcagcaccagcagtCTTGAGCATAAACTTTTCCCTTTACCTATTGCGAATGAGGACATAAGCAAAAAATGGTTGCATAACTTAAATTTATTGGAGAATTGCACGGCCAGTGAACGTAAACAGTATTATGTGTGCGAGCAACATTTTGAAGGTCAATGTTTCCATAGGATAAGCGGCCGTTTGAAATGGGCTGCTTTGCCCACATTGAGGTTAGACCGGAAAAAGAATTTATACCTGCTTAGCGATAAGGAAGTGCGTATTACACCCACCTTAAAAGATACGAAATTTAAATGTTGCTTTTCAAATTGccacaacaacaattcattACAATTATATGACTGGCCTGTTAAAGACATTTGCAAGGATATACTtctacaacaacagaaacaaaatgatGACGTTTTCGGTGTTAACATAATGAAGGAAACTGATTGCGTACGTTTATGTGatgaacatttttataaattgtataaacCTAACCGCCAAGCAATAGATGATTGCCATTTGGATGTCAAAGTAAAGAATTCCCTCAATTTAATATATGAAGAGTTaagcaaacaaatgaaattctatACGCGTAAATGTATAGTGCCTGAATGCCCGACCGACTATTGCTTAAAAGAAGACTATAAAACTTTAAAGCTTTTTAGTTTTCCTAAAACCGATGTGGCCAAGAAATGGTGCCATAACATAGGCATAGACTTTAACTCTTTGAAAACTAAGCCCAACCAAAAAGTCTGTGAATTACACTTTGAGCCATATTGTTTAGCCCGAAGAATGTTGTTTAATTGGTCAATACCGACATTAAATTTACCTGCCACACAAACAATCGAAACAAATCATAAGATTATACAAAATGATGCCGAGGAGATATTCGCCCATTCGGGACAATGTTGCATAAAAGGCTGCGTTAATGAAATGGGTTTggattgtaaaacaaaaacccGCTTTTATAGATTTCCTACACAAACAGACATTCTAGAGCAGTGGTTACAACTAAGCAAATGTAAAGATTTCAATATAAATGCCACACGCATTTGTGGTTTACACTTCCATGCAACAGATTTcctaaataagaaaacaactcTTAAAGCAGATGCTATGCCCAGCATAAATTTATCAACAGAATTAAACAATTCACCTGCCAATTCAATCATAGATGAAAATATACAAGTTAAACAAGAACTAGACAATTCAGAGGAATGGTGCAACGAAGAGGAGGACATTAACTTTAGTTATGGAGGAAATATGGCCTTAAAATCGTTTGCTTTGGATAACAGttttgaaactaaatttaaagacGATCTTAAAACCCAAGATTATAATGCTTTATTAGATATAAAGGAAGAGATCATTGAAATTGAGGAAGACAACAATATTATGTacgaaatcaataaatttgagGAGGCCAATGAATTTGAatacgaaacaaatttcaataatatggTGTATCCCAATGAGGAAGCAGGCTTTGTGATAAGCGATGTCAagtctcaaatttatttatgctgCGTCCAGAAATGCTCTAACAATTCGGAAAcaccaaatattaaaatttacactGAATTCCCTACAGATTCGGAGATTTTCATTAAATGgtgtttcaatttgaaaattgatCCGCGCAATTACAAGGAAAATCAATATGCCATTTGCCAAAAGCACTTTGAAAGCATATGTTTTACCGATGCCCAAACGCTTTATCCCTGGGCAGTGCCTacgttaaatttgaatttaaatgaaaactcttttatacacaaaaacgATGTACCCGACTATCTAAAGCCTTGCAATGAACAGTGCATTGTTTATGGTTGCATAAACCCGTTAAAGCCGCTCTATAAATTTCCTCAAGAGGTTGAGGTAACACAGAAATGGTTTACAAACCTGAAACTGGATTATACGGACTTTAGAGCACAAAATTATCGCATATGCAGAAGGCATTTCCCGCAACAATGCTTTGTGGAGGCTACGACGGATAAACTTAAAACTGAAGCTTTACCTACTTTGTATTTGGGTCACTCGGATAAAatcgtatttttaaataatagagAAGAACAACAGCAATTGGATCATGATGATATAGGGGCAGCAGCTGGTTTGGCTGCAGGTGGTATAAATCATGGTGTTGGTGGCGGTGTTGCGGGTGGTTTAGTTGTTGCTAACAATCAGGACAATAGTCGCGGTAGTAGTCAGGGCTCTTTAGCTAGAATAATATCACCGCATGATCTAGAGGATCATGACAGCAGTTATTATGAGGATTTTGAAGAGTATTATGGTCAGGATgattaa
- Protein Sequence
- MLGPQIPKCAVANCNSDKTSDDESIKLFNFPTDENLLKKWCDNLKMSHRLTPLQKICSLHFEKSCLGSCRIRSWAIPTLNLGHDEAPEHPNKKTTSREVYDGSEDTTEIQLKQVKIKRPLDSTKCCIASCRKSRLKHGVRLYCLPSNSKMKRKWMHNLKINHLKNNPKLHSIKVCNHHFHKRCWDGKNLKPWAVPTLHLGHSEAIFDNPRRIQAVPIVRCALSSCKNHKAIKDVQAFVFPKSPELLEKWSKNLKLDLEQCTGKICYEHFEKEVLAEKKLKAYAVPTLKLGHDDIIFDNTELIDKLQRKQTEQLNAKKRRYEDDDDYMDVEYEDPVEEGEDDDMWEYEEYNEDLEDEEEDEEEEYEECYYDDGEDDDDDNDEVEDEDDEDENEYEDNDDEYSVSNSITDWSGIKFKELRVSLTPLTPEDLMDLCSRSSYEREFGSITSASSLRGGRRSITPAASLKDLCSETPEQNSRSETPNQKQFNCFREPLTIAAADQIKSDNFREPSALTPEQKTDSRDSLTEYSIDAIMYNETPEHNASTNLRTDKPLNPISPRCCLKHCGKEKTPEQHLTTYGFPKDPQLLRKWCDNLGLQPEECIGRVCIDHFELRVVGVRRLKLGAVPTLNLGPNCTAKHTNSEETPQKKAIIKEFTETGNMQETDTSSKPLPPYKTTKPGKQSVFRLCCLKHCRRKKFIKPNKKTKRPDLNQTSMECQKNAAVAPNTLFKFPSDTKILKKWCKNLRLPEKLCLPSDLEICARHFEAKAIQDGKLHPKAIPTLELSYANRAPIYKNNPKDFDQQPQKKHLRVGGLPTLNLGHNGAIVRNCRKLRLKKTNGGAIKEKCCVQQCQETNLKLFSFPRSSDLRKIWCNNLQLDLRQVLINHLKVCARHFNIECFTVGTDNLKLNAVPMLHLGLQNESHMVLEIMPSERKCMVENCQKTPSVDRVKLFNFPQKKDILKKWLFNLNLSPDTYNPNAFICSRHFDKSCIKNGVLHENSIPTHFLQITPKGWFYKNNEELYEMPKKCCVLNCGQSSEEAKHLYRFPKHKEDLEKWLYNLKLQVDENDVKDLRVCDRHFEQNCKISNKDLITQSLPTLNLGHTDTDIYGNNFIKCCLNTCNIEGFYFHKLPEDLMLQSYWFQELEMESTYNSSLYICSVHFVAFFERILEKYSAFLKESKEYVKLALTYNELKALPALQCYKCFIPKCNSGFKLIWKLFKFPKDETLFNKWLHNTGLQIEHSQRPCYRICAQHFEERCLSEKKLHRWSLPTLKLPFNNSLYVNPPEALPSNHENLKHCCVSNCLNEKGPFFKFPVKQLEVKKWIHNLDLGPQQCTLNLRKWLHNLNLTAQQYKETMRICGVHFEMDCFYKDFKLMRKHSVPTLALATNVKELYRNPVRRPYLKCCVKLCKGPWENLINFPKHKTLLRKCGILKPVKVCIEHFESHCLKDNNRLLFGAVPTLKLGVKLDSKEILKTFSYSRCRIESCQRSIYYDKINRIPFPKGLMKTKWCCLLNLNDDEISNKDWICHRHFEKGALIDCRKLKPGTQPTLLLDTKAKAKNANTNDVLPTKSKKCCVRSCNSSSTSSLEHKLFPLPIANEDISKKWLHNLNLLENCTASERKQYYVCEQHFEGQCFHRISGRLKWAALPTLRLDRKKNLYLLSDKEVRITPTLKDTKFKCCFSNCHNNNSLQLYDWPVKDICKDILLQQQKQNDDVFGVNIMKETDCVRLCDEHFYKLYKPNRQAIDDCHLDVKVKNSLNLIYEELSKQMKFYTRKCIVPECPTDYCLKEDYKTLKLFSFPKTDVAKKWCHNIGIDFNSLKTKPNQKVCELHFEPYCLARRMLFNWSIPTLNLPATQTIETNHKIIQNDAEEIFAHSGQCCIKGCVNEMGLDCKTKTRFYRFPTQTDILEQWLQLSKCKDFNINATRICGLHFHATDFLNKKTTLKADAMPSINLSTELNNSPANSIIDENIQVKQELDNSEEWCNEEEDINFSYGGNMALKSFALDNSFETKFKDDLKTQDYNALLDIKEEIIEIEEDNNIMYEINKFEEANEFEYETNFNNMVYPNEEAGFVISDVKSQIYLCCVQKCSNNSETPNIKIYTEFPTDSEIFIKWCFNLKIDPRNYKENQYAICQKHFESICFTDAQTLYPWAVPTLNLNLNENSFIHKNDVPDYLKPCNEQCIVYGCINPLKPLYKFPQEVEVTQKWFTNLKLDYTDFRAQNYRICRRHFPQQCFVEATTDKLKTEALPTLYLGHSDKIVFLNNREEQQQLDHDDIGAAAGLAAGGINHGVGGGVAGGLVVANNQDNSRGSSQGSLARIISPHDLEDHDSSYYEDFEEYYGQDD*
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -