Papo027182.1
Basic Information
- Insect
- Parnassius apollo
- Gene Symbol
- -
- Assembly
- GCA_907164705.1
- Location
- CAJQZP010001624.1:1655773-1668033[-]
Transcription Factor Domain
- TF Family
- zf-C2H2
- Domain
- zf-C2H2 domain
- PFAM
- PF00096
- TF Group
- Zinc-Coordinating Group
- Description
- The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 45 1.3 1.3e+02 4.0 0.1 3 23 69 90 67 90 0.93 2 45 0.0013 0.13 13.4 0.4 2 23 160 181 159 181 0.96 3 45 0.045 4.5 8.5 0.5 1 21 185 205 185 206 0.94 4 45 0.28 29 6.0 0.2 1 23 247 270 247 270 0.96 5 45 0.27 28 6.1 0.3 2 23 296 318 295 318 0.95 6 45 0.027 2.7 9.2 0.2 2 23 342 363 341 363 0.94 7 45 4.9e-05 0.005 17.9 0.0 1 23 367 390 367 390 0.97 8 45 0.014 1.4 10.1 0.5 1 23 394 417 394 417 0.98 9 45 0.013 1.3 10.2 0.7 1 21 502 522 502 523 0.95 10 45 0.034 3.4 8.9 0.9 2 23 595 616 594 616 0.97 11 45 0.01 1 10.6 0.2 1 23 620 642 620 642 0.95 12 45 0.21 22 6.4 1.5 1 21 647 667 647 670 0.74 13 45 0.018 1.9 9.8 1.8 1 23 676 699 676 699 0.95 14 45 0.0014 0.15 13.2 0.6 1 23 706 729 706 729 0.98 15 45 2.8e-06 0.00028 21.8 1.2 1 23 736 758 736 758 0.97 16 45 1.2e-06 0.00013 22.9 0.5 1 23 764 786 764 786 0.99 17 45 0.00016 0.016 16.2 0.6 2 23 790 812 789 812 0.96 18 45 0.0019 0.2 12.8 0.2 1 23 900 923 900 923 0.95 19 45 0.016 1.6 10.0 1.3 2 23 950 972 949 972 0.96 20 45 0.03 3 9.1 1.5 1 23 994 1016 994 1016 0.98 21 45 0.00061 0.062 14.4 0.3 1 23 1020 1042 1020 1042 0.97 22 45 0.3 30 5.9 0.1 1 21 1076 1096 1076 1099 0.87 23 45 0.00044 0.045 14.9 1.2 1 23 1106 1129 1106 1129 0.97 24 45 0.74 75 4.7 0.8 1 18 1136 1153 1136 1153 0.95 25 45 4.4 4.4e+02 2.3 0.5 1 23 1203 1225 1203 1225 0.92 26 45 0.0074 0.76 11.0 0.5 1 23 1295 1317 1295 1317 0.98 27 45 3.1 3.2e+02 2.7 0.2 1 23 1369 1391 1369 1391 0.94 28 45 1.5 1.5e+02 3.8 0.0 2 23 1415 1437 1414 1437 0.93 29 45 0.023 2.4 9.4 4.9 1 23 1459 1481 1459 1481 0.97 30 45 0.0031 0.32 12.2 2.3 3 23 1487 1507 1486 1507 0.97 31 45 0.23 24 6.3 1.8 1 23 1514 1537 1514 1537 0.95 32 45 0.29 29 6.0 0.7 1 23 1543 1566 1543 1566 0.94 33 45 0.11 11 7.4 1.5 1 23 1572 1595 1572 1595 0.96 34 45 0.086 8.7 7.7 0.8 1 23 1601 1623 1601 1623 0.97 35 45 1.9e-05 0.0019 19.2 2.7 1 23 1629 1651 1629 1651 0.99 36 45 0.11 11 7.4 0.9 1 18 1657 1674 1657 1674 0.95 37 45 0.011 1.1 10.4 0.2 2 23 1781 1802 1780 1802 0.96 38 45 0.023 2.3 9.5 4.5 1 23 1824 1846 1824 1846 0.97 39 45 0.00029 0.029 15.5 0.6 1 23 1850 1872 1850 1872 0.98 40 45 0.039 3.9 8.7 0.1 1 23 1877 1900 1877 1900 0.96 41 45 0.00016 0.016 16.3 2.0 1 23 1906 1929 1906 1929 0.98 42 45 0.14 14 7.0 1.2 1 23 1936 1959 1936 1959 0.95 43 45 0.12 13 7.1 4.1 1 23 1965 1987 1965 1987 0.95 44 45 2.1e-06 0.00022 22.1 1.0 1 23 1993 2015 1993 2015 0.99 45 45 7.6e-06 0.00077 20.4 1.5 1 23 2021 2043 2021 2044 0.96
Sequence Information
- Coding Sequence
- ATGTACAGGAACAAAATAATGAAGCAgttcttttctttgtttttagGGTCCAGAAAGGAAATTGACGGATTAAAGAAAGAGAATAAACCAAAAAAGAGACTCTTTGAAAGGTCCGTCAACCAGAACCCTCAGCGGCAAAACGCCGTGCTGCTATTAAAGTATTCGACTGCCTTTCCCTTTAAAACGCGCTTTAACAGAATTATTTGCTCATACTGTCATGATGAGTTCGAACCTATGGCAGCTCTACGAGAGCACATTAAGGCAGAGCACTCAAATGCCGATTTCAATAGCGCATTCTATAAAGTCGTAGACGACTTAAAAATAGACATATCTCAATTTAAATGTAATCTTTGCTCACAGGATATGCTAAATGTAGAAACATTTATGATGCATATCTCTCGCGATCACAGCGTATCTGTCAATTTTGACGTACCGTTTGGGGTGTTGCCATACCGACAGGGTCCCACCGGGCTTTGGATGTGCCTGGACTGTGATAAAACTTTTACCGAGTTCTCACAAATCAACGGTCATTTGCGTAGTCATGTTAAAATATTCACATGCGACAAATGTGGGGCTACTTTCTTATCTGAACATGGGCTGCGGCAACACGAGCGCCGCTTCAAATGCTACAAATCAACGTATAAACCTCGATTCGGTAAAACCCTTAAGCATCGTAGCAATACGGAGATAATCCTGCAGTGTTCGACGGCTTGCCCTTTTCGGACTTGGGGACTAAATTTCAACTGTGTGCTCTGCCGCGTTCAGTCAAATGACCCGAACGGTTTAAGGACGCACATGGCGACCCGTCACGCAAATTTTGATATTCAACTTGTCTTCAGTCGGAAACTGCGTAAAGAGTTCCTCAAAGTGGATATCAGCGATCTCCAGTGCAAGTTGTGTTTTATACGTATCGATACTCTGGACGATTTGATGACGCACCTCAAGAACGATCACAAGCAACCTATCAACTTCGACGTACAGCCCGGCGTGTTACCGTTTAAACTAAACGACGGCTCCAGTTGGAAATGTGCTATATGCAAGATGCAGTTTCCAGATTTTATCTCCTTGAAGAAGCACACCGCGGAGCATTACCAAAACTTCGTCTGCGATACCTGCGGCGAAGGGTTTATCACGGAGTCCGCTCTGATCGCGCATACTCGGATTCCACACGGCAATAAGTACAGTTGTAGTCGGTGTGTAGCGACCTTCTCCACCCTTGAAGAGCGAAACGTACATATCAAAACACAGCACACAACATTGCCGTACATGTGTATGCATTGCAAGGATAAGCCCCGTTTCGGAGACCAGGGACAAATGGAAGATGAGAGTATTGTTAAGGTAGAAACCAATGATAATATACCATTAGGGGAGTCTATTAACCTTAAGGAAAAACGACGCTATATCAGGAGCGCACGAGCCGAAGCGAGAATTGTAACAAAGAGAAACGCCAAATCTCTGTTGGAATGTTGGTCCTTGTGCCCGTTTAGATGGCAACGGAACCGGTTTAAATGCGCATTCTGCGAGGAGAGCTTTATACAGTGCAATGATCTGAGGGAACATGTTAACGAGTGCTCATCTAAGCATAACGTCAAGGATATTTACAGCAAATTCAAAGAAATGTCCCTCATCAACGTAGACATCACAGGTGCTTCGTGCCGTCTCTGTAGGTGTCCTTACTCCGGGATTAACCAAATGCGACAGCACGCCGTCCAGCACGGATATGAGTTCAACACGTCTCAGCCGGACGGTGTCCTGCCTTTCAGTCTCGACAAGGAGTGCTGGCGTTGTGTTATATGCCACGAAGATTTCAACAATTTTCTCAAGCTCTACGAGCACATGAATGTTCACTACCAACATTATATCTGCTCCACTTGTGGGAAAGGTTACATGACCGCGCCACGTCTGAGGAAACATTCCGAGGTCCACATAACAGGGTCGTTCCCATGCGATAAATGCGATAGGGCGTTCACGATGCGTGCAGCTAGGGACCATCATAAGGCTCACGCCCACGCGAAAGGTCCTCGATACGAATGCCCACATTGCAACATGCGTTTCAACGGGTACTATGACAGAATGAATCATTTGAACGAAACGCACAGAGAGAAGGAGGTATCTTACCGTTGTAACGTCTGTGAACTGACGTTCAAAACGAGCGGTAGACGCGCAATGCACATAAGAACTGTACATTTGCCGCAGCCTCGTAACTTCGCTTGTCCGTATTGCGAATGGTTTTTCAAAACGCGCTACGAGCTGAAGAGGCACATGGTGAAGCACACGGGCGAAAGGAATTATTCTTGCACGATCTGCGGCAAGGCGTACCCGAGGAACCGGGCTCTGAGGACGCATTTGAAGACCCACGAGGACCTCACGTGCAAGTGGTGTGGAGCCTTCTTCAAACAGCGCGCTCAATTACTGACGCACACGCGAATCAATCATCCGGATTTGAGCGATTTGATAGCGGTCGGCTCGGATAAAATAGCTTGTTCCGCAAATGCGGAAGCGGTGGCAAACGTAGTTAAACGAGTTAGGAAGTTGAAAGAGAACGTAAGTGCGCGCCAAATGCGAAGACGGCGACGTGCCAACAACCAGCTACCGGAAGAGTCGGAGAGGCGAATCTCGAAAACAATGATGAGAAGGAACACCATGGCCATCTTAGAATGTTCCACTGCTTGGGCCTTTCGATGGTTTCGCAATGCCTTTTTCTGCTCCTATTGCGATGAAAAATTTGTCGATCCGCAGCCATTGCGTGAACACGTACTCTCATCCCACATCAGCTGTTCCCCCACCGTGCGTATATTCTCGAAACTGACAGAGAACAACATGGTGAAAATTGACATCACCAATCTAAGGTGTAGGTTGTGTAACTATGGATGCAACAACATAGACGcacttaaaaatcatttaaaaactgCTCATAGAAAAACACTTAATAGTGAATACAGCGATGGAGTTCTGCCATTCAAATTAGATGAGATCGGATTCTATTGTCAGGCGTGTTTTGAGTATTTCACTAGTTTCGCGAAGTTGAACGAACATATGAATTCTCACTATCAGAATTACATTTGTGACGCCTGCGGGAAGGCTTTCATTTCGAAGTCTAGGTTCAGGACTCACGTGGAGTCACACGAGATCGGGAGTTTCCCTTGCGGTGAATGCGACGAAGTATTACAGACACGGGCGGCCCGTACGTGTCATAGAATGAAGGTACACCGAAAAGGTATTCGGTACACGTGCCCTCGATGTCCTGCAGTGTTCACCGCGTATTACGCGAGAGCGAGGCACTTAGTAGACGGGCACGCGCAACAGAGGATGGATTATGATTGTACCACGTGCGGTAAAACTTTCGAGACGAGTTGCAAACGGGCGGCGCACATCCGCGTAGCCCATATGCCTGTCGAGAAACGATACGAGTGCCTTTATTGTCCGTCGTACTTCGTCAGCAAATCGAAACTGCGACgacaAGAAGAAATCACTCCGGATACATCagtaaaagaaaagttaaatataaaatggAAAAAGATACGCAGGGTTGCGGAAGACAAGGCTAACGCAGCTATTATCTTGCAGAATTCCAACGCTGTTGCTTTCAGATGGCATCGTGGAAGATTCATGTGTGCTTACTGCCCGCTCATTTGCACAAGTGTCACAGAGATACGTCTGCATTCCAATGAACACGCTAACAAGCAAGACATCTTTCTGAATAGTGGAGTCCGCAATTCTTTCCCATTAAGAGTTGATGTCACGGATTTAACTTGTATGCTATGCTTTGACAGAATCGaaaatttggaaaattttaaGGTTCATCTGTCAAATAACCATGATAAATTCCTCAATCCCGACTATAGTGACGGCTTAGTGCCATTCGTGTTGACGGGTAAAGACTATAAATGCGTTAATTGTGGCACATTGTTCGAGAACTTCATGAGCCTATATATTCACATGAATGAACATTACCAGACCTACGTATGCTATACGTGTGGTAAAGgGCCCAAGGAAAGCAATGTTAAATGGAAACCAAGACGGAAGTTTAACGATCACAGAGATAACGCTGCTATTATTCTCGAATGTTCCAACGCGTGCCCATTTAGATGGAAAAGCGGTGCGTTCGTCTGCGCATTCTGTCCGAACTCGTTCGGGGACTTTACGGGAGTCAGGGAACACACTTCAGAACACCCGAACAGGGTTGAAGCAGTACGCTTAGCGCGTCCCTTCGACACAATAAAAGCCGACATAACAAATCTCAGATGCGATCTATGTTTACAAATCGTAAAGGACTTAGATGAGCTCGCTGATCACTTAATCAATGGACACGAGAAACCTATTGTCAAAGACCGCGGTGTCGGCATCActcctttttatttgaaaggcaAAGAATTTATCTGTTCGCACTGCAGTGAACACTTCGATTTGTTCACTAATCTGAACACTCACATGAAtcaacattataaaaataatatatgctaTAAGTGCGGTAAAGCGTTCTCAGCCTCTCACAGACTGCATGCACACATGGTCACCCACGAACTGGAAGACGATGGGTTCAAATGCACTAAATGCGATGATATATTCGCCACTCGCAAACTCAGGAGTAGTCACATGTCCTATATGCACGGACCAAAGCTGAGATATAGGTGCCCGTATTGTAAAGATAAGTTTAAGGCGTACGGCGACAGAGCAAAACATTTAAAGGAAGTCCACGACAGGAAAGTTGAGTATCCTTGCCACTTATGTCCGGCTGTTTTTGCCCTTTGTAATCAGAGAACGAAACATATCCAGCAAGTTCATATAAAACACAAACCGTTTCAATGCGAGTACTGTGTGTTTAAGTCCTCGACGGCGGCCCGGTTGAGGAGTCACTTGGTCAAACATATAGGCATCAGGAAGTATCAATGTGACATTTGTAAAAAAGCCTATTGCAGGATCAAAACTCTGAGAGAGCACATGCGGATCCATAATAACGACAAGCGGTTCGTCTGCGAGTATTGCAACGCCGCGTTTGTGCAGAAATGCAGCTTACAGAgaAGAAAATGTGAAGACATACAGATTAAGCAGGAAGTGGAATCCGACTCTGAGAATATCAAAGCTGCCGTTATTGTGCCAGCTAGGGACGCGGGCAACGAGAGGCGCGCCGCGTTCcggaataatattaaaattatattagaatCGTGCACGGCATGTCCTTTCAAGTATAGGAAGGGGACGTACTTATGCTTTTTCTGTAAAACTTCGTTCCTTGAACCGGAACGCTTACGCGAGCACACCCAGCAGCTACACTCTGACGTTAAGCAACTGTTGAAGCCACGCAAGTACGAACCGCTGAAGATGGACTTCGCTGTGACGACCTGCAAAATGTGTGGCACTGCTATACCCGACTATGAGATGCTTAAATCGCATCTGCGCGATCACGGTAAAGTGCTCGACTGCACGCATGGCGACAGCGTGCTGCCTTACAGCCTATCTAAAGATGATCATCGTTGTCAGATATGCGGCAAGCGTTACGAGATGTTCCTTAGTCTCCACAAGCACATGAACGATCATTATGAACACTTTATTTGCGAAACCTGCGGCAAACGTTTCGCGACATCACAGCGTATGGTCAATCACGCCAGGACGCACGAGCGCGGCGAATTTCCGTGCAAACGGTGTCAAGATTCGTTCCCTTCATACGCCTCTCTGTACGCCCACATAGCCAAGGTTCACCGATCGAATAAACGGTACAAGTGTCCAATATGTGACGAAAAATTCGCCTCGTATAAACACAGGCTAAAACATTTGAACACGGTGCATGGAGAGAAGACAGCTATATTTCCTTGTCCGTCCTGTCCGCGTGTCTTCGACCTGTGCAGTCGGCGCACAGCCCACATTCGCTTCCAGCACTTGCAGGAACGGAACCACTCATGCTCGTTGTGTTTCATGAAATTTTTCACTAAATACGAGCTCCAAGAGCATTCGATAAAGCATGGCGGCGAGAGGATTTATCAGTGTGACGTGTGCAAGAAGTCCTACGCCCGGTTGAAAACCTTACGAGAGCACATGCGGATACACAATAATGATAGACGATTTGTATGCCCCGTATGCGGGCAGGCGTTCATACAAAATTGCAGTCTGAAGCAGCATGTAAGGGTTCACCATCCGAGTCACACGAAAACAGACGTATTTTAG
- Protein Sequence
- MYRNKIMKQFFSLFLGSRKEIDGLKKENKPKKRLFERSVNQNPQRQNAVLLLKYSTAFPFKTRFNRIICSYCHDEFEPMAALREHIKAEHSNADFNSAFYKVVDDLKIDISQFKCNLCSQDMLNVETFMMHISRDHSVSVNFDVPFGVLPYRQGPTGLWMCLDCDKTFTEFSQINGHLRSHVKIFTCDKCGATFLSEHGLRQHERRFKCYKSTYKPRFGKTLKHRSNTEIILQCSTACPFRTWGLNFNCVLCRVQSNDPNGLRTHMATRHANFDIQLVFSRKLRKEFLKVDISDLQCKLCFIRIDTLDDLMTHLKNDHKQPINFDVQPGVLPFKLNDGSSWKCAICKMQFPDFISLKKHTAEHYQNFVCDTCGEGFITESALIAHTRIPHGNKYSCSRCVATFSTLEERNVHIKTQHTTLPYMCMHCKDKPRFGDQGQMEDESIVKVETNDNIPLGESINLKEKRRYIRSARAEARIVTKRNAKSLLECWSLCPFRWQRNRFKCAFCEESFIQCNDLREHVNECSSKHNVKDIYSKFKEMSLINVDITGASCRLCRCPYSGINQMRQHAVQHGYEFNTSQPDGVLPFSLDKECWRCVICHEDFNNFLKLYEHMNVHYQHYICSTCGKGYMTAPRLRKHSEVHITGSFPCDKCDRAFTMRAARDHHKAHAHAKGPRYECPHCNMRFNGYYDRMNHLNETHREKEVSYRCNVCELTFKTSGRRAMHIRTVHLPQPRNFACPYCEWFFKTRYELKRHMVKHTGERNYSCTICGKAYPRNRALRTHLKTHEDLTCKWCGAFFKQRAQLLTHTRINHPDLSDLIAVGSDKIACSANAEAVANVVKRVRKLKENVSARQMRRRRRANNQLPEESERRISKTMMRRNTMAILECSTAWAFRWFRNAFFCSYCDEKFVDPQPLREHVLSSHISCSPTVRIFSKLTENNMVKIDITNLRCRLCNYGCNNIDALKNHLKTAHRKTLNSEYSDGVLPFKLDEIGFYCQACFEYFTSFAKLNEHMNSHYQNYICDACGKAFISKSRFRTHVESHEIGSFPCGECDEVLQTRAARTCHRMKVHRKGIRYTCPRCPAVFTAYYARARHLVDGHAQQRMDYDCTTCGKTFETSCKRAAHIRVAHMPVEKRYECLYCPSYFVSKSKLRRQEEITPDTSVKEKLNIKWKKIRRVAEDKANAAIILQNSNAVAFRWHRGRFMCAYCPLICTSVTEIRLHSNEHANKQDIFLNSGVRNSFPLRVDVTDLTCMLCFDRIENLENFKVHLSNNHDKFLNPDYSDGLVPFVLTGKDYKCVNCGTLFENFMSLYIHMNEHYQTYVCYTCGKGPKESNVKWKPRRKFNDHRDNAAIILECSNACPFRWKSGAFVCAFCPNSFGDFTGVREHTSEHPNRVEAVRLARPFDTIKADITNLRCDLCLQIVKDLDELADHLINGHEKPIVKDRGVGITPFYLKGKEFICSHCSEHFDLFTNLNTHMNQHYKNNICYKCGKAFSASHRLHAHMVTHELEDDGFKCTKCDDIFATRKLRSSHMSYMHGPKLRYRCPYCKDKFKAYGDRAKHLKEVHDRKVEYPCHLCPAVFALCNQRTKHIQQVHIKHKPFQCEYCVFKSSTAARLRSHLVKHIGIRKYQCDICKKAYCRIKTLREHMRIHNNDKRFVCEYCNAAFVQKCSLQRRKCEDIQIKQEVESDSENIKAAVIVPARDAGNERRAAFRNNIKIILESCTACPFKYRKGTYLCFFCKTSFLEPERLREHTQQLHSDVKQLLKPRKYEPLKMDFAVTTCKMCGTAIPDYEMLKSHLRDHGKVLDCTHGDSVLPYSLSKDDHRCQICGKRYEMFLSLHKHMNDHYEHFICETCGKRFATSQRMVNHARTHERGEFPCKRCQDSFPSYASLYAHIAKVHRSNKRYKCPICDEKFASYKHRLKHLNTVHGEKTAIFPCPSCPRVFDLCSRRTAHIRFQHLQERNHSCSLCFMKFFTKYELQEHSIKHGGERIYQCDVCKKSYARLKTLREHMRIHNNDRRFVCPVCGQAFIQNCSLKQHVRVHHPSHTKTDVF
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -