g9572.t1
Basic Information
- Insect
- Nymphalis c-album
- Gene Symbol
- -
- Assembly
- GCA_905332345.1
- Location
- CAJOBY010000016.1:1541328-1546761[+]
Transcription Factor Domain
- TF Family
- zf-C2H2
- Domain
- zf-C2H2 domain
- PFAM
- PF00096
- TF Group
- Zinc-Coordinating Group
- Description
- The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 36 0.31 24 6.1 0.1 2 23 125 148 124 148 0.94 2 36 0.001 0.076 14.0 1.5 2 23 154 176 153 176 0.95 3 36 0.045 3.4 8.8 0.6 2 23 182 204 181 204 0.92 4 36 0.00082 0.062 14.2 1.7 1 23 209 232 209 232 0.95 5 36 0.0033 0.25 12.3 9.2 1 23 236 259 236 259 0.97 6 36 0.00089 0.068 14.1 0.2 2 23 266 287 265 287 0.96 7 36 0.0036 0.28 12.2 1.4 2 23 326 347 325 347 0.95 8 36 0.0038 0.29 12.1 0.1 2 23 375 397 374 397 0.96 9 36 0.0043 0.33 12.0 0.1 2 23 418 440 417 440 0.95 10 36 0.00039 0.029 15.3 1.5 3 23 447 468 445 468 0.97 11 36 0.031 2.3 9.3 2.8 1 23 472 495 472 495 0.96 12 36 0.0061 0.46 11.5 2.4 2 23 498 520 498 520 0.97 13 36 0.0023 0.18 12.8 2.6 1 23 597 620 597 620 0.93 14 36 0.54 41 5.4 0.1 2 23 648 670 647 670 0.94 15 36 2.6e-05 0.002 19.0 0.6 1 23 692 715 692 715 0.93 16 36 4.6e-05 0.0035 18.2 2.8 3 23 722 742 721 743 0.94 17 36 0.00028 0.021 15.7 1.1 1 23 749 772 749 772 0.95 18 36 4.3e-05 0.0033 18.2 0.2 1 23 777 800 777 800 0.96 19 36 0.13 9.7 7.3 7.5 1 23 804 827 804 827 0.97 20 36 0.032 2.4 9.2 0.7 2 23 834 855 833 855 0.96 21 36 0.00028 0.022 15.7 0.8 1 23 904 926 904 926 0.95 22 36 1.9 1.4e+02 3.7 0.3 2 23 954 976 953 976 0.87 23 36 0.0043 0.32 12.0 0.4 2 23 997 1019 996 1019 0.95 24 36 0.0046 0.35 11.9 1.4 2 23 1025 1047 1024 1047 0.97 25 36 0.001 0.077 13.9 1.6 1 23 1050 1073 1050 1073 0.95 26 36 0.93 71 4.6 4.5 1 23 1075 1098 1075 1098 0.95 27 36 1.9 1.5e+02 3.6 1.6 2 20 1104 1122 1103 1124 0.90 28 36 0.00082 0.062 14.2 2.4 1 23 1172 1195 1172 1195 0.97 29 36 1.1 82 4.4 0.5 2 23 1223 1245 1222 1245 0.93 30 36 0.039 3 9.0 0.4 1 23 1267 1290 1267 1290 0.93 31 36 0.0001 0.0079 17.1 1.1 3 23 1297 1317 1296 1318 0.94 32 36 0.00016 0.012 16.4 1.9 1 23 1323 1346 1323 1346 0.95 33 36 0.018 1.4 10.0 0.6 2 23 1352 1374 1351 1374 0.90 34 36 0.0033 0.25 12.3 2.2 1 23 1378 1401 1378 1401 0.97 35 36 0.0016 0.12 13.3 0.7 3 23 1409 1429 1408 1429 0.97 36 36 0.014 1.1 10.3 2.7 1 23 1435 1457 1435 1457 0.97
Sequence Information
- Coding Sequence
- ATGCAATCAGACGAATGCTCTACTGGACTGACGGCGGTGCAGAGTCAAAATCTTCGAACTTTATTTACGATCACAAGTATAATACCATTCGAATGTAACGACGGTCTAGCATACTTTACCGATTTCGATAATTTACGTGATCATGCGTTATCCGATCATCCTCATTTCAATCAGGAATGTAAAGCAATGAAAATGATCAAGGGTCGAGATATAAACGttaaaatagatatttctaATCTAACGTGTAAGATTTGTTTTGAAAGCAGCCTGGCTttaaacgatttaatcaatCACCTCATAACGAAACATGGCGTCAATTATGACAAGTCGATAGAGTGCTTGCAACCGTTTCGAGTTGTCAAAGATAATATGCCCTGTCCGATTTGTCCGAACATTGTGTTTAGATATTTTAAGAAACTTCTCGAACATATGAATGAATCTCATTCGGATAATAATATAGTGTGTGCGCATTGCGGTCTCACATTTAGGGGACATCCGAATTACAGAGCTCACATGTCCCGCTATCACAGGAGCAAAGCGTGCAAATGTCCCGATTGTAATATGGAATTTTGGAATTTGGAGAAATTGGCTCGTCACAGAGCAAACGTACATGGCACGAAGAAATATAAATGTCCTCAATGCGTAGAGAAGTTTATAACACAGCATCTGATGCGTAAACATCAGATACTCATGCACGGCTTCGGCCACAAGTGTCCGTACTGTTGTAAAATGTTCACTCGAAATTCGCACATGAAAAATCACGTAAGACGTCTCCATTTGAAGGAGAAGAATGTAGAATGTACAGTTTGCAAGGAGAAGTTTTTCGATAGAGCCCTTTTGAATGTGCATATGGTGAAGCATGTGGgtGTAGACGGACCTGCTAAGACCCTAAGTCCTAATACGTTGAGGCGGAGGAATCTTCTAATACTATTCAACAATACATCAATTATACCGTTCAAATGCCGTGGAAAATGTCGATGTTTTTATTGCGGCGAAGAGTTCCCCATTTATGACGAACTGAGGAAACATACTAAAGCTCACGGCCCGTGTTCAGAGAGAGACAGGGCTATTAAGTTAGTTAAAACCGACGAGGCGGAAGTGAAAATTGACGTGTCCGATATAACGTGCGAATTGTGCAACGAATCGTTTGTAAACTTAGATGATATTATATCACATTTGATAACAAAACATATGTTACCTTACGATATAGacgttaaattgattattatgacGTATCGGCTTATCGATAAACAATGTCTCGTGTGTGATGAAAGGTTTAATCAAGTAAGCGAATTAGTAGTACACGTTAATAACGAACATCCCGTGCAGTGTTTTGGTTGCGAGGTCTGTCATCAAAAGTTTATCAGGAAACAGTATTTAGACGCCCATATGAGAGTTAAACATTCAACCgtatataaatgtttgaagTGCTCACAGACCTTTCAGTCTCACGTGGCTCTTCAAGAGCACAAAATCAAATCGCACGTTGCAGTCTGTAAcatatgttttacaaaattttcaaCACAAATGAAAAGGCTGAAGCATATGAAAACCGAACACGCTAACGAGCCCTTGAAGTGTGGTTTCTGTTTGAAGTTTATGAGCACTAAATTAGGTTTCCTTCGGCACGCTGCTAAATGTACGGaaaaaaacgaaaatgttaATGAAACTTTTGTAATCGACGACGACGATGACGACAAGAAACCGACAGTTATAcagataagaaaaaatatagcttgtatatttaatatgtcgACGGCGATACCGTTTAAGTATTTTATGAGTAGATTTAGGTGTTTTTATTGTCCGCACGATTTCACTGATTGCGATGATTTAAGAGCTCACACTGTCATTGAACATCCCATTTGTGatgttaatttgaaatgtatgaGGTTGCGTAACAGACAGGAAGGCTGTGTCAAAGTTGACACGTCCGTTCTCTCTTGTAAAATGTGCTTCGAAAATCTGCCGAATTTAGAATCTTTAATCGAACATCTAACATCTGAACATAAAGCTTTATATGATAAATCTGTCGATTGCAATATTCAAcagtatagattaattaaagaCAATTATCCATGTCCAGTTTGTGACGAAGCGTTTACACATTTTAGCACTTTGTTGAAACATATGGGACAATATCACaccgataataaaaatatttgcatgcaTTGCGGGAAATCGTTTCGGAATTTGCCCAATTTGCGTGTGCACATATCGAATCATCACAAAACGTCCGGTAGTTACAAATGCGATAGATGCGAACAAGAATTTTCgacgaataaatatttacaaactcaCTTGGGTCGTGCGCACGGTATTAAAGTTTACGAATGTCCCGAGTGTGCGGAGAGGTTTACATCGAATTACGCGATGCAAAGGCATATGATAAACACGCACAGCTCCGGACACAAATGTCTACATTGTGGGAAATTATTCACATCAAACTGTTTCATGATTGATCATATAAAACGGACGCATTTGAAAGAGAAAAACGTCGAATGTCAAGTATGTTACGAAAGGTTTTACGATACGCAACGCCTGAAGACGCATATGGTTAAACATAACGGTGAACGGAATTTTCATTGCGATATTTGTGATGAAGAGAACAGCCCTGTCAAGAAAACTAGTGCCAATAAATTGAGAAGGTTAAATCTTCAAATTTTATTCCACAATACGTCAATAATACCCTTCAAGTGGCGcggtaaatatttatgtttttactgCGGAGAAgattttaaatcttatgaaaattTGAAGGAACACACACAAGAACATGGCGTGTGCTCGGATAAAGATAGAGCTTTACGATTGGTTAAATCAGCCGATGTCGAAGTCAAAATTGATGTATCACAAATTTCTTGTAACTTATGTCGTGAGAACTTTGTGCATTTAGAGGAGACAATTTCTCATTTAGTCGAAAGTCATAATTTGCCGTATGATAGGAACGTGAATTTGTCAATAGCACCATATCGGTTGTCAGATTTAAGCTGTCTATTATGTGATGAAAAATTTACTTACCTAAAAAAGTTAATAGTTCACGTAAACACTAATCACCCCAGTCAAAATTTGAAGTGTGTTCAGTGTCAACAAAAGTTTAACAAGACAAGGGATCTAGACGCGCATATACGCACCAAACATAAGAATCATGTATGTCCAAAATGCCTCCTTAATTTTCGTACTCGTTCAGAACTTTTAAACCATACGAAGAATGCGCATACTTTTAAATGCATTGTTTGTAACAGAAGCTTTTCCTCGATGAGTAAATGTTTTAAGCACATAAAGAATGAGCATACAGGAGCTGTGATGAAGTGTGGCTTTTGTTTTACATCCTCAACTTCAAAGCAAGATTTTCACAGACACGCTATTAAATGTGCGGACTTCTGTAAGAAATCCGTTGAAGCCGTGACTGTTGACGTTGATAAGAAACCGCGCGTTACACAAATACGGGACAACATAGCTTGTATCTTTAACATGTCAACAGCTATaccctttaaatattttatgagcaAATTCAGATGCTTCTATTGTCCAAAAGATTTCAATGAGTGCGACGATCTTAAACAGCATACTATGATCGAACATCCCCTTTGCGATACTAAGCTGAAATCTATGAAACTACGCCACAGACACGACGGTGTCATTAAAGTCGACACATCATCTCTATCCTGTAAAATTTGTTACGAAAATATACAAGATTTGGAGTCTCTGATAAAACATCTACataatgaacataaaatatatttcgataaatCATTGTCGATTAACTTGCAGTCgtataaacttataaaagataattttccTTGTCCTTTTTGCGGGGATGTCTTTAGATACTTTCGAACGTTGTTAAATCATGTAGTTAAAATGCATTCGGATAATAAGAATATATGCATGCACTGTGGTATTGCTTTTAGGAATGCACCAAATTTACGAACGCACATCGCACGTCACCATAAGGCTGCTAATTATAAATGCTCCAAATGTGACTTAAGCTTCTTTTCGAATTACTATCTTCAAACTCATTTAGGTCGAGTGCACGGTACGAAAGTCGTAGAATGTCTTGAGTGTCATGAGAAATTCACATCAGTGTATGAAATGCAAAGGCACAAAATAGACGTTCACGGTACAGGTCACGAGTGTTCGTACTGTCGTAAGTTGTTCACGGGAAAATCATCCGTCGTCGATCATATAAGACGGACTCATTTGAAAGAGAAGAATGTCGCGTGCACGGTTTGTCAGGAGAAATTTTTCGATAGGCAAAGTTTGAAAGTGCATATGGTGAAACATTACGGTGAGAGGAATTTCCATTGCGACATCTGCGGAAAGAAATTTCTTTGGAAGAAAAATCTTAGGGAGCACATGACGTCACATAACAAGAGCTCGAATAGTCAGTTCTCAAATGATAACTGA
- Protein Sequence
- MQSDECSTGLTAVQSQNLRTLFTITSIIPFECNDGLAYFTDFDNLRDHALSDHPHFNQECKAMKMIKGRDINVKIDISNLTCKICFESSLALNDLINHLITKHGVNYDKSIECLQPFRVVKDNMPCPICPNIVFRYFKKLLEHMNESHSDNNIVCAHCGLTFRGHPNYRAHMSRYHRSKACKCPDCNMEFWNLEKLARHRANVHGTKKYKCPQCVEKFITQHLMRKHQILMHGFGHKCPYCCKMFTRNSHMKNHVRRLHLKEKNVECTVCKEKFFDRALLNVHMVKHVGVDGPAKTLSPNTLRRRNLLILFNNTSIIPFKCRGKCRCFYCGEEFPIYDELRKHTKAHGPCSERDRAIKLVKTDEAEVKIDVSDITCELCNESFVNLDDIISHLITKHMLPYDIDVKLIIMTYRLIDKQCLVCDERFNQVSELVVHVNNEHPVQCFGCEVCHQKFIRKQYLDAHMRVKHSTVYKCLKCSQTFQSHVALQEHKIKSHVAVCNICFTKFSTQMKRLKHMKTEHANEPLKCGFCLKFMSTKLGFLRHAAKCTEKNENVNETFVIDDDDDDKKPTVIQIRKNIACIFNMSTAIPFKYFMSRFRCFYCPHDFTDCDDLRAHTVIEHPICDVNLKCMRLRNRQEGCVKVDTSVLSCKMCFENLPNLESLIEHLTSEHKALYDKSVDCNIQQYRLIKDNYPCPVCDEAFTHFSTLLKHMGQYHTDNKNICMHCGKSFRNLPNLRVHISNHHKTSGSYKCDRCEQEFSTNKYLQTHLGRAHGIKVYECPECAERFTSNYAMQRHMINTHSSGHKCLHCGKLFTSNCFMIDHIKRTHLKEKNVECQVCYERFYDTQRLKTHMVKHNGERNFHCDICDEENSPVKKTSANKLRRLNLQILFHNTSIIPFKWRGKYLCFYCGEDFKSYENLKEHTQEHGVCSDKDRALRLVKSADVEVKIDVSQISCNLCRENFVHLEETISHLVESHNLPYDRNVNLSIAPYRLSDLSCLLCDEKFTYLKKLIVHVNTNHPSQNLKCVQCQQKFNKTRDLDAHIRTKHKNHVCPKCLLNFRTRSELLNHTKNAHTFKCIVCNRSFSSMSKCFKHIKNEHTGAVMKCGFCFTSSTSKQDFHRHAIKCADFCKKSVEAVTVDVDKKPRVTQIRDNIACIFNMSTAIPFKYFMSKFRCFYCPKDFNECDDLKQHTMIEHPLCDTKLKSMKLRHRHDGVIKVDTSSLSCKICYENIQDLESLIKHLHNEHKIYFDKSLSINLQSYKLIKDNFPCPFCGDVFRYFRTLLNHVVKMHSDNKNICMHCGIAFRNAPNLRTHIARHHKAANYKCSKCDLSFFSNYYLQTHLGRVHGTKVVECLECHEKFTSVYEMQRHKIDVHGTGHECSYCRKLFTGKSSVVDHIRRTHLKEKNVACTVCQEKFFDRQSLKVHMVKHYGERNFHCDICGKKFLWKKNLREHMTSHNKSSNSQFSNDN*
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- iTF_01080077;
- 90% Identity
- iTF_01080077;
- 80% Identity
- -