Basic Information

Gene Symbol
-
Assembly
GCA_963971445.1
Location
OZ020529.1:56767022-56774651[+]

Transcription Factor Domain

TF Family
zf-C2H2
Domain
zf-C2H2 domain
PFAM
PF00096
TF Group
Zinc-Coordinating Group
Description
The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 49 0.0064 0.53 11.2 0.5 3 23 9 30 8 30 0.92
2 49 0.12 9.7 7.3 0.6 3 23 33 54 32 54 0.88
3 49 0.033 2.7 9.0 0.0 3 20 58 75 56 77 0.94
4 49 1.5 1.2e+02 3.8 3.3 3 23 84 105 81 106 0.93
5 49 0.011 0.91 10.5 1.9 2 23 283 305 283 305 0.92
6 49 0.00026 0.022 15.6 0.7 2 23 307 329 307 329 0.95
7 49 0.0038 0.32 11.9 0.3 3 23 333 354 332 354 0.95
8 49 0.002 0.17 12.8 1.1 2 23 356 378 355 378 0.96
9 49 0.0052 0.43 11.5 0.5 3 23 382 403 381 403 0.92
10 49 0.025 2.1 9.3 0.2 2 22 405 425 405 427 0.89
11 49 0.0076 0.63 11.0 2.2 2 23 454 476 454 476 0.94
12 49 0.032 2.6 9.0 0.3 3 23 481 502 479 502 0.91
13 49 0.0019 0.15 12.9 1.5 2 23 504 526 504 526 0.94
14 49 0.032 2.6 9.0 0.3 3 23 531 552 529 552 0.91
15 49 0.011 0.91 10.5 1.9 2 23 554 576 554 576 0.92
16 49 0.00026 0.022 15.6 0.7 2 23 578 600 578 600 0.95
17 49 0.0038 0.32 11.9 0.3 3 23 604 625 603 625 0.95
18 49 0.002 0.17 12.8 1.1 2 23 627 649 626 649 0.96
19 49 0.0011 0.09 13.6 0.3 3 23 653 674 652 674 0.94
20 49 0.0083 0.69 10.9 1.4 2 23 676 698 676 698 0.95
21 49 0.0038 0.32 11.9 0.3 3 23 702 723 701 723 0.95
22 49 0.002 0.17 12.8 1.1 2 23 725 747 724 747 0.96
23 49 0.0052 0.43 11.5 0.5 3 23 751 772 750 772 0.92
24 49 0.025 2.1 9.3 0.2 2 22 774 794 774 796 0.89
25 49 0.0076 0.63 11.0 2.2 2 23 823 845 823 845 0.94
26 49 0.01 0.85 10.6 0.2 3 23 850 871 848 871 0.93
27 49 0.0019 0.15 12.9 1.5 2 23 873 895 873 895 0.94
28 49 0.032 2.6 9.0 0.3 3 23 900 921 898 921 0.91
29 49 0.0019 0.15 12.9 1.5 2 23 923 945 923 945 0.94
30 49 0.032 2.6 9.0 0.3 3 23 950 971 948 971 0.91
31 49 0.0019 0.15 12.9 1.5 2 23 973 995 973 995 0.94
32 49 0.032 2.6 9.0 0.3 3 23 1000 1021 998 1021 0.91
33 49 0.0019 0.15 12.9 1.5 2 23 1023 1045 1023 1045 0.94
34 49 0.032 2.6 9.0 0.3 3 23 1050 1071 1048 1071 0.91
35 49 0.0019 0.15 12.9 1.5 2 23 1073 1095 1073 1095 0.94
36 49 0.92 76 4.4 0.2 3 23 1100 1121 1098 1121 0.91
37 49 0.0019 0.15 12.9 1.5 2 23 1123 1145 1123 1145 0.94
38 49 0.032 2.6 9.0 0.3 3 23 1150 1171 1148 1171 0.91
39 49 0.00083 0.069 14.0 1.4 2 23 1173 1195 1173 1195 0.95
40 49 0.02 1.7 9.6 0.7 3 23 1199 1220 1198 1220 0.92
41 49 0.009 0.74 10.8 1.7 2 23 1222 1244 1222 1244 0.92
42 49 0.12 9.7 7.3 0.6 3 23 1247 1268 1246 1268 0.88
43 49 0.0066 0.54 11.2 0.4 3 23 1272 1293 1271 1293 0.92
44 49 0.12 9.7 7.3 0.6 3 23 1296 1317 1295 1317 0.88
45 49 0.0023 0.19 12.6 0.4 3 23 1321 1342 1320 1342 0.95
46 49 3.1 2.6e+02 2.8 0.4 2 23 1344 1366 1344 1366 0.82
47 49 0.065 5.3 8.1 0.2 5 23 1372 1391 1369 1392 0.91
48 49 0.019 1.6 9.7 0.6 3 23 1395 1416 1394 1417 0.94
49 49 1 86 4.3 3.8 3 23 1441 1462 1438 1462 0.94

Sequence Information

Coding Sequence
ATGGACGCggaacataaaaaaatttgtaccaAGTGTGATGCTGCCTTCTTCAATAAGGAGGACTTAGAAAATCATTTCAAGGATACACATCCGACTTGTTCATGTTGTAATGGCAGCTTCATTGATAAAGATGCTTTAACTAGTCACATGGACGCGGAACATAAAGTAATTTGTACGAAGTGTGATGCTGCCTTCTTCAATAAGGCTGACTTAGATAATCACATAGAGGATAAACGTATTTGTGAAATGTGTCTAGATTGTAGTGCCATATTCCTTCACAAGGATGAGTTAGTTAATCACATGAAAGCTAAGCACCATGAAATGTGTCCAGACAACAATGATGATTTGATTGAAACGAATGAAAATCCGGAAAATGCTATAAATGCATCCAATAATATTCAAATTAATAGAATTGCCTTAAGAGAATATTGGGGACGTACCAGTGTTGGTTTAACTACAATCGACTGTAAGCCAATTGATAGTAGATTACAAAAATTGGGCACCGTTGCTtccatatataaaatgtacaaaatcgTGGAATTCAACTGTCACATTGGTCATGTGGAGGGTCAGCAAGCCTCTGGCAGTTATTTTGTTGGCATATCCTATGGCAACAAACATCCACAGCATAAAAATGATATAACTTCTCTAACTACCGCTGTGTCTGCGGCAATTACCCAGAACACTTCGATTAATGTATCAACTGATAAATTAATGACCAAATCGTGGTTACGGATGAATGAAGAATCGCCCGGAGCTATTTTAATCTATTCGTATGCACGTGAAGAGTTGGATGTTTGGATTACATATTGCATAGAACTAAAAGACAAAACTGATAAACATCCGACTTGTCCAAATTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTCAAAGATAAACACCCGACTTGTCCAAATTGTAAGCGCAACTTCGTTGATAAAGATGCCTTAGATAAACACTTGAGAGATAAACATAAAGAAATTTGTTCGCAGTGTGATGCTGTCTTCTTCAATAAGGAGGACTTAGATAATCACTTGAGGGATAAACACACGTCTTGTCCAATTTGTAAATGCAGCTTTATTGATGGGATCGCTTTGACTCATCACATGGAGACGGAACATAAAGAAATTTGTACGAAGTGTGATGCTGCCTTCTCCAATAAGGCTGACTTAGATAATCACTTCAAGGATAAACATCCGACTTGTCCAATTTGTAAATGCAGCTTTATTAATGAGATCGCTTTGATTTATCACATCGGAGCAAAACATATAGAAATTTGTCCGAAATATGATGCTGCCTTTTTCAATACGGATGACTTAGAAAATTACTTCAAGGATAAACATCCGACTTGTTCGAATTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTGAAAGATAAACATAAAGGCACGATTTGTTCGGTATGTGATGCTGCCTTCTTCAATAAGGATGAATTAGATAATCACTTCAAGGATAAACATCCGACTTGTCCAAATTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTGAAAGATAAACATAAAGGCACGATTTGTTCGGTATGTGATGCTGCCTTCTTCAATAAGGATGAATTAGATAATCACTTCAAGGATAAACATCCGACTTGTCCAAATTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTCAAAGATAAACACCCGACTTGTCCAAATTGTAAGCGCAACTTCGTTGATAAAGATGCCTTAGATAAACACTTGAGAGATAAACATAAAGAAATTTGTTCGCAGTGTGATGCTGTCTTCTTCAATAAGGAGGACTTAGATAATCACTTGAGGGATAAACACACGTCTTGTCCAATTTGTAAATGCAGCTTTATTGATGGGATCGCTTTGACTCATCACATGGAGACGGAACATAAAGAAATTTGTACCAAGTGTGATGCTGCCTTCTCCAATAAGGCTGACTTAGATAATCACTTAAAGGATAAACACCCGACTTGCCCAAATTGTAAGAGCATCTTCGTTCATAAAGATGCCTTAGATAAACACTTGAGAGATAAACATAAAGAAATTTGTTCGCAGTGTGATGCTGTCTTCTTCAATAAGGAGGACTTAGATAATCACTTGAGGGATAAACACACGTCTTGTCCAATTTGTAAATGCAGCTTTATTGATGGGATCGCTTTGACTCATCACATGGAGACGGAACATAAAGAAATTTGTACGAAGTGTGATGCTGCCTTCTCCAATAAGGCTGACTTAGATAATCACTTCAAGGATAAACATCCGACTTGTCCAATTTGTAAATGCAGCTTTATTAATGAGATCGCTTTGATTTATCACATCGGAGCAAAACATATAGAAATTTGTCCGAAATATGATGCTGCCTTTTTCAATACGGATGACTTAGAAAATTACTTCAAGGATAAACATCCGACTTGTTCGAATTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTGAAAGATAAACATAAAGGCACGATTTGTTCGGTATGTGATGCTGCCTTCTTCAATAAGGATGAATTAGATAATCACTACAAAGATACACATCCGACTTGTCCAAATTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTGAAAGATAAACATAAAGGCACGATTTGTTCGGTATGTGATGCTGCCTTCTTCAATAAGGATGAATTAGATAATCACTTCAAGGATAAACATCCGACTTGTCCAAATTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTGAAAGATAAACATAAAGGCACGATTTGTTCGGTATGTGATGCTGCCTTCTTCAATAAGGATGAATTAGATAATCACTTCAAGGATAAACATCCGACTTGTCCAAATTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTGAAAGATAAACATAAAGGCACGATTTGTTCGGTATGTGATGCTGCCTTCTTCAATAAGGATGAATTAGATAATCACTTCAAGGATAAACATCCGACTTGTCCAAATTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTGAAAGATAAACATAAAGGCACGATTTGTTCGGTATGTGATGCTGCCTTCTTCAATAAGGATGAATTAGATAATCACTTCAAGGATAAACATCCGACTTGTCCAAATTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTGAAAGATAAACATAAAGGCACGATTTGTTCGGTATGTGATGCTGCCGTCTTCAATAAGGATGAATTAGATAATCACTTCAAGGATAAACATCCGACTTGTCCAAATTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTGAAAGATAAACATAAAGGCACGATTTGTTCGGTATGTGATGCTGCCTTCTTCAATAAGGATGAATTAGATAATCACTTCAAGGATAAACATCCGACTTGTCCAAATTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCACTTGAGAGATAAACATAAAGAAATTTGTTCGAAGTGTGATGCTGTCTTCTTCAATAAGGATGAATTAGATAATCACTTCAAGGATAAACATCCGACTTGTCCAAATTGTAAGCGCATCTTCGTTCATAAAGATGCCTTAGATAATCATTTCAAGGATACACATCCGACTTGTTCATGTTGTAATGGCAGCTTCATTGATAAAGATGCTTTAACTAGTCACATGGACGCggaacataaaaaaatttgttcgCAGTGTGATGCTGTCTTCTTCAATAAGGAGGACTTAGAAAATCATTTCAAGGATACACATCCGACTTGTTCATGTTGTAATGGCAGCTTCATTGATAAAGATGCTTTAACTAGTCACATGGACGCggaacataaaaaaatttgtaccaAGTGTGATGCTGCCTTCTTCAATAAGGATGACTTAGATAATCACTTGAGGGATAAACACCCGTCTTGTCCAATTTGTAAATGCAGCTTTCTTGATGGGATCGCTTTGACTTATCACATAGAGGAAAGACATAAAGAAATTTGCATAATTTGTAATGCTGTATTCTTCAGTTGGGTTCAATTGGCTGATCATATGAAAGCTAAGCACCATGAAATGTGTCCAAATTGCAATGCCGTATTCCTCTACAAGGATGACTTGGTTAATCACATGAAAGCTAAACACCACGAAATCTGTCCAGATTGTAATGCGGATGCCTCAGATAATCACATAGAGGATAAACGTATCTGTGAAATGTGTCTAGATTGTAATGCCATATTCCTTCACAAGCATGAGTTAGTTAATCACATGAAAGCTAAGCACCTTGTAATGTGTCCAGACAACAATGATGATTTGATTGAAACGAATGAAAATCCGGAAAATGCTATAAATGCATCCAATAATATTCACATTGATATAATTGCCTTAAGAGAATATTGGGGACGTACCAGTGTTGGTTTAACTACAATCGACTGTAAGCCAATTGATAGTAGATTACAAAAATTGGGCACCGTTGCTtccatatataaaatgtacaaaatcgTGGCATTCGACTGTCACATTGGTCATGTGGAGGGTCAGCAAGCCTCTGGCAGTTATTTTGTTGGCATATCCTATGGTAACAAACATCCACAGCATAAAAATGATATAACTTCTCTAACTACCGCTGTGTCTGCGGCAATTACCCAGAACACTTCGATTAATGTATCAACTGATAAATTAATGACCAAACCGTGGTTAAGTGGTGAAGAGTTGGATGTTTGGATCACATATTGCATAGAACTAAAAGACAAAACTAAGGACTTAGATAATAACTTCAAAGATAAACATCCGACGTGTCCAAGTTATAATGTCAACTTCATTAATAAAGATGCTTGA
Protein Sequence
MDAEHKKICTKCDAAFFNKEDLENHFKDTHPTCSCCNGSFIDKDALTSHMDAEHKVICTKCDAAFFNKADLDNHIEDKRICEMCLDCSAIFLHKDELVNHMKAKHHEMCPDNNDDLIETNENPENAINASNNIQINRIALREYWGRTSVGLTTIDCKPIDSRLQKLGTVASIYKMYKIVEFNCHIGHVEGQQASGSYFVGISYGNKHPQHKNDITSLTTAVSAAITQNTSINVSTDKLMTKSWLRMNEESPGAILIYSYAREELDVWITYCIELKDKTDKHPTCPNCKRIFVHKDALDNHFKDKHPTCPNCKRNFVDKDALDKHLRDKHKEICSQCDAVFFNKEDLDNHLRDKHTSCPICKCSFIDGIALTHHMETEHKEICTKCDAAFSNKADLDNHFKDKHPTCPICKCSFINEIALIYHIGAKHIEICPKYDAAFFNTDDLENYFKDKHPTCSNCKRIFVHKDALDNHLKDKHKGTICSVCDAAFFNKDELDNHFKDKHPTCPNCKRIFVHKDALDNHLKDKHKGTICSVCDAAFFNKDELDNHFKDKHPTCPNCKRIFVHKDALDNHFKDKHPTCPNCKRNFVDKDALDKHLRDKHKEICSQCDAVFFNKEDLDNHLRDKHTSCPICKCSFIDGIALTHHMETEHKEICTKCDAAFSNKADLDNHLKDKHPTCPNCKSIFVHKDALDKHLRDKHKEICSQCDAVFFNKEDLDNHLRDKHTSCPICKCSFIDGIALTHHMETEHKEICTKCDAAFSNKADLDNHFKDKHPTCPICKCSFINEIALIYHIGAKHIEICPKYDAAFFNTDDLENYFKDKHPTCSNCKRIFVHKDALDNHLKDKHKGTICSVCDAAFFNKDELDNHYKDTHPTCPNCKRIFVHKDALDNHLKDKHKGTICSVCDAAFFNKDELDNHFKDKHPTCPNCKRIFVHKDALDNHLKDKHKGTICSVCDAAFFNKDELDNHFKDKHPTCPNCKRIFVHKDALDNHLKDKHKGTICSVCDAAFFNKDELDNHFKDKHPTCPNCKRIFVHKDALDNHLKDKHKGTICSVCDAAFFNKDELDNHFKDKHPTCPNCKRIFVHKDALDNHLKDKHKGTICSVCDAAVFNKDELDNHFKDKHPTCPNCKRIFVHKDALDNHLKDKHKGTICSVCDAAFFNKDELDNHFKDKHPTCPNCKRIFVHKDALDNHLRDKHKEICSKCDAVFFNKDELDNHFKDKHPTCPNCKRIFVHKDALDNHFKDTHPTCSCCNGSFIDKDALTSHMDAEHKKICSQCDAVFFNKEDLENHFKDTHPTCSCCNGSFIDKDALTSHMDAEHKKICTKCDAAFFNKDDLDNHLRDKHPSCPICKCSFLDGIALTYHIEERHKEICIICNAVFFSWVQLADHMKAKHHEMCPNCNAVFLYKDDLVNHMKAKHHEICPDCNADASDNHIEDKRICEMCLDCNAIFLHKHELVNHMKAKHLVMCPDNNDDLIETNENPENAINASNNIHIDIIALREYWGRTSVGLTTIDCKPIDSRLQKLGTVASIYKMYKIVAFDCHIGHVEGQQASGSYFVGISYGNKHPQHKNDITSLTTAVSAAITQNTSINVSTDKLMTKPWLSGEELDVWITYCIELKDKTKDLDNNFKDKHPTCPSYNVNFINKDA

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
-
90% Identity
-
80% Identity
-