Basic Information

Gene Symbol
-
Assembly
GCA_963855885.1
Location
OY979654.1:2362110-2371494[-]

Transcription Factor Domain

TF Family
zf-C2H2
Domain
zf-C2H2 domain
PFAM
PF00096
TF Group
Zinc-Coordinating Group
Description
The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 49 0.0019 0.11 13.6 0.2 2 23 17 39 16 39 0.97
2 49 0.00031 0.019 16.0 0.0 1 23 190 213 190 213 0.97
3 49 2.4e-05 0.0015 19.5 1.0 1 23 219 241 219 241 0.99
4 49 4.5e-06 0.00028 21.8 0.5 1 23 247 269 247 269 0.98
5 49 1.9e-07 1.2e-05 26.1 4.8 1 23 275 297 275 297 0.99
6 49 4.7e-06 0.00029 21.8 2.3 1 21 303 323 303 324 0.96
7 49 0.25 15 6.9 0.9 10 23 327 340 325 340 0.88
8 49 4.7e-06 0.00029 21.8 2.3 1 21 346 366 346 367 0.96
9 49 0.25 15 6.9 0.9 10 23 370 383 368 383 0.88
10 49 4.7e-06 0.00029 21.8 2.3 1 21 389 409 389 410 0.96
11 49 0.25 15 6.9 0.9 10 23 413 426 411 426 0.88
12 49 4.7e-06 0.00029 21.8 2.3 1 21 432 452 432 453 0.96
13 49 0.25 15 6.9 0.9 10 23 456 469 454 469 0.88
14 49 4.7e-06 0.00029 21.8 2.3 1 21 475 495 475 496 0.96
15 49 0.25 15 6.9 0.9 10 23 499 512 497 512 0.88
16 49 4.7e-06 0.00029 21.8 2.3 1 21 518 538 518 539 0.96
17 49 0.25 15 6.9 0.9 10 23 542 555 540 555 0.88
18 49 4.7e-06 0.00029 21.8 2.3 1 21 561 581 561 582 0.96
19 49 0.25 15 6.9 0.9 10 23 585 598 583 598 0.88
20 49 4.7e-06 0.00029 21.8 2.3 1 21 604 624 604 625 0.96
21 49 0.25 15 6.9 0.9 10 23 628 641 626 641 0.88
22 49 4.7e-06 0.00029 21.8 2.3 1 21 647 667 647 668 0.96
23 49 0.25 15 6.9 0.9 10 23 671 684 669 684 0.88
24 49 4.7e-06 0.00029 21.8 2.3 1 21 690 710 690 711 0.96
25 49 0.25 15 6.9 0.9 10 23 714 727 712 727 0.88
26 49 4.7e-06 0.00029 21.8 2.3 1 21 733 753 733 754 0.96
27 49 0.25 15 6.9 0.9 10 23 757 770 755 770 0.88
28 49 4.7e-06 0.00029 21.8 2.3 1 21 776 796 776 797 0.96
29 49 0.25 15 6.9 0.9 10 23 800 813 798 813 0.88
30 49 4.7e-06 0.00029 21.8 2.3 1 21 819 839 819 840 0.96
31 49 0.25 15 6.9 0.9 10 23 843 856 841 856 0.88
32 49 0.00063 0.039 15.1 0.4 5 21 866 882 865 883 0.94
33 49 0.25 15 6.9 0.9 10 23 886 899 884 899 0.88
34 49 4.7e-06 0.00029 21.8 2.3 1 21 905 925 905 926 0.96
35 49 0.25 15 6.9 0.9 10 23 929 942 927 942 0.88
36 49 4.7e-06 0.00029 21.8 2.3 1 21 948 968 948 969 0.96
37 49 0.25 15 6.9 0.9 10 23 972 985 970 985 0.88
38 49 4.7e-06 0.00029 21.8 2.3 1 21 991 1011 991 1012 0.96
39 49 0.25 15 6.9 0.9 10 23 1015 1028 1013 1028 0.88
40 49 4.7e-06 0.00029 21.8 2.3 1 21 1034 1054 1034 1055 0.96
41 49 0.25 15 6.9 0.9 10 23 1058 1071 1056 1071 0.88
42 49 4.7e-06 0.00029 21.8 2.3 1 21 1077 1097 1077 1098 0.96
43 49 0.25 15 6.9 0.9 10 23 1101 1114 1099 1114 0.88
44 49 4.7e-06 0.00029 21.8 2.3 1 21 1120 1140 1120 1141 0.96
45 49 0.25 15 6.9 0.9 10 23 1144 1157 1142 1157 0.88
46 49 4.7e-06 0.00029 21.8 2.3 1 21 1163 1183 1163 1184 0.96
47 49 0.25 15 6.9 0.9 10 23 1187 1200 1185 1200 0.88
48 49 4.7e-06 0.00029 21.8 2.3 1 21 1206 1226 1206 1227 0.96
49 49 0.25 15 6.9 0.9 10 23 1230 1243 1228 1243 0.88

Sequence Information

Coding Sequence
ATGCTCCAGATAGAACAACCGGAAAGTGATGACGATGCAGATGCTGTTCAGTGCAAGGTGTGCTCCAAGAGCTTTGTGACGGAGATGGCGGTGAACAACCACGCTCGGATGGAGCACATCGAGCAGTACCTCGCCGGGGAACAGCTCGCTGTCAGATTGCAGCCAGACAAGAAACGCAAGCAGCCTGATGATGGACAGGATCCAAAGATCTCGAAGCTGGTGTCCGTGATGCAGACAGCCACGAGCCTCGCCCCCCACGACTTCAGCTACATCATCATCAAGGAAAACGACCTTCAGGGAGCCCATATGGTCAAAAGAGCTAAAAAGGCAGAGAAAGAAAAACCTGCGAAAGTAGTAGCAAAGCCTAAGGAGAAAGCTGTAAAAGCTATAAGCGGTCCATTCGAGTGCCTGCAGCCGTCTCCCCACGACCCCGATACACTGTGCCACCAGATTTTCCTCTCCTGCTGCGAGTACTCAATGCACTTCCGCGACGAACACACACGGCGTCGCAAGGGCAACCGCTGCCAGGTCTGCGAGAAGCCACTCACGGGAGAGCAACAGGGCCCGTACTCTTGCCAGGTTTGCGGCATGGGTTTTGAAAACAACAAAGATTTAGCAGCGCACGGTCAGACCGCCCACGTCAAACTCAAGCCGTTCCAGTGCAGCATATGCCTGAAGCGGTTCACGCAGCTGGGAGGGCTGCAGCAGCACACGCGCATGCACACGGGGGACCGCCCCTTCGAGTGCACCTTCTGCCCCAAGGCGTTCACGCAGAAGTCCGGCCTGGACCAGCACCTGAGGATACACACCAAGGTCCGCCCGTACCGCTGCGtaatatgcggcaagtcgttcTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCGACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCGACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTTCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAATTTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTTTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAATTTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAATTTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAACGTGGCGCCGTTCCAGTGCGGCATCTGCGACAAGCGGTTCAAACAGAGCAGCCACCTCAACTACCACCTCAGGTACGAGTCGCTCTGCCAGTCGGTGCACTTGCAGCAGCACATGCGCACGCACACCAATTTGGCGCCGTTCCAGTACCACAATGTCGCCAACATGACAGATGAGCAAAAGACCAAGTACGCGGAACTGATAGCACAGATCATCGCGCAAGAGGCGTCTGGAAAATCCAACCAGCAACGCTCTCCACAGTCTAAACAGAGAGTTCTGAAGCACCAAACCAATACCGACCAAGGAAATGGCCAACAAAATTGGTCACAAGCATCTGGTTACGAACAACAAATGTTAGAATATAGCCAGTCGGATTTGACGCAAGAGGAAATTTCCCAGTCGGATTTGACGCAGGAGGAAATTGCCCAGTCGGAATTGTCGCTGGAGGAATACGAGCAGTCGGAATTGACGCAGGAGTATGACCAGTCGGAATTGACGCAGGAGTATGACCAGTCGGAATTGACGCAGGAGTATGACCAGTCGGATTTGACTCAGGAGGAAACAGACCAGTCGGAGTCGAATGAACTGACAGGTGAAACGGTGTATATTTTGAACGAAGGAGAAGTTTGGTGCGAGGGAGTTGTCTGCGAATAG
Protein Sequence
MLQIEQPESDDDADAVQCKVCSKSFVTEMAVNNHARMEHIEQYLAGEQLAVRLQPDKKRKQPDDGQDPKISKLVSVMQTATSLAPHDFSYIIIKENDLQGAHMVKRAKKAEKEKPAKVVAKPKEKAVKAISGPFECLQPSPHDPDTLCHQIFLSCCEYSMHFRDEHTRRRKGNRCQVCEKPLTGEQQGPYSCQVCGMGFENNKDLAAHGQTAHVKLKPFQCSICLKRFTQLGGLQQHTRMHTGDRPFECTFCPKAFTQKSGLDQHLRIHTKVRPYRCVICGKSFCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTDVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTDVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQFGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNLAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNLAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNLAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNVAPFQCGICDKRFKQSSHLNYHLRYESLCQSVHLQQHMRTHTNLAPFQYHNVANMTDEQKTKYAELIAQIIAQEASGKSNQQRSPQSKQRVLKHQTNTDQGNGQQNWSQASGYEQQMLEYSQSDLTQEEISQSDLTQEEIAQSELSLEEYEQSELTQEYDQSELTQEYDQSELTQEYDQSDLTQEETDQSESNELTGETVYILNEGEVWCEGVVCE

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
-
90% Identity
-
80% Identity
-