Basic Information

Gene Symbol
-
Assembly
GCA_905404275.1
Location
FR990120.1:3916939-3943105[+]

Transcription Factor Domain

TF Family
zf-C2H2
Domain
zf-C2H2 domain
PFAM
PF00096
TF Group
Zinc-Coordinating Group
Description
The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 48 0.37 41 6.2 4.1 1 23 21 43 21 44 0.96
2 48 0.032 3.6 9.6 0.3 1 23 48 70 48 70 0.94
3 48 0.0027 0.3 13.0 1.4 1 23 74 97 74 97 0.97
4 48 0.0046 0.51 12.2 0.2 2 23 102 124 101 124 0.95
5 48 1.5 1.7e+02 4.3 0.3 1 21 139 159 139 163 0.90
6 48 0.44 49 6.0 0.4 6 23 176 195 176 195 0.90
7 48 0.019 2.2 10.3 0.2 3 23 238 257 236 257 0.97
8 48 0.037 4.1 9.4 1.8 1 21 290 310 290 311 0.93
9 48 0.037 4.1 9.4 1.8 1 21 331 351 331 352 0.93
10 48 0.037 4.1 9.4 1.8 1 21 372 392 372 393 0.93
11 48 0.037 4.1 9.4 1.8 1 21 413 433 413 434 0.93
12 48 0.037 4.1 9.4 1.8 1 21 454 474 454 475 0.93
13 48 0.037 4.1 9.4 1.8 1 21 495 515 495 516 0.93
14 48 0.037 4.1 9.4 1.8 1 21 536 556 536 557 0.93
15 48 0.037 4.1 9.4 1.8 1 21 577 597 577 598 0.93
16 48 0.037 4.1 9.4 1.8 1 21 618 638 618 639 0.93
17 48 0.037 4.1 9.4 1.8 1 21 659 679 659 680 0.93
18 48 0.037 4.1 9.4 1.8 1 21 700 720 700 721 0.93
19 48 0.037 4.1 9.4 1.8 1 21 741 761 741 762 0.93
20 48 0.037 4.1 9.4 1.8 1 21 782 802 782 803 0.93
21 48 0.037 4.1 9.4 1.8 1 21 823 843 823 844 0.93
22 48 0.037 4.1 9.4 1.8 1 21 864 884 864 885 0.93
23 48 0.037 4.1 9.4 1.8 1 21 905 925 905 926 0.93
24 48 0.037 4.1 9.4 1.8 1 21 946 966 946 967 0.93
25 48 0.037 4.1 9.4 1.8 1 21 987 1007 987 1008 0.93
26 48 0.037 4.1 9.4 1.8 1 21 1028 1048 1028 1049 0.93
27 48 0.037 4.1 9.4 1.8 1 21 1069 1089 1069 1090 0.93
28 48 0.037 4.1 9.4 1.8 1 21 1110 1130 1110 1131 0.93
29 48 0.037 4.1 9.4 1.8 1 21 1151 1171 1151 1172 0.93
30 48 0.037 4.1 9.4 1.8 1 21 1192 1212 1192 1213 0.93
31 48 0.037 4.1 9.4 1.8 1 21 1233 1253 1233 1254 0.93
32 48 0.037 4.1 9.4 1.8 1 21 1274 1294 1274 1295 0.93
33 48 0.037 4.1 9.4 1.8 1 21 1315 1335 1315 1336 0.93
34 48 0.037 4.1 9.4 1.8 1 21 1356 1376 1356 1377 0.93
35 48 0.037 4.1 9.4 1.8 1 21 1397 1417 1397 1418 0.93
36 48 0.037 4.1 9.4 1.8 1 21 1438 1458 1438 1459 0.93
37 48 0.037 4.1 9.4 1.8 1 21 1479 1499 1479 1500 0.93
38 48 0.037 4.1 9.4 1.8 1 21 1520 1540 1520 1541 0.93
39 48 0.037 4.1 9.4 1.8 1 21 1561 1581 1561 1582 0.93
40 48 0.037 4.1 9.4 1.8 1 21 1602 1622 1602 1623 0.93
41 48 0.037 4.1 9.4 1.8 1 21 1643 1663 1643 1664 0.93
42 48 0.037 4.1 9.4 1.8 1 21 1684 1704 1684 1705 0.93
43 48 0.085 9.5 8.2 2.4 1 21 1725 1745 1725 1746 0.93
44 48 5.1 5.7e+02 2.7 0.2 5 21 1770 1786 1769 1787 0.90
45 48 0.037 4.1 9.4 1.8 1 21 1807 1827 1807 1828 0.93
46 48 0.037 4.1 9.4 1.8 1 21 1848 1868 1848 1869 0.93
47 48 0.037 4.1 9.4 1.8 1 21 1889 1909 1889 1910 0.93
48 48 2.6 2.9e+02 3.6 3.1 1 14 1930 1943 1930 1944 0.92

Sequence Information

Coding Sequence
ATGAGCGACCTCGAGGGTCCTAGGAAGGGCCGTATAATTTACGCAGGCCCAAATTCCTGGTTCAAATGCGACATATGCTCCATATGGCTGAAGAATCGGCATCACCTTGTGGAACACATGATGCTGCATCACCGGTACATGTATACCTGCAACATTTGCAAGGAGGTGTTCACTATACGTGATCAAGCTGTCACGCACAACGACACGCATAAGAAGCGATTCGAATGTGAGGAATGCAGCGTAGTGTTTAAGACTGACGCCAAGCTCCAACAGCATCACAAGGTGAAACACGTGGTCAAGATCCCGTGCAACGTGTGCGGAATTTCCGTGAGGACCGAGAGGCTGCTTTACAAGCACCAGTTGCGGAAACACAAGTATATCAGCATAAACGACCCGCTAGTAGGCGACCCGGAGCACAAGTGCTCGCAGTGCGAGGTACTGTTCATGGATAAGGGCGCCTTGGACACGCATCTCGTCAGCGCCAACCACCAGGGCGCCGACAACGTGCAGTTCGCATGTATCCCGTGCAAGAAGCGCTTCGAGGACGAGGAGCAGCTGGAAACGCACTACATCACCTGCAACCACTCGCGGAAGTACCCTCGCAAGTGCTGCTACTGCCCTGAGGTGCTACTAAACCCGGCGTCGTACAAGAGCCACTACATCCAACAGCACCCCTCGTTGCCGTACCGCTATATCAGCAAGAACTCCATCTGCGACATCTGCGGCAAGTCGATCTACAAACCGTACCTGGAAACCCACCTGAAGACCCACGAAAGCAGCGTGCACCAGTGCCGGCTGTGCGAGGCGAGTCACCCCACGCCCGCCACGCACGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACATGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCTCGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGTGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCTACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTACGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACATGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCACGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCGACTGCAAGGACCACATACTGAAGGTGACCCCGTGTGAGCCACCAGTGGCGCACGTGCTCAGCCGCCTCGCGCGCCGCCCCTACCGCTGCGGCGTGTGCCCGCGCCGCTTCCGGTACCGCTCCTGCCCGACTGCAatgaaaaaactatttgtatGGGAACTTTGA
Protein Sequence
MSDLEGPRKGRIIYAGPNSWFKCDICSIWLKNRHHLVEHMMLHHRYMYTCNICKEVFTIRDQAVTHNDTHKKRFECEECSVVFKTDAKLQQHHKVKHVVKIPCNVCGISVRTERLLYKHQLRKHKYISINDPLVGDPEHKCSQCEVLFMDKGALDTHLVSANHQGADNVQFACIPCKKRFEDEEQLETHYITCNHSRKYPRKCCYCPEVLLNPASYKSHYIQQHPSLPYRYISKNSICDICGKSIYKPYLETHLKTHESSVHQCRLCEASHPTPATHVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHMLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRLARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHVRRPYRCGVCPRRFRYRSYCKDHILKVTPCEPPVAHVLSRHARRPYRYGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHMLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRHARRPYRCGVCPRRFRYRSDCKDHILKVTPCEPPVAHVLSRLARRPYRCGVCPRRFRYRSCPTAMKKLFVWEL*

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
-
90% Identity
-
80% Identity
-