Basic Information

Gene Symbol
-
Assembly
GCA_958510825.1
Location
OY294047.1:53149204-53153378[-]

Transcription Factor Domain

TF Family
zf-C2H2
Domain
zf-C2H2 domain
PFAM
PF00096
TF Group
Zinc-Coordinating Group
Description
The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 31 3e-05 0.0017 19.6 1.4 1 23 101 123 101 123 0.97
2 31 0.00011 0.006 17.9 7.6 1 23 129 151 129 151 0.96
3 31 2e-06 0.00011 23.3 2.6 1 23 159 181 159 181 0.97
4 31 6.6e-07 3.7e-05 24.8 0.3 1 23 187 209 187 209 0.98
5 31 0.00011 0.0062 17.8 2.2 1 23 215 237 215 237 0.97
6 31 9.4e-05 0.0053 18.0 1.6 1 23 243 265 243 265 0.99
7 31 0.00039 0.022 16.1 1.8 1 23 271 293 271 293 0.97
8 31 0.00063 0.036 15.4 3.1 1 23 300 322 300 322 0.98
9 31 4.2e-07 2.4e-05 25.4 1.6 1 23 328 350 328 350 0.98
10 31 7.9e-08 4.5e-06 27.7 0.8 1 23 356 378 356 378 0.97
11 31 4.5e-06 0.00025 22.2 0.7 1 23 388 411 388 411 0.97
12 31 2.4 1.4e+02 4.2 0.1 8 23 464 479 463 479 0.94
13 31 1e-06 5.7e-05 24.2 1.2 1 23 701 723 701 723 0.97
14 31 4.6e-07 2.6e-05 25.3 2.3 1 23 733 755 733 755 0.97
15 31 4e-06 0.00023 22.3 2.2 1 22 765 786 765 786 0.95
16 31 5.2e-06 0.0003 22.0 1.5 1 21 797 817 797 818 0.94
17 31 1.8e-06 0.0001 23.5 1.4 1 23 829 851 829 851 0.98
18 31 4e-07 2.3e-05 25.5 3.2 1 23 860 883 860 883 0.98
19 31 8.9e-06 0.0005 21.3 0.4 1 23 913 935 913 935 0.97
20 31 5.1e-05 0.0029 18.9 0.4 1 23 947 969 947 969 0.92
21 31 8.3e-05 0.0047 18.2 0.9 1 23 977 999 977 999 0.97
22 31 3.4e-06 0.0002 22.6 1.3 1 23 1010 1032 1010 1032 0.98
23 31 0.23 13 7.4 4.9 1 23 1038 1061 1038 1061 0.97
24 31 0.053 3 9.4 2.2 1 23 1067 1089 1067 1089 0.96
25 31 7.7e-06 0.00044 21.5 5.6 1 23 1096 1118 1096 1118 0.99
26 31 1.2e-06 6.7e-05 24.0 0.5 1 23 1124 1146 1124 1146 0.96
27 31 2e-08 1.1e-06 29.6 0.6 1 23 1152 1174 1152 1174 0.97
28 31 1.7e-05 0.00095 20.4 1.6 1 23 1180 1202 1180 1202 0.98
29 31 0.00015 0.0088 17.4 0.0 1 23 1208 1230 1208 1230 0.98
30 31 0.033 1.9 10.0 5.5 1 23 1237 1259 1237 1259 0.98
31 31 7.9e-05 0.0045 18.3 1.9 1 21 1265 1285 1265 1286 0.93

Sequence Information

Coding Sequence
ATGAGTGAGCCAAAGAAAAGAGGCAGAAAAAAGAAGgtgaaaaaagttgtagaaGAGGACGATAGCGACGATGAGTTTTTAATCGCCAAAATAACTCCCAAAAAGAAAAAGGAGAAGAAAGAAAGTGAAAATGAAGATGATAATGATGACaatgaagaaaaagaagaagaagaaggcgGGGCTAAATCCGGGAGCGACTGGGAGCAAGAAATCGTGTGCGAAGTGGACGTAGAAATTGACGaaaacgtcaaaacaaaaaagaagaagaaaaagagaaACAGAAAGAGCAAGAAGAAAGAACACATCTGCGAGTATTGTGCGAAAGTTTTCACGCTGAAATACAACCTAAAAGCGCACTTGTTGACGCACACGGGCGAAAAACCGCATGCCTGTACGTATTGCGACAAAAAATTCTCGCAAAAATGCAACTATATGCGCCATCTGAGGTCGCACTTCGGCAACCAAGACACAAACTTCGCTTGTGAAATTTGCGGCAGCACTTTTACGCAGAAAACCCACCTAAATCGGCACATGATGAAGCATAACGGAGAAAAGCCTTTTCCTTGCCCTTACTGCGAAAAAGGTTTCGTGGACCAGACTAATTTAACCCTTCACATCCGAAAACATACGGGAGAAAAACCATTTACGTGCGAGTGTTGTGGAAAAACTTTCGCACGCAAGACTCAACTTACTATTCACGCACGGGTCCATACAGGAGAAAAGCCCTACGTGTGCAGCTACTGTCAAAAAGCCTTCTTCCTGAAGAAGAGTCTCACCCTACACATGGAGACCCACATCGCCGTGAAGCCCTTCGCGTGCAGCTTCTGCAACTACCGCTACTCCAACGAAACCCTGCTAAGACGTCACATAACCACCCATACAGGTGAGCAAAAAACTCACGAATGTCCCCAATGCCAGAAAAAATGGTCGAGAAAGCACGATTTAATGATACACATGCGCATTCACACGGGCGAAAAGCCGCATCAATGTCAAATTTGCGGGAAAGACTTTGCGCGAAAACCTGATTTAACCAAGCATCTCCGCACGCATAGCGGCGAGAAACCTTTCGCCTGTACTTATTGCGAAAAAGCATTTTCTGATCAATCTAATCTTACTCAACACATACGGTCTCATACGGGTGAAAAACGCGGCGAAAAACGCTTTCCTTGTCCGAGCTGTCGCAAAACTTTTGCTTATGAATCTAACCTACAGCAACACATACGCATGCGCCATACGCAACAACCGGTGGTAGAAGTAAaacctcctcctcctcctcctcctgtGGTTCCCGAGAAACCATTGACGGATCAATTAAATACTTTATCGCAATATATATTACCCACCCTACCGGTAATAGAAAAACGTGCGGAAGAACCTACTAGTTACGAGAAATCTTTCGTAGATCCTTTTAACTTGGTACAACACTTTTTATCGCACGCTCAGGCGCAAGATAAACGGTATGGTTACGAAAAAACGTTTGATCACACACTTGGTAATGATAAACCCGTGGTTCCTGTGCGAAATTTTGACAAACCCATGTATCCGGATCAAACCAATTTACCGCAAAGTTTGTTAAACATAGGTGACAGGCGTGTGGAGAAACCTTACATGCCTCCCCAACAAATGAATTACGAGAAAACGTTTGAGGAACATTTGAATATGACGCAAAAACAACAACAGTTTTCGATGCCAATTTACGAAAAACCGAAACAGCAACAGTTCCATATGTCGAGTTACGAAAAACCGAAACCGCAACAGTTTTCTTTGTTGAATTATGAAAAACCAAAACAGCAGTTTACTTTGTCGAATTATGATAGACCTTTCCCTGGACAGTCTAATCTAATGAATATGATGGATATGCGTAGAGAAAACAAACCGTTTGGTTTGACAAATTACGAGAAAAGTTACGAAAAACCGTATGAAATGCATAACAGTAGTATCCAAAGTATCAATAGTATGGATAATGATCCTGACCATAGCCGTGATGACAGTGTTAGATCTTTCGGTGATTATATGGACTATGATACCAAACCTTATGAACAGCAAAACATTCCCCAACCTATTGTCCCTGAAGTAATAGAAAAACGCAAAGAAAAACCTTTTGCTTGCACTTACTGCGGAAAAAATTTCGCATATAAAACGAATCTAACGCAACATATTAGATTGCATACCGGGGAAAAGAGGGGGGACAAGCCATTTGCTTGTCCTCACTGTAAGAAAACTTTCGCGTATCAGTCCAACTTGCAACAGCACATTCAGTCGCATAGCGATGAAAAACGTGCGGAGAAACCGTACGCTTGTACACATTGCGACAAAAAGTTCAAATACCGATCAAATTTGATGCAACATATACGTTCGGTAACCGGCGAAAAACGTGCGGAAAAACCTTTTGCGTGCACACATTGCAACAAAACATTTGCGTATAAATCGAACCTTACGCAACATATTCAATCTGTAAGCGGAGAAAAACGAGGGGATAGAACGTACCCTTGTAACTTGTGCGATAAAAAATTCTCGTATAAATCTACCCTTACGCAACATATTTTATCACATACTGGCGAGACGTGCGAAAAACCGTTTAAATGTCACCTGTGCGATAAAGGTTTTGCGTATAAGTCAAATTTAACGCAACATATGCGAATCAGAcacaaaaattacgaaaaagcTTGTGCGGAGTTGTCAAATATAACGCAAGACATTTTATCGCATCCGGGTCAAGAGGAGAATAAGATCGCGTTTACTTGCACACAGTGCGATAAAACGTTTACCGACCAGCCGCTTTTAGCGCAACATATTTTACAGCATTTCGAGGAGAAAACCGtaaagaaagaaaaattgttCAACTGCGAAAAGTGCGATAAAACGTTTGCGAACGAGTCCACTTTAATTCGGCATTCTGCCGCGCACGACCTTGCTCCGGAAAAAGTTTTTACGTGCGAACAGTGTAGTAAATCGTTTAACGACGAAACTAAGCTTACGCAGCACATTTTGTGGCACGCTGTGGAGTCTGGTACGGAAAAAGCTACGTTCATTTGTTCGCAGTGCGATAAAACGTTCATGGATCAGTTAAATTTAACGCAACATATGAGGTCGCACACGGGCGAAAAACGGTTTAAGTGCGATTGTTGCGAGCTAAGTTTTTCGTCGAAATTAGATTTATGCTCGCATTTGAACAAAGTGCATGTAGATGAATTGCCGCACCAATGTTGTATGTGCGAGTTTAGATTCGCCACAGAACCGTTGCTAAACAAGCACGTTTTGGCGCACACAGGGGATATCAAACAGTTCCAATGTCCGCATTGCGATAAAAAGTGTGCGAGAAAACACGATCTCACTGTACATATAAGAATACACACGGGCGAAAAACCGCACGCGTGCGCTATTTGCGATAAAGTTTTCGCACGCAAGCCGGATTTAACGAAACATATGCGCTTGCATAGCGGCGAGAAACCGTTCGATTGTACCTATTGCGATAAATCGTTTGCGGACCAATCGAATTTAACCGCACACATTCGCTCGCATACGGGCGAGAAAAAATACAAGTGCGATTTTTGCGATTACAGTTTCGGTTTGAAGTCGCACCTAGCTCGGCATATGATCGTACACACCCGAGAAAAGCCTTTTGTTTGTGAAGTTTGCGAGGAAGCCTATCCGCAAAAATCGTTACTTGACAGACACATGATCTCGCACACTGGCGAGAAGAAAATTCATAAATGTCCGTACTGCGAGAAAAAATGGGCACGGAGACATGACATGCACGTGCATATCCGCACTCATACCGGAGAAAAACCGCACGTTTGTCCGTATTGCGAAAAAGAGTTTACCCGGAAACCGGATTTGACGCGGCACAAACACACGTGCAGACAGGCGAAGCCGAACATGGACAACCCGTCGGGTATGCCCGATCCGACCGCGCTGGATTTGATGCAATATATGAAGATTGAGTTGGGGGATATTCATCCGGTGAAATTAGAATGA
Protein Sequence
MSEPKKRGRKKKVKKVVEEDDSDDEFLIAKITPKKKKEKKESENEDDNDDNEEKEEEEGGAKSGSDWEQEIVCEVDVEIDENVKTKKKKKKRNRKSKKKEHICEYCAKVFTLKYNLKAHLLTHTGEKPHACTYCDKKFSQKCNYMRHLRSHFGNQDTNFACEICGSTFTQKTHLNRHMMKHNGEKPFPCPYCEKGFVDQTNLTLHIRKHTGEKPFTCECCGKTFARKTQLTIHARVHTGEKPYVCSYCQKAFFLKKSLTLHMETHIAVKPFACSFCNYRYSNETLLRRHITTHTGEQKTHECPQCQKKWSRKHDLMIHMRIHTGEKPHQCQICGKDFARKPDLTKHLRTHSGEKPFACTYCEKAFSDQSNLTQHIRSHTGEKRGEKRFPCPSCRKTFAYESNLQQHIRMRHTQQPVVEVKPPPPPPPVVPEKPLTDQLNTLSQYILPTLPVIEKRAEEPTSYEKSFVDPFNLVQHFLSHAQAQDKRYGYEKTFDHTLGNDKPVVPVRNFDKPMYPDQTNLPQSLLNIGDRRVEKPYMPPQQMNYEKTFEEHLNMTQKQQQFSMPIYEKPKQQQFHMSSYEKPKPQQFSLLNYEKPKQQFTLSNYDRPFPGQSNLMNMMDMRRENKPFGLTNYEKSYEKPYEMHNSSIQSINSMDNDPDHSRDDSVRSFGDYMDYDTKPYEQQNIPQPIVPEVIEKRKEKPFACTYCGKNFAYKTNLTQHIRLHTGEKRGDKPFACPHCKKTFAYQSNLQQHIQSHSDEKRAEKPYACTHCDKKFKYRSNLMQHIRSVTGEKRAEKPFACTHCNKTFAYKSNLTQHIQSVSGEKRGDRTYPCNLCDKKFSYKSTLTQHILSHTGETCEKPFKCHLCDKGFAYKSNLTQHMRIRHKNYEKACAELSNITQDILSHPGQEENKIAFTCTQCDKTFTDQPLLAQHILQHFEEKTVKKEKLFNCEKCDKTFANESTLIRHSAAHDLAPEKVFTCEQCSKSFNDETKLTQHILWHAVESGTEKATFICSQCDKTFMDQLNLTQHMRSHTGEKRFKCDCCELSFSSKLDLCSHLNKVHVDELPHQCCMCEFRFATEPLLNKHVLAHTGDIKQFQCPHCDKKCARKHDLTVHIRIHTGEKPHACAICDKVFARKPDLTKHMRLHSGEKPFDCTYCDKSFADQSNLTAHIRSHTGEKKYKCDFCDYSFGLKSHLARHMIVHTREKPFVCEVCEEAYPQKSLLDRHMISHTGEKKIHKCPYCEKKWARRHDMHVHIRTHTGEKPHVCPYCEKEFTRKPDLTRHKHTCRQAKPNMDNPSGMPDPTALDLMQYMKIELGDIHPVKLE

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
iTF_00244431;
90% Identity
iTF_00244431;
80% Identity
iTF_00244431;