Basic Information

Gene Symbol
-
Assembly
GCA_009761765.1
Location
chr4:30383081-30396032[+]

Transcription Factor Domain

TF Family
zf-C2H2
Domain
zf-C2H2 domain
PFAM
PF00096
TF Group
Zinc-Coordinating Group
Description
The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 37 7.9e-06 0.00053 19.7 6.6 1 23 132 154 132 154 0.98
2 37 5.7e-06 0.00039 20.1 1.2 1 23 187 209 187 209 0.98
3 37 0.038 2.6 8.1 4.8 1 23 235 257 235 257 0.94
4 37 2.1e-07 1.4e-05 24.7 3.7 1 23 286 308 286 308 0.99
5 37 3.7e-06 0.00025 20.8 1.4 1 23 338 360 338 360 0.98
6 37 9.4e-05 0.0063 16.3 0.4 1 23 394 416 394 416 0.98
7 37 0.015 1 9.4 0.1 1 23 422 444 422 444 0.96
8 37 0.0058 0.39 10.7 0.8 1 21 461 481 461 483 0.92
9 37 0.52 35 4.5 0.2 2 20 543 561 542 563 0.87
10 37 0.37 25 5.0 1.6 2 23 597 618 596 618 0.94
11 37 0.00043 0.029 14.3 0.9 2 23 641 662 640 662 0.96
12 37 2.2e-05 0.0015 18.3 0.1 1 23 681 703 681 703 0.99
13 37 0.049 3.3 7.8 0.1 2 20 716 734 715 736 0.93
14 37 9.8 6.6e+02 0.5 0.0 6 20 775 789 775 791 0.92
15 37 2.8 1.9e+02 2.3 4.2 2 20 806 824 805 826 0.91
16 37 0.013 0.89 9.6 0.2 2 23 840 861 839 861 0.96
17 37 1.1e-05 0.00074 19.3 2.6 2 23 876 897 875 897 0.96
18 37 1.9e-05 0.0013 18.5 1.4 1 23 910 932 910 932 0.97
19 37 0.0017 0.11 12.4 0.0 1 23 955 977 955 977 0.98
20 37 4.1e-05 0.0028 17.5 0.2 1 20 985 1004 985 1007 0.93
21 37 0.0039 0.26 11.2 0.1 1 23 1019 1041 1019 1041 0.94
22 37 8.2e-05 0.0055 16.5 0.2 1 23 1055 1077 1055 1077 0.98
23 37 6.4e-06 0.00043 20.0 0.2 1 23 1092 1114 1092 1114 0.98
24 37 3.2e-05 0.0021 17.8 0.8 1 23 1128 1150 1128 1150 0.97
25 37 2.5e-05 0.0017 18.1 1.6 1 23 1168 1190 1168 1190 0.98
26 37 0.00059 0.04 13.8 4.0 1 20 1204 1223 1204 1226 0.95
27 37 0.024 1.6 8.7 1.9 1 23 1240 1262 1240 1262 0.97
28 37 0.00056 0.038 13.9 3.5 1 23 1273 1295 1273 1295 0.98
29 37 0.0035 0.23 11.4 1.6 1 23 1335 1358 1335 1358 0.95
30 37 0.0003 0.02 14.7 1.4 1 23 1381 1403 1381 1403 0.96
31 37 0.0037 0.25 11.3 1.1 1 23 1418 1440 1418 1440 0.95
32 37 0.086 5.8 7.0 0.1 5 21 1536 1552 1535 1553 0.92
33 37 0.00046 0.031 14.1 0.5 3 23 1567 1587 1567 1587 0.98
34 37 3.5e-05 0.0024 17.7 1.9 1 23 1603 1625 1603 1625 0.92
35 37 0.0029 0.2 11.6 1.3 2 20 1642 1660 1641 1664 0.90
36 37 0.001 0.07 13.0 0.8 1 23 1674 1696 1674 1696 0.98
37 37 0.011 0.73 9.8 4.7 1 23 1709 1731 1709 1731 0.92

Sequence Information

Coding Sequence
ATGAATTCTCCATTATCGAATGAAAATCACGTCATTGCAACGGTAAACCATGGTTATATAGAAGAAATCGAATACTCTGCAATCGCCGAAGAAATAATCATCGAAGATGAGAGCGATTTCTACTCGTTATCTCCGAATTCCAATTCATCAGAAAGTAAAATTTATAAGAACGATAGAAGCAATGACGTAACAAACGATGATAATCATTCAGTTTCTGACAAAAATGCCAGTTCGAAGTTTATTATTGTCAACAACTCTACCTCATCTGGCGAAATCGCGCCAATCGACGCGGCCGGAGATCACACGTACGAAGCGTCGAAAAAAGCCGGCGAATGTATCGAGCAGGCGGTCACCGGGGCGAAAATCTCCACAAACGAAGGTACCGATAAACCTTACAAATGCCATATATGCGATAAAACGTTTTCGCGAAATAGCAGTCTGAAATGTCACATTAGACTTCACGTCGAAGGGCCGGATAATATGTTACGTAACACTTCATCGGCGCCTACGGCCGTCGTATGTGCGCGTACCGACGAAACCGAGTTATCGGAAGATACTTATAGTTGCGTTACTTGCGATAAGACATTTACTAATAAGAAAGCTCTACAGTATCACATTGAATCGCATAATGCCTCGAAAGCAGACGACGACGAATCTGGTAGCGAATGCAGTCACGAGCCGAATGAGGGCGAAAGTCGCGATTATAAATGTATCATGTGTAACAAACGATTCTCGCGTAGGAGCAGTTTGACGTTTCATTACAAATTTCACCCGGTAGAGAAATCGCCTTCTGCCAAAAAGCCCAGTGATAACGAATCGCCAGAAATGAACGACTCGGAAGACCGTAACAAACATTACCAGTGCTCTCACTGCGATCGTAAGTTCACTCGTAAGAATAGCCTGCAGTGGCATATAAGAATACACGATAATCAAACGCGTACGGAATCTAACGATAACGAACGGCGACAAGACGTTAAAGTCGACAAACCAGTAGACACTGTCGCCGGTACGTATAAATGTCGTAAATGCGATCAGTCGTTTTCACGGGTTCGTAATCTGAAATTACATATGGCCGTTCATACCGATACACAAGAGCAACCGGAGACCACATCTGATGTTTTCGAACGCCTCTACGAGCAAGTATCTAGTCACTGCGACTTCGTGCCTATACCGGGCCTGTATAAATGTGGCGTTTGCGATAAAGCCTTCGACCACGTAGAATCGCTCAAAGATCACGTAACCGTGCATCTACGACGTCGCACGTTCCGATGCGGCGTCTGCGACCAACTATTACCGACGAAAAACCAGCTGAAGGAGCACCTGATAGGGCACGAACGAAAATCGTCCAGCGTTGCCGAATCGAACAGCTCCAAAGAATCGTTCTTGTGCGTAGTGTGCGACCGCAGCTTCGCATCCGGCAACGCGTTGGAGAAGCATATGAACTGTCATCGTAAAATGGAAGAAGAAATTATCCCGAGGGTTAGCTCGTCTGGGCGTCGAATCAAAGACCCTCGTAATCGCGAAAACGACTTCGCCTATAATTCGGACAAAGTGTACAAATCATCGCCGAGCACTCGCCATAAGCTTATCGAAGACGAACCGGCGCAGGAAAAAGATGTAGAAGTGTCCTGCGCATCGTGTCGCAAGGCGTTCCGAAATCCTATATCTCTCAAGTTTCACCTACTCAGGAGCCAACGTTGTACACGAAATCAACGGCCGCCTTCGCCCGCACCTCGAATCGAAGATAAACCCGCCGCCGATGCCAAAGATCGTGATTTAGATTCGTTAAAGTGTAGCTGCTGCGATAGACTTTTCACCAGTATCGTAGCGCTTACGAAACACGTATCCAATCATTCGAAAGGCGTTCCCGACGAACACGTGGAAGATGAAGAAGAGTCGACGGCAGCCGGTAATTTCATCAAGTGCGATCGTTGCGATAAATTCTTCACCTCGATTCGATCGCTGAAGTTACACATCTCCGAGCATACCAGGAATCCCAATCCTAAAAGCGAGGAAGCGCCGGACGAGCAGGCCAAGTCGTTCAAATGCGACGTCTGCGGGGAAAAATTCGACGACGAGGAATCTCTGGATGAGCACATGAACTCGCACGAGACGCCGGACAACGACCAGCCGGGTAATGAGGTGCAGTGCGATCGATGTAACAAAGTGTTCCCGGCGTTGAGATCGTTGAATATCCACAAAGGTAGATTTTGTCGCGGAGAGCTCGAAAAATACGACAAACAGGAGGAAACGAACGAGCAAAACGAGCAAATGGAGGTAGACGAAGACTCTGATTTGGACGCCACAGATACGTTGGAATGTTGCGACGAAGTGTTCGACTCGGAACGAGCCTATAATATACACCGCACTCGTTACTGTACCAAAGAGACCGAATCGTTGCCGCAGCTTTCGTTGACGTGTAAATGCTGTAATTTCACATTCTGCAGCGTTAAATCGCTAAAGGTGCATTTGGCGCGAAAAGCCAAACAAGCGGACGAGCACGAAAACGTGTTGAGCCAGTGCGACCGATGCGATAAATATTTCCTGAATGCGGCCTCGCTGAGTAACCACATGGCCGAGCATAGCGAGCTCGAAGACTCGGACGAAGACGTATTCGTGCTTTTGCACTGCGACGTTTGCGATAAGAGCTTCAAGAATAGTACAGCGTTGAAAAAACACAAAGCCACGCACGTCCAGCCCGAAGACGACGATTCGTCCGATGAAACGTTCAAGTGCTCGAAATGCGATAAGGTGTTCGAAACGGAGAGATCGCTCAGGGTGCACAGAGCCTATCACACGAAAATCGAGCGAGTCGATAGCTTCCAAGCTGACAACCTTTACGAGGAGGTGGAAGGTGGCAAATACAAGTGTACGATATGCGGCCGAGTGTTAGACTCGACCAAAGCCGTAGAAGCCCATATCACCGTACACAAGGTGAACTCCGGTTTGGCCTATCAGTGTAACACGTGCGGTCGAGCGTTCAACTCGGGTGGCGCGTTGAGAAGGCATATGCCGAGTCATCGTCTTAAATCCGGCGACGAAGATGGTACCGAATACCCGTGCCTGAAATGCGACAAAATTTTCACCAGCCTCGAACAAGTGGCCGAGCATAAAGTCGTTCACGAGAACGAAAGCAGCGACGAAGAAGACGCCTCGGAGACGTACCCGTGTTTAAAATGCCCTCGTACGTTCGCCACGCGTAAAGGTTTGGGGGCCCATAGACGAGTGCACGAAGACGACAGCGAGCCCGAAACGCAATCCGATACGGATGTCTATCCGTGTCCCAAATGCGATCGTATATTTTCCACTCGCAAAGGCCTAGGGGCTCATCGAAGGGTACACGAAGGCGTCGACAGCGAAGACGAACCGAACGAACTGGAATACCCGTGCGCTCATTGTAATCGTTCGTTCCCCACTCCGAAGGGTTTGGCTAGGCATAAGTCGAGTCATAAAAACGAACTCGAAGACGGTCAAGATGCCGATATCGACGCAGAAGGGCCGTACCAGTGTGCGCATTGCGATCGCGCCTTTCCAACTCGTAAAGGCTTGAATAAACATAAGAGCATTCACAGAGACGACGACGAAGAGGCCGAATCGATGGACGTAGATTACCAGTGTTCTCATTGCGATCGGTCATTCCCCACTGCCAAAGGTTTGGCCAAACATAAAAGCTGCCATAAAGAGGTCGACGAATCGGATCACGAGGAAGCCGATGAATACGAATGCGAACTATGTAATCGACGGTTTCACTCGGTCAGAGGCTTGTCCATACATAAGAGCAGCCATAAAGACGTGAGTAGCGACGACGAGGATACCTATTCGTGTAGAATATGCGATTCTAAGTTCCACAACGAGAAAGCCTTGAATACCCATATGAGGAAACATTTACGCGAGATCACCTCCAGCGATGACCAAGCTGGTGCCGATGACGTAGACGAAGCGAAAACGTGCGAGTTATCGTGTACCATGGAGAATACCACCGAAGATGGACGAAGGTCGGCGCATCGGTGCGATCTGTGTCAGTTATCGTTCGCCGATTTAAAGAGCTTGTCTAAACACTCGACGACGATTCACCAAAACGACGATAAAAACGTAGATGACGTGGCAGACGACGACGTTAAGGAGTTACGCATTTTAATCCATCAGTGTAAAATGTGCGACGAAGCGTTCACCGATCACGTTAGCTTGCAGGATCATTCGATCGTGCATACGTGCGACAAGAAAACGGAGCTGGAGAAAACAGCCGATGGGTTCCAGTGCCAGTATTGCCCTCGGGTTTTCCCCACCATTCACGGCATCCGGAAACACAGCACGATACATAAGAACGATAAGAGAAGGGCGCAGAGGTTACAAAATGCCCATAAAGTTCGTTCTAGTCGCATTGCGGAGTCGAATACGCGACAACAGGTGGACAGTACGGTGGATACGGAGCAGAGCATCTCTTCGATACCAGAAAGCTGCGTGGATAAGGATACGACCGAGGCGACCAGCGCGAACACAACGGTGGTGGAGCCGATACAAATATCGTTGAATTTAAGCGAATTAGAACGGCCGATTAAAGACGACGCCAACTCGGAGGCGCTTTCGAACGAGGTGATTTCGTTGAAATGCGACGAGCAGTTCGATTCGTCGGATGATCTGAATAACCATAGAAGTACCCGTGAACTGGAACGTGTTCGAGAGACGCCATCGCTCGGGTGTAATATCTGCGGGCGACAGTTCAACGCTCCTAAGGCTCTGAAGAAACACCTGATGTTTCATAAATTCAACGACTTCGCCGAAGACACCGAAATCACGAGCGAACCGTACGAGTGTAATTTATGCCCGAAAACGTTCAAAAATGAAAACGACCTGAAGCGTCACTCGACGTTTCACGGCGCCGTTGCGAATGGCTCTGTGTCCAAATCTATAGGGCCGATTTTAAAGTGCAATTTGTGCGATGAAACGTTCACACGTGTGGAATGCCTGAAAAACCATTCGGTCGAGGTGCACGATCGGACGATTCACGACGAGATTACCTTCAAATGCGAACTATGCGACGAACAGTTCGCCAAAGCGAGCGCCTTTATGTTGCACAAGAAATACCACCGAAACGAGACCTGCGCGGATGAAGAGGCGATGCCTTTCGAATGTATATCGTGCGAAAAATGCTTCCGTTTTAAAGACCATTTGTTGGAGCACGTGAAAACGCACACAGACAACGAACTGGTGCTGGACGATTTGGAATCGTTGGTCGAACAGCGATCCGACGAGTTCGTTGATAAACCTGCGCCAGATTTAGAACATTTCATT
Protein Sequence
MNSPLSNENHVIATVNHGYIEEIEYSAIAEEIIIEDESDFYSLSPNSNSSESKIYKNDRSNDVTNDDNHSVSDKNASSKFIIVNNSTSSGEIAPIDAAGDHTYEASKKAGECIEQAVTGAKISTNEGTDKPYKCHICDKTFSRNSSLKCHIRLHVEGPDNMLRNTSSAPTAVVCARTDETELSEDTYSCVTCDKTFTNKKALQYHIESHNASKADDDESGSECSHEPNEGESRDYKCIMCNKRFSRRSSLTFHYKFHPVEKSPSAKKPSDNESPEMNDSEDRNKHYQCSHCDRKFTRKNSLQWHIRIHDNQTRTESNDNERRQDVKVDKPVDTVAGTYKCRKCDQSFSRVRNLKLHMAVHTDTQEQPETTSDVFERLYEQVSSHCDFVPIPGLYKCGVCDKAFDHVESLKDHVTVHLRRRTFRCGVCDQLLPTKNQLKEHLIGHERKSSSVAESNSSKESFLCVVCDRSFASGNALEKHMNCHRKMEEEIIPRVSSSGRRIKDPRNRENDFAYNSDKVYKSSPSTRHKLIEDEPAQEKDVEVSCASCRKAFRNPISLKFHLLRSQRCTRNQRPPSPAPRIEDKPAADAKDRDLDSLKCSCCDRLFTSIVALTKHVSNHSKGVPDEHVEDEEESTAAGNFIKCDRCDKFFTSIRSLKLHISEHTRNPNPKSEEAPDEQAKSFKCDVCGEKFDDEESLDEHMNSHETPDNDQPGNEVQCDRCNKVFPALRSLNIHKGRFCRGELEKYDKQEETNEQNEQMEVDEDSDLDATDTLECCDEVFDSERAYNIHRTRYCTKETESLPQLSLTCKCCNFTFCSVKSLKVHLARKAKQADEHENVLSQCDRCDKYFLNAASLSNHMAEHSELEDSDEDVFVLLHCDVCDKSFKNSTALKKHKATHVQPEDDDSSDETFKCSKCDKVFETERSLRVHRAYHTKIERVDSFQADNLYEEVEGGKYKCTICGRVLDSTKAVEAHITVHKVNSGLAYQCNTCGRAFNSGGALRRHMPSHRLKSGDEDGTEYPCLKCDKIFTSLEQVAEHKVVHENESSDEEDASETYPCLKCPRTFATRKGLGAHRRVHEDDSEPETQSDTDVYPCPKCDRIFSTRKGLGAHRRVHEGVDSEDEPNELEYPCAHCNRSFPTPKGLARHKSSHKNELEDGQDADIDAEGPYQCAHCDRAFPTRKGLNKHKSIHRDDDEEAESMDVDYQCSHCDRSFPTAKGLAKHKSCHKEVDESDHEEADEYECELCNRRFHSVRGLSIHKSSHKDVSSDDEDTYSCRICDSKFHNEKALNTHMRKHLREITSSDDQAGADDVDEAKTCELSCTMENTTEDGRRSAHRCDLCQLSFADLKSLSKHSTTIHQNDDKNVDDVADDDVKELRILIHQCKMCDEAFTDHVSLQDHSIVHTCDKKTELEKTADGFQCQYCPRVFPTIHGIRKHSTIHKNDKRRAQRLQNAHKVRSSRIAESNTRQQVDSTVDTEQSISSIPESCVDKDTTEATSANTTVVEPIQISLNLSELERPIKDDANSEALSNEVISLKCDEQFDSSDDLNNHRSTRELERVRETPSLGCNICGRQFNAPKALKKHLMFHKFNDFAEDTEITSEPYECNLCPKTFKNENDLKRHSTFHGAVANGSVSKSIGPILKCNLCDETFTRVECLKNHSVEVHDRTIHDEITFKCELCDEQFAKASAFMLHKKYHRNETCADEEAMPFECISCEKCFRFKDHLLEHVKTHTDNELVLDDLESLVEQRSDEFVDKPAPDLEHFI

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
-
90% Identity
-
80% Identity
-