Basic Information

Gene Symbol
-
Assembly
GCA_951394285.1
Location
OX596164.1:3787945-3795559[+]

Transcription Factor Domain

TF Family
zf-C2H2
Domain
zf-C2H2 domain
PFAM
PF00096
TF Group
Zinc-Coordinating Group
Description
The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 36 0.21 14 6.8 1.0 2 23 55 77 54 77 0.95
2 36 0.034 2.3 9.3 0.2 3 23 104 124 102 124 0.95
3 36 0.01 0.7 10.9 4.5 1 23 128 151 128 151 0.94
4 36 0.17 11 7.1 1.5 1 22 155 176 155 176 0.95
5 36 0.0018 0.12 13.4 1.7 1 23 183 206 183 206 0.94
6 36 0.00038 0.026 15.5 0.2 1 23 213 236 213 236 0.97
7 36 0.073 4.9 8.3 4.0 1 23 241 263 241 263 0.91
8 36 3.2e-07 2.1e-05 25.1 1.1 1 23 269 291 269 291 0.99
9 36 0.037 2.5 9.2 0.6 3 23 405 425 403 425 0.96
10 36 0.00011 0.0076 17.1 0.4 1 23 429 452 429 452 0.95
11 36 0.0015 0.097 13.6 2.2 2 23 485 507 484 507 0.93
12 36 0.0098 0.65 11.0 1.7 1 23 514 537 514 537 0.98
13 36 9.3e-07 6.2e-05 23.7 1.0 2 23 543 564 542 564 0.96
14 36 8e-05 0.0054 17.6 0.5 1 23 570 592 570 592 0.93
15 36 2.9 2e+02 3.2 0.6 3 23 716 736 714 736 0.90
16 36 4.9 3.3e+02 2.5 0.8 1 19 740 758 740 763 0.72
17 36 0.0044 0.29 12.1 1.2 1 22 767 788 767 788 0.97
18 36 9.9 6.6e+02 1.6 2.3 1 21 795 815 791 818 0.83
19 36 0.0062 0.42 11.6 2.0 1 23 825 848 825 848 0.97
20 36 0.0035 0.23 12.4 1.5 1 23 853 875 853 875 0.96
21 36 0.0069 0.46 11.5 1.5 1 23 881 903 881 903 0.87
22 36 0.65 44 5.3 0.1 2 23 988 1010 987 1010 0.94
23 36 0.065 4.4 8.4 0.7 3 23 1038 1058 1036 1058 0.96
24 36 0.09 6.1 8.0 0.4 1 23 1062 1085 1062 1085 0.92
25 36 0.23 15 6.7 2.3 1 19 1089 1107 1089 1108 0.94
26 36 0.065 4.4 8.4 0.4 3 23 1135 1156 1134 1156 0.95
27 36 1.5 97 4.2 0.1 2 23 1181 1203 1180 1203 0.90
28 36 0.14 9.6 7.3 1.3 3 23 1228 1248 1227 1248 0.98
29 36 0.00048 0.032 15.1 1.4 1 21 1252 1272 1252 1273 0.96
30 36 0.022 1.5 9.9 0.5 1 23 1314 1337 1314 1337 0.96
31 36 3.6 2.4e+02 2.9 0.3 2 23 1363 1385 1362 1385 0.93
32 36 0.012 0.81 10.7 0.5 2 23 1409 1430 1408 1430 0.94
33 36 0.0048 0.33 12.0 0.0 1 23 1434 1457 1434 1457 0.97
34 36 0.47 31 5.7 0.5 1 23 1461 1484 1461 1484 0.97
35 36 0.00078 0.052 14.5 4.0 1 23 1523 1546 1523 1546 0.97
36 36 0.051 3.5 8.7 0.6 2 23 1557 1579 1556 1579 0.89

Sequence Information

Coding Sequence
ATGTGCGTTTATTGCTGTGACGAAATAGAAGACCCGGATGACTACAGATGTCATATGGATAAAATGCATAAGGATGTCGATCTATTCACAGCTTTCGCGCACGTTAAATCGCGAAATAGAGTGAGTGAGAAAGACTATCTAAAAGTAGACTGCACTAATCTTAAATGCAGACTGTGTGATAAACCATGTGATACAATCAGTCTAGTAGCTCAGCATCTTAAACTTCGACACGAAAACAATGTTAAAGATATAGACCTGAACTATGAAGTTAGCCTACATCCGTTCCGATTAATAAAAGACAAATGGATATGCATGGTGTGCGATACTAAACTGCCGACGTTGACGAAACTCGTCAGGCACATGTCCTCTCATTTTTCAGATCACGTTTGCGATCTGTGTGATAAAAGCTACTGTACTGTTCAAAGCTTAAGATACCATATAAAATTCAACCATTCGAAGGACCACGTGTGCCGAAAATGCAGGATGGAGTTTTCTACTTCGGAAGAGAAGAAAGAGCACTTGAAAACCTCTTCAAAGTGTTGGGGTTTCGTCTGTGTTCATTGCAAAGAAAGATTTTCTTCATGGGACAGTAAACGTAGACATCTTGTGAGCGAACACAACGCCAAAGCTGTGACGTATCCGTGTCCAGATTGCGGGGCTGTGTTCGATTCTGGATCAAATTTCTACTATCATTATAAAATGTCGCATTCTGATTTACCGCACATGTGTTCCTTCTGTGGTAGACGGTTTGGGAAGAAATGCCAGTTGGAGGATCATCTACCCACGCATACCAACCGAAAGCAATTCCAATGCAATGTATGCTCAAAAGCGTTCGCTAGAAAAAAGGGATTGAGAGATCACATGAGAACGCACTTCAAAACTAAGAGAACAAAAGGTAATAGTTTGTTGTGCGGATACTGCTACGAAAAAATTGAAGATCCTGACGAATTCCGATGTCACATGGATGAATTCCACGAAACTGTAAAAATTAACACAGCTTTCGCGCATACAAAGAGCAGAGATGACTTTCGAATTAAGGTGGATTGTGTTGATTTAAAGTGCAGAGTATGCTCTGACCCTTTCGATACAGTGGCTGATATAGCTCAACATCTTATTGATTTGCACAAGAACGTAGATAATATTATCAGAGATTTGGATTTAAACTACGGTTTGGGATTGCATCCGTATCGACTGATTAAAGACAATTTGTTTTGCCTTGTTTGCAATATTAAACTGCCGACCTTAACTAAACTTACAAGACACATGTCCTCACATTACGCTGATTACGCGTGCGATTCCTGCGATAAAAGCTACCCGAACGAGGAAAATCTCAGATCACATATCAAGTTCAGTCATTCAGAAAAGCACATCTGCAGAAAGTGCTGCTTAGAGTTCTCCACGCTTGAAGAAAGAAAACAACACCTAAAAACGTCAACACAATGTCGGGGTACCGTGTGCACCCATTGTGGGGAGAGATTTTCATCATGGGACAAGAAGCAGAAACATTTAGTCAGTAAACACAATGTCCAACCCCTTACATACCCATGCCCGGATTGTGATCTTGTCTCCAATACGAGGTATCGATTCTACCATCATTACAACACCGTACACTCTGAATTGGCGCTTATTTGCTCGTATTGCGGCAAGAAGTTCAAAAACAAGAGTTATCTAAAGAGTCACCTTATCGTTCATACAGGGGCAAGAGATTTTACATGCAATGTTTGCTCAAAGGCGTTCAATACAGCAAAAGGCTTGCAGAATCATATGTGGACACACAGCGGCACGAAGCGATATCCTTGCGGGTATTCCACGCTGTACCCCTTCAGGTTGCGCGGAAATACTTTGATGTGCGTATACTGCTGCGACCAAATTGAAGACCCCGATGAGTACAGATGTCATATGGATGAAACTCACGATACTGTTGATCTTTACACAGCTTTCTCACACACTAAATGCAGGGACAGAGACTATCTCAAAGTAGACTGCATGAACTTAAAGTGCAGATTATGTAACGAACCTCATAAAACAGTAGTTGATATGGCTCGTCATCTTAAAGAACGACACGATAATAAGGATATCAAAGATATGGATTTAAATTATGAAGTCGGTCTGCACCCATACCGTCTTATCAAAGACAAATGGTTCTGCCTTGTCTGCAATATGAAACAGCCTACTATAACTAAACTCACTAGACACATGTCTTCACATTTCGCAGACCACGCATGTGATGTCTGCGATAAAAGCTATTTGAACATTGAAAGCTTAAAGTTCCACGTCAAGTTTAGCCATTCGGAGAAATACGTCTGTAGGAAATGTACCAAGGAGTTTTCTTCGTCAAATGAAAGAAAGGAACACTTGAGAACATCTTCGAAATGCTGGGGATTTGCTTGCATTCATTGCAAAGAAAGATTTGCATCCTGGGAACGTAAACAAAAGCATTTAGTAGACAAACACAATGTTCAAGTGGTGTCTTATCCGTGCTCTGACTGTGGTGTGGTCTTCGATACAAGAGTACGCTTCTATCACCATTATAATAAGACACACTCAGATCTGGCGCACATTTGTTCTGCTTGTGGAAAAAAATTTGGGAGTAAAAGGTTTCTTGATGATCACATGCTAACTCACACACAGGAGAAACCGTTTCACTGTATTGTCTGTGATAAGGAATTCTCTAGAAAAAAGGGCTTAAATGAACATATGTGGACGCATAGTGATAAAAAAAGACACTCTTttCGTGATACGGCGAAACGGAACGCTGAAGTTGTGTTACAATATTCGACACTGTACCCGTTCAGGTTGCGCGGAAAAATCTTGGTCTGCGTTTACTGCTGTGACGAATTTGAGGAGCCGGACGAATACAGATGTCACATGGATTTGACGCACAAGGAAGCAATTCTGTACACTGCTTTCGCTCACACCGGCAGAAACAGGGACTACCTCAAGGTAGACTGTGTCAATCTCAAATGCAGAATATGCAACGCACCCTTTCAAACAGTTGTTGAAATTGCGAAGCACCTTAAAGAAACCCACGTCAATGAAGACATCAAGCAAATGGACTTGAGCTTTGAAGTTGCCTTACACCCATACCGTTTAACTAAAGACAAATGGTTCTGTGTCCTTTGCAACATGAAGCTACCGTCATTGACCAAACTTCGAAGACACATGTCTTCGCATTACGCGGATTATGTTTGTGATATTTGTGGACGAAGTTATTTGAAAGTTGATAGATTGCAATATCACGTCAAGTCCTCACATTCGGGAAAATTCCCATGTAGAAAATGTTGGAAggtttttcaaacaaaagaagAGAAGAAACATCATACTCCACAGAGGCGTAATGCCGAACTGATTCTAAAGCACTCAACTGCTTATCCGTTCAAAACTCGATTTAGTCAGATACTATGCGCTTTTTGTCACGAAGAGTACAACACTCTGACAGAATTACGATATCACATGGCCATTGATCACATTAATTCAGATTATAACAACGTCTTCTATAGAATAAACGACAATTTGATCAAAGTAGACATAACTGATTTAAAGTGTAAGATATGCTCTCTTGAAGTCCCTGATATAGACTCATTGATGACTCATTTTTCTCAGGAACATCAAAAACCCGTAAAATTTAATGCCCGTTTCGGGGTACTTCCTTACAAGCAGAATTCAGAGAACCAATGGCTATGTGTGTACTGTCAGAGACATTACTCTGAATTTGTGGCTTTTAAACGTCATATAGGAACacattttatgaacttttcTTGTGATAAATGCGGTACCACTTTTGTATCCGAACATGCTTTGCGAGATCATCATCGTCAAGTAAAATGCTTCCGAACTGCCTACCATCCTCGGAACGGGAAGATCATGAGGTCTCGCGGCAATGCTGAGATAATACTACAGTGTTCCACCGCTAATCCCTTTAGGACATGGAAGAACAATTTCAATTGCGTGTTTTGCCGTATTCAAACTAACGACCCCAGCAGTTTGCGGATGCACATGGCAACCCGACATTCAAACTACGATGTACAATCTGCTTTCTATAAGAAACTAGGCAAAGACTTTCTCAGTGTAGATATAACAGATCTCCAATGTAAACTTTGCTTGGCGCCCCTACAAACTGTCGATGCTCTTACGCATCACTTGAAAAATGATCATCAACAACCTTTAAATTTAGAGGCTCAATTAGGTTTGCTACCCTTCAGATTGAACGATGGTTCCGTTTGGAAATGCACAATCTGTCCGAACCAATTCAAAGACTTCATATCACTAAAGAAGCACACTGCtgatcattttcaaaattatgtgTGCGATGCTTGTGGTGAAGGATTTATCACGGAGTCTGCGATGATAGCGCATACAAAAATACcacatgaaaataaatatagctGCAGTAGATGTATCGCAACGTTTTCGACTCTAGATGAAAGAAATATCCACGTAAAGACGCAACATACGTCGATGCCATACATGTGCATCTACTGTACTGAAAAGCCAAGGTTTGCCAACTGGGAATTGCGAAAAAGGCATCTAATGGAAGTTCACAACTATAGAACCGGAGCTGATAAATATGAATGCACCACGTGCCAAAAGACCTTCAAGACTAGGTCTGGAAAGTTCAACCACATGACGCGGAcgcacaaaattaaaaaagatgcAGAGCTTAATTTTCCGTGTATCAGTTGTCCGAAAGCGTTTACGACGAAACTGTTTTTAGACAAACACATGGCTAAGAAGCATTTTGATACTTGA
Protein Sequence
MCVYCCDEIEDPDDYRCHMDKMHKDVDLFTAFAHVKSRNRVSEKDYLKVDCTNLKCRLCDKPCDTISLVAQHLKLRHENNVKDIDLNYEVSLHPFRLIKDKWICMVCDTKLPTLTKLVRHMSSHFSDHVCDLCDKSYCTVQSLRYHIKFNHSKDHVCRKCRMEFSTSEEKKEHLKTSSKCWGFVCVHCKERFSSWDSKRRHLVSEHNAKAVTYPCPDCGAVFDSGSNFYYHYKMSHSDLPHMCSFCGRRFGKKCQLEDHLPTHTNRKQFQCNVCSKAFARKKGLRDHMRTHFKTKRTKGNSLLCGYCYEKIEDPDEFRCHMDEFHETVKINTAFAHTKSRDDFRIKVDCVDLKCRVCSDPFDTVADIAQHLIDLHKNVDNIIRDLDLNYGLGLHPYRLIKDNLFCLVCNIKLPTLTKLTRHMSSHYADYACDSCDKSYPNEENLRSHIKFSHSEKHICRKCCLEFSTLEERKQHLKTSTQCRGTVCTHCGERFSSWDKKQKHLVSKHNVQPLTYPCPDCDLVSNTRYRFYHHYNTVHSELALICSYCGKKFKNKSYLKSHLIVHTGARDFTCNVCSKAFNTAKGLQNHMWTHSGTKRYPCGYSTLYPFRLRGNTLMCVYCCDQIEDPDEYRCHMDETHDTVDLYTAFSHTKCRDRDYLKVDCMNLKCRLCNEPHKTVVDMARHLKERHDNKDIKDMDLNYEVGLHPYRLIKDKWFCLVCNMKQPTITKLTRHMSSHFADHACDVCDKSYLNIESLKFHVKFSHSEKYVCRKCTKEFSSSNERKEHLRTSSKCWGFACIHCKERFASWERKQKHLVDKHNVQVVSYPCSDCGVVFDTRVRFYHHYNKTHSDLAHICSACGKKFGSKRFLDDHMLTHTQEKPFHCIVCDKEFSRKKGLNEHMWTHSDKKRHSFRDTAKRNAEVVLQYSTLYPFRLRGKILVCVYCCDEFEEPDEYRCHMDLTHKEAILYTAFAHTGRNRDYLKVDCVNLKCRICNAPFQTVVEIAKHLKETHVNEDIKQMDLSFEVALHPYRLTKDKWFCVLCNMKLPSLTKLRRHMSSHYADYVCDICGRSYLKVDRLQYHVKSSHSGKFPCRKCWKVFQTKEEKKHHTPQRRNAELILKHSTAYPFKTRFSQILCAFCHEEYNTLTELRYHMAIDHINSDYNNVFYRINDNLIKVDITDLKCKICSLEVPDIDSLMTHFSQEHQKPVKFNARFGVLPYKQNSENQWLCVYCQRHYSEFVAFKRHIGTHFMNFSCDKCGTTFVSEHALRDHHRQVKCFRTAYHPRNGKIMRSRGNAEIILQCSTANPFRTWKNNFNCVFCRIQTNDPSSLRMHMATRHSNYDVQSAFYKKLGKDFLSVDITDLQCKLCLAPLQTVDALTHHLKNDHQQPLNLEAQLGLLPFRLNDGSVWKCTICPNQFKDFISLKKHTADHFQNYVCDACGEGFITESAMIAHTKIPHENKYSCSRCIATFSTLDERNIHVKTQHTSMPYMCIYCTEKPRFANWELRKRHLMEVHNYRTGADKYECTTCQKTFKTRSGKFNHMTRTHKIKKDAELNFPCISCPKAFTTKLFLDKHMAKKHFDT

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
-
90% Identity
-
80% Identity
-