Agos001882.1
Basic Information
- Insect
- Aphis gossypii
- Gene Symbol
- Zbtb41
- Assembly
- GCA_004010815.1
- Location
- NW:479514-490410[+]
Transcription Factor Domain
- TF Family
- zf-C2H2
- Domain
- zf-C2H2 domain
- PFAM
- PF00096
- TF Group
- Zinc-Coordinating Group
- Description
- The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 23 5.9e-06 0.00045 20.0 0.4 1 23 107 129 107 129 0.96 2 23 5.6e-05 0.0042 17.0 2.7 2 23 139 161 138 161 0.97 3 23 0.014 1 9.5 4.7 1 23 194 216 194 216 0.99 4 23 0.29 22 5.3 1.1 2 23 222 243 221 243 0.92 5 23 0.18 14 5.9 3.5 1 20 248 267 248 269 0.91 6 23 0.028 2.1 8.5 0.9 2 21 291 310 290 315 0.92 7 23 0.00027 0.02 14.8 0.5 1 23 316 338 316 338 0.97 8 23 0.038 2.8 8.1 1.9 1 23 715 737 711 737 0.88 9 23 0.00055 0.041 13.8 0.3 1 23 807 829 807 829 0.96 10 23 0.043 3.2 7.9 3.6 1 23 835 857 835 857 0.97 11 23 0.036 2.7 8.1 10.4 3 23 913 934 912 934 0.96 12 23 0.0001 0.0076 16.2 1.1 3 23 977 997 976 997 0.98 13 23 0.00019 0.014 15.3 3.1 1 23 1003 1025 1003 1025 0.99 14 23 1.1e-05 0.0008 19.2 1.5 1 23 1067 1089 1067 1089 0.98 15 23 0.0024 0.18 11.8 0.9 1 23 1094 1116 1094 1116 0.97 16 23 0.0023 0.17 11.9 0.8 2 21 1148 1167 1147 1168 0.94 17 23 4.6e-05 0.0035 17.2 2.2 1 23 1198 1221 1198 1221 0.97 18 23 0.00015 0.012 15.6 0.1 2 23 1267 1289 1266 1289 0.95 19 23 6.4e-05 0.0048 16.8 4.1 2 23 1297 1319 1296 1319 0.96 20 23 0.0008 0.061 13.3 0.9 1 23 1351 1373 1351 1373 0.97 21 23 0.00021 0.016 15.2 0.2 1 23 1378 1401 1378 1401 0.95 22 23 0.00049 0.037 14.0 0.2 2 23 1433 1454 1433 1454 0.98 23 23 2.9e-05 0.0022 17.9 2.6 1 22 1460 1481 1460 1481 0.96
Sequence Information
- Coding Sequence
- ATGACTCCAATGAAAGATTACATAAAAGTTGAAGTCACTGGTGATGCTGATGCTGTTACCagtGAATTATATGTTGAAAACACTATTGAAAATCCTGCATGGTATACAGTTGAAGATAATCATGTATCTTTGGTatctgaaaatattgaaaaggtTATCAAAGTTGAAACCGATATGTTGGATGACTGTATGTGGGAAAGTAGAATTGAAATTTCTAAAGATGTTGCCAACCAAAATAATGCCGAGTTATCCTTGGAATCTCAGAATGTTGCAGTCAATAATATTGACTTAAATCCTGAAAAGacaactaataaaatgttttactgtaatatttgTGGAAAAAGTTTCgctttaaattatcttttaaaaaatcacttGGCAGAACATAAAAgacgaataaatattaaaaacaacaaatgtaaatattgtggAAGGcagttcaaattaaaatttggtttGAATAGACACAttcaaaaatttcataatGAACACATGAAAAGCCATGAAAATAATGAAGCTAGGATtcctaatataaaattgcgAAAATTAAATGGTGTTTGGAAAAGTAagacaaatgtatttaaatgtaacatttgttttctaaaatgtcatagtgaaaaaaatttagtagaACATATGGAAACTCATGACAATTGTCTAACAGAGTGTAATGATTGCTTTAAGTGTGTATCAGTTTCTGAATTGGATAATCATATGgaacaattacataataaacaattgtataaatgtCAGAAATGTAACCtacaatttgtaaaacaaCAACATTTAAACTGTCATTTAATTCATTGCTTAAAAAGTTTGCCAACTAAATCAAATAAGGTTTGTGGACCATTCAAAGATGGTTCAACAACTTGCTCTGCTTGtggaattatatttaataattctttgtcattgaaaaatcatataacTATGGGCTGTAAGCATTACACTTGTAAATATTGTGGacaatattttagttcaaGGATTTTATGGCTAAGACATGTGGATGTTCATGAAAAAGATGAACAACGTGTAATGTTtagagtaaataaaatatctacaattagaaaatataatcaaaatgagGCTGTAAATGATACTTCTTCTAATACAAGTGAAGGACAATTATTATCCAGAAATGTTACAAGAAGAaataattcagaaaaaaacgatttaatgAACAGCAATTCAACAACTGTATCAACTGCTTCAAATAATTCAATGGCAAATGATtatgttattcaaaatataaagcaagaaaaatacttaaatgcaAATGCCAAAGAAAGAAATACTTCGCATCAAGTGATTAACAATCAATCAACTGTATCTGTTGTTCCtcaaaattcaataacaaacaattacctttttttaaatattaaacaagaaaaaatcACGAGTACAGATGCCGTAGAAACAAATAGTTcagatcaaataaatttagcaAACAGTAATGAATCAACAATTTCAACTTTTTCACATAACTCAATAGTAAACGATTGCCATGTCTTAAATATTAAGCAAGAAAAAAACACAAGTACAAATGTCATAGAAACAAATAGTTcagatcaaataaatttagcaAACAGTAATGAATCAACAATTTCAACTTTTTCACATAACTCAATAGTAAACGATTGCCATGTCTTAAATATTAAGCAAGAAAAAAACACAAGTACAAATGTCATAGAAACAAATAGTTcagatcaaataaatttagcaAACAATAATGAATCAATAATTTCAACTGTTTCGCAGAACTTAATAGTGAACGATTGCCTTGTCTTAAATATAAAGCAAGAAAAAAACACGAGTTTAAATGTCAAAGGAAGGGAAAATTCTGATCAAATGGACTTAGTGAATGACAATGAATCAACTGTGGAAAACAGACTTCATGATTTAATGGTACATGAAGaccctattaatattaatgtagctAATAATTCTGTTTCTACAATTAGAGAAACACCTTATATCTGTATTAGAAAATTgccttatattaaaaatgtattttgcaaTGATGTTAATGATCAGACatcacaaaataatgttatatttatatgtagaatttgtaaaaaatctcCGTCTACAGATATACATTCATTCGCATTGCACATGAGTGAACATGTTGAATGCAACATGCACGAATGTATAGTGTGTGATAAGACATTTAGCACTGTACTTCTTTGGACAAACCATATGACTAATCATCAACAgcaattagatttaaatatatcagcAGTTCAGTGTAATCTTACAGAAATTGAATCTAATACATCAGGCCCAATTGATTCTGGAAATGTGGAACCACAAGTTTGTTCAACCTCAAGAAATAGAAAATTCAAGATGTCTTTATTGGATGATTCATATTCAGATATAACCAGCATGAATAATAGTgtcaataaagtaaaatatcaatatgacTGTAGCacatgtaacataatatttccttCCAAAGTCAAATTAAAAGCTCATCAAACTCTTCATGCTAAACCACAATCATTTTCATGTAGGTATTGTGATAGAATATTTTCTGGAAAGGGTCAGTGTACAAACCATGAAAAATCTCACATTGGTTTAGACCAAGcgtgttttcaaaataatgagaATTATACACCTGAAGCAACTTTAATCCAAAACAATATAAGCATAgaaaataacaacattaaCACGGAGACAACgtcaattaaacataaaaataaaaacaaactatcatcaactataaaaacaaatcattgtcacttatgtaataaaaaattttcaaagcgctgttattttacaaatcatatgcaaataaagcataatattaatccAAATTTACCAAAACAACTTATATCAAATTctgataatgaaaataaattaccagtTCCGGTAGATAGTGATTTGAATAATCAtcttaataaaagtaatgagaagaaatcaaaattttgtaccatttgtaataaatattttgcacaTATGGGTGCCTTAACCAATCACATGAATATTCATTTAGATCGTAAACCATACAAATGTCAATATTGTTCCAAACAATTTAGTAAGGAAGCCTCTTATATTTTGCATCAGAAAAAACATGTGCCAAAAAATGAAGTTAATGAAGAACCACAAATCGAAAATAGTAATGATTTTGAGTTGGGTTATAATCCAGCTGATGGGAATTCTaatgatacaaatttaattaacatgaaAGATTTATGGTTTACCTGCGATGTATGTGAGAAAAAATTCTCTACTCCGTTCCAATTGAATTTACACCGAAAATTACATTCTAATGTACCATATGTATGTAAGATTTGTAATAGATCTTATTCATTGAAATTTCGATGGAATTGGCATTTAAAAGGCCATTACCTAAGacatattaaaagtttaaaaagtaaacgaaatactaatgatataaaatcaaaGATTGCATCTATACAACATTATCAAGATAGAATAAAATGTCGATAttgtaaaaaagaatataactCAATATCTCAGTGGAAAAAACATATGACAATGAACAAGGAATGTCGTCGCCATTGTAAAAATAACCTTCCAGAATTTGCATCAAATCATGCTTCAAAGAATCGTACCAAATCTTGTCGTTTTAAGTGTAATATTTGTAAGAAAACATATAGCACCTCTTACAATCGttcaatacatataaaaaatgttcacaaCGAGTTAGATTctacagttttaaataatcataataataccaattttCAAAAAGACAAACCAATGGAAATAATTCAACAGAAACCAGTgattaaaaaacgaaatagAAGTAATCATATTAATGGCAGTGGCACTAAATGTAAGTTATGTGGCAAAGTATATAGTAGTGTAGCTAATTTAGGCCGCCATGTATCTATAGTTCACACAAAGTGTTATGAACCTATGACCTGTAATGTTTGTGGAAGGACattcaaacataaattttcttttagagaacatttaaaaattaaacacaaaaaattgtttaaaaattatgaagaaataaacgttaaaaataaaagagccAACAGgaaaattataccaataaataaaaaaagaattatgaaatacttttgcaaaatttgtaaaatgaaGTTTGCtgacaatattacattacaaGAACATTCAAAGATACATATAGTagatatgtataattgtaaagACTGTGGTCAGCAATTTGAGACAAATGTTACTCTAGGAAATCATATTATGGAAAACCACAGtggtgataattttatatctaataaaaacaatcatatTAAGAATTATGAGAACCAAAATACTTCtttggaaataattaataataatcctgCTCAATGTAAAGTTTGTTTGAAAATCTTAAAAGATCCTGGATATTTACGTGAACACATGAGATTACATACAGGTGACAAGCCGTTTAAATGTGATCTGTGCAATATGACTTTCAGATTTAAATCGAACTTAAGAATGCATCAGAAGAAAGATATGCCATgttatataccataa
- Protein Sequence
- MTPMKDYIKVEVTGDADAVTSELYVENTIENPAWYTVEDNHVSLVSENIEKVIKVETDMLDDCMWESRIEISKDVANQNNAELSLESQNVAVNNIDLNPEKTTNKMFYCNICGKSFALNYLLKNHLAEHKRRINIKNNKCKYCGRQFKLKFGLNRHIQKFHNEHMKSHENNEARIPNIKLRKLNGVWKSKTNVFKCNICFLKCHSEKNLVEHMETHDNCLTECNDCFKCVSVSELDNHMEQLHNKQLYKCQKCNLQFVKQQHLNCHLIHCLKSLPTKSNKVCGPFKDGSTTCSACGIIFNNSLSLKNHITMGCKHYTCKYCGQYFSSRILWLRHVDVHEKDEQRVMFRVNKISTIRKYNQNEAVNDTSSNTSEGQLLSRNVTRRNNSEKNDLMNSNSTTVSTASNNSMANDYVIQNIKQEKYLNANAKERNTSHQVINNQSTVSVVPQNSITNNYLFLNIKQEKITSTDAVETNSSDQINLANSNESTISTFSHNSIVNDCHVLNIKQEKNTSTNVIETNSSDQINLANSNESTISTFSHNSIVNDCHVLNIKQEKNTSTNVIETNSSDQINLANNNESIISTVSQNLIVNDCLVLNIKQEKNTSLNVKGRENSDQMDLVNDNESTVENRLHDLMVHEDPININVANNSVSTIRETPYICIRKLPYIKNVFCNDVNDQTSQNNVIFICRICKKSPSTDIHSFALHMSEHVECNMHECIVCDKTFSTVLLWTNHMTNHQQQLDLNISAVQCNLTEIESNTSGPIDSGNVEPQVCSTSRNRKFKMSLLDDSYSDITSMNNSVNKVKYQYDCSTCNIIFPSKVKLKAHQTLHAKPQSFSCRYCDRIFSGKGQCTNHEKSHIGLDQACFQNNENYTPEATLIQNNISIENNNINTETTSIKHKNKNKLSSTIKTNHCHLCNKKFSKRCYFTNHMQIKHNINPNLPKQLISNSDNENKLPVPVDSDLNNHLNKSNEKKSKFCTICNKYFAHMGALTNHMNIHLDRKPYKCQYCSKQFSKEASYILHQKKHVPKNEVNEEPQIENSNDFELGYNPADGNSNDTNLINMKDLWFTCDVCEKKFSTPFQLNLHRKLHSNVPYVCKICNRSYSLKFRWNWHLKGHYLRHIKSLKSKRNTNDIKSKIASIQHYQDRIKCRYCKKEYNSISQWKKHMTMNKECRRHCKNNLPEFASNHASKNRTKSCRFKCNICKKTYSTSYNRSIHIKNVHNELDSTVLNNHNNTNFQKDKPMEIIQQKPVIKKRNRSNHINGSGTKCKLCGKVYSSVANLGRHVSIVHTKCYEPMTCNVCGRTFKHKFSFREHLKIKHKKLFKNYEEINVKNKRANRKIIPINKKRIMKYFCKICKMKFADNITLQEHSKIHIVDMYNCKDCGQQFETNVTLGNHIMENHSGDNFISNKNNHIKNYENQNTSLEIINNNPAQCKVCLKILKDPGYLREHMRLHTGDKPFKCDLCNMTFRFKSNLRMHQKKDMPCYIP
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- iTF_00135596; iTF_00134739;
- 90% Identity
- iTF_00135596;
- 80% Identity
- -