Sinv013237.1
Basic Information
- Insect
- Solenopsis invicta
- Gene Symbol
- ZNF236_1
- Assembly
- GCA_016802725.1
- Location
- NC:7458142-7470657[-]
Transcription Factor Domain
- TF Family
- zf-C2H2
- Domain
- zf-C2H2 domain
- PFAM
- PF00096
- TF Group
- Zinc-Coordinating Group
- Description
- The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 23 0.00017 0.015 15.4 0.6 2 23 123 144 123 144 0.98 2 23 1.6e-06 0.00014 21.8 0.7 1 23 165 187 165 187 0.99 3 23 3.8e-07 3.3e-05 23.8 2.5 1 23 193 215 193 215 0.98 4 23 6e-06 0.00053 20.0 1.2 1 23 221 244 221 244 0.96 5 23 0.0026 0.23 11.7 1.9 1 23 253 276 253 276 0.97 6 23 2.3e-06 0.0002 21.3 0.3 1 23 399 421 399 421 0.97 7 23 0.00041 0.036 14.2 0.8 1 23 427 449 427 449 0.99 8 23 0.0041 0.36 11.1 3.5 1 23 455 477 455 477 0.97 9 23 0.00013 0.012 15.7 2.2 1 23 481 503 481 503 0.98 10 23 0.0034 0.29 11.3 7.4 1 23 559 581 559 581 0.98 11 23 2.3e-05 0.0021 18.1 4.9 1 23 587 609 587 609 0.99 12 23 2e-06 0.00018 21.4 2.1 1 23 615 637 615 637 0.98 13 23 0.00013 0.012 15.8 6.4 1 23 643 665 643 665 0.98 14 23 0.6 53 4.2 2.0 2 23 1015 1036 1014 1036 0.92 15 23 1.3e-05 0.0011 18.9 2.7 1 23 1042 1064 1042 1064 0.98 16 23 0.36 32 4.9 1.0 2 23 1071 1092 1070 1092 0.96 17 23 2.1e-08 1.8e-06 27.7 1.4 2 23 1168 1189 1167 1189 0.98 18 23 6.6e-05 0.0058 16.7 5.0 1 23 1195 1217 1195 1217 0.99 19 23 0.00021 0.019 15.1 4.0 1 23 1223 1245 1223 1245 0.97 20 23 2.5e-05 0.0022 18.0 2.6 1 23 1251 1273 1251 1273 0.98 21 23 2.3e-06 0.0002 21.3 0.6 2 23 1510 1531 1509 1531 0.97 22 23 9.3e-05 0.0081 16.2 6.2 1 23 1537 1559 1537 1559 0.98 23 23 0.0015 0.13 12.4 8.3 1 23 1565 1588 1565 1588 0.98
Sequence Information
- Coding Sequence
- ATGTTGTGTTCATTTATTACCGGCGTTGATACCGCGAAAAAGACGTTTTTTCGAAACGACCGATCGATAGCCGCGATAATTCGGTGGGTCACCACGTTGTGTTCGTCCTACTTTTTACACCAAACACTGCCATATAAGTACCCACTGCCTGCCGGCGCGTATTACTTTATGTTACACGATATAATCCGAGCCGCAGTATCGTCTTTCGATATGTCGTTCGCCGTAGACTGCCTGAATTTCGCGCTTCTCTACATCTGCACGCGCGTGTTCCATCTCGCGCTTGGTATCAGCGCTCAGTATTATAAGGCTCAATTGGACGCGCATTTAAAGCTTCATGGAGAAAAGTGGGCGAGTGAGGATGTGCGACAATGTAAATTATGCAAAAAACAATTCGTGCAACCTGCACTCTACCGAATGCACATTCGCGAGCACTATAGGCAACAGACGAAGATTGTAAAACAAACCAAGAGGGGAACGAAACATAAAACAATGTACAAATGCACCATATGCCTGAAGATTTTTCAAAAACCGAGCCAATTAATGCGACATATCAGAGTACACACTGGAGAAAAGCCCTTTAAGTGTACTATCTGCAACAGAGCATTTACGCAGAAAAGTTCGTTAAAAATCCATACGAGGCAACACAAGGGCATTCGGCCGCATACATGTAATCTCTGTAACGCCAAGTTTAGTCAGAAAGGGAATCTAAAAGCGCATATACTTAGAGTACACAATGCACCAGAAGGTGAGCCCACGTATGCATGCTCCTACTGTTCTTGCATCTTTAAGAAGCTAGGCAGTTTAAACGGACATATAAAGCGAATGCATTCCATTTCTTCAGAGGAATCTGCCGCCAAGTCGTCAGAAGCCAATCTTGATATCTCGTCGGATGCTGATATGCGCGCGACCGTCGACAGTGTTATATCGCAATTGGCATCTTTGGAATCCGTTGTGAATAATACCGCAGATTCTACGAAAACCACGACCCTGAGTACGGAGCAGAATGATATTCTGCAACAAGCGTTGAAAAACAGCGGTTTGCCGAACAAAAACGAGGTCTCTTCGAAAGAAACCGCAGAATCGAAGAAAAGTGATTCGCCAACGTCGTACGTCACGCTATTGGACAGAACCGCCGGCGGTACTTCGAGGAAGTATCTCACTATAAAACAACGCTGCGTAGGAAACGTACGATGGTACGCATGTTTGTTCTGCCCAAAAGAATTCAAAAAACCGTCGGACTTGATACGCCATTTGCGCGTACATACTCAAGAGAAGCCTTTCaagTGTATGTATTGTGTGCGTTCTTTCGCGCTAAAATCGACAATGATAGCGCACGAGCGAACTCATACCGGAGTGAAGAAATACGCTTGCGATTCCTGCGACAAAACCTTTGCGTGTCATAGTAGCCTTACCGCTCATACGAGATTACACACGAAACCGCACAAATGTAACATATGCGACAAGTCATTTAGTGCGAGCACCATCCTGAAGAATCACATAAAGAGTCACATGCGGGAGAAGCCCAAGATATCGCCGGAAGCGGAGAGTCTGGTGCCGCAAGTGGTGCTGCAAGAACCGCTCGTCATCAGCGATGTCGGTAATCAAATCAGCGTGGCGCAGGTGCAATCAAAACAGAAACACTTGTATGAAAATGTCAATGGCGTGGCTAGGCCGCACAAATGTTTGGTCTGTCACGCGGCGTTCAGAAAAGTTAGCCACTTGAAGCAACATTATCGTAGGCATACGGGCGAGCGTCCTTATAAGTGTTCCAAGTGCGACAGGAGATTTACATCAAACAGTGTCCTGAAGTCTCACTTACATACGCATGACGACGCAAGGCCATACAGTTGTTCGTTATGTGACACGAAATTCTCCACGCAGAGCAGCATGAAAAGGCACATGGTCACTCATAGCAATAAAAGGCCGTATATGTGTCCGTATTgccaaaaaacatttaagacgTATGTTAATTGTCGGAAACATATGAAAAGGCACAAGCATGAGCTTGCACAACGGCAATTGGAGGAACAAAAGACGCAGAATCAGAAGGAGCTGCAGCCTTTAAGTGAAAATAAAGAACCTACCTCGACATTGTCGAAGAACCTCACTTTGGCCGAGAATATGATATTCCAACCGCAAATGGCACCAGATCTTACACAGGATTTCTCCGATCAGTTTCAAAATATCAATGCGGAGAAAGAGaaatccttttctctttctctgtcggACAACGGAACGTCATCGATAGGGAATCAAAACTTATCTATTGACACAACTAATTTAGGAACATCGCAAACTCTGCACGCCGATGAAACCGGTACTATTACGTTATCTCATTACACAGGGGATCAAACTCTTACGCCgGAAAGTATACGAGAGATCGAAGAGCTGTTTAATATTGGCGTGAATCTCGGACTGGGAGCGAATCTACCAAGGCAAATGGATGAAACAAACGCGAGTCCTCTCGACAATTCTAGGGAACAACCAGTGctcaatattatatatgaacATAACAAGACTTTGGAATCGTCAGGCAACACTATATTTATGTCGCAATTTGATTCGTTCGATATGAACCAGATTTCATTGCAGACGGATAACGATTTAGACATCAGTCTGAATCCGAGCAATTCCACGAGTATGTCCAATATTTTGCCGAGATCCGTGGGAAGCAACGAGCAAGAGGAGCGACAAGCTGCTTCAGTTACTACCGTTAGCAATATAGACGCTTCTCAAATTACGAGAAACACTCATCTAGTGGTGATAAATCCTAAGGAGAATTGCTCTGAATTAGTGAGACTTTCACAAGTATGCCCGCCAAAATATTCGAAAGTCATCGCGAAGGCGGACACCACGAATGAATCGGAGATACATGGCGTGAATGAGCACGAAAAAGCATCGCAGAAAAGTAACGCTGTGTCGTGCGACAACGGAAGAAATTCTTTATTCATGAAAAATTTCCAAACTACAATGCAGTCGGAGACGCTTGTTTCTCCCGGTATGGACAGCTCGGTCAAAATATCGGAGTGTAACACTTTATTACAGTGTCACATGTGCGGCAATCAAGGTTTCACGACAGAAAGATTGAAGGAACACTTGAAATCTCATCGCGGTGCTAAAGAGTTTGAGTGTTCGGAGTGTTCCCAGCGATTCTGCACCAACGGCGGTCTCAGCAGGCATATGAAGATGCACACAAACAAACAAAGATGGAAATGTACGACCTGTCAAAAGATAATGGGCAGCAAACTGCAATTGAAAGTTCACAACAAAATGCATACCGAGACTTGGAACGTTTTGCCCATGGAACCGGACAGCTCTCAGAAAGAAAGTTCTCTTATTTCCTCGAGCTCGGGTCTACCGGCGATAACAACGCTAGATGTCCAGGTAGTTCCGAACCCGTCAGTTTCCGAGAAGGTTTTGATGGCCGCAGTCGCCGAAAGAAAAAGCATGGATCGTTTTAATGAGAATGTGGAGAAGAAAGAGACAAGAGAGTACACTAATAAATGCAAGTACTGTCCGAAGACTTTCCGCAAACCAAGCGATCTCATCAGGCACATTCGCACGCATACGGGCGAACGGCCGTACAAGTGCGACCACTGCAACAAGAGCTTCGCGGTTAAGTGCACTTTGGATTCGCATATGAAGGTTCACACGGGCAAGAAGACATTCTGCTGCCATGTTTGCAGCAGCATGTTTGCTACGAAAGGCAGTTTGAAAGTCCACATGCGATTGCATACAGgTTCAAAACCGTTTAAGTGTCCCGTGTGCGATCTTCGATTCCGTACTTCAGGTCACAAGAAGGTACATATGTTAAAGCACGCAAGAGAGCATAAGGGTGGCACGAAACGCAAGCCGAAACACTCGAAAATTGCTGCAGTGGCGGAAGCGGCAGCCAGCCTCGAAAAAATCGGCAGTACTCTGGACAATGCATCGATGACAGCGACGACGACCACTACAGCCCTAATCAACAATGAGACCGTGCCATCGCAGAATCTGGATTATTCCACATTGGAACAAACAGTAAATCTCGATACTACAGTTCACCTGCCGAATCAACTCGCGTGCAACCACGCTGACGCGACAACGATTCTTaacaataattcaatattatccGTGAACGAGAATAACGAGCTGGTAGCGAATCTACAGTTCCTGTTGGCGAACGGTCTCGTCACCATTCAAACGGATGACACCTTGTTGACACAGCCAGCGACAAACGATGCCACCAGTGCCGGAATGCCAACTAACGTGATCGAGTTGATCAATCCAGATCCTTTTCAGGAGGCTTGCAACTTGGGCAATACCAATCATGTGGTGATAACCAGCCAGGTACCGACGGAGACAACTACCGCGGACATGAGCAATGCGACAATAATTCAGATGAATGATTGTATGCCTCCTGTGCCCCTGGTACAGATGCAGACTCAAGATACAAGCAGCGTTAGCATGTCTGAAATGAGGACGAGCAATCAAACGACAATGACGACATCAACAACAACGTCAAAAGCTTCGCAGCCTTCAAGAAAGGAGTGTGATGTGTGCGGCAAAACGTTTATGAAACCCTATCAATTGGAACGACACAAGCGCATTCATACAGGGGAACGGCCGTACAAGTGTGAACAATGCGGCAAATCATTCGCGCAGAGGTTTACGCTGCATCTGCACCAAAAGCACCACACCGGCGACCGGCCGTACTCTTGCCCACACTGCAAGCACCTGTTCACGCAGAAGTGCAACCTACAAACGCATCTGAAACGTGTCCACCAACTGGTCATGCTCGACGTGAAGAAGCTGAAGAGTGGCCAACAGATGCTGGGCGCGCTTCTGCAGGACAATCAAAGCAGTGATACCAAGTTACTAAATCTGGATGATATACTAGTTGTAGATTTTCTCAAGTAA
- Protein Sequence
- MLCSFITGVDTAKKTFFRNDRSIAAIIRWVTTLCSSYFLHQTLPYKYPLPAGAYYFMLHDIIRAAVSSFDMSFAVDCLNFALLYICTRVFHLALGISAQYYKAQLDAHLKLHGEKWASEDVRQCKLCKKQFVQPALYRMHIREHYRQQTKIVKQTKRGTKHKTMYKCTICLKIFQKPSQLMRHIRVHTGEKPFKCTICNRAFTQKSSLKIHTRQHKGIRPHTCNLCNAKFSQKGNLKAHILRVHNAPEGEPTYACSYCSCIFKKLGSLNGHIKRMHSISSEESAAKSSEANLDISSDADMRATVDSVISQLASLESVVNNTADSTKTTTLSTEQNDILQQALKNSGLPNKNEVSSKETAESKKSDSPTSYVTLLDRTAGGTSRKYLTIKQRCVGNVRWYACLFCPKEFKKPSDLIRHLRVHTQEKPFKCMYCVRSFALKSTMIAHERTHTGVKKYACDSCDKTFACHSSLTAHTRLHTKPHKCNICDKSFSASTILKNHIKSHMREKPKISPEAESLVPQVVLQEPLVISDVGNQISVAQVQSKQKHLYENVNGVARPHKCLVCHAAFRKVSHLKQHYRRHTGERPYKCSKCDRRFTSNSVLKSHLHTHDDARPYSCSLCDTKFSTQSSMKRHMVTHSNKRPYMCPYCQKTFKTYVNCRKHMKRHKHELAQRQLEEQKTQNQKELQPLSENKEPTSTLSKNLTLAENMIFQPQMAPDLTQDFSDQFQNINAEKEKSFSLSLSDNGTSSIGNQNLSIDTTNLGTSQTLHADETGTITLSHYTGDQTLTPESIREIEELFNIGVNLGLGANLPRQMDETNASPLDNSREQPVLNIIYEHNKTLESSGNTIFMSQFDSFDMNQISLQTDNDLDISLNPSNSTSMSNILPRSVGSNEQEERQAASVTTVSNIDASQITRNTHLVVINPKENCSELVRLSQVCPPKYSKVIAKADTTNESEIHGVNEHEKASQKSNAVSCDNGRNSLFMKNFQTTMQSETLVSPGMDSSVKISECNTLLQCHMCGNQGFTTERLKEHLKSHRGAKEFECSECSQRFCTNGGLSRHMKMHTNKQRWKCTTCQKIMGSKLQLKVHNKMHTETWNVLPMEPDSSQKESSLISSSSGLPAITTLDVQVVPNPSVSEKVLMAAVAERKSMDRFNENVEKKETREYTNKCKYCPKTFRKPSDLIRHIRTHTGERPYKCDHCNKSFAVKCTLDSHMKVHTGKKTFCCHVCSSMFATKGSLKVHMRLHTGSKPFKCPVCDLRFRTSGHKKVHMLKHAREHKGGTKRKPKHSKIAAVAEAAASLEKIGSTLDNASMTATTTTTALINNETVPSQNLDYSTLEQTVNLDTTVHLPNQLACNHADATTILNNNSILSVNENNELVANLQFLLANGLVTIQTDDTLLTQPATNDATSAGMPTNVIELINPDPFQEACNLGNTNHVVITSQVPTETTTADMSNATIIQMNDCMPPVPLVQMQTQDTSSVSMSEMRTSNQTTMTTSTTTSKASQPSRKECDVCGKTFMKPYQLERHKRIHTGERPYKCEQCGKSFAQRFTLHLHQKHHTGDRPYSCPHCKHLFTQKCNLQTHLKRVHQLVMLDVKKLKSGQQMLGALLQDNQSSDTKLLNLDDILVVDFLK
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- iTF_01523538;
- 90% Identity
- iTF_01354768;
- 80% Identity
- -