Fsel021571.1
Basic Information
- Insect
- Formica selysi
- Gene Symbol
- -
- Assembly
- GCA_009859135.1
- Location
- WHNR01000135.1:112351-122450[-]
Transcription Factor Domain
- TF Family
- zf-C2H2
- Domain
- zf-C2H2 domain
- PFAM
- PF00096
- TF Group
- Zinc-Coordinating Group
- Description
- The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 32 6.2e-06 0.001 20.2 1.0 1 23 207 229 207 229 0.98 2 32 0.009 1.5 10.2 3.7 1 23 252 275 252 275 0.96 3 32 0.00061 0.1 13.9 0.5 2 23 282 304 281 304 0.96 4 32 0.0088 1.5 10.3 0.1 1 21 324 344 324 345 0.95 5 32 0.00029 0.048 14.9 0.2 2 23 444 465 443 465 0.94 6 32 0.00028 0.047 15.0 0.3 1 23 471 494 471 494 0.96 7 32 8.7e-05 0.014 16.6 0.1 1 23 564 587 564 587 0.96 8 32 0.008 1.3 10.4 1.1 1 23 593 616 593 616 0.95 9 32 0.061 10 7.6 0.8 1 23 629 651 629 651 0.97 10 32 3.5e-05 0.0058 17.8 0.9 3 23 696 717 695 717 0.96 11 32 1.5 2.5e+02 3.2 0.4 1 23 750 773 750 773 0.92 12 32 0.0048 0.79 11.1 0.8 2 23 780 803 779 803 0.94 13 32 0.89 1.5e+02 4.0 0.0 2 23 812 834 812 834 0.90 14 32 0.018 3 9.3 0.0 2 23 841 863 840 863 0.93 15 32 1.3e-05 0.0021 19.2 1.2 1 23 871 893 871 893 0.98 16 32 1.1 1.8e+02 3.7 1.2 2 23 940 963 939 963 0.95 17 32 0.0041 0.69 11.3 1.0 1 23 966 989 966 989 0.97 18 32 0.00068 0.11 13.8 0.1 2 23 996 1018 995 1018 0.95 19 32 0.00091 0.15 13.4 3.9 1 21 1046 1066 1046 1067 0.95 20 32 0.0027 0.44 11.9 1.8 2 23 1081 1102 1080 1103 0.93 21 32 0.36 59 5.2 1.1 1 23 1106 1129 1106 1129 0.94 22 32 0.026 4.3 8.8 2.1 2 23 1136 1159 1135 1159 0.90 23 32 3 5e+02 2.3 0.2 2 23 1167 1189 1167 1189 0.94 24 32 3.2 5.3e+02 2.2 0.0 2 23 1196 1218 1195 1218 0.88 25 32 0.22 37 5.8 2.7 2 23 1227 1249 1227 1249 0.96 26 32 1.1 1.9e+02 3.6 0.4 2 23 1255 1277 1254 1277 0.91 27 32 0.017 2.7 9.4 0.1 1 23 1339 1362 1339 1362 0.95 28 32 0.00015 0.026 15.8 0.9 1 21 1390 1410 1390 1411 0.96 29 32 0.32 53 5.4 1.9 2 23 1512 1534 1511 1534 0.94 30 32 0.91 1.5e+02 3.9 0.4 2 23 1541 1563 1540 1563 0.94 31 32 0.69 1.1e+02 4.3 2.5 1 23 1574 1596 1574 1596 0.97 32 32 0.17 28 6.2 0.1 1 19 1600 1618 1600 1621 0.91
Sequence Information
- Coding Sequence
- ATGCACATATTCGGCCAGGAAGGCAGAAATCGGCAGCTCGTTGACAAAATTCAGACATGCCTGCCGTTCAAGATAGAGGAAGATGATCGTTTGCCGAAAGTCCTGTGTTATCGATgcatgtataatttggaaaatttttacGACTTCAGAACCGCATGCGTGAACGCATCTGCTTGGTTGGAAAGAAATAGGCCCAAGGAAGGCGTGAACGATGACGGTGCAAATGATAGCGCGCAATGCAACGAGATGCACACGGAGCTTCTCAAGGGAAAGGAAAATATGCCAATACTTATCCCGGAGGCGCCCGTGGTCAATCCGAACGCCGCATTAGGTACACCACCGAGATTGAATTCAGACGGAGAGGCCGATCCCGAGATCGAAGAGATTCTCGATGCGAGTGAAGGTACGGACGAAGTGCTCGATGATTCGACGGAGGATCGACGGTCGGAATACGAATACGAGATGGACATGGAGACGAATCCTAGCGACTTTTTAGAAATGACACCGATGGTAACCGAAGAAAACGAGGAGGAATGCGGCGCAAACAATACGAGTGCCGCGACCGGTCAACAAGAAGCCACGACGGTCTTTCCACCTACTTCGCAGCAGCACGAGGTCTATGTATGTTCTCTATGCAACAAGGCATTTAGCTCCAAGGGTCATCTATCGTTGCACGCGAGAATTCACGTCGGTGAGGGTGATGtgatcggtgaaagggttatTACTGACGACCATACCTCGTATCAGCGACCATATCAATGCGATCTCTGTCATAAGTCGTATTCTACCGCGAAGCATCGCTGGGGACACGTTTCTACGACACATCGGGGACATCCTGCTGTAACATGTGCGTACTGCTCGCGCATATATTCGACGCGGTACAACCTCGACGAGCATATAAAATCGCGACACGCCGGTCTACCGCCACCCCCGGAATTATCAGTTTCCCTTTCCCGCGCGGAGACTCGTTATCAATGCCAAACGTGTCCGAtgatttataaagatttggcAGATTTCAATGCGCATCGTCAGATATGCATCGAGGAACAACGCACTGATCTATTGGGGCAAACCGACGCGCAGAACAATAAGAGTTTAGTCGATATGTCCGACATGTCGAGTATTATAGACTCTGACGACGAGAACAAAGACTTTAGGAGTGCCGAGGCTAAACTGGCCAAGAATCCGCAATTGACCATTTTGAAACAAGCGTTGACTAAAGGAGACAGTTTAAAAAGGATTTACGATGATGATGGTTCAACGTTGAGTTGCAAGGCGAGAAAAATGGTCAAGATAGAAGGTGAGACGAATCCTGAGATAAAAAGGTGGTATTGCGAATATTGTCCGCAAAGTTTTACATCGGTGGACAATCTAAAGTTGCACGAAGTCAGGCATGATGCCGAGAAGCCGTTTATCTGCGTACTATGCAAGAAGGATTTTGTTTTGAAATCTTCATTGATAAGGCACATTACGACAATACACGGCGTTGATCCTACTCCTATCATTGACAGCGACAAGTGTCTAAAGACATCTGTGATGCCTCAGAATTGGAATCGAATGGATGTTAGCGTTTACGAGCAGAACGATATAAAGGAACCACCAGAATTTTCGTCGTCACCCGagACAAATTTGGACAATGATGAGAAAGATTTGAAAAACAACCACGAGAATATAGAAATCGAAACAGTATTTATATGTGAGATTTGTACGAGGGACTTTAATGATAGAGCGTCATTATGGTTACACATGCGTGCAACGCATAAGGAACTTGCTGCATATGCCTGTGGAGTATGCTTAAAGATTTGTTCCGATAATACACAACTCCAGAGTCATCTCTACATGTATCACGGAAAATCCAAGCTTTTAATATCGGAACAAAGAAGGTACAGTTGCACAATATGTGGCAGGCAACACGATTCAAGAAAGAAGCTAATAGCTCATGTCTCGATACACAATATCGATTCTGGGTTTGATCCTGCAATTTTTGTACAGTTAAACAGTAATTATTACAACGAGAACTTAAACGGTACCGAAGGAAATGAACAAGTAATGGATTTTGATGGAGAAGACGGCGAGAAAGTCGATTGTTACATTTGTTACAAATCTTTTCCAAACGAGGATCATCTTATACGACATCAGAGAAATGCGCATAAGTCCGAACAAATAATTCCGTTAGGAGACGCCGCCGGAAGTGGAAACGCTCCGAATATCAACGGCAACGGTAATAGGGCACAGTAtcatttgttttttgtttgtGAAATTTGCGGTAGTTCTCATTCGAGTAAATGGGAACGCTGGTTGCATATAAACAATATGCATAGTAACGAATCTTCCATCAAGTGCGAATGGGAAAACTGTGGAAAAATATTCGCGACGAAATCACTGCGCAATGACCATCTCCAGCATCATTTGATCCAAGGCCCATCGCCGAATACGTGCGAGATATGCGGTAAATTATGGCCTACCCGTGTCGATTACTGGAAACACGTGATGGGTGTTCACGCGGATACGGTACCCCTGATCTGCGGCGTTTGTCTGAAAGTATTTTCCGACGTGGTGCAGTTAAGTGCCCACGTAAAGGCGAAACATTGGCCACTGACCAATGGTGATTTCAGCTGTGATATTTGCGGTAGGCCATATTCCAATAAATCCAAGATGTCCCGGCATAGAAAGATCCACGGTTTGGAAATGGCAgcggcgacggcggcggcggcggtggctgCGGATGTCGCATGTGATAATAGCAACCTCAATGACACAACCAACGAATCGGTGAAATTCGAGCACGGCAACAATAGGGCCGTagatttcaaattaaaatgcgAACAATGCCCCGAGCACAAGTTCACGACTCTGGACATTTTACGCAATCATCGTCGGGTAGTGCACAATCTCTTTCCGTGCGACCTATGCGTTAAGTACTATGGTAGAACATCGCACTTATGGAAACACGTGAACAGAGTACACAAGGGTCACGCGGACGTGACTTGTCCATACTGCGCGAAAACGAGCGCGTCGAGGGATCATCTCGCGGCGCACATCGCGAAGATTCACAGGTACGTGCCCACGATGGGTGGTAAGGATAGTCAGAACTGCGTTGTTTCCAAGTCCTTGAATATGGAAGATGGTGTCCTGCATTACTGCGAGAAGTGTAACAAGGGATTCCATAAGCGCTATCTACTCCGACGTCATATGAAGGGCTGTCAAAACTACCGTAAGGATCCTGGAGCACTATTGACCCGCTGCCGAGCCTGCGAGAGGATATTCAAGGATCGTGCGAGTCTACAAAAGCACATCGAGAATCACCACAGCACATATACCTGCCATTTGTGTAACGAGACCATTACGTCCAAACTGGGCATCATGACGCACAATCGCGTCAATCATATGGATCACCCGGATCTGACATGCGATCATCCGAGCTGTAAGAAGCTTTTCCGCACCAAGGAGGATCTCGAGTCTCATCGAAAGGATCACAAATATCACAGCAATCCGAATGTCTGCGATTTCTGCGGTGACACCGTGGAGAATAAACTAAAGTTAAAGATGCACGTGCTATCGTTACACCGGAACGAGATCGGTGTGTCCTGTGGCGTTTGTCTCATTCCTATGAAGGATCCGAAAGATTTGAAGAAACACGTCGAGGCGGAACACAGTAGCGTTCTTTCCAATCCGAATACATGTCAAGTATGTGGCAAGCAGTATGCATCCAAGTGGAAGGCTTTTGATCACACGAAAAAGTGTCATGGGAAAGTTTTTCTCACGTGCAAACAATGTTTAGCAGTTTTCACGGATGAGAACGCTATACGCGATCACTACGAACATGTACATAACGTTCCAAAGGATCAATTAGCCGTTTTCGAATATAGAATGGACATCGGTGCGAAGAGGGAAGATTACGAGACGCCTGATATTATCGTGAAAGAAGAACCGGATGATCTTGAATTTGACGAGGAGATGTGCGATGAAAGTTCGAGCGATTCCCGCAAACGTAGACGATCGCCGAACGATACGTACGATTGTGAAATGTGCCCCGAGATCTTTCTCAATTCGGATACACTCGCCAAGCATTATCAGAACGTTCACAACACCGATCCCGTCCGTATGtttaaaaagtttaagaaGAACAACGGCGACGGTAAGCGTAGAATGAGAAACAGAAACAATTACGAATGCAAGAATTGCAAGAAGCAGTTCTCCACTAAGACCTTATTCTGGAATCACATCAATGCGTGCACGCGACGAAACTCGGTATGCAGATTCGACGTCCCGAATAATATCTCGACATCGATTCTGGAGTCGCACTTGAAGAACAATAATCAGATTCAGCGAGAAGAACCGGTATCGCTAACGAACGAATCCAATTTGAACATTCCCGATTTTAATCTGTTCGAGGACATCAATTTACAATTGTCATCCCAGAAACCGGTGCCGAATCTCATGCCGTTGTCGCAGGTAAAATCGGCAGGTAATGGCAAATGCTCGCGCAAAGACTCGCGCAAGGTGTACGACGAATCGACCAATACTGAGTGCACGTGCGAGGTCTGCGGCAAACAGTGGCCCGCTAAGAAGCACTTGTGGCAACACTTGATTCGCTTCCATCGTGCCGAAGCCGCCGTTACCTGTGGTGTATGCTTGAAACTATGCAAATCCTATCAAGACCTAGCCGATCACCTGAAGGCGGAGCACGCCCCTGTTTTGTCGCCGGAGGGCAACAATTTCACATGCAAGACATGCGGCAGATATCACAATGCGAGAAGTAAACTGCTGCTACATATGAGCATCCATATCGGATACTTTCGGTGCGAGAAATGTCAGCAGGGTTTCGCGAGTGAAGAGAAACTCGGCGAGCACGCGACGAGCTGCAACGGCAAGTCGGAATTTGAGAATCACGCAGTAACGGCGGATATCGAAGATAACGCGAAAAACGACAATGATGAAAAGGGCAGTTTAATCGCCGACGAAACGTCAGTCATCGAAGAGGAAGTTGAAGAAGCGGATTTCGAATCGGAAGGCGAGGGTAGTAGAGGCATCCAAAATGAAGAAAACAATAGCGAAGAAGACAATTCGGATAGCGATGACTCGGATAGCGGTAGCAATAGCAGTTCGAGTGAGAACGAAGGCGAAGAGGAAgtagaagaggaagaggaagaggaggaggaggaggaggaagaggaagaggaagaggagaacGAAAACGAAAACGAAAACGAAAATGAGTCTGATACAAGGACTGAGCCGGACACGAGAACCTCGAGCAGGGCGAGCGGTGACAGTGAATCGTGTAATTCCGAAAGCGATGAATCGGATGTGGACGAAGCAGGAGTGCGCGCGATGCAGAAGAAAGCGCCGCGATTGAGCGATATCAATAGGTTCAGGATACACGGCGAAGAGAACTCGCCGGCGATGGAAAAGTACATCGAGGATCAAAGCACCGCTTCCGTTATCGCAGCGACTGGTGGAGACCGAATTAAACAGAGTACCTTGAATAATCTATTGATTCCCGGTGCATCTGCGAACGTAGACAAATTTAAGACATTGCGTCTCCAGGAATCCACTGCAGCTACGGTGAGTGATGTAGACTTCTCTAATGATAACGAAAATGATAATGAAGAAGATGACAACGacgaaaataatgaaaacgacgaacaaaaagagagaggtgATGGTGAAGATGAAGGCGAAGGCGAAGGCGAAGGCGAAGGCGAAGGCGAAGGCGAAGGCGAAGGTGAAGGTGAAGGTGAAGGTGAAGGTGAAGGTGAAGGTGAAGGTGAAGATGAAGTTGAAGATGAAGGTAAAGGTGAAGGTGAGGGTGAAGATGAAGATGAAGATGAAGATGAAGATGAAGATGAAGATGAAAGTGAAGGTGAGGGTACCAGCGAAGCCGAGGCTGAGAGTGGAGAGGCTGAGGGCGAAGACAACGCTGAAGGTGAGGGggaagaagaggaggaggaggaaaatGATGaggatgacgacgacgacggacCGCCTGTGTTAAGTCCGATAATGCCTTTGCTACCTGAAAACGAATCCGAGGAGCACAGCAATACGATGGATCGTACAAGGCATAAGCTTAGTCCGATGGTATCGCTGAATATGGACAAATTAATGGAAGAATGTGAGATAACGGAGATCAAAACTGATACGGAGAACACGGTGGCACTGTCGAATGCCTCCAATTTCTTCGCGGCTAATAACAACGATTTATCCGTGACATGGGACGAGGATGAGGAACGCGACTGTAATTCCGATGTCGGAGATAGAGATGTGATGGTAATGAAGAACGAGGAGTTTGATAAGGAGTATACTAAGAGGAATATCAACGATTTGGAGGGCGACGACTATGAGGAGGATTCTGCGGACGAAAATGTGATGGATGATAGGGACGGTGGTGGTGGAGATCAAGTGCACGAGATACATAGTCTAGACGGGACAGTGTTAATGATGACTAATGATGCGGAAGGTAATCCGATTTTGATAGAGCATAACGTGTTAGATATCGATAACGAGGACTCCAACGCTGAGGTGGCGCAGTATATTTACCCAGATAATGCTTACGAGATCGAGGAAGAGGACGAGGAAGATTTTGCGACGCGAAACGAAACCGACGCCATGCAGGGTGAGATACAAGGTATGTCCTACATTCAGGATATGTCAGAGAACGACAATAGCACGGGAGATGATGTAGAAGGGGACAGTAATGACGCCCAGAAATAG
- Protein Sequence
- MHIFGQEGRNRQLVDKIQTCLPFKIEEDDRLPKVLCYRCMYNLENFYDFRTACVNASAWLERNRPKEGVNDDGANDSAQCNEMHTELLKGKENMPILIPEAPVVNPNAALGTPPRLNSDGEADPEIEEILDASEGTDEVLDDSTEDRRSEYEYEMDMETNPSDFLEMTPMVTEENEEECGANNTSAATGQQEATTVFPPTSQQHEVYVCSLCNKAFSSKGHLSLHARIHVGEGDVIGERVITDDHTSYQRPYQCDLCHKSYSTAKHRWGHVSTTHRGHPAVTCAYCSRIYSTRYNLDEHIKSRHAGLPPPPELSVSLSRAETRYQCQTCPMIYKDLADFNAHRQICIEEQRTDLLGQTDAQNNKSLVDMSDMSSIIDSDDENKDFRSAEAKLAKNPQLTILKQALTKGDSLKRIYDDDGSTLSCKARKMVKIEGETNPEIKRWYCEYCPQSFTSVDNLKLHEVRHDAEKPFICVLCKKDFVLKSSLIRHITTIHGVDPTPIIDSDKCLKTSVMPQNWNRMDVSVYEQNDIKEPPEFSSSPETNLDNDEKDLKNNHENIEIETVFICEICTRDFNDRASLWLHMRATHKELAAYACGVCLKICSDNTQLQSHLYMYHGKSKLLISEQRRYSCTICGRQHDSRKKLIAHVSIHNIDSGFDPAIFVQLNSNYYNENLNGTEGNEQVMDFDGEDGEKVDCYICYKSFPNEDHLIRHQRNAHKSEQIIPLGDAAGSGNAPNINGNGNRAQYHLFFVCEICGSSHSSKWERWLHINNMHSNESSIKCEWENCGKIFATKSLRNDHLQHHLIQGPSPNTCEICGKLWPTRVDYWKHVMGVHADTVPLICGVCLKVFSDVVQLSAHVKAKHWPLTNGDFSCDICGRPYSNKSKMSRHRKIHGLEMAAATAAAAVAADVACDNSNLNDTTNESVKFEHGNNRAVDFKLKCEQCPEHKFTTLDILRNHRRVVHNLFPCDLCVKYYGRTSHLWKHVNRVHKGHADVTCPYCAKTSASRDHLAAHIAKIHRYVPTMGGKDSQNCVVSKSLNMEDGVLHYCEKCNKGFHKRYLLRRHMKGCQNYRKDPGALLTRCRACERIFKDRASLQKHIENHHSTYTCHLCNETITSKLGIMTHNRVNHMDHPDLTCDHPSCKKLFRTKEDLESHRKDHKYHSNPNVCDFCGDTVENKLKLKMHVLSLHRNEIGVSCGVCLIPMKDPKDLKKHVEAEHSSVLSNPNTCQVCGKQYASKWKAFDHTKKCHGKVFLTCKQCLAVFTDENAIRDHYEHVHNVPKDQLAVFEYRMDIGAKREDYETPDIIVKEEPDDLEFDEEMCDESSSDSRKRRRSPNDTYDCEMCPEIFLNSDTLAKHYQNVHNTDPVRMFKKFKKNNGDGKRRMRNRNNYECKNCKKQFSTKTLFWNHINACTRRNSVCRFDVPNNISTSILESHLKNNNQIQREEPVSLTNESNLNIPDFNLFEDINLQLSSQKPVPNLMPLSQVKSAGNGKCSRKDSRKVYDESTNTECTCEVCGKQWPAKKHLWQHLIRFHRAEAAVTCGVCLKLCKSYQDLADHLKAEHAPVLSPEGNNFTCKTCGRYHNARSKLLLHMSIHIGYFRCEKCQQGFASEEKLGEHATSCNGKSEFENHAVTADIEDNAKNDNDEKGSLIADETSVIEEEVEEADFESEGEGSRGIQNEENNSEEDNSDSDDSDSGSNSSSSENEGEEEVEEEEEEEEEEEEEEEEEENENENENENESDTRTEPDTRTSSRASGDSESCNSESDESDVDEAGVRAMQKKAPRLSDINRFRIHGEENSPAMEKYIEDQSTASVIAATGGDRIKQSTLNNLLIPGASANVDKFKTLRLQESTAATVSDVDFSNDNENDNEEDDNDENNENDEQKERGDGEDEGEGEGEGEGEGEGEGEGEGEGEGEGEGEGEGEDEVEDEGKGEGEGEDEDEDEDEDEDEDESEGEGTSEAEAESGEAEGEDNAEGEGEEEEEEENDEDDDDDGPPVLSPIMPLLPENESEEHSNTMDRTRHKLSPMVSLNMDKLMEECEITEIKTDTENTVALSNASNFFAANNNDLSVTWDEDEERDCNSDVGDRDVMVMKNEEFDKEYTKRNINDLEGDDYEEDSADENVMDDRDGGGGDQVHEIHSLDGTVLMMTNDAEGNPILIEHNVLDIDNEDSNAEVAQYIYPDNAYEIEEEDEEDFATRNETDAMQGEIQGMSYIQDMSENDNSTGDDVEGDSNDAQK
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- iTF_00869056;
- 90% Identity
- iTF_00729511;
- 80% Identity
- -