Basic Information

Insect: Megalurothrips usitatus
Gene Symbol: -
Assembly: GCA_026979955.1
Location: CM049601.1:2874576-2881868[+]

Transcription Factor Domain

TF Family: zf-C2H2
Domain: zf-C2H2 domain
PFAM: PF00096
TF Group: Zinc-Coordinating Group
Description: The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter [1].
Hmmscan Out: # of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc

1 13 0.00049 0.037 13.9 0.0 4 23 42 62 42 63 0.94

2 13 0.0005 0.038 13.9 0.1 2 23 97 118 96 118 0.96

3 13 0.25 19 5.4 2.4 1 23 123 145 123 145 0.97

4 13 6.6e-06 0.0005 19.8 0.5 1 23 151 173 151 173 0.96

5 13 3.7e-06 0.00028 20.6 3.8 1 23 179 201 179 201 0.98

6 13 0.00017 0.013 15.3 0.1 1 23 207 230 207 230 0.97

7 13 1.1 82 3.4 0.1 1 14 236 249 236 250 0.91

8 13 1.4e-06 0.00011 21.9 0.9 3 23 255 275 254 275 0.99

9 13 0.21 16 5.6 0.2 1 17 282 298 282 300 0.93

10 13 0.00015 0.011 15.6 0.2 5 23 300 318 297 318 0.95

11 13 0.00041 0.031 14.2 1.6 3 23 326 346 324 346 0.98

12 13 3.2e-05 0.0025 17.6 0.3 3 23 354 374 352 374 0.98

13 13 0.016 1.2 9.2 0.1 1 20 384 403 384 406 0.95

#	of	c-Evalue	i-Evalue	score	bias	hmm coord from	hmm coord to	ali coord from	ali coord to	env coord from	env coord to	acc
1	13	0.00049	0.037	13.9	0.0	4	23	42	62	42	63	0.94
2	13	0.0005	0.038	13.9	0.1	2	23	97	118	96	118	0.96
3	13	0.25	19	5.4	2.4	1	23	123	145	123	145	0.97
4	13	6.6e-06	0.0005	19.8	0.5	1	23	151	173	151	173	0.96
5	13	3.7e-06	0.00028	20.6	3.8	1	23	179	201	179	201	0.98
6	13	0.00017	0.013	15.3	0.1	1	23	207	230	207	230	0.97
7	13	1.1	82	3.4	0.1	1	14	236	249	236	250	0.91
8	13	1.4e-06	0.00011	21.9	0.9	3	23	255	275	254	275	0.99
9	13	0.21	16	5.6	0.2	1	17	282	298	282	300	0.93
10	13	0.00015	0.011	15.6	0.2	5	23	300	318	297	318	0.95
11	13	0.00041	0.031	14.2	1.6	3	23	326	346	324	346	0.98
12	13	3.2e-05	0.0025	17.6	0.3	3	23	354	374	352	374	0.98
13	13	0.016	1.2	9.2	0.1	1	20	384	403	384	406	0.95

Sequence Information

Coding Sequence: ATGCCCCCGCTCACCTACCTCAAGACCTTCAACGCCGAGGCCCGAGGCCCTGGTCTGGCGCTCGCCCGGCGAGAGGCCGCCATCCGGGCCCACTCCAACGAGGCCGAGGCCCTGGCTGGCGCCCCGGAGTGCCCCAAGGAGTTCGCGAGCGCGAATGCTCTTCAGTCCCACATCCGCGCCGTTCACCACGAAGTGCTCAAGGGCCTTTGGAGTCCCCGGGGAGACTCCGCCAgcgcgacggagcaggcggacggcgacggcgacggcgccaggctggcggacaagctggtgtgccccgagtgcggcatgcggtgcccgaacccggtggcgctgtccatgcactacggcacgcacaccaggcgcgagcaccggtgcggcgtctgccaggcggtgcactgcagcgcctacgccctcgggctgcacctccgcacgcacacgggcgacaagccgcacgagtgcgacgtgtgcggggcgagcttctcccagctggccaacatgcagaagcaccgcctgctgcactcgggcgccaaggcgttcgagtgcgcggtctgcaaggacaagttctcccgcaaggagcacctgacgcggcacatgcgccggcacacgggcgagaagccgtacaggtgcagcatctgcctcaaggcgttcgcccgccgcagccccataaaggcccacgtgctgatggtgcacaccgagaacaggccgttcaagtgcgaggcgtgcgagctgcggttcgccacctccgacgccaggccgcacggctgcaccgtgtgcggcaagcggttcaagaggctgaccgacctccgcctccacgtgcgcagccacacggcggacaggccgtcctaccagtgcgcggtgtgcgagggcaagttcgtctcggcgtactccctcaagtgccactgcccgaagaggttctcgcagaagggcaacctggacgcgcaccgcgccgtacactccggggataagcggttcggctgcacgctctgcggcaagaagtgcagcgacggcgtgaaactacgcgcgcacctgcggcagcacacgggggagaggccgttcggctgcacggagtgcggcaagaggtacatgtcggccacggggctccgcttgcacctgcgggtgcacacggatgggGAGGCGGGGCCGGGCAGCTTCAGGTGCTCCGAGTGTCGGGCGACGTTCGCCAGCGACGCGGGCTTGCTCGCGCACCTCTGGGATCACATTCGGGGCAGGCCGGAGCCCCCGTCGTAG
Protein Sequence: MPPLTYLKTFNAEARGPGLALARREAAIRAHSNEAEALAGAPECPKEFASANALQSHIRAVHHEVLKGLWSPRGDSASATEQADGDGDGARLADKLVCPECGMRCPNPVALSMHYGTHTRREHRCGVCQAVHCSAYALGLHLRTHTGDKPHECDVCGASFSQLANMQKHRLLHSGAKAFECAVCKDKFSRKEHLTRHMRRHTGEKPYRCSICLKAFARRSPIKAHVLMVHTENRPFKCEACELRFATSDARPHGCTVCGKRFKRLTDLRLHVRSHTADRPSYQCAVCEGKFVSAYSLKCHCPKRFSQKGNLDAHRAVHSGDKRFGCTLCGKKCSDGVKLRAHLRQHTGERPFGCTECGKRYMSATGLRLHLRVHTDGEAGPGSFRCSECRATFASDAGLLAHLWDHIRGRPEPPS

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity: -
90% Identity: -
80% Identity: -