Basic Information

Insect
Euclidia mi
Gene Symbol
-
Assembly
GCA_944738845.2
Location
CALYKX020000092.1:1835116-1839750[+]

Transcription Factor Domain

TF Family
GCFC
Domain
GCFC domain
PFAM
PF07842
TF Group
Unclassified Structure
Description
This entry describes a domain found in a number of GC-rich sequence DNA-binding factor proteins and homologues [4, 5], as well as in a number of other proteins including Tuftelin-interacting protein 11 [1]. While the function of the domain is unknown, some of the proteins it is found in are reported to be involved in pre-mRNA splicing [1, 2]. This domain is also found in Sip1, a septin interacting protein [3].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 23 0.08 1.4e+03 0.9 0.0 141 206 27 91 22 98 0.74
2 23 0.085 1.5e+03 0.8 0.0 141 206 86 150 83 156 0.73
3 23 0.093 1.6e+03 0.7 0.0 142 206 146 209 143 215 0.72
4 23 0.076 1.3e+03 0.9 0.0 141 206 204 268 199 283 0.76
5 23 0.071 1.2e+03 1.0 0.0 141 206 263 327 260 357 0.77
6 23 0.083 1.4e+03 0.8 0.0 141 206 322 386 319 393 0.73
7 23 0.073 1.3e+03 1.0 0.0 141 206 381 445 378 471 0.77
8 23 0.09 1.6e+03 0.7 0.0 141 206 440 504 437 508 0.73
9 23 0.072 1.2e+03 1.0 0.0 141 206 499 563 496 574 0.74
10 23 0.093 1.6e+03 0.6 0.0 142 206 559 622 556 628 0.72
11 23 0.0096 1.6e+02 3.9 0.0 141 206 617 681 612 685 0.83
12 23 0.08 1.4e+03 0.9 0.0 141 206 735 799 727 805 0.75
13 23 0.075 1.3e+03 0.9 0.0 141 206 794 858 789 873 0.76
14 23 0.0088 1.5e+02 4.0 0.0 141 206 853 917 849 925 0.83
15 23 0.076 1.3e+03 0.9 0.0 141 206 971 1035 955 1042 0.77
16 23 0.073 1.3e+03 1.0 0.0 141 206 1030 1094 1026 1119 0.77
17 23 0.088 1.5e+03 0.7 0.0 141 206 1089 1153 1084 1157 0.73
18 23 0.069 1.2e+03 1.1 0.0 141 206 1148 1212 1140 1240 0.78
19 23 0.076 1.3e+03 0.9 0.0 141 206 1207 1271 1204 1288 0.75
20 23 0.15 2.7e+03 -0.1 0.0 141 206 1266 1330 1259 1339 0.75
21 23 0.083 1.4e+03 0.8 0.0 141 206 1325 1389 1322 1396 0.73
22 23 0.071 1.2e+03 1.0 0.0 141 206 1384 1448 1380 1475 0.77
23 23 0.081 1.4e+03 0.8 0.0 141 206 1443 1507 1439 1515 0.74

Sequence Information

Coding Sequence
ATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGTTGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCCTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCCCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAACAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCCCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAACAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGCCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCACCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTCACATACATGTTACAGTAGTATATCTCACCTTGGACCTGGCCCTCATGGCGATATGCAGCGCGGTGTCCCCGCGGTTGTCGGCCACGTTGACGCGCGCCTTGCGCTCCAGCAGCAGCTGCACCATCTCCGCGTGGCGCGACCGCACCGCGCGCATCAGCGGCGTGTCGCCCGTCTGTACACATACATTCAATACATACAGACATAGACTCACGCCCACATATTGCTTCACCCACTTTTCGCCAGATTTATTTTAG
Protein Sequence
MAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISVVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYPTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSNSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYPTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSNSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSPTLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCHIHVTVVYLTLDLALMAICSAVSPRLSATLTRALRSSSSCTISAWRDRTARISGVSPVCTHTFNTYRHRLTPTYCFTHFSPDLF

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
-
90% Identity
-
80% Identity
-