Basic Information

Insect
Leuctra nigra
Gene Symbol
-
Assembly
GCA_934046545.1
Location
CAKOHC010002133.1:41844-45366[+]

Transcription Factor Domain

TF Family
GCFC
Domain
GCFC domain
PFAM
PF07842
TF Group
Unclassified Structure
Description
This entry describes a domain found in a number of GC-rich sequence DNA-binding factor proteins and homologues [4, 5], as well as in a number of other proteins including Tuftelin-interacting protein 11 [1]. While the function of the domain is unknown, some of the proteins it is found in are reported to be involved in pre-mRNA splicing [1, 2]. This domain is also found in Sip1, a septin interacting protein [3].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 15 0.00028 3.9 8.2 0.1 103 153 48 98 33 127 0.82
2 15 0.061 8.6e+02 0.5 0.0 105 133 97 125 93 150 0.88
3 15 0.11 1.5e+03 -0.3 0.0 103 152 189 238 174 242 0.87
4 15 0.00019 2.6 8.8 0.1 103 152 236 285 214 288 0.86
5 15 0.00093 13 6.5 0.1 105 152 285 332 282 335 0.91
6 15 0.0007 9.7 6.9 0.0 105 152 332 379 327 382 0.90
7 15 0.0003 4.1 8.1 0.0 105 152 379 426 374 429 0.90
8 15 0.00019 2.7 8.7 0.1 105 153 426 474 421 484 0.88
9 15 0.02 2.8e+02 2.1 0.0 105 154 473 522 468 550 0.86
10 15 0.11 1.5e+03 -0.3 0.0 103 152 612 661 592 664 0.86
11 15 0.00019 2.6 8.8 0.1 103 152 659 708 637 711 0.86
12 15 0.0007 9.7 6.9 0.0 105 152 708 755 703 758 0.90
13 15 0.00025 3.5 8.4 0.0 105 153 755 803 750 805 0.90
14 15 0.00042 5.9 7.6 0.0 105 152 802 849 800 852 0.91
15 15 0.00015 2.1 9.1 0.1 105 153 849 897 845 927 0.83

Sequence Information

Coding Sequence
ATGTCCCCGGCCGCTGCATGGTGGAGATGGGAAGCAACGACCCGGTGCGACGTTCGAATTCAACACGGAGTGTCCTGCACATCATACGTCACGGTTACAGTCGCTGAGCGCGAAACAAATTTAGGGCGGCTGGAGCGTCACACATCCCAACAGAATGACGACGTGCGACTCGCTTCAGCATGGCGCAACGGCTATCCTAGAGTAACTGTACCTGAAGTATTGAAGCAGGTCGCCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAGTGACGACGTGCGACTCGCTTCAGCATGGCGCAACTGCTATCCTAGAGTAACTGTACCTGAAGTATTGAAGCAGGTCACCGCAAGGTACGTGCTCCATGATCATGCAGTAGGGCGGCTGGAGCGTCACACATCCCAACAGAATGACGACGTGCGACTCGCTTCAGCATGGCGCAACGGCTGCTCGGGAGTAACTGTACCTGAGGTATTGAAGCAGGTCGCCGCAAGGTACGTGCTCCATGATCATGCAGTAGGGCGGCTGGAGCGTCACACATCCCAACAGAATGACGACGTGCGACTCGCTTCAGCATGGCGCAACGGCTGCTCGGGAGTAAATGTACCTGAGGTATTGAAGCAGGTCACCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAGTGACGACGTGCGACTCGCTTCAGCATGGCGCAACTGCTATCCTAGAGTAACTGTACCTGAGGTATTGAAGCAGGTCGCCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAGTGACGACGTGCGACTCGCTTCAGCATGGCGCAACTGCTATCCTAGAGTAACTGTACCTGAGGTATTGAAGCAGGTCACCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAGTGACGACGTGCGACTCGCTTCAGCATGGCGCAACTGCTATCCTAGAGTAACTGTACCTGAGGTATTGAAGCAGGTCACCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAGTGACGACGTGCGACTCGCTTCAGCATGGCGCAACTGCTATCCTAGAGTAACTGTACCTGAGGTATTGAAGCAGGTCGCCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAGTGACGACGTGCGACTCGCTTCAGCATGGCGCAACTGCTATCCTAGAGTAACTGTACCTGAGGTATTGAAGCAGGTCGCCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAGTGACGACGTGCGACTCGCTTCAGCATGGCGCAACGGCTGCTCGGGAGTAACTGTACCTGAGGTATTGAAGCAGGTCACCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAATGACGACGTGCGACTCGCTTCAGCATGGCGCAACGGCTGCTCGGGAGTAACTGTACCTGAGGTATTGAAGCAGGTCGCCGCAAGGTACGTGCTCCATGATCATGCAGTAGGGCGGCTGGAGCGTCACACATCCCAACAGAATGACGACGTGCGACTCGCTTCAGCATGGCGCAACGGCTGCTCGGGAGTAACTGTACCTGAGGTATTGAAGCAGGTCGCCGCAAGGTACGTGCTCCATGATCATGCAGTAGGGCGGCTGGAGCGTCACACATCCCAACAGAATGACGACGTGCGACTCGCTTCAGCATGGCGCAACGGCTGCTCGGGAGTAAATGTACCTGAGGTATTGAAGCAGGTCACCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAGTGACGACGTGCGACTCGCTTCAGCATGGCGCAACTGCTATCCTAGAGTAACTGTACCTGAGGTATTGAAGCAGGTCGCCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAGTGACGACGTGCGACTCGCTTCAGCATGGCGCAACTGCTATCCTAGAGTAACTGTACCTGAGGTATTGAAGCAGGTCACCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAGTGACGACGTGCGACTCGCTTCAGCATGGCGCAACTGCTATCCTAGAGTAACTGTACCTGAGGTATTGAAGCAGGTCGCCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAATGACGACGTGCGACTCGCTTCAGCATGGCGCAACTGCTATCCTAGAGTAACTGTACCTGAGGTATTGAAGCAGGTCGCCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAGTGACGACGTGCGGCTCGCTTCAGCATGGCGCAACTGCTATCCTAGAGTAACTGTACCTGAGGTATTGAAGCAGGTCGCCGCAAGGTACGTGCTCCATGATCATGCAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAGTGACGACGTGCGACTCGCTTCAGCATGGCGCAACTGCTATCCTAGAGTAACTGTACCTGAGGTATTGAAGCAGGTCGCCGCAAGTATGGCGGCCGGAGCGTCACACATCCCAACAGAGTGA
Protein Sequence
MSPAAAWWRWEATTRCDVRIQHGVSCTSYVTVTVAERETNLGRLERHTSQQNDDVRLASAWRNGYPRVTVPEVLKQVAARYVLHDHAVWRPERHTSQQSDDVRLASAWRNCYPRVTVPEVLKQVTARYVLHDHAVGRLERHTSQQNDDVRLASAWRNGCSGVTVPEVLKQVAARYVLHDHAVGRLERHTSQQNDDVRLASAWRNGCSGVNVPEVLKQVTARYVLHDHAVWRPERHTSQQSDDVRLASAWRNCYPRVTVPEVLKQVAARYVLHDHAVWRPERHTSQQSDDVRLASAWRNCYPRVTVPEVLKQVTARYVLHDHAVWRPERHTSQQSDDVRLASAWRNCYPRVTVPEVLKQVTARYVLHDHAVWRPERHTSQQSDDVRLASAWRNCYPRVTVPEVLKQVAARYVLHDHAVWRPERHTSQQSDDVRLASAWRNCYPRVTVPEVLKQVAARYVLHDHAVWRPERHTSQQSDDVRLASAWRNGCSGVTVPEVLKQVTARYVLHDHAVWRPERHTSQQNDDVRLASAWRNGCSGVTVPEVLKQVAARYVLHDHAVGRLERHTSQQNDDVRLASAWRNGCSGVTVPEVLKQVAARYVLHDHAVGRLERHTSQQNDDVRLASAWRNGCSGVNVPEVLKQVTARYVLHDHAVWRPERHTSQQSDDVRLASAWRNCYPRVTVPEVLKQVAARYVLHDHAVWRPERHTSQQSDDVRLASAWRNCYPRVTVPEVLKQVTARYVLHDHAVWRPERHTSQQSDDVRLASAWRNCYPRVTVPEVLKQVAARYVLHDHAVWRPERHTSQQNDDVRLASAWRNCYPRVTVPEVLKQVAARYVLHDHAVWRPERHTSQQSDDVRLASAWRNCYPRVTVPEVLKQVAARYVLHDHAVWRPERHTSQQSDDVRLASAWRNCYPRVTVPEVLKQVAASMAAGASHIPTE

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
-
90% Identity
-
80% Identity
-