Aaes010078.3
Basic Information
- Insect
- Alsophila aescularia
- Gene Symbol
- NFIA
- Assembly
- GCA_946251895.1
- Location
- CAMIUG010000381.1:1116401-1135396[-]
Transcription Factor Domain
- TF Family
- CTF_NFI
- Domain
- CTF/NFI and MH1 domain
- PFAM
- PF00859
- TF Group
- Unclassified Structure
- Description
- Nuclear factor I (NF-I) or CCAAT box-binding transcription factor (CTF) [2, 1, 5] (also known as TGGCA-binding proteins) are a family of vertebrate nuclear proteins which recognise and bind, as dimers, the palindromic DNA sequence 5'-TGGCANNNTGCCA-3'. This family was first described for its role in stimulating the initiation of adenovirus DNA replication [6]. In vertebrates there are four members NFIA, NFIB, NFIC, and NFIX and an orthologue from Caenorhabditis elegans has been described, called Nuclear factor I family protein (NFI-I) [4]. The CTF/NF-I proteins are individually capable of activating transcription and DNA replication, thus they function by regulating cell proliferation and differentiation. They are involved in normal development and have been associated with developmental abnormalities and cancer in humans [5]. In a given species, there are a large number of different CTF/NF-I proteins, generated both by alternative splicing and by the occurrence of four different genes. CTF/NF-1 proteins contain 400 to 600 amino acids. The N-terminal 200 amino-acid sequence, almost perfectly conserved in all species and genes sequenced, mediates site-specific DNA recognition, protein dimerisation and Adenovirus DNA replication. The C-terminal 100 amino acids contain the transcriptional activation domain. This activation domain is the target of gene expression regulatory pathways elicited by growth factors and it interacts with basal transcription factors and with histone H3 [3].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 6 0.0086 1e+02 3.5 0.0 4 30 214 240 211 247 0.88 2 6 0.00046 5.5 7.7 7.6 145 207 293 362 262 401 0.70 3 6 0.033 4e+02 1.6 0.1 187 207 408 428 397 438 0.79 4 6 0.025 3e+02 2.0 0.0 180 205 435 459 429 468 0.83 5 6 0.0095 1.1e+02 3.4 0.1 184 204 477 497 469 506 0.82 6 6 4e-13 4.8e-09 37.4 3.6 179 271 506 595 500 599 0.85
Sequence Information
- Coding Sequence
- ATGCCATTTGTAAAGTCCTTTTCGTATACTTGGTTCAATTTACAAGCGGCCAAACGGAAATATTACAAGAAACACGAAAAACGGATGAGTCTCGAGGAGGAACGGCATACTAAATATGAATTACAGAACGAAAAAGCAGAAGTAAAACAGAAATGGGCATCGCGTCTGCTTGGGAAGCTGCGGAAAGACATCACGCAAGACTGCAGGGAAGACTTCGTGCTCAGCATCACAGGGAAGAGGCCGGCAGTATGCGTGTTGTCCAACCCGGACCAGAAGGGGAAAATGAGGAGAATAGACTGCCTGAGACAGGCCGATAAGGTGTGGCGATTAGACCTTGTAATGGTAATCCTATTCAAAGCTATACCGCTGGAAAGCACAGACGGGGAGCGACTGGAAAAGCACCCAGAATGCACGCAGCCAGGCCTCTGCGTGAACCCTTACCACATCAACGTCTCCGTGAGAGAGCTGGACCTGTACCTGGCCAACTTCATCAATAGCTACGGTAATGGGAGCGACGGTCTGCCAGATATTCTAAGCGGTTCCCTTTCGCCCCAGCCTAGGGACAAAGAGAACGAACACGACGCTAAGAGCAAAGGCTATAGTCACAATCCTTACAACGGCGTAATTTGCAACGATATTATTCTAGCAACTGGCGTCTTCTCTTCAAAGGAACTATGGAAGTTGTCCAAAGCGTCAATCCTCCAAGAGACGGGGTCCGAGTCTCCCAGTGGGTCCGGCGGGGGCGGCATCAAGCTGGAGTCGGGGTACTACTGCGGGTACAACAGCCCTGCGCCCCCTGCGTTCGAGCCCACTGCTGAGCGCGCGACCCCGTCTGCCATGCTCGTCGGGCAGTGCTTCAGCATACCTCACAGCTCCTCGGGCCTGAGCTCCCAGTCCCCCCTCGTACCATCAAACACCATATTCTACCAACACGCACCGCCCCCCACCGAAACGCACCCTACTGCATCAGAAGCGCTGTCGAGTCAGCATGCCAGCGCCAAATACGACACTGCGCCGCAGGACTCTCTGGGTGACTTCGTGACATTCGTGTGTCAGGAGCCGAGTGACGTCCAGCAGTTACAGGTTAGCTACATATCTAAATACGACAAAGCGCCGCAGGTCTCGCTGGGTGACTTCGTGACCTTATTGTGTCAGGGGCCGAGTGACGTCCAGCAGTTACAGGTTAGCTGCATATCTAAATATGACAAAGCGCCGCTGGTCTCGCTGGGTGACTTCGTGACCTTTGTGTGTCAAGAGCCGAGTGACGTCCAGCAGTTACAGGTTAGCTGCATATCCAAATACGACAAAGCACCGCAGGTCTCGCTGGGTCACTTCGTGACCTTCGTGTGTCAGGAGCCGAGTGCCGTCCAGCAGTTACAGGTTAGATGCATACCCAAATACGATAAAGCACCGCAGGTCTCGCTGGCGCCGCAGGACTTGCTAAGTGAATTCGTGACATTTGTGTGTCAGAAGCAGAGTGACATCCAGCAGTTACAGGTTAGCTGCAAATTCAAATACGACACAGCGCCGCAGGACTCACTGGGTGACTTCGTGACCTTCGTGTGTCAAGAGTCAAGAGACATCCAGCAGTTTCAGGTCCATAGCCTCTCCAGGAGTCCAAAGCCGTACTTCAGCAGCTCAATGCTGCCCCCTCCTCCCCTCCCACCGATGGCCAGGCCTGTAGCCATCATCAGATCAACTGCGAGCGAGCTGGTGGGAAACAATGCAGGAGGCGCGTCCCCCTCCTCCCCGGAGGTGCGGTCCCCCCCAGCGTCCCCCCGCCGCCGCTCCCCGCACCCCTACCGCACGGACTGCCACCCGCCGATATCGCACTTCAATCATTTCCATACGCAGCCTCAGCAAATGTTCACCTACGGGTCTCTTGGTGGGCTGGGGAGTGGTGCCAGTGGCGGAGGCGCGGTGCTGTCTGGGGCGGGGGGAGTGTCACCCCCCGGAGGGCTTGGACTTTATGCTCCGAGAACTGCTACTAGAGCACCGCCGAGGTGGAACGCACCATTCCCTACCCTAGAAGAAGAGTTCAACATAATGGCGGCGCCGGCGGGGCCCGACCACGTCGTGCTGCTTGATGACGAGCGATTCTTCCAATCCAGTGTGATCACATCAGAGGCGACCGCCATGGACACGACCGGATCGGTCGGCCAGGTCGTCGACACGTCCGACGATATCGCACCTGACAAACCAGCACCCGATCTGCGGTCACCGTGA
- Protein Sequence
- MPFVKSFSYTWFNLQAAKRKYYKKHEKRMSLEEERHTKYELQNEKAEVKQKWASRLLGKLRKDITQDCREDFVLSITGKRPAVCVLSNPDQKGKMRRIDCLRQADKVWRLDLVMVILFKAIPLESTDGERLEKHPECTQPGLCVNPYHINVSVRELDLYLANFINSYGNGSDGLPDILSGSLSPQPRDKENEHDAKSKGYSHNPYNGVICNDIILATGVFSSKELWKLSKASILQETGSESPSGSGGGGIKLESGYYCGYNSPAPPAFEPTAERATPSAMLVGQCFSIPHSSSGLSSQSPLVPSNTIFYQHAPPPTETHPTASEALSSQHASAKYDTAPQDSLGDFVTFVCQEPSDVQQLQVSYISKYDKAPQVSLGDFVTLLCQGPSDVQQLQVSCISKYDKAPLVSLGDFVTFVCQEPSDVQQLQVSCISKYDKAPQVSLGHFVTFVCQEPSAVQQLQVRCIPKYDKAPQVSLAPQDLLSEFVTFVCQKQSDIQQLQVSCKFKYDTAPQDSLGDFVTFVCQESRDIQQFQVHSLSRSPKPYFSSSMLPPPPLPPMARPVAIIRSTASELVGNNAGGASPSSPEVRSPPASPRRRSPHPYRTDCHPPISHFNHFHTQPQQMFTYGSLGGLGSGASGGGAVLSGAGGVSPPGGLGLYAPRTATRAPPRWNAPFPTLEEEFNIMAAPAGPDHVVLLDDERFFQSSVITSEATAMDTTGSVGQVVDTSDDIAPDKPAPDLRSP
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- iTF_00055447;
- 90% Identity
- iTF_00055447;
- 80% Identity
- iTF_00055447;