Basic Information

Gene Symbol
lilli
Assembly
GCA_004794745.1
Location
QBLH01002806.1:931344-1143157[-]

Transcription Factor Domain

TF Family
AF-4
Domain
AF-4 domain
PFAM
PF05110
TF Group
Unclassified Structure
Description
This family consists of AF4 (Proto-oncogene AF4) and FMR2 (Fragile X syndrome) nuclear proteins. These proteins have been linked to human diseases such as acute lymphoblastic leukaemia and mental disabilities [1]. The family also contains a Drosophila AF4 protein homologue Lilliputian which contains an AT-hook domain. Lilliputian represents a novel pair-rule gene that acts in cytoskeleton regulation, segmentation and morphogenesis in Drosophila [2].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 10 0.1 1.3e+03 -2.2 0.2 4 26 24 46 22 52 0.81
2 10 9.2e-06 0.12 11.1 0.3 49 130 91 172 55 183 0.61
3 10 6.9e-10 8.8e-06 24.8 1.8 347 444 278 371 261 386 0.78
4 10 0.6 7.8e+03 -4.8 26.7 448 503 451 501 430 533 0.47
5 10 1 1.3e+04 -10.0 18.7 136 250 521 638 497 652 0.33
6 10 0.033 4.3e+02 -0.6 10.2 426 480 642 698 638 711 0.58
7 10 1 1.3e+04 -7.5 12.4 426 488 685 745 683 754 0.54
8 10 1 1.3e+04 -8.9 14.3 91 161 729 799 702 864 0.37
9 10 0.035 4.5e+02 -0.7 9.8 90 274 961 1143 937 1148 0.52
10 10 0.0079 1e+02 1.5 4.4 126 225 1229 1331 1199 1373 0.49

Sequence Information

Coding Sequence
ATGCCTTCGAACGGCGGTGGATATTATGACGACAGGAATCCTCTGCTCAAGGGCACCTTATCGAGCGTGGATCGGGACCGGCTTCGGGAACGAGAGCGACAGGCCCGCGCGGCGATGTCGGTCCAGGCGGAGCAGGCGGCTGCGGGAGGTGGTCCTGATACGAGAcaccatcaccaccaccaccataaCCACGCGCATCATCATCACCCCAACATACATGTCTCCACGCCGAACCTCTTCCGTGCCCCCGTCAGGGTGAATCCTGATCCACAAATCCAGTCCAAGCTGGGCAACTATTCGCTAGTGAAGCATCTACTCGACCAGCCTAAGCGCCTGATCGGCATCGATGGCATCCCGCCGAGTCCGGCGCCCTCGCCAACGCCCTCCGCCCACAAATCAGGCACGGATAGCTCCGGCAGGAGCTGTTCTCCGTCCTCGGGACAGGAGTTCAAGAAGCCCGGTGGGCTACGGGGCACGAGCGCCGCGTCTTCCTCGTCGAGCGCGAGCCATCAGAAACGAGGCGGCTTCGTCAAACCCGCTGACGGCAAGCCGCCTTACAACGGTCGGGGTGGCTACCCCGGCCAACCAGTCAAGCACGGCGGCGGCAGCAACGATCATCGCAACCACGGTTTGCTTCCAGCTAAgggcccgccgccgccgacccAGCCCTCCTCCGTGGGGAACAACGGCACGCTGCCGATCGGCAATTCCGGCGGAAGTGTCGGCAATctcggcggcagcggcggcagcCTCGCGAGCCGCACTCAATTCGCCGGCCGACTGAAGCTCATCGAAGTTAACgGGTCTAACTCGCGATCATTGATCGATACTCCAGATGTAGAAAACATCCTGAAGGAGATGGCTGTGCCACTCACGCCATTGACGGCGATCGCACAGACACCGCGGAAGGAGCACGAGTCCAAGTTCACCTTCAACCCCCATCTGGCGAAGtTGACGGAAGTCGCCCCGCCAGAGCCTGCGAAGCCTCAACGCCAGCACAACGCCAGCAGACTGTCTGCGGATCTCGCGCGGGATCTGAGTCTGTCGGATAGCGAGGATGAAGAGAATAAGGAAATATTGTCGAGGCCGACGAGAGCAAACAGGAGTCCGGATCTCACGGTCGTTCTATCGACGCCGTTGATGACGTCGGCGCCGCCGCCTCTGACACCCATGTCGCCCATGGTCATGTCACCCGTGGGCCCGTTGTCACCCCCGCGGCCGATCAGCCCGCCCAGGTACCCGTCACCGCCCCCGAAGCGGATGACGCCGAAGCGAGAGCTGTCTCCTCCTCCACCTGTTCTCAATCATCCCCTGCCATCCGCGTGCTCCCCGACGAATCCGGTCGTAGTCGGCCAGCCGTCGAGTCCAGCCGAGGCGCCGCAGAGCTCCGGAAGCGCCAGCTCGAGCTCGGACTCCGGCTCGGATTCCGGCTCGGATAGCAGCGACGACtccgaggacgaggacgaccTGGCGTCGGCGCCTCCGCCGTCCAAGGGGCCCACGACGCCGCCCTCGATCTCGCCGAAGCAGGAGAATTTGGTCGAGGAACCGCCGCCCGCTGTGGAGGAACGTCGCTGGGATCTCAGTTCTTTCTTCAACAAGACGGCGGTACAGCATGGCGAACAGAACTCCGAGTCGAAGCCGGCTCAGGATAACGCCAGGCGGGAGAATACGCTCGAGATAAAGACGGAGACCAGCAGGTCGCACAGGGAACAGCCCCGCGATTGGCAGCTCGACGAGGCCTTGAAGAGGACTCACAACATGAGTTCCCTCAGCGACAGCGATCATCACTCCGATCAGGAGAAGATTCATAAGCTGGTCGAAGATAATCGCGCCCAGACGGAAAAGCCGAAGGTGGCCGATACTAGAAAACGACCCGGGCGACCCAGGAAATCCATCAAGAGCCCGAAACGCAGTCATCGGACATCGGACGAGAGTCTGAAGAACGGCAAGCCGCGCAGCCGGACGAGAACGGTTGCCACTTCCAGaaaacggtcgccactcgtCAAGAAGAAGCAACCGAAATCGAAGGAAATGGTGCCCACgagcgacgacgacaacgactcGAGATCGCACAGTGATTCCGACAGCGATCGTCGACCTACCAAAGTGTCGGCCGTGGCACCGACGAGAAAAGAAATCGAAAAGAGATCGAGACTGAGCGTGTCATCCAGCGACGATGAGAGTTCGCCACCGAACAGGAAAAATAATCACAGTGCCTCCGAGGACGACGCCGCGCGATGGACAAGAGTTCCCCCCATCAAGCGGACCAATCTGTTGGACTCGCCGAAGAAGCAAGACCAGAAGAAAAATTCCGCCAAGGGCAAGCCCAGGCAGCCGAGGTCGAGAGTGGCCAACGTTGTCGGCGGTTCGGACTCCGACAGCGAAACGGAGGTGTCTGTAAGAACTAATCGCATCAAAGTCGCTAGagTGCCACCCAGGCCGCGAGCACCACCAACGAGGACGACCTCGCCTGATAACTCTGATAGCGATAACAGCCCGGCGTCGAAACTGCAAGAGGACGACGCCGGCAACGTACAGGACAAGAAGAAAAGCGACACGCTGCGCCACGTCTTCTCTACGTCGAAGGGCGGCGGGAAGGGCGGTGGCAAAGGTGGAAAAGGTGGAAAAGGTGGTGGTAAATGCGGTATCTACGTGGAAGAGTACACGAGTAATTCTGCTACGCACACGCCGACCGGCGGGGACAGCCCATACAAAAGACCGTCCTCGCGGACGTCCAGTGGTGGAAACAATATTCTTCTGCGCTCGCCTCTAGCACTCACACACGTGAATGGCGTACCGAGTCTCGTGTGCAAGATCGATCTCGCCAGGATATCCTCGCAATTACTGCAACTATCGAGAGGGCAAGAGTTACGACAGCGTACGGAATTGCCCGACACCAGGCCGTCTTCGAGGCAGAGACCGTCCTCTAGCTTGGCGACCTCGCAACCACCGAGGCCGTCCACGCCAGAGGAGGGTGAGATCATAGACACGCCACCCCCGCAGCAGCAGGTCGTGTCAGATCGCGCGAGGATTCACCGTTCTGATGGACTGTTGGGCGAGGGTGACGGTAGGAGTTCACGTTCTGTGATCAAAGGCCAGCCGATATCGTCGGACTCGAAGAGCAGCGGTGCTATTCTCGGAGGTGTTGGAAGTGCTAGTGGTACCGGTGCGATCGGTAATGCACTCAAGAGGAAACGTAATCCGAGTTGCAGTTCCGTGTCCAGTTTGAGCCCTGTTCAGTGTTCGATGGACGCGAAAACGAAGAGTTCGTCCGAGCATAAGGACAGGAGCCGCAAAAGACAACGGAGGCATGTCACAAACGACGGATTAATGTCCAGTCAGCAAAGTGATATTCAACCGACAAATCACGAGAGGGACGAGAAACAAGATACAAGTTTATTACCACCACCACCTCTTCCAGCTCAGCGCGTCTACTATTCTTACTTCGATCCTCagaatgaaatattagaagaTCAGGATAGggACCACGACCAGTACCTGACCGAAGCTAAGCGACTAAAGCACAATGCCGACGAGGAGAATGATCTTACGGCACAGGGCATGATGTATCTGGAGGCCGCTCTGTACTTCCTTCTAACGGGCGACGCGATGGAATCAGACCAAGTTACGGAAAAAGCATCGTATACTATGTACAAAGATACTCTTAGTCTCATCAAATACATCTCGTCGAAATTCAAGAGCCAATCCAACAACTCACCCGAGAACAGTATACACACTAAGCTGGCCATTCTGAGCCTTTGGTGCCAGTCACGCTTGTACTCCAAACTTTATAATATGCGCAAACAAGAGATGAAAGAGGTTCAGAAGATCGTCAACGATTTCAATCAAAagcaatCACAGCAATCAGCAGCTCAGACAACGCCTGCTCAGGGGGAGGGACAAGGTACGCCTTCTCTTTCGCCTACACCGTCGCCCGCCGGTTCTGTAGGTTCCGTCGGTAGTCAAAGTTCCTCCGGATACAGCAGCGGTGGGCAACATCCGGCAACATCGGTCAATGGCCAATACATTAGCGTGCCTGTGCACGTCTACAATGCAATGATAAAGCAAAATCAGTATTCAGGTTTACTTATGAATGGCCATGACCTCTGGGAACAGGCAATAAAGCAGGCGAAACAGGAGGAGAATAGAAGCTTTTTCATCGACTTGGATCGAAGATTGGGACCCCTGACATCGTATAGCTCGCTACGCGAGCTCGTGCGTTACGTTCAAGCGGGTATAAAGAAGTTACGAGCTCTCTGA
Protein Sequence
MPSNGGGYYDDRNPLLKGTLSSVDRDRLRERERQARAAMSVQAEQAAAGGGPDTRHHHHHHHNHAHHHHPNIHVSTPNLFRAPVRVNPDPQIQSKLGNYSLVKHLLDQPKRLIGIDGIPPSPAPSPTPSAHKSGTDSSGRSCSPSSGQEFKKPGGLRGTSAASSSSSASHQKRGGFVKPADGKPPYNGRGGYPGQPVKHGGGSNDHRNHGLLPAKGPPPPTQPSSVGNNGTLPIGNSGGSVGNLGGSGGSLASRTQFAGRLKLIEVNGSNSRSLIDTPDVENILKEMAVPLTPLTAIAQTPRKEHESKFTFNPHLAKLTEVAPPEPAKPQRQHNASRLSADLARDLSLSDSEDEENKEILSRPTRANRSPDLTVVLSTPLMTSAPPPLTPMSPMVMSPVGPLSPPRPISPPRYPSPPPKRMTPKRELSPPPPVLNHPLPSACSPTNPVVVGQPSSPAEAPQSSGSASSSSDSGSDSGSDSSDDSEDEDDLASAPPPSKGPTTPPSISPKQENLVEEPPPAVEERRWDLSSFFNKTAVQHGEQNSESKPAQDNARRENTLEIKTETSRSHREQPRDWQLDEALKRTHNMSSLSDSDHHSDQEKIHKLVEDNRAQTEKPKVADTRKRPGRPRKSIKSPKRSHRTSDESLKNGKPRSRTRTVATSRKRSPLVKKKQPKSKEMVPTSDDDNDSRSHSDSDSDRRPTKVSAVAPTRKEIEKRSRLSVSSSDDESSPPNRKNNHSASEDDAARWTRVPPIKRTNLLDSPKKQDQKKNSAKGKPRQPRSRVANVVGGSDSDSETEVSVRTNRIKVARVPPRPRAPPTRTTSPDNSDSDNSPASKLQEDDAGNVQDKKKSDTLRHVFSTSKGGGKGGGKGGKGGKGGGKCGIYVEEYTSNSATHTPTGGDSPYKRPSSRTSSGGNNILLRSPLALTHVNGVPSLVCKIDLARISSQLLQLSRGQELRQRTELPDTRPSSRQRPSSSLATSQPPRPSTPEEGEIIDTPPPQQQVVSDRARIHRSDGLLGEGDGRSSRSVIKGQPISSDSKSSGAILGGVGSASGTGAIGNALKRKRNPSCSSVSSLSPVQCSMDAKTKSSSEHKDRSRKRQRRHVTNDGLMSSQQSDIQPTNHERDEKQDTSLLPPPPLPAQRVYYSYFDPQNEILEDQDRDHDQYLTEAKRLKHNADEENDLTAQGMMYLEAALYFLLTGDAMESDQVTEKASYTMYKDTLSLIKYISSKFKSQSNNSPENSIHTKLAILSLWCQSRLYSKLYNMRKQEMKEVQKIVNDFNQKQSQQSAAQTTPAQGEGQGTPSLSPTPSPAGSVGSVGSQSSSGYSSGGQHPATSVNGQYISVPVHVYNAMIKQNQYSGLLMNGHDLWEQAIKQAKQEENRSFFIDLDRRLGPLTSYSSLRELVRYVQAGIKKLRAL

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2