Basic Information

Gene Symbol
-
Assembly
GCA_963082685.1
Location
OY720123.1:399525-408451[-]

Transcription Factor Domain

TF Family
HTH
Domain
HTH_psq domain
PFAM
PF05225
TF Group
Helix-turn-helix
Description
This DNA-binding motif is found in four copies in the pipsqueak protein of Drosophila melanogaster [1]. In pipsqueak this domain binds to GAGA sequence [1].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 21 4.9e-09 1.8e-06 28.0 0.0 4 39 143 178 140 183 0.89
2 21 9.8e-10 3.6e-07 30.2 0.0 4 39 197 232 196 236 0.93
3 21 1.9e-08 6.9e-06 26.1 0.2 2 30 250 278 249 279 0.93
4 21 8.1e-08 3e-05 24.1 0.3 6 30 358 382 356 383 0.93
5 21 8.1e-08 3e-05 24.1 0.3 6 30 462 486 460 487 0.93
6 21 8.1e-08 3e-05 24.1 0.3 6 30 566 590 564 591 0.93
7 21 8.1e-08 3e-05 24.1 0.3 6 30 670 694 668 695 0.93
8 21 8.1e-08 3e-05 24.1 0.3 6 30 774 798 772 799 0.93
9 21 8.1e-08 3e-05 24.1 0.3 6 30 878 902 876 903 0.93
10 21 8.1e-08 3e-05 24.1 0.3 6 30 982 1006 980 1007 0.93
11 21 8.1e-08 3e-05 24.1 0.3 6 30 1086 1110 1084 1111 0.93
12 21 8.1e-08 3e-05 24.1 0.3 6 30 1190 1214 1188 1215 0.93
13 21 8.1e-08 3e-05 24.1 0.3 6 30 1294 1318 1292 1319 0.93
14 21 8.1e-08 3e-05 24.1 0.3 6 30 1398 1422 1396 1423 0.93
15 21 8.1e-08 3e-05 24.1 0.3 6 30 1502 1526 1500 1527 0.93
16 21 8.1e-08 3e-05 24.1 0.3 6 30 1606 1630 1604 1631 0.93
17 21 8.1e-08 3e-05 24.1 0.3 6 30 1710 1734 1708 1735 0.93
18 21 8.1e-08 3e-05 24.1 0.3 6 30 1814 1838 1812 1839 0.93
19 21 8.1e-08 3e-05 24.1 0.3 6 30 1918 1942 1916 1943 0.93
20 21 2.5e-14 9.1e-12 44.9 0.2 6 45 2022 2062 2020 2062 0.96
21 21 2.3e-15 8.7e-13 48.2 0.0 2 42 2070 2111 2069 2112 0.95

Sequence Information

Coding Sequence
ATGCCGGGGCACGGCTCCGCGCCGGTAAAGAGCGAGCCGCCTCCTCCAGACGACGACGAGTCTCTGCATCTCTCCACGATATTCTTCACTAAGGTGTCAGGCAGCCTCGAAGAGCTGGCGCCCTTGGCCCCGGCGGGGCCGGCGGCGACGCCCGTGTGCGTCAAGCAGGAGCCCGACGACACGCCCGCACACACCCTCGAGACGCTCGAGTCCACGCCGCACAACCAGGGTTCTTTGCTGTTAGAGCGGCACCTGGCGATGCTGGGTTCAAGTGAGCCCGACATAGACTTCGTGCCACTGCAGCCTGTTAAAGATGAACCGTCTTCGGAGGGAGAACAACCTCATCTCAGCAGCGACGAGATGAGCGATTCGTGCGGCGTGTCGGAGCCGGCCGCGCGGGGCAGCCCCAAGAGCTGGACGCAGCGAGACATGGACGGGGCCTTGGACGCGCTGCGCTCGCAGAACATGAGCCTCACTAAGGCATCGTGCGTGTTCGGCATCCCGTCGACGACGCTGTggcagcgcgcgcggcggctcGGCATCGACACGCCGAAGCGCGagggcgccgcgcgcgcctgGAGCCAGCCCGACCTGCGGCGCGCGCTCGACCACCTGCGCGCCGGCACGCTCTCCGCCAACAAGGCCAGCAAGGCTTACGGCATCCCGAGCAGCACGCTGTACAAGATCGCGCGGCGGGAAGGCATCCGCCTGTCGGCGCCGTTCAACGCGGCGCCGGTGGCGTGGCGCCGCGCCGACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCcgcgcgcgctcgccgccatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCcgcgcgcgctcgccgccatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCcgcgcgcgctcgccgccatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCcgcgcgcgctcgccgccatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCcgcgcgcgctcgccgccatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCcgcgcgcgctcgccgccatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCcgcgcgcgctcgccgccatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCcgcgcgcgctcgccgccatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCcgcgcgcgctcgccgccatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCcgcgcgcgctcgccgccatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGTCAGTACAGTGTCACGAGTGTCACCTGCCGCGCGCGCTGGCCGCCatccgcgccggcgccgcctccGTGCAGCGAGCCGCGCGACACTACGGGATACCCACGGGCACCCTGTACGGGCGCTGCAAGCGCGAGGGCATCGAGTTGTCGCGCTCCACGCCCACGCCGTGGTCCGAGGGCGCCATGGGCGAGGCGCTGGACGCCGTCAGGGTCGGTCAGATGTCAATAAACCAAGCGGCTATTCACTTCAACCTGCCCTACTCGTCGCTCTATGGGAGATTTAAGAGGTGCAAGTATCAGCCACCGCAATCGCTTCACGATGTACAAACCAATCCAGACGCTATGCAAGAAGTGTACTACAACCAGACACAGATGGCGCCACCGCACTTGGAAGAAAACTTGCAACAGACGTATTCCCAGGAGTTCAACACAGATGGAGACATTGAGCCGTACTCCATGTGTTATCACCAGGGGTACAGCAGTATAGCCACCAGTTGA
Protein Sequence
MPGHGSAPVKSEPPPPDDDESLHLSTIFFTKVSGSLEELAPLAPAGPAATPVCVKQEPDDTPAHTLETLESTPHNQGSLLLERHLAMLGSSEPDIDFVPLQPVKDEPSSEGEQPHLSSDEMSDSCGVSEPAARGSPKSWTQRDMDGALDALRSQNMSLTKASCVFGIPSTTLWQRARRLGIDTPKREGAARAWSQPDLRRALDHLRAGTLSANKASKAYGIPSSTLYKIARREGIRLSAPFNAAPVAWRRADLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARSPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARARRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARSPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARARRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARSPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGQYSVTSVTCRARWPPSAPAPPPCSEPRDTTGYPRVSTVSRVSPAARAGRHPRRRRLRAASRATLRDTHGSVQCHECHLPRALAAIRAGAASVQRAARHYGIPTGTLYGRCKREGIELSRSTPTPWSEGAMGEALDAVRVGQMSINQAAIHFNLPYSSLYGRFKRCKYQPPQSLHDVQTNPDAMQEVYYNQTQMAPPHLEENLQQTYSQEFNTDGDIEPYSMCYHQGYSSIATS

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
-
90% Identity
-
80% Identity
-