Basic Information

Insect
Cydia amplana
Gene Symbol
ATP13A2
Assembly
GCA_948474715.1
Location
OX419662.1:18170497-18201459[-]

Transcription Factor Domain

TF Family
HMG
Domain
HMG_box domain
PFAM
PF00505
TF Group
Other Alpha-Helix Group
Description
High mobility group (HMG) box domains are involved in binding DNA, and may be involved in protein-protein interactions as well. The structure of the HMG-box domain consists of three helices in an irregular array. HMG-box domains are found in one or more copies in HMG-box proteins, which form a large, diverse family involved in the regulation of DNA-dependent processes such as transcription, replication, and strand repair, all of which require the bending and unwinding of chromatin. Many of these proteins are regulators of gene expression. HMG-box proteins are found in a variety of eukaryotic organisms, and can be broadly divided into two groups, based on sequence-dependent and sequence-independent DNA recognition; the former usually contain one HMG-box motif, while the latter can contain multiple HMG-box motifs. HMG-box domains can be found in single or multiple copies in the following protein classes: HMG1 and HMG2 non-histone components of chromatin; SRY (sex determining region Y protein) involved in differential gonadogenesis; the SOX family of transcription factors [1]; sequence-specific LEF1 (lymphoid enhancer binding factor 1) and TCF-1 (T-cell factor 1) involved in regulation of organogenesis and thymocyte differentiation [2]; structure-specific recognition protein SSRP involved in transcription and replication; MTF1 mitochondrial transcription factor; nucleolar transcription factors UBF 1/2 (upstream binding factor) involved in transcription by RNA polymerase I; Abf2 yeast ARS-binding factor [3]; yeast transcription factors lxr1, Rox1, Nhp6b and Spp41; mating type proteins (MAT) involved in the sexual reproduction of fungi [4]; and the YABBY plant-specific transcription factors.
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 21 0.086 67 4.5 0.1 25 59 40 74 23 81 0.86
2 21 0.086 67 4.5 0.1 25 59 102 136 85 143 0.86
3 21 0.086 67 4.5 0.1 25 59 164 198 147 205 0.86
4 21 0.086 67 4.5 0.1 25 59 284 318 267 325 0.86
5 21 0.086 67 4.5 0.1 25 59 346 380 329 387 0.86
6 21 0.086 67 4.5 0.1 25 59 466 500 449 507 0.86
7 21 0.086 67 4.5 0.1 25 59 528 562 511 569 0.86
8 21 0.086 67 4.5 0.1 25 59 648 682 631 689 0.86
9 21 0.086 67 4.5 0.1 25 59 768 802 751 809 0.86
10 21 0.086 67 4.5 0.1 25 59 830 864 813 871 0.86
11 21 0.17 1.3e+02 3.5 0.1 25 59 892 926 875 933 0.86
12 21 0.17 1.3e+02 3.5 0.1 25 59 954 988 937 995 0.86
13 21 0.17 1.3e+02 3.5 0.1 25 59 1016 1050 999 1057 0.86
14 21 0.086 67 4.5 0.1 25 59 1078 1112 1061 1119 0.86
15 21 0.086 67 4.5 0.1 25 59 1198 1232 1181 1239 0.86
16 21 0.086 67 4.5 0.1 25 59 1260 1294 1243 1301 0.86
17 21 0.17 1.3e+02 3.5 0.1 25 59 1322 1356 1305 1363 0.86
18 21 0.17 1.3e+02 3.5 0.1 25 59 1384 1418 1367 1425 0.86
19 21 0.17 1.3e+02 3.5 0.1 25 59 1446 1480 1429 1487 0.86
20 21 0.17 1.3e+02 3.5 0.1 25 59 1508 1542 1491 1549 0.86
21 21 0.086 67 4.5 0.1 25 59 1570 1604 1553 1611 0.86

Sequence Information

Coding Sequence
ATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGCGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGCGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGGCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGCGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGGCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGCGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGGCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGCGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGCGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGGCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGCGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGGCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGCGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGGCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGCGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGGCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGCGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGAATGGGACAGACACAAGCCGGACCGCAAGATGAAGTGGGTGGACGCGCAGCGCGTCAAGCGGGAGGGGCTCGAGTGCGACATGACGTTCCTCGGCTTCCTCGTCATGCAGAACAGCCTCAAGCCGGAGACCACGCAGGTCATCAGGGAGCTGCACGAGGCGGACATCAAGCAGGTTATGGTCACAGGTGACAACATAATGACAGCGATGTCCGTGGCCAGAGGGTGTTTCATGGTGCAGCCGCACCAAAAGCTGGTGCTCATCACGGTCGGCCATCAGCAGACTGACGACACGCGGCCGCCGCTGTGTATGGAAGTGGTCGGGGAAGGCGGGCCTCCGAAGCTCGCTAGCGACATCTATGTGCTGGCGTTGGAAGGGAAAACGTGGTCCGTCATACGAAATTACTACCCGGAGATGATGACCACTGTACTTAATAGAGGCATAGTGTTCGGCCGCTTCGGTCCCGACCAGAAGACCCAGCTGGTAACGGCGCTGCAGGGCGAGGGGCGCGTGGTGGGGATGTGCGGGGACGGCGCCAACGACTGCGGCGCGCTCAAGGCCGCGCACGTCGGGATCTCGCTGTCCGAAGCCGATGCCTCAGTGGCGGCGCCTTTCACGTCACAAGAGCAGAACATTCGTTGCGTCAAGTTACTGACACTCGAAGGCAGATGCGCACTCAGTACTAGCTTCGCTATCTTCAAATACATGGCTCTCTATTCGCTCATACAGTTCTTCTCTATCCTCATATTATATAACTACTTCTCTATCCTGGGCAACTACCAGTTCCTGTACATCGACCTGGTGCTGACGACGCTGCTGGCCTTATCCCTCGGCCGCGCGGCGCCGGGGCCCGTGCTGGCGCCGCGCTCGCCGCCCGTCTCGCTCGTCGCCGCCTCCAGCATACTGCCTCTAGTCGCACAGGTGGCGCTGGTGCTACTTCTTCAACTCGCCGCTCTGTACTTGTTGCGAGCACAGGACTGGTTCCACCAAGTGGAAGGCAACCCGGAACTGGAAGTGGTCCTCTGCTGGGAGAATACTGTCATCTTCATCGTCTCCGCATTCCAGTACTTAGTCATGGCTTGCGTCTATGCCAAAGGCTGGCCCTTCAGGGAGCCGTTCTGTGCCAATTATTACATGGTACTGACTCTAGTCACGCAGTCGGTGTTCGTGGTTCTGTTGCTGTTCTGTCCGTGGCAGGAGCTGGCCGACCTCATGGAGGTGAAGGTGTTCAAGTGGGAAGCTCAGGCGGAGAACATCTTCCGCATCTACCTGCTGCTGGTGCCGGCGCTACATCTCATCCTAGCTATCGCCATTGAGGCGACCCTATCCGACACGCACAGGTTCCACGACATGTTCCGCGGCTTACGGCGGCGTCGATGCGACAAGCAACCAGGCGCCGGCGACGCCGCGCATGTGTGCTGA
Protein Sequence
MKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTDKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTDKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTDKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTDKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKADRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKADRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKADRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTDKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKADRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKADRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKADRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKADRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTEWDRHKPDRKMKWVDAQRVKREGLECDMTFLGFLVMQNSLKPETTQVIRELHEADIKQVMVTGDNIMTAMSVARGCFMVQPHQKLVLITVGHQQTDDTRPPLCMEVVGEGGPPKLASDIYVLALEGKTWSVIRNYYPEMMTTVLNRGIVFGRFGPDQKTQLVTALQGEGRVVGMCGDGANDCGALKAAHVGISLSEADASVAAPFTSQEQNIRCVKLLTLEGRCALSTSFAIFKYMALYSLIQFFSILILYNYFSILGNYQFLYIDLVLTTLLALSLGRAAPGPVLAPRSPPVSLVAASSILPLVAQVALVLLLQLAALYLLRAQDWFHQVEGNPELEVVLCWENTVIFIVSAFQYLVMACVYAKGWPFREPFCANYYMVLTLVTQSVFVVLLLFCPWQELADLMEVKVFKWEAQAENIFRIYLLLVPALHLILAIAIEATLSDTHRFHDMFRGLRRRRCDKQPGAGDAAHVC

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
-
90% Identity
-
80% Identity
-