Basic Information

Gene Symbol
-
Assembly
GCA_907165275.1
Location
OU015678.1:37858-43760[+]

Transcription Factor Domain

TF Family
HMGA
Domain
HMGA domain
PFAM
AnimalTFDB
TF Group
Unclassified Structure
Description
This entry represents the HMGA family, whose members contain DNA-binding domains, also known as AT hooks due to their ability to interact with the narrow minor groove of AT-rich DNA sequences. They play an important role in chromatin organisation [1]. The high mobility group (HMG) proteins are the most abundant and ubiquitous nonhistone chromosomal proteins. They bind to DNA and to nucleosomes and are involved in the regulation of DNA-dependent processes such as transcription, replication, recombination, and DNA repair. They can be grouped into three families: HMGB (HMG 1/2), HMGN (HMG 14/17) and HMGA (HMG I/Y). The characteristic domains are: AT-hook for the HMGA family, the HMG Box for the HMGB family, and the nucleosome-binding domain (NBD) for the members of the HMGN family [2].
Hmmscan Out
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc
1 14 0.00062 1.4 9.2 4.0 5 13 150 158 148 159 0.89
2 14 0.00062 1.4 9.2 4.0 5 13 200 208 198 209 0.89
3 14 0.00062 1.4 9.2 4.0 5 13 250 258 248 259 0.89
4 14 0.00062 1.4 9.2 4.0 5 13 300 308 298 309 0.89
5 14 0.00027 0.61 10.4 2.8 5 13 357 365 355 366 0.89
6 14 0.00062 1.4 9.2 4.0 5 13 575 583 573 584 0.89
7 14 0.00062 1.4 9.2 4.0 5 13 625 633 623 634 0.89
8 14 0.00062 1.4 9.2 4.0 5 13 675 683 673 684 0.89
9 14 0.00027 0.61 10.4 2.8 5 13 725 733 723 734 0.89
10 14 0.00062 1.4 9.2 4.0 5 13 900 908 898 909 0.89
11 14 0.00062 1.4 9.2 4.0 5 13 950 958 948 959 0.89
12 14 0.00062 1.4 9.2 4.0 5 13 1000 1008 998 1009 0.89
13 14 0.00027 0.61 10.4 2.8 5 13 1057 1065 1055 1066 0.89
14 14 5 1.1e+04 -3.3 3.2 2 11 1134 1143 1133 1144 0.94

Sequence Information

Coding Sequence
ATGAGTGCCCCTCAGAATGACGACGATGCGCCGGAACACGCGACGCACCGCCGCCCTCCTCCCTCTCTCGGGGACCTTGTGCAAAGGACGCCGGTCGGGCCCATGACCGGCGCCCAAAGGCCCCCTGCGACGGGTCAGGTAGATGGCCCCGCCGCCCACCACGACGGCCGCAGGGGGAGGAAAAGCATCTCCACGGCCTCTAACACTCAGCCGCGGTGCTCCCCGAGCCACCGCGGCCCGCCTCAGTGGCGCCGCCGCCCCGGAGGACGACAGCGCCACCTACATTGGCCGCCTCGACCACGTGCTTGCTGCCTGCAAGTGGGTGAGACGGCCCCCTCCCGGTGCCTCATGCCGGGCGCGGCGTCGGATGAAATCGACGGCGCCGCACCCGAACATTCCGGCCCTTCTCCCGGCCACGGTTTTTACGCCGCGACCCGCCTCAGCCGCGCCCCCACGCCAAAGCGCCCGAGAGGCACGGCCTACCCGTCCCGGTGCCGCATGCCGGGCGCGGCGTCGGATGAAATCGACGGCGCCGCACCCGAGCATTCCGGCCCTTCTCCCGGCCACGGTTTTTACGCCGCGACCCGCCTCAGCCGCGCCCCCACGCCAAAACGCCCGAGAGGCACGGCCTACCCGTCCCGGTGCCGCATGCCGGGCGCGGCGTCGGATGAAATCGACGGCGCCGCACCCGAACATTCCGGCCCTTCTCCCGGCCACGGTTTTTACGCCGCGACCCGCCTCAGCCGCGCCCCCACGCCAAAGCGCCCGAGAGGCACGGCCTACCCGTCCCGGTGCCGCATGCCGGGCGCGGCGTCGGATGAAATCGACGGCGCCGCACCCGAACATTCCGGCCCTTCTCCCGGCCACGGTTTTTACGCCGCGACCCGCCTCAGCCGCGCCCCCACGCCAAAGCGCCCGAGAGGCACGGCCTACGCTCCCCTCTCCCGGCCACGGTTTTTTACGCCGCGGCCCACCTCAGTTGCACCCCCACGCCAAAGCGCCCGAGAGGCGCAACCTACCGTCCCCTCTCCCGGCCACGGTTTTTACGCCGCGACCCGCCTCAGTCGCGCCCCCACGCCAAAGCGCCCGAGAGGCGCAACCTACCCGTCTCGGTGCCTCATGTCGGGCGCAGCGCTGGCCCTTCTCAGCGCCGCACTCACCGGCATTCCGGCCCGCCTCAGGGTTCCTCTTGGCTTCACCAATGACGACGATGCGCCGGAACACGCGACGCACCGCCGCCCTCCTCCCTCTCTCGGGGACCTTGTGCAGAGGACGCCGGTCGGGCCCATGACCGGCGCCCAAGGCCCCCTGCGACGGGTCGGGCCAGTGGCCCCGCCGCCCACCACGACGGCCGCAGGGGGTGAGGAGGGCCGACTAGCTCCGCGCCAGCACTCCCCTCCCCCCGAAGCCCCTCCACGCCGCCACTCACCGGGGCGGCCCCGTCGAGGCCGCAGAGGAAAACACCCCCACGGCCTCCAACCCCCGGCCGCGGCGCTCGAGCTACGCCGCAGCCCGCCTCAGTGGCGCCGCCGTCCCGGAGGACGACAGCGCCACCTACATGGGCCGCCTCGACCACGTGCTTGCTGCCTGCAAGTGGGTGAGACGGCCCCCTCCCGGTGCCTCATGCCGGGCGCGGCGTCGGATGAAATCGACGGCGCCGCACCCGAACATTCCGGCCCTTCTCCCGGCCACGGTTTTTACGCCGCGACCCGCCTCAGCCGCGCCCCCACGCCAAAGCGCCCGAGAGGCACGGCCTACCCGTCCCGGTGCCGCATGCCGGGCGCGGCGTCGGATGAAATCGACGGCGCCGCACCCGAGCATTCCGGCCCTTCTCCCGGCCACGGTTTTTACGCCGCGACCCGCCTCAGCCGCGCCCCCACGCCAAAGCGCCCGAGAGGCACGGCCTACCCGTCCCGGTGCCGCATGCCGGGCGCGGCGTCGGATGAAATCGACGGCGCCGCACCCGAACATTCCGGCCCTTCTCCCGGCCACGGTTTTTACGCCGCGACCCGCCTCAGCCGCGCCCCCACGCCAAAGCGCCCGAGAGGCACGGCCTACCCGTCGCGGTGCCGCATGCCGGGCGCGGCGTCGGATGAAATCGACGGCGCCGCACCCGAACATTCCGGCCCTTCTCCCGGCCACGGTTTTTACGCCGCGACCCGCCTCAGTCGCGCCCCCACGCCAAAGCGCCCGAGAGGCGCAACCTACCCGTCTCGGTGCCTCATGCCGGGCGCAGCGCTGGCCCTTCTCAGCGCCGCACTCACCGGCATTCCGGCCCGCCTCAGGGTTCCTCTTGGCTTCACCAATGACGACGATGCGCCGGAACACGCGGCGCACCGCCGCCCTCCTCCCTCTCTCGGGGACCTTGTGCAAAGGACGCCGGTCGGGCCCATGACCGGCGCCCAAGGCCCCCTGCGACGGGTCAGAGGAAAAGCATCTCCACGGCCTCTAACACTCAGCCGCGGTGCTCCCCGAGCCACCGCGGCCCGCCTCAGTGGCGCCGCCGCCCCGGAGGACGACAGCGCCACCTACATGGGCCGCCTCGACCACGTGCTTGCTGCCTGCAAGTGGGTGAGACGGCCCCCTCCCGGTGCCTCATGCCGGGCGCGGCGTCGGATGAAATCGACGGCGCCGCACCCGAAACATTCCGGCCCTTCTCCCGGCCACGGTTTTTACGCCGCGACCCGCCTCAGCCGCGCCCCCACGCCAAAGCGCCCGAGAGGCACGGCCTACCCGTCCCGGTGCCGCATGCCGGGCGCGGCGTCGGATGAAATCGACGGCGCCGCACCCGAGCATTCCGGCCCTTCTCCCGGCCACGGTTTTTACGCCGCGACCCGCCTCAGCCGCGCCCCCACGCCAAAGCGCCCGAGAGGCACGGCCTACCCGTCCCGGTGCCGCATGCCGGGCGCGGCGTCGGATGAAATCGACGGCGCCGCACCCGAACATTCCGGCCCTTCTCCCGGCCACGGTTTTTACGCCGCGACCCGCCTCAGCCGCGCCCCCACGCCAAAGCGCCCGAGAGGCACGGCCTACGCTCCCCTCTCCCGGCCACGGTTTTTTACGCCGCGGCCCACCTCAGTTGCACCCCCACGCCAAAGCGCCCGAGAGGCGCAACCTACCGTCCCCTCTCCCGGCCACGGTTTTTACGCCGCGACCCGCCTCAGTCGCGCCCCCACGCCAAAGCGCCCGAGAGGCGCAACCTACCCGTCTCGGTGCCTCATGTCGGGCGCAGCGCTGGCCCTTCTCAGCGCCGCACTCACCGGCATTCCGGCCCGCCTCAGGGTTCCTCTTGGCTTCACCTGGGCGCCGCACGGCCCCACGTCCGACGCCCGAAGTCCCCGGCGGCGGGATGGGGACCCGCCGCCAAAAGGCCGCAGGGGTTGGGAGGTTCGGCACTCTCCCACCGACGCTCCCCCCCCCCACCGACCCTCCACGCCGCCGCCCGTGAGGGCAGCAGCGCCTCCCACTCGGGCCACTTCGTCCGGGCTGTCCACACACCCCGAGGAAGCGGCCCGTGGCCTCGGGAAGGGCGGCACCGGGAGTGCGCACCGGCGCCGCTACCGACCATACTGGCCCCGCCTCTGGGTTCCTCTTGGCTTCACCGACGCCGGTCGGGCCCTTGACCGGCGCCCAAGGCCCCCTGCGACGGGTGAGATCAGTGGCCCCGCCGCCAACCACGACGGCCGCAGGGGGAGGGAAGATTGGTTCGCCCGCACCGACACTCCCCCTCCCCCGCAGCCCCTCCACGCCGCCATCAGGCGCCACCGCCTCCCCAGGAGGACCGCAGCCCACCTCAGTTGCGCCGCCGCTCCGAGGGACGACGGCACCACCTACTCGGGCCACTTCGTCGCGGGTTGGAGCCTGAACCCGGAGAAGCAGCCCGTGGCCTCGGGAAGGGCAGCACCGGGAGCGCATCACCGGCGCCGCTGCCGACCATTCTGGCCCCGCCTCTGGGTTCCTCTTGGCTTCACCGTGGGTGTGTGTTTCTCCACCCACGGTTACTAA
Protein Sequence
MSAPQNDDDAPEHATHRRPPPSLGDLVQRTPVGPMTGAQRPPATGQVDGPAAHHDGRRGRKSISTASNTQPRCSPSHRGPPQWRRRPGGRQRHLHWPPRPRACCLQVGETAPSRCLMPGAASDEIDGAAPEHSGPSPGHGFYAATRLSRAPTPKRPRGTAYPSRCRMPGAASDEIDGAAPEHSGPSPGHGFYAATRLSRAPTPKRPRGTAYPSRCRMPGAASDEIDGAAPEHSGPSPGHGFYAATRLSRAPTPKRPRGTAYPSRCRMPGAASDEIDGAAPEHSGPSPGHGFYAATRLSRAPTPKRPRGTAYAPLSRPRFFTPRPTSVAPPRQSAREAQPTVPSPGHGFYAATRLSRAPTPKRPRGATYPSRCLMSGAALALLSAALTGIPARLRVPLGFTNDDDAPEHATHRRPPPSLGDLVQRTPVGPMTGAQGPLRRVGPVAPPPTTTAAGGEEGRLAPRQHSPPPEAPPRRHSPGRPRRGRRGKHPHGLQPPAAALELRRSPPQWRRRPGGRQRHLHGPPRPRACCLQVGETAPSRCLMPGAASDEIDGAAPEHSGPSPGHGFYAATRLSRAPTPKRPRGTAYPSRCRMPGAASDEIDGAAPEHSGPSPGHGFYAATRLSRAPTPKRPRGTAYPSRCRMPGAASDEIDGAAPEHSGPSPGHGFYAATRLSRAPTPKRPRGTAYPSRCRMPGAASDEIDGAAPEHSGPSPGHGFYAATRLSRAPTPKRPRGATYPSRCLMPGAALALLSAALTGIPARLRVPLGFTNDDDAPEHAAHRRPPPSLGDLVQRTPVGPMTGAQGPLRRVRGKASPRPLTLSRGAPRATAARLSGAAAPEDDSATYMGRLDHVLAACKWVRRPPPGASCRARRRMKSTAPHPKHSGPSPGHGFYAATRLSRAPTPKRPRGTAYPSRCRMPGAASDEIDGAAPEHSGPSPGHGFYAATRLSRAPTPKRPRGTAYPSRCRMPGAASDEIDGAAPEHSGPSPGHGFYAATRLSRAPTPKRPRGTAYAPLSRPRFFTPRPTSVAPPRQSAREAQPTVPSPGHGFYAATRLSRAPTPKRPRGATYPSRCLMSGAALALLSAALTGIPARLRVPLGFTWAPHGPTSDARSPRRRDGDPPPKGRRGWEVRHSPTDAPPPHRPSTPPPVRAAAPPTRATSSGLSTHPEEAARGLGKGGTGSAHRRRYRPYWPRLWVPLGFTDAGRALDRRPRPPATGEISGPAANHDGRRGREDWFARTDTPPPPQPLHAAIRRHRLPRRTAAHLSCAAAPRDDGTTYSGHFVAGWSLNPEKQPVASGRAAPGAHHRRRCRPFWPRLWVPLGFTVGVCFSTHGY

Similar Transcription Factors

Sequence clustering based on sequence similarity using MMseqs2

100% Identity
-
90% Identity
-
80% Identity
-