Hcor066202.1
Basic Information
- Insect
- Hymenopus coronatus
- Gene Symbol
- -
- Assembly
- GCA_030762935.1
- Location
- CM060896.1:3736312-3741819[-]
Transcription Factor Domain
- TF Family
- T-box
- Domain
- T-box domain
- PFAM
- PF00907
- TF Group
- Unclassified Structure
- Description
- The T-box encodes a 180 amino acid domain that binds to DNA. Genes encoding T-box proteins are found in a wide range of animals, but not in other kingdoms such as plants. Family members are all thought to bind to the DNA consensus sequence TCACACCT. they are found exclusively in the nucleus, and perform DNA-binding and transcriptional activation/repression roles. They are generally required for development of the specific tissues they are expressed in, and mutations in T-box genes are implicated in human conditions such as DiGeorge syndrome and X-linked cleft palate, which feature malformations [1].
- Hmmscan Out
-
# of c-Evalue i-Evalue score bias hmm coord from hmm coord to ali coord from ali coord to env coord from env coord to acc 1 13 0.64 5.3e+03 -0.2 0.0 113 136 75 98 53 112 0.73 2 13 0.61 5.1e+03 -0.1 0.0 92 133 114 155 106 178 0.80 3 13 0.78 6.4e+03 -0.5 0.0 91 128 229 266 211 278 0.88 4 13 5.2 4.3e+04 -3.2 0.0 111 131 309 329 289 337 0.68 5 13 0.33 2.7e+03 0.8 0.0 103 134 389 423 367 438 0.70 6 13 0.74 6.1e+03 -0.4 0.0 110 135 451 479 432 494 0.69 7 13 1 8.4e+03 -0.8 0.0 91 130 495 534 485 552 0.85 8 13 0.068 5.6e+02 3.0 0.0 91 135 555 599 544 626 0.84 9 13 0.96 8e+03 -0.8 0.0 91 133 632 674 610 687 0.85 10 13 0.073 6e+02 2.9 0.0 91 133 731 773 719 812 0.78 11 13 0.032 2.7e+02 4.0 0.0 89 135 829 875 816 898 0.84 12 13 0.027 2.2e+02 4.3 0.0 82 136 931 985 912 1001 0.78 13 13 1.1 9.1e+03 -1.0 0.0 93 134 1002 1043 990 1057 0.75
Sequence Information
- Coding Sequence
- ATGTACTGCGAGGTGGGCCTCAAGTCTTCCAGACATCAACACATAGAACCCTCTCCACCTGGTGAACTCTACATGCTCTGCAAGCATGTTTACATCCCTGTTATTATTGTCACTGGATGTACTGCGAGGTGGGCAGCAAGTCTTCCAGACATCAACATAAAGAGACCTCTCCACCTGGTGGGCCACAAGTCTTCCAGAAAACAACACAAAGAGACCTCTCCACCTGGTGAACTCTACGTGCTCTGCAAGCATGTTTACATCCCTGTTATTATTGTCAGTAGATGTACTGCGAGGTGGGCCTCAAGTCTTCCAGACATCAACACAAAGAGACCTCTCCGACTGtGGATGTACTGTGAGGTGGCCCTCAAGTCTTCCAGACATCAACACAAAATGACCTCTCTACCTGGTGAACTCTATGTAGTCCGCAAGCATGTTTACATCCCTGTTATTATTGTCAGTGGATGTACTGCAAGGTGTCCATCAAGTCTTCCAGACATCAACACAAAGAGAAGTCTCCACCTGGTGAACTCTATGTACTCTGCAAGCATGTTTACATCCCTGTTATTATTGTCACTGGATGTACTGCGAGGTGGGCCTCAAGTCTTCCAGACATCAACAAAAAAAAGACCTCTCCACCTGGTGAGTTCTACATGCTCTGCAAGCATGTTCACATCCTGTTATTATTGTCACTGGATGTACTGCGAGGTGGGCCTCAAGTCTTCCAGACATCAACACAAAAAGACCTCTCCACCTGGTAAACTCTACGTGCTCTGCAAGCATGTTTACATCCCTGTTATTATTGGCAGTAGATGTACTGTGAGGTGGGCCTCAAGTCTCCCATACATCAACATAAAGAGACCTCTCCGACTGTGGATGTACTGCAAGGTGGGCCTCAAGTCTGCCAGACATCAACACATAGAACCCTCTCCACCTGGTGAACTCCACGTGCTCTGCAAGCATGTTTACATCCCTGTTATTATTGTCAGTGGATGTACTGCGAGGGGGGTAGCAAGTTTTCCAGACATTAACACATTGAACCCTCTCCACCTGGTGAACCCTACGTGCTCTGCAAGCTTGTTTACATCCCTGTTATTATTGTCAGTGGATGTACTGCGAGGGGGGCAGCAAGTCTTCCAGACATCAACACAAAGTGACCTCTCCACCTGCAAGTCTTCCAGGCATCAACACACAGAACCCTCTCCACCTGGTGAACTCTACATGCACTGCAAGCATGATTACATCCCTGTTATTATAGTCACTGGATGTACTGCGAGGTGGGCAGCAAGTCTTCTAAACATCAACACACAGGACCCTCTCCACCTGGTGGGCCTCAAGTCTTCCAGACATCAACACAAAGAGACCTCTCCACCTAGTGAACTCTACGTGCTCTGCAAGCATGTTTACATCCCTGTTATTATTGTCAGTAGATATACTGCGAGGTGGGCCTCAAGTCTTCCAGACATCAACACAAAGAGACCTCTCCGACCGtGGATGTACTGTGAGGTGGCCGTCAAGTCTTCCCGACATCAACACAAAATGACCTCTCCACCTGGTGAACTCTATGTAGTCTGCAAGCATGTTTACATTCCTGTTATTATTGTCGGTGGATGTACTGTGAGGTGTCCATTAAGTCTTCCAGACATCAACACAAAGAGACTTCTCCACCTGTGGATGTACTGTGAGGTGTCCATTAAGTCTTCCAGACATCAACACAAAGAGACCTCTCCACCTGGTGAACTCTATGTACTCTGCAAGCATGTTTACATCCCTGTTATTATTGTCACTGGATGTACTGCGAGGTGGGCCTCAAGTCTTCCAGACATCAACACAAAAAGACCTCTCCACCTGGTGAGTTCTACATGCTCTGCAAGCATGTTTACATCCTGTTATTATTGTCACTGGATGTACTGCGAAGTGGGCCTCAAGTCTTCCAGACATCAACACAAAGAGACCTCTCCACCTGATGAACTCTACATGCTCTGCACACATCTTTACATTCCTGATATTATTGTCACTGGATGTACTGCTAGGTGGCCCTCAAGTCTTCCAGACATCAACACATTGAAACCCACTCTACCTGTGGATGTACTGCAAGGTGGGCCTCCAGTCTTCCAGACATCAACACATAGAACCCTCTCCAGCTGGTTAACTCTACACGCTCTGCAAGCATGTTTACATCCCTGTTATTATTGTCACTGGATGTATTGCGAGGTGGGCCTTAAGTCTTCTAGACATCAACACAAAGAGACCTCTCCACCTGGTAAACTCTACGTGCTCTGCACACATGTTTACATCCCTGTTATTATTGTCAGTGGATGTACTGCGAGGGGGGCAGCAAGTCTTCCAAACATCAACACAAAGAGACCTCTCCACATGGTGAACTCTATGTACTCCGCCAGCATGTTTACATCCCTGTTATTTGTGTCAGTGGATGTACAGCGAGGAGGCCCTCAATTCTTCCAGACATCAACAAAAAGAGACCTGTCCACCTGTCAGTGGATGTACTGCGAGGTGGGCCTCAAGTCTTCCAGACTTCAACACACAGTACTCTCTCCACCTGGTGAACTATACGTGCTCTGCAAGCATGTTTACATCCCTGTTATTATTGTCACTGGATGTACTGCCAGGTGGGCCTCAAGTCTTCCAGACATCAACACAAAGAGACCTCTCCACCTGGTGAACTCTTTTGTACTCTGCAAGCATGTTTACATCCCTGCTATTATAGTTACTGGATGTACTGCGAGGTGGGCCTCAAGTCTTCCAGACATCAACACACAGAAACCTCTCCACCTGGTGAACTTACATCCCTGTTATTATAGTCACTGGATGTACTGCGAGTTGGGCCTCAAGTCTTCCAGACATCAACACAAAAAGACCTCTCCACCTGGTGAACTCTATGTTCTCTGCAAGCATGTTTACATCCCTGTTATTATAGTCACTGGATGTACTGCGAGGTGGGCCTCAAGTCTTTCAGACATCAACACAAAGAACCCTCTCCACCTGTGGATGTACTGCGAGGTGGGCCTCAAGTCTTCCAGACATCAACACAAAGAACCCTCTCCACCTGGTGAACTCTACGTGCTCTGCAAGCACGTTTATATCCCTGTTATTATTGTCAGTGGATGTACTGCGAGGTGGGCAGCAAGTCCTCCAGACATCAACACAAAGAACCCTCTCCACCTGGTGAACTCTACGTGCTCTGCAAGCATGTTTACATCCCTGTTATTACTGTCACTGGATGTACTGCGAGGTGGGCCTCAAGTCTTCCAGACATCAACACATAGAACCCTCTCCACCTGGTGA
- Protein Sequence
- MYCEVGLKSSRHQHIEPSPPGELYMLCKHVYIPVIIVTGCTARWAASLPDINIKRPLHLVGHKSSRKQHKETSPPGELYVLCKHVYIPVIIVSRCTARWASSLPDINTKRPLRLWMYCEVALKSSRHQHKMTSLPGELYVVRKHVYIPVIIVSGCTARCPSSLPDINTKRSLHLVNSMYSASMFTSLLLLSLDVLRGGPQVFQTSTKKRPLHLVSSTCSASMFTSCYYCHWMYCEVGLKSSRHQHKKTSPPGKLYVLCKHVYIPVIIGSRCTVRWASSLPYINIKRPLRLWMYCKVGLKSARHQHIEPSPPGELHVLCKHVYIPVIIVSGCTARGVASFPDINTLNPLHLVNPTCSASLFTSLLLLSVDVLRGGQQVFQTSTQSDLSTCKSSRHQHTEPSPPGELYMHCKHDYIPVIIVTGCTARWAASLLNINTQDPLHLVGLKSSRHQHKETSPPSELYVLCKHVYIPVIIVSRYTARWASSLPDINTKRPLRPWMYCEVAVKSSRHQHKMTSPPGELYVVCKHVYIPVIIVGGCTVRCPLSLPDINTKRLLHLWMYCEVSIKSSRHQHKETSPPGELYVLCKHVYIPVIIVTGCTARWASSLPDINTKRPLHLVSSTCSASMFTSCYYCHWMYCEVGLKSSRHQHKETSPPDELYMLCTHLYIPDIIVTGCTARWPSSLPDINTLKPTLPVDVLQGGPPVFQTSTHRTLSSWLTLHALQACLHPCYYCHWMYCEVGLKSSRHQHKETSPPGKLYVLCTHVYIPVIIVSGCTARGAASLPNINTKRPLHMVNSMYSASMFTSLLFVSVDVQRGGPQFFQTSTKRDLSTCQWMYCEVGLKSSRLQHTVLSPPGELYVLCKHVYIPVIIVTGCTARWASSLPDINTKRPLHLVNSFVLCKHVYIPAIIVTGCTARWASSLPDINTQKPLHLVNLHPCYYSHWMYCELGLKSSRHQHKKTSPPGELYVLCKHVYIPVIIVTGCTARWASSLSDINTKNPLHLWMYCEVGLKSSRHQHKEPSPPGELYVLCKHVYIPVIIVSGCTARWAASPPDINTKNPLHLVNSTCSASMFTSLLLLSLDVLRGGPQVFQTSTHRTLSTW
Similar Transcription Factors
Sequence clustering based on sequence similarity using MMseqs2
- 100% Identity
- -
- 90% Identity
- -
- 80% Identity
- -