In recent years, a large number of papers on DNA methylation have been published. In 2022, there will be more than 7000 papers.
Concept of DNA methylation
DNA methylation: refers to the process that a specific base on the DNA sequence obtains a methyl group through covalent bond binding under the action of DNA methylation transferase (DNMT). DNA methylation is a form of DNA chemical modification. It can change the activity of DNA fragments without changing the DNA sequence, thus changing the genetic performance. It is a very conservative epigenetic modification. It plays a crucial role in regulating gene expression, embryonic development, cell proliferation, cell differentiation, maintaining genome stability and resisting the invasion of foreign DNA virus. DNA methylation can occur at C-5 position of cytosine, N-6 position of adenine, N-7 position of guanine, etc. They are respectively catalyzed by different DNA methylation enzymes to generate 5-methylcytosine (5-mC), N6-methyladenine (N6-mA), and 7-methylguanine (7-mG). The DNA methylation involved in general research mainly refers to the methylation process of the fifth carbon atom on cytosine at CpG site (cytosine phosphate guanine site, that is, the site where cytosine is closely connected to guanine in the DNA sequence), and the product is called 5-methylcytosine (5-mC). 5-mC is widely found in eukaryotic genomes such as plants and animals, and is currently the most studied form of DNA methylation modification.
Mechanism of DNA methylation
During DNA methylation, cytosine protrudes from the double helix of DNA and enters the fissure that can be combined with the enzyme. Under the catalysis of cytosine methyltransferase (Dnmts), the active methyl is transferred from S-adenosylmethionine to cytosine 5 to form 5-methylcytosine (5-mC). The methylation mode of DNA is achieved through DNA methyltransferase. (1) DNA methylation enzymes can be divided into two categories: maintenance DNA methylation transferase (Dnmt1) and de novo methylase (Dnmt3a, Dnmt3b, Dnmt3L, etc.).
(2) DNA methylation reaction can be divided into two types. One type is DNA that has not been methylated in both strands, which is called de novo methylation. De novo methylation refers to the methylation modification of non methylated DNA, which mainly occurs in the early stages of embryonic development and is used to set the methylation state of early embryonic cells.
The other type is double stranded DNA, where one strand has already been methylated and the other unmethylated strand has been methylated. This type is called retained methylation.
Role of DNA methylation and demethylation
DNA methylation can cause changes in the structure of chromatin, DNA conformation, DNA stability and the way of interaction between DNA and protein, thus controlling gene expression. How DNA methylation promotes suppression of expression remains unclear, and various hypotheses have been proposed.
1. For some transcription factors, such as AP-2, c-myc, CREB/ATF, E2F and NF kB, DNA methylation is considered to generate a physical barrier that eliminates the acquisition of promoter binding sites.
2. After DNA methylation, methyl CpG binding domain proteins (MBDs) associate, and methyl CpG binding domain proteins (MBDs) recruit histone deacetylase (HDAC) and other complexes (such as Sin3, NurD, NCoR). As a result, chromatin was compacted and gene expression was inhibited.
5-mC is not a continuous and stable DNA modification, and it also undergoes a demethylation process. DNA methylation can turn off the activity of some genes, while demethylation induces gene reactivation and expression. The demethylation of DNA is regulated by fragments within the gene and the factors that bind to them. There are two hypotheses that can explain the molecular mechanism of DNA demethylation. One hypothesis is associated with DNA semi preservation replication, which is passive demethylation. If the methylated DNA is not methylated after half reserved replication, its DNA will be in a half methylated state. If the half methylated DNA occurs again and the DNA methylation activity is still inhibited, 50% of cells will be in a half methylated state. The second hypothesis is that DNA demethylation is catalyzed by DNA demethylase. The reaction of removing methylation bases under the action of DNA glycosidase is equivalent to the repair reaction of damaged DNA under the catalysis of glycosidase and base free nuclease digestion coupling.
The initial mechanism of DNA demethylation remains a mystery.
Biological function of DNA methylation
DNA methylation plays a crucial role in maintaining the function of normal cells, inactivation of female X chromosome, suppression of parasitic DNA sequence, stability of genome structure, genetic imprinting, embryonic development, and the occurrence and development of tumors and diseases. (1) DNA methylation and tumorigenesis start from a single cell, which has undergone many changes, making its phenotype different from its normal precursor. Although this process can be driven by key genes that control cell growth, many changes in expression may be due to epigenetic changes (mainly DNA methylation). The change of DNA methylation status is common in tumors, which is characterized by the decrease of overall methylation level and the increase of local methylation level. ● Some genes in cancer cells are inactivated due to hypermethylation of CpG island in the regulatory region, while these regulatory regions are not methylated in normal tissues. For example, retinoblastoma oncogene 1 (RB1). ● Abnormal CpG island methylation promotes the development and progression of cancer by affecting genes involved in key cellular processes. For example, death related protein kinase 1 (DAPK1).
In addition, in tumor cells, oncogenes are activated in a low methylation state, while tumor suppressor genes are suppressed in a high methylation state. Although DNA methylation may not play a leading role in all cancers, there is no doubt that changes in these decorator pattern will ultimately affect cell susceptibility and tumor phenotype. (2) DNA methylation and embryonic development During the process of embryonic development, the level of DNA methylation within the genome will undergo dramatic changes, among which the most dramatic changes are in the gametophytic stage and early embryonic development stage. The establishment of incorrect methylation patterns may lead to human diseases, such as fragile X chromosome syndrome. (3) The study on DNA methylation and the stability of genetic material has proved that the initiation of bacterial DNA replication replication is related to DNA methylation and the interaction between DNA and bacterial plasma membrane. DNA methylation, as a label, determines the initiation point of replication, controls the initiation of replication, and keeps DNA replication consistent with cell division; DNA mismatch repair is an important means to correct DNA replication errors during cell proliferation. After replication, the double stranded DNA maintains a semi methylated state for a short period of time (several minutes), and the mismatch repair system can distinguish between old and new strands, providing molecular markers for the incorrect bases incorporated in the new strand. (4) DNA methylation and gene expression regulation DNA methylation provide an effective inhibitory mechanism for long-term silencing of non coding regions (such as introns).
Application of DNA methylation
DNA methylation has a wide range of applications in disease occurrence and development, environmental factor exposure and response, development and differentiation, and disease marker research. Clinically, it is mainly used for the diagnosis of tumors. (1) Early diagnosis and screening: Normal cells undergo cancerous transformation and become tumor cells. Tumor cells also experience "birth, aging, illness, and death", and substances in tumor cells may be released into the bloodstream, such as the DNA of tumor cells, which flows with the blood circulation, which is what we call circulating tumor DNA (ctDNA). The methylation patterns of different types of cancer have high specificity, and the ctDNA levels of different stages of cancer also vary. Therefore, liquid biopsy can be used to detect ctDNA levels and methylation characteristics to achieve early diagnosis and staging of tumors. (2) Prognostic risk prediction: The methylation characteristics of ctDNA can predict the risk of postoperative recurrence and death in cancer patients, help adjust treatment plans, evaluate the need for postoperative chemotherapy, and determine chemotherapy plans. (3) Evaluation of therapeutic efficacy: The collection of ctDNA through liquid biopsy can be repeated multiple times, which is very beneficial for evaluating therapeutic efficacy during the course of the disease and monitoring the patient's physical condition in real-time.
Detection of DNA methylation
There are many methods for DNA methylation analysis technology, which can be roughly divided into the following: (1) High performance liquid chromatography (HPLC), an analysis technology based on base separation, can hydrolyze DNA into a single deoxyribonucleoside, separate and detect deoxycytidine and 5-methyldeoxycytidine to calculate the 5-mC content in the gene group. In 1980, Kuo et al. first used high-performance liquid chromatography harmonic ultraviolet detection to analyze the content of 5-mC in the genome. With the development of mass spectrometry (MS) technology, the application of high-performance liquid chromatography-tandem mass spectrometry (HPLC-MS/MS) and ultra high performance liquid chromatography-tandem mass spectrometry (UHPLC-MS/MS) has significantly enhanced the selectivity and sensitivity of 5mC detection. Due to its high accuracy and sensitivity, HPLC-MS is the gold standard method for detecting 5-mC content in the entire genome. (2) Methylation sensitive restriction endonucleases (MSRE) are a type of restriction endonucleases that are sensitive to the presence of methylation bases at their recognition sites. Digestive processing can specifically recognize methylation sequences, and these enzymes can generally only recognize one methylation base site. The amplification method after methylation sensitive endonuclease digestion is simple and widely applicable, but it can only qualitatively or semi quantitatively detect the methylation status of a specific site in the target fragment, and cannot be quantified. If the enzyme digestion is not complete, it will affect the results. (3) Sequencing technology based on bisulfite treatment Classical bisulfite sequencing is the most important method to study DNA methylation at present. The principle of bisulfite sequencing (BS seq) is to utilize the efficient deamination of non methylcytosine by bisulfite to generate uracil, which is read as thymine in subsequent sequencing. However, both 5mC and 5hmC are not sensitive to bisulfite, and DNA methylation can be determined through direct sequencing of PCR products. This method has high accuracy, and the methylation status of each CG, CHG, CHH site of the target fragment can be known, which is considered as the "gold standard" for DNA methylation detection. But the drawbacks are as follows: ① Unable to distinguish between 5mC and 5hmC; ② Incomplete transformation occurs; ③ Long time consumption; ④ There is a certain error in calculating the degree of methylation by cloning the copy number. (4) Sequencing technology based on affinity enrichment processing The technology includes DNA methylation immunoprecipitation chip detection or sequencing and methylation DNA specific binding protein enrichment sequencing. This method is cost-effective and can be used for methylation detection in large batches of samples. The disadvantage is that it cannot be quantified and has a low resolution (usually around 150 bp, which cannot achieve single base resolution), making it difficult to obtain low methylation region information. (5) The third generation sequencing technology The second generation sequencing (NGS) has become the standard tool for detecting the level of DNA methylation due to its cheap price and mature and reliable experimental process. However, the short reading technology has its inherent limitations, such as de novo assembly, haploid phasing and structural difference detection difficulties. The third generation sequencing technology mainly includes single molecule real-time sequencing technology (SMRT) and single molecule nanopore sequencing technology. Compared to the second generation sequencing technology, the high error rate and high cost of the third generation sequencing technology limit its application.
reference
【1】DOI: 10.1146/annurev-biochem-103019-102815
【2】DOI: 10.1038/npp.2012.112
【3】DOI: 10.1158/0008-5472.CAN-15-3278
【4】DOI: 10.1016/j.ymeth.2020.10.002