Identification of Liver Cancer driver mutations from COSMIC Data [308280]
Identification of Liver Cancer driver mutations from COSMIC Data
Abstract
Background
Liver cancer is the third most frequent causes of cancer related death accounting for more than 700,000 deaths each year. The development of liver cancer is highly attributed to genetic alterations in tumor cells which are known as driver mutations. Most people die due to late diagnosis of disease; therefore, there is a need of biomarkers for prediction and early diagnosis of disease when treatment is possible.
Objective
The objective of this study is to identify novel genes that are not known to cause liver cancer. It also aims to identify pathogenic alleles that may act as potential biomarkers for predicting liver cancer.
Method
The mutation data of hepatocellular carcinoma (HCC) [anonimizat] (Catalogue of Somatic Mutations in Cancer) databases. Different bioinformatics tools were used to study genetic alterations that are associated with liver cancer.
Results
The present study has identified a set of novel genes and pathogenic alleles (consistent mutations) that might be involved in liver cancer. [anonimizat]2 cells. The mutations near human genes and the genes involved in Ras/MAFK signaling pathway of Hepatitis B virus were also identified.
Conclusion
The identification of novel cancer causing genes and driver mutations provide targeted therapy for treating the disease. The pathogenic alleles identified in this study may help to understand the progression of liver cancer at molecular level. They may also act as potential biomarkers and therapeutic targets for liver cancer prediction and treatment.
Keywords: liver cancer; driver mutations; consistent mutations; transcription factors; HepG2 cells; biomarkers
INTRODUCTION
The struggle against cancer continues to pose a global challenge across the world. Even though the standards of health care and rehabilitation and cancer survival rates have improved; liver cancer is the seventh most common cancer and third leading cause of cancer related death [1]. It has an annual incidence of more than 800,000 cases and accounts for approximately 700,000 deaths each year [2]. Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer which accounts for more than 80% of all liver cancers [3]. [anonimizat] [4]. Viral Hepatitis is the predominant cause of HCC worldwide. Approximately, more than 75% of all cases of HCC are due to Hepatitis B and C Virus infections [5]. The regions that have higher burden of viral hepatitis have higher load of HCC [6]. Therefore, the significant increase in incidence and death rates of HCC are highly attributed to the increase in infections from HBV and HCV. Other factors that highly increase the risk of liver cancer development include use of tobacco (Smoking), heavy consumption of Alcohol and obesity (overweight) [7].
[anonimizat]. It has been found that only small fraction (about 1%) [anonimizat] (about 99%) [anonimizat] i.e. it does not code for protein [8]. [anonimizat]. [anonimizat] (also known as mutations) in regulatory elements which cause dysregulation of tumor suppressor genes called oncogenes (genes that protect body from cancers) Somatic mutations that occur at higher rate are called ‘driver mutations’. Driver mutations can be present in genes that are involved in maintenance of genome and chromosomal stability [9].
The analysis of non-coding region is quite difficult. The challenges associated with the study of non-coding regions are unique and distinct from the challenges of coding region. The driver mutations in non-coding region play a significant role in progression of cancer. There are different approaches that have been developed to identify candidate cancer driver genes but still it is difficult to distinguish which epigenetic and genetic changes are developing cancers. Many researchers investigated the role of regulatory mutations in non-coding regions and attempted to identify driver mutations in regulatory regions. In one study, recurrent non-coding mutations were identified within TAL1 enhancer region in acute lymphoblastic leukemia. It suggested that there is an impact of mutations in TAL1 enhancer region on regulatory factors of disease [10]. Similar study by Puente et al discovered recurrent non-coding mutations in enhancer region which is close to PAX5 gene in chronic lymphocytic leukaemia (CLL) patients [11]. Another important class of non-coding mutations includes mutations in functional RNA molecules ( long non-coding RNA (lncRNA) and micro RNA (miRNA)). The lncRNA of MALAT1 was found to be mutated in breast cancer [12]. The role of mutations in binding sites and in non-coding DNA was identified by Katainen et al. In their study, frequent mutations were observed in CTCF/cohesion-binding sites in cancers. These results revealed that mutations at CTCF binding sites are significantly important in cancers [13]. Some studies have identified recurrent somatic mutations in TERT promoter regions across various cancer patients. One study identified mutations in TERT promoter region at known and novel sites which suggested a significant role of regulatory mutations in diseases like cancers [14]. The cancer types in which mutations in TERT promoter regions were found to affect patient survival include bladder cancer [15], gliomas [16] and renal cell carcinoma [17]. Recently, a study by Schulze et al. identified TERT promoter mutations in alcohol-related hepatocellular carcinoma patients. In this study, these mutations were thought to be responsible for tumor progression [18]. Some recurrent mutations in promoter region of NFKBIE have also been identified in desmoplastiic melanoma [19].
The main objective of present study is to identify novel genes that are not known to cause liver cancer. It also aims to identify pathogenic alleles that may act as potential biomarkers for prediction of liver cancer. It focuses on mutations that are reported in non-coding regions of human DNA. Bioinformatics tools are used to study genetic alterations associated with liver cancer at molecular level. Liver cancer is mostly asymptomatic at early stages and symptoms usually begin to appear at later stages when cure becomes difficult. Most patients fail to receive successful treatment because of late diagnosis of disease. So, for patients with no or few symptoms, there is a need of biomarkers that can detect liver cancer at early stages when treatment is possible. These biomarkers will also help in reducing the risk for development of liver cancer. The results of this study may help in identification of driver mutations and genes involved in liver cancer progression. The biomarkers (pathogenic alleles) identified in this study can be used in further studies for verification.
MATERIAL AND METHODS
The data file (Cosmic non-coding variants) of release v87 and genome version GRCh38 was downloaded from COSMIC (Catalogue of Somatic Mutations in Cancer) database. The file contains complete data of non-coding mutations in different types of cancer. The first step was to filter out all the non-coding mutations that were reported in liver cancer. The complete methodology of this study is illustrated in Figure 1.
Identification of consistent mutations at HepG2 Transcription factors binding sites
The consistent non-coding mutations were found using customized Python code. The next step was to determine whether these recurrent non-coding mutations are at transcription factor binding sites (TFBS) or not. The data files of transcription factor binding sites for HepG2 cells were downloaded from UCSC ENCODE. The size of transcription factor binding sites that are obtained from ChIP-Seq experiments is large; therefore, in order to obtain significant results this size was reduced to 100 base-pairs. The TFBS files with actual and reduced size were then overlapped with consistent non-coding mutations individually to identify which transcription factor (TF) binds at consistent non-coding mutations. These mutations were then searched in the VISTA Enhancer Browser to determine whether they are part of identified human gene enhancer or not. The list of 1912 elements with enhancer activity for human was downloaded from Enhancer Vista browser. All downloaded files were of genome version GRCh37/hg19. These files were then converted to genome version GRCh38/hg38 using UCSC Genome Assembly.
Significance of reported non-coding mutations
The significance of all reported non-coding mutations was determined by calculating their scores and empirical p-value on the basis of consistency and number of transcription factors that were binding. For scoring, equal points i.e., 5 were assigned to both. The highest consistency was found to be 410 and the minimum consistency was 1. Since, the second highest consistency was 15 therefore, the mutation with consistency 410 was considered to be an outlier and ranking was done from mutation with consistency 15. The maximum and minimum numbers of TF binding were 39 and 1. The following formula was used for scoring non-coding mutations
Where,
Maximum consistency = 15, Consistency score = 5, Maximum no. of TF binding = 39, TF binding score =5
The statistical significance of acquired results was determined by randomization. It was done to eliminate biasness from the results. For this purpose, 10000 random samples were selected from the complete file of Non-Coding Variants. This file also had mutations with no TF binding in HepG2 cells. In this analysis, the cutoff value i.e., alpha for significance was set to be 0.05. Lower the p-value, the more significant the mutations are.
Association of genes with non-coding mutations
The genes that were closer to a great number of non-coding mutations were identified. It was also analyzed whether these mutations were in upstream region, downstream region or within the coding region of these genes. The closest distance of mutation from Transcription Start Site (TSS) of the corresponding gene was also found.
Mapping non-coding mutations with CTCF binding sites
CTCF is a transcription factor which acts as an activator, repressor or insulator protein. It controls gene expression either by insulation of enhancers or by activating or repressing promoters as it has the ability to bind at wide range of sequence. This diversified role of CTCF led researchers to map its binding sites in different species [20]. Therefore; mapping of non-coding mutations was done with HepG2 cells CTCF-binding sites. Before mapping, the clusters of non-coding mutations were made. For each cluster, the maximum distance between mutations was set to be 100. It means the mutations that were within 100 base-pairs; they were combined in one cluster. The overlapping clusters were also combined.
Graphical analysis of important non-coding mutations and clusters
The graphical profiles of important non-coding mutations and clusters were obtained from UCSC Genome browser (https://genome.ucsc.edu). It provides annotations for the specific regions of a genome. This browser is highly customized which displays relevant information only. The regions showing variations in results were selected for analysis. Only few HepG2 cells TF’s (CTCF, FOXA1, SP1 and SIN3A) were displayed from regulation feature due to limited window. The conservation track was also selected which provided regions that were most likely conserved in different species.
Analysis of Ras/MAPK signaling pathway of Hepatitis B virus
The mitogen-activated protein kinase (MAPK) pathway plays a signficant role in survival and growth of cells. It also regulates expression of genes [21]. Any abnormality in Ras/MAPK signaling pathway may lead to resistance to apoptosis causing increased and uncontrolled cell proliferation. Different researches have shown its involvement in some cancers [22]. Ras/MAPK is also activated in 50-100% cases of primary liver cancer (HCC) [23]. Therefore, it is considered as potential targets for treating HCC. In this study, mutations reported near genes involved in MAPK signaling pathway were identified.
RESULTS
The following section summarizes the results obtained from this study.
Consistent mutations at HepG2 Transcription factors binding sites
The complete list of identified non-coding mutations is present in Supplementary file 1. Some non-coding mutations that are bound by HepG2 cells TF’s with both actual and precise (100 base-pairs) size are shown in Table I. The highest consistency was found to be 410. The second highest consistency at other genomic position was 15 which is very less as compared to 410. It was also observed that the number of TFs binding at a specific location greatly change when the size of TFBS files was reduced to 100 base-pairs. Table I also gives information about non-coding mutations that were found to be present within regions of Vista Enhancer Browser elements. It indicates that the mutations with smaller consistency were located at regions that show enhancer activity. It was analyzed that the mutation with consistency 4 was present within enhancer regions of RCAN1 bracketing gene. This location is a TF binding site as well where 5 TF’s bind. Another mutation that was within enhancer region of NDRG4, it was bound by 7 TF’s.
Significance of non-coding mutations
The non-coding mutations were ranked on the basis of their scores and p-values (Supplementary file 1). Table II shows scores and p-values of some non-coding mutations. The highest score was found to be 5.385 out of 10 while the lowest score was 0.461. Some mutations in Table II were not highly consistent but still they had high scores as they were bound by great numbers of TF while some mutations were consistent but only few TF’s were binding there. There were also few mutations that had similar score but their consistency and number of TF’s binding were different. Many mutations in Table II are statistically significant as well (having p-value less than 0.05). However, p-value of few mutations was above 0.05. It means those mutations are not significant.
Association of genes with non-coding mutations
Table III gives information about genes that were closest to non-coding mutations in greater number. It was found that 75 non-coding mutations were near ALB gene. Some of them were in upstream region while some were within coding region of ALB gene. The mutations were also reported in upstream and coding regions of SYN3 gene. However, the mutations near MLLTP10P1, CNTNAP2, NPAS3 and LSAMP genes were in upstream, downstream and coding regions. PLCB1, LINC00511, LINC01410 and WWOX genes had non-coding mutations in their downstream and coding regions. There were some mutations that occurred within coding regions of genes i.e., EYS, ZFHX3, PTPRN2 and AC0976344.
Mapping non-coding mutations with CTCF binding sites
The results of mapping with HepG2 cells CTCF binding sites are shown in Table 4. Total 49492 clusters were formed. It was observed that some clusters have great number of non-coding mutations. Table IV shows that cluster number 48111 has highest number of non-coding mutations i.e., 17. After that 15 and 11 mutations are present in cluster number 17609 and 7433 respectively. Other important clusters (2565, 29170 and 32451) have 7, 6 and 6 mutations. In majority of the clusters, CTCF binding site did not lie between mutation and TSS of gene. In two clusters, CTCF was found to be binding between all reported non-coding mutations and TSS of gene.
Graphical analysis of important non-coding mutations and clusters
The selected genomic regions are graphically expressed in Figures 2 and 3. Figure 2 represents individual non-coding mutations whereas Figure 3 represents clusters having great number of non-coding mutations. All parts of Figures 2 and 3 indicate presence of clinical variants at the given genomic regions. The red and blue bars indicate copy number variation. The bars are red for variants which experience loss of genetic material. The blue bars on the other hand represent gain of genetic material. It means these regions are clinically significant as well. The genes expressed near these locations are also displayed. In Figure 3, some regions are also found to be conserved among different species. These conserved regions are generated from pair-wise alignments. The cluster shown in Figure 3 (B) has CTCF binding which is similar to the result shown in Table 4.
Analysis of Ras/MAPK signaling pathway of Hepatitis B virus
The non-coding mutations near genes that take part in Ras/MAPK signaling pathway are shown in Figure 4. The Figure is taken from KEGG pathway database. Figure 4 shows mutations that are reported near most of the genes. The highest numbers of mutations were reported close to the PKC gene. Other genes with greater non-coding mutations near them include STAT3 and Grb2. There are few genes where closest mutations were not reported i.e., Raf, MEK, CBP and ELK1.
DISCUSSION
The diseases like cancer can be prevented. The risk factors and causes of most of the cancers are known. Therefore, this knowledge can be used to avoid majority of the cancer related deaths. In case of liver cancer, viral hepatitis is the most common risk factor. It means the risk of developing liver cancer can be reduced when there is active treatment of viral hepatitis. Today, only a small amount of these patients are successfully treated because of late diagnosis of disease. So, it is necessary to identify biomarkers for predicting liver cancer at early stages. The main focus of this study is on non-coding mutations that occur in transcription factors binding sites of HepG2 cells.
Transcription factors are proteins that bind on the cis-regulatory elements. They regulate various cellular process and control gene expression levels. If the mutations occur at binding sites of transcription factors then the binding of TFs to their sites will be disrupted. As a result, gene expression will be affected. The abnormal expression of gene will then either enhance or reduce expression levels. Therefore, the mutations at TF binding sites can be termed as driver mutations.
From Table I, it is observed that four TFs were binding at a highly consistent location (5:1295113-1295113) but when the size was reduced then only one TF (GABP) bind there. Similarly, 13 TFs were found to bind at another consistent location (22:40856967-40856967) but this number was reduced to three with reduction in size. It was also analyzed that enhancer regions where non-coding mutations were observed at TF binding sites were not highly consistent. Their consistency was 2 which mean nucleotides bind randomly at TF binding sites. Therefore, these mutations may be considered as random mutations. In Tables I and II, there are few mutations that were not consistent but still they were bound by greater number of TFs with both actual and precise sizes and they had high score as well. These mutations are very significant because if they occur in great number then they would surely cause disease. The significance of mutations can be inferred from p-value. Some mutations were not statistically significant because we consider those alleles that were mutated at least once in case of consistency whereas in case of TF binding, the alleles with no TF binding were also considered along with TF bound alleles.
The genes indicated in Table III and Table IV are not identified as driver genes in liver cancer but still great numbers of non-coding mutations are reported near them. It implies that they may have some importance in liver cancer development. In Table III, the closest distance from TSS of some genes was very less like in case of MLLTP10P1 gene the mutation was reported at a distance of 60 base-pairs from TSS. Similarly, closest distance of mutations from TSS of WWOX, SYN3 and ALB genes were below 500 base-pairs.
It has been found in approximately 70% cases that regulatory region of a gene lie within 100 kb [24]. The coding region of one gene can be a regulatory region for another gene. Therefore, those mutations that are reported within coding regions of some genes are significant as well. They may be coding for TSS genes and non-coding for any gene present in upstream/downstream region. In Table IV, the clusters where CTCF binding sites were not present between mutation and TSS might be considered as regulatory regions of corresponding genes. So, the mutations reported in these regulatory regions are highly significant as they may have potential to the drive disease. However, the clusters where CTCF was binding between TSS and mutations they cannot be regarded as regulatory regions for the particular genes.
Figure 2 shows that the mutation at location 20:17859269-17859269 has no gene expression whereas SLCO1B3 gene is expressed at location 12:20815732-20815732. In Figure 3, the regions that are found to be conserved among different species; they can be mutated in those species as well. There are more conserved regions in Figure 3 (A) as compared to 3 (B). The bars with TF’s of HepG2 cells are displayed only when corresponding TF binds there. The darkness of bars for TF’s of HepG2 cells represent locations that are enriched with specific TF. CTCF binding in cluster 1:152018685-152018775 shows that the mutations reported in that region are not in regulatory regions of specific genes. Figures 2 and 3 validated the acquired results. It indicates that the regions selected for graphical analysis have great importance and can be considered as epigenetic markers for predicting liver cancer. However detailed analysis is required for better understanding.
Ras, Raf, MEK and ERK are signaling molecules in Ras/MAPK signaling pathway. These molecules activate this pathway which results in genes transcription. The transcribed genes code for proteins which are involved in cellular growth and proliferation. Figure 4 indicates that no non-coding mutation is reported near Raf and MEK molecules but they might have coding mutations.
CONCLUSION
The present study provides comprehensive analysis of non-coding mutations through bioinformatics tools. The identification of recurrent/consistent somatic mutations at TF binding sites in non-coding variants suggests that they may play a significant role in driving Hepatocellular Carcinoma (HCC). This information will help in analyzing non-coding regions contributing to the development of liver cancer. The results of this study are also essential in designing appropriate research strategies. This is because mutations in non-coding regions are more likely to affect regulatory elements of genes. They may also cause structural variations in genes resulting in gene disruptions. The identified pathogenic alleles can be considered as novel biomarkers for liver cancer diagnosis and prognosis. They may also act as therapeutic targets for treatment of liver cancer. However, further assessment is required for confirmation of acquired results.
SUPPLEMENTARY MATERIAL
Supplementary file 1 (9.15 MB)
Sheet 1: Identified significant non-coding mutations sorted on the basis of their scores
Sheet 2: Significant non-coding mutations on the basis of empirical p-value
ACKNOWLEDGEMENTS
This research was funded by Higher Education Commission (HEC) Pakistan and Ministry of Planning Development and Reforms under National Center in Big Data and Cloud computing at Exascale Open Data Analytics Lab (Genomics Lab) NED University of Engineering & Technology.
AUTHOR DISCLOSURE STATEMENT
No competing financial interests exist.
REFERENCES
[1] F. Bray, J. Ferlay, I. Soerjomataram, et al., “Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.,” CA. Cancer J. Clin., vol. 68, no. 6, pp. 394–424, 2018. https://doi.org/10.3322/caac.21492
[2] A. C. Society, “Cancer Facts & Figures 2019. Atlanta: American Cancer Society,” 2019. Available at: www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-statistics/annual-cancer-facts-and-figures/2019/cancer-facts-and-figures-2019.pdf. Accessed May 5,2019
[3] I. Tischoff and A. Tannapfel, “Hepatocellular carcinoma and cholangiocarcinoma – Different prognosis, pathogenesis and therapy,” Zentralbl. Chir., vol. 132, no. 4, pp. 300–305, 2007. https://doi.org/10.1055/s-2007-981195
[4] S. Yapali and N. Tozun, “Epidemiology and viral risk factors for hepatocellular carcinoma in the Eastern Mediterranean countries,” Hepatoma Res., vol. 4, no. 6, pp. 24–33, 2018. https://doi.org/10.20517/2394-5079.2018.57
[5] A. Arzumanyan, H. M. G. P. V. Reis, and M. A. Feitelson, “Pathogenic mechanisms in HBV- and HCV-associated hepatocellular carcinoma,” Nat. Rev. Cancer, vol. 13, no. 2, pp. 123–135, 2013. https://doi.org/10.1038/nrc3449
[6] S. M. F. and J. L. van D. Peter P Michielsen, “Viral Hepatitis and hepatocellular carcinoma,” World J. Surg. Oncol., vol. 3, no. 27, pp. 1–18, 2005. https://doi.org/10.1186/1477-7819-3-27
[7] M. Kew, “Hepatocellular carcinoma: epidemiology and risk factors,” J. Hepatocell. Carcinoma, vol. 1, pp. 115–125, 2014. https://doi.org/10.2147/JHC.S44381
[8] ENCODE, “An integrated encyclopedia of DNA elements in the human genome,” Nature, vol. 489, no. 7414, pp. 57–74, 2012. https://doi.org/10.1038/nature11247
[9] J. R. Pon and M. A. Marra, “Driver and Passenger Mutations in Cancer,” Annu. Rev. Pathol. Mech. Dis., vol. 10, no. 1, pp. 25–50, 2015. https://doi.org/10.1146/annurev-pathol-012414-040312
[10] R. M. Marc et al., “An oncogenic super-enhancer formed through somatic mutation of a noncoding intergenic element,” Science (80-. )., vol. 346, no. 6215, pp. 1373–1377, 2014. https://doi.org/10.1126/science.1259037
[11] X. S. Puente et al., “Non-coding recurrent mutations in chronic lymphocytic leukaemia.,” Nature, vol. 526, no. 7574, pp. 519–524, 2015. https://doi.org/10.1038/nature14666
[12] M. J. Ellis et al., “Whole-genome analysis informs breast cancer response to aromatase inhibition,” Nature, vol. 486, no. 7403, pp. 353–360, 2012. https://doi.org/10.1038/nature11143
[13] R. Katainen et al., “CTCF/cohesin-binding sites are frequently mutated in cancer,” Nat. Genet., vol. 47, no. 7, pp. 818–821, 2015. https://doi.org/10.1038/ng.3335
[14] M. S. Collin Melton, Jason A. Reuter, Damek V. Spacek, “Recurrent Somatic Mutations in Regulatory Regions of Human Cancer Genomes,” Nat Genet., vol. 47, no. 7, pp. 710–716, 2015.https://doi.org/10.1038/ng.3332
[15] P. S. Rachakonda et al., “TERT promoter mutations in bladder cancer affect patient survival and disease recurrence through modification by a common polymorphism,” Proc. Natl. Acad. Sci., vol. 110, no. 43, pp. 17426–17431, 2013. https://doi.org/10.1073/pnas.1310522110
[16] J. E. Eckel-Passow et al., “ Glioma Groups Based on 1p/19q, IDH , and TERT Promoter Mutations in Tumors ,” N. Engl. J. Med., vol. 372, no. 26, pp. 2499–2508, 2015. https://doi.org/10.1056/NEJMoa1407279
[17] I. Hosen et al., “TERT promoter mutations in clear cell renal cell carcinoma,” Int. J. Cancer, vol. 136, no. 10, pp. 2448–2452, 2015. https://doi.org/10.1002/ijc.29279
[18] K. Schulze et al., “Exome sequencing of hepatocellular carcinomas identifies new mutational signatures and potential therapeutic targets,” Nat. Genet., vol. 47, no. 5, pp. 505–511, 2015. https://doi.org/10.1038/ng.3252
[19] A. H. Shain et al., “Exome sequencing of desmoplastic melanoma identifies recurrent NFKBIE promoter mutations and diverse activating mutations in the MAPK pathway,” Nat. Genet., vol. 47, no. 10, pp. 1194–1199, 2015. https://doi.org/10.1038/ng.3382
[20] S. Kim, N. K. Yu, and B. K. Kaang, “CTCF as a multifunctional protein in genome regulation and gene expression,” Experimental & molecular medicine, vol. 47. p. e166, 05-Jun-2015. https://doi.org/10.1038/emm.2015.33.
[21] T. Knight and J. A. E. Irving, “Ras/Raf/MEK/ERK pathway activation in childhood acute lymphoblastic leukemia and its therapeutic targeting,” Front. Oncol., vol. 4 JUN, no. June, pp. 1–13, 2014. https://doi.org/10.3389/fonc.2014.00160
[22] L. Santarpia, S. M. Lippman, and A. K. El-Naggar, “Targeting the MAPK-RAS-RAF signaling pathway in cancer therapy,” Expert Opin. Ther. Targets, vol. 16, no. 1, pp. 103–119, 2012. https://doi.org/10.1517/14728222.2011.645805
[23] B. Delire and P. Stärkel, “The Ras/MAPK pathway and hepatocarcinoma: Pathogenesis and therapeutic implications,” Eur. J. Clin. Invest., vol. 45, no. 6, pp. 609–623, 2015. https://doi.org/10.1111/eci.12441
[24] M. Yaragatti, C. Basilico, and L. Dailey, “Identification of active transcriptional regulatory modules by the functional assay of DNA from nucleosome-free regions,” Genome Res., vol. 18, no. 6, pp. 930–938, Jun. 2008. https://doi.org/10.1101/gr.073460.107
Figure legends
Figure 1: Schematic diagram of methodology followed for analysis of non-coding mutations. The arrows represent final results. The lines represent files used for corresponding operation.
Figure 2: Graphical profiles of significant mutations (A) Represents mutation 20:17859269-17859269, (B) Represents mutation 12:20815732-20815732. The red bars and blue bars in clinical variants represent copy number and gain. The green bar in ‘ClinVar Short Variant represents benign clinical variant.
Figure 3: Graphical profiles of significant clusters (A) Represents cluster 11:62841559-62841872, (B) Represents cluster 1:152018685-152018775. The Gencode v29 track displays basic genes present close to the given cluster. The Conservation tracks ‘Cons 100 Verts’ track and ‘Multiz Alignment of 100 vertebrates’ display regions that are conserved in multiple species in condensed form.
Figure 4: Ras/MAFK signaling pathway of Hepatitis B virus taken from KEGG pathway. The numbers of mutations that occurred near genes are mentioned in red beside gene names
Table Legends
Table I: Non-coding mutations identified at TF binding sites of HepG2 cells. The column ‘Bracketing Gene in Enhancer Vista Browser’ provides names of genes showing enhancer activity where identified non-coding mutations were present.
Table II: Significance of non-coding mutations on the basis of their scores and empirical p-values. The scoring formula and calculations of p-values were based on consistency of a particular mutation and number of TF binding there with precise size (100 base-pairs).
Table III: Genes located to the non-coding mutations in great numbers. The terms ‘up’ and ‘down’ represent upstream and downstream regions of genes. The closest distance from TSS was written 0 for mutations present within coding regions of genes.
Table IV: Mapping of clusters having great number of non-coding mutations with CTCF binding sites of HepG2 cells. ‘Yes’ is written when CTCF binds between mutation and TSS gene and vice versa.
Figure 1: Schematic diagram of methodology followed for analysis of non-coding mutations. The arrows represent final results. The lines represent files used for corresponding operation.
Figure 2: Graphical profiles of significant mutations (A) Represents mutation 20:17859269-17859269, (B) Represents mutation 12:20815732-20815732. The red bars and blue bars in clinical variants represent copy number and gain. The green bar in ‘ClinVar Short Variant represents benign clinical variant.
Figure 3: Graphical profiles of significant clusters (A) Represents cluster 11:62841559-62841872, (B) Represents cluster 1:152018685-152018775. The Gencode v29 track displays basic genes present close to the given cluster. The Conservation tracks ‘Cons 100 Verts’ track and ‘Multiz Alignment of 100 vertebrates’ display regions that are conserved in multiple species in condensed form.
Figure 4: Ras/MAFK signaling pathway of Hepatitis B virus taken from KEGG pathway. The numbers of mutations that occurred near genes are mentioned in red beside gene names.
Table I: Non-coding mutations identified at TF binding sites of HepG2 cells. The column ‘Bracketing Gene in Enhancer Vista Browser’ provides names of genes showing enhancer activity where identified non-coding mutations were present.
Table II: Significance of non-coding mutations on the basis of their scores and p-values. The scoring formula and calculations of p-values were based on consistency of a particular mutation and number of transcription factor (TF) binding there with precise size (100 base-pairs).
Table III: Genes located closest to the non-coding mutations in great numbers. The terms ‘up’ and ‘down’ represent upstream and downstream regions of genes. The closest distance from Transcription Start Site (TSS) was written 0 for mutations present within coding regions of genes.
Table IV: Mapping of clusters having great number of non-coding mutations with CTCF binding sites of HepG2 cells. ‘Yes’ is written when CTCF binds between mutation and TSS gene and vice versa.
Copyright Notice
© Licențiada.org respectă drepturile de proprietate intelectuală și așteaptă ca toți utilizatorii să facă același lucru. Dacă consideri că un conținut de pe site încalcă drepturile tale de autor, te rugăm să trimiți o notificare DMCA.
Acest articol: Identification of Liver Cancer driver mutations from COSMIC Data [308280] (ID: 308280)
Dacă considerați că acest conținut vă încalcă drepturile de autor, vă rugăm să depuneți o cerere pe pagina noastră Copyright Takedown.
