Altered metabolic pathways in clear cell renal cell carcinoma: A meta-analysis and validation study focused on the deregulated genes and their associated networks.

Clear cell renal cell carcinoma (ccRCC) is the predominant subtype of renal cell carcinoma (RCC). It is one of the most therapy-resistant carcinomas, responding very poorly or not at all to radiotherapy, hormonal therapy and chemotherapy. A more comprehensive understanding of the deregulated pathways in ccRCC can lead to the development of new therapies and prognostic markers. We performed a meta- analysis of 5 publicly available gene expression datasets and identified a list of co- deregulated genes, for which we performed extensive bioinformatic analysis coupled with experimental validation on the mRNA level. Gene ontology enrichment showed that many proteins are involved in response to hypoxia/oxygen levels and positive regulation of the VEGFR signaling pathway. KEGG analysis revealed that metabolic pathways are mostly altered in ccRCC. Similarly, Ingenuity Pathway Analysis showed that the antigen presentation, inositol metabolism, pentose phosphate, glycolysis/gluconeogenesis and fructose/mannose metabolism pathways are altered in the disease. Cellular growth, proliferation and carbohydrate metabolism, were among the top molecular and cellular functions of the co-deregulated genes. qRT-PCR validated the deregulated expression of several genes in Caki-2 and ACHN cell lines and in a cohort of ccRCC tissues. NNMT and NR3C1 increased expression was evident in ccRCC biopsies from patients using immunohistochemistry. ROC curves evaluated the diagnostic performance of the top deregulated genes in each dataset. We show that metabolic pathways are mostly deregulated in ccRCC and we highlight those being most responsible in its formation. We suggest that these genes are candidate predictive markers of the disease.


INTRODUCTION
Renal-cell carcinoma (RCC) is the most common kidney neoplasm, comprising 3% of all adult malignancies [1]. Its incidence has increased steadily over the past 20 years in the United States and Europe; 35,000 new cases and 12,000 deaths now occur annually in the United States alone. If detected in early stages, it is potentially curable by surgical resection; however, to date there is no curative treatment for metastatic RCC. Clear cell renal cell carcinoma (ccRCC) represents the most common subtype (83%) of RCC [2]. The most striking phenotypic feature of ccRCC is its clear cell morphology, which has been linked to lipid and glycogen accumulation [3]. Moreover, ccRCC is one of the most therapy-resistant carcinomas, responding very poorly or not at all to radiotherapy, hormonal therapy and chemotherapy. All these facts emphasize the importance of developing early diagnostic markers for this particular RCC subtype.
Meta-analysis consists of statistical techniques to combine results from several studies in order to increase the statistical power and reproducibility compared with any single study [4]. Microarray gene expression profiling has been used in the past by various groups [5][6][7][8][9][10][11][12][13] to distinguish the various histological subtypes of RCC and consequently to identify novel tumor markers. The general procedure identifies markers in accordance with average differential expression level (fold change) and/or some level of significance as measured by the t-test. All these publicly available microarray expression datasets provide a rich resource for genome-wide information on RCC and provide the ideal opportunity to perform a meta-analysis study using a large number of cases.
The Oncomine database [14] is an online collection of microarray expression data from various cancer-related sources. We hypothesized that a meta-analysis of several gene expression datasets of ccRCC can give a potentially significant list of co-deregulated genes (co-DEGs) in ccRCC, which is important to define pathways in which the genes of interest are involved. To increase the likelihood of revealing truly significant deregulated genes, which should have higher potentials to be used as novel markers for the disease, we analyzed their expression profile over 5 independent studies, greatly increasing the significance of results.

Identification of candidate ccRCC markers for network analysis
The workflow of our study is summarized in Figure  1. Data mining of five microarray datasets from the Figure 1: Workflow of the study. Initially, five Oncomine microarray datasets were compared and the co-deregulated genes (co-DEGs) among them were retrieved. The co-DEGs were further enquired regarding their use as candidate markers for ccRCC. Next, the canonical pathways in which these co-DEGs are implicated were identified, as well as the networks that they form, and the top deregulated molecules among them. Following, validation of the deregulated expression levels of these genes was performed both in clear cell renal cell carcinoma cell lines, as well as in a cohort of ccRCC patients. Immunohistochemistry was performed in biopsies from the patient cohort for the top deregulated genes. ROC analysis was used to evaluate the discriminatory potential of the candidate biomarker genes. Further enrichment analysis was finally performed for the co-deregulated genes.
Oncomine repository for co-deregulated genes in ccRCC vs. their non-tumor kidney tissue, led to the identification of a list of 93 up-and 76 down-regulated genes, respectively. These genes belonged simultaneously within the top 1% of the up-or down-regulated genes, in at least two datasets, with a p<1E-4 and fold change>2 (Table S1). None of the DEGs belonged simultaneously to the top 1% of the up-regulated genes, in all 5 datasets. The most coup-regulated gene among 4 datasets was BTN3A3. Twenty genes were co-up-regulated among 3 datasets and 93 genes were co-up-regulated between 2 datasets, respectively (Figure 2.A). A similar approach was performed in order to identify the genes that belonged to the top 1% of the down-regulated genes, simultaneously in all 5 datasets. KCNJ1 and TMPRSS2 were the top co-down-regulated genes, exhibiting low levels of expression in 4 datasets. Fifteen and 81 genes were co-down-regulated in 3 and 2 datasets, respectively (Figure 2.B).
Top up-regulated genes Co-Up-regulated among four datasets Co-up-regulated among three datasets Co-down-regulated among four datasets

Diagnostic performance
A ROC test was performed for the top 20 DEGs in ccRCC using the extracted MAS5-calculated signal intensity values of each gene from each GEO datasets ( Table 2). The best discriminatory genes (AUC>0.75 and p<0.05) between ccRCC and the normal tissue in each dataset are depicted in Figure 7.

Validation of the DEGs in ccRCC cell lines and a ccRCC patient cohort
In order to validate the DEGs we used two ccRCC cell lines (ACHN and Caki-2) and the HEK-293 cells were used as a control. We also validated the DEGs in a cohort of 10 ccRCC freshly frozen patient samples. The significantly reduced expression levels of ARHGDIB, NKG7, IGFBP3, LAIR1, RNASET, TMPRSS2 and EGLN3 and the significantly high expression of AHNAK2 were validated in the ACHN vs. HEK-293 cells. Furthermore, PDIA5, ARHGDIB and ATP2B4 down-regulation EGLN3 along with NDUFA4L2 and AHNAK2 up-regulation was validated in the Caki-2 vs. HEK-293 cells ( Figure 8).

Upregulation of NNMT and NR3C1 expression in renal biopsies
Immunohistochemical (IHC) staining of NNMT and NR3C1 proteins was examined in FFPE biopsy samples of 24 confirmed ccRCC patients. Both NNMT and NR3C1 expression was increased in the kidney sections of these patients as compared to biopsies from non-cancerous kidney tissue ( Figure 11).

Further Enrichment analyses
GO enrichment for the top DEGs revealed genes that participate in biological processes such as response to organic substance (adjP=0.0002), response to chemical stimulus (adjP=0.0055), excretion (adjP=0.0067), regulation of epithelial cell proliferation (adjP=0.0067), response to hormone stimulus (adjP=0.0067), response to steroid hormone stimulus (adjP=0.0067), response to oxygen levels (adjP=0.0081), response to endogenous stimulus (adjP=0.0081), response to hypoxia (adjP=0.0081) and epithelial cell proliferation (adjP=0.0081). GO enrichment for the co-upregulated and the co-downregulated genes was also performed (Table  S2). Among the most important biological processes of the up-regulated genes, response to hypoxia/oxygen levels (EGLN3, VEGFA, CASP1, FLT1 and CA9) and positive regulation of the vascular endothelial growth factor receptor signaling pathway (VEGFA and FLT1) could be discriminated.

DISCUSSION
In the present study, we performed a meta-analysis in order to identify deregulated genes in ccRCC compared to the normal kidney and to measure their discriminatory capability between the two tissues. For this purpose, we extracted data from five Oncomine datasets and analyzed them, accordingly. A list of co-deregulated genes among the five microarray datasets was identified and IPA revealed the canonical pathways in which they are implicated, as well as the networks that they form and their associated functions. The deregulated expression pattern of these genes was validated in two ccRCC cell lines and in a cohort of ccRCC patients. We also attempted to gain a better understanding of the molecular pathways/ mechanisms involved in ccRCC, through comprehensive bioinformatics analyses.
Knowledge of gene regulatory networks is considered to be of crucial importance for the understanding of diseases such as cancer, as it may lead to new therapeutic approaches. Our investigation revealed that the top 1% of the co-DEGs take part in the Antigen Presentation pathway, the Inositol Metabolism pathway, the Pentose Phosphate pathway, in Glycolysis/ Gluconeogenesis, as well as in the Metabolism of Fructose and Mannose. Similar observations corroborating our results were recently reported by others [15][16][17][18][19]. White et al. [15] used three independent algorithms and also showed that the aforementioned pathways are among the  top deregulated pathways in metastatic RCC.
In the current study, we found that there was a significant overlap (77 genes) between genes deregulated in ccRCC and those reported to be deregulated in other cancers, suggesting that different cancer types share common pathways. Furthermore, 32 of them play a significant role in renal and urological diseases. The most significant molecular and cellular functions of these co-DEGs involved cell-to-cell signaling, cellular function and maintenance, molecular transport, cellular growth and proliferation.
Our analysis revealed that the antigen presentation pathway was the most altered pathway in ccRCC and BTN3A3 was the most up-regulated gene among all five ccRCC datasets. The butyrophilin (BTN) genes are a group of major histocompatibility complex (MHC)associated genes that encode type I membrane proteins with 2 extracellular immunoglobulin (Ig) domains and an intracellular B30.2 (PRYSPRY) domain. Autoantibodies that are produced against tumor-associated antigens attract a growing interest for cancer detection, differential diagnostics and prognosis. In line with our analysis, BTN3A3 antigen and its cognate autoantibody was recently suggested as a valuable target for further analysis as potential cancer biomarker [20].
Extensive data in the literature show that cancer cells reprogram their metabolism and shift from aerobic to anaerobic respiration even in the presence of oxygen. This theory was initially proposed by Warburg over 50 years ago [21] and has been recently refreshed [9,[22][23][24][25]. Reversal of the Warburg effect has also been explored as a novel treatment for cancer [26]. Now considered a hallmark of cancer, metabolic reprogramming of cells results in an unusually high rate of glycolysis and lactate production even in the presence of oxygen. Both our IPA and KEGG enrichments supported that the majority of the deregulated pathways in ccRCC are related to metabolic processes. Glycolysis was among the top deregulated pathways among the five datasets. Our observations are in line with the increasing evidence for the role of altered metabolism in the pathogenesis of renal cancer [2,15,22,[27][28][29]. Our data are also consistent with reports from other cancers, such as metastatic cervical carcinoma and head and neck cancers [30][31][32]. In these works the authors reported increased glycolysis as measured by high levels of lactate in these tumours. In metastasis, it has been hypothesized that the glycolytic phenotype arises as a result of transient hypoxic episodes that occur while cells travel to distant sites through the bloodstream [33]. Ultimately, cells that are able to perform glycolysis and are resistant to hypoxia, will be selectively favored for survival and growth and result in successful metastasis [33]. Another proposed explanation for the shift to glycolysis is that cancer cells may have damage to their mitochondria through acquired mutations, or may even shut down their mitochondria being the powerhouse of the cell and helps regulate apoptosis. In our meta-analysis, the platelet isoform of phosphofructokinase (PFKP) (ATP D-fructose-6-phosphate-1-phosphotransferase) was among the most co-upregulated genes among all ccRCC datasets. Of specific note is that our data show that PFKP participates in almost all of the major deregulated canonical pathways in the disease: the pentose phosphate pathway, glycolysis/ gluconeogenesis and fructose and mannose metabolism pathway. PFK catalyzes the irreversible conversion of fructose-6-phosphate to fructose-1,6-bisphosphate and is a key regulatory enzyme in glycolysis. It has been shown to be abundantly expressed in human tumors and its expression was linked to long-standing observations concerning the apparent coupling of enhanced glycolysis and cell proliferation [34,35].
A comprehensive understanding of the deregulated metabolic pathways in cancer has much potential for the development of novel therapies. For example, the use of glufosfamide (D-19575), a cytotoxic alkylating agent in which isophosphoramide mustard, the cytotoxic metabolite of ifosfamide, is covalently linked to β-Dglucose [8,36], has been studied for its effectiveness as a treatment for cancers alone or in combination with other treatments [37][38][39]. In addition, the identification of novel pathways involved in tumorigenesis may allow for the discovery of new compounds that can induce synthetic lethality. After the discovery that RCC tumors, like many others, depend on glycolysis, Chan et al. [40] determined this dependence to be, in part, a result of induction of the glucose transporter 1 (GLUT1). They identified a compound that inhibited the uptake of glucose through GLUT1, which resulted in cancer cell death. Because GLUT1 dependence was observed only in cancer cells lacking the von Hippel-Lindau (VHL) gene, treatment with the GLUT1 inhibiting compound selectively killed cancer cells and had no effect on normal kidney cells. Exploitation of this dependence has also been shown to be promising for the treatment of other cancers.
The data from the present study corroborate that kidney cancer cells manipulate more than one molecular mechanisms and a number of biological pathways to achieve their aggressive phenotype. Renal carcinoma is made up of a number of cancers that occur in the kidney, each having a different histology, following a different clinical course, responding differently to therapy and caused by different genes. Here, we highlight that ccRCC is fundamentally a metabolic disorder. Understanding the mechanisms that lead to altered metabolic pathways in this disease should provide the foundation for the development of novel targeted therapies and development of novel biomarkers. www.impactjournals.com/oncoscience

Data mining and gene expression analysis
The data mining strategy for selecting ccRCC marker genes was based on the Oncomine v4.4.3 cancer microarray platform [14]. Oncomine incorporates more than 628 independent microarray datasets, totaling more than 62,015 microarray experiments and spanning 41 cancer types. It unifies a large compendium of other published cancer microarray data, including Gene Expression Omnibus (GEO) and Stanford Microarray Database (SMD) and uniquely provides differential expression analyses comparing most major types of cancer with their respective normal tissues.

Microarray expression datasets.
Clear cell Renal Cell Carcinoma was used in the <profile search> function in the Oncomine database to find the available microarray datasets related to the specific cancer type. The analysis type <cancer vs. normal> was then applied to filter those microarray datasets exploring cancer relative to its non-tumor kidney tissue. Five publicly available datasets of gene expression profiles were chosen in this study. These were Gumz Renal [ [42]. The datasets were carefully selected with the criterion of using the same platform (Affymetrix HU133A and HU133B) ( Table 4).

Gene selection procedure.
Concept filters in the Oncomine database were used to identify the differentially expressed genes in ccRCC vs. the normal kidney tissue, using the following parameters: p-value=1E-4; Odds Ratio=2.0 and gene rank=top 1%. Next, a corrected Q value threshold (Q≤0.05) was used to filter and retrieve those DEGs with a high confidence. This filtering approach yielded 53 up-and 53 down-regulated genes from the "Higgins Renal", 126 up-and 126 downregulated genes from the "Gumz Renal", 126 up-and 126 down-regulated genes from the "Jones Renal", 177 up-and 177 down-regulated genes from the "Lenburg Renal" and 195 up-and 195 down-regulated genes from the "Yusenko Renal" datasets, respectively. Then, the co-DEGs among the five microarray datasets were selected as candidate genes for downstream network analysis and their expression levels was further validated in ccRCC cell lines as well as in a cohort of 10 ccRCC patient samples. The median fold change (±SD) value among the datasets was calculated and scored (Table S1).

Pathway analysis of the co-deregulated genes
DEGs were investigated for network interrelation by Ingenuity Pathway Analysis, version 7 (IPA; Ingenuity Systems, USA). IPA scans the set of input genes to identify networks by using Ingenuity Pathways Knowledge Base for interactions between identified "Focus Genes." In this study, the DEGs between ccRCC and normal tissue samples and hypothetical interacting genes stored in the knowledge base in IPA software, were used to generate a set of networks with a maximum network size of 35 genes/ proteins. Networks were displayed graphically as genes/ gene products ("nodes") and the biological relationships between the nodes ("edges"). All edges are from canonical information stored in the Ingenuity Pathways Knowledge Base. Networks of these genes were generated by IPA based on their connectivity, each ranked by a score. This score indicates the likelihood of the Focus Genes in a network from Ingenuity's knowledge base being found together due to random chance. It is based on the hypergeometric distribution, calculated with the righttailed Fisher's Exact Test, and corresponds to the negative log of this p-value. A score of 1.5 was set as the cutoff for identifying gene networks. Furthermore, we used IPA in order to identify the top 1% of the deregulated genes and the top canonical pathways in which they participate. Also, IPA was used to reveal the top molecular and cellular functions, as well as the top biological functions of the co-DEGs. Of these genes, the top 10 deregulated molecules were scored.

Gene expression validation in ccRCC cell lines
The human ccRCC cell lines Caki-2 and ACHN (kindly provided by Dr Nicoletta Gagliano [43]) were used. Cells were cultured in RPMI-1640 medium supplemented with 10% fetal bovine serum (FBS), 2 mmol/l glutamine, antibiotics (100 U/ml penicillin, 0.1 mg/ml streptomycin) and amphotericin B (2.5 μg/ml). Cells were incubated at 37°C at 5% CO 2 in 75 cm 2 flasks. The non-cancerous Human Embryonic Kidney cell line (HEK-293) was used as control. Total RNA was isolated from cells at 80% confluency using the Total RNA isolation NucleoSpin RNA II kit (Macherey-Nagel, Duren) and 400 ng were reverse transcribed to cDNA using the ProtoScript M-MuLV first-strand cDNA synthesis kit (New England Biolabs, Ipswich, MA). Real-time PCRs for the validation of the expression profile of the top or co-DEGs were performed in triplicate 20μl reactions on a ViiA™ 7 Real-Time PCR System using SYBR ® Green Fast Master Mix (Applied Biosystems). The primer pairs were designed to span at least one intron in order to avoid amplification of the contaminating genomic DNA along with cDNA. Their sequence and the corresponding PCR product sizes are listed in Table 5. Relative mRNA levels were calculated by the ΔΔCt method [44,45] using ACTB, GAPDH and RPL13A as reference genes. The expression levels of the most deregulated genes were measured in the Caki-2 and ACHN cell lines and compared to the corresponding levels in the HEK-293 cells. The correct size of the PCR products was confirmed by electrophoresis on a 2% agarose gel stained with ethidium bromide. Purity of the amplified products was assessed by melting curve analysis.

Gene expression validation in a cohort of ccRCC patient samples
Tissue samples were obtained from kidney specimens from 10 patients with sporadic ccRCC who underwent a radical tumour nephrectomy. Immediately after resection, the samples were stored at -80°C. Histological classification was performed according to the WHO and staging according to the UICC-TNM classification (2002). Nuclear grade was scored according to the Fuhrman classification system [46]. Informed consent was obtained from all patients and the study protocol was approved by the Ethics Committee of the Asklipieio General Hospital, Athens. A matched normal kidney biopsy was also collected from each patient. Total RNA was isolated from ccRCC and normal kidney samples using the TRIzol ® Reagent and further purified using the RNeasy minikit (Qiagen, Hilden, Germany). RNA yield and quality were determined with a NanoDrop 1000 Spectrophotometer. cDNA synthesis and qRT-PCR was performed in triplicate 10 μL reaction volumes on a 384-plate format of a ViiA™ 7 Real-Time PCR System using the Fast SYBR Green Master Mix (Applied Biosystems, Foster City, CA), as previously described [47][48][49].

Immunohistochemistry (IHC)
Formalin-fixed and paraffin embedded (FFPE) kidney tissue samples from patients with ccRCC were retrieved from the archives of the Asklipieio General Hospital, Athens. All patients in this study underwent a radical tumour nephrectomy. Paraffin sections from each specimen were reviewed by a pathologist, were histologically classified according to the WHO classification and staged according to the UICC-TNM classification (2009). Nuclear grade was scored according to the Fuhrman classification system [46]. Informed Then sections were de-paraffinized using xylene and rehydrated through washes in graded alcohols. Endogenous peroxidase was neutralized using 0.5% v/v hydrogen peroxide/methanol for 10 minutes. Slides were washed with water. Retrieval was carried out using 0.01M citrate retrieval solution (pH 6.0). Sections were subsequently washed with TBS and blocked for 10 minutes using 3% BSA in PBS. Sections were incubated with primary polyclonal antibodies anti-NNMT (PA5-11143) and anti-NR3C1 (PA1-511A) (NNMT, 1:50 dilution; NR3C1, 1:200 dilution) overnight at 4°C. Slides were then washed and incubated with DAKO REAL EnVision, HRP Rabbit/ Mouse (ENV). Subsequently, slides were incubated with 3, 3'-diaminobenzidine (DAB) washed thoroughly in running tap water and counterstained with hematoxylin before being dehydrated and mounted. Haematoxylin and eosin (H&E) staining and IHC for several routinely used ccRCC-specific markers such as AE1/AE3 keratins, Vimentin, Ki67, p53 and S-100 was also performed for all sections.

Diagnostic performance of the top 20 deregulated genes
We performed a Receiver Operating Characteristic (ROC) curves analysis to evaluate the diagnostic performance of the top 20 deregulated genes in each microarray dataset. The MAS5-calculated Signal intensity values extracted from each dataset were used for the analysis. Sensitivity and specificity scores defining the area under the curve (AUC) were used for each candidate gene in order to discriminate between those individuals with ccRCC and those without the disease [50]. ROC curves were constructed using GraphPad Prism v.5 software (San Diego, CA).

Further Enrichment Analysis
The DEGs were further explored for the pathways in which they participate, using the WebGestalt web-tool (http://bioinfo.vanderbilt.edu/webgestalt), as previously reported [44,51]. Specifically, we applied Gene Ontology (GO), KEGG pathways, Wikipathways and Pathway Commons Analysis. Since the knowledge of common transcriptional regulatory networks could potentially lead to the key treatment for ccRCC, our attention was also focused on the transcription factors (TFs) that regulate the co-DEGs in ccRCC. We performed Transcription Factor Target Analysis. The hypergeometric test, with Bonferroni correction was used for enrichment evaluation analysis. The R function adjP was used in order to adjust the nominal p values of the large number of categories at the same time. The significance level for the adjusted p-value was set at 0.01 and the number of minimum genes for a category was set at 2.

Statistical Analysis
Differences in gene expression levels between ccRCC and normal kidney tissues were explored using the paired t-test. Numerical values were expressed as the mean±standard deviation (SD). Statistical significance was set at the 95% confidence level (p<0.05). The statistical package GraphPad Prism v.5 (La Jolla, CA) was used for calculations.