CellLineNavigator - Help

Help Sections

Data access and query tools

Data section:

The data section offers the possibility to explore tissue sides (default) or disease states of all 54613 entries within the CellLineNavigator database. 50 entries per page were displayed be default, eligible the value can be changed to 100, 250, 500 per page. Furthermore, a sorting option is supported whereas the user can sort by Symbol, Transcript, Entrez Gene Id or Probe ID. Moreover, an option to filter for specific expression is provided, possible options are: 1.5fold, 2fold, 2.5fold, 3fold, 4fold, and 5fold. The default filter is set to list all genes with a different expression level of at least 2fold in comparison to the control (see Material / Data Sources). However, the user may also set the filter criteria to none (no filter) or no regulation (list all genes whose M-Values are in the range of -1 to +1).      

Search section:

To allow users a high degree of flexibility to access CellLineNavigator, we implemented an advanced search section, offering the user Fulltext search or Explore profile options:

  • Fulltext search - Single or multiple gene names or IDs can be entered into the query form to search CellLineNavigator for expression levels within specific cell lines, tissue sides or disease states (or any combination of all).
  • Explore profile - The user may query for specific expression levels classified in the fields of (a combination of all query types is possible):

Download section:

Download CellNavigator in tab separated text file format.

Material / Data Sources

Genome-wide expression data, freely available at ArrayExpress experiment E-MTAB-37,were kindly provided by J. Greshock et al., Laboratory of Cancer Metabolism Drug Discovery, GlaxoSmithKline, USA. The transcript abundance of 317 cancer cell lines were analyzed using the Affymetrix HG-U133 Plus2 GeneChip technology. This chip covers the complete human genome for analysis of over 45,000 transcripts, more than 19,000 genes respectly. All data were available in technical triplicates. Corresponding information on tissue site and disease state were supported for each cell line

Methods

The analysis were implemented in R-Project using the bioconductor libraries affyhgu133plus2.db and fRMA (McCall 2011, McCall 2010).

Processing Affymetrix U133 Plus2:

​After quality control, two mircroarrays experiments (SNU398 - Replicate 1 and SNU423 - Replicate 2) were neglected for further analysis, due to insufficient RNA level detection. All data were normalized using the expresso function of the affy package and following settings: background adjustment method: mas, normalization method: quantiles, PM adjustment method: mas and the method used for the computation of expression values: medianpolish. Next, we calculated the expression median for each probe set over all cell lines. These values were subsequently used as control to calculate log2 transformed expression ratios (M-values), after the median expression was calculated for each cancer cell line. M-values representing the expression levels of tissue sites and disease states were calculated accordingly. Official gene symbols and NCBI Entrez GeneIDs were assigned to the data using the hgu133plus2.db package.

Gene barcode generation / Z-Score Transformation:

Gene expression barcodes were generated using the frma (default options) and barcode (output: z-score) function implemented in the frma package (McCall et al. 2011McCall et al. 2010). A FRMA Z-Score of > 5 suggested that a gene is expressed in a particular tissue. The FRMA Z-Score was generated to allow comparison of the expression profiles to data already present at medicalgenomics.org and other microarray datasets processed with the FRMA method. Finally, the Z-Score was summarized via mean for each cell line, tissue site and diseas state. Official gene symbols and NCBI Entrez GeneIDs were assigned to the data using the hgu133plus2.db package.