[d90d15]: / supplementary_and_QC / effect_of_probe_mapping.ipynb

Download this file

2281 lines (2280 with data), 177.7 kB

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "suppressPackageStartupMessages(library(\"affy\"))\n",
    "suppressPackageStartupMessages(library(\"affyio\"))\n",
    "\n",
    "\n",
    "################################ functions ##########################################\n",
    "\n",
    "# download and install BrainArray files \n",
    "\n",
    "getBrainArrayFile <-  function(chipname,what,version=\"22.0.0\",gene_id_type=\"ensg\", download_dir=getwd()){\n",
    "    fname <- paste0(chipname,gene_id_type,what,\"_\",version,\".tar.gz\")\n",
    "    link <- paste0(\"http://mbni.org/customcdf/\",version,\"/\",gene_id_type,\".download/\",fname)\n",
    "    #print(link)\n",
    "    fpath <- paste0(download_dir,fname)\n",
    "    download.file(link, destfile = fpath, method = \"wget\")\n",
    "    suppressWarnings(install.packages(fpath, repos = NULL, type=\"source\"))\n",
    "    libname <- paste0(chipname,gene_id_type,what)\n",
    "    suppressPackageStartupMessages(library(libname,character.only = T))\n",
    "    return(libname)\n",
    "}\n",
    "# for Ensembl:\n",
    "# entrezg -> ensg\n",
    "# .db -> probe\n",
    "installBrainArrayCDFandDB <- function(chipname,version=\"22.0.0\",gene_id_type=\"entrezg\", download_dir=getwd()){\n",
    "    customCDFname <- getBrainArrayFile(chipname,\"cdf\",version=version,\n",
    "                            gene_id_type=gene_id_type, download_dir=download_dir)\n",
    "    # the same way for probe file\n",
    "    \n",
    "    customDBname <- getBrainArrayFile(chipname,\".db\",version=version,\n",
    "                                   gene_id_type=gene_id_type, download_dir=download_dir)\n",
    "    return (list(customCDFname=customCDFname, customDBname = customDBname))\n",
    "}\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# The effect of CDF and annotation on the results of affy:justRMA()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Threre are two crucial reference files required to preprocess raw microarray data, i.e. to get gene expression profiels from .CEL files:\n",
    "\n",
    "* CDF file (or more precisely the environment generated from the source file) defines how probes unitied into probesets and stores probe location to probeset mapping. In Affymetrix microarrays each probeset includes ~10-20 oligonucleotide probes. Some of them are not specific enough or designed for QC (starting with \"AFFX\") and therefore are not suitable for gene expression  estimation. Different CDFs edfine sligtly different mapping with completely different probe names. \n",
    "\n",
    "Options:\n",
    " - AnnotationDbi. By default affy imports CDF environments from http://master.bioconductor.org/packages/release/data/annotation/.\n",
    " - BrainArray provides updated probeset definition (current v22.0.0, Jun 21, 2017) correponding to the latest genome and transcriptome assemblies, when Affymetrix definition may be outdated. The impact of CDF on the result of differential expression analysis may be high:\n",
    " https://onlinelibrary.wiley.com/doi/abs/10.1002/bies.20433\n",
    " https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-8-48\n",
    "\n",
    "\n",
    "* Annotation file of DB contains probeset IDs mapping to gene or transcript IDs. Since for some genes more than one probeset exist, per gene expression is computed as an average over probesets belonging to the same gene.\n",
    "Options:\n",
    " - AnnotationDbi - annotation is missing for HuGene-1_0-st-v1 platform.\n",
    " - BrainArray DB - covers approximately 100 unique genes momre than AnnotationDbi\n",
    " - biomaRt - allows to select a specific Ensemble version, e.g. 37.75. Ensemble describes their own pipeline of probe-to-gene mapping https://uswest.ensembl.org/info/genome/microarray_probe_set_mapping.html. No annotation for HGU-219 platform. \n",
    " \n",
    "### Selection of gene nomenclature\n",
    "Options:\n",
    " - HGNC gene Symbols\n",
    "     + human-readble\n",
    "     - not stable; IL8 can be changed to CXCL8; ~10% of IDs from PDX do not map on the current HGNC. \n",
    "     + despite that, in many studies gene symbols are the only IDs reported. PDX reports only gene symbols in expression and CNA data tables. \n",
    "     - many IDs do not correspond to genes but to loci associated with a phenotype; this is very cofusing for transcriptomics. At the same time, many RNAs and pseudogenes are not presented in this nomenclature.     \n",
    " - Entrez IDs\n",
    "     + numeric gene IDs, seem to be more stable\n",
    "     - not human-readable\n",
    "     \n",
    " - ENSEMBL IDs\n",
    "     + these ids correspond to a certain transcriptome assembly, so there are no loci and there are non-coding RNA\n",
    "     - BrainArray and AnnotationDbi mappings to Ensemble mixing up ENSEMBL IDs from Grch37 and Grch38, athough here https://www.sciencedirect.com/science/article/pii/S1535610817302532 BrainArray CDF was used together with transcriptome annotation \n",
    " \n",
    "# Conclusions\n",
    "in the current version we use: \n",
    " - the latest CDF from BrainArray \n",
    "and \n",
    " - BrainArray annotation to map probesets to Entrez gene IDs, since they are more stable than HGNC. \n",
    " \n",
    "In general, for transcriptomic and proteomic studies Ensembl nomencalture would be more appropriate, because it covers more transcripts, do not include \"phenotype only\" loci and allows to control assembe versions. However,  both BrainArray and AnnotationDbi some microarray probes mapped to multiple Ensembl IDs due to mixing up 37th and 38th assmblies. \n",
    "BiomaRt developed by Ensembl would be an alternative soution, but GDSC, a key dataset for our study, is profiled by affymetrix HGU-219 platform, absent in biomaRt. \n",
    "\n",
    "\n",
    "# To Do:\n",
    "- try RMA vs gcRMA vs MAS5 normalizations\n",
    "- perform QC "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Statistics of gene annotations in BrainArray\n",
    "\"hgu219hs\",\"hgu133plus2hs\",\"hgu95av2hs\",\"hgu133ahs\",\"hgu133bhs\",\"hugene10sths\"\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "hgu219hs    ENTREZ    18646    ENSEMBL    20342    ENSEMBL dups    20757    SYMBOL    18646    \n",
      "hgu133plus2hs    ENTREZ    20343    ENSEMBL    20928    ENSEMBL dups    22473    SYMBOL    20343    \n",
      "hgu95av2hs    ENTREZ    8450    ENSEMBL    9443    ENSEMBL dups    9554    SYMBOL    8450    \n",
      "hgu133ahs    ENTREZ    12250    ENSEMBL    13538    ENSEMBL dups    13842    SYMBOL    12250    \n",
      "hgu133bhs    ENTREZ    9074    ENSEMBL    8887    ENSEMBL dups    9671    SYMBOL    9074    \n",
      "hugene10sths    ENTREZ    20663    ENSEMBL    21371    ENSEMBL dups    22943    SYMBOL    20663    \n"
     ]
    }
   ],
   "source": [
    "# how many unique gene ids in each annotation\n",
    "for (platform in c(\"hgu219hs\",\"hgu133plus2hs\",\"hgu95av2hs\",\"hgu133ahs\",\"hgu133bhs\",\"hugene10sths\")){\n",
    "    customAnno <- installBrainArrayCDFandDB(platform,version='22.0.0',gene_id_type=\"entrezg\",\n",
    "                                        download_dir=\"/home/olya/SFU/Hossein/arrays/\")\n",
    "    anno_ENTR <- suppressMessages(select(get(customAnno$customDBname), keytype=\"PROBEID\", \n",
    "  columns=c(\"ENTREZID\"), keys=keys(get(customAnno$customDBname))))\n",
    "    anno_ENS <- suppressMessages(select(get(customAnno$customDBname), keytype=\"PROBEID\", \n",
    "  columns=c(\"ENSEMBL\"), keys=keys(get(customAnno$customDBname))))\n",
    "    anno_SYM <- suppressMessages(select(get(customAnno$customDBname), keytype=\"PROBEID\", \n",
    "  columns=c(\"SYMBOL\"), keys=keys(get(customAnno$customDBname))))\n",
    "    cat(paste(platform,\"ENTREZ\",length(unique(anno_ENTR$ENTREZID)),\n",
    "                \"ENSEMBL\",length(unique(anno_ENS$ENSEMBL)),\"ENSEMBL dups\",\n",
    "                length(duplicated(anno_ENS$PROBEID)),\n",
    "                \"SYMBOL\",length(unique(anno_SYM$SYMBOL)),\"\\n\",sep=\"    \"))\n",
    "    \n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Here we show, that our microarray preprocessing pipeline produces almost the same results as in He et al. 2018:\n",
    "https://academic.oup.com/bioinformatics/article/34/16/2808/4924716\n",
    "by Xiao He, Lukas Folkman and Karsten Borgwardt \n",
    "\n",
    "For that we preprocess raw expression data from GDSC and GSE (docetaxel clinica trials) and compare it with expression profiles provided by Lukas Folkman\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  Preprocess raw GDSC data from E-MTAB-3610 with BrainArray CDF and DB\n",
    "  * get platform names from CEL headers\n",
    "  * download and install corresponding BrainArray CDF and DB\n",
    "  * read full CEL files affy::justRMA() with specified cdfname ( hus we replace the default CDF with custom CDF from BrainArray); justRMA sumarizes CEL intensities according to provided CDF, bg-corrects, RMA-normalizes and log2-transforms.   \n",
    "  \n",
    " [docs. : \"cdfname If set to NULL, then the usual cdf package based on Affymetrix' mappings will be used.\"]\n",
    " According to affy manual, default CDF are taken from here: http://bioconductor.org/help/bioc-views/release/data/annotation/ (and most of them are AnnotationDbi packages)\n",
    " \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "1018"
      ],
      "text/latex": [
       "1018"
      ],
      "text/markdown": [
       "1018"
      ],
      "text/plain": [
       "[1] 1018"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "'hgu219cdf'"
      ],
      "text/latex": [
       "'hgu219cdf'"
      ],
      "text/markdown": [
       "'hgu219cdf'"
      ],
      "text/plain": [
       "[1] \"hgu219cdf\""
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "cel_file_dir = \"arrays/E-MTAB-3610/\"\n",
    "getSnames <- function(sname,ext=\".cel\"){\n",
    "    sname <- sub(ext, \"\", sname)\n",
    "    sname <- unlist(sname)[1]\n",
    "    return(sname)\n",
    "}\n",
    "fnames <- list.files(cel_file_dir ,pattern = \".cel\")\n",
    "snames <- lapply(fnames, getSnames)\n",
    "snames <- unlist(snames)\n",
    "# number of CEL files:\n",
    "length(snames)\n",
    "#get CDF names from platform names from CEL headers \n",
    "getCDFname <- function(fname,dir=getwd()){\n",
    "    cel_header <-affyio::read.celfile.header(paste0(dir,fname))\n",
    "    return(cleancdfname(cel_header$cdfName))\n",
    "}\n",
    "cdf_basenames <- unique(unlist(lapply(fnames,getCDFname, dir=cel_file_dir)))\n",
    "cdf_basenames \n",
    "# add processing of batches in future\n",
    "cdf_basename <- sub(\"cdf\",\"\",cdf_basenames[1])\n",
    "\n",
    "# load BrainArray\n",
    "customAnno <- installBrainArrayCDFandDB(paste0(cdf_basename,\"hs\"),version='22.0.0',gene_id_type=\"entrezg\",\n",
    "                                        download_dir=\"/home/olya/SFU/Hossein/arrays/\")\n",
    "#customAnno"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "ExpressionSet (storageMode: lockedEnvironment)\n",
       "assayData: 18653 features, 1018 samples \n",
       "  element names: exprs, se.exprs \n",
       "protocolData\n",
       "  sampleNames: 5500994157493061613625_A01 5500994157493061613625_A02\n",
       "    ... 5500994175999120813240_H11 (1018 total)\n",
       "  varLabels: ScanDate\n",
       "  varMetadata: labelDescription\n",
       "phenoData\n",
       "  sampleNames: 5500994157493061613625_A01 5500994157493061613625_A02\n",
       "    ... 5500994175999120813240_H11 (1018 total)\n",
       "  varLabels: sample\n",
       "  varMetadata: labelDescription\n",
       "featureData: none\n",
       "experimentData: use 'experimentData(object)'\n",
       "Annotation: hgu219hsentrezgcdf "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>5500994157493061613625_A01</th><th scope=col>5500994157493061613625_A02</th><th scope=col>5500994157493061613625_A03</th><th scope=col>5500994157493061613625_A04</th><th scope=col>5500994157493061613625_A05</th><th scope=col>5500994157493061613625_A06</th><th scope=col>5500994157493061613625_A07</th><th scope=col>5500994157493061613625_A08</th><th scope=col>5500994157493061613625_A09</th><th scope=col>5500994157493061613625_A10</th><th scope=col>⋯</th><th scope=col>5500994175999120813240_H02</th><th scope=col>5500994175999120813240_H03</th><th scope=col>5500994175999120813240_H04</th><th scope=col>5500994175999120813240_H05</th><th scope=col>5500994175999120813240_H06</th><th scope=col>5500994175999120813240_H07</th><th scope=col>5500994175999120813240_H08</th><th scope=col>5500994175999120813240_H09</th><th scope=col>5500994175999120813240_H10</th><th scope=col>5500994175999120813240_H11</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>1_at</th><td>6.210671 </td><td>5.029226 </td><td>5.513632 </td><td>4.208249 </td><td>3.402282 </td><td>4.914035 </td><td>3.827411 </td><td>5.147878 </td><td>3.110751 </td><td>5.054884 </td><td>⋯        </td><td>4.273165 </td><td>3.442682 </td><td> 4.923977</td><td>2.901377 </td><td>4.528475 </td><td>5.079415 </td><td>2.959578 </td><td>3.094308 </td><td>4.045297 </td><td>5.339831 </td></tr>\n",
       "\t<tr><th scope=row>10_at</th><td>3.091616 </td><td>2.925953 </td><td>3.186898 </td><td>3.090349 </td><td>5.070067 </td><td>2.912126 </td><td>3.028838 </td><td>3.049457 </td><td>2.996803 </td><td>3.487389 </td><td>⋯        </td><td>3.169651 </td><td>3.094766 </td><td> 2.978914</td><td>3.312476 </td><td>3.271839 </td><td>3.004895 </td><td>3.039933 </td><td>3.346097 </td><td>3.049132 </td><td>3.238387 </td></tr>\n",
       "\t<tr><th scope=row>100_at</th><td>8.956673 </td><td>7.687656 </td><td>7.944700 </td><td>9.259829 </td><td>7.017582 </td><td>6.905274 </td><td>9.751193 </td><td>7.540148 </td><td>5.773222 </td><td>3.324055 </td><td>⋯        </td><td>6.993797 </td><td>8.793967 </td><td>11.941098</td><td>9.371423 </td><td>7.939735 </td><td>9.046963 </td><td>6.055670 </td><td>7.746151 </td><td>6.814890 </td><td>5.698509 </td></tr>\n",
       "\t<tr><th scope=row>1000_at</th><td>7.613752 </td><td>3.066216 </td><td>6.634775 </td><td>8.708245 </td><td>8.754138 </td><td>6.827789 </td><td>3.282731 </td><td>8.739057 </td><td>9.572730 </td><td>3.037323 </td><td>⋯        </td><td>3.816762 </td><td>3.022264 </td><td> 5.862574</td><td>6.649221 </td><td>8.565891 </td><td>6.079522 </td><td>4.536474 </td><td>4.141976 </td><td>8.703439 </td><td>4.722706 </td></tr>\n",
       "\t<tr><th scope=row>10000_at</th><td>5.836163 </td><td>4.045650 </td><td>2.832221 </td><td>4.290599 </td><td>4.776181 </td><td>4.715363 </td><td>3.575370 </td><td>4.213761 </td><td>5.838059 </td><td>2.894429 </td><td>⋯        </td><td>5.388321 </td><td>2.827153 </td><td> 3.599740</td><td>5.266808 </td><td>3.960827 </td><td>5.060720 </td><td>5.723751 </td><td>3.494836 </td><td>3.297643 </td><td>5.769323 </td></tr>\n",
       "\t<tr><th scope=row>10001_at</th><td>6.927791 </td><td>7.396852 </td><td>7.097197 </td><td>7.389636 </td><td>8.015398 </td><td>7.107112 </td><td>7.659327 </td><td>6.855786 </td><td>7.808900 </td><td>7.290094 </td><td>⋯        </td><td>7.876365 </td><td>8.314665 </td><td> 8.197616</td><td>7.431327 </td><td>8.092348 </td><td>6.973491 </td><td>8.852970 </td><td>8.436557 </td><td>7.824003 </td><td>7.584112 </td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll}\n",
       "  & 5500994157493061613625\\_A01 & 5500994157493061613625\\_A02 & 5500994157493061613625\\_A03 & 5500994157493061613625\\_A04 & 5500994157493061613625\\_A05 & 5500994157493061613625\\_A06 & 5500994157493061613625\\_A07 & 5500994157493061613625\\_A08 & 5500994157493061613625\\_A09 & 5500994157493061613625\\_A10 & ⋯ & 5500994175999120813240\\_H02 & 5500994175999120813240\\_H03 & 5500994175999120813240\\_H04 & 5500994175999120813240\\_H05 & 5500994175999120813240\\_H06 & 5500994175999120813240\\_H07 & 5500994175999120813240\\_H08 & 5500994175999120813240\\_H09 & 5500994175999120813240\\_H10 & 5500994175999120813240\\_H11\\\\\n",
       "\\hline\n",
       "\t1\\_at & 6.210671  & 5.029226  & 5.513632  & 4.208249  & 3.402282  & 4.914035  & 3.827411  & 5.147878  & 3.110751  & 5.054884  & ⋯         & 4.273165  & 3.442682  &  4.923977 & 2.901377  & 4.528475  & 5.079415  & 2.959578  & 3.094308  & 4.045297  & 5.339831 \\\\\n",
       "\t10\\_at & 3.091616  & 2.925953  & 3.186898  & 3.090349  & 5.070067  & 2.912126  & 3.028838  & 3.049457  & 2.996803  & 3.487389  & ⋯         & 3.169651  & 3.094766  &  2.978914 & 3.312476  & 3.271839  & 3.004895  & 3.039933  & 3.346097  & 3.049132  & 3.238387 \\\\\n",
       "\t100\\_at & 8.956673  & 7.687656  & 7.944700  & 9.259829  & 7.017582  & 6.905274  & 9.751193  & 7.540148  & 5.773222  & 3.324055  & ⋯         & 6.993797  & 8.793967  & 11.941098 & 9.371423  & 7.939735  & 9.046963  & 6.055670  & 7.746151  & 6.814890  & 5.698509 \\\\\n",
       "\t1000\\_at & 7.613752  & 3.066216  & 6.634775  & 8.708245  & 8.754138  & 6.827789  & 3.282731  & 8.739057  & 9.572730  & 3.037323  & ⋯         & 3.816762  & 3.022264  &  5.862574 & 6.649221  & 8.565891  & 6.079522  & 4.536474  & 4.141976  & 8.703439  & 4.722706 \\\\\n",
       "\t10000\\_at & 5.836163  & 4.045650  & 2.832221  & 4.290599  & 4.776181  & 4.715363  & 3.575370  & 4.213761  & 5.838059  & 2.894429  & ⋯         & 5.388321  & 2.827153  &  3.599740 & 5.266808  & 3.960827  & 5.060720  & 5.723751  & 3.494836  & 3.297643  & 5.769323 \\\\\n",
       "\t10001\\_at & 6.927791  & 7.396852  & 7.097197  & 7.389636  & 8.015398  & 7.107112  & 7.659327  & 6.855786  & 7.808900  & 7.290094  & ⋯         & 7.876365  & 8.314665  &  8.197616 & 7.431327  & 8.092348  & 6.973491  & 8.852970  & 8.436557  & 7.824003  & 7.584112 \\\\\n",
       "\\end{tabular}\n"
      ],
      "text/markdown": [
       "\n",
       "| <!--/--> | 5500994157493061613625_A01 | 5500994157493061613625_A02 | 5500994157493061613625_A03 | 5500994157493061613625_A04 | 5500994157493061613625_A05 | 5500994157493061613625_A06 | 5500994157493061613625_A07 | 5500994157493061613625_A08 | 5500994157493061613625_A09 | 5500994157493061613625_A10 | ⋯ | 5500994175999120813240_H02 | 5500994175999120813240_H03 | 5500994175999120813240_H04 | 5500994175999120813240_H05 | 5500994175999120813240_H06 | 5500994175999120813240_H07 | 5500994175999120813240_H08 | 5500994175999120813240_H09 | 5500994175999120813240_H10 | 5500994175999120813240_H11 | \n",
       "|---|---|---|---|---|---|\n",
       "| 1_at | 6.210671  | 5.029226  | 5.513632  | 4.208249  | 3.402282  | 4.914035  | 3.827411  | 5.147878  | 3.110751  | 5.054884  | ⋯         | 4.273165  | 3.442682  |  4.923977 | 2.901377  | 4.528475  | 5.079415  | 2.959578  | 3.094308  | 4.045297  | 5.339831  | \n",
       "| 10_at | 3.091616  | 2.925953  | 3.186898  | 3.090349  | 5.070067  | 2.912126  | 3.028838  | 3.049457  | 2.996803  | 3.487389  | ⋯         | 3.169651  | 3.094766  |  2.978914 | 3.312476  | 3.271839  | 3.004895  | 3.039933  | 3.346097  | 3.049132  | 3.238387  | \n",
       "| 100_at | 8.956673  | 7.687656  | 7.944700  | 9.259829  | 7.017582  | 6.905274  | 9.751193  | 7.540148  | 5.773222  | 3.324055  | ⋯         | 6.993797  | 8.793967  | 11.941098 | 9.371423  | 7.939735  | 9.046963  | 6.055670  | 7.746151  | 6.814890  | 5.698509  | \n",
       "| 1000_at | 7.613752  | 3.066216  | 6.634775  | 8.708245  | 8.754138  | 6.827789  | 3.282731  | 8.739057  | 9.572730  | 3.037323  | ⋯         | 3.816762  | 3.022264  |  5.862574 | 6.649221  | 8.565891  | 6.079522  | 4.536474  | 4.141976  | 8.703439  | 4.722706  | \n",
       "| 10000_at | 5.836163  | 4.045650  | 2.832221  | 4.290599  | 4.776181  | 4.715363  | 3.575370  | 4.213761  | 5.838059  | 2.894429  | ⋯         | 5.388321  | 2.827153  |  3.599740 | 5.266808  | 3.960827  | 5.060720  | 5.723751  | 3.494836  | 3.297643  | 5.769323  | \n",
       "| 10001_at | 6.927791  | 7.396852  | 7.097197  | 7.389636  | 8.015398  | 7.107112  | 7.659327  | 6.855786  | 7.808900  | 7.290094  | ⋯         | 7.876365  | 8.314665  |  8.197616 | 7.431327  | 8.092348  | 6.973491  | 8.852970  | 8.436557  | 7.824003  | 7.584112  | \n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "         5500994157493061613625_A01 5500994157493061613625_A02\n",
       "1_at     6.210671                   5.029226                  \n",
       "10_at    3.091616                   2.925953                  \n",
       "100_at   8.956673                   7.687656                  \n",
       "1000_at  7.613752                   3.066216                  \n",
       "10000_at 5.836163                   4.045650                  \n",
       "10001_at 6.927791                   7.396852                  \n",
       "         5500994157493061613625_A03 5500994157493061613625_A04\n",
       "1_at     5.513632                   4.208249                  \n",
       "10_at    3.186898                   3.090349                  \n",
       "100_at   7.944700                   9.259829                  \n",
       "1000_at  6.634775                   8.708245                  \n",
       "10000_at 2.832221                   4.290599                  \n",
       "10001_at 7.097197                   7.389636                  \n",
       "         5500994157493061613625_A05 5500994157493061613625_A06\n",
       "1_at     3.402282                   4.914035                  \n",
       "10_at    5.070067                   2.912126                  \n",
       "100_at   7.017582                   6.905274                  \n",
       "1000_at  8.754138                   6.827789                  \n",
       "10000_at 4.776181                   4.715363                  \n",
       "10001_at 8.015398                   7.107112                  \n",
       "         5500994157493061613625_A07 5500994157493061613625_A08\n",
       "1_at     3.827411                   5.147878                  \n",
       "10_at    3.028838                   3.049457                  \n",
       "100_at   9.751193                   7.540148                  \n",
       "1000_at  3.282731                   8.739057                  \n",
       "10000_at 3.575370                   4.213761                  \n",
       "10001_at 7.659327                   6.855786                  \n",
       "         5500994157493061613625_A09 5500994157493061613625_A10 ⋯\n",
       "1_at     3.110751                   5.054884                   ⋯\n",
       "10_at    2.996803                   3.487389                   ⋯\n",
       "100_at   5.773222                   3.324055                   ⋯\n",
       "1000_at  9.572730                   3.037323                   ⋯\n",
       "10000_at 5.838059                   2.894429                   ⋯\n",
       "10001_at 7.808900                   7.290094                   ⋯\n",
       "         5500994175999120813240_H02 5500994175999120813240_H03\n",
       "1_at     4.273165                   3.442682                  \n",
       "10_at    3.169651                   3.094766                  \n",
       "100_at   6.993797                   8.793967                  \n",
       "1000_at  3.816762                   3.022264                  \n",
       "10000_at 5.388321                   2.827153                  \n",
       "10001_at 7.876365                   8.314665                  \n",
       "         5500994175999120813240_H04 5500994175999120813240_H05\n",
       "1_at      4.923977                  2.901377                  \n",
       "10_at     2.978914                  3.312476                  \n",
       "100_at   11.941098                  9.371423                  \n",
       "1000_at   5.862574                  6.649221                  \n",
       "10000_at  3.599740                  5.266808                  \n",
       "10001_at  8.197616                  7.431327                  \n",
       "         5500994175999120813240_H06 5500994175999120813240_H07\n",
       "1_at     4.528475                   5.079415                  \n",
       "10_at    3.271839                   3.004895                  \n",
       "100_at   7.939735                   9.046963                  \n",
       "1000_at  8.565891                   6.079522                  \n",
       "10000_at 3.960827                   5.060720                  \n",
       "10001_at 8.092348                   6.973491                  \n",
       "         5500994175999120813240_H08 5500994175999120813240_H09\n",
       "1_at     2.959578                   3.094308                  \n",
       "10_at    3.039933                   3.346097                  \n",
       "100_at   6.055670                   7.746151                  \n",
       "1000_at  4.536474                   4.141976                  \n",
       "10000_at 5.723751                   3.494836                  \n",
       "10001_at 8.852970                   8.436557                  \n",
       "         5500994175999120813240_H10 5500994175999120813240_H11\n",
       "1_at     4.045297                   5.339831                  \n",
       "10_at    3.049132                   3.238387                  \n",
       "100_at   6.814890                   5.698509                  \n",
       "1000_at  8.703439                   4.722706                  \n",
       "10000_at 3.297643                   5.769323                  \n",
       "10001_at 7.824003                   7.584112                  "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "GDSC_rma_ba <- affy::justRMA(celfile.path=\"arrays/E-MTAB-3610/\",filenames = fnames,sampleNames=snames,\n",
    "                     verbose=FALSE,compress=FALSE,cdfname=customAnno$customCDFname)\n",
    "#GDSC_rma_ba\n",
    "dim(exprs(GDSC_rma_ba))\n",
    "head(exprs(GDSC_rma_ba))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "'select()' returned 1:1 mapping between keys and columns\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0    8    NA values in columns    PROBEID    ENTREZID    \n",
      "18645    rows remain after removing NA from     ENTREZID    \n",
      "duplicates in column    PROBEID    0    \n",
      "duplicates in column    ENTREZID    0    \n",
      "unique genes    18645    \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>18645</li>\n",
       "\t<li>2</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 18645\n",
       "\\item 2\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 18645\n",
       "2. 2\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 18645     2"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "'select()' returned 1:1 mapping between keys and columns\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0    8    NA values in columns    PROBEID    SYMBOL    \n",
      "18645    rows remain after removing NA from     SYMBOL    \n",
      "duplicates in column    PROBEID    0    \n",
      "duplicates in column    SYMBOL    0    \n",
      "unique genes    18645    \n"
     ]
    }
   ],
   "source": [
    "diagnose_annotation <- function(anno,drop_NA=TRUE){\n",
    "    \n",
    "    cat(paste(sum(is.na(anno[,1])),sum(is.na(anno[,2])),\n",
    "            \"NA values in columns\",colnames(anno)[1],colnames(anno)[2],\"\\n\",sep=\"    \"))\n",
    "    a <- anno[!is.na(anno[,2]),]\n",
    "    cat(paste(dim(a)[1],\"rows remain after removing NA from \",colnames(anno)[2],\"\\n\",sep=\"    \"))\n",
    "    dups <- unique(a[,1][duplicated(a[,1])])\n",
    "    cat(paste(\"duplicates in column\",colnames(a)[1], length(dups),\"\\n\",sep=\"    \"))\n",
    "    dup_probes <- unique(a[,2][duplicated(a[,2])])\n",
    "    cat(paste(\"duplicates in column\",colnames(a)[2], length(dup_probes),\"\\n\",sep=\"    \"))\n",
    "    cat(paste(\"unique genes\",length(unique(a[,2])),\"\\n\",sep=\"    \"))\n",
    "    if (drop_NA){\n",
    "        # drop unmapped probes with NA gene IDs\n",
    "        anno <- anno[!is.na(anno[,2]),]\n",
    "        return(anno)\n",
    "    }\n",
    "    else{\n",
    "        return(anno)\n",
    "    }\n",
    "}\n",
    "anno_ENTREZID <- select(get(customAnno$customDBname), keytype=\"PROBEID\", \n",
    "  columns=c(\"ENTREZID\"), keys=row.names(exprs(GDSC_rma_ba)))\n",
    "anno_ENTREZID <- diagnose_annotation(anno_ENTREZID)\n",
    "dim(anno_ENTREZID)\n",
    "anno_SYMBOL <- select(get(customAnno$customDBname), keytype=\"PROBEID\", \n",
    "  columns=c(\"SYMBOL\"), keys=row.names(exprs(GDSC_rma_ba)))\n",
    "anno_SYMBOL <- diagnose_annotation(anno_SYMBOL)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Annotate probesets with genes and summarize to gene level\n",
    "- Use Entrez gene IDs\n",
    "- Apply WGCNA::collapseRows(method=\"Average\")\n",
    "- write to file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>18645</li>\n",
       "\t<li>1018</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 18645\n",
       "\\item 1018\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 18645\n",
       "2. 1018\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 18645  1018"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>5500994157493061613625_A01</th><th scope=col>5500994157493061613625_A02</th><th scope=col>5500994157493061613625_A03</th><th scope=col>5500994157493061613625_A04</th><th scope=col>5500994157493061613625_A05</th><th scope=col>5500994157493061613625_A06</th><th scope=col>5500994157493061613625_A07</th><th scope=col>5500994157493061613625_A08</th><th scope=col>5500994157493061613625_A09</th><th scope=col>5500994157493061613625_A10</th><th scope=col>⋯</th><th scope=col>5500994175999120813240_H02</th><th scope=col>5500994175999120813240_H03</th><th scope=col>5500994175999120813240_H04</th><th scope=col>5500994175999120813240_H05</th><th scope=col>5500994175999120813240_H06</th><th scope=col>5500994175999120813240_H07</th><th scope=col>5500994175999120813240_H08</th><th scope=col>5500994175999120813240_H09</th><th scope=col>5500994175999120813240_H10</th><th scope=col>5500994175999120813240_H11</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>A1BG</th><td>6.210671</td><td>5.029226</td><td>5.513632</td><td>4.208249</td><td>3.402282</td><td>4.914035</td><td>3.827411</td><td>5.147878</td><td>3.110751</td><td>5.054884</td><td>⋯       </td><td>4.273165</td><td>3.442682</td><td>4.923977</td><td>2.901377</td><td>4.528475</td><td>5.079415</td><td>2.959578</td><td>3.094308</td><td>4.045297</td><td>5.339831</td></tr>\n",
       "\t<tr><th scope=row>A1CF</th><td>2.986196</td><td>2.952698</td><td>2.876904</td><td>3.078416</td><td>2.859689</td><td>3.226181</td><td>3.002562</td><td>2.899220</td><td>2.761808</td><td>2.991242</td><td>⋯       </td><td>2.948285</td><td>3.162107</td><td>2.989301</td><td>3.122200</td><td>2.981742</td><td>2.912033</td><td>2.950125</td><td>2.788239</td><td>2.874734</td><td>2.930411</td></tr>\n",
       "\t<tr><th scope=row>A2M</th><td>3.138822</td><td>3.339745</td><td>3.288143</td><td>3.035124</td><td>3.266711</td><td>6.878105</td><td>3.306808</td><td>3.376042</td><td>3.326467</td><td>3.532250</td><td>⋯       </td><td>3.898669</td><td>3.478617</td><td>3.251410</td><td>6.346759</td><td>3.532383</td><td>3.161646</td><td>3.203922</td><td>3.483303</td><td>5.147751</td><td>8.860099</td></tr>\n",
       "\t<tr><th scope=row>A2ML1</th><td>2.659014</td><td>2.740146</td><td>3.941870</td><td>2.782538</td><td>2.716182</td><td>2.912234</td><td>2.791071</td><td>2.829216</td><td>2.736346</td><td>2.977346</td><td>⋯       </td><td>2.904620</td><td>2.795384</td><td>2.714010</td><td>2.835382</td><td>2.763315</td><td>2.752095</td><td>2.911493</td><td>2.920386</td><td>2.794910</td><td>2.991144</td></tr>\n",
       "\t<tr><th scope=row>A3GALT2</th><td>2.765093</td><td>2.871171</td><td>3.009816</td><td>2.634662</td><td>2.697715</td><td>2.593990</td><td>2.929163</td><td>2.744387</td><td>2.998624</td><td>3.043765</td><td>⋯       </td><td>2.964269</td><td>3.067023</td><td>2.768809</td><td>2.607127</td><td>2.778964</td><td>2.850118</td><td>2.703913</td><td>2.663925</td><td>2.763491</td><td>2.604637</td></tr>\n",
       "\t<tr><th scope=row>A4GALT</th><td>3.342432</td><td>3.138229</td><td>3.513538</td><td>3.166587</td><td>3.240146</td><td>3.246475</td><td>3.801788</td><td>3.377124</td><td>3.288002</td><td>3.361292</td><td>⋯       </td><td>3.384112</td><td>3.070601</td><td>2.986975</td><td>3.334498</td><td>3.398661</td><td>3.195717</td><td>3.277447</td><td>4.407162</td><td>3.008970</td><td>3.053538</td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll}\n",
       "  & 5500994157493061613625\\_A01 & 5500994157493061613625\\_A02 & 5500994157493061613625\\_A03 & 5500994157493061613625\\_A04 & 5500994157493061613625\\_A05 & 5500994157493061613625\\_A06 & 5500994157493061613625\\_A07 & 5500994157493061613625\\_A08 & 5500994157493061613625\\_A09 & 5500994157493061613625\\_A10 & ⋯ & 5500994175999120813240\\_H02 & 5500994175999120813240\\_H03 & 5500994175999120813240\\_H04 & 5500994175999120813240\\_H05 & 5500994175999120813240\\_H06 & 5500994175999120813240\\_H07 & 5500994175999120813240\\_H08 & 5500994175999120813240\\_H09 & 5500994175999120813240\\_H10 & 5500994175999120813240\\_H11\\\\\n",
       "\\hline\n",
       "\tA1BG & 6.210671 & 5.029226 & 5.513632 & 4.208249 & 3.402282 & 4.914035 & 3.827411 & 5.147878 & 3.110751 & 5.054884 & ⋯        & 4.273165 & 3.442682 & 4.923977 & 2.901377 & 4.528475 & 5.079415 & 2.959578 & 3.094308 & 4.045297 & 5.339831\\\\\n",
       "\tA1CF & 2.986196 & 2.952698 & 2.876904 & 3.078416 & 2.859689 & 3.226181 & 3.002562 & 2.899220 & 2.761808 & 2.991242 & ⋯        & 2.948285 & 3.162107 & 2.989301 & 3.122200 & 2.981742 & 2.912033 & 2.950125 & 2.788239 & 2.874734 & 2.930411\\\\\n",
       "\tA2M & 3.138822 & 3.339745 & 3.288143 & 3.035124 & 3.266711 & 6.878105 & 3.306808 & 3.376042 & 3.326467 & 3.532250 & ⋯        & 3.898669 & 3.478617 & 3.251410 & 6.346759 & 3.532383 & 3.161646 & 3.203922 & 3.483303 & 5.147751 & 8.860099\\\\\n",
       "\tA2ML1 & 2.659014 & 2.740146 & 3.941870 & 2.782538 & 2.716182 & 2.912234 & 2.791071 & 2.829216 & 2.736346 & 2.977346 & ⋯        & 2.904620 & 2.795384 & 2.714010 & 2.835382 & 2.763315 & 2.752095 & 2.911493 & 2.920386 & 2.794910 & 2.991144\\\\\n",
       "\tA3GALT2 & 2.765093 & 2.871171 & 3.009816 & 2.634662 & 2.697715 & 2.593990 & 2.929163 & 2.744387 & 2.998624 & 3.043765 & ⋯        & 2.964269 & 3.067023 & 2.768809 & 2.607127 & 2.778964 & 2.850118 & 2.703913 & 2.663925 & 2.763491 & 2.604637\\\\\n",
       "\tA4GALT & 3.342432 & 3.138229 & 3.513538 & 3.166587 & 3.240146 & 3.246475 & 3.801788 & 3.377124 & 3.288002 & 3.361292 & ⋯        & 3.384112 & 3.070601 & 2.986975 & 3.334498 & 3.398661 & 3.195717 & 3.277447 & 4.407162 & 3.008970 & 3.053538\\\\\n",
       "\\end{tabular}\n"
      ],
      "text/markdown": [
       "\n",
       "| <!--/--> | 5500994157493061613625_A01 | 5500994157493061613625_A02 | 5500994157493061613625_A03 | 5500994157493061613625_A04 | 5500994157493061613625_A05 | 5500994157493061613625_A06 | 5500994157493061613625_A07 | 5500994157493061613625_A08 | 5500994157493061613625_A09 | 5500994157493061613625_A10 | ⋯ | 5500994175999120813240_H02 | 5500994175999120813240_H03 | 5500994175999120813240_H04 | 5500994175999120813240_H05 | 5500994175999120813240_H06 | 5500994175999120813240_H07 | 5500994175999120813240_H08 | 5500994175999120813240_H09 | 5500994175999120813240_H10 | 5500994175999120813240_H11 | \n",
       "|---|---|---|---|---|---|\n",
       "| A1BG | 6.210671 | 5.029226 | 5.513632 | 4.208249 | 3.402282 | 4.914035 | 3.827411 | 5.147878 | 3.110751 | 5.054884 | ⋯        | 4.273165 | 3.442682 | 4.923977 | 2.901377 | 4.528475 | 5.079415 | 2.959578 | 3.094308 | 4.045297 | 5.339831 | \n",
       "| A1CF | 2.986196 | 2.952698 | 2.876904 | 3.078416 | 2.859689 | 3.226181 | 3.002562 | 2.899220 | 2.761808 | 2.991242 | ⋯        | 2.948285 | 3.162107 | 2.989301 | 3.122200 | 2.981742 | 2.912033 | 2.950125 | 2.788239 | 2.874734 | 2.930411 | \n",
       "| A2M | 3.138822 | 3.339745 | 3.288143 | 3.035124 | 3.266711 | 6.878105 | 3.306808 | 3.376042 | 3.326467 | 3.532250 | ⋯        | 3.898669 | 3.478617 | 3.251410 | 6.346759 | 3.532383 | 3.161646 | 3.203922 | 3.483303 | 5.147751 | 8.860099 | \n",
       "| A2ML1 | 2.659014 | 2.740146 | 3.941870 | 2.782538 | 2.716182 | 2.912234 | 2.791071 | 2.829216 | 2.736346 | 2.977346 | ⋯        | 2.904620 | 2.795384 | 2.714010 | 2.835382 | 2.763315 | 2.752095 | 2.911493 | 2.920386 | 2.794910 | 2.991144 | \n",
       "| A3GALT2 | 2.765093 | 2.871171 | 3.009816 | 2.634662 | 2.697715 | 2.593990 | 2.929163 | 2.744387 | 2.998624 | 3.043765 | ⋯        | 2.964269 | 3.067023 | 2.768809 | 2.607127 | 2.778964 | 2.850118 | 2.703913 | 2.663925 | 2.763491 | 2.604637 | \n",
       "| A4GALT | 3.342432 | 3.138229 | 3.513538 | 3.166587 | 3.240146 | 3.246475 | 3.801788 | 3.377124 | 3.288002 | 3.361292 | ⋯        | 3.384112 | 3.070601 | 2.986975 | 3.334498 | 3.398661 | 3.195717 | 3.277447 | 4.407162 | 3.008970 | 3.053538 | \n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "        5500994157493061613625_A01 5500994157493061613625_A02\n",
       "A1BG    6.210671                   5.029226                  \n",
       "A1CF    2.986196                   2.952698                  \n",
       "A2M     3.138822                   3.339745                  \n",
       "A2ML1   2.659014                   2.740146                  \n",
       "A3GALT2 2.765093                   2.871171                  \n",
       "A4GALT  3.342432                   3.138229                  \n",
       "        5500994157493061613625_A03 5500994157493061613625_A04\n",
       "A1BG    5.513632                   4.208249                  \n",
       "A1CF    2.876904                   3.078416                  \n",
       "A2M     3.288143                   3.035124                  \n",
       "A2ML1   3.941870                   2.782538                  \n",
       "A3GALT2 3.009816                   2.634662                  \n",
       "A4GALT  3.513538                   3.166587                  \n",
       "        5500994157493061613625_A05 5500994157493061613625_A06\n",
       "A1BG    3.402282                   4.914035                  \n",
       "A1CF    2.859689                   3.226181                  \n",
       "A2M     3.266711                   6.878105                  \n",
       "A2ML1   2.716182                   2.912234                  \n",
       "A3GALT2 2.697715                   2.593990                  \n",
       "A4GALT  3.240146                   3.246475                  \n",
       "        5500994157493061613625_A07 5500994157493061613625_A08\n",
       "A1BG    3.827411                   5.147878                  \n",
       "A1CF    3.002562                   2.899220                  \n",
       "A2M     3.306808                   3.376042                  \n",
       "A2ML1   2.791071                   2.829216                  \n",
       "A3GALT2 2.929163                   2.744387                  \n",
       "A4GALT  3.801788                   3.377124                  \n",
       "        5500994157493061613625_A09 5500994157493061613625_A10 ⋯\n",
       "A1BG    3.110751                   5.054884                   ⋯\n",
       "A1CF    2.761808                   2.991242                   ⋯\n",
       "A2M     3.326467                   3.532250                   ⋯\n",
       "A2ML1   2.736346                   2.977346                   ⋯\n",
       "A3GALT2 2.998624                   3.043765                   ⋯\n",
       "A4GALT  3.288002                   3.361292                   ⋯\n",
       "        5500994175999120813240_H02 5500994175999120813240_H03\n",
       "A1BG    4.273165                   3.442682                  \n",
       "A1CF    2.948285                   3.162107                  \n",
       "A2M     3.898669                   3.478617                  \n",
       "A2ML1   2.904620                   2.795384                  \n",
       "A3GALT2 2.964269                   3.067023                  \n",
       "A4GALT  3.384112                   3.070601                  \n",
       "        5500994175999120813240_H04 5500994175999120813240_H05\n",
       "A1BG    4.923977                   2.901377                  \n",
       "A1CF    2.989301                   3.122200                  \n",
       "A2M     3.251410                   6.346759                  \n",
       "A2ML1   2.714010                   2.835382                  \n",
       "A3GALT2 2.768809                   2.607127                  \n",
       "A4GALT  2.986975                   3.334498                  \n",
       "        5500994175999120813240_H06 5500994175999120813240_H07\n",
       "A1BG    4.528475                   5.079415                  \n",
       "A1CF    2.981742                   2.912033                  \n",
       "A2M     3.532383                   3.161646                  \n",
       "A2ML1   2.763315                   2.752095                  \n",
       "A3GALT2 2.778964                   2.850118                  \n",
       "A4GALT  3.398661                   3.195717                  \n",
       "        5500994175999120813240_H08 5500994175999120813240_H09\n",
       "A1BG    2.959578                   3.094308                  \n",
       "A1CF    2.950125                   2.788239                  \n",
       "A2M     3.203922                   3.483303                  \n",
       "A2ML1   2.911493                   2.920386                  \n",
       "A3GALT2 2.703913                   2.663925                  \n",
       "A4GALT  3.277447                   4.407162                  \n",
       "        5500994175999120813240_H10 5500994175999120813240_H11\n",
       "A1BG    4.045297                   5.339831                  \n",
       "A1CF    2.874734                   2.930411                  \n",
       "A2M     5.147751                   8.860099                  \n",
       "A2ML1   2.794910                   2.991144                  \n",
       "A3GALT2 2.763491                   2.604637                  \n",
       "A4GALT  3.008970                   3.053538                  "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# actually, no CollapseRows() needed here, because these mappings are one-to-one\n",
    "CR <- WGCNA::collapseRows(exprs(GDSC_RMA_BrainArray), rowGroup = anno_ENTREZID$ENTREZID,\n",
    "                          rowID = anno_ENTREZID$PROBEID,method=\"Average\")\n",
    "exprs_GDSC_RMA_BrainArray_ENTREZ <- CR$datETcollapsed\n",
    "\n",
    "# the same for SYMBOL\n",
    "CR <- WGCNA::collapseRows(exprs(GDSC_RMA_BrainArray), rowGroup = anno_SYMBOL$SYMBOL,\n",
    "                          rowID = anno_SYMBOL$PROBEID,method=\"Average\")\n",
    "exprs_GDSC_RMA_BrainArray_SYMBOL <- CR$datETcollapsed\n",
    "\n",
    "dim(exprs_GDSC_RMA_BrainArray_SYMBOL)\n",
    "head(exprs_GDSC_RMA_BrainArray_SYMBOL)\n",
    "write.table(exprs_GDSC_RMA_BrainArray_SYMBOL, file =\"Geeleher_with_GDSCr6/data/GDSC_micro.BrainArray.RMAlog2Average.Symbol.Expr.tsv\"\n",
    "            ,na=\"\",quote = FALSE,sep=\"\\t\")\n",
    "write.table(exprs_GDSC_RMA_BrainArray_ENTREZ, file =\"Geeleher_with_GDSCr6/data/GDSC_micro.BrainArray.RMAlog2Average.Entrez.Expr.tsv\"\n",
    "            ,na=\"\",quote = FALSE,sep=\"\\t\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Compare with Lukas's data "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>17419</li>\n",
       "\t<li>962</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 17419\n",
       "\\item 962\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 17419\n",
       "2. 962\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 17419   962"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>CAL.120</th><th scope=col>DMS.114</th><th scope=col>CAL.51</th><th scope=col>H2869</th><th scope=col>H290</th><th scope=col>A2058</th><th scope=col>CESS</th><th scope=col>H2803</th><th scope=col>H2722</th><th scope=col>HCC1419</th><th scope=col>⋯</th><th scope=col>SW1271</th><th scope=col>NCI.H716</th><th scope=col>LB831.BLC</th><th scope=col>HC.1</th><th scope=col>Jurkat</th><th scope=col>MOG.G.UVW</th><th scope=col>JIMT.1</th><th scope=col>PEO1</th><th scope=col>PL4</th><th scope=col>TT</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>TSPAN6</th><td> 7.632023</td><td> 7.548671</td><td>8.712338 </td><td>7.797142 </td><td> 7.729268</td><td>7.074533 </td><td> 3.285198</td><td> 6.961606</td><td> 5.943046</td><td> 3.455951</td><td>⋯        </td><td> 6.287819</td><td> 3.607123</td><td> 7.105637</td><td> 3.236503</td><td> 3.038892</td><td> 8.373223</td><td> 8.441628</td><td> 8.422922</td><td> 8.089255</td><td>3.112333 </td></tr>\n",
       "\t<tr><th scope=row>TNMD</th><td> 2.964585</td><td> 2.777716</td><td>2.643508 </td><td>2.817923 </td><td> 2.957739</td><td>2.889677 </td><td> 2.828203</td><td> 2.874751</td><td> 2.686874</td><td> 3.290184</td><td>⋯        </td><td> 2.842090</td><td> 2.866483</td><td> 2.798847</td><td> 2.745137</td><td> 2.976406</td><td> 2.852552</td><td> 2.639276</td><td> 2.879890</td><td> 2.521169</td><td>2.870468 </td></tr>\n",
       "\t<tr><th scope=row>DPM1</th><td>10.379553</td><td>11.807341</td><td>9.880733 </td><td>9.883471 </td><td>10.418840</td><td>9.773987 </td><td>10.264385</td><td>10.205931</td><td>10.299757</td><td>11.570155</td><td>⋯        </td><td>10.370130</td><td>10.246750</td><td>10.486486</td><td>10.442951</td><td>10.311962</td><td>10.454830</td><td>11.463742</td><td>10.557777</td><td>10.792750</td><td>9.873902 </td></tr>\n",
       "\t<tr><th scope=row>SCYL3</th><td> 3.614794</td><td> 4.066887</td><td>3.956230 </td><td>4.063701 </td><td> 4.341500</td><td>4.270903 </td><td> 5.968168</td><td> 3.715033</td><td> 3.848112</td><td> 5.560883</td><td>⋯        </td><td> 3.789814</td><td> 3.780485</td><td> 3.696835</td><td> 4.624013</td><td> 4.348524</td><td> 3.858121</td><td> 4.425849</td><td> 3.550390</td><td> 4.443337</td><td>4.266828 </td></tr>\n",
       "\t<tr><th scope=row>C1orf112</th><td> 3.380681</td><td> 3.732485</td><td>3.236620 </td><td>3.558414 </td><td> 3.840373</td><td>3.815055 </td><td> 3.011867</td><td> 3.268449</td><td> 3.352835</td><td> 3.571228</td><td>⋯        </td><td> 3.318119</td><td> 3.499160</td><td> 3.726833</td><td> 3.947744</td><td> 3.806584</td><td> 3.196988</td><td> 4.384732</td><td> 4.247189</td><td> 3.071359</td><td>3.230197 </td></tr>\n",
       "\t<tr><th scope=row>FGR</th><td> 3.324692</td><td> 3.152404</td><td>3.241246 </td><td>3.101247 </td><td> 3.001802</td><td>3.298915 </td><td> 9.565308</td><td> 3.036333</td><td> 3.278688</td><td> 3.370885</td><td>⋯        </td><td> 3.154849</td><td> 3.056503</td><td> 3.245301</td><td> 8.969347</td><td> 3.562548</td><td> 3.098083</td><td> 3.229511</td><td> 3.176336</td><td> 3.238305</td><td>3.027742 </td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll}\n",
       "  & CAL.120 & DMS.114 & CAL.51 & H2869 & H290 & A2058 & CESS & H2803 & H2722 & HCC1419 & ⋯ & SW1271 & NCI.H716 & LB831.BLC & HC.1 & Jurkat & MOG.G.UVW & JIMT.1 & PEO1 & PL4 & TT\\\\\n",
       "\\hline\n",
       "\tTSPAN6 &  7.632023 &  7.548671 & 8.712338  & 7.797142  &  7.729268 & 7.074533  &  3.285198 &  6.961606 &  5.943046 &  3.455951 & ⋯         &  6.287819 &  3.607123 &  7.105637 &  3.236503 &  3.038892 &  8.373223 &  8.441628 &  8.422922 &  8.089255 & 3.112333 \\\\\n",
       "\tTNMD &  2.964585 &  2.777716 & 2.643508  & 2.817923  &  2.957739 & 2.889677  &  2.828203 &  2.874751 &  2.686874 &  3.290184 & ⋯         &  2.842090 &  2.866483 &  2.798847 &  2.745137 &  2.976406 &  2.852552 &  2.639276 &  2.879890 &  2.521169 & 2.870468 \\\\\n",
       "\tDPM1 & 10.379553 & 11.807341 & 9.880733  & 9.883471  & 10.418840 & 9.773987  & 10.264385 & 10.205931 & 10.299757 & 11.570155 & ⋯         & 10.370130 & 10.246750 & 10.486486 & 10.442951 & 10.311962 & 10.454830 & 11.463742 & 10.557777 & 10.792750 & 9.873902 \\\\\n",
       "\tSCYL3 &  3.614794 &  4.066887 & 3.956230  & 4.063701  &  4.341500 & 4.270903  &  5.968168 &  3.715033 &  3.848112 &  5.560883 & ⋯         &  3.789814 &  3.780485 &  3.696835 &  4.624013 &  4.348524 &  3.858121 &  4.425849 &  3.550390 &  4.443337 & 4.266828 \\\\\n",
       "\tC1orf112 &  3.380681 &  3.732485 & 3.236620  & 3.558414  &  3.840373 & 3.815055  &  3.011867 &  3.268449 &  3.352835 &  3.571228 & ⋯         &  3.318119 &  3.499160 &  3.726833 &  3.947744 &  3.806584 &  3.196988 &  4.384732 &  4.247189 &  3.071359 & 3.230197 \\\\\n",
       "\tFGR &  3.324692 &  3.152404 & 3.241246  & 3.101247  &  3.001802 & 3.298915  &  9.565308 &  3.036333 &  3.278688 &  3.370885 & ⋯         &  3.154849 &  3.056503 &  3.245301 &  8.969347 &  3.562548 &  3.098083 &  3.229511 &  3.176336 &  3.238305 & 3.027742 \\\\\n",
       "\\end{tabular}\n"
      ],
      "text/markdown": [
       "\n",
       "| <!--/--> | CAL.120 | DMS.114 | CAL.51 | H2869 | H290 | A2058 | CESS | H2803 | H2722 | HCC1419 | ⋯ | SW1271 | NCI.H716 | LB831.BLC | HC.1 | Jurkat | MOG.G.UVW | JIMT.1 | PEO1 | PL4 | TT | \n",
       "|---|---|---|---|---|---|\n",
       "| TSPAN6 |  7.632023 |  7.548671 | 8.712338  | 7.797142  |  7.729268 | 7.074533  |  3.285198 |  6.961606 |  5.943046 |  3.455951 | ⋯         |  6.287819 |  3.607123 |  7.105637 |  3.236503 |  3.038892 |  8.373223 |  8.441628 |  8.422922 |  8.089255 | 3.112333  | \n",
       "| TNMD |  2.964585 |  2.777716 | 2.643508  | 2.817923  |  2.957739 | 2.889677  |  2.828203 |  2.874751 |  2.686874 |  3.290184 | ⋯         |  2.842090 |  2.866483 |  2.798847 |  2.745137 |  2.976406 |  2.852552 |  2.639276 |  2.879890 |  2.521169 | 2.870468  | \n",
       "| DPM1 | 10.379553 | 11.807341 | 9.880733  | 9.883471  | 10.418840 | 9.773987  | 10.264385 | 10.205931 | 10.299757 | 11.570155 | ⋯         | 10.370130 | 10.246750 | 10.486486 | 10.442951 | 10.311962 | 10.454830 | 11.463742 | 10.557777 | 10.792750 | 9.873902  | \n",
       "| SCYL3 |  3.614794 |  4.066887 | 3.956230  | 4.063701  |  4.341500 | 4.270903  |  5.968168 |  3.715033 |  3.848112 |  5.560883 | ⋯         |  3.789814 |  3.780485 |  3.696835 |  4.624013 |  4.348524 |  3.858121 |  4.425849 |  3.550390 |  4.443337 | 4.266828  | \n",
       "| C1orf112 |  3.380681 |  3.732485 | 3.236620  | 3.558414  |  3.840373 | 3.815055  |  3.011867 |  3.268449 |  3.352835 |  3.571228 | ⋯         |  3.318119 |  3.499160 |  3.726833 |  3.947744 |  3.806584 |  3.196988 |  4.384732 |  4.247189 |  3.071359 | 3.230197  | \n",
       "| FGR |  3.324692 |  3.152404 | 3.241246  | 3.101247  |  3.001802 | 3.298915  |  9.565308 |  3.036333 |  3.278688 |  3.370885 | ⋯         |  3.154849 |  3.056503 |  3.245301 |  8.969347 |  3.562548 |  3.098083 |  3.229511 |  3.176336 |  3.238305 | 3.027742  | \n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "         CAL.120   DMS.114   CAL.51   H2869    H290      A2058    CESS     \n",
       "TSPAN6    7.632023  7.548671 8.712338 7.797142  7.729268 7.074533  3.285198\n",
       "TNMD      2.964585  2.777716 2.643508 2.817923  2.957739 2.889677  2.828203\n",
       "DPM1     10.379553 11.807341 9.880733 9.883471 10.418840 9.773987 10.264385\n",
       "SCYL3     3.614794  4.066887 3.956230 4.063701  4.341500 4.270903  5.968168\n",
       "C1orf112  3.380681  3.732485 3.236620 3.558414  3.840373 3.815055  3.011867\n",
       "FGR       3.324692  3.152404 3.241246 3.101247  3.001802 3.298915  9.565308\n",
       "         H2803     H2722     HCC1419   ⋯ SW1271    NCI.H716  LB831.BLC\n",
       "TSPAN6    6.961606  5.943046  3.455951 ⋯  6.287819  3.607123  7.105637\n",
       "TNMD      2.874751  2.686874  3.290184 ⋯  2.842090  2.866483  2.798847\n",
       "DPM1     10.205931 10.299757 11.570155 ⋯ 10.370130 10.246750 10.486486\n",
       "SCYL3     3.715033  3.848112  5.560883 ⋯  3.789814  3.780485  3.696835\n",
       "C1orf112  3.268449  3.352835  3.571228 ⋯  3.318119  3.499160  3.726833\n",
       "FGR       3.036333  3.278688  3.370885 ⋯  3.154849  3.056503  3.245301\n",
       "         HC.1      Jurkat    MOG.G.UVW JIMT.1    PEO1      PL4       TT      \n",
       "TSPAN6    3.236503  3.038892  8.373223  8.441628  8.422922  8.089255 3.112333\n",
       "TNMD      2.745137  2.976406  2.852552  2.639276  2.879890  2.521169 2.870468\n",
       "DPM1     10.442951 10.311962 10.454830 11.463742 10.557777 10.792750 9.873902\n",
       "SCYL3     4.624013  4.348524  3.858121  4.425849  3.550390  4.443337 4.266828\n",
       "C1orf112  3.947744  3.806584  3.196988  4.384732  4.247189  3.071359 3.230197\n",
       "FGR       8.969347  3.562548  3.098083  3.229511  3.176336  3.238305 3.027742"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "GEX <- read.csv(\"Geeleher_with_GDSCr6/data/GEX.csv\",row.names=1,header = T)\n",
    "dim(GEX)\n",
    "head(GEX)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAABLAAAAFoCAMAAABwqvIYAAADAFBMVEUAAAABAQECAgIDAwME\nBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUW\nFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJyco\nKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6\nOjo7Ozs8PDw9PT0+Pj4/Pz9AQEBBQUFCQkJDQ0NERERFRUVGRkZHR0dISEhJSUlKSkpLS0tM\nTExNTU1OTk5PT09QUFBRUVFSUlJTU1NUVFRVVVVWVlZXV1dYWFhZWVlaWlpbW1tcXFxdXV1e\nXl5fX19gYGBhYWFiYmJjY2NkZGRlZWVmZmZnZ2doaGhpaWlqampra2tsbGxtbW1ubm5vb29w\ncHBxcXFycnJzc3N0dHR1dXV2dnZ3d3d4eHh5eXl6enp7e3t8fHx9fX1+fn5/f3+AgICBgYGC\ngoKDg4OEhISFhYWGhoaHh4eIiIiJiYmKioqLi4uMjIyNjY2Ojo6Pj4+QkJCRkZGSkpKTk5OU\nlJSVlZWWlpaXl5eYmJiZmZmampqbm5ucnJydnZ2enp6fn5+goKChoaGioqKjo6OkpKSlpaWm\npqanp6eoqKipqamqqqqrq6usrKytra2urq6vr6+wsLCxsbGysrKzs7O0tLS1tbW2tra3t7e4\nuLi5ubm6urq7u7u8vLy9vb2+vr6/v7/AwMDBwcHCwsLDw8PExMTFxcXGxsbHx8fIyMjJycnK\nysrLy8vMzMzNzc3Ozs7Pz8/Q0NDR0dHS0tLT09PU1NTV1dXW1tbX19fY2NjZ2dna2trb29vc\n3Nzd3d3e3t7f39/g4ODh4eHi4uLj4+Pk5OTl5eXm5ubn5+fo6Ojp6enq6urr6+vs7Ozt7e3u\n7u7v7+/w8PDx8fHy8vLz8/P09PT19fX29vb39/f4+Pj5+fn6+vr7+/v8/Pz9/f3+/v7////i\nsF19AAAACXBIWXMAABJ0AAASdAHeZh94AAAgAElEQVR4nO2dB5gURfrGv564ObEsyy6byDmK\nhCWKCooKgiKoCIIZAU9UDAiKmADTCWZA4RQPBT0x8BfTiRkwnIjKEowoIiBIWli2/hN7pmdn\nerq7ema6h/f3POx0qK7+pvvlrZrq6ipiAABgEijRAQAAgFJgWAAA0wDDAgCYBhgWAMA0wLAA\nAKYBhgUAMA0wLACAaYBhAQBMAwwLAGAaYFgAANMAwwIAmAYYFgDANMCwAACmAYYFADANMCwA\ngGmAYQEATAMMCwBgGmBYAADTAMMCAJgGGBYAwDTAsAAApgGGBQAwDTAsAIBpgGEBAEwDDAsA\nYBpgWAAA0wDDAgCYBhgWAMA0wLAAAKYBhgUAMA0wLACAaYBhAQBMAwwLAGAaYFgAANMAwwIA\nmAYYFgDANMCwAACmAYYFADANMCwAgGmAYQFgXl4gciY6hrgCw1LJq+Nb5dmyOlzxgXf1SQri\ncvZnAdH53j1jier9mbg4wXEBDAvIsamL6E5D/3JvCDEs9m/Xx9vuHR8IRIsSGis4DvAb1u9W\n+jbRscQFGJYavs91+ZG1fa8ylxtR5VHmNazTh/h41LVhKFHLasaOtiPqk+hwQdLjN6x/EgwL\nhFLb0WVPk3a5ljb0dy3NZV7D2hOcZnsO0Z2MzSVyHB8KAonEb1iVMCxQh1dc7nS9d/HwCWln\nvsDCGBZbQJS67ZcMolvjHyAwHnvv7JZrLzh1YY175Raibu7Pl1wVdebRSp+jk/IL/GmD1r+7\nvKkz84SH3LV4dvTxAfVt9U+4a2fdHLyGNdjbJDFZmjQpgWGpYDRR9gHf8o5Dno+6hsVOITpj\nOFHTQ/ENDhiSL4t9DZw93E9gQuzmOaL293kXPQTWl6d4jzrJpaIj/XxZNN4czbAkSZMSGJYK\nmhEND9kUxrC2pXsk82bcwgLGZbfLryoefXmqjWgwC2c35SX2ji38qcX1ramuuvz3a/sS3cTY\no0Qtl360ylUI9o1gWN/8xyW4Z9dslSZNSmBYKkj36EdCcKP7pb5tD7n96vx4BweMyO1EWb+6\nPpe4JLEunN1Qs58DqcX1q4n6uT52ZlDmIXcHmftcK0dGXX3PsfCGxX4jbxuWJGlSAsNSgUA0\ny/151Ffv/kLaraHMl2ype2VS4sIExqED0Vj3Z00u0cywhvVcUGpxvQnRLYdc9CF6i00iKl38\nuy+JvGFJkiYlMCwVZBLd4P6UN6xdBe4V6+cJDBQYhFqbt8rDWA+iUWENK9he/Ou1loCqHmJf\npLk/m4xf7m63lzcsSdKkBIalgjaun3/uz2MTXAiiYYW0YY0hat2R6MQkrZMDFex3yeNxz9LJ\nRGeEsxtrsEz86/uDisEZjL3bxrtY/lE0w5IkTUpgWCoYT5Sy279ijWBYb7m2/PcTVxH5SNzj\nA0bDXcOa41nqRnSBx246utcel9iNiH+91iWuB4Oz+WjmoGyXrurvj5CDaFjBSZMSGJYK3F50\niW/5WATDOtiEaAxjlxPlJHVbAlBER49PMXYkk+hexu4mKqhlHnnIGRZr7ulUJaHmP7lEr0fI\nIWBYQUmTEhiWGnq4dDHW7UNH3uhJ4Q3rOqLcPxjbXeBTKjiumUmU7n7s9wSR8L33YeFKxr5N\ni2JYE4mKDric5/yLb/zl4F1jz/L8bjyV6KUIOfzu2rrGVVpKkiYlMCw1bHa3p1s7nNTF09Xq\n3IMh7xIO2cfWuypej7mTLibfW9DgeGZPI6Km85df7/TWzbcILgO7/Lq8nlEMa0sqUeVr/zeM\nqE2Nu5Y2/PV1799uJ+eOCDnU2Il6L/s/adKkBIaliq3dAs8EF7g3SEZroJ1HOxF18zaj9iNq\nUZ3QYIEBEHu6D/O8+XCFZ7nZB64KV62MYbEXnN6jil0/9L5u5MvCsiBiDqe5Nw4OSZqMwLBU\n8uq4FjmOoi4TXvWaUYhh3Rvoz7DR4el5A45z9s7qmm1vePYr3rWau5s5ii/b8YtLLAfkDItt\nHFfhTGt7i+cZz+93nNDAntbysq8i5/DL0JyUijtDkiYjMCwAgGmAYQEATAMMCwBgGmBYAADT\nAMMCAJgGGBYAwDTAsAAApgGGBQAwDTAsAIBpgGEBAEwDDAsAYBpgWAAA0wDDAgCYBhgWAMA0\nwLAAAKYBhgUAMA0wLACAaYBhAQBMAwwLAGAaYFgAANMAwwIAmAYYFgDANMCwAACmAYYFADAN\nMCwAgGmAYQEATAMMCwBgGmBYAADTAMMCAJgGGBYAwDTAsAAApgGGBQAwDTAsAIBpgGEBAEwD\nDAsAYBpgWAAA0wDDAgCYBhgWAMA0wLAAAKYhDob15Tpgcr6MvUrUA12ZHg26ir1hrSVgetbG\nXCaqga6SAPW6io1hHVu5TOQOqubNrkr6LS3N0pfpESVQSDV9mOgQ6vIhv67Y4WBZCdQ4d4oO\ngQGlaNFVbAxra0GuSBrt580uTWpYAl1wqh5RAoUkrWE1kgrL3iB9tw6RAYUYx7CCeYz+5s0i\npB5Zj4baftcjNKAMwxjW3lumipyng2GFCMtCRffqECVQSLIaVmWIrgR7eqOHdYkNKMIwhvXH\nqHNFutA+3uwGhAqLTmpeq0ecQBHJalh1mupa0bn9dYkNKMIwhhUMv65q6whLyLWu0SU4oIQk\nNawldXRVTC1tO/WJDiggSQ3r1TrCqke9r9IlOKCEJDWslDq6Iqcl/2l9ogMKSFLDyqmjqxRq\nXXBUl+iAApLTsPbV9StqSZ3O1Sk8EJ3kNKzqMMJKsdne1SU6oIDkNKxZYXRVaGmQfUSn+EBU\nktOwXggjrBJqdY0+4YHoJKdhZYTRFZFgfV+n+EBUktOwWoWRVSoVVegTHohOUhrWsbB2lUHF\n0/QKEEQjOQ1LCF8S0kZ94gNRSUrDejOsrOo7c7vqFSCIRlIa1i9hhZVDBXN0ChBEIykNa2BY\nXVkyybpLrwhBFJLSsGaGFVampd4AnQIE0UhKw7KFr7gL5FyuV4QgCklpWKVhhUU2wcH97itQ\nRmIMa+/Ub2X38wmrbm9kL3ZbNrr4xYukNKxITViC43WdIgRRSIxh/UwrZffzCevjCIaVUmxt\nyZEtUENCDCu2BSE7EElYQi46NsSJuBrWeD+j6NTx42US8gnr3KCyT0IpCds58gUqSIhhxbYg\nZMsiGJY129qWJ1+gnLgalvQ2yyTkE1bg9YlMyQktdnI8x5EvUEE8DStOBSHrE8GwhExB+I0n\nY6CYuBrWP6wdV+1x8w09v2ePTEI+YQXqVZl2qbLyHJdx5AtUEE/DilNByFLFczikhkUoCeNF\nfNuw1nYUrvyLxbjqfiRISS2kVawTCH2S40Q8DStOBWGQMba2SIWV4kRJGB/i3Oh+9J7Uohdj\nbFhLg5Q0SVrF6iLQr9ozBiqIaxtWXArC4NHcO1A9ibAqqIwjY6CcuD8l3DyAzvwppsIKtDSk\nW5YNlhSEeRbbUu0ZAxXEt9E9HgUhWxBQUpnjCslvwkYW+okjZ6CYBHRrWJSXMSOWwsoSdVRC\nmx+UFITUzIKqe3yI91PC2BeEwcNup7SQjNsg2MixmCNnoJhE9MPaMZJiKaxAm3uho/YrqWGd\njap7nIh/t4ZYF4QsW5RRKl3wu7TLTL7lIo6cgWIS03H09Snyr7dzCStQ7qU3YbXZEl2dJhDm\nzokLCeiHFeOCMKgkbCk8wZpIhNVNKOTIGSgm+d4lXBNUUx/G2NnBsrI0s1te0DFOEJGEdByN\naUEY9JCwC33BLpH8Jmxnoa0cWQOlJJ9hjRBVZKX7GbtfUhDautIlOsYJIpJ87xJ+GSj3Wlmq\n2TPBvwmFXMG2SLcwQWQSZVhVlX2GzJF0mfmxRWORfI754wJv1Dd0f7d1EsOiSUIpd+xAAcln\nWJcG7KmwAWNbpMJqKpynX5wgIokyrC+IFgy7LXjLkcWPi1zAIazA2DLltJexmnSJrlyqw2Rf\n8SBBhlW3IDww9x6Rs3kMq3lARY7ervUGEmGdZanHGTpQQqIMq7qqirGDkfbylISBcrBhpntd\nMpqfpWeK8JLmrIFyEmRYdQvC7QNPFmnFY1hOUUV5dK1rfXiwsIT+FtrCGTtQQNK1YQW6I1sz\nW7s3SEaJFHL60cX6xQkikiDDil1BGPSQsD39x7V+n6QRq9xmeUp73kAp8Tas2i2rV6x4O1qv\nYA5hPR1UnRrh3vCOtMPMPZYSrVkDFSRdG1bgFVVLF0+39o8lwrKdQEP0CxREIr6GtXtKgff2\nls6MWAq64RDWoICuaK57wwHpgMkzSfhTa95AOXE2rJgXhOw9UULOVvZa14bDTomwrrBma84b\nKCauhrW9gpqNnTF79rRRRdRht0xCDmE1FBVUSJ95tnSV/CYcmoFGrHgQV8OKQ0EY9JDQ2rDY\ns6UyWFg0SiD54U6BHsR3xFH7Mt9SzXxhskxCDmEF6lONyTszwGSJYZWcSaO05g2UE0/DikdB\nyFoGRJTS17NlikRYnVKEeZozB0qJq2EVjgssnyfXlMQhrECjQkm6d8sySVuD5RFrQ615A+XE\ndcTROBSELNA9JpO8I7i/IBkUK2MAnaQ5c6AUXsPq/thfyg+03xlYvs0hk1C7sA4FisG8Jt5N\nv0pq7vQsemLFA07DUqWreBSEgYeEQhvyNiqETKt6hy3tmObcgUJ4DctGqaPeVHqfykYEloeU\nyyTULqzAPAGCc5BvW4lEV1fm0780Zg6Uw2lYqnQVh4KQ1QSq7l38o0BKhXUD0TqtuQOl8BrW\nn48PsFLJLVWKDpwszDnsXdo/nabKJNQurBFBCrrRt22UpK2h9QXCGRozB8rhNCxVuopDQci+\nCfz2a2et9W47VyKsk3OEWVpzB0rRoQ3rj0f7W6jXUwre/tvTmTIHjL16wph+adRbTjrahdVY\nFFA2vebb9rCkEcu52JqrMXOgHP42LOW6ikNBGNT/OLWswLdN0nWU6p8jdNaaO1CKPo3u2x/o\nQGlXfB/9bPd3tLrvrb37EzVy6bQLK0Us8MrJ/7xovbStYQ3RjxpzB4rRpdFdoa7iUBCy3gEB\nZXf1bftIYljC43abbKcKoAO6GNbBF4anUqndfltt9IMPbVq/vqo6SiLtwhIlZK+w+7fVpEkM\n69YKekBj7kAxehiWYl3FviBkBaJ+LJYxvm2HpLN9zSfhTa3ZA4XoYFgfXJJFqRe8y34aTjP0\nCUq7sGpF9WQWFYhbB0h0deIkS0+d4gQR4TcsdbqKcUHIrGJFqpTm+zd2k9SwRpTS5VqzBwrh\nNayfZjUj6jTPM6BH7ckFEY5Qi2ZhBX79pWSdIG69VVJ1T19pScHz51jDaVgG01Vw/74OtMG/\n8RqJsEqvshbrEyeICK9hWSj7CvFh7jxBj5AYh7BuCcjHOlrcukr6/vN3Vjx/jjmchmUwXbGd\nonqyOwriz85/S7qOWl62onU01vAaVu+ng9oZq1bwB+RBs7B6BPwqqKHqL6lhze1GV+sUKIgE\np2EZTFfsVVE99VpkiFtDuo6+YhUe1SdQEAlew1qzy7fw6Yu6xONFs7Dqiw0KDejzwOY2El31\nudMm11kH6AGnYRlMVywwb6q9QdPA5mKJsC7uQb11iRNEhNewyD/0wVw9OzdpFpbYNGprLAQ9\nL5og0VXmWhL2RM4D6AGnYRlMV6x9QD5pgwKbR0iE1WSWw35Il0BBJLgMq+qNN2j6Gx5WnJim\nY1SahSVqJ6s8OJ6l0t+E27JpkR5hgsjwGJbxdBU0nXiqcENg84OSRiybqyR8Q5dAQSS4DOvu\n4Lt1jo5RaRXWXjGaegVlQdt/lvgVPTDS0lePMEFkeAzLcLpyPwTwYWlGLwc2r5WWhGvyhLG6\nBAoiwfeTcPt/aPTdHma/eETHqLQK6yVROvb0PsE7CiW66r3Y5pTtYAi44fpJaDRdsWOidjI7\nUtCIW0dTJcK69EJbnoLO00A7vG1Ygz/WL5YAWoUVGBWSLFcE7xgt0VXaDoH+q0ugIBKcbVjG\n0hXbLGqnqJ0teEc/ibCaLrWjx0xs4TGs31xFzW8BdIxKq7DaitJxSFupFkh/E37TkS7UI04Q\nEQ7DMp6u2DOidPIr8oN3TJf8JrTutFpujJQH0AMew6KBQa3cxDv516EH+Ce8FJtGLUW0NXhH\nyDS9d8yw53GGC+ThMCx9dRWMZsMK1NAteR2Cd7wpbcR6q4eASZliCo9hnXe3618Azkh+7dFF\npFTjVPWiejIaW6R7ciW66ryW6H+c8QJZOAxLX11VL9RhRvGgAd3t5wbv+NsqEdboe9Poa86A\ngRzJNZGqKJwGpVnSPZKx1sh2qIDkxv4G3BhmXsKfWjcWyddYEAYGdBdy6G7Jng4SYZV8Q/bp\nOgQNIsFvWO7HbYc/+VzXhyMaDetHUTj185tKd4U0Yr13iR1TUcQUbsMykK6Cqu7OFrRGsucG\nia6EA2UC3qKIJbyGVXPVOYxta0zUi2MS8DpoFNYTonCsqadKd/0gNawrXrEG3rkHMYDTsAyl\nK3ZYFE7D9iTtZ/GatBHr2X+kB78TBvSG17DupmsZO1248irL3RHTq0ejsIJelBCmhOzLk1bd\nDzrrpAB6wmlYhtIV+1oUTkWrFOmuvZK+7jToPUsKhBVDeA2r7TDGfhHGMzauo35BaRVWM9Gu\nUumFkH0jpVWs7UNtBejiF0M4DctQunId5yevJHTIq1YSXeXU5FIBOiXHDl7DynjM3T70FmPz\nc/QLSquwxG7HzhL6I2SfdDpVenKxQ/hIh0hBBDgNy1C6Cq66Z/UI2XettCTcNi7Duoo/UhAB\nXsPKdAlrVHo1Y/PS9QtKq7BET6rf2Bq6b7fUsPrutgkX8wcKIsFpWIbSFWsSEI51fMi+1VJh\n3fSaLXMkf6QgAtw/Cc9nv2ec7Vq4tIVuMWkVVmCuy9JGdQclkc56aTs80J6OoUBiB+9PQgPp\nKmguJmt9ejxk3yGbRFgtD2eSc1fYXIAO8BrWXdSjiN5j7BnH9foFpVFYn4uqqZ/bps5e6ZhY\n9MZTKbal/JGCCHAalpF0FVR1z21JdSYdk/bEEg5flJ35EHekIAK8hnVobGr2P12fDdvtjpRc\nA9qENTugGvvQOns/kRrWha7fhH3CZAL0gdOwjKQrtk9UTdM2dYeXv00qrOdfteXULS6BTujV\n0/3jo/yxBNAmrNOCZHN7nb3HpJPIpdeeniIomwkdaECnnu5G0FVQWde0SWadvf+TGlbf6lzB\ngrFAYkUyvZrTyK8Za7b7+VIolVJhfbTYmX4dd6QgAoZ5NScYjYY1VxRNZkHTurszJLqy116W\nVz6CN1IQAV7Dql12Rsc2XtRlsnubzE5twhJbP7NL6UDd3U9JDevK/amUHSYZ0AVOwzKSrtiZ\nAdWkDqy7e6BUWGv+K2TYf9FyHhAdXsOaQ5SW7UXBkV+dXtZrvrdb3VS5epo2YYlNo2VljjC7\nD0ifP6fXnp+d+piG0wAlcBqWkXQVqLoLluAB3f0skxrWGbWlzkY3aTkPiA6vYTUauEX5gR84\nKc1OfT3NqPoL629RMhUNwr7ZXCEV1idvWNNaoLd7jOA0LAPpijG7XzJpxfSfursPSt/OcbBp\nua3z9ms5EYgKr2HZP1Fx4GD7S7WH77d3dd9M/YX1hiiZrIwTwyWYJjWsC2oKbfZX1J8HKIHT\nsAykq6Cqe0lLCvfQsnVISVglWPLRsyE2cNew1Iy9XeIZl/htx+k1sRDWNQHJWMJ2Yt8j1ZX9\n6I25JZXqzwOUwFvDMo6u2HZRMu2b28MlmCkVVj/WJ69fmZ6TZwARXsO6/ioVB/rGNltMk2Ih\nrK5+wQhWCj9jeJFUWK9tEizW99SfCCiA07AMpKugqnt5o/rhEuyW6srKFqfWz16o4UwgKryG\n9ffA81dtrPIQ/cBGZ3k/b6LZMRCWOKB7WgF9GzbF9VJhdWL981ucrP5EQAGchmUgXbmyFCvl\n2V3CppC+9kUvHcxNPauprj3IgA/uqeoDRD9wkvCwp6JcO4aumai7sMSmz+IyIfwAHyElofDX\nUme6ZU3YpIAT3qnqjaMr1iOo8nRR2BT/kAqrMbu2oGPm0xpOBaLBa1ijxoz3E/3AP0vJW6Gp\nnSQvRC3CqhX10qq4bndkL6VSYV17pGFW576qzwQUwGlYxtEVYzl+vTjSaF7YFDukuqKDVRbL\n6IpqDecCUYhvT/edV13jW1reRG9hfSvKpSQ30hv+90p15WTT8wvtr6s+FYhOXHu6x1JXQVX3\nwnLaGD5JyG/CMWxwySm54c0NcKGDYe3bsEenYES0COuhQL3dflaENAekPWbo2d+daX3bY4DI\nGMBvWAbRVdCA7h2aR2hrYDdKdWVjq63CNQ00TtEDZOA2rPe6EL3B2JlhXt7TjhZhDQkSzG2R\nEvWQCquAjSsty3ySK1YQFl7DMoyu2FeiXNqWZUVIs0v6FgUtYh3Lzi65lStYEA5ew/rUkTnQ\nJaw/Ch3r9AtKk7CKxR96mRSxr8Iqqa5ozbeWjCEN/uIKFoSD07CMoys2X1RLTm6rSInaSB0r\niy1NscxK+4knWBAOXsMaXPrzb+6ScEfpEFV5VFX2GTJHUuXfLJlDV72wxMOLGkWeLvNYlkRX\nVMrObtGw4h+qTwaiwWlY+umq5uVlIpdqMaxhAbmEGWbNR8i8l/R2TfMmg08cpf5sQB5ew6p3\nN/MIi91Vd1BiOb4gWjBM+svty3UiN2sQlljEtW2UEjnV9BBhfbBeqD/K9qXqs4EocBqWfrra\nVpgrkqZl5mex6m6z0V2REh2SjrZG9djiNMujFnRL1htew7L9yyesRWFfWoh8XneHwIOR9mqo\nuv8qaqVZXlnkZDtCmt3rsyHt0wZ2O6b2dCAKnIZlGF0FjVrUoIgiv+F4YUhJuKqmVZtO41vj\nBR2d4X6X8BafsC4u0yskpklYC0WpOB1yvddHhzSPvviVpd0pWQ9yBAvCwfsuoVF0xY6JgunU\nhCJaYei4o5R+bIUjY07+LJ5wQV14Deuy3PVuYe2+mZS9/FW7ZfWKFW9Ha4zUIKzz/Epx6Utu\nMKJvQgzLemh0O8f4dBWDmQAlcBqWYXTFtopS6VwqN+VYV6mu6NHayk45D6V8p/qEQA5ew/qt\nxNaZOnZ0UunvCo7cPaXAezdLZ0Yuq5gmYYmd2FPSaLVcwmEhPwov+il1WL3KPvhRqC+chmUY\nXbFnRKUUROyQ7OaVEF0Jf31qaXnmoEoIS1e4+2HtuLKe6+7kX7lDwYHbK6jZ2BmzZ08bVUQd\n5GZD0SAssaWhpDjsmEUiX4cIi76eUXRij6zZak8IZOHth2UUXbFRAaXYIj4kdHGsWYiuerJx\nzVLuyb5P9RmBDDr0dK/9vUpJKehivH2Zb6lmvjBZJqEGYYk66VQs85DQzVi7VFj2vxuPzRvs\nWK/2jEAO/p7uxtAVK/PrxGkl2VLtWemMqkTLd9Y7I+2OlA2qTwkiw21YmxbPffAFJcWgi8Jx\ngeXzSmQSqhfWNlEmTXOayCf9NS2kGavTKuv9th7N8SKFnvAalkF0FVR1L21In8slPNY29Efh\nH0ucpzQ7oz0mGNcRTsP6tLv31gxRNMOf/c7A8m3h5onwo15Yge7INtuZUdLOTpEKi24Z23yu\no+g8lacEcvAZlmF0xWpElXSvEA7LJn0ttLEhs3po+za9itSMRQiiwGdYq1Ko803zHphQTtkf\nKTiwLGi6tiHlMgnVC0ucRNUlmnuipD3aJeRHIS0uv+ymFCf6NugIl2EZR1fsm0BFvGFOlLSD\nQ0vCNtsbji3qb/u32pOCiHAZ1p6CtBc8CzXz7EpeyJsszPGVUfun01SZhOqFle/XSG4ORW2N\n+iZUWPSQbem1Dtu7Kk8KIsNjWAbSFXtY1EhWeucoabelhzQ2UO/VtnvzO2Z+o/asIBJchvUA\niQNXz4v81kKAPZ0pc8DYqyeM6ZdGveWko15YolLaFEepuLtZGFrFojHpX86wZGxWeVYQER7D\nMpCu2KBA1V24LFriR52huuo1K/W5gsKmug+Uc9zCZVgnNxI7mRwr7abkbPd39LyjbO/+hOwg\nVKqFdVRUSJcCJS+fTa6jrPZl2xdasjFhr17wGJZxdMVYrlh1T6MXoqYelhqqqzYjGr7bOKU3\nBnjXCS7DahD0NvoYJTP0uji0af36qmiDx6oW1juiQPJTws5JGMKxc0OfQVN2+z3vOlI/UHde\nEAkewzKOroKq7p1LaHvU1PvaWUN1ld+16YZK4VS15wXh4TIs+7WBjddrHzi5LqqFNcEvD5e+\nJio54MjZoY90yNpu77fpwg2yXaWBUngMyzi6Cpq3pGe0/n0efioJlRXZKlr8OJ467NISMAiF\ny7CCGzhlp1dSi2phNfGrI8NBryk64ujoOsqigi2bclPLlmsIGITCY1jG0RVbKYqjNKuZkgO2\nNKwjK3tuo2/mCJlP12qIGISQJIYlVsRbFtKfCo+5pa5jWWZuaNDE0W+t6oBBKEliWBcHqu6W\nkYqO+Lm0jqystrRXF1kc3dDewA+fYVXOEKlMpLACU3x1L8hQfNS/Qh9CuyicW3riWZYReMee\nFy7DMoquGGvkF0a2jRYpO+TvzmFKQmHsQlsny1D0b+CFz7Ak6BiVWmF9KgZRnNJJxWF1eje4\nStLioub/7me7eKvKkIEULsMyiq6CpvjqUkiKJTE8jKwo73rHhSdbRmMcIz64DGuJBB2jUius\niQFdKGtz97GpoK60XBV4x32ruzsu+1FdzEACj2EZRldsn6iKykKn8jao6WHq7gI1yjzjlR72\ny39WGTQIJr4TqSpFrbDEN+pznZ5hKhWzq18Yx3IXh7MWdXFciUlPtBPXiVSVotqwXhc1UZrW\nWsVxT9d5Bu3BVu+dVzunTFY4CAUIQ3IYliiPExqQujm7jt0TpjD0mFa7kc2dE1AaaiU5DOsC\nsdpNdImaAz8sDC8rKlnyXOv0m2RHbAMyJIVhHRHl0D23ntpz/a8sgrRcppVirVwR/UUfEIbk\nMKz6fikUO+gVVUfuGhWhJLhEqxMAABRUSURBVCSh+dkNs2dhMCNtJIVhvSiKIdcqNwFFeI79\nM3z9XRSYPb/D5a9qmG7luCY5DEs0nd4NFHeX8fNa3R5ZQaKyFA+fuepXlVmC5DCsU/wycBA9\noOF0B8+TkZbUu7KKOw26bO7Sl17/+Nstu9ATMDJJYVg/iXe+e47qqjs7Mr/Oezp1saaXDJi6\n8L2f8IKFMoxjWHsmXCbSW52w0v13v1UOfa/p5Ic6KbSsOgbWbMTcd7Zi9rk6GMawat9fLTJZ\npWHdI97oDIuWtwGrp6kVlGB1FnQdOWvZF1vwKk9YjGNYf44+V6SLOmGJ97tvXqrWWs/29pos\nKyC1jH7TX9oA5/JjGMPaIhmYQ13TUWv/YbkCPaLp7LUX8akqrcWoJ7/Yjpl3RIxjWMGoq7p/\nJN7fYruSwUgisGWoghq8Ao0JqS0veGTD8d5WbxjDCkbz0+fKPOXdRkPYNTZS67sqrPVOmvWZ\n7Ng5xwfJYFjD/DfVQXRn9OSR2f9srzB937UipPV9WF0fi2QiGQxrl3gv++ZkaT/r7jkNdFRV\nap9//nIc17iSwbAy/DezbSbxvgRY8+VD/XN0KRIDEss4dd57x91T7GQwrLniLcwSTuE68aY5\nbXWpvgew1O8/9fn1vx531pUEhhV487lPjvI3n2XZ+fwVHTLlOzuoR8jpc+/xI7BkMKzm/lvX\nQBCe4j759qWDdS4KPapyNL7osXXHz4CmSWBYgV5YORa+cjCUQ6/dcGrz+qkWXWVmyRvz1nGg\nr2QwLPG+98+j33QKoeaTewdVZNt1dy5L/ZPmHQ8vVieBYYmP9/IEei52MR344sV7RnUry3bo\n4l9C1oXJ/aZiEhjWd+Ld6pJSrHss1RtXPnbndUNOKM9yKUov/xKyB/9fcj+pTgLDEn+69cqx\n7I1dTMEc2PrVmysWzJk2acy5Azo1zU/XWmBaG09O1hGSksCwRvhvk8NCE2IWk48j/3v5gVuv\nHnNG/8oObVs2KWmQXWc6CxW2lXnOox8fiHXEicH8hhUoB5vb28QwJnl2/+earjYhJz+VBGeq\nI+BfCpxMqHgoCR8mJoFhiU+Mu6bRpzGLSZbq9U9N7JtLtoaleQ61haLQZmESzi5mfsMa4L9B\naURzYhhTdA5+8tikQc21lYxC+fzkeoPf/Ib1p3hzeqdlJfQtrO1vPzplaCf3XMFWu1Wdbwml\ny5OrwdT8hiX+IuyZJRhioKEDP3297v3Vq1e9tnTpgvlzZl437uS2RZnKysfMq5JmoFPzG9bl\n/rtis9NFsYtJOYe2fvTKM/PumXbl6DN7tizKTbUp9q6US5Nmwk3TG9aX4l1pbmsby5h4qd7+\nyZxzW2cKURtYbR2fTYJC0fyG5fDfkBNT6ZPYxcTD0Z3fv71w8lntC+zkkpVgla1+CYXzk6A9\nPt6GVbtl9YoVb0d7QKZCWEHPCIUnOQKLF3vfmX1OKdlK2nWsqJ8aWWBC6fOJjpSPOBuW7rpi\nW8V70dXRgC+4OPD76gfOb+Uke7a7+6BVpgth5p3mLg3ja1i7p/gGUS+dKTuchnJhBXqNnpTu\nMM0QHX+uuvOcJgIV9R590z3TL+lfP5LA0u4ybZkYV8PSX1eMVfpvQpYgzOANMD4c27x8+lkl\nZCvt1LNn+2JHBFG5cN6V6FA1E1fD2l5BzcbOmD172qgi6iDXxqxcWDf474EllUZFT24k9n22\nZMaYQR3Kc9MiK8tDwX2JDlUD8TSsGOiK1YiV35NSrX/oEGTc2Ll6ztgTc12B5xQU5mc7Ita2\nhM5rTPicJ66GNd6+zLdUM1+YLJNQubDEN7R6pAibtcZlCPbv3r311WvaRnr7Wii/74dEh6iG\neBpWDHTF/iGWhOk0lC+8hPDXhndXLHv88aeXLXvzpXmT+jeM9Cqj0OQRU01cEFfDKhwXWD6v\nRCahYmHNFy98vtBba1gGY/vTY8sijgduP2W5OfQVT8PSX1esVrwDvRzCtzzBGYYfnhrXLiVS\ndctW+X6i41NGXA3LHjT4y20OmYSKhRWYLschbNQalhH5e82DZ+RGUpdLX50fMXhn03galv66\nCsx0acmk03hiMxiH1y+6rn3kgSOsPd/an+gQ5YmrYZWNCCwPKZdJqFRYY0VZpZmy3h6NX99/\n5OyKiKUiCULGhe8dSnSQ4YmnYemuK7ZfvMbdbdbke6e4etMbN3dJj9wHQsifYNT2lbga1mRh\njm8gzv3TaapMQoXCqhKvcHtbqtpJTczEj2/e2l6u66lg6fAvg03rE0/D0ltXjBX5r2y6RbiO\nOz7DcnjDw33lHvkI2XcYbSC3uBrWns6UOWDs1RPG9Euj3nLSUSasY+IPwgbWWI7TYBgObZo/\nqJ7sMF2CtdkCY3SWj6dh6awrxq4XL2g+lR8Hg13veKBHtryu7N2frKpOdJhu4tsPq/r+jp7f\nz/buT8gOT61MWOJbezYLjdUckyk5urxXtJfKBHur29cnrmdaXPth6asrtkC8iMVCirZpmMzJ\n4XfH1ov2IoYgZJ+3KnEhxv3VnEOb1q+P6tVKhPW32HQoWOjU43WKwL8Wd1AwPpdgazE3zo30\n8X41RzddMXajeN1yBNsHegRnOva9OjBFyZuKgqXhtM/j2kxv2ncJrw1cNKKRx83IwxH5Y9Vl\nDZQN6iwIgrXFJZ/FuM3LtO8S7gg8Q7NRxmdxiMrQ/L64b6rC8QUFt7LKz3/001hOqWhOw/rG\nFnSZKHVxzAMyE4e3rBhfYlU7LqrgxWKxOZsNW/Thb7xDKZnTsI6NlFyUAcn8JEc1Nd8/d3GF\nXcPoqD5pWW2lFy77aufu3TxjCybKsKoq+wyZI/1PsWGdyM0BYW1b0NRq8eP54iFXI3uaIRoD\njcrfv7w2sakjzHVTjxCKJTzOgtZX3DqDnkjE11Wsq9qXrykJijmMsizd1sc9fDOxZ91T5zaw\n6CAspbryQq1UR6qHYX1BtGDYbcFbNgd/8ZxdB32Uy33VzPZTNx8Eqtmx4d/TBxc7ff9P/fAK\nLwS5DlExQ7GuXpWN3Vp05st7E3mHTMqB7T98+PBFnevZpcrSV1qqRaGHYVVXVTEmfYa1b7fI\nKhJrTYe3PDwo1xlE7m0ffLH5OHjUbEj2vHbVSfWdcuQ06jPypkf+u34tvZ6IABXrim3/7L7K\nHEnofV7+8Ktdx+vTm8RycM3jFxbJ6spDZmHrnjRNde6xb8P6kPAzz+QYsg0LujI9xhzAD8Iy\nPYYcwA+6Mj3GHMAPwjI9hhzAD7oyPcYcwG+tro10ICGs1SoT6ArIoF5XsR/Aj325LhpZE5fw\nMTGLM4MlrYdxZnAvzefM4dSunBkssdzImcG44vA36EutKlGPnrpa1/tUzgtyvZ0zgyV5V3Bm\nMLwFZwbzaDZnDp1P58zg2lTddBX7AfwUUH9Z9DSyLKvPmQHrP4Mzg43EOy/ZxOGcGTDrW5wZ\nPN6cNwRu9NQVGz6RM4M3UjgzYCW8PaFn8g5m+RvxDlp45hTODF7J5MwgQOwH8FMADMsNDMuN\nnrqCYbmBYXlQPNCaAmBYbmBYbvTUFQzLDQzLg+KB1hQAw3IDw3Kjp65gWG5gWB4UD7SmABiW\nGxiWGz11BcNyA8PyonSgNQXAsNzAsDzoqCsYlhsYloiigdYUAMNyA8Pyo5euYFhuYFi6A8Ny\nA8PSHRgWg2HFABiWGxiW7sCwGAwrBsCw3MCwdAeGxWBYMaD4Zc4MXi7mDWHgLM4MNgu8o19P\nGcWZAUvlnaJ8UVveEIzFKN7/aW9n8YbQ5HnODO4ZwJnBnwLv7LHDb+LMYFU9zgwCGMKwfuB9\nHFTzA28Iv3NPF8I9p/Denbw5bOUdsa462pguJmPnXs4MjnFPDPnjUc4MDvzGGwK3Mv/gnYH1\n2DbeEEQMYVgAAKAEGBYAwDTAsAAApgGGBQAwDTAsAIBpgGEBAEwDDAsAYBpgWAAA0wDDAgCY\nBhgWAMA0wLAAAKYBhgUAMA0wLACAaYBhAQBMAwwLAGAaYFgAANOQeMPaPaXUUT7kY85c/kHj\nOY5+vU9Gdv93OTL49sJCW/7QT7UdfORGSxfv0p7JZfaG47dz5KDxagYycMN3MY0ChMUtLAPq\nKuGGtaucBt96gS3lf1y5rLXyXIuF1GTadfUdH2rOYENm3vTFdxTa3tZy8MbOmb67Wt2Zht85\nzl6xW3MOGq9mIAM3fBfTKEBY3MIyoq4SblgT6GHX3+V0Ok8mRzt24LgWOzI67WesKuMqzTmc\nT++4/n5F/TQcuzf1hCqn967eT/e6/v6bVI5EHpSDtqsZlAHjvZiGAcLiFZYhdZVww7pmwBHX\n39rUMp5M7hHe4LgWc2iV+4NjQPRu5P4SLKtcw7G7phxhvrvaMfOw+6NpgbpQgnLQdjWDMmC8\nF9MwQFi8wjKkrhJuWF4O2ys5jt6ceuUejmsxMPUIO8w1XcEY+tr1d6flNI3He+/qIat3gpSx\npH7aAGdQU4GmqylmwHkxDQaExbiEZThdGcSwHvJUObUyoOFfPNeirPXnlQI1WaQ9gI25Hdb8\n9vmAtE80Hu+9q5torGdtBq3WmIMPLVdTzIDzYhoMCItxCctwujKGYb3n6MUxG9IiepHxXIvM\nsoZTXnyolJ7VHsJ3rYmo9COth3vv6nqa4FmbQys05uBF09X0Z8B7MY0FhOX+yyEsw+nKEIb1\nnLMzxyykO/LOYFzXwknPuP5uzyjUPD3ixoqS+1YuaJOtvmbki8Cnq6s9a7PpJY05eNB2NX0Z\ncF9MQwFh8QrLcLoygGHVTqdBPDM1jsz4ke9a1LMecH+cS5qfgHdP+8X190Bx8RFtx3vvahWN\n8axNI/VTzovC0no1fRlwX0wDAWHxC8twukq8YdWOo4k8Mz+/Trf+/PPP39Con7U2b3axeuRw\nFWntL/O30N/zeRFt0JaB965W27xPr0fRjxpz4Lia3gz4L6ZxgLB0EJbhdJV4w5pMd3EdP4X8\nTNWYw9XkadM8lbRO1P4H9fB8jqB12jLwyaJbmrtEPlZUojkH7VfTmwH/xTQOEJYOwjKcrhJu\nWMtpMl8GG1e6eZ5OXfmtxhzWCScdZmytpb3mGCrs37v+7snLOqzteJ8snqDbXH8fpds156D9\nanoz4L+YhgHCYjoIy3C6SrhhNaGJUz2ofR9FCtfP42uo4+2Xpjre1ZzBCku9WxbeWUHzNRz7\nnuu7Wwtdf/5kNb1pyO0jhXYHNOeg7WoGZeAhKdqwICxeYRlSVwk3LLGuuI0rG65rUftYh5Ts\n0z/jOP1HQ+vbck9+Tcuhd/svQBVjf19XZi+eoPZZTFAO2q5mcAhuksKwICxeYRlSVwk3LAAA\nUAoMCwBgGmBYAADTAMMCAJgGGBYAwDTAsAAApgGGBQAwDTAsAIBpgGEBAEwDDAsAYBpgWAAA\n0wDDAgCYBhgWAMA0wLAAAKYBhgUAMA0wLACAaYBhAQBMAwwLAGAaYFgAANMAwwIAmAYYFgDA\nNMCwAACmAYYFADANMCwAgGmAYQEATAMMCwBgGmBYAADTAMMCAJgGGBYAwDTAsAAApgGGBQAw\nDTAsAIBpgGEBAEwDDEsvzqOfEx0CSEKgKwkwLDmWkAdHxSVbvWurfDsmEx31Lk2h7IOeBRlh\n7Z5S6igf8nGMgwWmAbrSDAxLjiVUOdXFZZ0o+2v3mnWkd/vRAqtPWNX5FnrGsxRZWLvKafCt\nF9hS/heHiIEZgK40A8OSYwnN8C7MoTPdaz1S9nhWV1Inn7Ceo6uEXp6lyMKaQA+7/i6n02Mb\nLDAN0JVmYFhyiMKqduS5126nxzyr55SO9AmrH23qTRvdSx5h/TC2yF7vzE/d6692TW0w6WCj\nToxdM+CIa702tSzu8QNjAl1pBoYlhyisw7YS99rKiu7utT3O68/zCut76smepGvdi25h/VSQ\ncf3TdxY71zD2X2vh7fP7nZXdzZ/XYXtl3OMHxgS60gwMSw5RWLfTOI+wbqPvXGuP0dc+YU2h\nJ9m+tPxq5hXWGFrhWtpodcnvFFrLWE1/EoX1kKcCDwB0xQEMS44l1HeGi8ndqekvHmFtE250\nbe7RhXmFdTg/dS9jo+l55hFWbXaDWvdhvehPltLSvbRKFNZ7jl5HE/MlgOGArjQDw5LD9/iZ\nCm7e5VlbyfoVH2NV9E+fsJ6lC11/36GTmUdY2+kkz2Hj6aM9dIZ7aZ9fWM85O+9KyFcABgS6\n0gwMSw5v1f1geebP3rWV7BlaxW617/QJqy89VVVVtamBsMUjrCr3Mx8XV9PqzTTCs2j1CKt2\nOg3al5BvAIwIdKUZGJYcvraGl2mod20lO5A5srZ8KPMK6zvyc7NHWL/5SsKL6ZMf6Sz30gFP\nSVg7jibWJOgrAAMCXWkGhiWHv3H0NHqJeYXFLk5/373iEda1dMkLbpZYGx71NI7mNfS0NXQT\n9lRbOriX3vEIazLdlahvAIwIdKUZGJYcfmFtcjba5xPW+9Qn/4hXWIfrOf/wphtOL3uEdYlH\ngF8IAxg7UfiWsZqBbmEtp8kJ+wbAiEBXmoFhySE+fr6ZJvqExZq4Fz3CepYu9qV7jwZ7hPVr\nYcbNz9xekPkVYy9QxZzHe49xdvMcMdXD7sR8C2A0oCvNwLDkEIV1oMTyqU9Yd9A65hVWH/rS\nn7Cd9WdPj+SfLm5oKxjp6aG8oIWj7JYjjp6ua+xnW/y/ATAi0JVmYFgxZa+3jRQAXTl+dQXD\nihEL+7oLzIdodqIDAUnF8a4rGFaM+MRZePuTV9lK9yQ6EJBUHO+6gmHFig9OK7AXj/s10WGA\nJOM41xUMCwBgGmBYAADTAMMCAJgGGBYAwDTAsAAApgGGBQAwDTAsAIBpgGEBAEwDDAsAYBpg\nWAAA0wDDAgCYBhgWAMA0wLAAAKYBhgUAMA0wLACAaYBhAQBMAwwLAGAaYFgAANMAwwIAmAYY\nFgDANMCwAACmAYYFADANMCwAgGmAYQEATAMMCwBgGmBYAADTAMMCAJiG/wcvGObHHLD3LAAA\nAABJRU5ErkJggg==",
      "text/plain": [
       "Plot with title “our results”"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "myhist <- function(expr,title = \"Histogram\", xlabel = \"RMAlog2\"){\n",
    " i=1 \n",
    " plot( density(expr[,i]),main = title, xlab=xlabel)\n",
    " for(i in 2:dim(expr)[2]){\n",
    "  lines(density((expr[,i])))\n",
    " }\n",
    "}\n",
    "options(repr.plot.width=10, repr.plot.height=3)\n",
    "par(mfrow = c(1,2))\n",
    "myhist(GEX, title = \"GEX\")\n",
    "myhist(exprs_GDSC_RMA_BrainArray_SYMBOL, title = \"our results\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Compare two expression marices after annotatation\n",
    "(now they have mostly the same row.names)\n",
    " - find the number of rows in each and the number of shared row names \n",
    " - correlation of gene profiles\n",
    " - correlation of sample profiles"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>5500994157493061613625_A01</th><th scope=col>5500994157493061613625_A02</th><th scope=col>5500994157493061613625_A03</th><th scope=col>5500994157493061613625_A04</th><th scope=col>5500994157493061613625_A05</th><th scope=col>5500994157493061613625_A06</th><th scope=col>5500994157493061613625_A07</th><th scope=col>5500994157493061613625_A08</th><th scope=col>5500994157493061613625_A09</th><th scope=col>5500994157493061613625_A10</th><th scope=col>⋯</th><th scope=col>5500994175999120813240_H02</th><th scope=col>5500994175999120813240_H03</th><th scope=col>5500994175999120813240_H04</th><th scope=col>5500994175999120813240_H05</th><th scope=col>5500994175999120813240_H06</th><th scope=col>5500994175999120813240_H07</th><th scope=col>5500994175999120813240_H08</th><th scope=col>5500994175999120813240_H09</th><th scope=col>5500994175999120813240_H10</th><th scope=col>5500994175999120813240_H11</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>A1BG</th><td>6,210671</td><td>5,029226</td><td>5,513632</td><td>4,208249</td><td>3,402282</td><td>4,914035</td><td>3,827411</td><td>5,147878</td><td>3,110751</td><td>5,054884</td><td>⋯       </td><td>4,273165</td><td>3,442682</td><td>4,923977</td><td>2,901377</td><td>4,528475</td><td>5,079415</td><td>2,959578</td><td>3,094308</td><td>4,045297</td><td>5,339831</td></tr>\n",
       "\t<tr><th scope=row>A1CF</th><td>2,986196</td><td>2,952698</td><td>2,876904</td><td>3,078416</td><td>2,859689</td><td>3,226181</td><td>3,002562</td><td>2,899220</td><td>2,761808</td><td>2,991242</td><td>⋯       </td><td>2,948285</td><td>3,162107</td><td>2,989301</td><td>3,122200</td><td>2,981742</td><td>2,912033</td><td>2,950125</td><td>2,788239</td><td>2,874734</td><td>2,930411</td></tr>\n",
       "\t<tr><th scope=row>A2M</th><td>3,138822</td><td>3,339745</td><td>3,288143</td><td>3,035124</td><td>3,266711</td><td>6,878105</td><td>3,306808</td><td>3,376042</td><td>3,326467</td><td>3,532250</td><td>⋯       </td><td>3,898669</td><td>3,478617</td><td>3,251410</td><td>6,346759</td><td>3,532383</td><td>3,161646</td><td>3,203922</td><td>3,483303</td><td>5,147751</td><td>8,860099</td></tr>\n",
       "\t<tr><th scope=row>A2ML1</th><td>2,659014</td><td>2,740146</td><td>3,941870</td><td>2,782538</td><td>2,716182</td><td>2,912234</td><td>2,791071</td><td>2,829216</td><td>2,736346</td><td>2,977346</td><td>⋯       </td><td>2,904620</td><td>2,795384</td><td>2,714010</td><td>2,835382</td><td>2,763315</td><td>2,752095</td><td>2,911493</td><td>2,920386</td><td>2,794910</td><td>2,991144</td></tr>\n",
       "\t<tr><th scope=row>A3GALT2</th><td>2,765093</td><td>2,871171</td><td>3,009816</td><td>2,634662</td><td>2,697715</td><td>2,593990</td><td>2,929163</td><td>2,744387</td><td>2,998624</td><td>3,043765</td><td>⋯       </td><td>2,964269</td><td>3,067023</td><td>2,768809</td><td>2,607127</td><td>2,778964</td><td>2,850118</td><td>2,703913</td><td>2,663925</td><td>2,763491</td><td>2,604637</td></tr>\n",
       "\t<tr><th scope=row>A4GALT</th><td>3,342432</td><td>3,138229</td><td>3,513538</td><td>3,166587</td><td>3,240146</td><td>3,246475</td><td>3,801788</td><td>3,377124</td><td>3,288002</td><td>3,361292</td><td>⋯       </td><td>3,384112</td><td>3,070601</td><td>2,986975</td><td>3,334498</td><td>3,398661</td><td>3,195717</td><td>3,277447</td><td>4,407162</td><td>3,008970</td><td>3,053538</td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll}\n",
       "  & 5500994157493061613625\\_A01 & 5500994157493061613625\\_A02 & 5500994157493061613625\\_A03 & 5500994157493061613625\\_A04 & 5500994157493061613625\\_A05 & 5500994157493061613625\\_A06 & 5500994157493061613625\\_A07 & 5500994157493061613625\\_A08 & 5500994157493061613625\\_A09 & 5500994157493061613625\\_A10 & ⋯ & 5500994175999120813240\\_H02 & 5500994175999120813240\\_H03 & 5500994175999120813240\\_H04 & 5500994175999120813240\\_H05 & 5500994175999120813240\\_H06 & 5500994175999120813240\\_H07 & 5500994175999120813240\\_H08 & 5500994175999120813240\\_H09 & 5500994175999120813240\\_H10 & 5500994175999120813240\\_H11\\\\\n",
       "\\hline\n",
       "\tA1BG & 6,210671 & 5,029226 & 5,513632 & 4,208249 & 3,402282 & 4,914035 & 3,827411 & 5,147878 & 3,110751 & 5,054884 & ⋯        & 4,273165 & 3,442682 & 4,923977 & 2,901377 & 4,528475 & 5,079415 & 2,959578 & 3,094308 & 4,045297 & 5,339831\\\\\n",
       "\tA1CF & 2,986196 & 2,952698 & 2,876904 & 3,078416 & 2,859689 & 3,226181 & 3,002562 & 2,899220 & 2,761808 & 2,991242 & ⋯        & 2,948285 & 3,162107 & 2,989301 & 3,122200 & 2,981742 & 2,912033 & 2,950125 & 2,788239 & 2,874734 & 2,930411\\\\\n",
       "\tA2M & 3,138822 & 3,339745 & 3,288143 & 3,035124 & 3,266711 & 6,878105 & 3,306808 & 3,376042 & 3,326467 & 3,532250 & ⋯        & 3,898669 & 3,478617 & 3,251410 & 6,346759 & 3,532383 & 3,161646 & 3,203922 & 3,483303 & 5,147751 & 8,860099\\\\\n",
       "\tA2ML1 & 2,659014 & 2,740146 & 3,941870 & 2,782538 & 2,716182 & 2,912234 & 2,791071 & 2,829216 & 2,736346 & 2,977346 & ⋯        & 2,904620 & 2,795384 & 2,714010 & 2,835382 & 2,763315 & 2,752095 & 2,911493 & 2,920386 & 2,794910 & 2,991144\\\\\n",
       "\tA3GALT2 & 2,765093 & 2,871171 & 3,009816 & 2,634662 & 2,697715 & 2,593990 & 2,929163 & 2,744387 & 2,998624 & 3,043765 & ⋯        & 2,964269 & 3,067023 & 2,768809 & 2,607127 & 2,778964 & 2,850118 & 2,703913 & 2,663925 & 2,763491 & 2,604637\\\\\n",
       "\tA4GALT & 3,342432 & 3,138229 & 3,513538 & 3,166587 & 3,240146 & 3,246475 & 3,801788 & 3,377124 & 3,288002 & 3,361292 & ⋯        & 3,384112 & 3,070601 & 2,986975 & 3,334498 & 3,398661 & 3,195717 & 3,277447 & 4,407162 & 3,008970 & 3,053538\\\\\n",
       "\\end{tabular}\n"
      ],
      "text/markdown": [
       "\n",
       "| <!--/--> | 5500994157493061613625_A01 | 5500994157493061613625_A02 | 5500994157493061613625_A03 | 5500994157493061613625_A04 | 5500994157493061613625_A05 | 5500994157493061613625_A06 | 5500994157493061613625_A07 | 5500994157493061613625_A08 | 5500994157493061613625_A09 | 5500994157493061613625_A10 | ⋯ | 5500994175999120813240_H02 | 5500994175999120813240_H03 | 5500994175999120813240_H04 | 5500994175999120813240_H05 | 5500994175999120813240_H06 | 5500994175999120813240_H07 | 5500994175999120813240_H08 | 5500994175999120813240_H09 | 5500994175999120813240_H10 | 5500994175999120813240_H11 | \n",
       "|---|---|---|---|---|---|\n",
       "| A1BG | 6,210671 | 5,029226 | 5,513632 | 4,208249 | 3,402282 | 4,914035 | 3,827411 | 5,147878 | 3,110751 | 5,054884 | ⋯        | 4,273165 | 3,442682 | 4,923977 | 2,901377 | 4,528475 | 5,079415 | 2,959578 | 3,094308 | 4,045297 | 5,339831 | \n",
       "| A1CF | 2,986196 | 2,952698 | 2,876904 | 3,078416 | 2,859689 | 3,226181 | 3,002562 | 2,899220 | 2,761808 | 2,991242 | ⋯        | 2,948285 | 3,162107 | 2,989301 | 3,122200 | 2,981742 | 2,912033 | 2,950125 | 2,788239 | 2,874734 | 2,930411 | \n",
       "| A2M | 3,138822 | 3,339745 | 3,288143 | 3,035124 | 3,266711 | 6,878105 | 3,306808 | 3,376042 | 3,326467 | 3,532250 | ⋯        | 3,898669 | 3,478617 | 3,251410 | 6,346759 | 3,532383 | 3,161646 | 3,203922 | 3,483303 | 5,147751 | 8,860099 | \n",
       "| A2ML1 | 2,659014 | 2,740146 | 3,941870 | 2,782538 | 2,716182 | 2,912234 | 2,791071 | 2,829216 | 2,736346 | 2,977346 | ⋯        | 2,904620 | 2,795384 | 2,714010 | 2,835382 | 2,763315 | 2,752095 | 2,911493 | 2,920386 | 2,794910 | 2,991144 | \n",
       "| A3GALT2 | 2,765093 | 2,871171 | 3,009816 | 2,634662 | 2,697715 | 2,593990 | 2,929163 | 2,744387 | 2,998624 | 3,043765 | ⋯        | 2,964269 | 3,067023 | 2,768809 | 2,607127 | 2,778964 | 2,850118 | 2,703913 | 2,663925 | 2,763491 | 2,604637 | \n",
       "| A4GALT | 3,342432 | 3,138229 | 3,513538 | 3,166587 | 3,240146 | 3,246475 | 3,801788 | 3,377124 | 3,288002 | 3,361292 | ⋯        | 3,384112 | 3,070601 | 2,986975 | 3,334498 | 3,398661 | 3,195717 | 3,277447 | 4,407162 | 3,008970 | 3,053538 | \n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "        5500994157493061613625_A01 5500994157493061613625_A02\n",
       "A1BG    6,210671                   5,029226                  \n",
       "A1CF    2,986196                   2,952698                  \n",
       "A2M     3,138822                   3,339745                  \n",
       "A2ML1   2,659014                   2,740146                  \n",
       "A3GALT2 2,765093                   2,871171                  \n",
       "A4GALT  3,342432                   3,138229                  \n",
       "        5500994157493061613625_A03 5500994157493061613625_A04\n",
       "A1BG    5,513632                   4,208249                  \n",
       "A1CF    2,876904                   3,078416                  \n",
       "A2M     3,288143                   3,035124                  \n",
       "A2ML1   3,941870                   2,782538                  \n",
       "A3GALT2 3,009816                   2,634662                  \n",
       "A4GALT  3,513538                   3,166587                  \n",
       "        5500994157493061613625_A05 5500994157493061613625_A06\n",
       "A1BG    3,402282                   4,914035                  \n",
       "A1CF    2,859689                   3,226181                  \n",
       "A2M     3,266711                   6,878105                  \n",
       "A2ML1   2,716182                   2,912234                  \n",
       "A3GALT2 2,697715                   2,593990                  \n",
       "A4GALT  3,240146                   3,246475                  \n",
       "        5500994157493061613625_A07 5500994157493061613625_A08\n",
       "A1BG    3,827411                   5,147878                  \n",
       "A1CF    3,002562                   2,899220                  \n",
       "A2M     3,306808                   3,376042                  \n",
       "A2ML1   2,791071                   2,829216                  \n",
       "A3GALT2 2,929163                   2,744387                  \n",
       "A4GALT  3,801788                   3,377124                  \n",
       "        5500994157493061613625_A09 5500994157493061613625_A10 ⋯\n",
       "A1BG    3,110751                   5,054884                   ⋯\n",
       "A1CF    2,761808                   2,991242                   ⋯\n",
       "A2M     3,326467                   3,532250                   ⋯\n",
       "A2ML1   2,736346                   2,977346                   ⋯\n",
       "A3GALT2 2,998624                   3,043765                   ⋯\n",
       "A4GALT  3,288002                   3,361292                   ⋯\n",
       "        5500994175999120813240_H02 5500994175999120813240_H03\n",
       "A1BG    4,273165                   3,442682                  \n",
       "A1CF    2,948285                   3,162107                  \n",
       "A2M     3,898669                   3,478617                  \n",
       "A2ML1   2,904620                   2,795384                  \n",
       "A3GALT2 2,964269                   3,067023                  \n",
       "A4GALT  3,384112                   3,070601                  \n",
       "        5500994175999120813240_H04 5500994175999120813240_H05\n",
       "A1BG    4,923977                   2,901377                  \n",
       "A1CF    2,989301                   3,122200                  \n",
       "A2M     3,251410                   6,346759                  \n",
       "A2ML1   2,714010                   2,835382                  \n",
       "A3GALT2 2,768809                   2,607127                  \n",
       "A4GALT  2,986975                   3,334498                  \n",
       "        5500994175999120813240_H06 5500994175999120813240_H07\n",
       "A1BG    4,528475                   5,079415                  \n",
       "A1CF    2,981742                   2,912033                  \n",
       "A2M     3,532383                   3,161646                  \n",
       "A2ML1   2,763315                   2,752095                  \n",
       "A3GALT2 2,778964                   2,850118                  \n",
       "A4GALT  3,398661                   3,195717                  \n",
       "        5500994175999120813240_H08 5500994175999120813240_H09\n",
       "A1BG    2,959578                   3,094308                  \n",
       "A1CF    2,950125                   2,788239                  \n",
       "A2M     3,203922                   3,483303                  \n",
       "A2ML1   2,911493                   2,920386                  \n",
       "A3GALT2 2,703913                   2,663925                  \n",
       "A4GALT  3,277447                   4,407162                  \n",
       "        5500994175999120813240_H10 5500994175999120813240_H11\n",
       "A1BG    4,045297                   5,339831                  \n",
       "A1CF    2,874734                   2,930411                  \n",
       "A2M     5,147751                   8,860099                  \n",
       "A2ML1   2,794910                   2,991144                  \n",
       "A3GALT2 2,763491                   2,604637                  \n",
       "A4GALT  3,008970                   3,053538                  "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "934"
      ],
      "text/latex": [
       "934"
      ],
      "text/markdown": [
       "934"
      ],
      "text/plain": [
       "[1] 934"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>Assay.Name</th><th scope=col>Characteristics.cell.line.</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>5500994157493061613625_A01</th><td>5500994157493061613625_A01</td><td>CAL.120                   </td></tr>\n",
       "\t<tr><th scope=row>5500994157493061613625_A02</th><td>5500994157493061613625_A02</td><td>DMS.114                   </td></tr>\n",
       "\t<tr><th scope=row>5500994157493061613625_A03</th><td>5500994157493061613625_A03</td><td>CAL.51                    </td></tr>\n",
       "\t<tr><th scope=row>5500994157493061613625_A04</th><td>5500994157493061613625_A04</td><td>H2869                     </td></tr>\n",
       "\t<tr><th scope=row>5500994157493061613625_A05</th><td>5500994157493061613625_A05</td><td>H290                      </td></tr>\n",
       "\t<tr><th scope=row>5500994157493061613625_A06</th><td>5500994157493061613625_A06</td><td>A2058                     </td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|ll}\n",
       "  & Assay.Name & Characteristics.cell.line.\\\\\n",
       "\\hline\n",
       "\t5500994157493061613625\\_A01 & 5500994157493061613625\\_A01 & CAL.120                     \\\\\n",
       "\t5500994157493061613625\\_A02 & 5500994157493061613625\\_A02 & DMS.114                     \\\\\n",
       "\t5500994157493061613625\\_A03 & 5500994157493061613625\\_A03 & CAL.51                      \\\\\n",
       "\t5500994157493061613625\\_A04 & 5500994157493061613625\\_A04 & H2869                       \\\\\n",
       "\t5500994157493061613625\\_A05 & 5500994157493061613625\\_A05 & H290                        \\\\\n",
       "\t5500994157493061613625\\_A06 & 5500994157493061613625\\_A06 & A2058                       \\\\\n",
       "\\end{tabular}\n"
      ],
      "text/markdown": [
       "\n",
       "| <!--/--> | Assay.Name | Characteristics.cell.line. | \n",
       "|---|---|---|---|---|---|\n",
       "| 5500994157493061613625_A01 | 5500994157493061613625_A01 | CAL.120                    | \n",
       "| 5500994157493061613625_A02 | 5500994157493061613625_A02 | DMS.114                    | \n",
       "| 5500994157493061613625_A03 | 5500994157493061613625_A03 | CAL.51                     | \n",
       "| 5500994157493061613625_A04 | 5500994157493061613625_A04 | H2869                      | \n",
       "| 5500994157493061613625_A05 | 5500994157493061613625_A05 | H290                       | \n",
       "| 5500994157493061613625_A06 | 5500994157493061613625_A06 | A2058                      | \n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "                           Assay.Name                \n",
       "5500994157493061613625_A01 5500994157493061613625_A01\n",
       "5500994157493061613625_A02 5500994157493061613625_A02\n",
       "5500994157493061613625_A03 5500994157493061613625_A03\n",
       "5500994157493061613625_A04 5500994157493061613625_A04\n",
       "5500994157493061613625_A05 5500994157493061613625_A05\n",
       "5500994157493061613625_A06 5500994157493061613625_A06\n",
       "                           Characteristics.cell.line.\n",
       "5500994157493061613625_A01 CAL.120                   \n",
       "5500994157493061613625_A02 DMS.114                   \n",
       "5500994157493061613625_A03 CAL.51                    \n",
       "5500994157493061613625_A04 H2869                     \n",
       "5500994157493061613625_A05 H290                      \n",
       "5500994157493061613625_A06 A2058                     "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>Assay.Name</th><th scope=col>Characteristics.cell.line.</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>5500994175999120813240_H05</th><td>5500994175999120813240_H05</td><td>MOG.G.UVW                 </td></tr>\n",
       "\t<tr><th scope=row>5500994175999120813240_H06</th><td>5500994175999120813240_H06</td><td>MET.2B                    </td></tr>\n",
       "\t<tr><th scope=row>5500994175999120813240_H07</th><td>5500994175999120813240_H07</td><td>JIMT.1                    </td></tr>\n",
       "\t<tr><th scope=row>5500994175999120813240_H08</th><td>5500994175999120813240_H08</td><td>PEO1                      </td></tr>\n",
       "\t<tr><th scope=row>5500994175999120813240_H09</th><td>5500994175999120813240_H09</td><td>PL4                       </td></tr>\n",
       "\t<tr><th scope=row>5500994175999120813240_H10</th><td>5500994175999120813240_H10</td><td>TT                        </td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|ll}\n",
       "  & Assay.Name & Characteristics.cell.line.\\\\\n",
       "\\hline\n",
       "\t5500994175999120813240\\_H05 & 5500994175999120813240\\_H05 & MOG.G.UVW                   \\\\\n",
       "\t5500994175999120813240\\_H06 & 5500994175999120813240\\_H06 & MET.2B                      \\\\\n",
       "\t5500994175999120813240\\_H07 & 5500994175999120813240\\_H07 & JIMT.1                      \\\\\n",
       "\t5500994175999120813240\\_H08 & 5500994175999120813240\\_H08 & PEO1                        \\\\\n",
       "\t5500994175999120813240\\_H09 & 5500994175999120813240\\_H09 & PL4                         \\\\\n",
       "\t5500994175999120813240\\_H10 & 5500994175999120813240\\_H10 & TT                          \\\\\n",
       "\\end{tabular}\n"
      ],
      "text/markdown": [
       "\n",
       "| <!--/--> | Assay.Name | Characteristics.cell.line. | \n",
       "|---|---|---|---|---|---|\n",
       "| 5500994175999120813240_H05 | 5500994175999120813240_H05 | MOG.G.UVW                  | \n",
       "| 5500994175999120813240_H06 | 5500994175999120813240_H06 | MET.2B                     | \n",
       "| 5500994175999120813240_H07 | 5500994175999120813240_H07 | JIMT.1                     | \n",
       "| 5500994175999120813240_H08 | 5500994175999120813240_H08 | PEO1                       | \n",
       "| 5500994175999120813240_H09 | 5500994175999120813240_H09 | PL4                        | \n",
       "| 5500994175999120813240_H10 | 5500994175999120813240_H10 | TT                         | \n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "                           Assay.Name                \n",
       "5500994175999120813240_H05 5500994175999120813240_H05\n",
       "5500994175999120813240_H06 5500994175999120813240_H06\n",
       "5500994175999120813240_H07 5500994175999120813240_H07\n",
       "5500994175999120813240_H08 5500994175999120813240_H08\n",
       "5500994175999120813240_H09 5500994175999120813240_H09\n",
       "5500994175999120813240_H10 5500994175999120813240_H10\n",
       "                           Characteristics.cell.line.\n",
       "5500994175999120813240_H05 MOG.G.UVW                 \n",
       "5500994175999120813240_H06 MET.2B                    \n",
       "5500994175999120813240_H07 JIMT.1                    \n",
       "5500994175999120813240_H08 PEO1                      \n",
       "5500994175999120813240_H09 PL4                       \n",
       "5500994175999120813240_H10 TT                        "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>5500994157493061613625_A01</th><th scope=col>5500994157493061613625_A02</th><th scope=col>5500994157493061613625_A03</th><th scope=col>5500994157493061613625_A04</th><th scope=col>5500994157493061613625_A05</th><th scope=col>5500994157493061613625_A06</th><th scope=col>5500994157493061613625_A07</th><th scope=col>5500994157493061613625_A08</th><th scope=col>5500994157493061613625_A09</th><th scope=col>5500994157493061613625_A10</th><th scope=col>⋯</th><th scope=col>5500994175999120813240_H01</th><th scope=col>5500994175999120813240_H02</th><th scope=col>5500994175999120813240_H03</th><th scope=col>5500994175999120813240_H04</th><th scope=col>5500994175999120813240_H05</th><th scope=col>5500994175999120813240_H06</th><th scope=col>5500994175999120813240_H07</th><th scope=col>5500994175999120813240_H08</th><th scope=col>5500994175999120813240_H09</th><th scope=col>5500994175999120813240_H10</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>A1BG</th><td>6,210671</td><td>5,029226</td><td>5,513632</td><td>4,208249</td><td>3,402282</td><td>4,914035</td><td>3,827411</td><td>5,147878</td><td>3,110751</td><td>5,054884</td><td>⋯       </td><td>3,853365</td><td>4,273165</td><td>3,442682</td><td>4,923977</td><td>2,901377</td><td>4,528475</td><td>5,079415</td><td>2,959578</td><td>3,094308</td><td>4,045297</td></tr>\n",
       "\t<tr><th scope=row>A1CF</th><td>2,986196</td><td>2,952698</td><td>2,876904</td><td>3,078416</td><td>2,859689</td><td>3,226181</td><td>3,002562</td><td>2,899220</td><td>2,761808</td><td>2,991242</td><td>⋯       </td><td>7,746021</td><td>2,948285</td><td>3,162107</td><td>2,989301</td><td>3,122200</td><td>2,981742</td><td>2,912033</td><td>2,950125</td><td>2,788239</td><td>2,874734</td></tr>\n",
       "\t<tr><th scope=row>A2M</th><td>3,138822</td><td>3,339745</td><td>3,288143</td><td>3,035124</td><td>3,266711</td><td>6,878105</td><td>3,306808</td><td>3,376042</td><td>3,326467</td><td>3,532250</td><td>⋯       </td><td>3,801339</td><td>3,898669</td><td>3,478617</td><td>3,251410</td><td>6,346759</td><td>3,532383</td><td>3,161646</td><td>3,203922</td><td>3,483303</td><td>5,147751</td></tr>\n",
       "\t<tr><th scope=row>A2ML1</th><td>2,659014</td><td>2,740146</td><td>3,941870</td><td>2,782538</td><td>2,716182</td><td>2,912234</td><td>2,791071</td><td>2,829216</td><td>2,736346</td><td>2,977346</td><td>⋯       </td><td>2,788015</td><td>2,904620</td><td>2,795384</td><td>2,714010</td><td>2,835382</td><td>2,763315</td><td>2,752095</td><td>2,911493</td><td>2,920386</td><td>2,794910</td></tr>\n",
       "\t<tr><th scope=row>A3GALT2</th><td>2,765093</td><td>2,871171</td><td>3,009816</td><td>2,634662</td><td>2,697715</td><td>2,593990</td><td>2,929163</td><td>2,744387</td><td>2,998624</td><td>3,043765</td><td>⋯       </td><td>2,579725</td><td>2,964269</td><td>3,067023</td><td>2,768809</td><td>2,607127</td><td>2,778964</td><td>2,850118</td><td>2,703913</td><td>2,663925</td><td>2,763491</td></tr>\n",
       "\t<tr><th scope=row>A4GALT</th><td>3,342432</td><td>3,138229</td><td>3,513538</td><td>3,166587</td><td>3,240146</td><td>3,246475</td><td>3,801788</td><td>3,377124</td><td>3,288002</td><td>3,361292</td><td>⋯       </td><td>3,378919</td><td>3,384112</td><td>3,070601</td><td>2,986975</td><td>3,334498</td><td>3,398661</td><td>3,195717</td><td>3,277447</td><td>4,407162</td><td>3,008970</td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll}\n",
       "  & 5500994157493061613625\\_A01 & 5500994157493061613625\\_A02 & 5500994157493061613625\\_A03 & 5500994157493061613625\\_A04 & 5500994157493061613625\\_A05 & 5500994157493061613625\\_A06 & 5500994157493061613625\\_A07 & 5500994157493061613625\\_A08 & 5500994157493061613625\\_A09 & 5500994157493061613625\\_A10 & ⋯ & 5500994175999120813240\\_H01 & 5500994175999120813240\\_H02 & 5500994175999120813240\\_H03 & 5500994175999120813240\\_H04 & 5500994175999120813240\\_H05 & 5500994175999120813240\\_H06 & 5500994175999120813240\\_H07 & 5500994175999120813240\\_H08 & 5500994175999120813240\\_H09 & 5500994175999120813240\\_H10\\\\\n",
       "\\hline\n",
       "\tA1BG & 6,210671 & 5,029226 & 5,513632 & 4,208249 & 3,402282 & 4,914035 & 3,827411 & 5,147878 & 3,110751 & 5,054884 & ⋯        & 3,853365 & 4,273165 & 3,442682 & 4,923977 & 2,901377 & 4,528475 & 5,079415 & 2,959578 & 3,094308 & 4,045297\\\\\n",
       "\tA1CF & 2,986196 & 2,952698 & 2,876904 & 3,078416 & 2,859689 & 3,226181 & 3,002562 & 2,899220 & 2,761808 & 2,991242 & ⋯        & 7,746021 & 2,948285 & 3,162107 & 2,989301 & 3,122200 & 2,981742 & 2,912033 & 2,950125 & 2,788239 & 2,874734\\\\\n",
       "\tA2M & 3,138822 & 3,339745 & 3,288143 & 3,035124 & 3,266711 & 6,878105 & 3,306808 & 3,376042 & 3,326467 & 3,532250 & ⋯        & 3,801339 & 3,898669 & 3,478617 & 3,251410 & 6,346759 & 3,532383 & 3,161646 & 3,203922 & 3,483303 & 5,147751\\\\\n",
       "\tA2ML1 & 2,659014 & 2,740146 & 3,941870 & 2,782538 & 2,716182 & 2,912234 & 2,791071 & 2,829216 & 2,736346 & 2,977346 & ⋯        & 2,788015 & 2,904620 & 2,795384 & 2,714010 & 2,835382 & 2,763315 & 2,752095 & 2,911493 & 2,920386 & 2,794910\\\\\n",
       "\tA3GALT2 & 2,765093 & 2,871171 & 3,009816 & 2,634662 & 2,697715 & 2,593990 & 2,929163 & 2,744387 & 2,998624 & 3,043765 & ⋯        & 2,579725 & 2,964269 & 3,067023 & 2,768809 & 2,607127 & 2,778964 & 2,850118 & 2,703913 & 2,663925 & 2,763491\\\\\n",
       "\tA4GALT & 3,342432 & 3,138229 & 3,513538 & 3,166587 & 3,240146 & 3,246475 & 3,801788 & 3,377124 & 3,288002 & 3,361292 & ⋯        & 3,378919 & 3,384112 & 3,070601 & 2,986975 & 3,334498 & 3,398661 & 3,195717 & 3,277447 & 4,407162 & 3,008970\\\\\n",
       "\\end{tabular}\n"
      ],
      "text/markdown": [
       "\n",
       "| <!--/--> | 5500994157493061613625_A01 | 5500994157493061613625_A02 | 5500994157493061613625_A03 | 5500994157493061613625_A04 | 5500994157493061613625_A05 | 5500994157493061613625_A06 | 5500994157493061613625_A07 | 5500994157493061613625_A08 | 5500994157493061613625_A09 | 5500994157493061613625_A10 | ⋯ | 5500994175999120813240_H01 | 5500994175999120813240_H02 | 5500994175999120813240_H03 | 5500994175999120813240_H04 | 5500994175999120813240_H05 | 5500994175999120813240_H06 | 5500994175999120813240_H07 | 5500994175999120813240_H08 | 5500994175999120813240_H09 | 5500994175999120813240_H10 | \n",
       "|---|---|---|---|---|---|\n",
       "| A1BG | 6,210671 | 5,029226 | 5,513632 | 4,208249 | 3,402282 | 4,914035 | 3,827411 | 5,147878 | 3,110751 | 5,054884 | ⋯        | 3,853365 | 4,273165 | 3,442682 | 4,923977 | 2,901377 | 4,528475 | 5,079415 | 2,959578 | 3,094308 | 4,045297 | \n",
       "| A1CF | 2,986196 | 2,952698 | 2,876904 | 3,078416 | 2,859689 | 3,226181 | 3,002562 | 2,899220 | 2,761808 | 2,991242 | ⋯        | 7,746021 | 2,948285 | 3,162107 | 2,989301 | 3,122200 | 2,981742 | 2,912033 | 2,950125 | 2,788239 | 2,874734 | \n",
       "| A2M | 3,138822 | 3,339745 | 3,288143 | 3,035124 | 3,266711 | 6,878105 | 3,306808 | 3,376042 | 3,326467 | 3,532250 | ⋯        | 3,801339 | 3,898669 | 3,478617 | 3,251410 | 6,346759 | 3,532383 | 3,161646 | 3,203922 | 3,483303 | 5,147751 | \n",
       "| A2ML1 | 2,659014 | 2,740146 | 3,941870 | 2,782538 | 2,716182 | 2,912234 | 2,791071 | 2,829216 | 2,736346 | 2,977346 | ⋯        | 2,788015 | 2,904620 | 2,795384 | 2,714010 | 2,835382 | 2,763315 | 2,752095 | 2,911493 | 2,920386 | 2,794910 | \n",
       "| A3GALT2 | 2,765093 | 2,871171 | 3,009816 | 2,634662 | 2,697715 | 2,593990 | 2,929163 | 2,744387 | 2,998624 | 3,043765 | ⋯        | 2,579725 | 2,964269 | 3,067023 | 2,768809 | 2,607127 | 2,778964 | 2,850118 | 2,703913 | 2,663925 | 2,763491 | \n",
       "| A4GALT | 3,342432 | 3,138229 | 3,513538 | 3,166587 | 3,240146 | 3,246475 | 3,801788 | 3,377124 | 3,288002 | 3,361292 | ⋯        | 3,378919 | 3,384112 | 3,070601 | 2,986975 | 3,334498 | 3,398661 | 3,195717 | 3,277447 | 4,407162 | 3,008970 | \n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "        5500994157493061613625_A01 5500994157493061613625_A02\n",
       "A1BG    6,210671                   5,029226                  \n",
       "A1CF    2,986196                   2,952698                  \n",
       "A2M     3,138822                   3,339745                  \n",
       "A2ML1   2,659014                   2,740146                  \n",
       "A3GALT2 2,765093                   2,871171                  \n",
       "A4GALT  3,342432                   3,138229                  \n",
       "        5500994157493061613625_A03 5500994157493061613625_A04\n",
       "A1BG    5,513632                   4,208249                  \n",
       "A1CF    2,876904                   3,078416                  \n",
       "A2M     3,288143                   3,035124                  \n",
       "A2ML1   3,941870                   2,782538                  \n",
       "A3GALT2 3,009816                   2,634662                  \n",
       "A4GALT  3,513538                   3,166587                  \n",
       "        5500994157493061613625_A05 5500994157493061613625_A06\n",
       "A1BG    3,402282                   4,914035                  \n",
       "A1CF    2,859689                   3,226181                  \n",
       "A2M     3,266711                   6,878105                  \n",
       "A2ML1   2,716182                   2,912234                  \n",
       "A3GALT2 2,697715                   2,593990                  \n",
       "A4GALT  3,240146                   3,246475                  \n",
       "        5500994157493061613625_A07 5500994157493061613625_A08\n",
       "A1BG    3,827411                   5,147878                  \n",
       "A1CF    3,002562                   2,899220                  \n",
       "A2M     3,306808                   3,376042                  \n",
       "A2ML1   2,791071                   2,829216                  \n",
       "A3GALT2 2,929163                   2,744387                  \n",
       "A4GALT  3,801788                   3,377124                  \n",
       "        5500994157493061613625_A09 5500994157493061613625_A10 ⋯\n",
       "A1BG    3,110751                   5,054884                   ⋯\n",
       "A1CF    2,761808                   2,991242                   ⋯\n",
       "A2M     3,326467                   3,532250                   ⋯\n",
       "A2ML1   2,736346                   2,977346                   ⋯\n",
       "A3GALT2 2,998624                   3,043765                   ⋯\n",
       "A4GALT  3,288002                   3,361292                   ⋯\n",
       "        5500994175999120813240_H01 5500994175999120813240_H02\n",
       "A1BG    3,853365                   4,273165                  \n",
       "A1CF    7,746021                   2,948285                  \n",
       "A2M     3,801339                   3,898669                  \n",
       "A2ML1   2,788015                   2,904620                  \n",
       "A3GALT2 2,579725                   2,964269                  \n",
       "A4GALT  3,378919                   3,384112                  \n",
       "        5500994175999120813240_H03 5500994175999120813240_H04\n",
       "A1BG    3,442682                   4,923977                  \n",
       "A1CF    3,162107                   2,989301                  \n",
       "A2M     3,478617                   3,251410                  \n",
       "A2ML1   2,795384                   2,714010                  \n",
       "A3GALT2 3,067023                   2,768809                  \n",
       "A4GALT  3,070601                   2,986975                  \n",
       "        5500994175999120813240_H05 5500994175999120813240_H06\n",
       "A1BG    2,901377                   4,528475                  \n",
       "A1CF    3,122200                   2,981742                  \n",
       "A2M     6,346759                   3,532383                  \n",
       "A2ML1   2,835382                   2,763315                  \n",
       "A3GALT2 2,607127                   2,778964                  \n",
       "A4GALT  3,334498                   3,398661                  \n",
       "        5500994175999120813240_H07 5500994175999120813240_H08\n",
       "A1BG    5,079415                   2,959578                  \n",
       "A1CF    2,912033                   2,950125                  \n",
       "A2M     3,161646                   3,203922                  \n",
       "A2ML1   2,752095                   2,911493                  \n",
       "A3GALT2 2,850118                   2,703913                  \n",
       "A4GALT  3,195717                   3,277447                  \n",
       "        5500994175999120813240_H09 5500994175999120813240_H10\n",
       "A1BG    3,094308                   4,045297                  \n",
       "A1CF    2,788239                   2,874734                  \n",
       "A2M     3,483303                   5,147751                  \n",
       "A2ML1   2,920386                   2,794910                  \n",
       "A3GALT2 2,663925                   2,763491                  \n",
       "A4GALT  4,407162                   3,008970                  "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>CAL.120</th><th scope=col>DMS.114</th><th scope=col>CAL.51</th><th scope=col>H2869</th><th scope=col>H290</th><th scope=col>A2058</th><th scope=col>CESS</th><th scope=col>H2803</th><th scope=col>H2722</th><th scope=col>HCC1419</th><th scope=col>⋯</th><th scope=col>NCI.H716</th><th scope=col>LB831.BLC</th><th scope=col>HC.1</th><th scope=col>JURKAT</th><th scope=col>MOG.G.UVW</th><th scope=col>MET.2B</th><th scope=col>JIMT.1</th><th scope=col>PEO1</th><th scope=col>PL4</th><th scope=col>TT</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>A1BG</th><td>6,210671</td><td>5,029226</td><td>5,513632</td><td>4,208249</td><td>3,402282</td><td>4,914035</td><td>3,827411</td><td>5,147878</td><td>3,110751</td><td>5,054884</td><td>⋯       </td><td>3,853365</td><td>4,273165</td><td>3,442682</td><td>4,923977</td><td>2,901377</td><td>4,528475</td><td>5,079415</td><td>2,959578</td><td>3,094308</td><td>4,045297</td></tr>\n",
       "\t<tr><th scope=row>A1CF</th><td>2,986196</td><td>2,952698</td><td>2,876904</td><td>3,078416</td><td>2,859689</td><td>3,226181</td><td>3,002562</td><td>2,899220</td><td>2,761808</td><td>2,991242</td><td>⋯       </td><td>7,746021</td><td>2,948285</td><td>3,162107</td><td>2,989301</td><td>3,122200</td><td>2,981742</td><td>2,912033</td><td>2,950125</td><td>2,788239</td><td>2,874734</td></tr>\n",
       "\t<tr><th scope=row>A2M</th><td>3,138822</td><td>3,339745</td><td>3,288143</td><td>3,035124</td><td>3,266711</td><td>6,878105</td><td>3,306808</td><td>3,376042</td><td>3,326467</td><td>3,532250</td><td>⋯       </td><td>3,801339</td><td>3,898669</td><td>3,478617</td><td>3,251410</td><td>6,346759</td><td>3,532383</td><td>3,161646</td><td>3,203922</td><td>3,483303</td><td>5,147751</td></tr>\n",
       "\t<tr><th scope=row>A2ML1</th><td>2,659014</td><td>2,740146</td><td>3,941870</td><td>2,782538</td><td>2,716182</td><td>2,912234</td><td>2,791071</td><td>2,829216</td><td>2,736346</td><td>2,977346</td><td>⋯       </td><td>2,788015</td><td>2,904620</td><td>2,795384</td><td>2,714010</td><td>2,835382</td><td>2,763315</td><td>2,752095</td><td>2,911493</td><td>2,920386</td><td>2,794910</td></tr>\n",
       "\t<tr><th scope=row>A3GALT2</th><td>2,765093</td><td>2,871171</td><td>3,009816</td><td>2,634662</td><td>2,697715</td><td>2,593990</td><td>2,929163</td><td>2,744387</td><td>2,998624</td><td>3,043765</td><td>⋯       </td><td>2,579725</td><td>2,964269</td><td>3,067023</td><td>2,768809</td><td>2,607127</td><td>2,778964</td><td>2,850118</td><td>2,703913</td><td>2,663925</td><td>2,763491</td></tr>\n",
       "\t<tr><th scope=row>A4GALT</th><td>3,342432</td><td>3,138229</td><td>3,513538</td><td>3,166587</td><td>3,240146</td><td>3,246475</td><td>3,801788</td><td>3,377124</td><td>3,288002</td><td>3,361292</td><td>⋯       </td><td>3,378919</td><td>3,384112</td><td>3,070601</td><td>2,986975</td><td>3,334498</td><td>3,398661</td><td>3,195717</td><td>3,277447</td><td>4,407162</td><td>3,008970</td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll}\n",
       "  & CAL.120 & DMS.114 & CAL.51 & H2869 & H290 & A2058 & CESS & H2803 & H2722 & HCC1419 & ⋯ & NCI.H716 & LB831.BLC & HC.1 & JURKAT & MOG.G.UVW & MET.2B & JIMT.1 & PEO1 & PL4 & TT\\\\\n",
       "\\hline\n",
       "\tA1BG & 6,210671 & 5,029226 & 5,513632 & 4,208249 & 3,402282 & 4,914035 & 3,827411 & 5,147878 & 3,110751 & 5,054884 & ⋯        & 3,853365 & 4,273165 & 3,442682 & 4,923977 & 2,901377 & 4,528475 & 5,079415 & 2,959578 & 3,094308 & 4,045297\\\\\n",
       "\tA1CF & 2,986196 & 2,952698 & 2,876904 & 3,078416 & 2,859689 & 3,226181 & 3,002562 & 2,899220 & 2,761808 & 2,991242 & ⋯        & 7,746021 & 2,948285 & 3,162107 & 2,989301 & 3,122200 & 2,981742 & 2,912033 & 2,950125 & 2,788239 & 2,874734\\\\\n",
       "\tA2M & 3,138822 & 3,339745 & 3,288143 & 3,035124 & 3,266711 & 6,878105 & 3,306808 & 3,376042 & 3,326467 & 3,532250 & ⋯        & 3,801339 & 3,898669 & 3,478617 & 3,251410 & 6,346759 & 3,532383 & 3,161646 & 3,203922 & 3,483303 & 5,147751\\\\\n",
       "\tA2ML1 & 2,659014 & 2,740146 & 3,941870 & 2,782538 & 2,716182 & 2,912234 & 2,791071 & 2,829216 & 2,736346 & 2,977346 & ⋯        & 2,788015 & 2,904620 & 2,795384 & 2,714010 & 2,835382 & 2,763315 & 2,752095 & 2,911493 & 2,920386 & 2,794910\\\\\n",
       "\tA3GALT2 & 2,765093 & 2,871171 & 3,009816 & 2,634662 & 2,697715 & 2,593990 & 2,929163 & 2,744387 & 2,998624 & 3,043765 & ⋯        & 2,579725 & 2,964269 & 3,067023 & 2,768809 & 2,607127 & 2,778964 & 2,850118 & 2,703913 & 2,663925 & 2,763491\\\\\n",
       "\tA4GALT & 3,342432 & 3,138229 & 3,513538 & 3,166587 & 3,240146 & 3,246475 & 3,801788 & 3,377124 & 3,288002 & 3,361292 & ⋯        & 3,378919 & 3,384112 & 3,070601 & 2,986975 & 3,334498 & 3,398661 & 3,195717 & 3,277447 & 4,407162 & 3,008970\\\\\n",
       "\\end{tabular}\n"
      ],
      "text/markdown": [
       "\n",
       "| <!--/--> | CAL.120 | DMS.114 | CAL.51 | H2869 | H290 | A2058 | CESS | H2803 | H2722 | HCC1419 | ⋯ | NCI.H716 | LB831.BLC | HC.1 | JURKAT | MOG.G.UVW | MET.2B | JIMT.1 | PEO1 | PL4 | TT | \n",
       "|---|---|---|---|---|---|\n",
       "| A1BG | 6,210671 | 5,029226 | 5,513632 | 4,208249 | 3,402282 | 4,914035 | 3,827411 | 5,147878 | 3,110751 | 5,054884 | ⋯        | 3,853365 | 4,273165 | 3,442682 | 4,923977 | 2,901377 | 4,528475 | 5,079415 | 2,959578 | 3,094308 | 4,045297 | \n",
       "| A1CF | 2,986196 | 2,952698 | 2,876904 | 3,078416 | 2,859689 | 3,226181 | 3,002562 | 2,899220 | 2,761808 | 2,991242 | ⋯        | 7,746021 | 2,948285 | 3,162107 | 2,989301 | 3,122200 | 2,981742 | 2,912033 | 2,950125 | 2,788239 | 2,874734 | \n",
       "| A2M | 3,138822 | 3,339745 | 3,288143 | 3,035124 | 3,266711 | 6,878105 | 3,306808 | 3,376042 | 3,326467 | 3,532250 | ⋯        | 3,801339 | 3,898669 | 3,478617 | 3,251410 | 6,346759 | 3,532383 | 3,161646 | 3,203922 | 3,483303 | 5,147751 | \n",
       "| A2ML1 | 2,659014 | 2,740146 | 3,941870 | 2,782538 | 2,716182 | 2,912234 | 2,791071 | 2,829216 | 2,736346 | 2,977346 | ⋯        | 2,788015 | 2,904620 | 2,795384 | 2,714010 | 2,835382 | 2,763315 | 2,752095 | 2,911493 | 2,920386 | 2,794910 | \n",
       "| A3GALT2 | 2,765093 | 2,871171 | 3,009816 | 2,634662 | 2,697715 | 2,593990 | 2,929163 | 2,744387 | 2,998624 | 3,043765 | ⋯        | 2,579725 | 2,964269 | 3,067023 | 2,768809 | 2,607127 | 2,778964 | 2,850118 | 2,703913 | 2,663925 | 2,763491 | \n",
       "| A4GALT | 3,342432 | 3,138229 | 3,513538 | 3,166587 | 3,240146 | 3,246475 | 3,801788 | 3,377124 | 3,288002 | 3,361292 | ⋯        | 3,378919 | 3,384112 | 3,070601 | 2,986975 | 3,334498 | 3,398661 | 3,195717 | 3,277447 | 4,407162 | 3,008970 | \n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "        CAL.120  DMS.114  CAL.51   H2869    H290     A2058    CESS     H2803   \n",
       "A1BG    6,210671 5,029226 5,513632 4,208249 3,402282 4,914035 3,827411 5,147878\n",
       "A1CF    2,986196 2,952698 2,876904 3,078416 2,859689 3,226181 3,002562 2,899220\n",
       "A2M     3,138822 3,339745 3,288143 3,035124 3,266711 6,878105 3,306808 3,376042\n",
       "A2ML1   2,659014 2,740146 3,941870 2,782538 2,716182 2,912234 2,791071 2,829216\n",
       "A3GALT2 2,765093 2,871171 3,009816 2,634662 2,697715 2,593990 2,929163 2,744387\n",
       "A4GALT  3,342432 3,138229 3,513538 3,166587 3,240146 3,246475 3,801788 3,377124\n",
       "        H2722    HCC1419  ⋯ NCI.H716 LB831.BLC HC.1     JURKAT   MOG.G.UVW\n",
       "A1BG    3,110751 5,054884 ⋯ 3,853365 4,273165  3,442682 4,923977 2,901377 \n",
       "A1CF    2,761808 2,991242 ⋯ 7,746021 2,948285  3,162107 2,989301 3,122200 \n",
       "A2M     3,326467 3,532250 ⋯ 3,801339 3,898669  3,478617 3,251410 6,346759 \n",
       "A2ML1   2,736346 2,977346 ⋯ 2,788015 2,904620  2,795384 2,714010 2,835382 \n",
       "A3GALT2 2,998624 3,043765 ⋯ 2,579725 2,964269  3,067023 2,768809 2,607127 \n",
       "A4GALT  3,288002 3,361292 ⋯ 3,378919 3,384112  3,070601 2,986975 3,334498 \n",
       "        MET.2B   JIMT.1   PEO1     PL4      TT      \n",
       "A1BG    4,528475 5,079415 2,959578 3,094308 4,045297\n",
       "A1CF    2,981742 2,912033 2,950125 2,788239 2,874734\n",
       "A2M     3,532383 3,161646 3,203922 3,483303 5,147751\n",
       "A2ML1   2,763315 2,752095 2,911493 2,920386 2,794910\n",
       "A3GALT2 2,778964 2,850118 2,703913 2,663925 2,763491\n",
       "A4GALT  3,398661 3,195717 3,277447 4,407162 3,008970"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# rename columns in our data table\n",
    "cl_names <- read.csv(\"GDSC/E-MTAB-3610.annotation.tsv\",header = T,sep=\"\\t\")\n",
    "cl_names$Characteristics.cell.line. <- gsub(\"-\",\".\",cl_names$Characteristics.cell.line.)\n",
    "cl_names <- cl_names[!cl_names$Characteristics.cell.line. %in%  cl_names$Characteristics.cell.line.[duplicated(cl_names$Characteristics.cell.line. )],]\n",
    "length(intersect(cl_names$Characteristics.cell.line.,colnames(GEX)))\n",
    "row.names(cl_names) <- cl_names$Assay.Name\n",
    "cl_names <- cl_names[colnames(exprs_GDSC_RMA_BrainArray_SYMBOL),]\n",
    "cl_names <- cl_names[!is.na(cl_names$Assay.Name),]\n",
    "\n",
    "exprs_GDSC_RMA_BrainArray_SYMBOL_ <- exprs_GDSC_RMA_BrainArray_SYMBOL[,cl_names$Assay.Name]\n",
    "colnames(exprs_GDSC_RMA_BrainArray_SYMBOL_) <- cl_names$Characteristics.cell.line.\n",
    "head(exprs_GDSC_RMA_BrainArray_SYMBOL_)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span style=white-space:pre-wrap>'our data    18645'</span>"
      ],
      "text/latex": [
       "'our data    18645'"
      ],
      "text/markdown": [
       "<span style=white-space:pre-wrap>'our data    18645'</span>"
      ],
      "text/plain": [
       "[1] \"our data    18645\""
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span style=white-space:pre-wrap>'GEX    17419'</span>"
      ],
      "text/latex": [
       "'GEX    17419'"
      ],
      "text/markdown": [
       "<span style=white-space:pre-wrap>'GEX    17419'</span>"
      ],
      "text/plain": [
       "[1] \"GEX    17419\""
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span style=white-space:pre-wrap>'Intersection: Shared Genes    16639    Shared samples    934'</span>"
      ],
      "text/latex": [
       "'Intersection: Shared Genes    16639    Shared samples    934'"
      ],
      "text/markdown": [
       "<span style=white-space:pre-wrap>'Intersection: Shared Genes    16639    Shared samples    934'</span>"
      ],
      "text/plain": [
       "[1] \"Intersection: Shared Genes    16639    Shared samples    934\""
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "paste(\"our data\",dim(exprs_GDSC_RMA_BrainArray_SYMBOL_)[1], sep =\"    \")\n",
    "paste(\"GEX\",dim(GEX)[1], sep =\"    \")\n",
    "shared_genes <- intersect(row.names(GEX),row.names(exprs_GDSC_RMA_BrainArray_SYMBOL_))\n",
    "shared_samples <- intersect(colnames(GEX),colnames(exprs_GDSC_RMA_BrainArray_SYMBOL_))\n",
    "paste(\"Intersection: Shared Genes\",length(shared_genes),\"Shared samples\",length(shared_samples),sep=\"    \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "gene_corrs <- diag(cor(t(GEX[shared_genes,shared_samples]), t(exprs_GDSC_RMA_BrainArray_SYMBOL_[shared_genes,shared_samples])))\n",
    "sample_corrs <- diag(cor(GEX[shared_genes,shared_samples], exprs_GDSC_RMA_BrainArray_SYMBOL_[shared_genes,shared_samples]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Result:\n",
    "* Correlation between samples is nearly perfect: 0.981-0.977, 0.98 on average;\n",
    "* Gene profile correlation is also good for most genes (0.99 on average), but for some it is very poor: min. is -0.04"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span style=white-space:pre-wrap>'Corr. between genes (mean, max, min)    0,987574861022879    0,999999501930154    -0,0466895425476415'</span>"
      ],
      "text/latex": [
       "'Corr. between genes (mean, max, min)    0,987574861022879    0,999999501930154    -0,0466895425476415'"
      ],
      "text/markdown": [
       "<span style=white-space:pre-wrap>'Corr. between genes (mean, max, min)    0,987574861022879    0,999999501930154    -0,0466895425476415'</span>"
      ],
      "text/plain": [
       "[1] \"Corr. between genes (mean, max, min)    0,987574861022879    0,999999501930154    -0,0466895425476415\""
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span style=white-space:pre-wrap>'Corr. between samples (mean, max, min)    0,97923761177824    0,980946265835985    0,977003232824752'</span>"
      ],
      "text/latex": [
       "'Corr. between samples (mean, max, min)    0,97923761177824    0,980946265835985    0,977003232824752'"
      ],
      "text/markdown": [
       "<span style=white-space:pre-wrap>'Corr. between samples (mean, max, min)    0,97923761177824    0,980946265835985    0,977003232824752'</span>"
      ],
      "text/plain": [
       "[1] \"Corr. between samples (mean, max, min)    0,97923761177824    0,980946265835985    0,977003232824752\""
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# average correlation between genes \n",
    "paste(\"Corr. between genes (mean, max, min)\",mean(gene_corrs),max(gene_corrs),min(gene_corrs), sep =\"    \")\n",
    "# average correlation between samples\n",
    "paste(\"Corr. between samples (mean, max, min)\",mean(sample_corrs),max(sample_corrs),min(sample_corrs), sep =\"    \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<dl class=dl-horizontal>\n",
       "\t<dt>NDUFA13</dt>\n",
       "\t\t<dd>-0,0466895425476415</dd>\n",
       "\t<dt>AGRN</dt>\n",
       "\t\t<dd>-0,0357717955240473</dd>\n",
       "\t<dt>CCDC122</dt>\n",
       "\t\t<dd>-0,00548321411413838</dd>\n",
       "\t<dt>AKTIP</dt>\n",
       "\t\t<dd>0,0184622017401918</dd>\n",
       "\t<dt>TET3</dt>\n",
       "\t\t<dd>0,0814979327351481</dd>\n",
       "\t<dt>EXOSC7</dt>\n",
       "\t\t<dd>0,0932529618544275</dd>\n",
       "\t<dt>STARD10</dt>\n",
       "\t\t<dd>0,118468067661345</dd>\n",
       "\t<dt>NAA38</dt>\n",
       "\t\t<dd>0,120170347929768</dd>\n",
       "\t<dt>EFR3B</dt>\n",
       "\t\t<dd>0,12613722046339</dd>\n",
       "\t<dt>SPIN3</dt>\n",
       "\t\t<dd>0,150719264107254</dd>\n",
       "\t<dt>SMCR8</dt>\n",
       "\t\t<dd>0,1582612418071</dd>\n",
       "\t<dt>HS6ST3</dt>\n",
       "\t\t<dd>0,196708977687097</dd>\n",
       "\t<dt>OFCC1</dt>\n",
       "\t\t<dd>0,213518347005132</dd>\n",
       "\t<dt>RPL10</dt>\n",
       "\t\t<dd>0,245640291331545</dd>\n",
       "\t<dt>DLEU2</dt>\n",
       "\t\t<dd>0,25076045963443</dd>\n",
       "\t<dt>KLHL7</dt>\n",
       "\t\t<dd>0,259364034605027</dd>\n",
       "\t<dt>PPM1E</dt>\n",
       "\t\t<dd>0,264811047564632</dd>\n",
       "\t<dt>MKL2</dt>\n",
       "\t\t<dd>0,268757227041272</dd>\n",
       "\t<dt>DISC1</dt>\n",
       "\t\t<dd>0,290808335661287</dd>\n",
       "\t<dt>C3orf30</dt>\n",
       "\t\t<dd>0,303613399674166</dd>\n",
       "</dl>\n"
      ],
      "text/latex": [
       "\\begin{description*}\n",
       "\\item[NDUFA13] -0,0466895425476415\n",
       "\\item[AGRN] -0,0357717955240473\n",
       "\\item[CCDC122] -0,00548321411413838\n",
       "\\item[AKTIP] 0,0184622017401918\n",
       "\\item[TET3] 0,0814979327351481\n",
       "\\item[EXOSC7] 0,0932529618544275\n",
       "\\item[STARD10] 0,118468067661345\n",
       "\\item[NAA38] 0,120170347929768\n",
       "\\item[EFR3B] 0,12613722046339\n",
       "\\item[SPIN3] 0,150719264107254\n",
       "\\item[SMCR8] 0,1582612418071\n",
       "\\item[HS6ST3] 0,196708977687097\n",
       "\\item[OFCC1] 0,213518347005132\n",
       "\\item[RPL10] 0,245640291331545\n",
       "\\item[DLEU2] 0,25076045963443\n",
       "\\item[KLHL7] 0,259364034605027\n",
       "\\item[PPM1E] 0,264811047564632\n",
       "\\item[MKL2] 0,268757227041272\n",
       "\\item[DISC1] 0,290808335661287\n",
       "\\item[C3orf30] 0,303613399674166\n",
       "\\end{description*}\n"
      ],
      "text/markdown": [
       "NDUFA13\n",
       ":   -0,0466895425476415AGRN\n",
       ":   -0,0357717955240473CCDC122\n",
       ":   -0,00548321411413838AKTIP\n",
       ":   0,0184622017401918TET3\n",
       ":   0,0814979327351481EXOSC7\n",
       ":   0,0932529618544275STARD10\n",
       ":   0,118468067661345NAA38\n",
       ":   0,120170347929768EFR3B\n",
       ":   0,12613722046339SPIN3\n",
       ":   0,150719264107254SMCR8\n",
       ":   0,1582612418071HS6ST3\n",
       ":   0,196708977687097OFCC1\n",
       ":   0,213518347005132RPL10\n",
       ":   0,245640291331545DLEU2\n",
       ":   0,25076045963443KLHL7\n",
       ":   0,259364034605027PPM1E\n",
       ":   0,264811047564632MKL2\n",
       ":   0,268757227041272DISC1\n",
       ":   0,290808335661287C3orf30\n",
       ":   0,303613399674166\n",
       "\n"
      ],
      "text/plain": [
       "     NDUFA13         AGRN      CCDC122        AKTIP         TET3       EXOSC7 \n",
       "-0,046689543 -0,035771796 -0,005483214  0,018462202  0,081497933  0,093252962 \n",
       "     STARD10        NAA38        EFR3B        SPIN3        SMCR8       HS6ST3 \n",
       " 0,118468068  0,120170348  0,126137220  0,150719264  0,158261242  0,196708978 \n",
       "       OFCC1        RPL10        DLEU2        KLHL7        PPM1E         MKL2 \n",
       " 0,213518347  0,245640291  0,250760460  0,259364035  0,264811048  0,268757227 \n",
       "       DISC1      C3orf30 \n",
       " 0,290808336  0,303613400 "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# top 20 genes with the worst correlation \n",
    "head(sort(gene_corrs,decreasing = FALSE),20)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "587"
      ],
      "text/latex": [
       "587"
      ],
      "text/markdown": [
       "587"
      ],
      "text/plain": [
       "[1] 587"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "length(gene_corrs[gene_corrs<0.9])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Compare expressions provided by Lukas Folkman with our results \n",
    "\n",
    "\n",
    "### GSE6434\n",
    "Resulted correlation of sample profiles for GSE6434"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading objects:\n",
      "  doceVivoNorm_syms\n",
      "  groupMembership\n"
     ]
    }
   ],
   "source": [
    "load(\"Geeleher_with_GDSCr6/data/doce_rma_syms_brainArray.RData\",verbose = T)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>8399</li>\n",
       "\t<li>24</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 8399\n",
       "\\item 24\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 8399\n",
       "2. 24\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 8399   24"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>GSM4903</th><th scope=col>GSM4907</th><th scope=col>GSM4908</th><th scope=col>GSM4914</th><th scope=col>GSM4915</th><th scope=col>GSM4917</th><th scope=col>GSM4919</th><th scope=col>GSM4920</th><th scope=col>GSM4921</th><th scope=col>GSM4923</th><th scope=col>⋯</th><th scope=col>GSM4906</th><th scope=col>GSM4909</th><th scope=col>GSM4910</th><th scope=col>GSM4911</th><th scope=col>GSM4912</th><th scope=col>GSM4913</th><th scope=col>GSM4916</th><th scope=col>GSM4918</th><th scope=col>GSM4922</th><th scope=col>GSM4924</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>AKT3</th><td>4,471063</td><td>4,624944</td><td>4,692121</td><td>4,466994</td><td>5,754121</td><td>4,554908</td><td>5,429423</td><td>4,312592</td><td>4,582778</td><td>4,528860</td><td>⋯       </td><td>4,815075</td><td>4,543328</td><td>4,204645</td><td>4,361776</td><td>5,068274</td><td>4,937692</td><td>4,759488</td><td>4,357240</td><td>4,177261</td><td>4,285073</td></tr>\n",
       "\t<tr><th scope=row>MED6</th><td>3,839062</td><td>3,584472</td><td>3,326740</td><td>3,535239</td><td>3,742069</td><td>3,774019</td><td>3,677544</td><td>3,898045</td><td>3,855050</td><td>3,555893</td><td>⋯       </td><td>3,364237</td><td>3,667986</td><td>3,521200</td><td>3,502689</td><td>3,351606</td><td>3,525377</td><td>3,429070</td><td>3,956674</td><td>4,073611</td><td>3,682697</td></tr>\n",
       "\t<tr><th scope=row>NR2E3</th><td>7,290057</td><td>7,399590</td><td>7,468425</td><td>7,677972</td><td>7,797163</td><td>8,160049</td><td>7,042885</td><td>7,989204</td><td>7,698802</td><td>7,518574</td><td>⋯       </td><td>7,511533</td><td>8,709372</td><td>8,222946</td><td>7,912787</td><td>7,671314</td><td>7,479153</td><td>8,212303</td><td>8,233507</td><td>7,851094</td><td>8,244293</td></tr>\n",
       "\t<tr><th scope=row>NAALAD2</th><td>3,242835</td><td>3,710604</td><td>3,535582</td><td>4,681849</td><td>4,179551</td><td>3,944334</td><td>3,007184</td><td>4,539145</td><td>3,058529</td><td>4,224176</td><td>⋯       </td><td>4,132053</td><td>6,250989</td><td>4,565469</td><td>4,741568</td><td>3,704555</td><td>3,302500</td><td>4,680904</td><td>5,732379</td><td>4,786829</td><td>4,913436</td></tr>\n",
       "\t<tr><th scope=row>NAALADL1</th><td>4,458982</td><td>4,539672</td><td>4,233429</td><td>4,309195</td><td>4,778892</td><td>4,536552</td><td>4,131638</td><td>4,760834</td><td>4,371971</td><td>4,365722</td><td>⋯       </td><td>4,455732</td><td>5,010709</td><td>4,587314</td><td>4,458041</td><td>4,298968</td><td>4,551020</td><td>4,380551</td><td>4,852454</td><td>4,650280</td><td>4,854401</td></tr>\n",
       "\t<tr><th scope=row>ACOT8</th><td>5,100625</td><td>5,307493</td><td>5,405129</td><td>5,246431</td><td>5,714727</td><td>5,812253</td><td>5,115640</td><td>5,806674</td><td>5,922718</td><td>5,290012</td><td>⋯       </td><td>5,636612</td><td>6,111452</td><td>5,990468</td><td>5,492492</td><td>5,349489</td><td>5,440780</td><td>5,528125</td><td>5,955764</td><td>5,688114</td><td>5,585995</td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|llllllllllllllllllllllll}\n",
       "  & GSM4903 & GSM4907 & GSM4908 & GSM4914 & GSM4915 & GSM4917 & GSM4919 & GSM4920 & GSM4921 & GSM4923 & ⋯ & GSM4906 & GSM4909 & GSM4910 & GSM4911 & GSM4912 & GSM4913 & GSM4916 & GSM4918 & GSM4922 & GSM4924\\\\\n",
       "\\hline\n",
       "\tAKT3 & 4,471063 & 4,624944 & 4,692121 & 4,466994 & 5,754121 & 4,554908 & 5,429423 & 4,312592 & 4,582778 & 4,528860 & ⋯        & 4,815075 & 4,543328 & 4,204645 & 4,361776 & 5,068274 & 4,937692 & 4,759488 & 4,357240 & 4,177261 & 4,285073\\\\\n",
       "\tMED6 & 3,839062 & 3,584472 & 3,326740 & 3,535239 & 3,742069 & 3,774019 & 3,677544 & 3,898045 & 3,855050 & 3,555893 & ⋯        & 3,364237 & 3,667986 & 3,521200 & 3,502689 & 3,351606 & 3,525377 & 3,429070 & 3,956674 & 4,073611 & 3,682697\\\\\n",
       "\tNR2E3 & 7,290057 & 7,399590 & 7,468425 & 7,677972 & 7,797163 & 8,160049 & 7,042885 & 7,989204 & 7,698802 & 7,518574 & ⋯        & 7,511533 & 8,709372 & 8,222946 & 7,912787 & 7,671314 & 7,479153 & 8,212303 & 8,233507 & 7,851094 & 8,244293\\\\\n",
       "\tNAALAD2 & 3,242835 & 3,710604 & 3,535582 & 4,681849 & 4,179551 & 3,944334 & 3,007184 & 4,539145 & 3,058529 & 4,224176 & ⋯        & 4,132053 & 6,250989 & 4,565469 & 4,741568 & 3,704555 & 3,302500 & 4,680904 & 5,732379 & 4,786829 & 4,913436\\\\\n",
       "\tNAALADL1 & 4,458982 & 4,539672 & 4,233429 & 4,309195 & 4,778892 & 4,536552 & 4,131638 & 4,760834 & 4,371971 & 4,365722 & ⋯        & 4,455732 & 5,010709 & 4,587314 & 4,458041 & 4,298968 & 4,551020 & 4,380551 & 4,852454 & 4,650280 & 4,854401\\\\\n",
       "\tACOT8 & 5,100625 & 5,307493 & 5,405129 & 5,246431 & 5,714727 & 5,812253 & 5,115640 & 5,806674 & 5,922718 & 5,290012 & ⋯        & 5,636612 & 6,111452 & 5,990468 & 5,492492 & 5,349489 & 5,440780 & 5,528125 & 5,955764 & 5,688114 & 5,585995\\\\\n",
       "\\end{tabular}\n"
      ],
      "text/markdown": [
       "\n",
       "| <!--/--> | GSM4903 | GSM4907 | GSM4908 | GSM4914 | GSM4915 | GSM4917 | GSM4919 | GSM4920 | GSM4921 | GSM4923 | ⋯ | GSM4906 | GSM4909 | GSM4910 | GSM4911 | GSM4912 | GSM4913 | GSM4916 | GSM4918 | GSM4922 | GSM4924 | \n",
       "|---|---|---|---|---|---|\n",
       "| AKT3 | 4,471063 | 4,624944 | 4,692121 | 4,466994 | 5,754121 | 4,554908 | 5,429423 | 4,312592 | 4,582778 | 4,528860 | ⋯        | 4,815075 | 4,543328 | 4,204645 | 4,361776 | 5,068274 | 4,937692 | 4,759488 | 4,357240 | 4,177261 | 4,285073 | \n",
       "| MED6 | 3,839062 | 3,584472 | 3,326740 | 3,535239 | 3,742069 | 3,774019 | 3,677544 | 3,898045 | 3,855050 | 3,555893 | ⋯        | 3,364237 | 3,667986 | 3,521200 | 3,502689 | 3,351606 | 3,525377 | 3,429070 | 3,956674 | 4,073611 | 3,682697 | \n",
       "| NR2E3 | 7,290057 | 7,399590 | 7,468425 | 7,677972 | 7,797163 | 8,160049 | 7,042885 | 7,989204 | 7,698802 | 7,518574 | ⋯        | 7,511533 | 8,709372 | 8,222946 | 7,912787 | 7,671314 | 7,479153 | 8,212303 | 8,233507 | 7,851094 | 8,244293 | \n",
       "| NAALAD2 | 3,242835 | 3,710604 | 3,535582 | 4,681849 | 4,179551 | 3,944334 | 3,007184 | 4,539145 | 3,058529 | 4,224176 | ⋯        | 4,132053 | 6,250989 | 4,565469 | 4,741568 | 3,704555 | 3,302500 | 4,680904 | 5,732379 | 4,786829 | 4,913436 | \n",
       "| NAALADL1 | 4,458982 | 4,539672 | 4,233429 | 4,309195 | 4,778892 | 4,536552 | 4,131638 | 4,760834 | 4,371971 | 4,365722 | ⋯        | 4,455732 | 5,010709 | 4,587314 | 4,458041 | 4,298968 | 4,551020 | 4,380551 | 4,852454 | 4,650280 | 4,854401 | \n",
       "| ACOT8 | 5,100625 | 5,307493 | 5,405129 | 5,246431 | 5,714727 | 5,812253 | 5,115640 | 5,806674 | 5,922718 | 5,290012 | ⋯        | 5,636612 | 6,111452 | 5,990468 | 5,492492 | 5,349489 | 5,440780 | 5,528125 | 5,955764 | 5,688114 | 5,585995 | \n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "         GSM4903  GSM4907  GSM4908  GSM4914  GSM4915  GSM4917  GSM4919 \n",
       "AKT3     4,471063 4,624944 4,692121 4,466994 5,754121 4,554908 5,429423\n",
       "MED6     3,839062 3,584472 3,326740 3,535239 3,742069 3,774019 3,677544\n",
       "NR2E3    7,290057 7,399590 7,468425 7,677972 7,797163 8,160049 7,042885\n",
       "NAALAD2  3,242835 3,710604 3,535582 4,681849 4,179551 3,944334 3,007184\n",
       "NAALADL1 4,458982 4,539672 4,233429 4,309195 4,778892 4,536552 4,131638\n",
       "ACOT8    5,100625 5,307493 5,405129 5,246431 5,714727 5,812253 5,115640\n",
       "         GSM4920  GSM4921  GSM4923  ⋯ GSM4906  GSM4909  GSM4910  GSM4911 \n",
       "AKT3     4,312592 4,582778 4,528860 ⋯ 4,815075 4,543328 4,204645 4,361776\n",
       "MED6     3,898045 3,855050 3,555893 ⋯ 3,364237 3,667986 3,521200 3,502689\n",
       "NR2E3    7,989204 7,698802 7,518574 ⋯ 7,511533 8,709372 8,222946 7,912787\n",
       "NAALAD2  4,539145 3,058529 4,224176 ⋯ 4,132053 6,250989 4,565469 4,741568\n",
       "NAALADL1 4,760834 4,371971 4,365722 ⋯ 4,455732 5,010709 4,587314 4,458041\n",
       "ACOT8    5,806674 5,922718 5,290012 ⋯ 5,636612 6,111452 5,990468 5,492492\n",
       "         GSM4912  GSM4913  GSM4916  GSM4918  GSM4922  GSM4924 \n",
       "AKT3     5,068274 4,937692 4,759488 4,357240 4,177261 4,285073\n",
       "MED6     3,351606 3,525377 3,429070 3,956674 4,073611 3,682697\n",
       "NR2E3    7,671314 7,479153 8,212303 8,233507 7,851094 8,244293\n",
       "NAALAD2  3,704555 3,302500 4,680904 5,732379 4,786829 4,913436\n",
       "NAALADL1 4,298968 4,551020 4,380551 4,852454 4,650280 4,854401\n",
       "ACOT8    5,349489 5,440780 5,528125 5,955764 5,688114 5,585995"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "dim(doceVivoNorm_syms)\n",
    "colnames(doceVivoNorm_syms) <- gsub(\".CEL\",\"\",colnames(doceVivoNorm_syms))\n",
    "head(doceVivoNorm_syms)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "24"
      ],
      "text/latex": [
       "24"
      ],
      "text/markdown": [
       "24"
      ],
      "text/plain": [
       "[1] 24"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "24"
      ],
      "text/latex": [
       "24"
      ],
      "text/markdown": [
       "24"
      ],
      "text/plain": [
       "[1] 24"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "'hgu95av2cdf'"
      ],
      "text/latex": [
       "'hgu95av2cdf'"
      ],
      "text/markdown": [
       "'hgu95av2cdf'"
      ],
      "text/plain": [
       "[1] \"hgu95av2cdf\""
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "cel_file_dir = \"arrays/GSE6434/\"\n",
    "getSnames <- function(sname,ext=\".CEL.gz\"){\n",
    "    sname <- sub(ext, \"\", sname)\n",
    "    #sname <- strsplit(sname,split=\"_\")\n",
    "    sname <- unlist(sname)[1]\n",
    "    return(sname)\n",
    "}\n",
    "fnames <- list.files(cel_file_dir ,pattern = \".CEL.gz\")\n",
    "snames <- lapply(fnames, getSnames)\n",
    "snames <- unlist(snames)\n",
    "length(snames)\n",
    "length(fnames)\n",
    "\n",
    "\n",
    "cdf_basenames <- unique(unlist(lapply(fnames,getCDFname, dir=cel_file_dir)))\n",
    "cdf_basenames \n",
    "\n",
    "cdf_basename <- sub(\"cdf\",\"\",cdf_basenames[1])\n",
    "\n",
    "# load BrainArray\n",
    "customAnno <- installBrainArrayCDFandDB(paste0(cdf_basename,\"hs\"),version='22.0.0',gene_id_type=\"entrezg\",\n",
    "                                        download_dir=\"/home/olya/SFU/Hossein/arrays/\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "'hgu95av2hsentrezgcdf'"
      ],
      "text/latex": [
       "'hgu95av2hsentrezgcdf'"
      ],
      "text/markdown": [
       "'hgu95av2hsentrezgcdf'"
      ],
      "text/plain": [
       "[1] \"hgu95av2hsentrezgcdf\""
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>8519</li>\n",
       "\t<li>24</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 8519\n",
       "\\item 24\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 8519\n",
       "2. 24\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 8519   24"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "rma_doce_BrainArray <- affy::justRMA(celfile.path=cel_file_dir,sampleNames=snames,\n",
    "                     verbose=FALSE,compress=FALSE,cdfname=customAnno$customCDFname)\n",
    "annotation(rma_doce_BrainArray)\n",
    "dim(exprs(rma_doce_BrainArray))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 94,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>GSM4901</th><th scope=col>GSM4902</th><th scope=col>GSM4903</th><th scope=col>GSM4904</th><th scope=col>GSM4905</th><th scope=col>GSM4906</th><th scope=col>GSM4907</th><th scope=col>GSM4908</th><th scope=col>GSM4909</th><th scope=col>GSM4910</th><th scope=col>⋯</th><th scope=col>GSM4915</th><th scope=col>GSM4916</th><th scope=col>GSM4917</th><th scope=col>GSM4918</th><th scope=col>GSM4919</th><th scope=col>GSM4920</th><th scope=col>GSM4921</th><th scope=col>GSM4922</th><th scope=col>GSM4923</th><th scope=col>GSM4924</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>AADAC</th><td>4,246572 </td><td>3,988715 </td><td>3,387175 </td><td>3,692939 </td><td>4,710232 </td><td> 3,788563</td><td> 3,564630</td><td>4,843054 </td><td>4,567442 </td><td>4,268574 </td><td>⋯        </td><td>3,992699 </td><td>3,724247 </td><td>3,401156 </td><td>4,314926 </td><td>5,820770 </td><td>3,928832 </td><td>3,613545 </td><td>4,031425 </td><td>4,874323 </td><td>4,207930 </td></tr>\n",
       "\t<tr><th scope=row>AAK1</th><td>3,503150 </td><td>3,481343 </td><td>3,267147 </td><td>3,799788 </td><td>3,946391 </td><td> 3,486923</td><td> 3,310952</td><td>3,370090 </td><td>4,033090 </td><td>3,521773 </td><td>⋯        </td><td>3,365616 </td><td>3,448040 </td><td>3,277319 </td><td>3,483527 </td><td>3,256345 </td><td>3,302839 </td><td>3,270803 </td><td>3,507228 </td><td>3,485034 </td><td>3,427167 </td></tr>\n",
       "\t<tr><th scope=row>AAMP</th><td>6,542812 </td><td>6,433155 </td><td>7,112246 </td><td>6,726937 </td><td>7,048852 </td><td> 6,697682</td><td> 7,044337</td><td>6,983283 </td><td>6,968767 </td><td>6,134034 </td><td>⋯        </td><td>7,542823 </td><td>6,242356 </td><td>7,463244 </td><td>7,561346 </td><td>6,944381 </td><td>7,543045 </td><td>7,032896 </td><td>6,401135 </td><td>6,753362 </td><td>6,682369 </td></tr>\n",
       "\t<tr><th scope=row>AANAT</th><td>8,262640 </td><td>7,809070 </td><td>7,429048 </td><td>7,881925 </td><td>8,636162 </td><td> 7,612333</td><td> 7,747587</td><td>7,630458 </td><td>8,497340 </td><td>7,720368 </td><td>⋯        </td><td>7,899382 </td><td>7,625594 </td><td>7,761377 </td><td>7,922388 </td><td>7,437147 </td><td>7,905339 </td><td>7,712204 </td><td>7,772773 </td><td>7,795030 </td><td>7,620933 </td></tr>\n",
       "\t<tr><th scope=row>AARS</th><td>8,715532 </td><td>9,549540 </td><td>8,925394 </td><td>9,426015 </td><td>8,139263 </td><td>10,383877</td><td>10,094132</td><td>8,438800 </td><td>8,857150 </td><td>8,694890 </td><td>⋯        </td><td>9,162233 </td><td>9,401812 </td><td>9,758292 </td><td>8,780993 </td><td>9,141654 </td><td>9,511462 </td><td>9,582736 </td><td>8,506947 </td><td>8,874166 </td><td>8,419377 </td></tr>\n",
       "\t<tr><th scope=row>AASDHPPT</th><td>7,313638 </td><td>7,208273 </td><td>8,728579 </td><td>7,394029 </td><td>6,499497 </td><td> 7,896952</td><td> 8,509394</td><td>7,753057 </td><td>6,754180 </td><td>9,084544 </td><td>⋯        </td><td>7,292513 </td><td>8,133136 </td><td>7,531076 </td><td>7,058344 </td><td>7,443340 </td><td>7,703600 </td><td>7,198549 </td><td>8,290108 </td><td>8,486498 </td><td>7,723431 </td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|llllllllllllllllllllllll}\n",
       "  & GSM4901 & GSM4902 & GSM4903 & GSM4904 & GSM4905 & GSM4906 & GSM4907 & GSM4908 & GSM4909 & GSM4910 & ⋯ & GSM4915 & GSM4916 & GSM4917 & GSM4918 & GSM4919 & GSM4920 & GSM4921 & GSM4922 & GSM4923 & GSM4924\\\\\n",
       "\\hline\n",
       "\tAADAC & 4,246572  & 3,988715  & 3,387175  & 3,692939  & 4,710232  &  3,788563 &  3,564630 & 4,843054  & 4,567442  & 4,268574  & ⋯         & 3,992699  & 3,724247  & 3,401156  & 4,314926  & 5,820770  & 3,928832  & 3,613545  & 4,031425  & 4,874323  & 4,207930 \\\\\n",
       "\tAAK1 & 3,503150  & 3,481343  & 3,267147  & 3,799788  & 3,946391  &  3,486923 &  3,310952 & 3,370090  & 4,033090  & 3,521773  & ⋯         & 3,365616  & 3,448040  & 3,277319  & 3,483527  & 3,256345  & 3,302839  & 3,270803  & 3,507228  & 3,485034  & 3,427167 \\\\\n",
       "\tAAMP & 6,542812  & 6,433155  & 7,112246  & 6,726937  & 7,048852  &  6,697682 &  7,044337 & 6,983283  & 6,968767  & 6,134034  & ⋯         & 7,542823  & 6,242356  & 7,463244  & 7,561346  & 6,944381  & 7,543045  & 7,032896  & 6,401135  & 6,753362  & 6,682369 \\\\\n",
       "\tAANAT & 8,262640  & 7,809070  & 7,429048  & 7,881925  & 8,636162  &  7,612333 &  7,747587 & 7,630458  & 8,497340  & 7,720368  & ⋯         & 7,899382  & 7,625594  & 7,761377  & 7,922388  & 7,437147  & 7,905339  & 7,712204  & 7,772773  & 7,795030  & 7,620933 \\\\\n",
       "\tAARS & 8,715532  & 9,549540  & 8,925394  & 9,426015  & 8,139263  & 10,383877 & 10,094132 & 8,438800  & 8,857150  & 8,694890  & ⋯         & 9,162233  & 9,401812  & 9,758292  & 8,780993  & 9,141654  & 9,511462  & 9,582736  & 8,506947  & 8,874166  & 8,419377 \\\\\n",
       "\tAASDHPPT & 7,313638  & 7,208273  & 8,728579  & 7,394029  & 6,499497  &  7,896952 &  8,509394 & 7,753057  & 6,754180  & 9,084544  & ⋯         & 7,292513  & 8,133136  & 7,531076  & 7,058344  & 7,443340  & 7,703600  & 7,198549  & 8,290108  & 8,486498  & 7,723431 \\\\\n",
       "\\end{tabular}\n"
      ],
      "text/markdown": [
       "\n",
       "| <!--/--> | GSM4901 | GSM4902 | GSM4903 | GSM4904 | GSM4905 | GSM4906 | GSM4907 | GSM4908 | GSM4909 | GSM4910 | ⋯ | GSM4915 | GSM4916 | GSM4917 | GSM4918 | GSM4919 | GSM4920 | GSM4921 | GSM4922 | GSM4923 | GSM4924 | \n",
       "|---|---|---|---|---|---|\n",
       "| AADAC | 4,246572  | 3,988715  | 3,387175  | 3,692939  | 4,710232  |  3,788563 |  3,564630 | 4,843054  | 4,567442  | 4,268574  | ⋯         | 3,992699  | 3,724247  | 3,401156  | 4,314926  | 5,820770  | 3,928832  | 3,613545  | 4,031425  | 4,874323  | 4,207930  | \n",
       "| AAK1 | 3,503150  | 3,481343  | 3,267147  | 3,799788  | 3,946391  |  3,486923 |  3,310952 | 3,370090  | 4,033090  | 3,521773  | ⋯         | 3,365616  | 3,448040  | 3,277319  | 3,483527  | 3,256345  | 3,302839  | 3,270803  | 3,507228  | 3,485034  | 3,427167  | \n",
       "| AAMP | 6,542812  | 6,433155  | 7,112246  | 6,726937  | 7,048852  |  6,697682 |  7,044337 | 6,983283  | 6,968767  | 6,134034  | ⋯         | 7,542823  | 6,242356  | 7,463244  | 7,561346  | 6,944381  | 7,543045  | 7,032896  | 6,401135  | 6,753362  | 6,682369  | \n",
       "| AANAT | 8,262640  | 7,809070  | 7,429048  | 7,881925  | 8,636162  |  7,612333 |  7,747587 | 7,630458  | 8,497340  | 7,720368  | ⋯         | 7,899382  | 7,625594  | 7,761377  | 7,922388  | 7,437147  | 7,905339  | 7,712204  | 7,772773  | 7,795030  | 7,620933  | \n",
       "| AARS | 8,715532  | 9,549540  | 8,925394  | 9,426015  | 8,139263  | 10,383877 | 10,094132 | 8,438800  | 8,857150  | 8,694890  | ⋯         | 9,162233  | 9,401812  | 9,758292  | 8,780993  | 9,141654  | 9,511462  | 9,582736  | 8,506947  | 8,874166  | 8,419377  | \n",
       "| AASDHPPT | 7,313638  | 7,208273  | 8,728579  | 7,394029  | 6,499497  |  7,896952 |  8,509394 | 7,753057  | 6,754180  | 9,084544  | ⋯         | 7,292513  | 8,133136  | 7,531076  | 7,058344  | 7,443340  | 7,703600  | 7,198549  | 8,290108  | 8,486498  | 7,723431  | \n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "         GSM4901  GSM4902  GSM4903  GSM4904  GSM4905  GSM4906   GSM4907  \n",
       "AADAC    4,246572 3,988715 3,387175 3,692939 4,710232  3,788563  3,564630\n",
       "AAK1     3,503150 3,481343 3,267147 3,799788 3,946391  3,486923  3,310952\n",
       "AAMP     6,542812 6,433155 7,112246 6,726937 7,048852  6,697682  7,044337\n",
       "AANAT    8,262640 7,809070 7,429048 7,881925 8,636162  7,612333  7,747587\n",
       "AARS     8,715532 9,549540 8,925394 9,426015 8,139263 10,383877 10,094132\n",
       "AASDHPPT 7,313638 7,208273 8,728579 7,394029 6,499497  7,896952  8,509394\n",
       "         GSM4908  GSM4909  GSM4910  ⋯ GSM4915  GSM4916  GSM4917  GSM4918 \n",
       "AADAC    4,843054 4,567442 4,268574 ⋯ 3,992699 3,724247 3,401156 4,314926\n",
       "AAK1     3,370090 4,033090 3,521773 ⋯ 3,365616 3,448040 3,277319 3,483527\n",
       "AAMP     6,983283 6,968767 6,134034 ⋯ 7,542823 6,242356 7,463244 7,561346\n",
       "AANAT    7,630458 8,497340 7,720368 ⋯ 7,899382 7,625594 7,761377 7,922388\n",
       "AARS     8,438800 8,857150 8,694890 ⋯ 9,162233 9,401812 9,758292 8,780993\n",
       "AASDHPPT 7,753057 6,754180 9,084544 ⋯ 7,292513 8,133136 7,531076 7,058344\n",
       "         GSM4919  GSM4920  GSM4921  GSM4922  GSM4923  GSM4924 \n",
       "AADAC    5,820770 3,928832 3,613545 4,031425 4,874323 4,207930\n",
       "AAK1     3,256345 3,302839 3,270803 3,507228 3,485034 3,427167\n",
       "AAMP     6,944381 7,543045 7,032896 6,401135 6,753362 6,682369\n",
       "AANAT    7,437147 7,905339 7,712204 7,772773 7,795030 7,620933\n",
       "AARS     9,141654 9,511462 9,582736 8,506947 8,874166 8,419377\n",
       "AASDHPPT 7,443340 7,703600 7,198549 8,290108 8,486498 7,723431"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>8449</li>\n",
       "\t<li>24</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 8449\n",
       "\\item 24\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 8449\n",
       "2. 24\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 8449   24"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "anno_SYMBOL <- suppressMessages(select(get(customAnno$customDBname), keytype=\"PROBEID\", \n",
    "  columns=c(\"SYMBOL\"), keys=row.names(exprs(rma_doce_BrainArray))))\n",
    "anno_SYMBOL <- anno_SYMBOL[!is.na(anno_SYMBOL$SYMBOL),]\n",
    "CR <- WGCNA::collapseRows(exprs(rma_doce_BrainArray), rowGroup = anno_SYMBOL$SYMBOL, rowID = anno_SYMBOL$PROBEID\n",
    "                   ,method=\"Average\")\n",
    "exprs_rma_doce_BrainArray<- CR$datETcollapsed\n",
    "write.table(exprs_rma_doce_BrainArray, file =\"Geeleher_with_GDSCr6/data/GSE6434.BrainArray.RMAlog2Average.Symbol.Expr.tsv\",na=\"\",quote = FALSE,sep=\"\\t\")\n",
    "head(exprs_rma_doce_BrainArray)\n",
    "dim(exprs_rma_doce_BrainArray)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>GSM4901</th><th scope=col>GSM4902</th><th scope=col>GSM4903</th><th scope=col>GSM4904</th><th scope=col>GSM4905</th><th scope=col>GSM4906</th><th scope=col>GSM4907</th><th scope=col>GSM4908</th><th scope=col>GSM4909</th><th scope=col>GSM4910</th><th scope=col>⋯</th><th scope=col>GSM4915</th><th scope=col>GSM4916</th><th scope=col>GSM4917</th><th scope=col>GSM4918</th><th scope=col>GSM4919</th><th scope=col>GSM4920</th><th scope=col>GSM4921</th><th scope=col>GSM4922</th><th scope=col>GSM4923</th><th scope=col>GSM4924</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>10</th><td>3,352156</td><td>3,308508</td><td>3,096914</td><td>3,236103</td><td>3,811192</td><td>3,104723</td><td>3,081062</td><td>3,091837</td><td>3,949944</td><td>3,622625</td><td>⋯       </td><td>3,136446</td><td>3,491678</td><td>3,365999</td><td>3,622488</td><td>3,450723</td><td>3,396406</td><td>3,453915</td><td>3,612785</td><td>3,148515</td><td>3,275832</td></tr>\n",
       "\t<tr><th scope=row>100</th><td>6,737301</td><td>6,659171</td><td>6,455985</td><td>6,695844</td><td>6,965612</td><td>6,603501</td><td>6,461829</td><td>6,246567</td><td>7,155956</td><td>6,826456</td><td>⋯       </td><td>6,775504</td><td>6,891247</td><td>6,983027</td><td>6,862854</td><td>6,768173</td><td>6,709703</td><td>6,484590</td><td>6,502096</td><td>6,745475</td><td>6,842699</td></tr>\n",
       "\t<tr><th scope=row>1000</th><td>3,834283</td><td>3,677712</td><td>3,396575</td><td>3,723289</td><td>3,450899</td><td>3,545390</td><td>3,474213</td><td>3,298312</td><td>3,681804</td><td>3,471742</td><td>⋯       </td><td>3,329445</td><td>3,611272</td><td>3,160738</td><td>3,487775</td><td>3,264067</td><td>3,476193</td><td>3,115201</td><td>3,433925</td><td>3,154068</td><td>3,239380</td></tr>\n",
       "\t<tr><th scope=row>10000</th><td>4,754271</td><td>4,496707</td><td>4,483903</td><td>4,585342</td><td>5,254096</td><td>4,824657</td><td>4,610763</td><td>4,705913</td><td>4,544194</td><td>4,187782</td><td>⋯       </td><td>5,769203</td><td>4,751104</td><td>4,562195</td><td>4,366119</td><td>5,410028</td><td>4,297884</td><td>4,593181</td><td>4,197862</td><td>4,547317</td><td>4,246281</td></tr>\n",
       "\t<tr><th scope=row>10001</th><td>3,424575</td><td>3,490181</td><td>3,820446</td><td>3,465579</td><td>3,489038</td><td>3,349595</td><td>3,558470</td><td>3,299228</td><td>3,656393</td><td>3,511745</td><td>⋯       </td><td>3,746529</td><td>3,405819</td><td>3,760430</td><td>3,949115</td><td>3,669616</td><td>3,876380</td><td>3,835922</td><td>4,054287</td><td>3,528116</td><td>3,664227</td></tr>\n",
       "\t<tr><th scope=row>10002</th><td>7,934315</td><td>7,757096</td><td>7,293050</td><td>7,820109</td><td>8,616442</td><td>7,514208</td><td>7,403382</td><td>7,470628</td><td>8,706345</td><td>8,217301</td><td>⋯       </td><td>7,802912</td><td>8,219931</td><td>8,165052</td><td>8,235681</td><td>7,045313</td><td>7,998694</td><td>7,701926</td><td>7,857424</td><td>7,532069</td><td>8,243082</td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|llllllllllllllllllllllll}\n",
       "  & GSM4901 & GSM4902 & GSM4903 & GSM4904 & GSM4905 & GSM4906 & GSM4907 & GSM4908 & GSM4909 & GSM4910 & ⋯ & GSM4915 & GSM4916 & GSM4917 & GSM4918 & GSM4919 & GSM4920 & GSM4921 & GSM4922 & GSM4923 & GSM4924\\\\\n",
       "\\hline\n",
       "\t10 & 3,352156 & 3,308508 & 3,096914 & 3,236103 & 3,811192 & 3,104723 & 3,081062 & 3,091837 & 3,949944 & 3,622625 & ⋯        & 3,136446 & 3,491678 & 3,365999 & 3,622488 & 3,450723 & 3,396406 & 3,453915 & 3,612785 & 3,148515 & 3,275832\\\\\n",
       "\t100 & 6,737301 & 6,659171 & 6,455985 & 6,695844 & 6,965612 & 6,603501 & 6,461829 & 6,246567 & 7,155956 & 6,826456 & ⋯        & 6,775504 & 6,891247 & 6,983027 & 6,862854 & 6,768173 & 6,709703 & 6,484590 & 6,502096 & 6,745475 & 6,842699\\\\\n",
       "\t1000 & 3,834283 & 3,677712 & 3,396575 & 3,723289 & 3,450899 & 3,545390 & 3,474213 & 3,298312 & 3,681804 & 3,471742 & ⋯        & 3,329445 & 3,611272 & 3,160738 & 3,487775 & 3,264067 & 3,476193 & 3,115201 & 3,433925 & 3,154068 & 3,239380\\\\\n",
       "\t10000 & 4,754271 & 4,496707 & 4,483903 & 4,585342 & 5,254096 & 4,824657 & 4,610763 & 4,705913 & 4,544194 & 4,187782 & ⋯        & 5,769203 & 4,751104 & 4,562195 & 4,366119 & 5,410028 & 4,297884 & 4,593181 & 4,197862 & 4,547317 & 4,246281\\\\\n",
       "\t10001 & 3,424575 & 3,490181 & 3,820446 & 3,465579 & 3,489038 & 3,349595 & 3,558470 & 3,299228 & 3,656393 & 3,511745 & ⋯        & 3,746529 & 3,405819 & 3,760430 & 3,949115 & 3,669616 & 3,876380 & 3,835922 & 4,054287 & 3,528116 & 3,664227\\\\\n",
       "\t10002 & 7,934315 & 7,757096 & 7,293050 & 7,820109 & 8,616442 & 7,514208 & 7,403382 & 7,470628 & 8,706345 & 8,217301 & ⋯        & 7,802912 & 8,219931 & 8,165052 & 8,235681 & 7,045313 & 7,998694 & 7,701926 & 7,857424 & 7,532069 & 8,243082\\\\\n",
       "\\end{tabular}\n"
      ],
      "text/markdown": [
       "\n",
       "| <!--/--> | GSM4901 | GSM4902 | GSM4903 | GSM4904 | GSM4905 | GSM4906 | GSM4907 | GSM4908 | GSM4909 | GSM4910 | ⋯ | GSM4915 | GSM4916 | GSM4917 | GSM4918 | GSM4919 | GSM4920 | GSM4921 | GSM4922 | GSM4923 | GSM4924 | \n",
       "|---|---|---|---|---|---|\n",
       "| 10 | 3,352156 | 3,308508 | 3,096914 | 3,236103 | 3,811192 | 3,104723 | 3,081062 | 3,091837 | 3,949944 | 3,622625 | ⋯        | 3,136446 | 3,491678 | 3,365999 | 3,622488 | 3,450723 | 3,396406 | 3,453915 | 3,612785 | 3,148515 | 3,275832 | \n",
       "| 100 | 6,737301 | 6,659171 | 6,455985 | 6,695844 | 6,965612 | 6,603501 | 6,461829 | 6,246567 | 7,155956 | 6,826456 | ⋯        | 6,775504 | 6,891247 | 6,983027 | 6,862854 | 6,768173 | 6,709703 | 6,484590 | 6,502096 | 6,745475 | 6,842699 | \n",
       "| 1000 | 3,834283 | 3,677712 | 3,396575 | 3,723289 | 3,450899 | 3,545390 | 3,474213 | 3,298312 | 3,681804 | 3,471742 | ⋯        | 3,329445 | 3,611272 | 3,160738 | 3,487775 | 3,264067 | 3,476193 | 3,115201 | 3,433925 | 3,154068 | 3,239380 | \n",
       "| 10000 | 4,754271 | 4,496707 | 4,483903 | 4,585342 | 5,254096 | 4,824657 | 4,610763 | 4,705913 | 4,544194 | 4,187782 | ⋯        | 5,769203 | 4,751104 | 4,562195 | 4,366119 | 5,410028 | 4,297884 | 4,593181 | 4,197862 | 4,547317 | 4,246281 | \n",
       "| 10001 | 3,424575 | 3,490181 | 3,820446 | 3,465579 | 3,489038 | 3,349595 | 3,558470 | 3,299228 | 3,656393 | 3,511745 | ⋯        | 3,746529 | 3,405819 | 3,760430 | 3,949115 | 3,669616 | 3,876380 | 3,835922 | 4,054287 | 3,528116 | 3,664227 | \n",
       "| 10002 | 7,934315 | 7,757096 | 7,293050 | 7,820109 | 8,616442 | 7,514208 | 7,403382 | 7,470628 | 8,706345 | 8,217301 | ⋯        | 7,802912 | 8,219931 | 8,165052 | 8,235681 | 7,045313 | 7,998694 | 7,701926 | 7,857424 | 7,532069 | 8,243082 | \n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "      GSM4901  GSM4902  GSM4903  GSM4904  GSM4905  GSM4906  GSM4907  GSM4908 \n",
       "10    3,352156 3,308508 3,096914 3,236103 3,811192 3,104723 3,081062 3,091837\n",
       "100   6,737301 6,659171 6,455985 6,695844 6,965612 6,603501 6,461829 6,246567\n",
       "1000  3,834283 3,677712 3,396575 3,723289 3,450899 3,545390 3,474213 3,298312\n",
       "10000 4,754271 4,496707 4,483903 4,585342 5,254096 4,824657 4,610763 4,705913\n",
       "10001 3,424575 3,490181 3,820446 3,465579 3,489038 3,349595 3,558470 3,299228\n",
       "10002 7,934315 7,757096 7,293050 7,820109 8,616442 7,514208 7,403382 7,470628\n",
       "      GSM4909  GSM4910  ⋯ GSM4915  GSM4916  GSM4917  GSM4918  GSM4919  GSM4920 \n",
       "10    3,949944 3,622625 ⋯ 3,136446 3,491678 3,365999 3,622488 3,450723 3,396406\n",
       "100   7,155956 6,826456 ⋯ 6,775504 6,891247 6,983027 6,862854 6,768173 6,709703\n",
       "1000  3,681804 3,471742 ⋯ 3,329445 3,611272 3,160738 3,487775 3,264067 3,476193\n",
       "10000 4,544194 4,187782 ⋯ 5,769203 4,751104 4,562195 4,366119 5,410028 4,297884\n",
       "10001 3,656393 3,511745 ⋯ 3,746529 3,405819 3,760430 3,949115 3,669616 3,876380\n",
       "10002 8,706345 8,217301 ⋯ 7,802912 8,219931 8,165052 8,235681 7,045313 7,998694\n",
       "      GSM4921  GSM4922  GSM4923  GSM4924 \n",
       "10    3,453915 3,612785 3,148515 3,275832\n",
       "100   6,484590 6,502096 6,745475 6,842699\n",
       "1000  3,115201 3,433925 3,154068 3,239380\n",
       "10000 4,593181 4,197862 4,547317 4,246281\n",
       "10001 3,835922 4,054287 3,528116 3,664227\n",
       "10002 7,701926 7,857424 7,532069 8,243082"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "anno_ENTREZ <- suppressMessages(select(get(customAnno$customDBname), keytype=\"PROBEID\", \n",
    "  columns=c(\"ENTREZID\"), keys=row.names(exprs(rma_doce_BrainArray))))\n",
    "anno_ENTREZ <- anno_ENTREZ[!is.na(anno_ENTREZ$ENTREZ),]\n",
    "CR <- WGCNA::collapseRows(exprs(rma_doce_BrainArray), rowGroup = anno_ENTREZ$ENTREZID,\n",
    "                          rowID = anno_ENTREZ$PROBEID,method=\"Average\")\n",
    "exprs_rma_doce_BrainArray_ENTREZ<- CR$datETcollapsed\n",
    "write.table(exprs_rma_doce_BrainArray_ENTREZ, file =\"Geeleher_with_GDSCr6/data/GSE6434.BrainArray.RMAlog2Average.Entrez.Expr.tsv\",\n",
    "            na=\"\",quote = FALSE,sep=\"\\t\")\n",
    "head(exprs_rma_doce_BrainArray_ENTREZ)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "8061"
      ],
      "text/latex": [
       "8061"
      ],
      "text/markdown": [
       "8061"
      ],
      "text/plain": [
       "[1] 8061"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "24"
      ],
      "text/latex": [
       "24"
      ],
      "text/markdown": [
       "24"
      ],
      "text/plain": [
       "[1] 24"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "shared_genes <- intersect(row.names((exprs_rma_doce_BrainArray)),row.names(doceVivoNorm_syms))\n",
    "length(shared_genes)\n",
    "shared_samples <- intersect(colnames((exprs_rma_doce_BrainArray)),colnames(doceVivoNorm_syms))\n",
    "length(shared_samples)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 102,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span style=white-space:pre-wrap>'Corr. between genes (mean, max, min)    0,994214009288569    0,999999751668813    -0,0322620349274823'</span>"
      ],
      "text/latex": [
       "'Corr. between genes (mean, max, min)    0,994214009288569    0,999999751668813    -0,0322620349274823'"
      ],
      "text/markdown": [
       "<span style=white-space:pre-wrap>'Corr. between genes (mean, max, min)    0,994214009288569    0,999999751668813    -0,0322620349274823'</span>"
      ],
      "text/plain": [
       "[1] \"Corr. between genes (mean, max, min)    0,994214009288569    0,999999751668813    -0,0322620349274823\""
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span style=white-space:pre-wrap>'Corr. between samples (mean, max, min)    0,995224019854658    0,996122633468715    0,993188134393841'</span>"
      ],
      "text/latex": [
       "'Corr. between samples (mean, max, min)    0,995224019854658    0,996122633468715    0,993188134393841'"
      ],
      "text/markdown": [
       "<span style=white-space:pre-wrap>'Corr. between samples (mean, max, min)    0,995224019854658    0,996122633468715    0,993188134393841'</span>"
      ],
      "text/plain": [
       "[1] \"Corr. between samples (mean, max, min)    0,995224019854658    0,996122633468715    0,993188134393841\""
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "gene_corrs <- diag(cor(t(exprs_rma_doce_BrainArray[shared_genes,shared_samples]), t(doceVivoNorm_syms[shared_genes,shared_samples])))\n",
    "sample_corrs <- diag(cor(exprs_rma_doce_BrainArray[shared_genes,shared_samples], doceVivoNorm_syms[shared_genes,shared_samples]))\n",
    "# average correlation between genes\n",
    "paste(\"Corr. between genes (mean, max, min)\",mean(gene_corrs),max(gene_corrs),min(gene_corrs), sep =\"    \")\n",
    "# average correlation between samples\n",
    "paste(\"Corr. between samples (mean, max, min)\",mean(sample_corrs),max(sample_corrs),min(sample_corrs), sep =\"    \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "116"
      ],
      "text/latex": [
       "116"
      ],
      "text/markdown": [
       "116"
      ],
      "text/plain": [
       "[1] 116"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "length(gene_corrs[gene_corrs<0.9])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 104,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "R version 3.5.0 (2018-04-23)\n",
       "Platform: x86_64-conda_cos6-linux-gnu (64-bit)\n",
       "Running under: Ubuntu 16.04.5 LTS\n",
       "\n",
       "Matrix products: default\n",
       "BLAS/LAPACK: /home/olya/miniconda2/lib/R/lib/libRblas.so\n",
       "\n",
       "locale:\n",
       " [1] LC_CTYPE=en_US.UTF-8          LC_NUMERIC=de_DE.UTF-8       \n",
       " [3] LC_TIME=de_DE.UTF-8           LC_COLLATE=en_US.UTF-8       \n",
       " [5] LC_MONETARY=de_DE.UTF-8       LC_MESSAGES=en_US.UTF-8      \n",
       " [7] LC_PAPER=de_DE.UTF-8          LC_NAME=de_DE.UTF-8          \n",
       " [9] LC_ADDRESS=de_DE.UTF-8        LC_TELEPHONE=de_DE.UTF-8     \n",
       "[11] LC_MEASUREMENT=de_DE.UTF-8    LC_IDENTIFICATION=de_DE.UTF-8\n",
       "\n",
       "attached base packages:\n",
       "[1] stats4    parallel  stats     graphics  grDevices utils     datasets \n",
       "[8] methods   base     \n",
       "\n",
       "other attached packages:\n",
       " [1] hugene10sthsentrezg.db_22.0.0  hugene10sthsentrezgcdf_22.0.0 \n",
       " [3] hgu133bhsentrezg.db_22.0.0     hgu133bhsentrezgcdf_22.0.0    \n",
       " [5] hgu133ahsentrezg.db_22.0.0     hgu133ahsentrezgcdf_22.0.0    \n",
       " [7] hgu95av2hsentrezg.db_22.0.0    hgu95av2hsentrezgcdf_22.0.0   \n",
       " [9] hgu133plus2hsentrezg.db_22.0.0 hgu133plus2hsentrezgcdf_22.0.0\n",
       "[11] hgu219hsentrezg.db_22.0.0      org.Hs.eg.db_3.4.1            \n",
       "[13] hgu219hsentrezgcdf_22.0.0      AnnotationDbi_1.38.2          \n",
       "[15] IRanges_2.10.5                 S4Vectors_0.14.7              \n",
       "[17] affyio_1.46.0                  affy_1.54.0                   \n",
       "[19] Biobase_2.36.2                 BiocGenerics_0.22.1           \n",
       "\n",
       "loaded via a namespace (and not attached):\n",
       " [1] matrixStats_0.54.0    fit.models_0.5-14     robust_0.4-18        \n",
       " [4] bit64_0.9-7           doParallel_1.0.11     RColorBrewer_1.1-2   \n",
       " [7] repr_0.15.0           dynamicTreeCut_1.63-1 tools_3.5.0          \n",
       "[10] backports_1.1.2       R6_2.2.2              rpart_4.1-13         \n",
       "[13] Hmisc_4.1-1           DBI_1.0.0             lazyeval_0.2.1       \n",
       "[16] colorspace_1.3-2      nnet_7.3-12           tidyselect_0.2.4     \n",
       "[19] gridExtra_2.3         bit_1.1-14            compiler_3.5.0       \n",
       "[22] preprocessCore_1.38.1 WGCNA_1.64-1          htmlTable_1.12       \n",
       "[25] scales_1.0.0          checkmate_1.8.5       mvtnorm_1.0-8        \n",
       "[28] DEoptimR_1.0-8        robustbase_0.93-2     pbdZMQ_0.3-3         \n",
       "[31] stringr_1.3.1         digest_0.6.17         foreign_0.8-71       \n",
       "[34] rrcov_1.4-4           base64enc_0.1-3       pkgconfig_2.0.2      \n",
       "[37] htmltools_0.3.6       htmlwidgets_1.2       rlang_0.2.2          \n",
       "[40] impute_1.50.1         rstudioapi_0.7        RSQLite_2.1.1        \n",
       "[43] BiocInstaller_1.26.1  bindr_0.1.1           jsonlite_1.5         \n",
       "[46] acepack_1.4.1         dplyr_0.7.6           magrittr_1.5         \n",
       "[49] GO.db_3.4.1           Formula_1.2-3         Matrix_1.2-14        \n",
       "[52] Rcpp_0.12.18          IRkernel_0.8.11       munsell_0.5.0        \n",
       "[55] stringi_1.2.4         MASS_7.3-50           zlibbioc_1.22.0      \n",
       "[58] plyr_1.8.4            grid_3.5.0            blob_1.1.1           \n",
       "[61] crayon_1.3.4          lattice_0.20-35       IRdisplay_0.5.0      \n",
       "[64] splines_3.5.0         knitr_1.20            pillar_1.3.0         \n",
       "[67] fastcluster_1.1.25    uuid_0.1-2            codetools_0.2-15     \n",
       "[70] glue_1.3.0            evaluate_0.11         latticeExtra_0.6-28  \n",
       "[73] data.table_1.11.4     foreach_1.4.4         gtable_0.2.0         \n",
       "[76] purrr_0.2.5           assertthat_0.2.0      ggplot2_3.0.0        \n",
       "[79] pcaPP_1.9-73          survival_2.42-6       tibble_1.4.2         \n",
       "[82] iterators_1.0.10      memoise_1.1.0         bindrcpp_0.2.2       \n",
       "[85] cluster_2.0.7-1      "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "sessionInfo()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "R",
   "language": "R",
   "name": "ir"
  },
  "language_info": {
   "codemirror_mode": "r",
   "file_extension": ".r",
   "mimetype": "text/x-r-source",
   "name": "R",
   "pygments_lexer": "r",
   "version": "3.5.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}