This is a utility R package containing custom code and scripts developed to
establish a working approach for integration of multi-omics data.
The package provides a unified toolkit for the analysis and integration of
multi-omic high-throughput data. It relies upon the
mixOmics
toolkit to provide implementations of many of
the underlying projection to latent structures (PLS) methods used to analyse
high-dimensional data. In addition to this, it includes custom implementations
of data pre-processing, normalisation, collation, model validation,
visualisation & output functions.
The originally individual scripts have been collected into a formal package that
should be installable and usable within an analysts' R environment without
further configuration. The package is fully documented at the function level.
This package and analysis requires R v3.6 or above. It is largely built upon the
mixOmics
integration framework. The dependencies vary significantly in source,
so an installation script is provided to make satisfying the dependencies as
simple as possible. mixOmics
installs its own dependencies as well. Note that
we install mixOmics
from the GitHub repository as this version is more up to
date than the one on Bioconductor and has a number of fixes which are needed to
avoid bugs.
Notable dependencies that will be installed if they are not already:
See the DESCRIPTION
file for a complete
dependency list
Due to the number of dependencies and the number of places those dependencies
come from, there is an installation script available. This can be run by
opening up an R session in your preferred environment, ensuring your working
directory is the OmicsFold
directory, then issuing the following commands:
source('install.R')
install.omicsfold()
This should install all the dependencies and then finally the OmicsFold package
itself. If there are any issues due to versions changing or changes in which
repository maintains the active version of a package, you may have to update the
script.
If you are having issues installing OmicsFold in a conda environment, please try
the following steps:
First, create the conda environment:
conda create --name OmicsFold
source activate OmicsFold
conda install r=3.6.0
conda install -c conda-forge boost-cpp
Second, launch R in the conda environment and manually install the following packages (or if you are installing directly in a local instance of R):
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("metagenomeSeq")
BiocManager::install("org.Mm.eg.db")
install.packages("XML", repos = "http://www.omegahat.net/R")
source("http://labs.genetics.ucla.edu/horvath/htdocs/CoexpressionNetwork/GeneAnnotation/installAnRichment.R")
installAnRichment()
source('install.R')
install.omicsfold()
For installation using nextflow (https://www.nextflow.io/docs/latest/getstarted.html) please see https://github.com/AstraZeneca/Omicsfold/tree/master/OmicsFold/nextflow_pipeline
Import the OmicsFold
and the mixOmics
packages in R and you're ready to
go. Some functions also require dplyr
to be loaded so it's a good idea to
load it anyway. Certain plotting functions also may require ggplot2 to be loaded.
library(OmicsFold)
library(mixOmics)
library(dplyr)
library (ggplot2) #(optional)
A number of normalisation functions have been provided. Each has documentation
which can be read in the usual way in R. For example, the help for the function
normalise.tss
can be viewed by calling ?normalise.tss
. A brief description
of the usage of each function can be read in the Getting Started with
Normalisation document, with a few key
functions also showing example code for how to use it.
low.count.removal()
normalise.tss()
normalise.css()
normalise.logit()
normalise.logit.empirical()
normalise.clr()
normalise.clr.within.features()
Once a mixOmics
model has been fitted, OmicsFold can be used to perform a
number of visualisation and data extraction functions. Below is a brief list of
the functionality provided. While these are well documented in the R help
system, descriptions of how to use each function can also be found in the
Getting Started with Model Analysis
document.
To contact the maintainers or project director, please refer to the
AUTHORS
file. If you are thinking of contributing to OmicsFold,
all the information you will need is in the CONTRIBUTING
file.
OmicsFold is licensed under the Apache-2.0 software
licence as documented in the
LICENCE
file. Separately installed dependencies of OmicsFold
may be licensed under different licence agreements. If you plan to create
derivative works from OmicsFold or use OmicsFold for commercial or profitable
enterprises, please ensure you adhere to all the expectations of these
dependencies and seek legal advice if you are unsure.