Tutorial 3: Import data containers from csv files

SummarizedExperiment and MultiAssayExperiment containers can be imported from R as a collection of csv files.

Importing SE

In this example, the R-native OKeefeDSData dataset is first disassembled into its components, which are saved as csv files. Next, these files are read in Julia and re-assembled into a SummarizedExperiment.

In R, you can adjust the code below to retrieve the se and split it into three csv files, which contain the assays, the rowdata and the coldata, respectively.

# R

# load OKeefeDSData container
library(microbiomeDataSets)
se <- OKeefeDSData()

# store assays, coldata and rowdata into variables
assays <- assays(se)
coldata <- colData(se)
rowdata <- rowData(se)

# write out csv files with assays, rowdata and coldata, respectively
write.csv(assays, "DS_assays.csv")
write.csv(rowdata, "DS_rowdata.csv")
write.csv(coldata,"DS_coldata.csv")

In Julia, it is necessary to pass the paths to the three files to the import_se_from_csv function, keeping in mind the correct order.

# Julia

# assemble se from csv files
import_se_from_csv("DS_assays.csv",
                   "DS_rowdata.csv",
                   "DS_coldata.csv")

Importing MAE

Similarly, the HintikkaXOData container can be exported from R to Julia as an analogous MultiAssayExperiment object.

This time, the container needs to be divided into one assay file for each experiment within the mae. That is, if a mae contains three experiments, such as HintikkaXOData, then three different assay files need to be generated. In addition, the sample data and the sample map also go to make one csv file each.

# R

# load HintikkaXOData container
library(microbiomeDataSets)
mae <- HintikkaXOData()

# split mae into se elements
se1 <- mae[["microbiota"]]
se2 <- mae[["metabolites"]]
se3 <- mae[["biomarkers"]]

# store assays of respective se into variables
microbiota_assays <- assays(se1)
metabolites_assays <- assays(se2)
biomarkers_assays <- assays(se3)

# write out csv files with assays
write.csv(microbiota_assays, "XO_microbiota_assays.csv")
write.csv(metabolites_assays, "XO_metabolites_assays.csv")
write.csv(biomarkers_assays, "XO_biomarkers_assays.csv")

# write out csv files with sample data and sample map, respectively
write.csv(colData(mae), "XO_sample_data.csv")
write.csv(sampleMap(mae),"XO_sample_map.csv",
          row.names = FALSE)

As in the previous case, the csv files are re-assembled into a mae in Julia. The paths to the assay files should be provided to import_mae_from_csv as a list in the first argument, followed by sample data and sample map file paths. Also, the optional argument experiment_names lets you choose custom names for the experiments.

# Julia

# make a list with the paths to the assays files
experiment_files = ["XO_microbiota_assays.csv",
                    "XO_metabolites_assays.csv",
                    "XO_biomarkers_assays.csv"]

# make a list with custom names for the experiments
experiment_names = ["microbiota", "metabolites", "biomarkers"]

# assemble mae from csv files
mae = import_mae_from_csv(experiment_files,
                          "XO_sample_data.csv",
                          "XO_sample_map.csv",
                          experiment_names = experiment_names)

Retrieval of readily available datasets

It is also possible to get started with a ready-made experiment and try to perform analysis on it. The following studies come pre-installed with this package:

In order to retrieve a specific experiment, a function with the same name is called. This returns the experiment itself, which can be stored into a variable and investigated further.

mae = HintikkaXOData()
MultiAssayExperiment object
  experiments(3): microbiota metabolites biomarkers
  sampledata(7): name Sample ... Fat XOS
  metadata(0):