1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
|
---
title: Saving arrays to artifacts and back again
author:
- name: Aaron Lun
email: infinite.monkeys.with.keyboards@gmail.com
package: alabaster.matrix
date: "Revised: November 28, 2023"
output:
BiocStyle::html_document
vignette: >
%\VignetteIndexEntry{Saving and loading arrays}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, echo=FALSE}
library(BiocStyle)
self <- Githubpkg("ArtifactDB/alabaster.matrix")
knitr::opts_chunk$set(error=FALSE, warning=FALSE, message=FALSE)
```
# Overview
The `r self` package implements methods to save matrix-like objects to file artifacts and load them back into R.
Check out the `r Githubpkg("ArtifactDB/alabaster.base")` for more details on the motivation and the **alabaster** framework.
# Quick start
Given an array-like object, we can use `saveObject()` to save it inside a staging directory:
```{r}
library(Matrix)
y <- rsparsematrix(1000, 100, density=0.05)
library(alabaster.matrix)
tmp <- tempfile()
saveObject(y, tmp)
list.files(tmp, recursive=TRUE)
```
We then load it back into our R session with `loadObject()`.
This creates a HDF5-backed S4 array that can be easily coerced into the desired format, e.g., a `dgCMatrix`.
```{r}
roundtrip <- readObject(tmp)
class(roundtrip)
```
This process is supported for all base arrays, `r CRANpkg("Matrix")` objects and `r Biocpkg("DelayedArray")` objects.
# Saving delayed operations
🚧🚧🚧 Oops. Yet to migrate the delayed operations to the new `saveObject`/`readObject` world - stay tuned. 🚧🚧🚧
<!--
For `DelayedArray`s, we may instead choose to save the delayed operations themselves to file, using the `r Githubpkg("LTLA/chihaya")` package.
This creates a HDF5 file following the [**chihaya**](https://ltla.github.io/chihaya) format, containing the delayed operations rather than the results of their evaluation.
```{r, eval=FALSE}
library(DelayedArray)
y <- DelayedArray(rsparsematrix(1000, 100, 0.05))
y <- log1p(abs(y) / 1:100) # adding some delayed ops.
preserveDelayedOperations(TRUE)
meta <- stageObject(y, tmp, "delayed")
.writeMetadata(meta, tmp)
meta <- acquireMetadata(tmp, "delayed/delayed.h5")
roundtrip <- loadObject(meta, tmp)
class(roundtrip)
```
However, it is probably best to avoid preserving delayed operations for file-backed `DelayedArray`s if you want the artifacts to be re-usable on different filesystems.
For example, `HDF5Array`s will be saved with a reference to an absolute file path, which will not be portable.
-->
# Session information {-}
```{r}
sessionInfo()
```
|