In this tutorial we will show you how to use the .loom files that are generated by ASAP.
In ASAP we chose to store all the input and output data in Loom files.
This format is very convenient for many reasons, and even more for large scale data storage, such as single-cell data experiments:
The main limitation (currently) is that .loom files handle only ONE fixed dimension for matrices (in terms of nb of cells/genes).
This means that if you filter your main matrix (gene or cell filtering), you will have to create a NEW .loom file with the new dimensions.
That is the reason why, in the same project, a user can have multiple .loom files available to download (one for each matrix dimension).
For example, if you uploaded a count matrix of 10000 cells x 30000 genes, that you performed a cell filtering (keeping only 8000 cells) and further a gene filtering (keeping only 2000 genes), then you will end up with 3 matrix dimensions, and thus 3 .loom files:
Of note
Currently ASAP generates Loom v3 files, so if you use loompy/loomR to load the .loom into your workspace, check that you have the latest version (that can handle Loom v3)
ASAP follows the rules of the .loom format specs, so we invite the reader to check this website for more information (also see Figure 1).
In brief:
Retrieving information from the .loom file requires writing some lines of code. Since it is a HDF5 file, any programming language can be used, but for simplicity we recommend using loomR (R language) or loompy (Python language).
Here follows an example in R and Python to retrieve clustering results stored in the "/col_attrs/cluster_seurat" metadata:
# Load the library require(loomR) # Connect to the file ds <- connect(filename = "my_loom.loom", mode = "r") # r for read-only and r+ for read/write # Note: You may encounter the following error: # Error in validateLoom(object = self) : # There can only be 5 groups in the loom file: 'row_attrs', 'col_attrs', 'layers', 'row_graphs', 'col_graphs' # If this is the case, it is because loomR is not yet applicable to Loom v3 standards (that is generated in ASAP). As of today, this is the case (2020.04.14). Maybe later you will simply need to update loomR version. # You will need to use the option: skip.validate = T # e.g. # ds <- connect(filename = "my_loom.loom", mode = "r", skip.validate = T) # Retrieve the data and put in variable clusters clusters = ds[["col_attrs/cluster_seurat"]] # Close the file handle ds$close_all()
# Import the package
import loompy
# Connect to the file
ds = loompy.connect("my_loom.loom", "r") # r for read-only and r+ for read/write
# Retrieve the data and put in variable clusters
clusters = ds.col_attrs["cluster_seurat"]
# Close the file handle
ds.close()