This function is a utility wrapper for SummarizedExperiment that provides automatic subsetting for row and column data, as well as automatic handling of transgenes and spike-ins.

makeSummarizedExperiment(assays, ...)

# S4 method for SimpleList
makeSummarizedExperiment(
  assays,
  rowRanges = GRangesList(),
  rowData = NULL,
  colData = DataFrame(),
  metadata = list(),
  transgeneNames = NULL,
  spikeNames = NULL,
  sort = TRUE,
  sessionInfo = TRUE
)

# S4 method for list
makeSummarizedExperiment(
  assays,
  rowRanges = GRangesList(),
  rowData = NULL,
  colData = DataFrame(),
  metadata = list(),
  transgeneNames = NULL,
  spikeNames = NULL,
  sort = TRUE,
  sessionInfo = TRUE
)

Arguments

assays

SimpleList. Count matrices, which must have matching dimensions. Counts can be passed in as either a dense matrix (matrix) or sparse matrix (sparseMatrix).

...

Additional arguments.

rowRanges

GRanges or GRangesList. Genomic ranges (e.g. genome annotations). Metadata describing the assay rows.

rowData

DataFrame. Metadata describing the assay rows, if genomic ranges are not available. Use rowRanges (GRanges) instead, if possible.

colData

DataFrame. Metadata describing the assay columns. For bulk RNA-seq, this data describes the samples. For single-cell RNA-seq, this data describes the cells.

metadata

list. Metadata.

transgeneNames

character. Vector indicating which assay rows denote transgenes (e.g. EGFP, TDTOMATO).

spikeNames

character. Vector indicating which assay rows denote spike-in sequences (e.g. ERCCs).

sort

logical(1). Ensure all row and column names are sorted alphabetically. This includes columns inside rowData and colData, and metadata slot names. Assay names are required to contain counts as the first assay.

sessionInfo

logical(1). Slot session information into metadata.

Value

  • Providing rowRanges: RangedSummarizedExperiment.

  • Providing rowData: SummarizedExperiment.

Note

Column and rows always return sorted alphabetically.

Updated 2019-09-09.

Session information

This function improves upon the standard constructor by slotting useful session information into the metadata slot by default:

  • date: Today's date, returned from Sys.Date.

  • wd: Working directory, returned from getwd.

  • sessionInfo: sessioninfo::session_info() return.

This behavior can be disabled by setting sessionInfo = FALSE.

See also

Examples

## Rows (genes) genes <- c( sprintf("gene%02d", seq_len(3L)), "EGFP" # transgene ) print(genes)
#> [1] "gene01" "gene02" "gene03" "EGFP"
## Columns (samples) samples <- sprintf("sample%02d", seq_len(4L)) print(samples)
#> [1] "sample01" "sample02" "sample03" "sample04"
## Counts (assay) counts <- matrix( data = seq_len(length(genes) * length(samples)), nrow = length(genes), ncol = length(samples), dimnames = list(genes, samples) ) ## Primary assay must be named "counts". assays <- SimpleList(counts = counts) print(assays)
#> List of length 1 #> names(1): counts
## Row data (genomic ranges) ## Note that we haven't defined the transgene here. ## It will be handled automatically in the function call. rowRanges <- emptyRanges(names = head(genes, n = length(genes) - 1L)) print(rowRanges)
#> GRanges object with 3 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> gene01 unknown 1-100 * #> gene02 unknown 101-200 * #> gene03 unknown 201-300 * #> ------- #> seqinfo: 1 sequence from an unspecified genome; no seqlengths
## Column data colData <- DataFrame( age = rep( x = c(3L, 6L), times = length(samples) / 2L ), genotype = rep( x = c("wildtype", "knockout"), times = 1L, each = length(samples) / 2L ), row.names = samples ) print(colData)
#> DataFrame with 4 rows and 2 columns #> age genotype #> <integer> <character> #> sample01 3 wildtype #> sample02 6 wildtype #> sample03 3 knockout #> sample04 6 knockout
## Minimal mode. x <- makeSummarizedExperiment(assays = assays) print(x)
#> class: SummarizedExperiment #> dim: 4 4 #> metadata(3): date sessionInfo wd #> assays(1): counts #> rownames(4): EGFP gene01 gene02 gene03 #> rowData names(0): #> colnames(4): sample01 sample02 sample03 sample04 #> colData names(0):
x <- makeSummarizedExperiment( assays = assays, rowRanges = rowRanges, colData = colData, transgeneNames = "EGFP" ) print(x)
#> class: RangedSummarizedExperiment #> dim: 4 4 #> metadata(3): date sessionInfo wd #> assays(1): counts #> rownames(4): EGFP gene01 gene02 gene03 #> rowData names(0): #> colnames(4): sample01 sample02 sample03 sample04 #> colData names(2): age genotype