The combine generic function handles methods for combining or merging different Bioconductor data structures. It should, given an arbitrary number of arguments of the same class (possibly by inheritance), combine them into a single instance in a sensible way (some methods may only combine 2 objects, ignoring ... in the argument list; because Bioconductor data structures are complicated, check carefully that combine does as you intend).

combine(x, y, ...)

# S4 method for SummarizedExperiment,SummarizedExperiment
combine(x, y)

# S4 method for SingleCellExperiment,SingleCellExperiment
combine(x, y)

Arguments

x

One of the values.

y

A second value.

...

Additional arguments.

Value

SummarizedExperiment.

Details

There are two basic combine strategies. One is an intersection strategy. The returned value should only have rows (or columns) that are found in all input data objects. The union strategy says that the return value will have all rows (or columns) found in any one of the input data objects (in which case some indication of what to use for missing values will need to be provided).

These functions and methods are currently under construction. Please let us know if there are features that you require.

Note

We're attempting to make this as strict as possible, requiring:

  • Rows (genes) across objects must be identical.

  • rowRanges and/or rowData metadata must be identical.

  • colData must contain the same columns.

  • Specific metadata must be identical (see metadata argument).

Methods

The following methods are defined in the BiocGenerics package:

combine(x=ANY, missing)

Return the first (x) argument unchanged.

combine(data.frame, data.frame)

Combines two data.frame objects so that the resulting data.frame contains all rows and columns of the original objects. Rows and columns in the returned value are unique, that is, a row or column represented in both arguments is represented only once in the result. To perform this operation, combine makes sure that data in shared rows and columns are identical in the two data.frames. Data differences in shared rows and columns usually cause an error. combine issues a warning when a column is a factor and the levels of the factor in the two data.frames are different.

combine(matrix, matrix)

Combined two matrix objects so that the resulting matrix contains all rows and columns of the original objects. Both matricies must have dimnames. Rows and columns in the returned value are unique, that is, a row or column represented in both arguments is represented only once in the result. To perform this operation, combine makes sure that data in shared rows and columns are all equal in the two matricies.

Additional combine methods are defined in the Biobase package for AnnotatedDataFrame, AssayData, MIAME, and eSet objects.

See also

Examples

library(SummarizedExperiment)
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> Loading required package: parallel
#> #> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:parallel': #> #> clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, #> clusterExport, clusterMap, parApply, parCapply, parLapply, #> parLapplyLB, parRapply, parSapply, parSapplyLB
#> The following objects are masked from 'package:stats': #> #> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base': #> #> anyDuplicated, append, as.data.frame, basename, cbind, colnames, #> dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep, #> grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, #> order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, #> rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, #> union, unique, unsplit, which, which.max, which.min
#> Loading required package: S4Vectors
#> #> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:base': #> #> expand.grid
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
#> Loading required package: Biobase
#> Welcome to Bioconductor #> #> Vignettes contain introductory material; view with #> 'browseVignettes()'. To cite Bioconductor, see #> 'citation("Biobase")', and for packages 'citation("pkgname")'.
#> Loading required package: DelayedArray
#> Loading required package: matrixStats
#> #> Attaching package: 'matrixStats'
#> The following objects are masked from 'package:Biobase': #> #> anyMissing, rowMedians
#> Loading required package: BiocParallel
#> #> Attaching package: 'DelayedArray'
#> The following objects are masked from 'package:matrixStats': #> #> colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges
#> The following objects are masked from 'package:base': #> #> aperm, apply, rowsum
data(rse, sce, package = "acidtest") ## SummarizedExperiment ==== x <- rse colnames(x)
#> [1] "sample01" "sample02" "sample03" "sample04" "sample05" "sample06" #> [7] "sample07" "sample08" "sample09" "sample10" "sample11" "sample12"
#> DataFrame with 12 rows and 1 column #> condition #> <factor> #> sample01 A #> sample02 A #> sample03 A #> sample04 A #> sample05 A #> ... ... #> sample08 B #> sample09 B #> sample10 B #> sample11 B #> sample12 B
## Create a copy of our minimal example. y <- x colnames(y) <- paste0("sample", seq(from = ncol(y) + 1L, to = ncol(y) * 2L)) colnames(y)
#> [1] "sample13" "sample14" "sample15" "sample16" "sample17" "sample18" #> [7] "sample19" "sample20" "sample21" "sample22" "sample23" "sample24"
#> DataFrame with 12 rows and 1 column #> condition #> <factor> #> sample13 A #> sample14 A #> sample15 A #> sample16 A #> sample17 A #> ... ... #> sample20 B #> sample21 B #> sample22 B #> sample23 B #> sample24 B
## Combine two SummarizedExperiment objects. c <- combine(x, y) print(c)
#> class: RangedSummarizedExperiment #> dim: 500 24 #> metadata(6): interestingGroups version ... wd sessionInfo #> assays(1): counts #> rownames(500): gene001 gene002 ... gene499 gene500 #> rowData names(5): geneID geneName geneBiotype broadClass entrezID #> colnames(24): sample01 sample02 ... sample23 sample24 #> colData names(1): condition
#> [1] "sample01" "sample02" "sample03" "sample04" "sample05" "sample06" #> [7] "sample07" "sample08" "sample09" "sample10" "sample11" "sample12" #> [13] "sample13" "sample14" "sample15" "sample16" "sample17" "sample18" #> [19] "sample19" "sample20" "sample21" "sample22" "sample23" "sample24"
#> DataFrame with 24 rows and 1 column #> condition #> <factor> #> sample01 A #> sample02 A #> sample03 A #> sample04 A #> sample05 A #> ... ... #> sample20 B #> sample21 B #> sample22 B #> sample23 B #> sample24 B
## SingleCellExperiment ==== x <- sce head(colnames(x))
#> [1] "cell001" "cell002" "cell003" "cell004" "cell005" "cell006"
#> DataFrame with 2 rows and 2 columns #> sampleName interestingGroups #> <factor> <factor> #> sample1 sample1 sample1 #> sample2 sample2 sample2
## Here we're faking a distinct replicate, just as an example. y <- x ## Increase the cell ID numbers. cells <- colnames(y) %>% sub("cell", "", .) %>% as.integer() %>% `+`(ncol(y)) %>% paste0("cell", .) colnames(y) <- cells head(colnames(y))
#> [1] "cell101" "cell102" "cell103" "cell104" "cell105" "cell106"
## Increase the sample ID numbers. sampleID <- y$sampleID sampleID <- gsub("1$", "3", sampleID) sampleID <- gsub("2$", "4", sampleID) y$sampleID <- as.factor(sampleID) sampleData(y)
#> DataFrame with 2 rows and 2 columns #> sampleName interestingGroups #> <factor> <factor> #> sample3 sample3 sample3 #> sample4 sample4 sample4
## Combine two SingleCellExperiment objects. c <- combine(x, y) print(c)
#> class: SingleCellExperiment #> dim: 500 200 #> metadata(4): combine date wd sessionInfo #> assays(1): counts #> rownames(500): gene1 gene10 ... gene98 gene99 #> rowData names(5): geneID geneName geneBiotype broadClass entrezID #> colnames(200): cell001 cell002 ... cell199 cell200 #> colData names(2): expLibSize sampleID #> reducedDimNames(0): #> spikeNames(0):
#> DataFrame with 4 rows and 2 columns #> sampleName interestingGroups #> <factor> <factor> #> sample1 sample1 sample1 #> sample2 sample2 sample2 #> sample3 sample3 sample3 #> sample4 sample4 sample4