Given a set of feature indices defining a module (e.g. a cluster of co-regulated features), computes a sample-by-sample cosine similarity matrix based on the concatenated feature embeddings within that module. This captures how similar two samples are in terms of the connectivity structure of the module's features.
Arguments
- projected_list
Named list of per-sample projected feature matrices (features x latent dims), as returned by
run_MOSAIC()$mosaic_embed_list. Each element is a matrix where row i is feature i's embedding in that sample.- feature_idx
Integer vector of feature indices (row indices) belonging to the module. For example,
which(feature_clusters == module_id).
Value
A symmetric numeric matrix of dimension n_samples x n_samples,
with cosine similarity values (range -1 to 1). Row and column names
correspond to sample names from projected_list.
Details
This is the first step in MOSAIC's unsupervised subgroup detection pipeline:
compute module similarity, then test for a balanced partition with
find_partition_hclust.
See also
find_partition_hclust for testing whether the
similarity matrix reveals distinct subgroups,
plot_mds_cluster for visualizing the sample similarity.
Examples
if (FALSE) { # \dontrun{
# Run MOSAIC
result <- run_MOSAIC(
list(RNA = seurat_rna, ATAC = seurat_atac),
assays = c("RNA", "GeneACT"),
sample_meta = "sample_name",
condition_meta = "condition"
)
# Suppose feature_clusters is a vector of module assignments
# Compute similarity for module 5
module_idx <- which(feature_clusters == 5)
sim_mat <- compute_module_similarity(
result$mosaic_embed_list,
feature_idx = module_idx
)
# Visualize
plot_mds_cluster(sim_mat, "Module 5",
cluster = result$annotation$Condition)
} # }