This function partitions genes based on their pairwise distances using various clustering methods.

ClusterGenes(
  dist,
  clustering_method = "average",
  min_gene = 10,
  deepSplit = 2,
  return_tree = FALSE,
  filtered = TRUE,
  accu = 0.75
)

Arguments

dist

dist; the gene-gene distance matrix.

clustering_method

character; the method for clustering the genes. Options are "average", "ward.D", "ward.D2", "single", "complete", "mcquitty", "median", "centroid", "dbscan", or "hdbscan". Default is "average".

min_gene

integer; the minimum number of genes each group should contain. Default is 10.

deepSplit

integer; parameters for cutreeDynamic. Default is 2.

return_tree

logical; if TRUE, returns a gene partition tree; otherwise returns only the gene partition. Default is FALSE.

filtered

logical; if TRUE, filters out some genes for each partition based on the SML method (Parisi et al., 2014). Default is TRUE.

accu

numeric; the threshold for filtering out genes. Default is 0.75.

Value

If return_tree is TRUE, returns a list containing the gene partition and the gene partition tree. Otherwise, returns the gene partition.

Details

This function performs hierarchical clustering using the specified method and partitions the genes. It also provides options for other clustering methods like dbscan and hdbscan, and to filter out noisy genes.

References

Parisi, F., Strino, F., Nadler, B., & Kluger, Y. (2014). Ranking and combining multiple predictors without labeled data. Proceedings of the National Academy of Sciences, 111(4), 1253-1258. doi:10.1073/pnas.1219097111

Examples

gene_dist <- dist(matrix(runif(100), nrow=10))
gene_partition <- ClusterGenes(gene_dist, clustering_method="average")
#>  ..cutHeight not given, setting it to 1.39  ===>  99% of the (truncated) height range in dendro.
#>  cutHeight set too low: no merges below the cut.
#> Filtering out outlier genes in each module:  0  genes left.