Plot embeddings via UMAP, colored by arbitrary labels
Source:R/plot_embeddings.R
plot_embeddings_umap.RdComputes a 2D UMAP projection of V1..Vd and returns a scatter plot colored
by labels membership. Uses cosine distance by default to align
with common embedding similarity.
Usage
plot_embeddings_umap(
embeddings,
labels,
n_neighbors = 15,
min_dist = 0.1,
metric = "cosine",
n_epochs = 500,
seed = 42,
sample_n = NULL,
point_size = 2,
alpha = 0.5
)Arguments
- embeddings
Path to a Parquet file or dataset directory containing columns
idandV1..Vd.- labels
Label mapping for ids. Supported formats:
data frame with columns
idandlabel,path to CSV with columns
idandlabel,named character vector where names are ids and values are labels,
named list where each element is an id vector for that label.
- n_neighbors, min_dist, metric, n_epochs
UMAP parameters passed to
uwot::umap(). Defaults:n_neighbors = 15,min_dist = 0.1,metric = "cosine",n_epochs = 500.- seed
Random seed for reproducibility (set to
NULLto skip).- sample_n
Optional maximum number of rows to sample for plotting (applied before UMAP). If
NULL, uses all rows.- point_size, alpha
Point size and transparency for points in the plot. Defaults
point_size = 2,alpha = 0.5.