Skip to contents

All functions

backend_config()
Build embedding backend configuration
backend_embed_texts()
Embed texts via configured backend
backend_info()
Get embedding backend model/service information
backend_read()
Read backend configuration from YAML
backend_save()
Save backend configuration to YAML
batch_collect_openai()
Collect completed OpenAI batch embedding jobs
batch_status_openai()
Inspect OpenAI batch state for a label
batch_submit_openai()
Submit OpenAI Batch jobs for corpus embeddings (asynchronous)
calibrate_threshold()
Calibrate threshold from Parquet scores by streaming batches
clean_abstract_for_embedding()
Clean title/abstract rows into embedding-ready text
demo_finalize_openai_batch()
Finalize OpenAI demo batch jobs and compare direct vs batch embeddings
distance_cosine()
Cosine distance between two numeric vectors
distance_reference_cosine()
Pairwise cosine distances with centroid axis between label partitions
distance_ridge()
Compute corpus distance to a reference embedding area
distances()
Join prototype and ridge distances lazily via Arrow
embed_corpus()
Stream a corpus dataset, embed in batches, and write Parquets
embed_texts()
Embed texts through a configured backend
fit_ridge()
Fit a reference-area model from embeddings parquet
plot_embeddings_pca()
Plot embeddings via PCA, colored by arbitrary labels
plot_embeddings_umap()
Plot embeddings via UMAP, colored by arbitrary labels
run_demo_openai()
Create and optionally run an OpenAI-based demo project via Quarto
run_demo_openalex()
Create and optionally run a self-contained demo project via Quarto
score_reference_cosine()
Convert reference-cosine distances to scores
score_ridge()
Convert ridge distances to ridge scores
similarity_cosine()
Cosine similarity between two numeric vectors