library(openalexVectorComp)
backend <- backend_config(
provider = "tei",
base_url = "http://localhost:3000"
)
emb <- embed_texts(
texts = c("Title: A\nAbstract: B", "Title: C\nAbstract: D"),
backend = backend
)
dim(emb)Purpose
This vignette documents TEI server operations outside the package:
- start and stop workflows,
- health and info checks,
- endpoint verification,
- operational troubleshooting.
openalexVectorComp no longer manages TEI process lifecycle internally.
Start TEI
Local binary
text-embeddings-router --model-id BAAI/bge-small-en-v1.5 --port 3000Alternative port
text-embeddings-router --model-id BAAI/bge-small-en-v1.5 --port 3001Verify service
Health
curl -s http://localhost:3000/healthInfo
curl -s http://localhost:3000/infoEmbed smoke test
curl -s -X POST http://localhost:3000/embed \
-H "Content-Type: application/json" \
-d '{"inputs":["hello world"]}'Use TEI endpoint in package
Process management (shell)
If TEI is running in a terminal, stop with Ctrl+C.
If running in background:
pkill -f text-embeddings-routerOr find PID and stop explicitly:
ps aux | grep text-embeddings-router
kill <PID>Recommended production pattern
- Run TEI under a process supervisor (
systemd,supervisord, or container runtime). - Expose one stable embed URL.
- Keep model id fixed for each embedding campaign.
- Record endpoint + model id in run metadata.
Troubleshooting
Port already in use
Use another port and update backend config accordingly.
Empty/invalid responses
- Check
curlsmoke test directly against/embed. - Reduce
max_batch_sizeinbackend_config().
Slow throughput
- Increase TEI resources.
- Use larger
batch_sizeinembed_corpus()where feasible. - Verify server-side limits from
/info.