check
Run environment and capacity preflight checks before heavy operations.
Usage
openalex-snapshot check --root-dir . --dataset all
What it checks
- dependency binaries (
duckdb,aws) - path writability for root/snapshot/parquet/metadata
- download disk requirement (remote manifest + safety margin)
- convert disk requirement (precise source inventory estimate)
- memory/tuning risk based on profile/workers/memory settings
Options
--strictfail on warnings as well as failures--jsonmachine-readable output--preciseenable precise source inventory estimate- shared runtime args:
--dataset,--profile,--workers,--max-memory-mb