Architecture¶
Reflow has four main layers.
1. Definition layer¶
Flow— reusable task groupsWorkflow— runtime orchestration (extendsFlow)@wf.job()— task decorators
This is where you describe the DAG.
2. Parameter and typing layer¶
Param— CLI parameter metadataResult— data dependency declarationsRunDir— working directory injection- Type inspection helpers in
params.py
This layer turns Python function signatures into CLI flags, typed data edges, and wire mode inference.
3. Persistence layer¶
manifest.py— typed JSON serialisation (Path, datetime, UUID, …)stores/sqlite.py— SQLite-backed manifest storestores/records.py— typed dataclass records (RunRecord,TaskSpecRecord,TaskInstanceRecord)cache.py— Merkle-DAG identity hashing and verificationresults.py— file-based worker result protocol
4. Execution layer¶
executors/__init__.py—ExecutorABC,JobResources, key normalisationexecutors/slurm.py— Slurm backendexecutors/pbs.py— PBS Pro / OpenPBS / Torque backendexecutors/lsf.py— LSF backendexecutors/sge.py— SGE / UGE backendexecutors/flux.py— Flux backendexecutors/local.py— local subprocess backendexecutors/helpers.py—default_executor(),resolve_executor()
Each executor translates JobResources into scheduler-native CLI
commands and handles array jobs, dependency chaining, cancellation,
and state queries.
Workflow subpackage¶
The Workflow class is split across a subpackage for maintainability:
workflow/
├── __init__.py — re-exports (Workflow + public helpers)
├── _core.py — Workflow class: submit, validate, cancel, retry, status, describe
├── _dispatch.py — DispatchMixin: dispatch loop, cache, result wiring, fan-out
├── _worker.py — WorkerMixin: task execution with signal handling
└── _helpers.py — make_run_id, build_kwargs, resolve_index
The MRO is Workflow → DispatchMixin → WorkerMixin → Flow → object.
External code imports from reflow.workflow import Workflow as before.
Execution flow¶
wf.submit(...)validates the graph and checks required parameters.- Reflow opens the shared manifest database.
- It inserts a run row, task specs, and initial task instances.
- A dispatch job is submitted through the configured executor.
- The dispatcher ingests worker result files, resolves upstream outputs, tries the Merkle cache, and submits runnable tasks.
- Workers execute Python callables and write result files to
<run_dir>/results/. - The dispatcher chains a follow-up dispatch with a scheduler dependency on all submitted jobs.
- Steps 5–7 repeat until the graph is complete.
Key normalization¶
Users write scheduler-agnostic keys (partition, queue, account).
Each executor declares a _KEY_ALIASES mapping that rewrites these
to the backend's native vocabulary before rendering CLI flags. This
means a workflow is portable across schedulers without config changes.
Why the shared store matters¶
A separate database in every run directory makes inspection and cache reuse harder. Using one shared manifest store gives a cleaner split:
- run directory — files, logs, outputs
- manifest store — workflow history, task states, cache records