Runtime And Tuning#
Use RuntimeConfig for provider/runtime behavior and ExtractOptions for deterministic pipeline controls.
RuntimeConfig#
Required:
model: str
Common:
temperaturemax_tokensstreamstorage_dirrespect_context_window
Retry policy (retry):
max_attemptsinitial_backoff_secondsmax_backoff_secondsbackoff_multiplierretry_on_rate_limitretry_on_transient_errorsauto_resume_paused_runsmax_pause_resumes
Optional workflows:
session_refinement: per-document session context across chunksreconciliation: canonical claims per document
ExtractOptions#
max_chunk_charscontext_window_charsmax_passesbatch_concurrencyenable_fuzzy_alignmentfuzzy_alignment_thresholdaccept_partial_exactstop_when_no_new_extractionsallow_unresolved
Behavior That Affects Output#
- If
allow_unresolved=False(default), unresolved candidates are counted in metrics but not returned indocuments[*].extractions. - If
strict_example_alignment=True(default inExtractionTask) and examples are unresolved, extraction raisesExampleValidationErrorbefore runtime execution. - If reconciliation workforce fails at runtime, the engine falls back to deterministic canonical-claim construction and records warnings.
Practical Baseline#
from sourcery.contracts import ExtractOptions, RuntimeConfig
runtime = RuntimeConfig(
model="deepseek/deepseek-chat",
temperature=0.0,
)
options = ExtractOptions(
max_chunk_chars=1200,
context_window_chars=200,
max_passes=2,
batch_concurrency=16,
stop_when_no_new_extractions=True,
allow_unresolved=False,
)
Throughput-Oriented Profile#
options = ExtractOptions(
max_chunk_chars=1800,
context_window_chars=120,
max_passes=1,
batch_concurrency=32,
stop_when_no_new_extractions=True,
)
Quality-Oriented Profile#
options = ExtractOptions(
max_chunk_chars=900,
context_window_chars=280,
max_passes=3,
enable_fuzzy_alignment=True,
fuzzy_alignment_threshold=0.82,
allow_unresolved=False,
)
Tuning Sequence#
- Freeze schema and examples first.
- Set
temperature=0.0. - Measure baseline metrics and warnings.
- Increase
max_passesonly when extraction recall improves materially. - Increase concurrency only if provider limits and system resources allow it.