Reader Node

What Reader is and why it exists

Reader is the entry boundary for external information entering a NodeFox workflow. In directed-graph systems, the quality of downstream behavior depends heavily on how input arrives, how it is shaped, and how failures are surfaced. Reader exists so ingestion is explicit, inspectable, and testable instead of hidden inside custom code blocks.

In practice, Reader should be treated as a contract boundary rather than a convenience utility. It is where you decide what source of truth is allowed, what data shape can move forward, and what fallback route should execute if upstream data is unavailable or malformed.

Execution behavior

Reader emits payloads into output slots, and those payloads drive downstream eligibility. Depending on variant and configuration, Reader may execute once per run, once per item in a batch window, or once per iteration in list processing. Because NodeFox is deterministic at the orchestration layer, it is good practice to keep Reader output structure stable and immediately normalize it in Code or Data before policy branching.

Reader does not replace validation. It fetches or loads data. Validation, policy decisions, and write authorization should happen downstream in Decision and Writer-adjacent paths.

Variants and configuration details

Text variant

Text is used when source content is plain text and may need chunking. It is common for ingestion of notes, tickets, logs, or free-form documents.

Field	What it controls	Practical guidance
`Path`	Source path in registered storage handles	Use stable, environment-aware naming so the same network can run across staging and production.
`Split By`	Delimiter used for segmentation	Use deterministic delimiters (`\n`, `\n\n`, custom tokens) and document the expected format in workflow notes.
`Batch`	Number of segments emitted per cycle	Keep batch sizes bounded to avoid sudden fan-out cost spikes.
`Emit On Complete`	Whether completion signal is emitted after final segment	Enable when downstream paths depend on explicit completion behavior.

CSVExcel variant

CSVExcel is used for structured table ingestion from CSV or spreadsheet-like sources.

Field	What it controls	Practical guidance
`Path`	CSV/XLSX source path	Keep file naming conventions consistent with run timestamping and source ownership.
`Target Row Start/End`	Row window to process	Useful for partial replays, checkpoints, and controlled backfills.
`Search Column` / `Search Value`	Optional in-file filtering	Use only for deterministic key lookup patterns; avoid brittle fuzzy matching at ingest stage.
`Column Selections`	Mapping from source columns to output slots	Document these mappings so downstream nodes can rely on stable slot semantics.

File variant

File is used for binary content such as PDFs, images, archives, or raw blobs.

Field	What it controls	Practical guidance
`Path`	Binary source location	Route large binaries through controlled branches and avoid loading oversized payloads into unnecessary paths.

API variant

API performs inbound HTTP calls and is typically used for system integrations, data pulls, and status lookups.

Field	What it controls	Practical guidance
`URL`	Endpoint (supports `$N` interpolation)	Keep URL construction deterministic and avoid optional query fields that change behavior unpredictably.
`HTTP Method`	Request method (`GET`, `POST`, `PUT`, `PATCH`, `DELETE`)	Use method semantics that match source system contracts; do not overload a single endpoint for multiple actions without clear branch logic.
`API Key Reference`	Credential reference from integrations/settings	Never hardcode secrets in node text fields.
`Headers`	Request header map	Keep auth and version headers explicit and environment-specific.
`Response Type`	Expected parse mode (`JSON`, `Text`, `Blob`)	Align response parsing to downstream slot expectations to reduce schema drift.

CloudFile variant

CloudFile retrieves objects from cloud-backed storage providers.

Field	What it controls	Practical guidance
`Provider`	Cloud provider connection target	Isolate by workspace/environment to reduce blast radius.
`File Identifier`	Object key or file id	Keep identifier resolution explicit in upstream branches.
`API Key Reference`	Credential path	Pair with key rotation policy and provider-side scope limits.

How to use Reader well in production

Reader performs best when followed by explicit normalization and validation. A common robust pattern is Reader -> Code(normalize) -> Data(extract) -> Decision(validate/policy). This keeps ingestion concerns separated from business logic and preserves traceability when incidents occur.

For high-volume ingestion, combine Reader batching with Buffer/Stack and explicit retry limits. For policy-sensitive workflows, route all upstream failure modes to deterministic fallback paths instead of allowing ambiguous null-like payloads to flow into high-impact branches.

Common mistakes to avoid

A common mistake is to treat Reader as an all-in-one ETL and policy layer. That usually creates brittle graphs where routing logic depends on inconsistent source shapes. Another frequent issue is overloading one Reader configuration for multiple source contracts; this reduces reproducibility and complicates rollback.

When in doubt, duplicate Reader boundaries by source contract and keep each configuration narrow and explicit.