NodeFox logoNodeFox

Reader Node

What Reader is and why it exists

Reader is the entry boundary for external information entering a NodeFox workflow. In directed-graph systems, the quality of downstream behavior depends heavily on how input arrives, how it is shaped, and how failures are surfaced. Reader exists so ingestion is explicit, inspectable, and testable instead of hidden inside custom code blocks.

In practice, Reader should be treated as a contract boundary rather than a convenience utility. It is where you decide what source of truth is allowed, what data shape can move forward, and what fallback route should execute if upstream data is unavailable or malformed.

Execution behavior

Reader emits payloads into output slots, and those payloads drive downstream eligibility. Depending on variant and configuration, Reader may execute once per run, once per item in a batch window, or once per iteration in list processing. Because NodeFox is deterministic at the orchestration layer, it is good practice to keep Reader output structure stable and immediately normalize it in Code or Data before policy branching.

Reader does not replace validation. It fetches or loads data. Validation, policy decisions, and write authorization should happen downstream in Decision and Writer-adjacent paths.

Variants and configuration details

Text variant

Text is used when source content is plain text and may need chunking. It is common for ingestion of notes, tickets, logs, or free-form documents.

FieldWhat it controlsPractical guidance
PathSource path in registered storage handlesUse stable, environment-aware naming so the same network can run across staging and production.
Split ByDelimiter used for segmentationUse deterministic delimiters (\n, \n\n, custom tokens) and document the expected format in workflow notes.
BatchNumber of segments emitted per cycleKeep batch sizes bounded to avoid sudden fan-out cost spikes.
Emit On CompleteWhether completion signal is emitted after final segmentEnable when downstream paths depend on explicit completion behavior.

CSVExcel variant

CSVExcel is used for structured table ingestion from CSV or spreadsheet-like sources.

FieldWhat it controlsPractical guidance
PathCSV/XLSX source pathKeep file naming conventions consistent with run timestamping and source ownership.
Target Row Start/EndRow window to processUseful for partial replays, checkpoints, and controlled backfills.
Search Column / Search ValueOptional in-file filteringUse only for deterministic key lookup patterns; avoid brittle fuzzy matching at ingest stage.
Column SelectionsMapping from source columns to output slotsDocument these mappings so downstream nodes can rely on stable slot semantics.

File variant

File is used for binary content such as PDFs, images, archives, or raw blobs.

FieldWhat it controlsPractical guidance
PathBinary source locationRoute large binaries through controlled branches and avoid loading oversized payloads into unnecessary paths.

API variant

API performs inbound HTTP calls and is typically used for system integrations, data pulls, and status lookups.

FieldWhat it controlsPractical guidance
URLEndpoint (supports $N interpolation)Keep URL construction deterministic and avoid optional query fields that change behavior unpredictably.
HTTP MethodRequest method (GET, POST, PUT, PATCH, DELETE)Use method semantics that match source system contracts; do not overload a single endpoint for multiple actions without clear branch logic.
API Key ReferenceCredential reference from integrations/settingsNever hardcode secrets in node text fields.
HeadersRequest header mapKeep auth and version headers explicit and environment-specific.
Response TypeExpected parse mode (JSON, Text, Blob)Align response parsing to downstream slot expectations to reduce schema drift.

CloudFile variant

CloudFile retrieves objects from cloud-backed storage providers.

FieldWhat it controlsPractical guidance
ProviderCloud provider connection targetIsolate by workspace/environment to reduce blast radius.
File IdentifierObject key or file idKeep identifier resolution explicit in upstream branches.
API Key ReferenceCredential pathPair with key rotation policy and provider-side scope limits.

How to use Reader well in production

Reader performs best when followed by explicit normalization and validation. A common robust pattern is Reader -> Code(normalize) -> Data(extract) -> Decision(validate/policy). This keeps ingestion concerns separated from business logic and preserves traceability when incidents occur.

For high-volume ingestion, combine Reader batching with Buffer/Stack and explicit retry limits. For policy-sensitive workflows, route all upstream failure modes to deterministic fallback paths instead of allowing ambiguous null-like payloads to flow into high-impact branches.

Common mistakes to avoid

A common mistake is to treat Reader as an all-in-one ETL and policy layer. That usually creates brittle graphs where routing logic depends on inconsistent source shapes. Another frequent issue is overloading one Reader configuration for multiple source contracts; this reduces reproducibility and complicates rollback.

When in doubt, duplicate Reader boundaries by source contract and keep each configuration narrow and explicit.