Skip to content

S3 File Processing

Automated ingestion of lender CSV files from S3 into PostgreSQL RAW tables.

How It Works

Idempotent Processing

Every S3 file is tracked in RAW_S3_FILE_PROCESSING_LOG. Status transitions:

pending → processing → success | failed

Files with status=success are automatically skipped on subsequent runs.

Lender Registry

The S3_LENDER_CONFIGS dict in tasks/flows/s3_processing.py defines all lender configurations:

KeyS3 PrefixRAW Table
arthmatearthmate/RAW_ARTHMATE_NACH_STATUS_RERT
dmidmi/RAW_DMI_NACH_STATUS_REPORT
dmi_paymentdmi_payment/RAW_DMI_PAYMENT_REPORT
gbgb/RAW_GB_NACH_PAYMENT_REPORT
ikfikf/RAW_IKF_NACH_STATUS_REPORT
shivalikshivalik/RAW_SHIVALIK_NACH_STATUS_REPORT
shivalik_paymentshivalik_payment/RAW_SHIVALIK_PAYMENT_REPORT
ugrougro/RAW_UGRO_NACH_PAYMENT_REPORT
ugro_razorpayugro_razorpay/RAW_UGRO_RAZORPAY_NACH_STATUS_REPORT
vivritivivriti/RAW_VIVRITI_PAYMENT_REPORT
vivriti_nach_statusvivriti_nach_status/RAW_VIVRITI_PAYMENT_NACH_STATUS_REPORT

Running the Flow

bash
export PREFECT_API_URL=http://127.0.0.1:4201/api

# All lenders
./venv/bin/python tasks/flows/s3_processing.py

# Specific lender
./venv/bin/python tasks/flows/s3_processing.py arthmate
./venv/bin/python tasks/flows/s3_processing.py ugro
./venv/bin/python tasks/flows/s3_processing.py dmi_payment

Source File Column

The source_file column is injected by utils/csv_processor.py into every CSV load. This column is critical for dbt's incremental filtering - dbt skips rows from files already present in the target table.

TIP

The source_file column must exist in the RAW table for dbt incremental models to work correctly.

Lender-Product Matrix

LenderProduct IDNACH CSVPayment CSVRazorpayCashfree
UGRO1Yes-YesYes
DMI2YesYes--
ARTHMATE3Yes---
SHIVALIK4YesYes--
VIVRITI5YesYesYes-
GB6Yes--Yes
IKF7Yes---

Turno Fineract LMS Documentation