Connecticut Foreclosure Intelligence
Legal Tech · PropTech (United States)
B2B platform that turns Connecticut judicial e-filing data into actionable foreclosure intelligence — web app, Excel reports, and a production-grade Python ETL pipeline.
Data pipeline and critical backend evolution since mid-2025 (~9 months). ETL feeds a Django web app used daily by legal and investment analysts.
2025–2026Teams tracking Connecticut foreclosure cases must reconcile e-filing records (parties, addresses, documents, key dates), equity figures buried in PDF worksheets, market valuations from listing sites, and a complex priority of procedural statuses — mediation, default, strict foreclosure, bankruptcy, withdrawn, and more. Manual Excel consolidation does not scale and is error-prone.
We built the data backbone of an enterprise platform: a Python 3.12 pipeline that extracts from a 30+ table PostgreSQL e-filing warehouse, applies Connecticut-specific business rules (forensic business days, county/ZIP geography, status priority engine), enriches cases with PDF equity and property links, and loads production application tables via explicit-column PostgreSQL COPY. Parallel QA tables capture how each status was triggered for audit. Outputs include multi-sheet Excel with conditional formatting and a web UI (login, operational canvas, advanced filters) backed by a Django API.
- Production ETL from 30+ e-filing tables with CTE-based extraction
- Priority status classification with forensic-day change detection
- PDF equity extraction and property listing enrichment with caching
- Explicit-column PostgreSQL COPY into application tables
- Parallel QA validation table for status audit evidence
- Excel reports plus web app (filters, canvas, role-based login)
- Prefect scheduling, dry-run flags, and automated regression tests
Priority status engine under court rule changes
Each case maps to legal buckets (mediation ended, default filed, strict foreclosure, bankruptcy, no longer in foreclosure, etc.) with configurable overrides when new document codes appear in Connecticut courts.
Recent-change detection on forensic business days
Highlights use Connecticut forensic calendars — weekends plus federal and state holidays — not calendar days alone, so analysts see truly recent movement in Excel and the app.
Safe PostgreSQL COPY into existing app schemas
Loads use explicit column lists aligned to live tables, eliminating positional drift that had misaligned equity and defendant counts. In-memory DataFrame loads skip unnecessary Excel round-trips.
App vs. QA data separation
Audit metadata (trigger documents, dock codes, status sources) lands in a parallel validation table; the dashboard table stays lean for performance and UX.
Multi-channel delivery and resilient scraping
Reports reach analysts via email OAuth, cloud drives, and object storage. SOCKS proxy watchdogs and alerts keep PDF retrieval reliable behind anti-bot constraints.