Analysis request
EPA — Waters of the United States (WOTUS) Definition
req_0g2ca0v4f / created 5/31/2026, 8:52:55 PM
Databricks Jobs modefailed
Rulemaking Metadata
Docket IDEPA-HQ-OW-2021-0602
Agency IDEPA
Topicenvironment_water
Data Sourceregulations_gov
Expected Scale~2302 comments
Date WindowFull Historical Ingestion
Notes / Reviewer Context
"Requested via discovered rulemakings panel. (estimated runtime at submission: ~2.6 hr; bottleneck: parsing)"
Status & Control Plane
Execution Failure:
Task main_analysis_job failed with message: Workload failed, see run output for details.
Databricks Workspace Integration
Active Databricks Run ID: 509518781365215Open Run in Databricks Workspace (opens externally)
Command-Generation Mode
If you want to run this pipeline locally on your system instead of hosted Databricks, run the following sequence in your terminal:
.uv-test-venv\Scripts\python.exe scripts\run_ingestion.py --docket-id EPA-HQ-OW-2021-0602 .uv-test-venv\Scripts\python.exe scripts\run_embedding.py --docket-id EPA-HQ-OW-2021-0602 --backend databricks .uv-test-venv\Scripts\python.exe scripts\run_clustering.py --docket-id EPA-HQ-OW-2021-0602 --clustering-mode vector_search
Command-generation mode allows running comment ingestion and clustering locally via python scripts, writing directly to your local delta lakehouse.