Edgarjure — SEC EDGAR filings, financials, and XBRL data for Clojure

I’ve published edgarjure, a Clojure library for accessing and analysing SEC EDGAR data.

What is SEC EDGAR?

EDGAR is the U.S. Securities and Exchange Commission’s public filing system. Every public company in the U.S. files here — both structured data (XBRL financials, XML ownership reports) and unstructured text (annual reports, risk disclosures, MD&A narratives). It represents decades of data on thousands of firms, accessible via public HTTP/JSON APIs with no API keys or paid services required.

What does edgarjure do?

It provides comprehensive access to all of it from Clojure:

  • Financial statements — income, balance sheet, cash flow with automatic XBRL line-item resolution across accounting tag changes, restatement deduplication, and long or wide output
  • XBRL facts — datasets with human-readable labels, concept discovery, and cross-sectional screening across all filers in a single API call
  • Filing content — full-text section extraction from 10-K/10-Q (MD&A, Risk Factors, any item), HTML table extraction
  • Form parsers — Form 4 (insider trades) and 13F-HR (institutional holdings) into structured maps and datasets
  • Panel datasets — multi-ticker, multi-concept, with point-in-time support for look-ahead-bias-free backtesting
  • Bulk downloads — bounded parallelism, skip-existing, structured result envelopes
  • Unified API — ticker or CIK interchangeably, keyword args throughout, Malli validation at entry

Dataset results are returned as tech.ml.dataset for easy downstream manipulation.

What’s the goal?

A composable, research-grade Clojure data system for SEC filings — where immutable data, lazy sequences, and a unified API make financial research pipelines clean and reproducible.

This is an early release (0.1.1). The core is solid and tested, but there’s plenty more to build. Feedback and contributions are very welcome.