An agent-native command-line interface over paulgraham.com — list, read, and full-text-search 25 years of essays. This report documents the tool and what its cached corpus reveals.
paulgraham.com is a keyless, static site. The CLI turns it into structured, agent-clean data — no browser, no API key, no scraping fragility on the caller's side.
| Command | Does | Scope |
|---|---|---|
essays | List all essays, newest first | index page |
read <slug> | Full clean text of one essay (paragraphs preserved) | one page |
search <q> | Find essays by keyword — title only | index page |
sync | Crawl every essay into a local cache | whole corpus |
topic <q> | Full-text search of essay bodies, ranked + snippets | cached corpus |
@mvanhorn/printing-press-library) only installs CLIs from its catalog — it has no authoring command, and paulgraham.com isn't in the catalog. opencli is the right tool for authoring a new site adapter.sync capturedOne polite crawl (bounded concurrency, per-essay failure tolerance) mirrors the full archive locally so topic queries run instantly and offline.
Shortest cached: Why Twitter is a Big Deal (147w), Charisma / Power (121w), Lisp for Web-Based Applications (58w).
A steady builder: a 2007–2009 startup-advice peak, a quieter mid-2010s, and a strong 2020–2021 resurgence.
Essays containing each term (full-text, via topic), with the single essay that uses it most.
| Term | Essays | Densest essay | Hits |
|---|---|---|---|
| startup | 143 | How to Fund a Startup | 108 |
| ideas | 140 | How to Get Startup Ideas | 74 |
| money | 119 | How to Raise Money | 94 |
| founders | 108 | How to Fund a Startup | 55 |
| writing | 106 | The Best Essay | 35 |
| users | 82 | The Other Road Ahead | 74 |
| growth | 47 | Startup = Growth | 60 |
| wealth | 39 | How to Make Wealth | 70 |
| taste | 31 | How Art Can Be Good | 23 |
| empathy | 3 | Hackers and Painters | 11 |
search (titles only) could never surface — it needs full-text topic.# one-time: build the local corpus $ opencli paulgraham sync essays: 231 · words: 565030 · failed: 0 # find essays ABOUT a concept (body text, ranked) $ opencli paulgraham topic "compound growth" Superlinear Returns hits 28 Do Things that Don't Scale hits 9 How to Do Great Work hits 7 # read one, clean, as markdown — feed to an LLM $ opencli paulgraham read earn -f md # export the whole index to a spreadsheet $ opencli paulgraham essays --limit 0 -f csv > pg.csv
-f json/-f md gives an LLM the essay, not 19KB of tags.read → summarizer → digest).search is one fast HTTP call over titles; topic searches full text after a one-time sync.<font face="verdana"> block, ending at the Yahoo/Turbify footer <script>. The parser slices between them and converts <br><br> → blank lines so paragraphs survive.node:https with family:4 + transient retry (global fetch stalls on broken-IPv6 networks).mapLimit runs N fetches at a time (default 6); a single essay failing is counted, not fatal.sync writes ~/.opencli/cache/paulgraham/corpus.json; topic is then a pure in-memory scan, which is why ranking + snippets are cheap.