Understanding the Output#
After a run, ./results/<session_id>/ contains:
File |
Description |
|---|---|
|
Compiled PDF (requires pdflatex + bibtex) |
|
Full LaTeX source — edit and recompile if needed |
|
BibTeX bibliography |
|
Full proof state — lemmas, proofs, confidence scores |
|
Planning state — directions scored and selected |
|
Numerical validation results (if experiment ran) |
Paused sessions also write a checkpoint to ~/.eurekaclaw/sessions/<session_id>/checkpoint.json.
Reading theory_state.json#
Key fields:
{
"status": "proved",
"proof_plan": [
{
"lemma_id": "concentration_bound",
"provenance": "known",
"statement": "For sub-Gaussian ..."
}
],
"proven_lemmas": {
"main_result": {
"verified": true,
"confidence_score": 0.91,
"verification_method": "llm_check",
"proof_text": "..."
}
},
"failed_attempts": [ ... ],
"counterexamples": [ ... ]
}
Proof Status#
Status |
Meaning |
|---|---|
|
All lemmas verified, assembled proof complete |
|
A counterexample was found; the conjecture is false or needs refinement |
|
Hit |
Lemma Provenance#
Provenance |
Meaning |
|---|---|
|
Directly citable — no new proof needed |
|
A known result modified to fit this context |
|
Genuinely novel — fully proved from scratch |
Low-Confidence Warnings#
If a lemma has verified=false, the PDF contains:
[Unverified step] ← orange text
and a Limitations section explaining all unverified steps. Review theory_state.json → proven_lemmas to see which lemmas are flagged.
Using the Python API to Access Results#
import json
state = json.loads(result.theory_state_json)
print("Status:", state["status"])
print("Lemmas:", len(state["proof_plan"]))
brief = json.loads(result.research_brief_json)
direction = brief["selected_direction"]
print("Direction:", direction["title"])
print("Novelty score:", direction["novelty_score"])
See Python API for full details.