Skip to main content

Miru Autoresearch Plan

Goal

Build a safe autonomous improvement loop for Miru that can research, propose, verify, and keep only measurable improvements.

Principles

  • Follow Karpathy's autoresearch pattern:
    • one narrow editable surface per loop when possible
    • fixed evaluation budget
    • objective metric decides keep or discard
    • every run leaves a durable experiment log
  • Follow Tobi's qmd pattern:
    • keep repo knowledge in markdown
    • search first, synthesize second
    • use local retrieval instead of reloading giant prompts

Phase 1

  • Define safe improvement lanes:
    • flaky test repair
    • focused docs cleanup
    • frontend crash regression checks
    • small performance wins with measurable before/after checks
  • Create one metric per lane:
    • build success
    • focused spec pass rate
    • browser smoke pass rate
    • page-level perf delta

Phase 2

  • Build an experiment runner that:
    • takes a narrow task
    • applies a small patch
    • runs fixed verification
    • keeps the patch only if the metric improves or remains green
    • writes a markdown experiment note after every run

Phase 3

  • Add nightly or scheduled runs for the safest lanes only.
  • Keep production, migrations, auth, and billing changes out of autonomous write scope until the loop proves reliable.

Immediate TODO

  • Keep the current manual lanes green:
    • docs_consistency
    • frontend_build
    • auth_pages_request_spec
    • browser_root_smoke
  • Add one browser smoke matrix for the most fragile routes.
  • Add the next narrow lane:
    • docs drift checks with stronger assertions
    • one more focused rspec verification lane
    • one authenticated browser smoke lane