Diagnose failures with Doctor
io.github.shafthq:shaft-doctor collects explicitly selected local test
evidence into a portable, redacted bundle and applies ordered deterministic
rules. It does not require shaft-ai, a provider credential, or network
access. The complete baseline works with pilot.ai.enabled=false.
Optional provider analysis is advisory only. It is disabled unless explicitly requested, receives only minimized already-redacted evidence, and never replaces the deterministic diagnosis, findings, confidence, or remediation.
Outputs
Each analysis writes:
doctor-evidence.json: versionedEvidenceBundlewith checksums, provenance, relative paths, size-limit decisions, and a redaction summary;doctor-report.json: the bundle plus versionedDiagnosis, citedFindingrecords, confidence, uncertainty, andRemediationactions, plus a separately identified advisory when provider analysis is requested;doctor-report.md: a portable human-readable report and evidence index;artifacts/: approved binary evidence such as screenshots.
Reports contain no original absolute machine paths. Evidence IDs and bundle IDs are content-derived, and JSON formatting uses LF line endings so repeated analysis of identical inputs is byte stable.
Run Doctor
- macOS / Linux
- PowerShell
java -jar SHAFT_MCP-<version>.jar doctor analyze --input allure-results --allowed-root "$PWD" --output-dir target/shaft-doctor
java -jar SHAFT_MCP-<version>.jar doctor analyze --input allure-results --allowed-root "$PWD" --output-dir target/shaft-doctor
The command writes doctor-evidence.json, doctor-report.json, and
doctor-report.md under target/shaft-doctor.
From an MCP chat:
Use
doctor_analyzeon theallure-resultsdirectory. Allow only the current project root, write results totarget/shaft-doctor, and do not collect screenshots.
CLI reference
The executable MCP JAR exposes Doctor as a local command:
java -jar shaft-mcp/target/SHAFT_MCP-<version>.jar doctor analyze \
--input allure-results \
--input target/shaft-logs \
--allowed-root "$PWD" \
--output-dir target/shaft-doctor \
--minimum-results 1
Every readable input must resolve under an explicit allowed root. Symlink
targets are resolved before collection. The output directory must also be
inside a declared root. Add repeated --history options to correlate recurring
signatures from older doctor-evidence.json files.
Screenshots and page snapshots are excluded by default. Retain them only with explicit approval:
java -jar shaft-mcp/target/SHAFT_MCP-<version>.jar doctor analyze \
--input allure-results \
--allowed-root "$PWD" \
--output-dir target/shaft-doctor \
--include-screenshots \
--include-page-snapshots
Use --max-item-bytes and --max-bundle-bytes to lower the conservative
retention limits. Use --minimum-results when the expected run size is known;
an empty, malformed, truncated, or unexpectedly small Allure run is reported
as incomplete and is never interpreted as successful.
Optional provider advisory
Add shaft-ai when invoking DoctorAnalyzer.analyzeWithAi(...) directly. The
executable SHAFT_MCP JAR already packages the OpenAI, Anthropic, Gemini, and
Ollama adapters. CLI provider analysis also requires --ai; Pilot properties
must independently enable the provider, processing location, model, and every
submitted evidence category.
Local Ollama example:
java \
-Dpilot.ai.enabled=true \
-Dpilot.ai.provider=ollama \
-Dpilot.ai.consent.local=true \
-Dpilot.ai.allowedEvidenceCategories=TEXT,LOG,CONFIGURATION \
-Dpilot.ai.ollama.model=<local-model> \
-jar shaft-mcp/target/SHAFT_MCP-<version>.jar doctor analyze \
--input allure-results \
--allowed-root "$PWD" \
--output-dir target/shaft-doctor \
--ai
Ollama defaults to http://127.0.0.1:11434/api/chat. Changing the endpoint does
not weaken consent, redaction, minimization, schema validation, or evidence-ID
checks.
For OpenAI, Anthropic, or Gemini, select the provider and model, approve remote processing, and approve the same evidence categories. Credentials remain in the provider-specific environment variable documented in optional provider controls; they are never Doctor arguments or report fields.
Doctor submits the deterministic diagnosis, its explicit uncertainty, and only
the textual evidence cited by deterministic findings. Unknown cases may submit
the smallest available textual evidence set. Provider output must match the
versioned shaft-doctor-advisory-1.0 schema and may contain observations,
hypotheses with confidence, missing evidence, recommended actions, and
limitations. References outside the submitted evidence-ID allowlist reject the
entire advisory. Uncited claims and hypotheses that contradict the
deterministic primary cause are visibly marked.
Timeout, rate limit, invalid credentials, unavailable provider, malformed JSON, schema violation, oversized output, invented evidence, and budget exhaustion produce an explicit fallback advisory while retaining the complete deterministic report. Reports contain provider/model/configuration identifiers, duration, usage when available, cache state, and a safe fallback reason. They never contain credentials, raw provider responses, or hidden reasoning.
Use --ai-cache to explicitly cache successful safe structured advisories
under the output directory. Cache keys include the evidence bundle checksum,
deterministic diagnosis checksum, and a non-secret provider/configuration
checksum. Failures and raw evidence are never cached.
MCP
doctor_analyze accepts explicit input paths, historical bundle paths,
allowed roots, an output directory, screenshot/page-snapshot approvals, and
the minimum expected Allure result count. The tool calls the same
DoctorAnalyzer used by the CLI. It remains deterministic when Pilot AI is
disabled; when the MCP server is explicitly started with an enabled provider,
the same separate advisory and fallback rules apply.
ChatGPT, Codex, Claude, Gemini, and GitHub Copilot can invoke
doctor_analyze as external MCP clients. Their model authentication stays in
the client and is not ingested by SHAFT. Copilot is MCP interoperability, not a
generic Copilot API-key adapter. Download the credential-free
representative invocations.
Reviewed repair proposals
Doctor repair is a separate, approval-gated workflow. propose-fix requires an
exact 40-character base commit SHA, explicit repository-relative file
allowlists, structured full-file patches, and tokenized Maven validation
commands. It creates codex/doctor-<issue-or-session>-<proposal> in a temporary
Git worktree, applies changes only there, and returns a persisted manifest with
the complete unified diff, patch checksums, diagnosis/evidence references,
exact validation commands, populated Allure counts, residual risk, rollback
guidance, and a one-proposal approval token.
Example repair-input.json:
{
"patches": [
{
"path": "src/test/java/example/CheckoutTest.java",
"operation": "REPLACE",
"content": "package example;\n\nfinal class CheckoutTest {}\n",
"rationale": "Apply the reviewed diagnosis.",
"evidenceIds": ["allure-result-1"]
}
],
"validationCommands": [
["mvn", "-pl", "shaft-engine", "-am", "test", "-Dtest=CheckoutTest"],
["mvn", "-pl", "shaft-engine", "-am", "compile", "-DskipTests"]
]
}
java -jar shaft-mcp/target/SHAFT_MCP-<version>.jar doctor propose-fix \
--repository "$PWD" \
--base-sha <full-approved-sha> \
--diagnosis target/shaft-doctor/doctor-report.json \
--evidence-bundle target/shaft-doctor/doctor-evidence.json \
--issue 2857 \
--allowed-path src/test/java/example/CheckoutTest.java \
--repair-input repair-input.json \
--output-dir target/shaft-doctor/repairs
Only Maven compile, test, package, install, verification, Surefire/Failsafe, and
JavaDoc goals are accepted. Commands are executed as argument arrays without a
shell. Release, deployment, SCM, Versions Plugin, shell metacharacter, and
arbitrary executable input is rejected. Test-running commands are forced to
include -DheadlessExecution=true; a zero process exit alone is insufficient
when populated passing Allure results are expected. Maven runs offline by
default. CLI users must add --approve-network-validation, or MCP clients must
set networkValidationApproved=true, before validation may access the network.
--ai can request an optional provider-generated patch. The provider receives
only the deterministic diagnosis and exact approved regular source files under
explicit TEXT and SOURCE consent. Output must match the versioned repair
patch schema. Invented paths, commands, symlinks, binary or oversized content,
protected workflows, generated paths, and secret-like material are rejected.
Provider, consent, timeout, or schema failure returns no patch and does not
create a worktree.
Publishing is always a later explicit action:
java -jar shaft-mcp/target/SHAFT_MCP-<version>.jar doctor publish-draft-pr \
--manifest target/shaft-doctor/repairs/repair-proposal-<id>.json \
--approval-token <token-from-reviewed-proposal> \
--approve
Failed validation blocks publication by default. An explicit
--override-failed-validation also requires --override-rationale, which is
recorded in the manifest and pull-request body. Publication stages only the
manifested files, creates a Doctor-identified commit, pushes the dedicated
branch, and creates or reuses an open draft pull request through authenticated
gh. It never marks a PR ready, merges, releases, deploys, resets, cleans, or
switches the user's current worktree. The temporary worktree is removed after
publication or explicit cancellation; the published branch remains.
The MCP equivalents are doctor_propose_fix and
doctor_publish_draft_pr. The latter requires the same separate approved
boolean and exact proposal token.
Evidence
The collector recognizes populated Allure *-result.json, normalized exception
chains, SHAFT logs/action history, environment metadata, dependency/build
metadata, configuration summaries, screenshots, and page snapshots. Text and
structured JSON are redacted before retention or hashing. Password fields,
authorization and cookie headers, tokens, private keys, common credential
fields, and configured sensitive names are replaced without retaining their
original values.
Allure attempts are grouped by history ID and ordered by their recorded start
time. Non-final failed, broken, and skipped attempts remain visible even when
the final attempt passes. Historical bundles are optional and can be copied
with their relative artifacts/ directory for offline analysis.
Diagnosis
The ordered rule engine classifies primary and contributing causes as:
PRODUCTTESTLOCATORDATATIMING_SYNCHRONIZATIONENVIRONMENT_CONFIGURATIONINFRASTRUCTUREUNKNOWN
Rules cover locator-not-found, duplicate, stale, hidden/covered/interactable, frame/window context, assertion and test-data mismatches, timeout symptoms, driver/browser startup, Grid/Appium/network/filesystem/resource failures, setup/cleanup failures, parallel shared-state symptoms, retry-hidden failures, and recurring historical signatures. Every inference cites evidence IDs and is kept separate from observations. Unknown and contradictory cases remain unknown and list the missing evidence needed to narrow them.
Sharing
Review doctor-evidence.json, doctor-report.json, and any approved
artifacts/ before sharing. Screenshots and page source can contain personal
or confidential data even after deterministic redaction and therefore remain
opt-in. Doctor never uploads evidence automatically.
Validation
Run from the repository root:
mvn -pl shaft-doctor,shaft-ai,shaft-mcp -am test
mvn -pl shaft-doctor -am javadoc:javadoc