Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions openspec/changes/add-qa-smoke-harness/.openspec.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-02-20
45 changes: 45 additions & 0 deletions openspec/changes/add-qa-smoke-harness/proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
## Why

We need a faster, more reliable way to manually validate CLI behavior changes like profile/delivery sync, migration behavior, and tool-detection UX.

Today, manual review is mostly ad hoc: each developer sets up state differently, runs a different command order, and checks outputs informally. This makes regressions easy to miss and slows iteration on CLI UX work.

An 80/20 solution is to add a lightweight smoke harness for deterministic non-interactive flows, plus a short manual checklist for interactive prompt behavior.

## What Changes

- Add a lightweight QA smoke harness for OpenSpec CLI behavior with isolated per-run sandbox state
- Use `Makefile` targets as the primary entrypoint:
- `make qa` (default local QA entrypoint)
- `make qa-smoke` (deterministic non-interactive suite)
- `make qa-interactive` (prints/opens manual interactive checklist)
- Implement smoke logic in a script (invoked by Make targets), not in Make itself
- Ensure each scenario runs in an isolated sandbox with temporary `HOME`, `XDG_CONFIG_HOME`, `XDG_DATA_HOME`, and `CODEX_HOME`
- Capture scenario artifacts for inspection (command output, exit code, and before/after filesystem state)
- Add a focused scenario set for high-risk behavior:
- init core output generation
- non-interactive detected-tool behavior
- migration when profile is unset
- delivery cleanup (`both -> skills`, `both -> commands`)
- commands-only update detection
- new tool directory detection messaging
- invalid profile override validation
- Add a short interactive checklist for keypress/prompt UX verification (Space toggle, Enter confirm, detected pre-selection)
- Wire CI to run the smoke suite on Linux as a fast regression gate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Linux-only CI gate leaves Windows behavior unvalidated.

The proposal scopes CI wiring to Linux only. The PR objectives explicitly flag "Windows CI verification" as pending. Given that the smoke harness uses shell scripts (scripts/qa-smoke.sh), path separators, and env var conventions that differ on Windows, a Linux-only gate is a reasonable starting point — but the proposal should document the Windows gap explicitly (e.g., as a known limitation or a follow-up item) so it is not silently dropped.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/changes/add-qa-smoke-harness/proposal.md` at line 28, The proposal
currently states "Wire CI to run the smoke suite on Linux as a fast regression
gate" but does not document the unvalidated Windows behavior; update proposal.md
to explicitly note that the CI gate is Linux-only, call out that
scripts/qa-smoke.sh and shell/env/path differences may break on Windows, and add
a concrete follow-up action item (e.g., "Add Windows CI verification" or "Port
scripts/qa-smoke.sh for Windows") with owner and priority so the Windows gap
isn't silently dropped.


## Capabilities

### New Capabilities

- `qa-smoke-harness`: Deterministic, sandboxed CLI smoke validation with a single developer entrypoint

### Modified Capabilities

- `developer-qa-workflow`: Standardized local/CI QA flow for CLI behavior and migration-sensitive scenarios

## Impact

- `Makefile` - Add `qa`, `qa-smoke`, and `qa-interactive` targets
- `scripts/qa-smoke.sh` (or equivalent) - Implement sandbox setup, scenario execution, and assertions
- `docs/` - Add/update contributor-facing QA instructions and interactive checklist usage
- CI workflow - Add smoke target execution as a lightweight regression gate
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
## ADDED Requirements

### Requirement: Makefile QA Entry Point

The repository SHALL provide Makefile targets as the primary developer entrypoint for CLI QA flows.

#### Scenario: Default QA target runs smoke suite

- **WHEN** a developer runs `make qa`
- **THEN** the command SHALL execute the non-interactive smoke suite
- **AND** exit with status code 0 only when all smoke scenarios pass

#### Scenario: Smoke suite target is directly invokable

- **WHEN** a developer runs `make qa-smoke`
- **THEN** the command SHALL execute the same smoke suite used by `make qa`
- **AND** return a non-zero exit code on assertion failure

#### Scenario: Interactive checklist target exists

- **WHEN** a developer runs `make qa-interactive`
- **THEN** the command SHALL provide the manual interactive verification checklist
- **AND** SHALL NOT run interactive prompt automation by default

### Requirement: Sandboxed Smoke Scenario Runner

The smoke suite SHALL run CLI scenarios in isolated sandboxes so tests are repeatable and do not depend on machine-global state.

#### Scenario: Scenario execution is environment-isolated

- **WHEN** a smoke scenario runs
- **THEN** it SHALL use temporary values for `HOME`, `XDG_CONFIG_HOME`, `XDG_DATA_HOME`, and `CODEX_HOME`
- **AND** global config from the host machine SHALL NOT affect scenario outcomes

#### Scenario: Scenario artifacts are captured for review

- **WHEN** a smoke scenario completes
- **THEN** the runner SHALL capture command output and exit status
- **AND** SHALL capture enough filesystem state to inspect before/after behavior
Comment on lines +37 to +39
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

"Enough filesystem state" is a non-testable normative requirement.

A SHALL clause must be verifiable. "Enough" is subjective and gives implementors no concrete signal for what to capture or assert. Consider specifying the artifact surface explicitly (e.g., global-config file, per-project .openspec/ directory, or a list of paths relative to the sandbox root).

📝 Example tightening
-- **AND** SHALL capture enough filesystem state to inspect before/after behavior
+- **AND** SHALL capture the sandbox global-config file, the project `.openspec/` directory tree,
+  and any generated workflow files, preserving before/after snapshots for post-run inspection
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/changes/add-qa-smoke-harness/specs/developer-qa-workflow/spec.md`
around lines 37 - 39, Replace the subjective clause "**AND** SHALL capture
enough filesystem state to inspect before/after behavior" with a verifiable,
concrete list of artifacts to capture (or a clear, configurable rule for which
paths to snapshot); for example update the clause to state the runner SHALL
capture command stdout/stderr and exit status, SHALL record a snapshot of
specified files and directories (e.g., global config file, per-project
`.openspec/` directory, and a configurable list of paths relative to the sandbox
root), and SHALL persist metadata identifying each captured path and its
pre/post state to make the requirement testable.


#### Scenario: High-risk workflow coverage exists

- **WHEN** the smoke suite executes
- **THEN** it SHALL include scenarios covering profile/delivery behavior and migration-sensitive flows
- **AND** include at least:
- non-interactive tool detection
- migration when profile is unset
- delivery cleanup (`both -> skills`, `both -> commands`)
- commands-only update detection
Comment on lines +25 to +49
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

No spec requirement for CI wiring.

proposal.md (line 28 and line 45) lists CI wiring as an explicit deliverable and names it as an Impact item. The spec currently has no corresponding SHALL requirement covering when and how the smoke suite must run in CI. Without this, the CI gate is optional from a spec-compliance perspective and risks being dropped during implementation.

Consider adding a requirement such as:

Scenario: Smoke suite runs in CI on Linux

  • WHEN a pull request targets the main branch
  • THEN the CI pipeline SHALL execute make qa-smoke
  • AND SHALL fail the pipeline on a non-zero exit code
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/changes/add-qa-smoke-harness/specs/developer-qa-workflow/spec.md`
around lines 25 - 49, The spec is missing a SHALL-level requirement to ensure
the smoke suite runs in CI; add a new Scenario titled "Smoke suite runs in CI on
Linux" that states WHEN a pull request targets the main branch THEN the CI
pipeline SHALL execute `make qa-smoke` and SHALL fail the pipeline on any
non-zero exit code; reference `proposal.md`'s CI deliverable to justify
inclusion and place this new scenario alongside the existing
sandboxed-smoke-runner requirements so CI wiring is a normative requirement
rather than optional.

Comment on lines +41 to +49
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Spec is missing three high-risk scenarios listed in the proposal.

proposal.md defines seven mandatory high-risk scenarios; this spec only enforces four. The three omitted ones are:

Proposal item Why it matters
init core output generation Validates the happy-path baseline for the new 4-workflow core profile
new tool directory detection messaging Covers the UX path for newly-detected AI tool dirs — a regression surface from the smart-init changes
invalid profile override validation Validates an error path; without a spec requirement it is likely to be skipped in the smoke suite

The spec and proposal should align on the minimum required coverage. If any scenarios are intentionally deferred, the proposal should note that explicitly.

📝 Proposed addition to the spec
   - commands-only update detection
+  - init core output generation (core profile, 4 workflows)
+  - new tool directory detection messaging
+  - invalid profile override validation
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/changes/add-qa-smoke-harness/specs/developer-qa-workflow/spec.md`
around lines 41 - 49, The spec section "High-risk workflow coverage exists"
currently enforces only four of the seven required high-risk scenarios from the
proposal; add explicit requirements for the three missing scenarios — "init core
output generation", "new tool directory detection messaging", and "invalid
profile override validation" — as mandatory smoke-suite checks (e.g., extend the
THEN/AND list under that Scenario to include these three bullets and brief
acceptance criteria), and if any of these are intentionally deferred, update
proposal.md or this spec to state which are deferred and why so the two
documents remain aligned.

95 changes: 95 additions & 0 deletions openspec/changes/simplify-skill-installation/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,10 +132,76 @@ What's installed in `.claude/skills/` (etc.) is the source of truth, not config.
- Extra workflows (not in profile) are preserved
- Delivery changes are applied: switching to `skills` removes commands, switching to `commands` removes skills

**Why not a separate tool manifest?**

Tool selection (which assistants a project uses) is per-user AND per-project, but the two config locations are per-user-only (global config) or per-project-shared (checked-in project config). A separate manifest was explored and rejected:

- *Path-keyed global config* (`projects: { "/path": { tools: [...] } }`): Fragile on directory move/rename/delete, symlink ambiguity, and project behavior depends on invisible external state.
- *Gitignored local file* (`.openspec.local`): Lost on fresh clone, adds file management overhead.
- *Checked-in project config* (`openspec/config.yaml` with `tools` field): Forces tool choices on the whole team — Alice uses Claude Code, Bob uses Cursor, neither wants the other's tools mandated.

The filesystem approach avoids all three problems. For teams, it's actually beneficial: checked-in skill files mean `openspec update` from any team member refreshes skills for all tools the project supports. The generated files serve as both the deliverable and the implicit tool manifest.

Known gap: a tool that stores config outside the project tree (no local directory to scan) would need tool-specific handling, since there's nothing in the project to scan. Address if/when such a tool is supported.

**When to use init vs update:**
- `init`: First time setup, or when you want to change which tools are configured
- `update`: After changing config, or to refresh templates to latest version

### 8. Existing User Migration

When `openspec init` or `openspec update` encounters a project with existing workflows but no `profile` field in global config, it performs a one-time migration to preserve the user's current setup.

**Rationale:** Without migration, existing users would default to `core` profile, causing `propose` to be added on top of their 10 workflows — making things worse, not better. Migration ensures existing users keep exactly what they have.

**Triggered by:** Both `init` (re-init on existing project) and `update`. The migration check is a shared function called early in both commands, before profile resolution.

**Detection logic:**
```typescript
// Shared migration check, called by both init and update:
function migrateIfNeeded(projectPath: string, tools: AiTool[]): void {
const globalConfig = readGlobalConfig();
if (globalConfig.profile) return; // already migrated or explicitly set

const installedWorkflows = scanInstalledWorkflows(projectPath, tools);
if (installedWorkflows.length === 0) return; // new user, use core defaults

// Existing user — migrate to custom profile
writeGlobalConfig({
...globalConfig,
profile: 'custom',
delivery: 'both',
workflows: installedWorkflows,
});
}
```

**Scanning logic:**
- Scan all tool directories (`.claude/skills/`, `.cursor/skills/`, etc.) for workflow directories/files
- Match only against `ALL_WORKFLOWS` constant — ignore user-created custom skills/commands
- Map directory names back to workflow IDs (e.g., `openspec-explore/` → `explore`, `opsx-explore.md` → `explore`)
- Take the union of detected workflow names across all tools

**Edge cases:**
- **User manually deleted some workflows:** Migration scans what's actually installed, respecting their choices
- **Multiple projects with different workflow sets:** First project to trigger migration sets global config; subsequent projects use it
- **User has custom (non-OpenSpec) skills in the directory:** Ignored — scanner only matches known workflow IDs from `ALL_WORKFLOWS`
- **Migration is idempotent:** If `profile` is already set in config, no re-migration occurs
- **Non-interactive (CI):** Same migration logic, no confirmation needed — it's preserving existing state

**Alternatives considered:**
- Migrate during `init` instead of `update`: Init already has its own flow (tool selection, etc.). Mixing migration with init creates confusing UX
- Don't migrate, just default to core: Breaks existing users by adding `propose` and showing "extra workflows" warnings
- Migrate at global config read time: Too implicit, hard to show feedback to user

### 9. Generic Next-Step Guidance in Templates

Workflow templates use generic, concept-based next-step guidance rather than referencing specific workflow commands. For example, instead of "run `/opsx:propose`", templates say "create a change proposal".

**Rationale:** Conditional cross-referencing (where each template checks which other workflows are installed and renders different command names) adds significant complexity to template generation, testing, and maintenance. Generic guidance avoids this entirely while still being useful — users already know their installed workflows.

**Note:** If we find that users consistently struggle to map concepts to commands, we can revisit this with conditional cross-references. For now, simplicity wins.

### 7. Fix Multi-Select Keybindings
Comment on lines +151 to 205
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Section numbering is out of order — sections 8 and 9 precede section 7 in the file.

Sections 8 ("Existing User Migration") and 9 ("Generic Next-Step Guidance") were inserted before the existing section 7 ("Fix Multi-Select Keybindings") rather than after it. Consider renumbering or reordering to keep the document sequential.

🧰 Tools
🪛 LanguageTool

[style] ~183-~183: To elevate your writing, try using a synonym here.
Context: ... global config read time: Too implicit, hard to show feedback to user ### 9. Generi...

(HARD_TO)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/changes/simplify-skill-installation/design.md` around lines 139 -
193, The section numbering is out of order: "### 8. Existing User Migration" and
"### 9. Generic Next-Step Guidance" appear before "### 7. Fix Multi-Select
Keybindings"; renumber or reorder these headings so the document flows
sequentially (e.g., move "Existing User Migration" and "Generic Next-Step
Guidance" after "Fix Multi-Select Keybindings" or renumber them to 7/8/9
consistently), and update any in-file references or cross-links that mention the
old section numbers to the new numbers; locate headings by the exact strings
"### 8. Existing User Migration", "### 9. Generic Next-Step Guidance", and "###
7. Fix Multi-Select Keybindings" to apply the fix.


Change from tab-to-confirm to industry-standard space/enter.
Expand All @@ -144,6 +210,35 @@ Change from tab-to-confirm to industry-standard space/enter.

**Implementation:** Modify `src/prompts/searchable-multi-select.ts` keybinding configuration.

### 10. Update Sync Must Consider Config Drift, Not Just Version Drift

`openspec update` cannot rely only on `generatedBy` version checks for deciding whether work is needed.

**Rationale:** profile and delivery changes can require file add/remove operations even when existing skill templates are current. If we only check template versions, update may incorrectly return "up to date" and skip required sync.

**Implementation:**
- Keep version checks for template refresh decisions
- Add file-state drift checks for profile/delivery (missing expected files or stale files from removed delivery mode)
- Treat either version drift OR config drift as update-required

### 11. Tool Configuration Detection Includes Commands-Only Installs

Configured-tool detection for update must include command files, not only skill files.

**Rationale:** with `delivery: commands`, a project can be fully configured without skill files. Skill-only detection incorrectly reports "No configured tools found."

**Implementation:**
- For update flows, treat a tool as configured if it has either generated skills or generated commands
- Keep migration workflow scanning behavior unchanged (skills remain the migration source of truth)

### 12. Init Profile Override Is Strictly Validated

`openspec init --profile` must validate allowed values before proceeding.

**Rationale:** silently accepting unknown profile values hides user errors and produces implicit fallback behavior.

**Implementation:** accept only `core` and `custom`; throw a clear CLI error for invalid values.

## Risks / Trade-offs

**Risk: Breaking existing user workflows**
Expand Down
41 changes: 27 additions & 14 deletions openspec/changes/simplify-skill-installation/proposal.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,30 +127,43 @@ Workflows: (space to toggle, enter to save)
[ ] onboard
```

### 8. Backwards Compatibility

- Existing users with all workflows keep them (extra workflows not in profile are preserved)
- `openspec init` sets up new projects using current profile config
- `openspec update` applies config changes to existing projects (adds missing workflows, refreshes templates)
### 8. Backwards Compatibility & Migration

**Existing users keep their current setup.** When `openspec update` runs on a project with existing workflows and no `profile` in global config, it performs a one-time migration:

1. Scans installed workflow files across all tool directories in the project
2. Writes `profile: "custom"`, `delivery: "both"`, `workflows: [<detected>]` to global config
3. Refreshes templates but does NOT add or remove any workflows
4. Displays: "Migrated: custom profile with N existing workflows"

After migration, subsequent `init` and `update` commands respect the migrated config.

**Key behaviors:**
- Existing users' workflows are preserved exactly as-is (no `propose` added automatically)
- Both `init` (re-init) and `update` trigger migration on existing projects if no profile is set
- `openspec init` on a **new** project (no existing workflows) uses global config, defaulting to `core`
- `init` with a custom profile shows what will be installed and prompts to proceed or reconfigure
- `init` validates `--profile` values (`core` or `custom`) and errors on invalid input
- Migration message mentions `propose` and suggests `openspec config profile core` to opt in
- After migration, users can opt into `core` profile via `openspec config profile core`
- Workflow templates conditionally reference only installed workflows in "next steps" guidance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Documentation inconsistency with design.md section 9.

Line 149 states templates "conditionally reference only installed workflows" in next-step guidance, but design.md section 9 explicitly documents the opposite decision — generic, concept-based guidance was chosen to avoid conditional cross-referencing complexity. Consider aligning this bullet with the design decision.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/changes/simplify-skill-installation/proposal.md` at line 149, The
proposal's line "Workflow templates conditionally reference only installed
workflows in 'next steps' guidance" conflicts with the design decision in
design.md section 9 which specifies using generic, concept-based guidance;
update the bullet in proposal.md (the line containing that quoted phrase) to
match design.md section 9 by replacing the conditional/installed-workflows
wording with language that states templates provide generic, concept-based
next-step guidance rather than conditionally referencing installed workflows,
and ensure the phrasing mirrors the terminology used in design.md for
consistency.

- Delivery changes are applied: switching to `skills` removes command files, switching to `commands` removes skill files
- Re-running `init` applies delivery cleanup on existing projects (removes files that no longer match delivery)
- `update` treats profile/delivery drift as update-required even when template versions are already current
- `update` treats command-only installs as configured tools
- All workflows remain available via custom profile

## Capabilities

### New Capabilities

- `profiles`: Support for workflow profiles (core, custom) with interactive configuration
- `delivery-config`: User preference for delivery method (skills, commands, both)
- `profiles`: Workflow profiles (core, custom), delivery preferences, global config storage, interactive picker
- `propose-workflow`: Combined workflow that creates change + generates all artifacts
- `user-config`: Extend existing global config with profile/delivery settings
- `available-tools`: Detect what AI tools the user has from existing directories

### Modified Capabilities

- `cli-init`: Smart defaults with auto-detection and confirmation
- `tool-selection-ux`: Space to select, Enter to confirm
- `skill-generation`: Conditional based on profile and delivery settings
- `command-generation`: Conditional based on profile and delivery settings
- `cli-init`: Smart defaults with tool auto-detection, profile-based skill/command generation
- `cli-update`: Profile support, delivery changes, one-time migration for existing users

## Impact

Expand All @@ -166,7 +179,7 @@ Workflows: (space to toggle, enter to save)
- `src/core/shared/skill-generation.ts` - Filter by profile, respect delivery
- `src/core/shared/tool-detection.ts` - Update SKILL_NAMES and COMMAND_IDS to include propose
- `src/commands/config.ts` - Add `profile` subcommand with interactive picker
- `src/commands/update.ts` - Add profile/delivery support, file deletion for delivery changes
- `src/core/update.ts` - Add profile/delivery support, file deletion for delivery changes
- `src/prompts/searchable-multi-select.ts` - Fix keybindings (space/enter)

### Global Config Schema Extension
Expand Down

This file was deleted.

Loading
Loading