bkit: Structured PDCA Workflows and Context Engineering for Gemini CLI

TL;DR

Component	Count	Purpose
Specialized AI Agents	21	Role-based personas (CTO, architect, QA, PM) with permission grades
Domain Skills	35	On-demand expert knowledge via progressive disclosure
Custom Commands	24	TOML slash commands with `@{}`, `!{}`, `{{}}` syntax
Hook Events	10	Full Gemini CLI lifecycle interception
Team Patterns	5	Leader, Council, Swarm, Pipeline, Watchdog
Automation Levels	5	L0 (Manual) to L4 (Autonomous)
Languages	8	EN, KO, JA, ZH, ES, FR, DE, IT

Install: gemini extensions install https://github.com/popup-studio-ai/bkit-gemini.git — then run /bkit to start.

The Problem bkit Solves

AI coding assistants are powerful, but they lack structure. You ask Gemini CLI to build a feature, and it jumps straight to writing code — no planning, no design doc, no gap analysis. The result? Fragile implementations that drift from the original intent, inconsistent code quality, and no documentation trail.

bkit solves this by layering PDCA (Plan-Do-Check-Act) methodology on top of Gemini CLI through a concept its creators call Context Engineering.

Context Engineering vs Prompt Engineering

The distinction matters. Prompt engineering is about writing good prompts. Context Engineering is about designing systems that integrate prompts, tools, and state to provide LLMs with optimal context for inference.

bkit implements this through three layers:

Layer	Components	What It Does
Domain Knowledge	35 skills	Expert knowledge loaded on-demand (progressive disclosure)
Behavioral Rules	21 agents	Role-based constraints with model/tools/temperature configs
State Management	17 hooks + 13 lib modules	PDCA tracking, intent detection, permissions, memory

Instead of one massive prompt, bkit injects only the context relevant to the current PDCA phase. The v2.0.0 guide reports 71% token savings from this phase-aware context loading — not through prompt tricks, but through architectural design.

The 10-Event Hook System

This is the core mechanism. bkit intercepts the entire Gemini CLI lifecycle:

1
Event 1:  SessionStart          -> Detect project level, load output style, inject PDCA context
2
Event 2:  BeforeAgent           -> 8-language intent detection, agent/skill auto-trigger
3
Event 3:  BeforeModel           -> PDCA phase-specific prompt augmentation
4
Event 4:  AfterModel            -> Response tracking, usage metrics
5
Event 5:  BeforeToolSelection   -> Phase-based tool filtering
6
Event 6:  BeforeTool            -> Permission manager, dangerous command blocking (exit code 2)
7
Event 7:  AfterTool             -> Auto PDCA phase transitions
8
Event 8:  AfterAgent            -> Cleanup, phase completion detection
9
Event 9:  PreCompress           -> Context fork snapshot preservation
10
Event 10: SessionEnd            -> Session cleanup, memory persistence

The key automation: Event 7 auto-transitions PDCA phases. When you write your first source file in the Design phase, bkit automatically transitions to the Do phase. When you run analysis, it transitions to Check. You don’t have to remember which phase you’re in.

Dangerous commands are auto-blocked by Event 6: rm -rf, git reset --hard, git push --force, reverse shell patterns, and similar destructive operations are rejected before execution.

Getting Started

Prerequisites

Gemini CLI v0.34.0+ (v2.0.0 requirement)
Node.js v18+ (for hook scripts)
Git

Installation

1
gemini extensions install https://github.com/popup-studio-ai/bkit-gemini.git

Verify with /bkit in interactive mode or gemini extensions list in non-interactive mode.

Ensure Hooks Are Enabled

1
{
2
  "hooksConfig": {
3
    "enabled": true
4
  }
5
}

Hooks are enabled by default, but it’s worth verifying. On first run, bkit auto-initializes and shows a Control Panel and Workflow Map.

First-Time User Options

Option	Command	Target Audience
Learn bkit	`/development-pipeline`	First-time bkit users
Learn Gemini CLI	`/claude-code-learning`	First-time Gemini CLI users
New project	`/starter`, `/dynamic`, `/enterprise`	New projects
Free start	Normal conversation mode	Experienced users

Automation Levels: L0 Through L4

One of the most significant additions in v2.0.0 is the automation level system. You control how autonomous bkit is:

Level	Name	Behavior
L0	Manual	Every action requires user approval
L1	Semi-Auto	PDCA transitions auto, implementation requires approval
L2	Auto	Read actions auto, write actions require approval
L3	Full-Auto	Mostly auto, dangerous commands only require approval
L4	Autonomous	Fully autonomous (except dangerous commands)

Defaults: Starter=L0, Dynamic=L1, Enterprise=L1.

1
/control level L2                          # Set automation level
2
export BKIT_PDCA_AUTOMATION=full-auto      # Or via environment variable
3
/control stop                              # Emergency stop
4
Ctrl+C                                     # Also triggers emergency stop + checkpoint save

The Core Workflow: PDCA

Every feature goes through the same 6-phase cycle. bkit enforces this — you can’t skip to implementation without a plan.

Plan

1
/pdca plan "user authentication with JWT and Google OAuth"

Creates docs/01-plan/features/user-authentication.plan.md with requirements, scope, and constraints. For deeper exploration:

1
/plan-plus "user authentication with JWT and Google OAuth"

This adds Intent Discovery, 3+ alternative comparisons, and YAGNI (You Aren’t Gonna Need It) review before committing to a plan.

Design

1
/pdca design "user authentication with JWT and Google OAuth"

Creates docs/02-design/features/user-authentication.design.md with architecture decisions, API contracts, and component structure. The design-validator agent checks completeness and consistency automatically.

Do

1
/pdca do "user authentication with JWT and Google OAuth"

Implementation phase. All tools are unlocked (Plan phase restricts to read-only). The permission manager blocks dangerous commands. AfterTool hooks track every file change. bkit’s Rule 5 applies: “Prefer editing existing files over creating new ones.”

Check

1
/pdca analyze "user authentication with JWT and Google OAuth"

The gap-detector agent compares the design document against the actual implementation and produces a Match Rate metric: (implemented items / designed items) × 100%. Target: >= 90%. Generates docs/03-analysis/user-authentication.analysis.md.

Act

1
# If Match Rate < 90% — iterate (max 5 iterations)
2
/pdca iterate "user authentication with JWT and Google OAuth"
3

4
# If Match Rate >= 90% — generate completion report
5
/pdca report "user authentication with JWT and Google OAuth"

The pdca-iterator agent uses the Evaluator-Optimizer pattern to auto-fix issues, re-running Check after each iteration. The report-generator creates docs/04-report/features/user-authentication.report.md when the match rate is acceptable.

Status, Archive, and Batch

1
/pdca status              # Current PDCA state
2
/pdca next                # Auto-advance guidance
3
/pdca archive auth        # Archive completed PDCA docs
4
/pdca batch               # View all active features (parallel PDCA, max 3)

State Management: Where bkit Stores Things

PDCA state is persisted in .bkit/state/pdca-status.json:

1
{
2
  "primaryFeature": "user-authentication",
3
  "features": {
4
    "user-authentication": {
5
      "phase": "check",
6
      "matchRate": 87,
7
      "iterations": 1,
8
      "documents": {
9
        "plan": "docs/01-plan/features/user-authentication.plan.md",
10
        "design": "docs/02-design/features/user-authentication.design.md"
11
      }
12
    }
13
  }
14
}

Memory operates at three tiers:

Scope	Storage	Persistence	Purpose
Session	In-memory	Current session only	Temporary state, intermediate results
Project	`.bkit/state/`	Permanent	PDCA state, decision records
User	`~/.bkit/`	All projects	User preferences, learning history

The starter-guide and pipeline-guide agents use User scope (shared across projects). The other 19 agents use Project scope.

Project Levels

bkit auto-detects your project type through directory/file scanning:

Level	Detection Signals	Stack	PDCA Phases	Automation
Starter	Default (no special files)	HTML, CSS, JS, Next.js	4 of 6 (skips Check, Act)	L0 (Manual)
Dynamic	`docker-compose.yml`, `.mcp.json`, `lib/bkend`, `prisma/schema.prisma`	Next.js + bkend.ai BaaS	5 of 6 (optional Act)	L1 (Semi-Auto)
Enterprise	`kubernetes/`, `terraform/`, `docker-compose.prod.yml`	K8s, Terraform, Monorepo	All 6 phases	L1 (Semi-Auto)

Force a level with BKIT_PROJECT_LEVEL=Enterprise or via gemini-extension.json.

Auto-detection works in 8 languages. Type “포트폴리오 웹사이트 만들고 싶어요” (Korean for “I want to make a portfolio website”) and bkit triggers the Starter skill automatically.

The 21 Agents with Permission Grades

Agents have permission grades that restrict what they can do:

Grade	Agents	Allowed Tools
READONLY	gap-detector, design-validator	Read-only only
DOCWRITE	report-generator, pm-prd	Read + docs/ directory write only
FULL	cto-lead, pdca-iterator, and most others	All tools (except dangerous commands)

Each agent uses Gemini native frontmatter with configurable model, tools, temperature, max_turns, and timeout_mins. v2.0.0 uses updated models: gemini-3.1-pro, gemini-3-pro, and gemini-3-flash.

Here are the agents grouped by function:

PDCA Core (5):

Agent	Role	Triggered By
`cto-lead`	CTO-level team orchestration	`/bkit`, PDCA workflows
`gap-detector`	Design-implementation gap analysis (READONLY)	“verify”, “검증”
`pdca-iterator`	Auto-iteration improvement, max 5 loops	”improve”, “개선”
`report-generator`	PDCA completion report (DOCWRITE)	“report”, “보고서”
`design-validator`	Design completeness validation (READONLY)	“validate design”

Code Quality (3):

Agent	Role	Triggered By
`code-analyzer`	Code quality, security, performance	”analyze”, “분석”
`qa-strategist`	Test strategy, QA planning	”test strategy”
`qa-monitor`	Docker log real-time monitoring	”docker logs”

Architecture (4):

Agent	Role	Triggered By
`frontend-architect`	UI/UX, React, Next.js	”component”, “UI”
`security-architect`	OWASP, auth design	”CSRF”, “XSS”, “보안”
`infra-architect`	AWS, K8s, Terraform	”K8s”, “AWS”
`enterprise-expert`	Microservices, AI-native	”CTO”, “microservices”

PM Agent Team (5): Accessed via /pdca pm {feature} — 5 agents produce a full PRD with personas, competitive analysis, Lean Canvas, and GTM strategy.

Specialized (4): bkend-expert (28 MCP tools), starter-guide, pipeline-guide, product-manager.

Checkpoints and Rollback

bkit auto-saves a checkpoint at every PDCA phase transition. If something goes wrong, you can roll back:

1
/rollback list                      # List available checkpoints
2
/rollback restore {id}              # Restore to a checkpoint
3
/rollback reset {feature}           # Reset feature to initial state

Emergency stops (Ctrl+C or /control stop) automatically save a checkpoint. This is a safety net — you can always undo.

BTW: Recording Ideas Mid-Workflow

One of the most practical v2.0.0 additions is the BTW (By The Way) system. When you’re in the middle of a PDCA cycle and have an improvement idea, record it without interrupting your flow:

1
/btw 이 API 응답 시간이 느린데 캐싱 추가하면 좋겠다   # Record an idea
2
/btw list                                               # List recorded ideas
3
/btw analyze                                            # Analyze feasibility
4
/btw promote {id}                                       # Promote to full PDCA feature
5
/btw stats                                              # View statistics

Ideas stay out of your current workflow until you’re ready to promote them.

Greenfield Example: Building a Task Management SaaS

1
# 1. Product discovery (5 PM agents collaborate)
2
/pdca pm "task management SaaS for remote teams"
3
# Produces: personas, competitive analysis, Lean Canvas, PRD, GTM strategy
4

5
# 2. Initialize as Dynamic project
6
/dynamic init "taskflow"
7

8
# 3. Plan Plus for deeper brainstorming
9
/plan-plus "user signup and login with Google OAuth"
10

11
# 4. PDCA each feature
12
/pdca plan "user signup and login with Google OAuth"
13
/pdca design "user signup and login with Google OAuth"
14
/pdca do "user signup and login with Google OAuth"
15
/pdca analyze "user signup and login with Google OAuth"
16
/pdca report "user signup and login with Google OAuth"
17

18
# 5. Parallel PDCA (max 3 concurrent features)
19
/pdca plan "todo CRUD with kanban board"
20
/pdca plan "real-time notifications"
21

22
# 6. Quality review
23
/review src/
24
/simplify
25
/qa

Result: docs/ contains plan, design, analysis, and report for every feature. Each agent remembers context across sessions. Checkpoints are auto-saved at every phase transition.

Brownfield Example: Adding Auth to an Existing Next.js App

1
cd ~/my-existing-nextjs-app
2
gemini
3

4
# bkit auto-detects Dynamic level
5
# Rule 5 enforced: "Prefer editing existing files over creating new ones"
6

7
/pdca plan "add authentication with role-based access control"
8
/pdca design "add authentication with role-based access control"
9
/pdca do "add authentication with role-based access control"
10
/pdca analyze "add authentication with role-based access control"
11

12
# If gap < 90%:
13
/pdca iterate "add authentication with role-based access control"
14

15
# If gap >= 90%:
16
/pdca report "add authentication with role-based access control"
17

18
# Rollback if something went wrong:
19
/rollback list
20
/rollback restore {checkpoint-id}
21

22
# Final code review
23
/review src/middleware.ts src/lib/auth.ts

The gap-detector compares the design doc against your existing codebase, accounting for brownfield constraints rather than assuming a clean implementation.

Enterprise Example: Team Orchestration

For Enterprise-level projects, agents form coordinated teams:

1
/pdca team user-authentication

This triggers a structured team composition:

1
CTO Lead (cto-lead):
2
├── Plan Team (Leader pattern)
3
│   ├── product-manager      → requirements analysis
4
│   └── security-architect   → security requirements
5
├── Design Team (Council pattern)
6
│   ├── frontend-architect   → UI design
7
│   ├── bkend-expert         → API design
8
│   └── security-architect   → auth design
9
├── Check Team (Watchdog pattern)
10
│   ├── gap-detector         → gap analysis
11
│   ├── code-analyzer        → code quality
12
│   └── qa-monitor           → log monitoring
13
└── Report (Pipeline pattern)
14
    └── report-generator     → final report

Each pattern serves a different purpose:

Pattern	Architecture	When Used
Leader	1 lead + N workers	Plan and Design phases
Council	Equal peers, independent analysis	Architecture decisions
Swarm	Dynamic pool, parallel execution	Large-scale code review
Pipeline	Sequential chain	PDCA auto-progression
Watchdog	Monitor + actors	QA/security verification

The 9-Phase Development Pipeline

Beyond PDCA, bkit offers a 9-phase pipeline for comprehensive project scaffolding:

Phase	Description	Starter	Dynamic	Enterprise
1. Schema	Terminology and data models	Required	Required	Required
2. Convention	Coding rules and standards	Required	Required	Required
3. Mockup	UI/UX wireframes	Required	Required	Required
4. API	REST endpoint design	Skip	Required	Required
5. Design System	Component library	Skip	Optional	Required
6. UI Integration	Frontend-to-backend wiring	Required	Required	Required
7. SEO/Security	Hardening	Skip	Optional	Required
8. Review	Architecture review	Skip	Optional	Required
9. Deployment	CI/CD and production	Required	Required	Required

1
/development-pipeline start          # Start from phase 1
2
/development-pipeline status         # Show current phase
3
/development-pipeline next           # Advance to next phase
4
/phase-4-api                         # Jump to a specific phase

Additional Commands

1
# Code quality
2
/review <path>                       # Code review with code-analyzer agent
3
/simplify                            # Code quality review and complexity reduction
4
/qa                                  # Zero-script QA via Docker log monitoring
5

6
# Automation
7
/loop 5m /pdca status                # Recurring command execution
8

9
# Audit trail
10
/audit                               # View full audit log
11
/audit {feature}                     # Audit specific feature
12
/audit decisions                     # View all recorded decisions
13

14
# Custom skills
15
/skill-create                        # Create project-specific skill (hot-reload supported)

The audit log tracks decisions, tool usage, phase transitions, agent activity, and security events — all stored in .bkit/audit/.

Output Styles

Choose a response style that matches your experience level:

1
/output-style-setup

Style	Default For	Characteristics
`bkit-learning`	Starter	Beginner-friendly, “Why?” sections, code comments, tips
`bkit-pdca-guide`	Dynamic	PDCA-centric, phase indicators, action→result→next flow
`bkit-enterprise`	Enterprise	Efficiency-first, minimal explanation, technical terminology
`bkit-pdca-enterprise`	Enterprise (optional)	PDCA + Enterprise combined

Set via /output-style-setup or the BKIT_OUTPUT_STYLE environment variable.

23 Built-in Tools

bkit maps and manages Gemini CLI’s 23 built-in tools, including file management (7), execution (1), info search (2), agent coordination (5), plan mode (2), and Task Tracker tools (6, requires v0.32.0+).

Environment Variables

Variable	Purpose
`BKIT_PROJECT_LEVEL`	Override auto-detected project level
`BKIT_PDCA_AUTOMATION`	Set automation level (e.g., `full-auto`)
`BKIT_OUTPUT_STYLE`	Set output style

When to Use bkit

bkit is worth the setup overhead when:

You’re building features that benefit from design docs and gap analysis
You want consistent development processes enforced automatically
You’re working with Gemini CLI and want more structure than raw prompting
You’re building a greenfield project that needs systematic scaffolding
You’re adding features to a brownfield project and need design-to-implementation traceability
You want AI autonomy with guardrails (L2-L4 automation levels)
You need audit trails and rollback safety for AI-assisted development

When bkit Is Overkill

Quick scripts and one-off tasks
Projects that don’t benefit from formal PDCA cycles
Teams already using a different project management methodology
Environments where you can’t install Gemini CLI extensions
Developers who prefer fully manual control over every AI action

Key Takeaways

PDCA is enforced by hooks — you can’t skip planning, even if you try
Context Engineering is architectural — 71% token savings from phase-aware context loading
Automation levels L0-L4 — start manual, scale to autonomous as you gain trust
Checkpoints and rollback — every phase transition auto-saves, emergency stop included
Agent permission grades — READONLY, DOCWRITE, and FULL prevent agents from overstepping
BTW system — record improvement ideas without interrupting your current workflow
Parallel PDCA — up to 3 features simultaneously via /pdca batch
Multi-language auto-detection — works in 8 languages, triggers agents automatically
Team orchestration — 5 patterns for coordinated multi-agent workflows (Enterprise)
Audit trail — full logging of decisions, tool usage, phase transitions, and security events

bkit turns Gemini CLI from a raw AI assistant into a structured development environment. It won’t write better code than Gemini already can — but it ensures the code is planned, designed, verified, documented, and auditable in a systematic way that raw prompting never achieves.

This article was written by opencode (GLM-5 Turbo), based on content from: https://github.com/popup-studio-ai/bkit-gemini and the bkit-gemini v2.0.0 User Guide