# Kanji Review — MCQ Redesign Plan (Single-Reading Options)

> **Status:** Proposed design & implementation plan  
> **Date:** 2026-06-22  
> **Scope:** `kanji` review flow, SRS integration, question generation, UI, data model, rollout.  
> **Related:** current flip-card review (`frontend/app/kanji/review/page.tsx`), `KanjiController@queue` + `@store`, `SrsService`, `AssessmentService` (existing kanji MCQ patterns), `GameContentService`.

---

## 1. Current State (Summary)

**Existing Kanji Review (SRS flip-card):**
- Queue: due `UserKanji` + new kanji backfill (`/api/v1/kanji/queue`).
- UI: front = character (+ radical hint); back = full `onyomi` + full `kunyomi` lists, `explanation`, `sample_words`, SRS interval.
- Grading: pure self-report (`remember` / `forgot`) → `SrsService::processReview` + `KanjiReview` row.
- Problem: learner sees the *entire list* at once → recognition of the string, not precise retrieval of individual readings. No objective correctness signal.

**Existing MCQ patterns elsewhere:**
- `AssessmentService::kanjiQuestions`: character → one `firstReading()` (prefers onyomi). Distractors from same-level kanji readings via `mcqChoices`.
- `GameContentService::kanjiQuestions`: character → `meaning`; distractors by radical / jlpt / global pool.
- `firstReading()` splits on `・,、\s` and strips okurigana dots.

Data reality:
- `onyomi` / `kunyomi` are multi-value strings (e.g. `"した、しも、もと、さ.げる,..."` or `"·"` separated after sanitization).
- `meaning` column exists (nullable) + richer `explanation`.
- Many kanji have 0–1 kun or on; a few have 5+.

---

## 2. Evaluation of the Proposed Concept

**Core idea:** replace (or augment) flip-card with 4-option MCQ. For each kanji present three question types:
- Onyomi (pick the correct single reading)
- Kunyomi (pick the correct single reading)
- Meaning

**Rule:** every option contains **exactly one clean reading** (no full lists).

### 2.1 Strengths
- **Precision training**: forces discrimination of one specific reading instead of pattern-matching a comma list.
- **Objective signal**: MCQ yields true/false before any self-grade.
- **Coverage**: explicitly practices On / Kun / Meaning triad.
- **Fits existing infrastructure**: distractor logic and `mcqChoices` already exist.

### 2.2 Risks & Weaknesses
| Risk | Impact | Why |
|------|--------|-----|
| Context-free single readings | High | Japanese readings are context-dependent. Teaching "the" onyomi for a kanji can create brittle memory. |
| Distractor quality | High | Trivial or phonetically implausible distractors → shallow learning. |
| Recognition vs production | Medium-High | MCQ is recognition-heavy; long-term retention benefits from recall/production. |
| SRS semantics change | Medium | Current SRS is "know this kanji holistically". Per-aspect questions need clear mapping to `UserKanji` state. |
| Data sparsity | Medium | Kanji with only one reading or no `meaning` need graceful fallbacks. |
| Over-testing rare readings | Medium | Learners may be drilled on low-frequency readings while core ones are neglected. |
| Cognitive load / fatigue | Medium | 3 questions per kanji can feel repetitive if not interleaved well. |

**Verdict:** The single-reading MCQ rule is pedagogically sound **as one tool**, not the only tool. A systematic design must combine it with contextual, production, and spaced elaboration elements.

---

## 3. Proposed Design (Systematic & Retention-Focused)

### 3.1 Question Taxonomy (not just 3 fixed types)

| Type | Prompt | Answer format | When to use | Pedagogy goal |
|------|--------|---------------|-------------|---------------|
| **On-Discriminate** | Kanji → "Onyomi mana yang benar?" | 4 single katakana strings (1 correct) | Always available when onyomi exists | Precise onyomi retrieval |
| **Kun-Discriminate** | Kanji → "Kunyomi mana yang benar?" | 4 single hiragana strings (1 correct) | When kunyomi exists | Precise kunyomi retrieval |
| **Meaning-Discriminate** | Kanji → "Arti yang tepat?" | 4 short Indonesian/English glosses | Always | Semantic anchoring |
| **Contextual-Reading** (high value) | Word containing kanji (e.g. 生きる) → "Pembacaan 生 di sini?" | 4 single readings | When sample_words or linked cards exist | Ecological validity + context |
| **Okurigana-Choice** | Stem + blank (e.g. 生___) → choose correct ending | Mixed reading+okurigana | Kunyomi with okurigana | Morphological precision |
| **Production (typed)** | Kanji shown → type one valid kunyomi (or onyomi) | Free text (fuzzy match) | Advanced / after MCQ success | Recall over recognition |

**Interleaving rule (recommended):**
- In a session of N kanji, generate 1–2 questions per kanji.
- Prefer **variety**: never ask On + Kun + Meaning for the *same* kanji back-to-back.
- Use a small state machine per session: track which facets have been practiced for the current kanji.

### 3.2 Distractor Generation (quality first)

Single-reading extraction helper (backend):
```php
// returns cleaned individual readings, e.g. ["した","しも","もと", ...] without dots
collect(explode('·', $raw))->map(fn($r) => explode('.', $r)[0])->filter()->unique()->values();
```

Distractor pools (priority order):
1. Same JLPT level + same radical (hardest & most useful confusions).
2. Same JLPT level.
3. Phonetic neighbors (for readings): e.g. similar mora onsets, long/short vowel, voiced/unvoiced.
4. Global same-type pool (onyomi pool vs kunyomi pool) with frequency or "common kanji" bias.
5. Semantic neighbors for meaning questions (from `explanation` or `meaning`).

**Invariants:**
- Exactly 3 distractors + 1 correct.
- All 4 options must be **distinct**.
- Never leak full multi-reading strings.
- For contextual questions, distractors should be plausible in that word's domain when possible.

### 3.3 Grading & SRS Integration (critical)

Current flow (self-grade only) is insufficient for MCQ.

**New contract for kanji MCQ review:**
1. User answers MCQ → client records `isCorrect` + `responseTimeMs`.
2. POST to backend includes both:
   ```json
   { "question_type": "on|kun|meaning|contextual", "selected": "...", "is_correct": true, "rt_ms": 2340 }
   ```
3. Backend decides grade (SrsService):
   - `is_correct === false` → cannot be better than `hard`.
   - `is_correct && rt_ms > 8000` → `hard` (slow).
   - `is_correct && rt_ms <= 3000` → `good` or `easy` based on prior history.
   - First exposure: graduated intra-day interval (as already in `handleGood`).
4. Still allow a light self-grade override in some modes (for metacognition), but **clamp** it.
5. Persist a `KanjiReview` row with `was_correct`, `question_type`, `selected_answer`, `correct_answer`.
6. Update `UserKanji` via `SrsService::processReview` (reuse existing columns; no schema change required initially).

**Per-facet mastery (future, optional):**
- Add nullable columns or JSON `facet_state` on `user_kanjis`:
  ```json
  { "on": { "repetition": 3, "interval_days": 4, ... }, "kun": {...}, "meaning": {...} }
  ```
- Or keep holistic `UserKanji` and derive "weak facet" signals from `KanjiReview` aggregates for queue prioritization.

### 3.4 Session Structure & UI

**Modes:**
- **Classic SRS** (keep): flip-card for users who want pure recognition + elaboration.
- **MCQ Drill** (new): primary active-recall mode.
- **Mixed** (recommended default): 60% MCQ questions + 40% flip or production.

**UI flow (MCQ page):**
1. Session setup (same 5/10/20/100 limits).
2. One question at a time.
3. Clear prompt: "Pilih onyomi yang benar untuk 生" (or "Kunyomi", "Arti", or contextual sentence).
4. 4 large tappable options (single reading / short meaning). No lists.
5. Immediate feedback:
   - Correct → green + 1-line usage example from `sample_words`.
   - Wrong → show correct + why (e.g. "この文脈では訓読み「い.きる」").
6. Optional "Show full readings" disclosure (after answering) — prevents list memorization during the question.
7. SRS buttons only if we want metacognitive layer (clamped).
8. Keyboard: 1-4 for choices, Space for next.

**Progress:**
- Track facets completed per kanji in the session.
- Completion screen: breakdown by facet accuracy + weakest facet callout.

### 3.5 Data & Backend Changes (minimal at first)

**No breaking schema changes required for Phase 1.**
- Reuse `KanjiReview` (add `question_type`, `selected_answer`, `correct_answer` as nullable — or store in JSON `meta`).
- Enhance `KanjiController` or create `KanjiReviewService`:
  - `generateQuestionsForKanji(Kanji $k, int $count = 1): array`
  - Support filtering by `question_type` preference.
- Extract reusable `KanjiReadingExtractor` (single readings, primary vs variants).
- Distractor pool service (can initially be simple queries, later cached + frequency-weighted).

**Migration (Phase 2 if per-facet SRS desired):**
- Add `facet_mastery` JSON to `user_kanjis` (or three separate tiny tables).

### 3.6 Pedagogical Safeguards for Long-Term Retention

1. **Elaboration after every answer** (example + short rule).
2. **Spaced interleaving** across kanji and facets (avoid blocking).
3. **Production ramp**: after 3–4 successful MCQ on a reading, insert a typed-production item.
4. **Error-driven review**: wrong answers get the kanji re-queued sooner with a targeted facet.
5. **Context injection**: at least 30% of questions should be contextual (word-level) once sample data is rich.
6. **Avoid list exposure during testing**: full on/kun lists shown only post-answer or on demand.
7. **Metacognition**: occasional "How confident are you?" (1–5) before reveal — improves calibration.

---

## 4. Phased Implementation Plan

### Phase 0 — Foundations (1–2 days)
- Implement `KanjiReadingExtractor::singleReadings($raw, $type='on'|'kun'): array`.
- Port/adapt `firstReading` + `mcqChoices` into a dedicated `KanjiQuestionBank` service.
- Add unit tests for extraction + distractor uniqueness.

### Phase 1 — Core MCQ Review (backend + simple UI)
- New endpoint: `GET /api/v1/kanji/mcq-queue?limit=10&types=on,kun,meaning`.
- `POST /api/v1/kanji/{kanji}/mcq-submit` with correctness + rt.
- Map to `SrsService` + `KanjiReview`.
- Minimal React page: `/kanji/review/mcq` (copy structure from current review, replace flip with 4 buttons).
- Keep existing flip-card review untouched.

### Phase 2 — Quality & Context
- Add contextual questions using `sample_words` and linked cards.
- Improve distractors (radical + phonetic similarity).
- Rich feedback panel (example sentence + reading rule).
- Session stats by facet.

### Phase 3 — SRS & Analytics
- Derive "weak facet" signals from review history.
- Optional per-facet state on `UserKanji`.
- Show in kanji detail modal: "Onyomi strength: 82%".
- Blend MCQ kanji items into Daily Mix.

### Phase 4 — Production & Polish
- Typed production items.
- A/B test MCQ vs classic (retention after 7/30 days via review accuracy & due counts).
- Admin tools to curate "primary" readings or ban bad distractors.

---

## 5. Risks & Mitigations

- **Brittle single-reading memory**: Mitigate with contextual questions + post-answer full-list disclosure + production items.
- **Bad distractors making questions trivial**: Invest in multi-signal distractor selection; log low-discrimination items.
- **SRS pollution**: Always send objective `is_correct`; clamp grades.
- **Performance on queue generation**: Pre-compute or cache common distractor pools per level; bound AI (none needed here).
- **User resistance to change**: Keep classic flip mode; default to mixed; provide "Why this format?" tooltip.

---

## 6. Success Metrics

- **Objective accuracy** on MCQ questions rises over time (per user + global).
- **Transfer to production**: % of correct typed-production items after MCQ training.
- **Retention signal**: lower re-due rate (fewer "forgot" / lapses) 7–30 days later vs classic-only cohort.
- **Facet balance**: users show <15% gap between their strongest and weakest facet accuracy.
- **Engagement**: session completion rate, time per question (not too fast = guessing, not too slow = frustration).

---

## 7. Open Questions

1. Should we allow "any valid reading" as correct (multiple correct answers), or always designate one primary per question?
2. Do we want per-facet SRS state, or keep holistic `UserKanji` and only use facet data for prioritization?
3. How aggressively to push production (typing) vs pure MCQ?
4. Should "meaning" questions use Indonesian, English, or both?
5. Frequency weighting for distractors: do we prefer common kanji readings or level-matched?

---

## 8. References (code & docs)

- Current review UI: `frontend/app/kanji/review/page.tsx`
- Queue & review persistence: `backend/app/Http/Controllers/KanjiController.php` (queue, store)
- SRS: `backend/app/Services/SrsService.php`
- Existing MCQ helpers: `backend/app/Services/AssessmentService.php` (`kanjiQuestions`, `firstReading`, `mcqChoices`)
- Game kanji MCQ (meaning-focused): `backend/app/Services/GameContentService.php`
- Data shape: `backend/database/data/kanji.json`, `kanjis` table + `user_kanjis`, `kanji_reviews`
- Similar plans: `daily-mix-feature-analysis.md`, `assessment-item-bank-plan.md`, `onboarding-revamp-plan.md`

---

**Next step recommendation:** implement Phase 0 + Phase 1 behind a feature flag (`kanji_mcq_review`). Run a small internal pilot (10 users) before wider rollout.