# Daily Mix — Feature Analysis & Improvement Plan

_Last reviewed: 2026-06-12_

## 1. What Daily Mix is

Daily Mix is a **cross-domain interleaved review session**. Instead of practicing
one skill at a time, it blends three item types into a single shuffled queue:

| Type        | Source                                   | Interaction in UI                                  |
|-------------|------------------------------------------|----------------------------------------------------|
| `flashcard` | SRS-due `UserCard`s + new `Card`s        | 4-option MCQ (kana / meaning) → SRS self-grade     |
| `sentence`  | SRS-due `UserSentence`s + new `Sentence`s| Type the Japanese translation → SRS self-grade     |
| `particle`  | AI-generated via `ParticlePracticeService` | cloze MCQ / scramble / error-correction          |

### Request flow

- **Frontend:** [frontend/app/daily-mix/page.tsx](../../frontend/app/daily-mix/page.tsx)
  fetches `GET /api/v1/daily-mix/queue?focus=<mode>&t=<ts>` once on mount, renders
  items one at a time, and posts each answer to `POST /api/v1/daily-mix/submit`.
- **Backend:** [DailyMixController](../app/Http/Controllers/DailyMixController.php)
  builds the queue (`queue()`) and routes grading per type (`submit()`).
- **Routes:** [routes/v1/dashboard.php](../routes/v1/dashboard.php) lines 13–15.

### Queue composition

`calculateDynamicLimits()` decides how many of each type to include:

- **Explicit focus** (`flashcard` / `sentence` / `particle`): that type gets 20 slots,
  the others 5 each.
- **Balanced (adaptive)** — inspects the user's struggle signals:
  - `> 3` particles with `incorrect_count > 5` → particle-heavy (10 / 5 / 15).
  - sentence lapses `>` 2× vocab lapses → sentence-heavy (8 / 17 / 5).
  - vocab lapses `> 20` → vocab-heavy (20 / 6 / 4).
  - otherwise default 15 / 10 / 5.

Flashcards reuse `ReviewQueueService::buildQueue` (overdue-first, active-pool cap,
new-card backfill by JLPT composition order). MCQ distractors are drawn from a random
50-card window. Sentences are due-first, then backfilled with the user's never-seen
sentences. Particle questions are generated one at a time in a loop.

### Grading

- `flashcard` → `SrsService::processReview` + a `Review` row; snoozes until tomorrow on
  `good`/`easy` or after 2 reviews the same day. Dashboard cache invalidated.
- `sentence` → `SrsService::processReview` + a `SentenceReview` row; snoozes on
  `good`/`easy`. Dashboard cache invalidated.
- `particle` → `ParticlePracticeService::updateSRS` + optional `recordAnswerOnQuestion`.

---

## 2. Strengths

- Clean separation: the controller orchestrates, business logic stays in services.
- Interleaving across domains is good pedagogy (mixed practice beats blocked practice).
- Adaptive focus and an explicit focus selector on the completion screen are nice touches.
- Solid empty / loading / complete states; cache-busting timestamp on fetch.
- Reuses the canonical `ReviewQueueService`, so flashcard selection stays consistent
  with the rest of the app (active-pool cap, mastery exclusion, JLPT ordering).

---

## 3. Issues & Bugs

### 3.1 High — MCQ / typed answers are cosmetic; grade is fully self-reported
For **flashcards** and **sentences**, the user picks an MCQ option (or types a
translation), but the recorded grade comes only from the subsequent
`again/hard/good/easy` self-grade buttons. The objectively-known correctness
(`isCorrect` at [page.tsx:305](../../frontend/app/daily-mix/page.tsx#L305),
`handleSentenceCheck` at [page.tsx:226](../../frontend/app/daily-mix/page.tsx#L226))
is computed and shown but **never sent to the server**. A user who picks the wrong
MCQ answer can still tap "Easy". This undercuts SRS accuracy and the adaptive
lapse-based focus logic.

> **Fix:** send the measured correctness alongside the self-grade, and either
> (a) clamp the grade (a wrong MCQ cannot be graded better than `hard`), or
> (b) for MCQ-style flashcards drop the self-grade entirely and derive the grade
> from correctness + response time, like a typical quiz. At minimum, persist the
> objective result so analytics aren't blind to it.

### 3.2 High — Particle generation is synchronous and serial (latency / cost)
`queue()` calls `particleService->generateQuestion()` in a `for` loop up to the
particle limit — **up to 20 iterations on particle-focus**, each potentially an
OpenAI call (per `ParticlePracticeService`'s AI fallback). These run sequentially
inside the request, so the user can wait many seconds and the queue endpoint risks
timeouts. There's no batching, no cap on AI fallbacks per request, and no caching.

> **Fix:** (a) prefer DB-verified questions and bound AI fallbacks to a small number
> (e.g. ≤ 2) per request; (b) generate particle questions in a queued job / pre-warm a
> per-user pool so the request only reads from the DB; (c) add a hard time budget.

### 3.3 Medium — No server-side session; progress lost on reload
The queue lives only in React state. A refresh, navigation, or crash mid-session
loses all progress and re-fetches a brand-new shuffled queue. Submitted answers
are persisted individually, so SRS state is safe, but the *session* is not resumable.

> **Fix:** persist a lightweight session (id + remaining item ids + cursor) or at least
> store the queue in `sessionStorage` keyed by a session id so reloads resume.

### 3.4 Medium — MCQ distractors can be too few / duplicated / trivially wrong
`flashQueueWithDistractors` samples a random 50-card window and filters by
meaning/kana. If the window yields `< 3` valid distractors, the card shows fewer than
4 options. Distractors are random rather than confusable (same JLPT level / semantic
tag / similar length), making many questions trivially guessable. No dedup against
multiple cards sharing a `meaning_id`.

> **Fix:** draw distractors scoped to the same JLPT level / tag, guarantee exactly 3
> by widening the pool on shortfall, and dedup by normalized meaning.

### 3.5 Medium — Adaptive "particle focus" amplifies the AI cost problem
The balanced branch can auto-route to **15 particle questions** when the user has been
struggling. Combined with 3.2, the users who struggle most get the slowest, most
expensive sessions — the opposite of desirable.

> **Fix:** couple this with the pre-warmed pool from 3.2; cap AI-generated particles
> regardless of focus.

### 3.6 Low — `submit` validation allows a null grade for flashcard/sentence
The validator marks `grade` as `nullable`, but `submitFlashcard` / `submitSentence`
pass it straight into `SrsService::processReview`. A malformed client (or future bug)
sending `type=flashcard` without `grade` would hit an error / undefined behavior.

> **Fix:** make `grade` required when `type` is `flashcard`/`sentence`, and `is_correct`
> required when `type` is `particle` (conditional validation rules).

### 3.7 Low — Dead / unused code
- `generateFlashcardOptions()` ([DailyMixController.php:391](../app/Http/Controllers/DailyMixController.php#L391))
  is never called — superseded by `flashQueueWithDistractors`.
- The dead branch comment block at lines 363–368 (`if (!$userCard && $item instanceof Card)`)
  does nothing.
- Frontend `_mixStats` is fetched and immediately `void`-ed — the stats payload
  (`flashcard_count` / `sentence_count` / `particle_count`) is computed server-side but
  never shown to the user.

> **Fix:** delete the dead method/branch; either surface the mix breakdown in the UI
> (e.g. on the loading or completion screen) or stop sending it.

### 3.8 Low — `fetchQueue` effect coupling is fragile
`fetchQueue` is a `useCallback` depending on `items.length`, and the mount effect depends
on `fetchQueue`. Each `setItems` recreates the callback and re-runs the effect, saved only
by the `!force && items.length > 0` early-return guard and the `fetchingRef` latch. It works
but is brittle.

> **Fix:** split fetching from the dependency on `items.length` (pass force explicitly,
> drop `items.length` from deps), or move fetch into an effect with a stable trigger.

### 3.9 Low — Response-time signal is noisy for flashcards
`response_time_ms` is measured from card start to the *grade* tap, which for flashcards
happens after the answer is revealed and read. It conflates thinking time with
reading/grading time, weakening any latency-based difficulty signal.

> **Fix:** capture the time to first answer (MCQ tap / Enter) separately from grade time.

### 3.10 Low — Accuracy stats are client-only
The completion screen's correct/incorrect/accuracy come from local `stats`, derived from
the self-grade (`grade !== 'again'`). Given 3.1, these can diverge from objective truth and
are never reconciled server-side.

---

## 4. Feature Opportunities

1. **Surface the mix breakdown** — use the already-computed `stats` payload to show
   "5 vocab · 3 sentences · 2 particles" on the loading/summary screens.
2. **Daily goal & streak integration** — Daily Mix is the natural "one session a day"
   ritual; tie completion into the streak/quest system referenced in project memory.
3. **Resumable sessions** (see 3.3) with a "continue where you left off" entry point.
4. **Configurable session length** — let the user pick a short (10) / normal / long mix
   instead of fixed limits.
5. **Grammar item type** — fold JLPT grammar practice into the mix (see §6); the
   backend service and SRS table already exist and mirror particles exactly.
6. **Kanji / kana item types** — the platform already has these domains (gaming hub);
   adding them to the mix would make it a true full-coverage daily review.
6. **Post-session insight** — "your weakest area today was particles" using per-type
   accuracy, linking into the focus selector that already exists.
7. **Audio / TTS for sentences** — leverage the existing TTS pipeline for a listening
   variant of sentence items.

---

## 5. Reusing the Particle & Grammar Practice Components

Daily Mix's inline particle rendering (`renderParticle`,
[page.tsx:411](../../frontend/app/daily-mix/page.tsx#L411)) is a **second,
independent implementation** of what the standalone particle practice page already
does ([particles/practice/page.tsx](../../frontend/app/particles/practice/page.tsx)).
Both render the same three question types (`cloze`, `scramble`, `error_correction`)
from the **same generator** (`ParticlePracticeService::generateQuestion(mode:'mixed')`).
The content is identical; only the presentation is duplicated — and Daily Mix's copy
is the weaker one (a fragile diff/heuristic to locate the wrong particle, custom
`(( ))` / `[[ ]]` markers, no TTS/romaji/tip/streak/timer).

**Grammar is the same story.** The grammar practice page
([grammar/practice/page.tsx](../../frontend/app/grammar/practice/page.tsx)) is a
near-clone of the particle one: same `Question` shape, same options/feedback/streak/
timer machinery, same generate→submit backend pattern
(`GrammarPracticeService::generateQuestion / updateSRS / recordAnswerOnQuestion`,
`UserGrammarStat`). It differs only in surface details:

| | Particle | Grammar |
|---|---|---|
| question types | `cloze`, `scramble`, `error_correction` | `meaning_match`, `pattern_recognition` (MCQ only) |
| reading field | `romaji` | `furigana` |
| id field | `particle_id` | `grammar_id` |
| timer | 10s | 15s |
| accent color | violet | orange |

**Decision: extract ONE shared question component covering both domains, and reuse it
verbatim — including the timer.** No `variant` flag. The standalone pages' behavior
(timer, TTS, reading, tip, streak, prefetch) becomes the single behavior everywhere.
The handful of differences above collapse into a small per-domain config object, not
divergent code paths. Scramble is simply a question-type the grammar domain never emits.

### 5.1 The one real blocker — normalize the data contract
The surfaces disagree on field names/markers and must be unified onto one canonical
`Question` shape first:

| Concept | Practice pages (target shape) | Daily Mix today |
|---|---|---|
| correct answer | `correct_answer` | remapped to `correct_particle` |
| cloze blank | `[ ]` placeholder | `(( ))` / `[[ ]]` markers + heuristic |
| reading | `romaji` / `furigana` | omitted |
| id | `particle_id` / `grammar_id` | `particle_id` only |

Canonical shape: keep `correct_answer`, the `[ ]` blank marker, and a single
`reading` field (populated from `romaji` or `furigana`), plus a `domain` discriminator
(`'particle' | 'grammar'`).

> **Action:** make the Daily Mix backend emit this shape for both particle and grammar
> items instead of remapping in
> [DailyMixController.php:124-127](../app/Http/Controllers/DailyMixController.php#L124).
> This deletes the `(( ))`/`[[ ]]` parsing and the error-correction highlight heuristic
> from the frontend entirely.

### 5.2 Plan
1. **Create** `frontend/components/practice/practice-question.tsx` — a domain-agnostic
   presentational component. Move the `renderQuestion()` branches (all five types
   across both domains), scramble handlers, timer, and feedback block into it:
   ```
   <PracticeQuestion
     question={Question}                 // single canonical shape (+ domain field)
     config={DOMAIN_CONFIG[domain]}      // accent, reading label, timer seconds
     onResolved={(isCorrect) => void}    // host submits/advances
   />
   ```
   `DOMAIN_CONFIG` holds the per-domain bits from the table above (color, timer,
   reading label). Everything else is shared.
2. **Refactor** both standalone pages (`particles/practice`, `grammar/practice`) to
   consume `<PracticeQuestion>` — pure extraction, no behavior change; they already
   own this code.
3. **Normalize** the Daily Mix queue payload (5.1) so particle *and* grammar items
   match `Question`.
4. **Replace** Daily Mix's `renderParticle` with `<PracticeQuestion>`; wire `onResolved`
   to the existing `submitAnswer` + `advanceCard` flow. Delete `highlightParticle`,
   the `(( ))`/`[[ ]]` parsing, and the duplicated scramble handlers.
5. **Keep host-specific orchestration** (queue building, focus limits, per-type submit,
   session stats) in each page — only the single-question UI is shared.

### 5.3 Payoff
- Removes the brittle error-correction highlight heuristic (one source of truth).
- Daily Mix particle items gain TTS, reading, tips, streak, and the timer for free.
- Grammar drops into Daily Mix (§6) with zero new rendering code — it reuses the same
  component.
- Particle/grammar UI bugs get fixed once instead of three times.
- Does **not** address 3.2 latency/cost (that's server-side generation volume) — track
  separately.

---

## 6. Adding Grammar to the Daily Mix

Grammar becomes a fourth item type (`grammar`) alongside `flashcard`, `sentence`, and
`particle`. The backend already has everything needed — this is wiring, not new systems.

### 6.1 Backend (`DailyMixController`)
1. **Inject** `GrammarPracticeService` into the constructor (mirrors
   `ParticlePracticeService`).
2. **Generate** grammar questions in `queue()` exactly like the particle loop, capped by
   a new `limits['grammar']` slot, and tag them `'type' => 'grammar'` in the canonical
   `Question` shape from §5.1 (`correct_answer`, `[ ]` marker, `reading` from `furigana`,
   `grammar_id`, `domain => 'grammar'`). Apply the same ≤ 2 AI-fallback cap from 3.2 —
   grammar inherits the latency/cost concern, so it must read from a pre-warmed pool too.
3. **Route** grading in `submit()`: add a `case 'grammar'` that calls
   `grammarService->updateSRS($user->id, $grammar_id, $is_correct)` and
   `recordAnswerOnQuestion($question_id, $is_correct)` — identical to `submitParticle`.
   Extend the validator to accept `grammar_id` and `in:flashcard,sentence,particle,grammar`.

### 6.2 Limits & adaptive focus (`calculateDynamicLimits`)
- Add a `grammar` slot to every returned limit array and a `grammar` explicit-focus branch.
- Rebalance the **balanced** default so four domains share the session, e.g.
  `flashcard 12 / sentence 8 / particle 5 / grammar 5`.
- Add an adaptive trigger using `UserGrammarStat` (e.g. low `mastery_level` / recent
  errors), parallel to the existing particle-struggle branch.

### 6.3 Frontend (`daily-mix/page.tsx`)
- Add `'grammar'` to `ItemType` and a `TYPE_CONFIG.grammar` entry (orange accent, label
  e.g. "Tata Bahasa" / 文法, `BookText` icon).
- Render grammar items with the **shared `<PracticeQuestion>`** from §5 — no bespoke
  rendering. Map the submit body (`grammar_id`, `question_id`, `is_correct`).
- Add a Grammar option to the completion-screen focus selector (`FOCUS_OPTIONS`).

### 6.4 Notes
- Grammar has **no scramble type**, so nothing extra is needed — `<PracticeQuestion>`
  just never receives that type for grammar items.
- The mix-breakdown stats payload (3.7 / opportunity #1) should add `grammar_count`.
- Sequencing: do §5 (shared component) **first**, then §6 — grammar then drops in with
  no new rendering code.

---

## 7. Suggested Priority

| Priority | Item | Why |
|----------|------|-----|
| P0 | 3.2 / 3.5 particle AI latency & cost | User-facing slowness + real API spend |
| P0 | 3.1 objective correctness not recorded | Corrupts SRS scheduling & analytics |
| P1 | 3.4 distractor quality | Directly affects learning value |
| P1 | 3.3 resumable session | Common real-world frustration |
| P1 | §5 particle + grammar component reuse | Removes brittle heuristic, frees UX wins |
| P2 | §6 grammar item type | New content domain; cheap once §5 lands |
| P2 | 3.6 conditional validation | Robustness / safety |
| P2 | 3.7 dead code + 3.8 effect cleanup | Maintainability |
| P3 | 3.9 / 3.10 timing & stats fidelity | Nice-to-have signal quality |
| P3 | Section 4 features | Roadmap |

---

## 8. Concrete next steps (smallest valuable slice)

1. **Pre-warm particle pool:** add a job/command that keeps N verified particle questions
   per user; change `queue()` to read from that pool with a tiny (≤ 2) AI fallback cap.
2. **Record objective correctness:** extend `submit` to accept and store `was_correct`
   from the MCQ/typed answer; clamp self-grades that contradict a wrong objective answer.
3. **Tighten validation** (3.6) and **delete dead code** (3.7) in the same PR — low risk.
4. **Scope distractors** to JLPT level/tag and guarantee 3 (3.4).
5. **Extract `<PracticeQuestion>`** and normalize the Daily Mix particle payload (§5) —
   pure refactor, deletes the highlight heuristic, reuses the timer as-is.
6. **Add the `grammar` item type** (§6) — wire `GrammarPracticeService` into the queue
   and submit routing; render via the shared `<PracticeQuestion>` from step 5.

Each step is independently shippable and individually testable against the existing
backend test suite.

---

## 9. Phased Implementation Roadmap

Each phase is self-contained, independently shippable, and ends in a green test suite.
Phases are ordered so earlier work unblocks later work (notably: the shared component in
Phase 4 must land before grammar in Phase 5). Tell me **"start Phase N"** to begin one.

> **Status (2026-06-12): Phases 1–8 all implemented.** Daily Mix submit/queue,
> the shared `<PracticeQuestion>` component, grammar item type, distractor/timing
> improvements, resumable sessions, and the mix-breakdown UI are in. Backend Daily
> Mix tests (`DailyMixSubmitTest`, `ParticlePoolTest`) pass; touched frontend files
> are eslint/tsc clean. (Pre-existing, unrelated failures remain in
> Invitation/Milestone/NextAction/AdminNotification suites — not part of this work.)

### Phase 1 — Quick wins & hardening (low risk, no behavior change) ✅ DONE
_Foundation cleanup so later phases build on solid ground._
- **3.6** ✅ Conditional validation in `submit()`: `required_if` makes `grade` required
  for flashcard/sentence and `is_correct` required for particle.
- **3.7** ✅ Deleted dead code: `generateFlashcardOptions()`, the no-op `if (!$userCard …)`
  branch, and the unused `_mixStats` frontend state (backend `stats` payload retained
  for the future `grammar_count` breakdown).
- **3.8** ✅ Untangled the `fetchQueue` effect: mount-once via `hasFetchedRef`, dropped
  `items.length` from deps, removed the `force` arg (callers pass `focusOverride` only).
- **Done:** Pint clean, PHP lint clean, daily-mix `tsc`/eslint clean (pre-existing
  unrelated `kanji_alt` errors remain), no UX change.

### Phase 2 — Objective correctness in SRS (P0 correctness) ✅ DONE
_Stop letting self-grades override what we objectively know._
- **3.1** Send measured correctness (MCQ pick / typed answer) with every submit.
- Extend `submit` to persist `was_correct`; clamp self-grades that contradict a wrong
  objective answer (a wrong MCQ can't be graded better than `hard`).
- **3.10** Reconcile completion-screen stats with server truth.
- **Done when:** a wrong answer can no longer be scheduled as `easy`; tests cover the
  clamp; `Review`/`SentenceReview` rows carry objective correctness.

### Phase 3 — Particle AI latency & cost (P0 performance) ✅ DONE
_Make the queue endpoint fast and cheap regardless of focus._
- **3.2 / 3.5** Add a job/command that keeps a pre-warmed pool of N verified particle
  questions per user; `queue()` reads from the pool with a hard ≤ 2 AI-fallback cap and
  a time budget.
- **Done when:** particle-focus queue builds with ≤ 2 synchronous AI calls; load test
  shows bounded latency; pool-refill job covered by a test.

### Phase 4 — Shared `<PracticeQuestion>` component (P1 refactor) ✅ DONE
_One rendering path for particle + grammar; deletes the brittle heuristic._
- **§5.1** Normalize the canonical `Question` shape (`correct_answer`, `[ ]` marker,
  `reading`, `domain`); make the Daily Mix backend emit it for particle items.
- **§5.2** Create `frontend/components/practice/practice-question.tsx` with
  `DOMAIN_CONFIG`; refactor `particles/practice` to consume it (pure extraction).
- Replace Daily Mix `renderParticle` with `<PracticeQuestion>`; delete
  `highlightParticle`, the `(( ))`/`[[ ]]` parsing, and duplicated scramble handlers.
- **Done when:** both particle surfaces render from one component; heuristic code gone;
  visual parity verified; tests/eslint green.

### Phase 5 — Add the `grammar` item type (P2 feature) — _depends on Phase 4_ ✅ DONE
_Grammar drops into the mix with zero new rendering code._
- **§6.1** Inject `GrammarPracticeService`; generate `grammar` items in `queue()` using
  the Phase 3 pre-warmed-pool pattern + ≤ 2 fallback cap; emit the canonical shape.
- **§6.2** Add a `grammar` slot to all limit arrays, a `grammar` explicit-focus branch,
  a rebalanced balanced default (~12/8/5/5), and a `UserGrammarStat` adaptive trigger.
- **§6.3** Refactor the standalone `grammar/practice` page onto `<PracticeQuestion>`;
  add `'grammar'` to `ItemType`, `TYPE_CONFIG`, the focus selector, and submit routing.
- **§6.4** Add `grammar_count` to the mix-breakdown stats.
- **Done when:** grammar items appear and grade correctly in Daily Mix; all four
  practice surfaces share one component; tests green.

### Phase 6 — Distractor quality & timing fidelity (P1/P3 polish) ✅ DONE
- **3.4** Scope flashcard distractors to JLPT level / tag, dedup by normalized meaning,
  guarantee exactly 3 by widening the pool on shortfall.
- **3.9** Capture time-to-first-answer separately from grade time.
- **Done when:** every MCQ shows 4 plausible options; response-time signal isolated.

### Phase 7 — Resumable sessions (P1 UX) ✅ DONE
- **3.3** Persist a lightweight session (id + remaining item ids + cursor) so reloads
  resume; add a "continue where you left off" entry point.
- **Done when:** a mid-session reload restores position instead of re-shuffling.

### Phase 8 — Roadmap features (P3, optional / incremental) ✅ DONE
- Section 4 opportunities: mix-breakdown UI, streak/quest integration, configurable
  session length, kanji/kana item types, post-session insight, sentence TTS.
- **Done when:** picked items shipped behind their own small PRs.

> **Dependency summary:** Phase 4 → Phase 5 (grammar needs the shared component).
> Phase 3's pre-warmed-pool pattern is reused by Phase 5's grammar generation. All
> other phases are independent and may be reordered to taste.