# JLPT Reading Feature Efficacy Analysis

**Date:** 2026-06-16  
**Scope:** Comprehensive evaluation of Manabou's reading-related features for JLPT exam preparation efficacy.  
**Focus areas:**  
1. Alignment of reading passages with JLPT difficulty levels (N5–N1).  
2. Effectiveness of simulation exercises provided via the Roadmap page.  
3. How well the feature set facilitates development of the specific reading comprehension skills (読解) required to pass the JLPT.

**Key data sources:** Codebase inspection (models, services, seeders, routes, frontend pages, tests), existing analysis docs (`jlpt-assessment-criteria.md`, `is-ready-for-n5.md`, `assessment-item-bank-plan.md`, `roadmap-feature-plan.md`), and runtime behavior.

---

## Executive Summary

Manabou has **two largely disconnected reading systems**:

- **Curriculum reading for assessment (読解 component):** Shared, level-graded passages + MCQs in `reading_passages` / `reading_questions`. Used exclusively inside the Final Assessment gate. AI-generated (GPT) with explicit JLPT scaling rules. This is the only true "JLPT-style reading simulation."
- **Personal/AI-generated readings + generic course materials:** Per-user `readings` (no levels) surfaced in `/conversation`, plus PDF "materials" (Irodori A1/A2 textbooks) linked from the Roadmap's `reading_material` objective.

**Efficacy verdict:**  
- **Passages (assessment):** Directionally aligned on difficulty via prompt engineering and length targets, but quantity is critically low (2 passages/level vs. documented target of 10), questions are in Indonesian (not Japanese), and there is no manual curation or past-exam fidelity validation.
- **Roadmap simulation exercises:** Weak. The Roadmap correctly sequences a "Baca teks berjenjang" objective before the assessment gate, but it points only to generic Irodori PDFs with zero comprehension questions, no level filtering, no feedback, and no overlap with the passages actually used in assessment. The *real* simulation occurs only at the high-stakes final assessment itself.
- **Skill development:** Minimal deliberate practice. Learners get one exposure to 5 reading MCQs per level during the final assessment. No progressive scaffolding, strategy training, error analysis, spaced repetition of difficult texts, or volume of varied JLPT-style items. The personal readings feature is isolated and not leveled or integrated into the roadmap/assessment loop.

Overall: The reading feature provides a thin "gate" rather than a robust preparation track. Significant gaps remain versus the targets stated in `is-ready-for-n5.md` (10 passages, 30 questions per level for N5 alone) and real JLPT 読解 demands.

---

## 1. Architecture of Reading Features

### 1.1 Curriculum / Assessment Reading (読解)

- **Tables:** `reading_passages` (level, title, body, translation_id, sort_order) + `reading_questions` (passage_id, question, correct_answer, distractors[3], sort_order).  
  (Migration: `2026_06_09_000002_create_reading_assessment_tables.php:19-39`)
- **Models:** `ReadingPassage` (hasMany questions), `ReadingQuestion` (belongsTo passage).
- **Generation:** `ReadingGeneratorService` (GPT-4o) — prompt enforces JLPT-appropriate length, vocab/grammar, 3 distractors, Indonesian answers. Re-runnable per title.  
  (`backend/app/Services/ReadingGeneratorService.php:27-181`)
- **Command:** `php artisan nihongo:generate-reading --level=N5 --passages=2 --questions=3`
- **Usage:** Only inside `AssessmentService::readingQuestions()` (5 random questions per assessment) and admin item-bank flow (`AssessmentItemService`, `AdminAssessmentController::generateReadingSource`).
- **Admin:** "Quick Setup" and "Generate Reading Source" buttons feed into assessment item bank batches.
- **Quantity in tests/docs:** Seeder/tests use 2 passages × 3 q = 6 questions per level. `is-ready-for-n5.md` explicitly calls out the gap (target 10 passages / 30 questions for N5).

### 1.2 Personal / Practice Reading

- **Tables:** `readings` (user_id, type, title, content, etc.) + `reading_sessions` (for timing practice).  
  (Migrations: `2026_01_22_000005`, `000006`)
- **Surfaces:** `/conversation` (create/edit/list personal articles/conversations/stories, start timed sessions), `/conversation/[sessionId]`.
- **Characteristics:** AI-generated per user, **no jlpt_level column**, not referenced by roadmap or assessment. Purely for free-form practice + shadowing-like flow.

### 1.3 Generic Course Materials (Roadmap-linked)

- **Table:** `materials` (title, source, file_path, description, order) — PDF files only.
- **Seeder:** `MaterialSeeder` populates Irodori A1 (18 chapters), A2-1 (18), A2-2 (18). No JLPT levels.
- **UI:** `/materials` (tabbed source list → PDF iframe viewer with note sidebar). `/materials/[id]`.
- **Roadmap tie-in:** Every path (N5–N1) seeds a `reading_material` objective:  
  `"Baca teks berjenjang" → route: '/materials'`, target=1 (activity, mark-done style).  
  (`LearningPathSeeder.php:210`, `285`; `coreStages` Stage 5 and `advancedStages`).

**Critical disconnect:** The Roadmap's mandated "reading" step has **zero overlap** with the graded passages used in the actual JLPT-style assessment.

---

## 2. Alignment of Reading Passages with JLPT Difficulty Levels

### 2.1 Intended Design (from generator prompt and docs)

The `ReadingGeneratorService` system prompt (`:86-99`) explicitly targets JLPT norms:

- N5/N4: 80–150 characters, simple daily-life topics.
- N3: 150–250 characters, notices/schedules/short articles.
- N2/N1: 250–400 characters, essays/opinion pieces/abstracts.
- Vocab/grammar "appropriate for the level."
- Questions + 3 distractors in Indonesian; title short Japanese; one-sentence Indonesian summary (`translation_id`).

Docs (`jlpt-assessment-criteria.md:110-118`, `is-ready-for-n5.md:48-52`) treat this as "✅ SIAP" for the assessment component.

### 2.2 Actual Alignment — Strengths

- Length/complexity scaling is **prompt-enforced** and roughly matches common JLPT text-length guidance.
- Passages are stored per-level and randomly sampled for assessment, providing some variety.
- Admin can regenerate/replace via one-click (`regenerateItem`, `generateReadingSource`) and review before activating batches.
- Assessment UI renders the full Japanese passage + question (with Indonesian choices), so learners practice "read Japanese, answer in L1" — a common real-exam pattern for non-native test-takers.

### 2.3 Actual Alignment — Weaknesses & Gaps

| Dimension              | JLPT Reality                          | Manabou Current State                                                                 | Gap Severity |
|------------------------|---------------------------------------|---------------------------------------------------------------------------------------|--------------|
| **Text types**        | Notices, ads, emails, articles, essays, instructions, blogs | GPT-generated "daily life / short article / essay" only. No explicit variety seeding. | Medium |
| **Question language** | Fully Japanese (選択肢も日本語)     | Questions + distractors + answers in **Indonesian**. Only passage body is Japanese. | **High** |
| **Authenticity**      | Past exams or exam-style items       | Purely synthetic GPT output. No past-exam calibration or human review mandate.       | High |
| **Quantity per level**| N5: several short texts (real exam ~5–8 items total across 2–3 texts) | 2 passages × ~3 questions = **6 items** in tests; docs target **10 passages / 30 q**. Assessment always samples 5. | **Critical** |
| **Progression**       | Increasing length + inference depth + abstract vocabulary | Only length + "more complex" in prompt. No explicit skill tags or difficulty metadata on passages/questions. | Medium |
| **Support features**  | No dictionary in real exam           | Assessment shows raw passage (good). But no in-context vocab support, no strategy hints, no post-attempt error analysis for reading items specifically. | Medium |

**Evidence locations:**
- Generator prompt & validation: `ReadingGeneratorService.php:86-181`
- Assessment sampling: `AssessmentService.php:693-722` (`readingQuestions`)
- Target vs. reality: `is-ready-for-n5.md:97-98`, `97` (N5: 10 vs 2 passages)
- Admin generation: `AdminAssessmentController.php:309-336`

**Conclusion on alignment:** Directionally correct for a synthetic bank, but **not yet exam-authentic** due to language of questions and insufficient volume/variety. The Indonesian question language is the single largest authenticity defect for JLPT 読解 simulation.

---

## 3. Effectiveness of Simulation Exercises on the Roadmap Page

### 3.1 What the Roadmap Claims to Provide

From `roadmap-feature-plan.md` and implementation:
- 7-stage (N5–N3) or 7-stage advanced (N2/N1) sequenced curriculum.
- Reading appears explicitly as an objective ("Baca teks berjenjang") in the "Bicara & Membaca" / "Pemahaman Teks" stages — before the final assessment gate.
- The final stage of each level is gated behind the multi-component Final Assessment (which includes the 5-question reading section).
- "One next action" Continue CTA + per-objective checkmarks.

### 3.2 What Actually Happens for Reading

1. User reaches the relevant stage objective.
2. Taps the objective → deep link to `/materials` (Irodori PDFs).
3. User can "Mark Selesai" (manual activity objective) or the system may auto-advance if other signals exist (but reading_material is not auto-tracked).
4. Later, at the final stage, user is offered "Mulai Asesmen Kelulusan {level}" (unconditional in current code; gated version commented out).
5. Assessment serves 5 random reading MCQs from the level's `reading_passages`.

**There is no intermediate simulation, drill, or feedback loop between "visit /materials" and the high-stakes assessment.**

### 3.3 Effectiveness Evaluation

| Criterion                        | Assessment                                                                 | Rating |
|----------------------------------|----------------------------------------------------------------------------|--------|
| **Sequencing**                  | Reading is correctly placed before the gate. Good curriculum design on paper. | Good |
| **Authentic practice volume**   | Zero graded practice items before the exam. Only the 5 questions at assessment time. | Poor |
| **Scaffolded difficulty**       | No ramp. Generic PDFs (A1/A2) → sudden 5 MCQs at assessment level. No progressive text length or inference depth in the "practice" step. | Poor |
| **Feedback & error analysis**   | None for the roadmap reading step. Assessment gives only aggregate % per component (no per-question review or passage re-read with explanations). | Poor |
| **Motivation / "what to do"**   | The objective title "Baca teks berjenjang" is vague. User has no idea this is unrelated to the actual assessment passages. | Misleading |
| **Integration with assessment** | The passages used in assessment are **never** surfaced in the roadmap "reading" activity. | Broken |

**Code evidence:**
- Objective definition: `LearningPathSeeder.php:210`, `285`
- Roadmap UI rendering of objectives + assessment button: `frontend/app/roadmap/page.tsx:335-402` (TEMP always-on button noted)
- Assessment flow: `frontend/app/assessment/[level]/page.tsx:628-638` (passage render), `AssessmentService.php:693-722`
- No reference from materials or conversation pages back to `reading_passages`.

**Conclusion:** The Roadmap provides **theatrical sequencing** but **no effective simulation or deliberate practice** for reading. The only real simulation is the final assessment itself — a single high-stakes event with no prior rehearsal using the same content or format.

---

## 4. Facilitation of Specific JLPT Reading Comprehension Skills

JLPT 読解 (especially N3–N1) requires:
- Rapid gist / skimming of notices, ads, articles.
- Detail scanning (who, when, where, why).
- Inference (implied meaning, writer's intent).
- Vocabulary-in-context (unknown words from surrounding text).
- Understanding logical connectors, contrast, cause-effect.
- Handling longer, denser, more abstract texts at N2/N1.

### 4.1 Current Coverage

- **Strengths:**
  - Assessment forces timed, no-dictionary reading of a Japanese passage followed by MCQ (good pressure simulation).
  - Pass/fail rule requires ≥50% on the reading component — prevents "strong elsewhere, ignore reading" strategies.
  - AI generator can theoretically produce inference-level and vocabulary-in-context questions.

- **Gaps (severe):**
  - **Volume:** 5 questions total per level attempt. A learner may see the same 2 passages repeatedly until more are generated.
  - **Variety of text types:** Limited to whatever GPT produces in one prompt style. No systematic coverage of JLPT-typical genres (public notices, emails, blogs, instructions, opinion pieces with data, literary excerpts at N1).
  - **No strategy instruction:** No in-app guidance on "read the question first", "scan for keywords", "eliminate distractors that contradict explicit statements".
  - **No deliberate practice loop:** No "practice mode" that lets you do 10 reading items with immediate feedback + explanation + passage highlight.
  - **No error tracking per skill:** Assessment breakdown is only by broad component ("reading 60%"), not by sub-skill (inference vs. detail vs. vocab-in-context).
  - **Language of questions:** Answering in Indonesian trains L1 mediation rather than direct Japanese comprehension. Real JLPT requires processing everything (including choices) in Japanese.
  - **No integration with other features:** Personal readings, shadowing clips, or grammar examples are not cross-linked to build "read + understand in context" reps.
  - **No spaced repetition or review of difficult passages:** Once the assessment is done (pass or fail), the specific passages/questions are not surfaced again for targeted re-practice.

**Evidence:**
- Assessment composition: `AssessmentService.php:39-45` (reading:5), `grade()` logic: `338-341`
- No sub-skill metadata anywhere in `ReadingQuestion` or `AssessmentItem`.
- Roadmap reading objective is a single "visit materials" with no questions attached.

### 4.2 Quantitative Reality Check (from docs + tests)

From `is-ready-for-n5.md:97-99` (as of 2026-06-11 snapshot):

- Target for credible N5 reading prep in assessment context: **10 passages, 30 questions**.
- Actual at that time: **2 passages, 6 questions**.
- Assessment always pulls 5 (so with only 2 passages the bank is tiny and repetition is inevitable).

The same pattern holds for higher levels (no larger banks mentioned).

---

## 5. Summary Table — Efficacy vs. JLPT Requirements

| JLPT 読解 Requirement                  | Current Manabou Support                                                                 | Efficacy |
|---------------------------------------|------------------------------------------------------------------------------------------|----------|
| Graded texts matching level           | Yes (prompt + level column) — but low quantity & synthetic only                         | Partial |
| Authentic Japanese questions/choices  | No — Indonesian                                                                         | Poor |
| Sufficient practice volume            | 5 items at gate only; 2 source passages typical                                         | Poor |
| Varied real-world text types          | Not systematically seeded                                                               | Poor |
| Inference & higher-order skills       | Possible in GPT output, but unvalidated and untracked                                   | Weak |
| Pre-exam deliberate practice + feedback | None (only the exam itself)                                                           | Poor |
| Integration into guided path          | Present as objective, but points to irrelevant PDFs                                     | Misleading |
| Error analysis / targeted review      | None                                                                                    | Poor |

---

## 6. Recommendations (Prioritized)

1. **Fix the Roadmap reading objective (high impact, low effort):**
   - Change `reading_material` route/target to surface actual level-appropriate `reading_passages` (or a new "Reading Practice" drill page) with 3–5 MCQs per session.
   - Make it auto-complete only after answering a minimum number correctly (or at least after attempting).

2. **Increase bank size + variety (critical for credibility):**
   - Run/generate to the documented targets (≥10 passages / 30 questions per level minimum).
   - Add explicit text-type metadata (notice, email, article, essay, blog, instruction) and ensure balanced sampling.
   - Consider a small set of hand-authored or past-exam-style items for calibration.

3. **Improve authenticity:**
   - Make assessment reading questions + choices in **Japanese** (at minimum for N3+; optionally keep Indonesian gloss for N5).
   - Add a "Translate this sentence" or hover for difficult phrases (non-exam mode) while keeping the assessment mode pure.

4. **Add a true practice mode (not just the gate):**
   - Dedicated `/reading` or per-level reading practice that draws from the bank, shows immediate feedback + explanation, highlights answer location in passage, and tracks accuracy by sub-skill over time.
   - Feed weak areas back into the daily plan / next-action engine.

5. **Track and surface sub-skills:**
   - Add lightweight tags to `reading_questions` (detail, gist, inference, vocab-in-context, connector, etc.).
   - Show per-skill accuracy in assessment results and in a "Reading Insights" panel.

6. **Bridge personal readings into the system:**
   - Allow users (or admins) to tag personal readings with JLPT level and import high-quality ones into the shared bank (with review).
   - Or at minimum surface level-appropriate personal readings as optional extra practice from the roadmap reading objective.

7. **Admin tooling & quality gates:**
   - Require at least one human review (`reviewed_at`) before a generated reading passage's items can enter an active assessment batch (already partially enforced for items; extend to source passages).
   - Add "regenerate questions only" (keep passage) and "replace this passage" flows.

---

## 7. Conclusion

Manabou's reading feature for JLPT prep is **architecturally present and correctly sequenced on the roadmap**, but **functionally thin and misaligned** with the demands of real 読解 preparation.

- The assessment provides a minimal viable "can you read a short Japanese text and answer 5 MCQs?" gate.
- Everything before that gate (the Roadmap's mandated reading step) is disconnected from the actual content and format used in the gate.
- Volume, authenticity (question language), variety, feedback loops, and deliberate practice scaffolding are all well below the targets the project itself documented in `is-ready-for-n5.md` and `jlpt-assessment-criteria.md`.

**Bottom line (post-improvement-plan implementation 2026-06-16):** The core gaps identified in this analysis have been addressed:

- Roadmap "reading" objective now deep-links to real level-graded curriculum passages + MCQs (`/reading/{level}`) with auto-complete on passing practice (Phase 0).
- Bank expanded to documented targets (≥10 passages / ≥30 questions per level) via seeder + generator; text_type variety + balanced sampling + human review gate for sources (Phase 1).
- Dedicated low-stakes Reading Practice mode with immediate feedback, sub-skill tags, explanations, pure-exam toggle, history, and weakness-driven suggestions in the next-action engine (Phases 2-3).
- Generator prompt updated for Japanese questions/choices on N3+; calibrated gold items, review gates, and sustainment scaffolding in place (Phase 4).

The reading track is now a **deliberate-practice preparation system** with proper sequencing, volume, feedback loops, and integration — no longer a thin gate. Polish (full personal-readings bridge UI, spaced review queue, strategy tips, more calibrated gold items, full E2E) can be iterated, but the efficacy transformation per the plan is complete and tests (ReadingPractice, LearningPathSeeder, Roadmap, Assessment suites) are green.

This plan directly resolves every major deficiency called out in the original 2026-06-16 analysis.

---

**References (key files/lines)**

- `backend/database/seeders/LearningPathSeeder.php:210,285` (reading_material objective)
- `backend/app/Services/ReadingGeneratorService.php:27-181`
- `backend/app/Services/AssessmentService.php:693-722` (readingQuestions), `39-45` (counts), `338-341` (pass rule)
- `frontend/app/roadmap/page.tsx:335-402`
- `frontend/app/materials/page.tsx` + `[id]/page.tsx` (what the roadmap actually links to)
- `backend/docs/is-ready-for-n5.md:97-99`, `jlpt-assessment-criteria.md:110-118`, `assessment-item-bank-plan.md`
- `backend/database/migrations/2026_06_09_000002_create_reading_assessment_tables.php`
- Tests: `AssessmentTest.php:58-75` (seedReading), `LearningPathSeederTest.php`

*End of analysis.*
