Skip to main content
Every record in the dataset has a speaker field that identifies who is delivering the line. There are three categories of speaker tags.

Speaker categories

TagCategoryUsage
Character names (e.g., Susie, Toriel)Named characterSpoken dialogue
NarratorNarratorGame text, item descriptions, visual-to-text descriptions, stage directions
PlayerPlayer choiceMenu choice options presented to the player

Character names

Character name tags correspond directly to in-game display names. Common characters across the dataset include:
CharacterAppears in
KrisAll chapters (protagonist, usually silent — choices appear as Player)
SusieChapters 1–4
RalseiChapters 1–4
TorielChapters 1–4
AlphysChapters 1–4
NoelleChapters 1–4
BerdlyChapters 1–3
LancerChapters 1–2
Rouxls KaardChapters 1–2
Example character dialogue:
{"context": "Scene: Obj Krisroom", "speaker": "Toriel", "text": "Kris...!"}
{"context": "Scene: Obj Krisroom", "speaker": "Toriel", "text": "Wake up!"}
{"context": "Scene: Obj Classscene", "speaker": "Susie", "text": "... am I late?"}
{"context": "Scene: Obj Schoollobbycutscene", "speaker": "Susie", "text": "Quiet people piss me off."}
{"context": "Scene: Obj Classscene", "speaker": "Alphys", "text": "So, does everyone have a..."}
Character names match the in-game display names exactly as they appear in the game. Capitalisation follows game conventions (e.g., RALSEI may appear in all-caps in certain scenes).

Narrator

The Narrator tag is used for all non-character text including:
  • Game text — Flavour text, environmental descriptions
  • Item descriptions — Text shown when examining objects
  • Visual-to-text (vid2text) — Descriptions of on-screen actions, animations, and stage directions converted from video
  • System messages — In-game prompts and interface text
{"context": "Scene: Device Contact", "speaker": "Narrator", "text": "ARE YOU THERE?"}
{"context": "Scene: Device Contact", "speaker": "Narrator", "text": "ARE WE CONNECTED?"}
{"context": "Scene: Device Contact", "speaker": "Narrator", "text": "EXCELLENT."}
{"context": "Scene: Device Contact", "speaker": "Narrator", "text": "YOU MUST CREATE A VESSEL."}
Chapter 1 was transcribed before the vid2text pass was applied. Visual/stage direction descriptions are absent from Chapter 1 Narrator lines. Chapters 2 and 3 still have approximately 15 key scenes each where visual descriptions are pending. See Known gaps for details.

Player

The Player tag marks player-facing choice options — the selectable menu items shown during dialogue branches and the character creation sequence.
{"context": "Scene: Device Contact", "speaker": "Player", "text": "YES"}
{"context": "Scene: Device Contact", "speaker": "Player", "text": "SWEETS"}
{"context": "Scene: Device Contact", "speaker": "Player", "text": "B"}
{"context": "Scene: Device Contact", "speaker": "Player", "text": "GREEN"}
{"context": "Scene: Device Contact", "speaker": "Player", "text": "VOICE"}
{"context": "Scene: Device Contact", "speaker": "Player", "text": "HOPE"}
Player lines represent all available options as they appear in the game — including options not selected in the specific playthrough. The transcription captures the full choice set, not just the selected option.

Filtering by speaker

To work with a specific speaker type in Python:
import pandas as pd

df = pd.read_json('data/chap2_dataset.jsonl', lines=True)

# All Susie lines
susie = df[df['speaker'] == 'Susie']

# All narration
narration = df[df['speaker'] == 'Narrator']

# All player choices
choices = df[df['speaker'] == 'Player']

# Count lines per speaker
print(df['speaker'].value_counts())