skip to content
Jerrie Pelser's Blog

Learning Thai with AI: Preparing the AI prompts

/ 12 min read

Introduction

I have been learning Thai for many years and over the years I have built up an Anki deck with the vocabulary I want to learn. At some point, I have started using AI to help me with translating words, describing the meaning and usage of the words, etc. This was all done manually and over time I used various models with various degrees of success.

This left my Anki deck in a bit of a mess at the moment. For some words, I have AI-assisted translations and descriptions. For others, I don’t. And even the ones where I do, the format is different as I experimented with different things over time.

So, I decided it is time to do something about this. I want to clean up all my cards and get them in a consistent state across the board. I also want to automate the creation of cards for new vocabulary. The current process is manual and takes a lot of time. The friction involved in this prevents me from adding new cards.

Determining which model to use

The first thing was to determine which model to use. I use Grok for my day-to-day AI uses, so I asked it which model will best suit my needs.

Asking AI which model to use

It suggested that the best model of the current crop is the Claude Sonnet/Opus family.

AI response on which model to use

Determining the Anki deck structure

So, I opened up Claude and got started. Since I want to do a proper clean up, I decided to revisit my Anki card structure and see whether I can make improvements to that as well.

Asking about the structure of the Anki card

It came back with something very similar to what I have already.

Answering my question about the structure of the Anki card

I experimented with this structure by generating cards (more on that below) and refined to be as follows:

**Card Structure:**
**Thai:** [the word in Thai script]
**Romanization:** [phonetic pronunciation with tone marks]
**English Translation:** [List each distinct meaning within THIS SPECIFIC grammatical function as separate bullet points]
- [meaning 1]
- [meaning 2]
- [meaning 3, if applicable]
**Context/Usage:** [Detailed explanation of when/how to use this word for EACH meaning within THIS grammatical function. Structure with clear labels:]
- **Meaning 1 ([brief descriptor]):** [Context and usage explanation]
- **Meaning 2 ([brief descriptor]):** [Context and usage explanation]
- **Meaning 3 ([brief descriptor]):** [Context and usage explanation, if applicable]
**Near-Synonyms/Contrasts:** [If applicable - list Thai near-synonyms for any of the meanings within this grammatical function and explicitly explain how this word differs from them in meaning, usage, context, scope, or formality. If no relevant near-synonyms, note "N/A"]
**Part of Speech:** [The specific grammatical function provided - must match exactly what was requested]
**Classifier:** [If applicable and the grammatical function is noun - the classifier used with this noun, otherwise note "N/A"]
**Example Sentences (Thai):** [2-3 bullet points PER MEANING showing the word used in context in THIS grammatical function, clearly labeled by meaning]
- [sentence 1 - Meaning 1]
- [sentence 2 - Meaning 1]
- [sentence 3 - Meaning 2]
- [sentence 4 - Meaning 2]
- [etc.]
**Example Sentences (English):** [Translations of the above, in the same order]
- [translation 1 - Meaning 1]
- [translation 2 - Meaning 1]
- [translation 3 - Meaning 2]
- [translation 4 - Meaning 2]
- [etc.]
**Notes:** [Any additional memory aids, common collocations, or formality markers SPECIFIC TO THIS GRAMMATICAL FUNCTION]

A few things to note:

  1. I decided to drop the audio because (a) I am not sure how to automate generating it (yet) and (b) my level of Thai is at a point where I don’t really need that any more. For more difficult words, I always have the Romanization I can fall back to.
  2. Thai words often have synonyms which are very closely related and there are some nuances as to when to use which. This is one thing that often trips me up, so I decided to explicitly list the synonyms and contrast their use.

Generating the prompt

Next up, I wanted to generate the prompt. I decided to use Claude to generate a prompt for itself. Initially, it seemed to do OK, but there are a number of things that tripped it up. Let’s look at some of those.

Distinguishing between part of speech

My first iteration would create a single card for different parts of speech - i.e. a single card for using the same word as a verb, noun, adjective, etc.

Splitting parts of speech into separate cards

So, I changed the prompt to return different parts of speech as separate cards.

Erroneously splitting different meanings

Splitting different parts of speech worked well, but I noticed that it also split different meanings of the same word (functioning as the same part of speech) into different cards. I felt this would be problematic, as I pointed out to Claude.

Claude erroneously splitting different meanings

Frustration sets in

Claude updated the prompt to split different parts of speech into separate cards, but handle different meanings as a single card. At first, it seemed to work, but as I continued testing the prompt it became clear it created very unpredictable results.

Sometimes it ignored certain parts of speech.

Claude ignoring parts of speech

Sometimes, it would split different meanings into separate cards again.

Splitting meanings by mistake

No matter how much I refined the prompt and how clear I made it that it MUST split different parts of speech and NEVER split different meanings within the same part of speech, it kept messing up. The lead me to the point to decide that I may have better luck if I created two separate prompts.

Asking Claude about creating two separate prompts

More edge cases

I would still run into weird edge cases related to the Thai language.

Questioning Claude about stative verbs

To fix this, I had to update the first prompt to make the distinction between stative verbs, action verbs, and adjectives.

The final prompts

After a little bit more iteration, I got to a point where I felt I had fairly consistent results. Here are my final prompts:

Prompt 1: Determining the parts of speech

System prompt

You are a Thai language expert. Your task is to analyze Thai input (which may be a single word, phrase, or grammar pattern) and determine ALL grammatical functions or categories it represents.
Research thoroughly and identify every grammatical role or category. Use these Thai grammatical categories:
**For single words:**
- **stative verb** (กริยาบอกสภาพ) - verbs that describe states or conditions (e.g., สงบ meaning "to be calm")
- **action verb** (กริยาแสดงอาการ) - verbs that describe actions (e.g., วิ่ง meaning "to run")
- **noun** (คำนาม) - names of people, places, things, concepts
- **adjective** (คุณศัพท์) - true adjectives that modify nouns (rare in Thai)
- **adverb** (กริยาวิเศษณ์) - words that modify verbs or adjectives
- **preposition** (บุพบท) - words showing relationships
- **classifier** (ลักษณนาม) - words used for counting
- **particle** (อนุภาค) - grammatical particles
- **other** - specify if it's something else
**For multi-word units:**
- **phrase** - multi-word units that function together (e.g., อย่าทำอย่างนั้น)
- **expression** - idiomatic expressions or fixed sayings
- **grammar pattern** - structural templates with variables (e.g., ไม่ + adj + เท่าไร)
**IMPORTANT:**
- Most words that English speakers think of as "adjectives" are actually **stative verbs** in Thai
- If the input is a phrase, expression, or grammar pattern, identify it as such rather than trying to classify it as a single word
- Be precise about the Thai grammatical category
**OUTPUT FORMAT:**
Return your response as valid JSON ONLY. Do not include any preamble, explanatory text, or markdown code fences (no ```json).
Use this EXACT schema:
{
"thai_input": "string",
"parts_of_speech": ["string", "string"]
}
The "parts_of_speech" array must contain at least one grammatical category. List all grammatical roles or categories that apply using the categories above.

User prompt

Analyze the Thai word: <THAI WORD>
NEVER, EVER include anything other than the JSON response

Note, I still had to add NEVER, EVER include anything other than the JSON response to the user prompt because, despite making this clear in the system prompt, it would still ignore it sometimes and return something other than JSON 🤷‍♂️

Prompt 2: Creating the Anki card

System prompt

You are a Thai language expert. Your task is to create a detailed Anki flashcard for a Thai word when it functions in a SPECIFIC grammatical role.
**CRITICAL INSTRUCTIONS:**
- You will be given a Thai word AND a specific grammatical function (e.g., "stative verb", "action verb", "noun")
- Create ONLY ONE card for that word functioning ONLY in that specific grammatical role
- If the word has multiple meanings within that grammatical function, include ALL meanings on the SAME card
- Do NOT include information about how the word functions in other grammatical roles
**Understanding Thai Grammatical Categories:**
- **stative verb** - describes states/conditions, not actions (what English speakers often call adjectives)
- **action verb** - describes actions, events, or processes
- **noun** - names of entities, concepts
- Other categories as applicable
When creating the card, focus ONLY on how the word functions in the specified grammatical role.
**Card Structure:**
**Thai:** [the word in Thai script]
**Romanization:** [phonetic pronunciation with tone marks]
**English Translation:** [List each distinct meaning within THIS SPECIFIC grammatical function as separate bullet points]
- [meaning 1]
- [meaning 2]
- [meaning 3, if applicable]
**Context/Usage:** [Detailed explanation of when/how to use this word for EACH meaning within THIS grammatical function. Structure with clear labels:]
- **Meaning 1 ([brief descriptor]):** [Context and usage explanation]
- **Meaning 2 ([brief descriptor]):** [Context and usage explanation]
- **Meaning 3 ([brief descriptor]):** [Context and usage explanation, if applicable]
**Near-Synonyms/Contrasts:** [If applicable - list Thai near-synonyms for any of the meanings within this grammatical function and explicitly explain how this word differs from them in meaning, usage, context, scope, or formality. If no relevant near-synonyms, note "N/A"]
**Part of Speech:** [The specific grammatical function provided - must match exactly what was requested]
**Classifier:** [If applicable and the grammatical function is noun - the classifier used with this noun, otherwise note "N/A"]
**Example Sentences (Thai):** [2-3 bullet points PER MEANING showing the word used in context in THIS grammatical function, clearly labeled by meaning]
- [sentence 1 - Meaning 1]
- [sentence 2 - Meaning 1]
- [sentence 3 - Meaning 2]
- [sentence 4 - Meaning 2]
- [etc.]
**Example Sentences (English):** [Translations of the above, in the same order]
- [translation 1 - Meaning 1]
- [translation 2 - Meaning 1]
- [translation 3 - Meaning 2]
- [translation 4 - Meaning 2]
- [etc.]
**Notes:** [Any additional memory aids, common collocations, or formality markers SPECIFIC TO THIS GRAMMATICAL FUNCTION]
---
**OUTPUT FORMAT:**
Return your response as valid JSON ONLY. Do not include any preamble, explanatory text, or markdown code fences (no ```json).
Use this EXACT schema:
{
"card": {
"thai": "string",
"romanization": "string",
"english_translation": ["meaning 1", "meaning 2"],
"context_usage": "Formatted with labeled meanings as described above",
"near_synonyms": "string or N/A",
"part_of_speech": "string (must match the provided grammatical function)",
"classifier": "string or N/A",
"example_sentences_thai": ["sentence 1", "sentence 2", "..."],
"example_sentences_english": ["translation 1", "translation 2", "..."],
"notes": "string"
}
}
All fields are required. Return exactly ONE card object that focuses ONLY on the specified grammatical function.

User prompt

Create an Anki card for the Thai word '<THAI WORD>' when it functions as a '<PART OF SPEECH>'. Focus ONLY on this grammatical function and do not include information about other grammatical roles.

Example

Let’s see an example. Here is an example user prompt for generating the parts of speech:

Analyze the Thai word: ฟกช้ำ
NEVER, EVER include anything other than the JSON response

Which returns the following result:

{
"thai_input": "ฟกช้ำ",
"parts_of_speech": ["stative verb", "noun"]
}

For each of the parts of speech, I can then call the second prompt, for example:

Create an Anki card for the Thai word 'ฟกช้ำ' when it functions as a 'noun'. Focus ONLY on this grammatical function and do not include information about other grammatical roles.

Which would return the following result:

{
"card": {
"thai": "ฟกช้ำ",
"romanization": "fók cháam",
"english_translation": ["bruise", "contusion"],
"context_usage": "**Meaning 1 (bruise/contusion):** Used to refer to the discolored mark on the skin caused by broken blood vessels under the surface, typically from impact or injury. This is the medical/physical condition itself as a noun. Commonly used in everyday conversation when discussing injuries, medical conditions, or examining wounds. Can refer to bruises of varying severity, from minor to serious.",
"near_synonyms": "รอยช้ำ (rɔɔi cháam) - literally 'bruise mark,' emphasizes the visible mark/trace left by the bruising. ฟกช้ำ is more clinical and refers to the actual injury/condition itself, while รอยช้ำ focuses on the visible evidence. Both are commonly used interchangeably in everyday speech, though ฟกช้ำ is slightly more formal/medical.",
"part_of_speech": "noun",
"classifier": "รอย (rɔɔi) - used for marks, spots, or traces",
"example_sentences_thai": [
"เธอมีฟกช้ำที่แขนจากการล้ม",
"ฟกช้ำบริเวณขาจะค่อยๆ หายไปเองภายในสองสามวัน",
"หมอตรวจพบฟกช้ำหลายรอยบนร่างกายของผู้ป่วย",
"ฟกช้ำนี้เจ็บมากเลย"
],
"example_sentences_english": [
"She has a bruise on her arm from falling.",
"The bruise on the leg will gradually heal by itself within a few days.",
"The doctor found several bruises on the patient's body.",
"This bruise hurts a lot."
],
"notes": "The word ฟกช้ำ combines ฟก (fók - bruised/swollen) and ช้ำ (cháam - bruised/beaten). When used as a noun, it's neutral in formality and appropriate for both casual and medical contexts. Commonly appears in medical settings, first aid discussions, and everyday conversations about injuries. The classifier รอย is used when counting: ฟกช้ำสามรอย (three bruises)."
}
}

Conclusion

In this blog post, I described my process to generate prompts to assist me with creating Anki decks for learning Thai. In the next blog post, we’ll turn our attention to Anki and see how we can export and import the data.