In Quranic linguistics, a lemma is the canonical dictionary form of a word — the form you'll typically find in a lexicon. It acts as the base reference for all grammatical variations of a word such as conjugated verbs or inflected nouns.
ℹ️ Note:
Lemmas are distinct from roots. While a root captures the core triliteral structure (e.g., ر-ح-م), a lemma represents the normalized lexical form (e.g., رَحِمَ) from which conjugations are derived.
Example:
-
Lemma: رَحِمَ – to have mercy
- Derived forms:
- يَرْحَمُ – he has mercy
- نَرْحَمُ – we have mercy
- ارْحَمْ – have mercy! (imperative)
QUL exports lemma data as Sqlite database with two tables lemmas and word_lemmas.
Table: lemmas
| Column |
Type |
Description |
| id |
INTEGER |
Unique identifier for the lemma |
| text |
TEXT |
Canonical form in Arabic with tashkeel |
| text_clean |
TEXT |
Canonical form in Arabic without tashkeel |
| words_count |
INTEGER |
Total occurrences of all derived words for lemma |
| uniq_words_count |
INTEGER |
Count of unique word forms derived from this lemma |
🔗 Table: word_lemmas
| Column |
Type |
Description |
| lemma_id |
INTEGER |
Foreign key to the lemmas.id field |
| word_location |
TEXT |
Word identifier (e.g. 2:3:5) |