In Quranic linguistics, a lemma is the canonical dictionary form of a word — the form you'll typically find in a lexicon. It acts as the base reference for all grammatical variations of a word such as conjugated verbs or inflected nouns.
ℹ️ Note:
Lemmas are distinct from roots. While a root captures the core triliteral structure (e.g., ر-ح-م), a lemma represents the normalized lexical form (e.g., رَحِمَ) from which conjugations are derived.
Example:
QUL exports lemma data as Sqlite database with two tables lemmas and word_lemmas.
lemmas| Column | Type | Description |
|---|---|---|
| id | INTEGER | Unique identifier for the lemma |
| text | TEXT | Canonical form in Arabic with tashkeel |
| text_clean | TEXT | Canonical form in Arabic without tashkeel |
| words_count | INTEGER | Total occurrences of all derived words for lemma |
| uniq_words_count | INTEGER | Count of unique word forms derived from this lemma |
word_lemmas| Column | Type | Description |
|---|---|---|
| lemma_id | INTEGER | Foreign key to the lemmas.id field |
| word_location | TEXT | Word identifier (e.g. 2:3:5) |