File: language_model.md

package info (click to toggle)
groonga 15.0.4%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: trixie
  • size: 163,080 kB
  • sloc: ansic: 770,564; cpp: 48,925; ruby: 40,447; javascript: 10,250; yacc: 7,045; sh: 5,602; python: 2,821; makefile: 1,672
file content (44 lines) | stat: -rw-r--r-- 1,307 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
# Language model

```{versionadded} 14.1.0

```

```{note}
This is an experimental feature. Currently, this feature is still not stable.
```

## Summary

Language model is useful for full text search too. Groonga can
integrate with language model.

Groonga uses language models in local. We'll provide a tool to manage
language models in local in the feature but it doesn't exist yet for
now. You need to download one or more language models manually for
now.

This feature uses [llama.cpp](https://github.com/ggerganov/llama.cpp)
internally. You can use only GGUF formatted language models. See also
the "supported models" section in the llama.cpp README.

## How to manage language models

You need to put GGUF formatted language models to
`${PREFIX}/share/groonga/language_models/`.

For example:
`/usr/local/share/groonga/language_models/mistral-7b-v0.1.Q4_K_M.gguf`

You can download GGUF formatted language models from [Hugging
Face](https://huggingface.co/). Some official language models provide
GGUF formatted language models too. But most of them don't provide
GGUF formatted language models.

You can convert existing language models on Hugging Face to GGUF
format by
[GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo).

## Functions

- {doc}`functions/language_model_vectorize`