File: cheatsheet.md

package info (click to toggle)
fasttext 0.9.2%2Bds-8
  • links: PTS, VCS
  • area: main
  • in suites: trixie
  • size: 4,940 kB
  • sloc: cpp: 5,459; python: 2,427; javascript: 635; sh: 621; makefile: 106; xml: 81; perl: 43
file content (89 lines) | stat: -rw-r--r-- 1,840 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
id: cheatsheet
title: Cheatsheet
---

## Word representation learning

In order to learn word vectors do:

```bash
$ ./fasttext skipgram -input data.txt -output model
```

## Obtaining word vectors

Print word vectors for a text file `queries.txt` containing words.

```bash
$ ./fasttext print-word-vectors model.bin < queries.txt
```

## Text classification

In order to train a text classifier do:

```bash
$ ./fasttext supervised -input train.txt -output model
```

Once the model was trained, you can evaluate it by computing the precision and recall at k (P@k and R@k) on a test set using:

```bash
$ ./fasttext test model.bin test.txt 1
```

In order to obtain the k most likely labels for a piece of text, use:

```bash
$ ./fasttext predict model.bin test.txt k
```

In order to obtain the k most likely labels and their associated probabilities for a piece of text, use:

```bash
$ ./fasttext predict-prob model.bin test.txt k
```

If you want to compute vector representations of sentences or paragraphs, please use:

```bash
$ ./fasttext print-sentence-vectors model.bin < text.txt
```

## Quantization

In order to create a `.ftz` file with a smaller memory footprint do:

```bash
$ ./fasttext quantize -output model
```

All other commands such as test also work with this model

```bash
$ ./fasttext test model.ftz test.txt
```

## Autotune

Activate hyperparameter optimization with `-autotune-validation` argument:

```bash
$ ./fasttext supervised -input train.txt -output model -autotune-validation valid.txt
```

Set timeout (in seconds):
```bash
$ ./fasttext supervised -input train.txt -output model -autotune-validation valid.txt -autotune-duration 600
```

Constrain the final model size:
```bash
$ ./fasttext supervised -input train.txt -output model -autotune-validation valid.txt -autotune-modelsize 2M
```