File: README.md

package info (click to toggle)
performous 1.1%2Bgit20181118-4
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 11,736 kB
  • sloc: cpp: 30,008; ansic: 2,751; sh: 801; xml: 464; python: 374; makefile: 36
file content (46 lines) | stat: -rw-r--r-- 1,064 bytes parent folder | download | duplicates (29)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
### Introduction

Compact Encoding Detection(CED for short) is a library written in C++ that
scans given raw bytes and detect the most likely text encoding.

Basic usage:

```
#include "compact_enc_det/compact_enc_det.h"

const char* text = "Input text";
bool is_reliable;
int bytes_consumed;

Encoding encoding = CompactEncDet::DetectEncoding(
        text, strlen(text),
        nullptr, nullptr, nullptr,
        UNKNOWN_ENCODING,
        UNKNOWN_LANGUAGE,
        CompactEncDet::WEB_CORPUS,
        false,
        &bytes_consumed,
        &is_reliable);

```

### How to build

You need [CMake](https://cmake.org/) to build the package. After unzipping
the source code , run `autogen.sh` to build everything automatically.
The script also downloads [Google Test](https://github.com/google/googletest)
framework needed to build the unittest.

```
$ cd compact_enc_det
$ ./autogen.sh
...
$ bin/ced_unittest
```

On Windows, run `cmake .` to download the test framework, and generate
project files for Visual Studio.

```
D:\packages\compact_enc_det> cmake .
```