File: Train-network-with-your-dataset.md

package info (click to toggle)
tiny-dnn 1.0.0a3%2Bds-5
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 4,784 kB
  • sloc: cpp: 16,471; ansic: 11,829; lisp: 3,682; python: 3,422; makefile: 208
file content (87 lines) | stat: -rw-r--r-- 2,836 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# Train network with your original dataset
Here are some examples.

### 1. using opencv (image file => vec_t)

```cpp
#include <opencv2/imgcodecs.hpp>
#include <opencv2/imgproc.hpp>
#include <boost/foreach.hpp>
#include <boost/filesystem.hpp>
using namespace boost::filesystem;

// convert image to vec_t
void convert_image(const std::string& imagefilename,
                   double scale,
                   int w,
                   int h,
                   std::vector<vec_t>& data)
{
    auto img = cv::imread(imagefilename, cv::IMREAD_GRAYSCALE);
    if (img.data == nullptr) return; // cannot open, or it's not an image

    cv::Mat_<uint8_t> resized;
    cv::resize(img, resized, cv::Size(w, h));
    vec_t d;

    std::transform(resized.begin(), resized.end(), std::back_inserter(d),
                   [=](uint8_t c) { return c * scale; });
    data.push_back(d);
}

// convert all images found in directory to vec_t
void convert_images(const std::string& directory,
                    double scale,
                    int w,
                    int h,
                    std::vector<vec_t>& data)
{
    path dpath(directory);

    BOOST_FOREACH(const path& p, 
                  std::make_pair(directory_iterator(dpath), directory_iterator())) {
        if (is_directory(p)) continue;
        convert_image(p.string(), scale, w, h, data);
    }
}
```

Another example can be found in [issue#16](https://github.com/tiny-dnn/tiny-dnn/issues/16), which can treat color channels.

### 2. using [mnisten](https://github.com/nyanp/mnisten) (image file => idx format)
mnisten is a library to convert image files to idx format.
```
mnisten -d my_image_files_directory_name -o my_prefix -s 32x32
```
After generating idx files, you can use parse_mnist_images / parse_mnist_labels utilities in mnist_parser.h

### 3. from levelDB (caffe style => [vec_t, label_t])
[Caffe](https://github.com/BVLC/caffe/) supports levelDB data format. Following code can convert levelDB created by Caffe into data/label arrays.

```cpp
#include "leveldb/db.h"

void convert_leveldb(const std::string& dbname,
                     double scale,
                     std::vector<vec_t>& data,
                     std::vector<label_t>& label)
{
    leveldb::DB *db;
    leveldb::Options options;
    options.create_if_missing = false;
    auto status = leveldb::DB::Open(options, dbname, &db);

    leveldb::Iterator* it = db->NewIterator(leveldb::ReadOptions());
    for (it->SeekToFirst(); it->Valid(); it->Next()) {
        const char* src = it->value().data();
        size_t sz = it->value().size();
        vec_t d;
        std::transform(src, src + sz - 1, std::back_inserter(d),
                       [=](char c){ return c * scale; });
        data.push_back(d);
        label.push_back(src[sz - 1]);
    }
    delete it;
    delete db;
}
```