File: Distributed-Storage-Engine.txt

package info (click to toggle)
ray 2.3.1-9
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 6,008 kB
  • sloc: cpp: 49,973; sh: 339; makefile: 281; python: 168
file content (40 lines) | stat: -rw-r--r-- 1,165 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
3 types of objects are distributed:

- input sequence reads, the key is the ReadHandle. A read handle is a global
  integer
- graph vertices, the key is the DNA sequence (2-bit). hashing is used to find
  the owner
- paths. the key is a PathHandle. a path handle contains a rank and a local
  identifier



The distributed storage engine used by Ray is a distributed sparse hash table
(from RayPlatform) that uses these features:

- incremental resizing
- double hashing
- buckets are in groups
- distributed


The run-time options:


  Distributed storage engine

       -hash-table-buckets buckets
              Sets the initial number of buckets. Must be a power of 2 !
              Default value: 262144

       -hash-table-buckets-per-group buckets
              Sets the number of buckets per group for sparse storage
              Default value: 64, Must be between >=1 and <= 64

       -hash-table-load-factor-threshold threshold
              Sets the load factor threshold for real-time resizing
              Default value: 0.6, must be >= 0.5 and < 1

       -hash-table-verbosity
              Activates verbosity for the distributed storage engine