1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
|
3 types of objects are distributed:
- input sequence reads, the key is the ReadHandle. A read handle is a global
integer
- graph vertices, the key is the DNA sequence (2-bit). hashing is used to find
the owner
- paths. the key is a PathHandle. a path handle contains a rank and a local
identifier
The distributed storage engine used by Ray is a distributed sparse hash table
(from RayPlatform) that uses these features:
- incremental resizing
- double hashing
- buckets are in groups
- distributed
The run-time options:
Distributed storage engine
-hash-table-buckets buckets
Sets the initial number of buckets. Must be a power of 2 !
Default value: 262144
-hash-table-buckets-per-group buckets
Sets the number of buckets per group for sparse storage
Default value: 64, Must be between >=1 and <= 64
-hash-table-load-factor-threshold threshold
Sets the load factor threshold for real-time resizing
Default value: 0.6, must be >= 0.5 and < 1
-hash-table-verbosity
Activates verbosity for the distributed storage engine
|