File: roadmap.md

package info (click to toggle)
boost1.83 1.83.0-5
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 545,632 kB
  • sloc: cpp: 3,857,086; xml: 125,552; ansic: 34,414; python: 25,887; asm: 5,276; sh: 4,799; ada: 1,681; makefile: 1,629; perl: 1,212; pascal: 1,139; sql: 810; yacc: 478; ruby: 102; lisp: 24; csh: 6
file content (188 lines) | stat: -rw-r--r-- 6,731 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
# Refactoring Roadmap

[Proof of concept](https://github.com/joaquintides/fca_unordered) implementation for a fast closed-addressing implementation.

## Plan of Refactoring

* remove `ptr_node` and `ptr_bucket`
* see if the code can survive a lack of the `extra_node` or maybe we hard-code it in
* implement bucket groups as they are in `fca` but don't use them directly yet, add alongside the `buckets_` data member in `struct table`
* try to remove `bucket_info_` from the node structure (breaks all call-sites that use `get_bucket()` and dependents)
* make sure `fca` can successfully handle multi-variants at this stage + supports mutable iterators for `map`/`multimap`
* do a hard-break:
    * update code to no longer use one single linked list across all buckets (each bucket contains its own unique list)
    * integrate the `bucket_group<Node>` structure into the `table` (update iterator call-sites to include `bucket_iterator`s)

Blockers:
* how to handle `multi` variants with new `fca` prototype

## Implementation Differences

### Unordered

### Node Type

Bullet Points:
* reify node type into a single one
* come up with implementation for multi- variants
* code that touches `get_bucket()` and `*_in_group()` member functions may need updating

There are two node types in Unordered, `struct node` and `struct ptr_node`, and the node type is selected conditionally based on the Allocator's pointer type:
```c++
template <typename A, typename T, typename NodePtr, typename BucketPtr>
struct pick_node2
{
  typedef boost::unordered::detail::node<A, T> node;
  // ...
};

template <typename A, typename T>
struct pick_node2<A, T, boost::unordered::detail::ptr_node<T>*,
  boost::unordered::detail::ptr_bucket*>
{ 
  typedef boost::unordered::detail::ptr_node<T> node;
  // ...
};

template <typename A, typename T> struct pick_node
{
  typedef typename boost::remove_const<T>::type nonconst;

  typedef boost::unordered::detail::allocator_traits<
    typename boost::unordered::detail::rebind_wrap<A,
      boost::unordered::detail::ptr_node<nonconst> >::type>
    tentative_node_traits;

  typedef boost::unordered::detail::allocator_traits<
    typename boost::unordered::detail::rebind_wrap<A,
      boost::unordered::detail::ptr_bucket>::type>
    tentative_bucket_traits;

  typedef pick_node2<A, nonconst, typename tentative_node_traits::pointer,
    typename tentative_bucket_traits::pointer>
    pick;

  typedef typename pick::node node;
  typedef typename pick::bucket bucket;
  typedef typename pick::link_pointer link_pointer;
};
```

The node types are identical in terms of interface and the only difference is that `node` is chosen when the Allocator uses fancy pointers and `ptr_node` is chosen when the Allocator's pointer type is `T*`.

Nodes in Unorderd store `bucket_info_`:
```cpp
template <typename A, typename T>
struct node : boost::unordered::detail::value_base<T>
{
  link_pointer next_;
  std::size_t bucket_info_;
  node() : next_(), bucket_info_(0) {}
  // ...
};
```

`bucket_info_` maps each node back to its corresponding bucket via the member function:
```cpp
std::size_t get_bucket() const
{
  return bucket_info_ & ((std::size_t)-1 >> 1);
}
```

`bucket_info_` is also used to demarcate the start of equivalent nodes in the containers via:
```cpp
// Note that nodes start out as the first in their group, as `bucket_info_` defaults to 0.
std::size_t is_first_in_group() const
{ return !(bucket_info_ & ~((std::size_t)-1 >> 1)); }

void set_first_in_group()
{ bucket_info_ = bucket_info_ & ((std::size_t)-1 >> 1); }

void reset_first_in_group()
{ bucket_info_ = bucket_info_ | ~((std::size_t)-1 >> 1); }
```

A goal of refactoring is to simply have one node type:
```cpp
template<class T>
struct node {
  node *next;
  T    value;
};
```
that is used unconditionally. This also requires updating the code that touches the `bucket_info_` along with the code that that touches the `*_in_group()` member functions.

### Bucket Type

Bullet points:
* reify bucket structure into a single one
* figure out how to add `bucket_group`s to the table struct

Buckets are similar to nodes in that there are two variations: `template<class NodePointer> struct bucket` and `struct ptr_bucket`.

The buckets exist to contain a pointer to a node, however they contain an `enum { extra_node = true };` or `enum { extra_node = false }` to determine whether or not the code should explicitly allocate a default constructed node whose address assigned as the dummy node at the end of the bucket array.

`extra_node` is used in the creation and deletion of the bucket array but it is not inherently clear what its intended purpose is.

### Iterators

Iterators are currently templated on the type of Node they store. Because `fca` constructs iterators with two arguments, all the call-sites that instantiate iterators will need to be updated but this a straight-forward mechanical change.

Iterators are selected, as of now, via the `detail::map` and `detail::set` class templates.

For example, for `unordered_map`, `iterator` is defined as:
```cpp
typedef boost::unordered::detail::map<A, K, T, H, P> types;
typedef typename types::table table;
typedef typename table::iterator iterator;
```

The iterator is a member typedef of the `table` which is `types::table`. Examining `types` (aka `detail::map<...>`), we see:
```cpp
template <typename A, typename K, typename M, typename H, typename P>
struct map {
  // ...
  typedef boost::unordered::detail::table<types> table;
  // ...
};
```

Examining the `detail::table<types>` struct, we see:
```cpp
template <typename Types>
struct table {
  // ...
  typedef typename Types::iterator iterator;
  // ...
}
```

Collapsing all of this, we see that our iterator types are defined here:
```cpp
template <typename A, typename K, typename M, typename H, typename P>
struct map
{
  // ...
  typedef boost::unordered::detail::pick_node<A, value_type> pick;
  typedef typename pick::node node;

  typedef boost::unordered::iterator_detail::iterator<node> iterator;
  typedef boost::unordered::iterator_detail::c_iterator<node> c_iterator;
  typedef boost::unordered::iterator_detail::l_iterator<node> l_iterator;
  typedef boost::unordered::iterator_detail::cl_iterator<node>
    cl_iterator;
  // ...
};
```

This is similarly designed for `detail::set`:
```cpp
typedef boost::unordered::iterator_detail::c_iterator<node> iterator;
typedef boost::unordered::iterator_detail::c_iterator<node> c_iterator;
typedef boost::unordered::iterator_detail::cl_iterator<node> l_iterator;
typedef boost::unordered::iterator_detail::cl_iterator<node>
  cl_iterator;
```

The only difference here is that `set::iterator` is always a `c_iterator`, a `const_iterator` type.