Layout of std::unordered_set (libstdc++)

Layout

  • The meta element on the stack consumes 7 words (56 Byte on x64, 28 Byte on x86).
  • Each node on the heap has a size of 1 word (pointer to next node) and sizeof(val).
  • The number and size of the buckets depends on the amount of values and collisions.

Rehashing

Rehashing happens when, before adding one item, the following condition is true:

(_M_element_count + 1) > (_M_bucket_count * (double)_M_max_load_factor)

Source: gcc/libstdc++-v3/src/c++11/hashtable_c++0x.cc


Example

#include <unordered_set>

int main()
{
    std::unordered_set<int> uset = { 1, 2, 3, 4, 5 };
    return 0;
}
$ g++ --version
g++ (Ubuntu 13.2.0-4ubuntu3) 13.2.0
$ g++ main.cpp -g
$ gdb a.out
(gdb) b 6
(gdb) r
(gdb) disable pretty-printer
(gdb) p /x uset
$1 = {
  _M_h = {...
          _M_buckets = 0x55555556e2d0,
          _M_bucket_count = 0xd, 
          _M_before_begin = {_M_nxt = 0x55555556e3a0},
          _M_element_count = 0x5,
          _M_rehash_policy = {static _S_growth_factor = 0x2,
                              _M_max_load_factor = 0x3f800000,
                              _M_next_resize = 0xd}, 
          _M_single_bucket = 0x0}}
(gdb) p sizeof(uset)
$2 = 56
(gdb) x/7xg &uset
0x7fffffffddf0:	0x000055555556e2d0 0x000000000000000d
0x7fffffffde00:	0x000055555556e3a0 0x0000000000000005
0x7fffffffde10:	0x000000003f800000 0x000000000000000d
0x7fffffffde20:	0x0000000000000000
(gdb) x/13xg uset._M_h._M_buckets
0x55555556e2d0:	0x0000000000000000 0x000055555556e340
0x55555556e2e0:	0x000055555556e360 0x000055555556e380
0x55555556e2f0:	0x000055555556e3a0 0x00007fffffffde00
0x55555556e300:	0x0000000000000000 0x0000000000000000
0x55555556e310:	0x0000000000000000 0x0000000000000000
0x55555556e320:	0x0000000000000000 0x0000000000000000
0x55555556e330:	0x0000000000000000
(gdb) x/2xg 0x000055555556e3a0
0x55555556e3a0:	0x000055555556e380 0x0000000000000005
(gdb) x/2xg 0x000055555556e380
0x55555556e380:	0x000055555556e360 0x0000000000000004
(gdb) x/2xg 0x000055555556e360
0x55555556e360:	0x000055555556e340 0x0000000000000003
(gdb) x/2xg 0x000055555556e340
0x55555556e340:	0x000055555556e2b0 0x0000000000000002
(gdb) x/2xg 0x000055555556e2b0
0x55555556e2b0:	0x0000000000000000 0x0000000000000001