[GSoC 02] Deployment and Configuration

Hardware Environment

  1. My personal laptop, a MacBook Pro 15’ inch (2015).
  2. A Dell PowerEdge Server runs Ubuntu 14.04 and equips with dual 8-core CPUs (Intel Xeon E5-2640 v2, 2.0 GHz), 96 GB memory space, and a Mellanox ConnectX-3 InfiniBand NIC (MT27500, 40 Gbps).

Software Environment

  1. Boost Library 1.60
    • I compile the binary and install this in MacOS, and directly install libboost via apt-get tool in Ubuntu.
  2. Hwloc Library (Hierarchical view of the machine)
  3. GCC-5.4
    • The server has a old version 4.8 of GCC, so C++17 cannot be used in the environment.
    • I upgrade gcc and use update-alternatives to switch new version on.
  4. CMake
    • Update my cmake from a private source list.

HPX deployment

  1. Add my_test/,ctags,’CMakeFiles/‘ to .gitignore for neglecting some unnecessary files.
  2. Make a new directory named my_test and run cmake in this directory and indicate the path of hwloc and libboost.
  3. Run make -j16 to build up and generate the static library.

Libcds

  1. make a test directory and run CMake (add -DCDS_DISABLE_128BIT_ATOMIC flag for 64-bit building).
  2. make and make install libcds

Libcds overview

The core functions are all in the cds directory.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
cds/
├── algo
│   └── flat_combining
├── compiler
│   ├── clang
│   ├── gcc
│   ├── icl
│   └── vc
├── container
│   ├── details
│   ├── impl
│   ├── striped_map
│   └── striped_set
├── details
├── gc
│   └── details
├── intrusive
│   ├── details
│   ├── impl
│   └── striped_set
├── lock
├── memory
├── opt
├── os
├── sync
├── threading
├── urcu
└── user_setup

In algo, there are some basic algorithm used in data structures, like flat combining (an OCC algorithm)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
cds/algo
├── atomic.h
├── backoff_strategy.h
├── base.h
├── bitop.h
├── bit_reversal.h
├── elimination.h
├── elimination_opt.h
├── elimination_tls.h
├── flat_combining
│   ├── defs.h
│   ├── kernel.h
│   └── wait_strategy.h
├── flat_combining.h
├── int_algo.h
└── split_bitstring.h

In most cases, Libcds uses RCU (read-copy-update) or HP (hazard pointer) to handle concurrent. In conclusion, the main types of container including:

  1. Queue
  2. Stack
  3. HashMap/HashSet
  4. List/KVList
  5. Binary Tree/AVL Tree
    There are some special algorithms for different data structure to provide lock-free, like MichaelList, Feldman Method and so on.

Test Libcds

  1. Make a new directory, and type make test to show the benchmarks.
  2. Some of the test case cannot pass, so I post an issue in libcds github homepage. (https://github.com/khizmax/libcds/issues/121/ )
  3. Firstly, I try to change my allocator (default is Glibc allocator) to TCMalloc. Then I try to build libcds with asan instrumentation and run the test.
  4. On the other hand, I also run some examples to test the correctness of these algorithms.