Immutable serialisable data structures in C++11

immutablecppcaption

As C++ is very flexible language there are a lot of ways to construct immutable-like data structures from functional programming style. For example, this nice presentation by Kelvin Henney contains one of the ways. Here I will present my way which is slightly different.

At the same time my approach also solves another problem of serialisation of such immutable structures. I use JSON format at the moment, but the approach is not limited to this format only.

Unlike my previous posts I will start from the results this time.

I suppose you are familiar with general concept of immutability. If you are not – please read some minimal introduction to immutable data concept in functional programming.

FIRST PART – USAGE

ATM I have structure declarations simular to this:

Data fields of class are declared as const. And that’s what should be done to be sure that data will not be changed. Note, that I don’t change fields to functions, I don’t hide them below some getters or doing some other tricky stuff. I just mark fields const.

Next – SERIALIZE_JSON. This is macro which contains all the magic. Unfortunately we can’t at the moment achieve introspection without macro declaration. So this is black magic again. I promise next post will not contain macros 🙂

The last step is sharing immutable data through smart pointer to decrease data copy overhead introduced by functional-style data processing. And I use such pointers implicitly. For example: plain data object is named as EventData and pointer to such data is named just Event. This is arguable moment – it’s not necessary to follow this notation.

About the price of using shared_ptr there is nice video from NDC conference – The real price of Shared Pointers in C++ by Nicolai M. Josuttis.

Before presenting some usage examples let’s make data a bit more realistic:

This is description of some event which has id, title, rating, some schedule as pairs of start/finish linux-times, and vector of integer tags. All this just to show nested immutable structures and const vectors as parts of serialisable data.

Note that you can still add some methods into this class. Marking them const will be good idea.

IMMUTABILITY

Ok, it’s time for action! Structure creation is simple:

Using new C++ initialisation syntax we can not only form immutable structure the simple way, but also declare all nested collections. All constructors are generated automatically by same black-magic macro.

Important: immutable data does not have empty constructor! You can only create ‘filled’ data state. This is good feature as now it’s very problematic to get corrupted unfully-constructed state. It’s all or nothing. And of course you still could have empty shared_ptr which could contain no immutable data at all.

When you add new field into data structure all places where data was created explicitly will stop to compile. It might seem as bad, but actually this is very good restriction. Now you can’t forget to modify your object construction according new design.

As you can guess all fields could be accessed general way. But any modification will be prevented by compiler.

immutability

To modify immutable data we need to create new copy of data with modified field. So it’s not modification, but the construction of new object. Same macro is generating all such constructors:

Screenshot 2015-06-27 23.09.39

Note: type is auto-derived using decltype from C++11.

To get the idea how to use such immutable data in functional way using C++11 you can read several posts: post1 post2 post3 (and, probably, more are coming).

SERIALIZATION

Just two methods: toJSON() / fromJSON() are enough to handle all serialisation/deserialisation needs:

Output:

Sweet and simple. And note that we can (de)serialise nested structures / arrays.

Serialisation part is optional – if you only need immutability you don’t have to include this part. Implementation of immutability is not using any serialisation implicitly.

SECOND PART – IMPLEMENTATION

Under the hood there is fusion of macro-magic and C++11 features like decltype. I’ll try to express the main ideas how it was implemented instead of just copying of whole code. If you not trying to implement the given approach yourself you could even skip this and believe me that it works.

The next part will be a bit ugly. Be sure you are 16+ before reading this.

caution

Also I had some additional limitation which absence could make life a bit more easy – i could not use constexpr. I use immutable data inside cross-platform solutions and one of my targets is Windows desktop. I don’t know why Microsoft Visual Studio 2015 still does not like constexpr but it’s support is not yet completed. So don’t be surprised why i’m using couple of old-school macro technics to make functional stuff work in more comfort way.

At first we need macro which applies macro to each of macro parameter.

This is very ugly, but it works (clang, gcc, VS). As you might guess this specific version could handle up to 20 arguments only.

I will split the whole macro to blocks and to improve readability I will skip ‘\’ sign here which is at the end of each line in multiline macro. Also in each block ‘helper’ macros will be placed below to show content together.

Main constructor:

NAME – is name of class. VA_ARGS is the list of all fields. So macro is expanded in something like this – NAME(decltype(a) a, decltype(b) b,int finisher=0) : a(a), b(b), IImmutable(){} . Here IImmutable is some base class (optional), which could be empty. Also I use finisher to solve last comma problem – may be there is some more clean solution for this.

So other part of magic here is using decltype from C++11.

Additional copy constructors:

Note that we can’t move immutable data. Move semantics modify the source object and immutable objects can’t be modified.

The last constructor is from std::tuple. We need it for modification / serialisation methods.

The tricky part here is that to construct from tuple we need to know index of each field inside tuple. To get such index we need C++14’s constexpr with no C++11 limitations (increment inside constexpr). Or we can use __COUNTER__ macro to create additional static fields which contain indexes.

Note that __COUNTER__ macro is global so we need to generate additional offset field to store initial value and solve the difference for each index.

Generation of ‘set_’ methods:

Unfortunately tuple also has finisher as int type to solve ‘comma’ problem. May be i’ll find a way to make it possible without it.

To make shorter declaration of holding pointers:

Additional overload of compare operator:

This overload could be changed according to business logic. For example you could compare only id fields.

Whole serialisation part of macro is quite short:

Here I use my old lib for parsing JSON (described here). But you can use any JSON decoder you like. The only restriction is that you have to write wrappers so your lib could read/write values through single entry point. Here I will show an example implementation how to distinguish vector and non vector types.

JSON writer is using simple overloading / template specialisation:

Couple of implementations (for example):

So all type variations are hidden inside JSON writing/reading section.

Reader has a bit more complicated form of overloading because we have to overload only output type. To make this possible we have some dummy parameter, std::enable_if and simple type trait for vector detection.

Some overloads:

So you need to specify overloads for all your trivial types and containers. In practice this is not so compact but anyway you should do it only once.

CONCLUSION

Once again, all this ugly implementation detail is written only once and is located on low level of architecture. Upper business layer just uses this utility as black box and should not be aware of inner implementation.

Anyway the main idea of this post was to show how compact declaration of immutable data could be in C++11, and that you don’t have to write more boilerplate code for each business class declaration.

Any comments are welcome.

UPDATE: I put some source code on github as 2 gists:

My old JSON lib – https://gist.github.com/VictorLaskin/1fb078d7f4ac78857f48

Declaration – https://gist.github.com/VictorLaskin/48d1336e8b6eea16414b

Please, treat this code like just an example.