As C++ is very flexible language there are a lot of ways to construct immutable-like data structures from functional programming style. For example, this nice presentation by Kelvin Henney contains one of the ways. Here I will present my way which is slightly different.
At the same time my approach also solves another problem of serialisation of such immutable structures. I use JSON format at the moment, but the approach is not limited to this format only.
Unlike my previous posts I will start from the results this time.
I suppose you are familiar with general concept of immutability. If you are not – please read some minimal introduction to immutable data concept in functional programming.
FIRST PART – USAGE
ATM I have structure declarations simular to this:
1 2 3 4 5 6 7 8 9 10 |
class EventData : public IImmutable { public: const int id; const string title; const double rating; SERIALIZE_JSON(EventData, id, title, rating); }; using Event = EventData::Ptr; |
Data fields of class are declared as const. And that’s what should be done to be sure that data will not be changed. Note, that I don’t change fields to functions, I don’t hide them below some getters or doing some other tricky stuff. I just mark fields const.
Next – SERIALIZE_JSON. This is macro which contains all the magic. Unfortunately we can’t at the moment achieve introspection without macro declaration. So this is black magic again. I promise next post will not contain macros 🙂
The last step is sharing immutable data through smart pointer to decrease data copy overhead introduced by functional-style data processing. And I use such pointers implicitly. For example: plain data object is named as EventData and pointer to such data is named just Event. This is arguable moment – it’s not necessary to follow this notation.
About the price of using shared_ptr there is nice video from NDC conference – The real price of Shared Pointers in C++ by Nicolai M. Josuttis.
Before presenting some usage examples let’s make data a bit more realistic:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
class ScheduleItemData : public IImmutable { public: const time_t start; const time_t finish; SERIALIZE_JSON(ScheduleItemData, start, finish); }; using ScheduleItem = ScheduleItemData::Ptr; class EventData : public IImmutable { public: const int id; const string title; const double rating; const vector<ScheduleItem> schedule; const vector<int> tags; SERIALIZE_JSON(EventData, id, title, rating, schedule, tags); }; using Event = EventData::Ptr; |
This is description of some event which has id, title, rating, some schedule as pairs of start/finish linux-times, and vector of integer tags. All this just to show nested immutable structures and const vectors as parts of serialisable data.
Note that you can still add some methods into this class. Marking them const will be good idea.
IMMUTABILITY
Ok, it’s time for action! Structure creation is simple:
1 |
Event event = EventData(136, "Nice event", 4.88, {ScheduleItemData(1111,2222), ScheduleItemData(3333,4444)}, {45,323,55}); |
Using new C++ initialisation syntax we can not only form immutable structure the simple way, but also declare all nested collections. All constructors are generated automatically by same black-magic macro.
Important: immutable data does not have empty constructor! You can only create ‘filled’ data state. This is good feature as now it’s very problematic to get corrupted unfully-constructed state. It’s all or nothing. And of course you still could have empty shared_ptr which could contain no immutable data at all.
When you add new field into data structure all places where data was created explicitly will stop to compile. It might seem as bad, but actually this is very good restriction. Now you can’t forget to modify your object construction according new design.
As you can guess all fields could be accessed general way. But any modification will be prevented by compiler.
To modify immutable data we need to create new copy of data with modified field. So it’s not modification, but the construction of new object. Same macro is generating all such constructors:
Note: type is auto-derived using decltype from C++11.
To get the idea how to use such immutable data in functional way using C++11 you can read several posts: post1 post2 post3 (and, probably, more are coming).
SERIALIZATION
Just two methods: toJSON() / fromJSON() are enough to handle all serialisation/deserialisation needs:
1 2 3 4 5 |
// serialisation string json = event->toJSON(); // deserialisation Event eventCopy = EventData::fromJSON(json); |
Output:
1 |
{"id":136,"title":"Nice event","rating":4.880000,"schedule":[{"start":1111,"finish":2222},{"start":3333,"finish":4444}],"tags":[45,323,55]} |
Sweet and simple. And note that we can (de)serialise nested structures / arrays.
Serialisation part is optional – if you only need immutability you don’t have to include this part. Implementation of immutability is not using any serialisation implicitly.
SECOND PART – IMPLEMENTATION
Under the hood there is fusion of macro-magic and C++11 features like decltype. I’ll try to express the main ideas how it was implemented instead of just copying of whole code. If you not trying to implement the given approach yourself you could even skip this and believe me that it works.
The next part will be a bit ugly. Be sure you are 16+ before reading this.
Also I had some additional limitation which absence could make life a bit more easy – i could not use constexpr. I use immutable data inside cross-platform solutions and one of my targets is Windows desktop. I don’t know why Microsoft Visual Studio 2015 still does not like constexpr but it’s support is not yet completed. So don’t be surprised why i’m using couple of old-school macro technics to make functional stuff work in more comfort way.
At first we need macro which applies macro to each of macro parameter.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
#define SERIALIZE_PRIVATE_DUP1(M,NAME,A) M(NAME,A) #define SERIALIZE_PRIVATE_DUP2(M,NAME,A,B) M(NAME,A) M(NAME,B) #define SERIALIZE_PRIVATE_DUP3(M,NAME,A,B,C) M(NAME,A) SERIALIZE_PRIVATE_DUP2(M,NAME,B,C) #define SERIALIZE_PRIVATE_DUP4(M,NAME,A,B,C,D) M(NAME,A) SERIALIZE_PRIVATE_DUP3(M,NAME,B,C,D) #define SERIALIZE_PRIVATE_DUP5(M,NAME,A,B,C,D,E) M(NAME,A) SERIALIZE_PRIVATE_DUP4(M,NAME,B,C,D,E) #define SERIALIZE_PRIVATE_DUP6(M,NAME,A,B,C,D,E,F) M(NAME,A) SERIALIZE_PRIVATE_DUP5(M,NAME,B,C,D,E,F) #define SERIALIZE_PRIVATE_DUP7(M,NAME,A,B,C,D,E,F,G) M(NAME,A) SERIALIZE_PRIVATE_DUP6(M,NAME,B,C,D,E,F,G) #define SERIALIZE_PRIVATE_DUP8(M,NAME,A,B,C,D,E,F,G,H) M(NAME,A) SERIALIZE_PRIVATE_DUP7(M,NAME,B,C,D,E,F,G,H) #define SERIALIZE_PRIVATE_DUP9(M,NAME,A,B,C,D,E,F,G,H,I) M(NAME,A) SERIALIZE_PRIVATE_DUP8(M,NAME,B,C,D,E,F,G,H,I) #define SERIALIZE_PRIVATE_DUP10(ME,NAME,A,B,C,D,E,F,G,H,I,K) ME(NAME,A) SERIALIZE_PRIVATE_DUP9(ME,NAME,B,C,D,E,F,G,H,I,K) #define SERIALIZE_PRIVATE_DUP11(ME,NAME,A,B,C,D,E,F,G,H,I,K,L) ME(NAME,A) SERIALIZE_PRIVATE_DUP10(ME,NAME,B,C,D,E,F,G,H,I,K,L) #define SERIALIZE_PRIVATE_DUP12(ME,NAME,A,B,C,D,E,F,G,H,I,K,L,M) ME(NAME,A) SERIALIZE_PRIVATE_DUP11(ME,NAME,B,C,D,E,F,G,H,I,K,L,M) #define SERIALIZE_PRIVATE_DUP13(ME,NAME,A,B,C,D,E,F,G,H,I,K,L,M,N) ME(NAME,A) SERIALIZE_PRIVATE_DUP12(ME,NAME,B,C,D,E,F,G,H,I,K,L,M,N) #define SERIALIZE_PRIVATE_DUP14(ME,NAME,A,B,C,D,E,F,G,H,I,K,L,M,N,O) ME(NAME,A) SERIALIZE_PRIVATE_DUP13(ME,NAME,B,C,D,E,F,G,H,I,K,L,M,N,O) #define SERIALIZE_PRIVATE_DUP15(ME,NAME,A,B,C,D,E,F,G,H,I,K,L,M,N,O,P) ME(NAME,A) SERIALIZE_PRIVATE_DUP14(ME,NAME,B,C,D,E,F,G,H,I,K,L,M,N,O,P) #define SERIALIZE_PRIVATE_DUP16(ME,NAME,A,B,C,D,E,F,G,H,I,K,L,M,N,O,P,R) ME(NAME,A) SERIALIZE_PRIVATE_DUP15(ME,NAME,B,C,D,E,F,G,H,I,K,L,M,N,O,P,R) #define SERIALIZE_PRIVATE_DUP17(ME,NAME,A,B,C,D,E,F,G,H,I,K,L,M,N,O,P,R,S) ME(NAME,A) SERIALIZE_PRIVATE_DUP16(ME,NAME,B,C,D,E,F,G,H,I,K,L,M,N,O,P,R,S) #define SERIALIZE_PRIVATE_DUP18(ME,NAME,A,B,C,D,E,F,G,H,I,K,L,M,N,O,P,R,S,T) ME(NAME,A) SERIALIZE_PRIVATE_DUP17(ME,NAME,B,C,D,E,F,G,H,I,K,L,M,N,O,P,R,S,T) #define SERIALIZE_PRIVATE_DUP19(ME,NAME,A,B,C,D,E,F,G,H,I,K,L,M,N,O,P,R,S,T,Q) ME(NAME,A) SERIALIZE_PRIVATE_DUP18(ME,NAME,B,C,D,E,F,G,H,I,K,L,M,N,O,P,R,S,T,Q) #define SERIALIZE_PRIVATE_DUP20(ME,NAME,A,B,C,D,E,F,G,H,I,K,L,M,N,O,P,R,S,T,Q,Y) ME(NAME,A) SERIALIZE_PRIVATE_DUP19(ME,NAME,B,C,D,E,F,G,H,I,K,L,M,N,O,P,R,S,T,Q,Y) #define SERIALIZE_PRIVATE_EXPAND(x) x #define SERIALIZE_PRIVATE_DUPCALL(N,M,NAME,...) SERIALIZE_PRIVATE_DUP ## N (M,NAME,__VA_ARGS__) // counter of macro arguments + actual call #define SERIALIZE_PRIVATE_VA_NARGS_IMPL(_1,_2,_3,_4,_5,_6,_7,_8,_9,_10, _11,_12,_13,_14,_15,_16,_17,_18,_19,_20, N, ...) N #define SERIALIZE_PRIVATE_VA_NARGS(...) SERIALIZE_PRIVATE_EXPAND(SERIALIZE_PRIVATE_VA_NARGS_IMPL(__VA_ARGS__, 20,19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1)) #define SERIALIZE_PRIVATE_VARARG_IMPL2(M,NAME,base, count, ...) SERIALIZE_PRIVATE_EXPAND(base##count(M,NAME,__VA_ARGS__)) #define SERIALIZE_PRIVATE_VARARG_IMPL(M,NAME,base, count, ...) SERIALIZE_PRIVATE_EXPAND(SERIALIZE_PRIVATE_VARARG_IMPL2(M,NAME,base, count, __VA_ARGS__)) #define SERIALIZE_PRIVATE_VARARG(M,NAME,base, ...) SERIALIZE_PRIVATE_EXPAND(SERIALIZE_PRIVATE_VARARG_IMPL(M,NAME, base, SERIALIZE_PRIVATE_VA_NARGS(__VA_ARGS__), __VA_ARGS__)) #define SERIALIZE_PRIVATE_DUPAUTO(M,NAME,...) SERIALIZE_PRIVATE_EXPAND(SERIALIZE_PRIVATE_VARARG(M,NAME,SERIALIZE_PRIVATE_DUP, __VA_ARGS__)) |
This is very ugly, but it works (clang, gcc, VS). As you might guess this specific version could handle up to 20 arguments only.
I will split the whole macro to blocks and to improve readability I will skip ‘\’ sign here which is at the end of each line in multiline macro. Also in each block ‘helper’ macros will be placed below to show content together.
Main constructor:
1 2 3 4 5 6 7 |
// Main constructor NAME(SERIALIZE_PRIVATE_DUPAUTO(SERIALIZE_PRIVATE_CTORIMMUTABLEDECL,NAME,__VA_ARGS__) int finisher = 0) : SERIALIZE_PRIVATE_DUPAUTO(SERIALIZE_PRIVATE_CTORIMMUTABLEPARAM,NAME,__VA_ARGS__)IImmutable(){} #define SERIALIZE_PRIVATE_CTORIMMUTABLEDECL(NAME,VAL) decltype(VAL) VAL, #define SERIALIZE_PRIVATE_CTORIMMUTABLEPARAM(NAME,VAL) VAL(VAL), |
NAME – is name of class. VA_ARGS is the list of all fields. So macro is expanded in something like this – NAME(decltype(a) a, decltype(b) b,int finisher=0) : a(a), b(b), IImmutable(){} . Here IImmutable is some base class (optional), which could be empty. Also I use finisher to solve last comma problem – may be there is some more clean solution for this.
So other part of magic here is using decltype from C++11.
Additional copy constructors:
1 2 3 4 5 6 7 8 9 10 |
// copy constructors NAME(NAME&& other) noexcept : SERIALIZE_PRIVATE_DUPAUTO(SERIALIZE_PRIVATE_CTORIMMUTABLECOPY,NAME,__VA_ARGS__)IImmutable() {} NAME(const NAME& other) noexcept : SERIALIZE_PRIVATE_DUPAUTO(SERIALIZE_PRIVATE_CTORIMMUTABLECOPY,NAME,__VA_ARGS__)IImmutable(){} #define SERIALIZE_PRIVATE_CTORIMMUTABLECOPY(NAME,VAL) VAL(other.VAL), |
Note that we can’t move immutable data. Move semantics modify the source object and immutable objects can’t be modified.
The last constructor is from std::tuple. We need it for modification / serialisation methods.
1 2 3 4 5 6 |
// Constructor from tuple NAME(std::tuple< SERIALIZE_PRIVATE_DUPAUTO(SERIALIZE_PRIVATE_CTORIMMUTABLEDECLTYPENONCONST,NAME,__VA_ARGS__) int> vars) : SERIALIZE_PRIVATE_DUPAUTO(SERIALIZE_PRIVATE_CTORFROMTUPLE,NAME,__VA_ARGS__)IImmutable() {} #define SERIALIZE_PRIVATE_CTORIMMUTABLEDECLTYPENONCONST(NAME,VAL) typename std::remove_const<decltype(VAL)>::type, #define SERIALIZE_PRIVATE_CTORFROMTUPLE(NAME,VAL) VAL(std::get<SERIALIZE_PRIVATE_GETINDEX(NAME,VAL)>(vars)), |
The tricky part here is that to construct from tuple we need to know index of each field inside tuple. To get such index we need C++14’s constexpr with no C++11 limitations (increment inside constexpr). Or we can use __COUNTER__ macro to create additional static fields which contain indexes.
1 2 3 4 5 |
// Index of each field inside class: static const int _index_offset = __COUNTER__; SERIALIZE_PRIVATE_DUPAUTO(SERIALIZE_PRIVATE_FIELDINDEX,NAME,__VA_ARGS__) #define SERIALIZE_PRIVATE_FIELDINDEX(NAME,VAL) static const int _index_of_##VAL = __COUNTER__ - _index_offset - 1; |
Note that __COUNTER__ macro is global so we need to generate additional offset field to store initial value and solve the difference for each index.
Generation of ‘set_’ methods:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
// Generation of set_ methods SERIALIZE_PRIVATE_DUPAUTO(SERIALIZE_PRIVATE_CLONEANDSET2,NAME,__VA_ARGS__) // Convert data into std::tuple std::tuple< SERIALIZE_PRIVATE_DUPAUTO(SERIALIZE_PRIVATE_CTORIMMUTABLEDECLTYPENONCONST,NAME,__VA_ARGS__) int> toTuple() const noexcept { return make_tuple(SERIALIZE_PRIVATE_DUPAUTO(SERIALIZE_PRIVATE_CTORIMMUTABLEVAL,NAME,__VA_ARGS__) 0); } #define SERIALIZE_PRIVATE_CLONEANDSET2(NAME,VAL) NAME::Ptr set_##VAL(decltype(VAL) VAL) const noexcept { auto t = toTuple(); std::get<SERIALIZE_PRIVATE_GETINDEX(NAME,VAL)>(t) = VAL; return std::make_shared<NAME>(NAME(t)); } |
Unfortunately tuple also has finisher as int type to solve ‘comma’ problem. May be i’ll find a way to make it possible without it.
To make shorter declaration of holding pointers:
1 2 |
// Short smart-pointer declaration typedef std::shared_ptr<NAME> Ptr; |
Additional overload of compare operator:
1 2 3 4 5 6 7 |
// compare operator overload bool operator== (const NAME& other) const noexcept { SERIALIZE_PRIVATE_DUPAUTO(SERIALIZE_PRIVATE_COMPAREIMMUTABLE,NAME,__VA_ARGS__) return true; return false; } #define SERIALIZE_PRIVATE_COMPAREIMMUTABLE(NAME,VAL) if (other.VAL==VAL) |
This overload could be changed according to business logic. For example you could compare only id fields.
Whole serialisation part of macro is quite short:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
// JSON serialisation is done using my own old lib string toJSON() const noexcept { JSON::MVJSONWriter w; w.begin(); SERIALIZE_PRIVATE_DUPAUTO(SERIALIZE_PRIVATE_APPENDTOJSON,NAME,__VA_ARGS__) w.end(); return w.result; } static NAME fromJSON(string json) { JSON::MVJSONReader reader(json); return fromJSON(reader.root); } static NAME fromJSON(JSON::MVJSONNode* node) { return NAME( make_tuple(SERIALIZE_PRIVATE_DUPAUTO(SERIALIZE_PRIVATE_FROMJSON,NAME,__VA_ARGS__) 0)); } #define SERIALIZE_PRIVATE_APPENDTOJSON(NAME,VAL) w.add(#VAL,VAL); #define SERIALIZE_PRIVATE_FROMJSON(NAME,VAL) node->getValue<decltype(VAL)>(#VAL), |
Here I use my old lib for parsing JSON (described here). But you can use any JSON decoder you like. The only restriction is that you have to write wrappers so your lib could read/write values through single entry point. Here I will show an example implementation how to distinguish vector and non vector types.
JSON writer is using simple overloading / template specialisation:
1 2 3 4 5 |
template< typename T > inline string toString(const T& value); template< typename T > inline string toString(const vector<T>& value); |
Couple of implementations (for example):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
template< typename T > inline string MVJSONWriter::toString(const T& value) { // default overload is for string-like types! return "\"" + value + "\""; } template< typename T > inline string MVJSONWriter::toString(const vector<T>& value) { string result = "["; for (auto item : value) result += ((result != "[") ? "," : "") + item->toJSON(); result += "]"; return result; } template<> inline string MVJSONWriter::toString(const vector<int>& value) { string result = "["; for (auto item : value) result += ((result != "[") ? "," : "") + std::to_string(item); result += "]"; return result; } |
So all type variations are hidden inside JSON writing/reading section.
Reader has a bit more complicated form of overloading because we have to overload only output type. To make this possible we have some dummy parameter, std::enable_if and simple type trait for vector detection.
1 2 3 4 5 6 7 8 9 10 11 |
template <typename T> struct is_vector { static const bool value = false; }; template <typename T> struct is_vector< std::vector<T> > { static const bool value = true; }; template <typename T> struct is_vector< const std::vector<T> > { static const bool value = true; }; // inside reader class: template<class T> inline T getValue(const string& name, typename enable_if<!is_vector<T>::value, T>::type* = nullptr); template<class T> inline T getValue(const string& name, typename enable_if<is_vector<T>::value, T>::type* = nullptr); |
Some overloads:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
template<class T> inline T MVJSONNode::getValue(const string& name, typename enable_if<!is_vector<T>::value, T>::type*) { MVJSONValue* value = getField(name); if (value == NULL) return ""; return value->stringValue; } template<class T> inline T MVJSONNode::getValue(const string& name, typename enable_if<is_vector<T>::value, T>::type*) { typename std::remove_const<T>::type result; MVJSONValue* value = getField(name); if (value == NULL) return result; for (auto item : value->arrayValue) { result.push_back(remove_pointer<decltype(std::declval<T>().at(0).get())>::type::fromJSON(item->objValue)); } return result; } template<> inline const vector<int> MVJSONNode::getValue(const string& name, typename enable_if<is_vector<const vector<int>>::value, const vector<int>>::type*) { vector<int> result; MVJSONValue* value = getField(name); if (value == NULL) return result; for (auto item : value->arrayValue) { result.push_back((int)item->intValue); } return result; } |
So you need to specify overloads for all your trivial types and containers. In practice this is not so compact but anyway you should do it only once.
CONCLUSION
Once again, all this ugly implementation detail is written only once and is located on low level of architecture. Upper business layer just uses this utility as black box and should not be aware of inner implementation.
Anyway the main idea of this post was to show how compact declaration of immutable data could be in C++11, and that you don’t have to write more boilerplate code for each business class declaration.
Any comments are welcome.
UPDATE: I put some source code on github as 2 gists:
My old JSON lib – https://gist.github.com/VictorLaskin/1fb078d7f4ac78857f48
Declaration – https://gist.github.com/VictorLaskin/48d1336e8b6eea16414b
Please, treat this code like just an example.