Chapter 11. Serializing user defined types

Table of Contents

11.1. Serializing user defined types - intrusive non-split
11.2. Serializing user defined types - non-intrusive non-split
11.3. Serializing user defined types - non-intrusive split

To serialize an user defined type (UDT) such as a class/struct instance, you serialize the members of the UDT which hold the information that must be made persistent. Other members, such as calculated values or caches stored in maps, and which can be regenerated from the persistent data, do not need to be serialized (and probably should not be unless there is a compelling reason to do so).

UDTs that are to be serialized are required to be default constructible. That is, for a UDT of type T, this must be allowed:

T  t;  // default construct an instance of T

The reason for this requirement is because there are times when the library will need to construct a temporary instance. For example, when it loads the contents of a STL vector, it must construct each element one at a time, then load the element's data, and then add that element to the vector. Also, most of the time, your loader code will look something like this:

void load_T(ccs::serialization::storage<>& s) {
  T  t;
  s >> t;
}

and clearly, 't' must be constructible even though it won't hold useful data until after the "s >> t;" statement executes.

There are several techniques that are available for serializing UDTs. They are classified according to whether they require changes to the UDT - if so, such techniques are termed "intrusive" because they intrude upon the UDT's definition. If it's possible to serialize the UDT's data without changing the UDT's definition, the technique is said to be "non-intrusive". The non-intrusive techniques are further divided into those that (1) don't require any special processing for either loading or saving data, in which case, only one function needs to be written to perform either loading or saving, and (2) do require special processing for either loading or saving data, in which case, two functions need to be written: one to perform loading, the other to perform saving. Techniques of type (1) are termed "non-intrusive non-split" serializers, and those of type (2) are termed "non-intrusive split" serializers.

For UDTs with plain data and which have no members that depend on other serialized members and which are themselves not serialized, the simplest technique is appropriate: intrusive non-split serialization. This technique requires you to modify the class/struct by adding a member function. An example of an UDT suitable for this technique is something like a bank account class: since the members will have private access to prevent other classes from stealing money, only a member function can have read/write access to the member data, and so the intrusive non-split serialization technique is required.

If the class needs special processing for loading or saving, then it will need to use the non-intrusive split technique and may also require changes to the UDT to allow serializing to be correctly performed. For example, the interest rate for a bank account may depend on its balance, so after loading in the balance, the loader code must then set up the interest rate member. Such special processing does not apply to saving the data, so a split serialization technique is needed. But intrusive split serialization is not supported so the only way to serialize the UDT is by non-intrusive split serialization.

If you don't have source access or you don't need or want to modify the UDT's definition, and the UDT has public access to its members (for example, a C struct or a class with members marked "public:") then non-intrusive serialization is the way to go. If the saving and loading is simple and just involves serializing individual members, then non-intrusive non-split serialization is suitable, otherwise non-intrusive split serialization should be used. An example of an UDT that is suitable for the non-intrusive non-split technique is std::pair<U, V>. For non-intrusive split serialization, an example is a struct with const members: because the members can only be initialized during construction, a special loading function is needed to read in the individual member values and then to construct the UDT instance, which must then be copied to the correct object.

11.1. Serializing user defined types - intrusive non-split

The general technique is to define a function named "serialize" which takes a "serializer object" that can be of any type. The function then passes to the "serializer object" the members that must be serialized. For example, a banking program manages instances of this bank_account class:

class bank_account {
public:
  bank_account(int id, float balance) : id_(id), balance_(balance) {}

private:
  int id_;
  float balance_;
};

If instances of the bank_account class are to be serialized, the following changes must be done:

class bank_account {
public:
  // 1 - made this default constructible so that loading can be done:
  bank_account(int id = 0, float balance = 0) : id_(id), balance_(balance) {}

private:
  // 2 - the 'access' class must be able to call serialize() so give it access:
  friend class ccs::serialization::access;

  // 3 - this saves or loads the data according to whether the library is called
  //     to load or save:
  template <typename S>
  void    serialize(S& s) {
    // 4 - tell the library what data to load/save. If a field/member does not
    //     need to be serialized, then don't give it to the library.
    s & id_ & balance_;
  }

  int id_;
  float balance_;
};