Tuesday, 4 March 2008

Serializing Boost.Tuple using Boost.Serialization

Note: there is an update here. This post is entertainment only!

If you are using Boost Serialization, at some point, you will have to serialize instances of a class that you cannot alter. Perhaps it is from a third-party library like Boost. One example of a class you might want to serialize is Boost Tuple.

The way one normally serializes instances of a class using Boost Serialization is by inserting a template member function:

#include <boost/archive/text_oarchive.hpp>
#include <iostream>

struct
foo

{

foo(int bar):bar(bar){}
template
<typename Archive>

void
serialize(Archive & ar, const unsigned int)
{

ar & bar;
}


private
:
int
bar;
};


int
main()
{

foo f(5);;

boost::archive::text_oarchive ar(std::cout);
ar & f;
}



But if you want to serialize something like Boost Tuple, you can't really go and modify the code to add the serialize member. Technically, you could but it would be a maintenance headache. Instead, you would use non-intrusive serialization which looks something like this:
#include <boost/tuple/tuple.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>

#include <sstream>
#include <iostream>
#include <boost/tuple/tuple_comparison.hpp>
#include <boost/tuple/tuple_io.hpp>

namespace
boost { namespace serialization {

template
<typename Archive, typename T1>
void
serialize(Archive & ar, boost::tuple<T1> & t, const unsigned int)
{


ar & t.get<0>();
}
}}


template
<typename TupleType>
void
test(TupleType t)
{


std::ostringstream os;
{

boost::archive::text_oarchive ar(os);
ar & boost::serialization::make_nvp("tuple",t);
}


TupleType t2;
{

std::istringstream is(os.str());
boost::archive::text_iarchive ar(is);

ar & boost::serialization::make_nvp("tuple",t2);
}

std::cout << t << " == " << t2 << " ? " << std::boolalpha << (t==t2) << std::endl;
}


int
main()
{

test(boost::make_tuple(1.0));
}


The above code will work just fine for a tuple with only one element but will fail for any tuples with more than one element.

Under the hood, boost::tuple is a template class with a fixed number of template parameters all defaulting to boost::tuples::null_type. So one way to serialize all tuples with at most five elements is to write the above free serialize function five times over. That is really as much fun as it sounds. So lets make it a bit more fun!

In a language like Lisp, you would generate the code by writing a macro. However, C++ macros are nowhere near as easy-to-use as Lisp macros. But the good/insane guys at Boost have pushed the limits and have developed a set of preprocessor macros that add the control flow statements you have come to love like looping and if statements. Thats right, looping!

So to make it more fun, we must abuse the C++ preprocessor and have it generate the code we need.

There are multiple ways to iterate using Boost.Preprocessor. The one I will use (for no particular reason) is BOOST_PP_REPEAT_FROM_TO(first,last,macro,data). This macro causes the argument passed in for the macro argument to be evaluated last-first times.

The expected signature for this macro is macro(z,iteration,data). z is simply the next available repetition dimension and data is any other stuff you want to pass in. For the vast majority of stuff, I think you ignore these two parameters. iteration will take on the values first, first+1,...,last-1.

In our case, we would like to call BOOST_PP_REPEAT_FROM_TO(0,6,GENERATE_TUPLE_SERIALIZE,~). Even though we do not make any use of the last parameter, we must still provide a value. Now the rest is fairly mechanical. We need to write GENERATE_TUPLE_SERIALIZE. From the above, we can see the signature should look something like #define GENERATE_TUPLE_SERIALIZE(z,n,unused). So lets take a stab at it:
#define GENERATE_TUPLE_SERIALIZE(z,nargs,unused)                            \
template< typename Archive, BOOST_PP_ENUM_PARAMS(nargs,typename T) > \
void serialize(Archive & ar, \
boost::tuple< BOOST_PP_ENUM_PARAMS(nargs,T) > & t, \
const unsigned int version) \
{ \
ar & boost::serialization::make_nvp("head",t.head); \
ar & boost::serialization::make_nvp("tail",t.tail); \
}


Too easy :-)

Peeking under the hood (but still public API) for Boost Tuple, we see that each tuple is really a cons, albeit a compile-time one. This is done through inheritance. So for example, boost::tuple<int> is really boost::tuples::cons<int,boost::tuples::null_type>.

When we access t.tail in the above definition of serialize, we are accessing one of these cons structures. So if we compiled the code as is, that is:
#include <boost/tuple/tuple.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>

#include <boost/preprocessor/repetition.hpp>

#include <sstream>
#include <iostream>
#include <boost/tuple/tuple_comparison.hpp>
#include <boost/tuple/tuple_io.hpp>


namespace
boost { namespace serialization {

#define GENERATE_TUPLE_SERIALIZE(z,nargs,unused) \
template< typename Archive, BOOST_PP_ENUM_PARAMS(nargs,typename T) > \
void serialize(Archive & ar, \
boost::tuple< BOOST_PP_ENUM_PARAMS(nargs,T) > & t, \
const unsigned int version) \
{ \
ar & boost::serialization::make_nvp("head",t.head); \
ar & boost::serialization::make_nvp("tail",t.tail); \
}


BOOST_PP_REPEAT_FROM_TO(1,6,GENERATE_TUPLE_SERIALIZE,~);

}}


template
<typename TupleType>

void
test(TupleType t)
{

std::ostringstream os;
{

boost::archive::text_oarchive ar(os);

ar & boost::serialization::make_nvp("tuple",t);
}

TupleType t2;
{


std::istringstream is(os.str());
boost::archive::text_iarchive ar(is);

ar & boost::serialization::make_nvp("tuple",t2);
}

std::cout << t << " == " << t2 << " ? " << std::boolalpha << (t==t2) << std::endl;
}


int
main()
{

test(boost::make_tuple(1.0));
}



We would get a compile error:

serialize3.cpp:23: error: ‘class boost::tuples::tuple<double, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>’ has no member named ‘tail’

The reason for this is that a one element tuple has no tail. There are multiple ways to solve this problem, but I think the neatest is to treat 1-element tuples (ignore 0-element tuples) as special cases and treat the rest as generic. The new serialization code looks like:

namespace boost { namespace serialization {

template
<typename Archive, typename T1>
void
serialize(Archive & ar,

boost::tuple<T1> & t,
const
unsigned int)
{

ar & boost::serialization::make_nvp("head",t.head);
}


#define GENERATE_TUPLE_SERIALIZE(z,nargs,unused) \
template< typename Archive, BOOST_PP_ENUM_PARAMS(nargs,typename T) > \
void serialize(Archive & ar, \
boost::tuple< BOOST_PP_ENUM_PARAMS(nargs,T) > & t, \
const unsigned int version) \
{ \
ar & boost::serialization::make_nvp("head",t.head); \
ar & boost::serialization::make_nvp("tail",t.tail); \
}


BOOST_PP_REPEAT_FROM_TO(2,6,GENERATE_TUPLE_SERIALIZE,~);

}}


If we compile this, we get yet another error:
/usr/include/boost/serialization/access.hpp:109: 

error:struct boost::tuples::cons<int, boost::tuples::cons<char,

boost::tuples::cons<std::basic_string<char, std::char_traits<char>,

std::allocator<char> >, boost::tuples::null_type> > >’ has no member
named ‘serialize’

Almost home free! The reason we get this error is that when we reference the tail member of the tuple, that is a cons object, not a tuple. So we need two more functions:
    template<typename Archive, typename T1>

void
serialize(Archive & ar,
boost::tuples::cons<T1,boost::tuples::null_type> & t,

const
unsigned int)
{

ar & boost::serialization::make_nvp("head",t.head);
}


template
<typename Archive, typename T1, typename T2>

void
serialize(Archive & ar,
boost::tuples::cons<T1,T2> & t,

const
unsigned int)
{

ar & boost::serialization::make_nvp("head",t.head);

ar & boost::serialization::make_nvp("tail",t.tail);
}



With that, the final code looks like:
#include <boost/tuple/tuple.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>

#include <boost/preprocessor/repetition.hpp>

#include <sstream>
#include <iostream>
#include <boost/tuple/tuple_comparison.hpp>
#include <boost/tuple/tuple_io.hpp>


namespace
boost { namespace serialization {

template
<typename Archive, typename T1>

void
serialize(Archive & ar,
boost::tuples::cons<T1,boost::tuples::null_type> & t,

const
unsigned int)
{

ar & boost::serialization::make_nvp("head",t.head);
}


template
<typename Archive, typename T1, typename T2>

void
serialize(Archive & ar,
boost::tuples::cons<T1,T2> & t,

const
unsigned int)
{

ar & boost::serialization::make_nvp("head",t.head);

ar & boost::serialization::make_nvp("tail",t.tail);
}


template
<typename Archive, typename T1>
void
serialize(Archive & ar,

boost::tuple<T1> & t,
const
unsigned int)
{

ar & boost::serialization::make_nvp("head",t.head);
}


#define GENERATE_TUPLE_SERIALIZE(z,nargs,unused) \
template< typename Archive, BOOST_PP_ENUM_PARAMS(nargs,typename T) > \
void serialize(Archive & ar, \
boost::tuple< BOOST_PP_ENUM_PARAMS(nargs,T) > & t, \
const unsigned int version) \
{ \
ar & boost::serialization::make_nvp("head",t.head); \
ar & boost::serialization::make_nvp("tail",t.tail); \
}


BOOST_PP_REPEAT_FROM_TO(2,6,GENERATE_TUPLE_SERIALIZE,~);

}}


template
<typename TupleType>

void
test(TupleType t)
{

std::ostringstream os;
{

boost::archive::text_oarchive ar(os);

ar & boost::serialization::make_nvp("tuple",t);
}

TupleType t2;
{


std::istringstream is(os.str());
boost::archive::text_iarchive ar(is);

ar & boost::serialization::make_nvp("tuple",t2);
}

std::cout << t << " == " << t2 << " ? " << std::boolalpha << (t==t2) << std::endl;
}


int
main()
{

test(boost::make_tuple(1.0));
test(boost::make_tuple(1.0,1));

test(boost::make_tuple(1.0,1,'c'));
test(boost::make_tuple(1.0,1,'c',std::string("FOO!")));
}



Compiling and running the code, we get the expected output:

/tmp$ g++ serialize6.cpp -lboost_serialization && ./a.out
(1) == (1) ? true
(1 1) == (1 1) ? true
(1 1 c) == (1 1 c) ? true
(1 1 c FOO!) == (1 1 c FOO!) ? true


Why did I do this? Someone asked and I was waiting for a process to finish running. That and I tend to do bursts of random C++ code when I am sick (which I am today!)

1 comment:

Michael said...

I love it! I'm googling for how I might serialize a boost::any and I find a blog by Sohail!

Perhaps you can work on the serialization of boost::any during your next long build.