Object Life Time and Persistence



class ID
{
public:
    string Name;
    int Score;
};


Dbstl is an interface to Berkeley DB, so it is used to store data persistently. This is really a different purpose from that of regular c++ STL. This difference in their aims has an implication to object lifetime: In standard STL, when you store an object A of type ID into C++ stl vector V via V.push_back(A), if a proper copy constructor is provided in A's class type, then the copy of A (call it B) and everything in B, such as another object C pointed to by B's data member B.c_ptr, will be stored in V and will live as long as B is still in V and V is alive. B will be destroyed when V is destroyed or B is erased from V.

This is not true for dbstl, which will copy A's data and store it into the underlying database. The copy is by default a shallow copy, but users can register their object marshalling and unmarshalling functions via the DbstlElemTraits class template. So if A is passed to a db_vector container dv via dV.push_back(A), then we will copy A's data using the registered functions, and store the chunk of bytes into the underlying database. So A will be valid, even if the container is destroyed, because it is stored into the database. If the copy is simply a shallow copy, and A is later destroyed, then the pointer stored in the database will become invalid, and the next time we use the retrieved object, we would be using an invalid pointer, thus there will be errors. The way to avoid this is to store the referred object C rather than the pointer member A.c_ptr itself, by registering the right marshalling/unmarshalling function with DbstlElemTraits.

In the above example, the class ID has a data member Name, which will refer to a memory address of the actual characters in the string. If we simply shallow copy an object id of class ID to store it, then the stored data idd will be invalid when id is destroyed. This is because idd and id refer to a common memory address which is the base address of the memory space storing all characters in the string, and this memory space is released when id is destroyed. So idd will be refering an invalid address, and the next time we retrieve idd and use it, there will be memory corruption.

The right way to store id is to write a marshall/unmarshal function pair like this:


void copy_id(void *dest, const ID&elem)
{
	memcpy(dest, &elem.Score, sizeof(elem.Score));
	char *p = ((char *)dest) + sizeof(elem.Score);
	strcpy(p, elem.Name.c_str());
}

void restore_id(ID& dest, const void *srcdata)
{
	memcpy(&dest.Score, srcdata, sizeof(dest.Score));
	const char *p = ((char *)srcdata) + sizeof(dest.Score);
	dest.Name = p;
}

size_t size_id(const ID& elem)
{
	return sizeof(elem.Score) + elem.Name.size() + 
	    1;// store the '\0' char.
}

Then register the above functions before storing any instance of ID:


DbstlElemTraits<ID>::instance()->set_copy_function(copy_id);
DbstlElemTraits<ID>::instance()->set_size_function(size_id);
DbstlElemTraits<ID>::instance()->set_restore_function(restore_id);


This way, the actual data of instances of ID are stored, thus the data will persist even if the container itself is destroyed.