Data Persistence pitfalls

Today I ran in to a little bit of a snag while working on a generic approach to data persistance. I wanted to have the ability to do the following:

this.Worlds = this.StorageSource.Load< IWorld>().ToList();

The result of the above call would go to the data source, find all of the objects that implement IWorld and restore them. The problem however is that there might be multiple implementations of IWorld. The engine ships with DefaultWorld and someone might come along and write a SullyWorld. Both of which can be used simultaneously in the game. With that however came issues with data persistance. How do you persist this to the data store and maintain data integrity? For instance, the StorageSource must identify how it can save an object to its data store and know what that object is during restore. This would be easy if the game was going to built by developers using the source. You would just manually specify

this.StorageSource.Save< DefaultWorld>(this.Worlds.First());

However the goal is to have the entire game created within a editor. On top of that, I want the ability to download custom IWorlds from other users and plug them straight in to the engine via the editor and have the persistance code continue to work. This puts a bit of a dampner on Generics because you can't do the following with Generics:

this.StorageSource.Save< typeof(this.Worlds.First())>(this.Worlds.First());

You must strongly type the generic parameter. This presents a couple of tricky issues. One, the persistant store knows ahead of time that the IWorld object it is saving is DefaultWorld that implements IWorld. However, once saved and I request that the IPersistanceStorage object re-fetch the IWorld objects from the data store, it doesn't know what objects belong to what Types. It only knows that they all are IWorld. How do we fix this? Once way to address that, is to store some meta-data that links everything together. I'm a bit apprehensive of that because it makes the persistance store to brittle in my opinion.

Since any one can create a IPersistedStorage based object, it really becomes even more difficult to track by the engine. So I think the best choice is to have each individual implementation track this by themselves. Forcing all IPersistedStorage implementations to handle type inferrence itself won't break the fact that you can hot-swap data stores at runtime. As long as the data is loaded in memory (which it will be), you can re-persist to any storage provider you want.

By default, I want to ship with both XML Serialization support and SQLite support. I can solve this problem for XML serialization because I can load the XML before deserializing, determine the Type specified in it, instance that Type and then perform the deserialization. For the database, I will store all of the Types in a table that matches their interface implementation. Each table will have to have a column that specifies what the original Type actually was, so that I can re-instance that. The downside is I will not be able to use any existing ORMs and will have to create my own wrapper to read in the data and instance objects based off of the column data. Not optimal, so I will continue to explore it.