Let’s first examine briefly what makes Transacted Persistence such a complex topic. (For more detailed information, see the spec “NGWS Runtime Services: Object Persistence and Serialization”)
Consider a very simple, and imaginary, Sales Order Processing System. A Salesman gathers orders at a Customer site. Later, these are fed into the company computer, which creates objects in memory, representing the situation, as shown below:
This shows a graph of objects, representing a Customer, linked to a Salesman, and to three Orders. The objects have been created by processing the Salesman’s orders, connecting with company databases to check Customer details (billing address, delivery address, credit limit, discount levels, etc), similarly to check stock details (numberof items in stock, price, weights for transportation, etc). The result is to dispatch these orders to the shipping department, with billing information to the Accounts department, etc.
All of this processing has been done under transactional control, with all that entails – locks held on database records to ensure isolation, two-phase commit protocol across multiple resource managers, write-ahead logging to ensure atomicity and recoverability, etc, etc.
The fields of the objects have been populated with values held in various company databases. For example, all fields of the Salesman object may come from one record in the company’s Salesman database. Of course, as the orders are processed, the system may update a TotalOrderValue field of that object, to reflect his monthly commission payment – and this information must be saved when the transaction completes. On the other hand, the Customer object may derive from several database tables – one for Customers, plus another for Addresses. So, populating the Customer object may involve extracting various columns from a Customer row, plus various colums from an Address row, into a buffer, which is then Deserialized into the object.
Conversely, when the transaction completes, serialization of the object graph may be a complex process – selecting some fields of an object to update certain columns of a database table row, with other fields sent to a different destination. Some fields may even be copied twice, if the database schema has been denormalized for performance reasons. Moreover, the backing databases may not all be from the same vendor – for example, a mixture of SQL Server, Oracle, and legacy C-ISAM.
The system must also be sure not to construct two instances of the same Customer in memory – that’s to say, two copies of the object representing that particular customer. Allowing this to happen blows apart the guarantees required for Enterprise transactions, opening the field to lost updates, database inconsistencies, etc, etc. Similarly, when a transaction commits, the object graph must be either destroyed, or at least marked ‘stale’ to safeguard against multiple conflicting versions of the truth.
This situation for transacted persistence becomes even more involved once we consider offline order processing, where disconnected object graphs are operated upon by a remote computer, with the changes only eventually brought back into a transaction and reconciled.
It should be clear from this much-simplified description of transacted persistence, that the requirements it imposes upon the runtime’s simple Serialization service are profound. The solution proposed is to layer the two – so that transacted persistence achieves what it needs to do, by providing its own Formatters and Serializers, but whose structure and interfaces, comply with the model followed by other, simpler, consumers of the Serialization service. Of course, transacted persistence may freely use some base services provided by the layer beneath (eg controlled and efficient access to the private fields of an object), but attempts to-date suggest it is impractical to provide a general-purpose Serialization service that fully meets the requirements of all conceivable scenarios for customer transacted persistence.