Data Storages

In the last several years I’ve seen several projects suffering from performance problems caused by one bad decision: inadequate choice of storage.

As Greg Young noted developers often stick to RDBMS without even considering the alternatives. I’m not going to discuss SQL vs. NoSQL, but rather share some ideas on what might affect the decision.

Data life time

In one of the projects I’ve been involved in, there was an incoming stream of real time geo location data from hundred of thousands of users. While the whole system generally worked well the performance of geo-location data storage was very poor.

As one might guess the dev team decided to store the data in RDBMS, I believe the vendor doesn’t really matter.

No ORM layer above, easy structure with several tables.

When some load testing was performed it revealed that the system can’t serve even a fraction of the load it was assumed to.

The problem with real-time geo location info is that there is no reason to store it durably, other than for historical reasons. If we are talking about latest coordinates, those change often and the best storage for them is… in-memory cache.

After some rework we have moved the writes to a queue in the backend, which dumped the coordinates to the database and the system was able to pass the load tests.

Model Responsibility

Much responsibility put on the model narrows the storage options. For years applications employed same model for both reporting and transaction processing.

In a recent talk on CQRS I gave, there was a slide with the following comparison table:

  Commands Queries
Processing Synch/Asynch Synch
Data Mostly Normalized Highly De-normalized
Form Objects Data Sets/Tables
Freshness Very High Very Low-High
Target Isolated object Any subset
Security Reject command Pre-filter results
Logic Highly involved Little to no

As one might see, the ideal storage for underlying data from queries and commands point of view highly diverges. But because commands contain most of the logic of the system which developers should implement, they usually concentrate on the command-side and the query side becomes second-class citizen developed on top of commands model.

And here is where RDBMS really shines. Joins and aggregation are the magic wands that turn normalized model into de-normalized one. That’s what makes RDBMS cool – you mostly concentrate on one side of the system and get the other one very cheaply in terms of development effort. After all that’s what most of the systems that serve several hundreds users need.

When the load increases (like in the example above) the cost of joins, aggregation and full ACID cannot be denied anymore. Even systems built on top of monolithic database, usually have a second storage for reports generation (be it a copy of the OLTP database, de-normalized store, OLAP or whatever).

When applying CQRS, you logically decouple the command and query model, opening a wide set of options for stores on both sides (RDBMS, key-value storage, in-memory cache, event storages, etc).

Model Complexity

Model complexity may affect your storage options as well. Imagine the model behind Twitter-like service, which is very simple to express in terms of objects. Most of the objects are very small and can be stored in isolation, therefore, can reside in a variety of stores.

If the model requires a wide set of transactional operations and the cost of failure is high, you will likely need a more sophisticated storage than Simple DB to guard yourself from nasty errors.

In Twitter the cost of error is rather low, usually the outcome is undelivered tweet. Imagine the same thing in medical billing or accountant software.

Individual Objects Value

Sometimes individual objects are useless for being observed on their own. Imagine a paragraph of a text document stored in a separate table in a relation database. Sounds ridiculous isn’t it?

Yet, the idea of “document” is it too many times overlooked with developers stick to ORM + SQL DB.

I’ve seen once the developers implemented the whole thing relational and then figured out that they need to implement their own versioning, and conversion to portable format, which they could achieve for free using XML.

The following signs can indicate a place for “Document” model:

  1. Individual objects are of low value (e.g. they would never be queried on their own outside of the “Document” scope)
  2. User usually given a single “Save” button for the whole “Document” (with auto-save functionality implemented)
  3. The “Document”-like model will be transmitted for usage elsewhere

Conclusion

Finally you can get the best of both worlds by breaking the system into different services. The storage decision is made then per-service, so you can evaluate storage requirements one at a time.

Evgeny Shapiro | posted @ Sunday, March 28, 2010 2:04 PM | Feedback (0)

NServiceBus introductory web-cast

My introductory web-cast that briefly covers NServiceBus capabilities was recently published on TechDays.ru. The web-cast is in Russian and I hope it will be useful for the Russian NServiceBus community.

You can find it here. If you liked it, don’t hesitate to vote for it (requires Live ID)!

Evgeny Shapiro | posted @ Thursday, February 18, 2010 11:46 PM | Feedback (0)

Virtualized Bugs - Answer

Ok. So we’ve found some problem in a VM that doesn’t do any network stuff on our side, but it certainly worked on the other side. How could it be?

The software we’ve worked on contained a bunch of windows services set to start automatically on Windows startup. Almost all of them initialized a heavy infrastructure that were absolutely critical for the whole application. When we started the client it tried to connect to a web-service hosted in one of the windows services and failed to do so.

After some time of debugging it turned out that windows services do have dependencies on each other, in terms of RPC calls.

We’ve looked in the whole list and found that one of the services didn’t started at all. We’ve checked it start settings and it was set to Automatic, but it was stopped. We’ve tried to start it manually and it failed to start.

When we looked into the code we have found it to make most of the heavy lifting right in the OnStart method which Windows expect to complete shortly; and the code inside OnStart used to run quickly… until it was executed on a VM-server under heavy load of several VMs.

The standard behavior for Windows is to kill service process if the services didn’t respond in a timely fashion. In other words the difference in server load revealed unstableness of the code that used to work fine on clean hardware.

This seems quite interesting, as it seemed safe to pass preconfigured VM to show some preview, but it turned out that even if the VM works good on your side, it can be broken on the other side.

Evgeny Shapiro | posted @ Thursday, February 18, 2010 11:41 PM | Feedback (0)

Virtualized bugs?

How many times you’ve sent Virtual Machines with preconfigured environment to ensure that all is setup correctly and will 99,99% work on the other side?

Well I’ve done this numerous times. However, I dealt today with 100% correctly configured environment on a VM which do nothing with the network, only a bunch of services doing inter-process communication; and it failed to work correctly on my end, but worked correctly on VM creator’s end.

It took me some time to get from the cryptic “Could not logon error (An exception was thrown by the target of invocation)” to the real issue.

Does anyone can guess what the issue was?

Evgeny Shapiro | posted @ Monday, January 04, 2010 11:33 PM | Feedback (0)

Is it time for “send” in C#?

I often catch my self thinking that I miss something like the following in C#:

send new MyMessage();

The line should work just like “throw”, but instead of stopping the execution of the code and looking for a handling catch block up in the stack, I would like a pluggable architecture for processing messages, so that you can mark a method to handle the message. “Send” can be used to pass messages to the upper levels, without events if those are unwanted.

It will also be very cool if it will be a way to tell the framework to call the handler in either way: synchronous or asynchronous.

Well, yes, I know about the events, but those are noisy for passing simple messages.

I remember Jeremy D. Miller asked if it’s the time to put DI into the BCL. So it’s seems like there is some need for putting the “Decoupled” way of doing things right into the platform.

It’s just a random thought crossed my mind :). And no, I will not go that far and say that I want to see macros support in C#.

As for now - the simplest thing that actually works:

DomainEvents.Raise(new MyMessage());

Evgeny Shapiro | posted @ Wednesday, December 09, 2009 12:01 AM | Feedback (0)

Saga for DSL with NServiceBus

When dealing with processes that can potentially span hours, days and even months going with full ACID transactions is unpractical, because of internal locks.

To model such interactions saga pattern is used. Saga breaks the interaction with the system into a series of small transactions and manages the consistency through timeouts and compensations. Saga relaxes Atomicity and Isolation requirements to achieve greater scalability.

Service bus frameworks (such as NServiceBus) have saga support out of the box. Most of the frameworks (NServiceBus, MassTransit and Rhino Service Bus) infer which messages saga orchestrates by static analysis of saga types. For example the following is an example of Saga implementation in NServiceBus:

   1: public class OrderSaga : Saga<OrderSagaData>,
   2:     IAmStartedByMessages<OrderMessage>,
   3:     IHandleMessages<OrderAuthorizationResponseMessage>,
   4:     IHandleMessages<CancelOrderMessage>
   5: {
   6:     public void Handle(OrderMessage message){...}
   7:     public void Handle(OrderAuthorizationResponseMessage message){...}
   8:     public void Handle(CancelOrderMessage message){...}
   9: }

When bus starts the framework finds all saga classes and uses reflection to determine which messages they process.

This greatly simplifies the whole story and generally works very well, but recently I have encountered a need to implement a DSL like the following:

   1: Expect.MessageSequence("BatchProcessVerifier1")
   2:     .StartsWith<BatchProcessStart>()
   3:     .ForEach(x => x.Node)
   4:     .Then<BatchProcessEnd>()
   5:     .In(TimeSpan.FromSeconds(300));

This definitely requires a saga, but NServiceBus sagas aren’t designed for handling such cases. Generally because NSB decides which messages are orchestrated by sagas using reflection (infers the static structure of the classes), while DSL requires dynamic registration of saga type and the set of messages it orchestrates.

But it turned out that it’s not so hard to implement Dynamic Sagas with NServiceBus.

Dynamic Saga

First of all let’s define IDynamicSaga interface:

   1: public interface IDynamicSaga
   2: {
   3:     void Handle(IMessage message);
   4:     bool Completed { get; }
   5:     IDynamicSagaEntity Data { get; set; }
   6:     IDynamicSagaDescription SagaDescription { get; set; }
   7: }
   8:  
   9: public interface IDynamicSaga<TSagaEntity> : IDynamicSaga
  10:     where TSagaEntity : IDynamicSagaEntity
  11: {
  12:     new TSagaEntity Data { get; set; }
  13: }

 

There is a mandatory Handle(IMessage message) method that will process all incoming messages regardless of their type.

IDynamicSagaEntity is saga state bag that will be persisted while waiting for messages arrival (this is similar to NServiceBus’ ISagaEntity).

Completed is just a way to query the saga if it has been completed.

There is also SagaDescription, which contains runtime information for the saga, basically all the data setup in the DSL above will go to SagaDescription.

Saga Finder

The purpose of Saga Finder is to find saga entities using the message being processed.

This part become a bit tricky because of the third line of the DSL:

.ForEach(x => x.Node)

The line states that there should be a saga per Node and the finder should know about this as well. For this reason IDynamicSagaFinder is asked to find the saga not only by the message being processed, but also by saga description:

   1: public interface IDynamicSagaFinder
   2: {
   3:     IDynamicSagaEntity FindSagas(IMessage message, IDynamicSagaDescription sagaDescription);
   4: }

And the interface of IDynamicSagaDescription looks like the following:

   1: public interface IDynamicSagaDescription
   2: {
   3:     string GetCorrelationKey(IMessage message);
   4: }

For the Saga backing the DSL above GetCorrelationKey will just concatenate the Name of the message sequence and the values of the properties defined in ForEach expression. As a result the correlation key looks something like “BatchProcessVerifier1 : Node1”. The only thing left to the finder is to search the entity by it’s CorrelationKey.

Saga Example

The following saga can be used for processing the DSL statement above:

   1: public class MessageSequenceDynamicSaga : DynamicSaga<MessageSequenceDynamicSagaEntity, MessageSequenceDescription>
   2: {
   3:     #region Overrides of DynamicSaga<MessageSequenceDynamicSagaEntity>
   4:  
   5:     public override void Handle(IMessage message)
   6:     {
   7:         var messageType = message.GetType();
   8:         var messageTypesSequence = SagaDescription.GetMessageTypesSequence().ToList();
   9:         var messageTypeIndex = messageTypesSequence.FindIndex(x => x == messageType);
  10:         var expectedIndex = Data.ExpectedMessageIndex;
  11:         
  12:         if (messageTypeIndex == -1 || messageTypeIndex < expectedIndex)
  13:             return;
  14:  
  15:         if (messageTypeIndex > expectedIndex)
  16:         {
  17:             Bus.HandleCurrentMessageLater();
  18:             return;
  19:         }
  20:  
  21:         Data.ExpectedMessageIndex++;
  22:  
  23:         if (Data.ExpectedMessageIndex == messageTypesSequence.Count)
  24:             MarkAsCompleted();
  25:         else
  26:             RequestTimeout(SagaDescription.GetTimeoutForNextMessageType(messageType), Data.ExpectedMessageIndex);
  27:     }
  28:  
  29:     #endregion
  30:  
  31:     public override void Timeout(NServiceBus.Saga.TimeoutMessage message)
  32:     {
  33:         var index = Convert.ToInt32(message.State);
  34:         if (index == Data.ExpectedMessageIndex)
  35:         {
  36:             Bus.Publish(new ExpectationViolated
  37:                         {
  38:                             ExpectationName = SagaDescription.Name,
  39:                             ViolationRegistered = DateTime.UtcNow,
  40:                         });
  41:             MarkAsCompleted();
  42:         }
  43:     }
  44:  
  45: }

MarkAsCompleted and RequestTimeout are helper methods defined in abstract DynamicSaga class to mimic original NServiceBus saga experience.

And the description for it:

   1: public class MessageSequenceDescription : IDynamicSagaDescription
   2: {
   3:     private readonly string _name;
   4:     private readonly Type _firstMessageType;
   5:     private readonly List<Expression> _groupByExpressions;
   6:     private readonly List<MessageExpectation> _expectations;
   7:  
   8:     public MessageSequenceDescription(string name, Type firstMessageType, IEnumerable<Expression> groupByExpressions, params MessageExpectation[] expectations)
   9:     {
  10:         _name = name;
  11:         _firstMessageType = firstMessageType;
  12:         _groupByExpressions = new List<Expression>(groupByExpressions);
  13:         _expectations = new List<MessageExpectation>(expectations);
  14:     }
  15:  
  16:     public List<Expression> GroupByExpressions
  17:     {
  18:         get { return _groupByExpressions; }
  19:     }
  20:  
  21:     public Type FirstMessageType
  22:     {
  23:         get { return _firstMessageType; }
  24:     }
  25:  
  26:     public List<MessageExpectation> Expectations
  27:     {
  28:         get { return _expectations; }
  29:     }
  30:  
  31:     public string Name
  32:     {
  33:         get { return _name; }
  34:     }
  35:  
  36:     public IEnumerable<Type> GetMessageTypesSequence()
  37:     {
  38:         yield return _firstMessageType;
  39:         foreach (var expectation in Expectations)
  40:             yield return expectation.MessageType;
  41:     }
  42:  
  43:     public TimeSpan GetTimeoutForNextMessageType(Type messageType)
  44:     {
  45:         if (messageType == _firstMessageType)
  46:             return _expectations[0].RelativeTime;
  47:         var index = _expectations.FindIndex(x => x.MessageType == messageType);
  48:         return _expectations[index + 1].RelativeTime;
  49:     }
  50:  
  51:     #region Implementation of IDynamicSagaDescription
  52:  
  53:     public string GetCorrelationKey(IMessage message)
  54:     {
  55:         var groupMembers = new List<KeyValuePair<string, object>>();
  56:         foreach (var groupByExpression in GroupByExpressions)
  57:         {
  58:             var labmdaExpression = (LambdaExpression)groupByExpression;
  59:  
  60:             var memberExpression = (MemberExpression)labmdaExpression.Body;
  61:             var member = memberExpression.Member;
  62:             var memberName = member.Name;
  63:  
  64:             var property = message.GetType().GetProperty(memberName);
  65:             var memberValue = property.GetValue(message, null);
  66:  
  67:             if (memberValue != null)
  68:             {
  69:                 var memberStringValue = memberValue.ToString();
  70:                 groupMembers.Add(new KeyValuePair<string, object>(memberName, memberStringValue));
  71:             }
  72:         }
  73:  
  74:         return groupMembers.OrderBy(x => x.Key).Aggregate(_name, (c, e) => c + " : " + e.Value);
  75:     }
  76:  
  77:     #endregion
  78: }

Sagas can be registered in runtime with the following static method of DynamicSagas class:

void RegisterSaga(Type sagaType, IDescriptionProvider descriptionProvider, Type[] sagaMessageTypes, Type finder)

Conclusion

It’s was quite a bit of code which still needs some testing and polishing, and it heavily replicates NServiceBus’ saga facility, but it works.

It’s also worth mentioning that NServiceBus’ sagas can be based on Dynamic Sagas engine, but not vice versa.

Evgeny Shapiro | posted @ Monday, November 16, 2009 8:38 PM | Feedback (0)

Don’t: open session in view

I’m starting a series of posts “Don’t” about patterns and design decisions I consider bad. Some of them are learned the hard way, through bad judgment and painful experience. While at certain circumstances some of the patterns described can be useful, they are mostly don’t work well in a general case.

Open session in view

“Open session in view” (OSV) is a web-oriented (anti)pattern that opens ORM session at the beginning of the request and close it when view rendering is done (session is maintained by a servlet filter (Java) or http module (ASP.NET)).

Why one might need a session open for a duration of whole request?

The answer is really simple: this is the first thing that comes into mind when the rendering fails with LazyLoadingException, or LazyInitializationException, or whatever the ORM throws. This usually happens when the view is rendered directly from persistent entities with uninitialized lazy associations/collections.

Example

Consider the following menu in a classic eshop scenario:

  • Category1
    • Product1
    • Product2
  • Category 2
    • Product3
    • Product4

And the following code passes the entities to the view layer:

   1: using (var session = ORM.GetSession()){
   2:     return session.GetAll<ProductCategory>();
   3: }

Assuming that lazy initialization is enabled for the Products collection of Category’s entity, view will fail to render the second level of the menu, because the session the entity was associated with is already closed.

To address those issues one may easily jump into OSV, but this decision has some strong drawbacks and generally hides the flaw of the design.

Drawbacks

OSV (anti)pattern has several drawbacks that limit it’s use and heavily influence design and performance:

  1. Works only in a 2-tier environment, which is acceptable in some, but not all cases
  2. Entities model is influenced by views: some of the associations that are needed by views are meaningless from the entities point of view. Category may have no Product collection, but rather a Product refers a certain category
  3. Can result in a large number of database queries, because of intensive lazy loading usage

Generally ORM sessions aren’t designed to survive failures in applying changes, therefore exception handling can became quite tricky. Up to some extent this can be addressed by Get after Post pattern.

Also while in short term OSV can solve the problem, it hides the real problem in the design: rendering views from entities.

What to use instead?

Well, there is really no need to render the view from entity model. Entity model’s purpose is to encapsulate behavior; displaying state is a query which has no behavior and therefore entity model is of a very limited value here.

A better choice is accessing the database directly, after all queries is the strong side of SQL.

It’s also worth looking at CQRS as a pattern that generalize this separation.

Evgeny Shapiro | posted @ Sunday, November 15, 2009 3:20 AM | Feedback (0)

Going One-way Asynchronous

The alternative way to going Request/Response communication style is one-way asynchronous messaging. While at first this seems much harder, it is much more connected to the way you think of the processes than to technical issues.

Once you get used to modeling processes in asynchronous way you will discover that much of the processes happening in the enterprise or around you are naturally asynchronous.

Example

Let’s analyze something that is easy to touch – project development process itself (well simplified). Let’s model the process of releasing a build for testing in both ways (sync and async).

When the dev team happily ended features development they are trying to pass it to the QA team (at first assume no internal process automation in place).

Synchronous
The synchronous way is for the dev team lead to find QA team lead and tell him about the new build. If the QA lead is not on his working place, wait for it, or… throw an exception (to the higher level, which is probably management :)). QA team lead acknowledges that he is now aware of the build and dev team lead is able to proceed his work.

Asynchronous
Send an email.

As you can see the asynchronous way is easier to model and develop. Preserving this asynchronous nature in the application tends to greatly simplify the overall interaction. So the Project Builds Tracking (PBT) doesn’t need to create QA task when a new build is submitted. It just sends a message (BuildSubmitted, pay attention to the past tense as PBT already committed its transaction) to QA Task Tracking (QATT), so that it will be able to take its decision on how to handle this.

Pub/Sub

Well, at some point you may find that not only QATT needs the information about new builds, but the Bug Tracking system should also create the build and target versions. When going further you will discover that adding new systems requires PBT to send more and more messages. At this point it’s useful to stop thinking of them as a targeted messages, but rather generalize this interaction model.

One well known pattern for this is Publish/Subscribe (or Pub/Sub). There are number of systems that want to listen to events happening in PBT, all those Subscribe to certain PBT events. When the event occurs PBT Publishes it in a form of message sent too all subscribers. Pub/Sub logically decouples the systems, as the code doesn’t need to be modified to add new listeners.

Tools

Now, suppose you shifted your mind toward the world of asynchronous processes. One thing you may found you are lacking is an appropriate tool to make you concentrating on the processes, not the infrastructure.

There are plenty of tools that may serve your needs, I encourage you to look at which do fit better:

You may also try to build your own framework on top of WCF with MSMQ binding to understand the concept better.

I also encourage you to read Udi Dahan blog, who is author of NServiceBus and SOA guru.

Evgeny Shapiro | posted @ Tuesday, November 03, 2009 10:23 PM | Feedback (2)

Google Reader

Just noticed, that Google Reader doesn't show my previous post. I know nothing about the cause of the issue, but other readers have no problems showing the post (checked with Bloglines).

Evgeny Shapiro | posted @ Wednesday, October 28, 2009 5:14 PM | Feedback (0)

Security of Q in CQS

One of the questions that frequently arise on the Domain Driven Design mail list is how to build the query side of CQS? The most popular approaches are as follows:

  • Go the 2-tier way and query the store directly from the client
  • Implement some sort of thin DTO layer on top of the query store

If the Query side in question is not some sort of BI support store I personally prefer the second approach because of security. Putting security checks on the client is something I can’t agree with, because the server side should never trust the clients (unless, other requirements explicitly states so).

The question is how to enable security rules such as “Account manager can only see the payment history of his own customers”? In this post I will go through the steps of implementing the Query side with security support for such rules.

The implementation below uses LINQ to SQL and WCF and a solution to inject security rules into the queries. While LINQ to SQL with WCF do quite a good job here, the same approach can be used with any ORM that supports query objects (say NHibernate and Criteria API).

Note: because Command side and Query side are completely separated they can be implemented with a different set of technologies (say, Q on top of WCF + L2S, while C on top of NHibernate + NserviceBus).

Implementing the Query Side

The first step is to generate LINQ to SQL objects from the view-specific tables/views in the query store (seems like the first time the designer makes a good job). Note that because most of the queries are just flat data (like grids) the objects generated will have no connections.

The next part is implementing WCF service which returns the results needed to be displayed by the application. The operations in the service will return LINQ to SQL objects which are used as DTOs here.

Injecting Security Rules

Protecting data from being viewed is mostly about filtering it, so security rules for the Q side are easily described as data filters. The following defines how security filters look with LINQ:

public interface ISecurityFilter<T>

{

    IQueryable<T> Apply(IQueryable<T> query);

}

The following class can express the rule above (“Account manager can only see the payment history of his own customers”) in code:

public class AccountManagerPaymentHistoryAccessFilter : ISecurityFilter<AccountManagerPaymentHistory>

{

    public IQueryable<AccountManagerPaymentHistory>  Apply(IQueryable<AccountManagerPaymentHistory> query)

    {

        var userId = ...; //current user id

        return from p in query

                   where p.AccountManagerId == userId

                   select p;

    }

}

Rules are injected using the following wrapper over the DataContext object generated by LINQ to SQL:

public class QueryContext

{

    private QueryDataContext _context = new QueryDataContext();

    public IQueryable<T> Get<T>()

    {

        var filters = IoCContainer.ResolveAll<ISecurityFilter<T>>();

        var query = _context.GetTable<T>();

        foreach (var filter in filters)

            query = filter.Apply(query);

        return query;

    }

}

The wrapper uses IoC container to get all security filters applicable and apply them one by one. As a result any code in the service layer that queries against QueryContext.Get<T>() operates on prefiltered data.

Versioning

When using DTOs with strict schema versioning can became a hassle. The rule of thumb here is to avoid schema breaking changes such as renaming or removing of fields/properties and rather go with a soft process of deprecating fields/properties in favor of new ones.

DataContractSerializer has built-in support using ExtensionDataObject. All new fields/properties will go to ExtensionDataObject property that is part of WCF autogenerated classes.

Another possible approach here is to return some generic DTO such as DataSet from the service, so that the view will be completely driven by service replies.

Conclusion

With the help of IoC container and security filters we can easily implement security requirements on the query side. Also, because the Query side is free of heavy logic, the implementation can be done on top of any ORM technology that supports query objects.

P.S.: this is one of the examples where DnDDD (Drag’n’Drop Driven Development) and code gen do the job :)

Evgeny Shapiro | posted @ Monday, October 26, 2009 9:31 PM | Feedback (2)