Code Design
Courses and Workshops on Effective Code Design

Who Disposes Your Repository

Recently, I’ve went again through the discussion of how the Repository Pattern works with Dependency Injection (DI) in one of the projects I’m involved in. Even if these patterns are around for a while and there are many examples on how to use them together, discussing some particular implementation aspects is still interesting. So, I think it might help to go once again through it, maybe from a different angle.

The case I want to focus on, is when the repository implementation is abstracted through a generic interface, and its implementation is injected through DI to the classes that need to read or store data.

Using DI in an application it is often a good idea. It is an easy way to follow the Separate Construction and Configuration from Use principle (Uncle Bob explains it in his Clean Code book and also Martin Fowler does it here). Abstracting the repository through an interface is also good. It doesn’t only benefit you in writing isolated Unit Tests, but it brings a good separation of the data access concerns from the business logic. It prevents the data access specifics to leak into the upper layers, if you encapsulate them well into the implementation of the interface. Using them together is also good, but you need to consider the stateful nature of a repository and the deferred execution of an IQueryable.

The question I want to address is who disposes my repository implementation and when. The repository will use some connection to the storage, which should be released when is no longer needed. And that should be in a deterministic way, not when the garbage collector kicks in. If the implementation of my repository is through the Entity Framework (EF), it means that I should dispose the DbContext when I no longer need it. For the EF case this may not be too problematic since it does a clever connection management. Here is a response of the EF team on this. However, it may not be the same for other ORMs. Even more I think that IDisposable pattern should be correctly implemented and followed in all cases. I read IDisposable as a clear statement that says that the instances of its implementation must be cleaned in a deterministic way.  I don’t think it’s a good idea to just leave instances undisposed, because I may risk nasty leaks.

I prefer to have an open interface for my repository by taking the benefits the IQueryable brings. This means that the user of my repository can write or alter the query by itself, and I will not need to add a new methods to my repository each time a new use case needs a different kind of filtering or a different data projection. This is easily achieved with an interface like this:

interface IRepository  
{  
	Queryable<T> GetEntities<T>();    
	…  
}  

The clients of my repository may add additional filtering or data selection before the query is actually executed on the database server. Combining it with DI we may have class services like this using it:

class SalesOrdersService : ISalesOrdersService  
{  
 // The concern of calling repository.Dispose()  
 // is not in this service  
 private readonly IRepository repository;  
 …

public SalesOrdersService(IRepository repository, …)  
{  
   this.repository = repository;  
   …  
}

public decimal GetHighRiskOrdersAmount(int year)  
{  
    IQueryable<Order> orders = GetHighValueOrders()  
    		                    .Where(o => o.Year == year);
   
    decimal amount = 0;  
    foreach (var order in orders)  
    {  
         if (IsHighRisk(order))  
         amount += order.OrderLines.Sum(ol => ol.Amount);  
         // only read DB the order lines of high risk orders.  
         // important if expect that only 10% of orders are high risk  
    }

    return amount;  
 }

public int CountHighRiskOrders(int startWithYear, int endWithYear)  
{  
    IQueryable<Order> orders = GetHighValueOrders()  
 								.Where(o =>  o.Year >= startWithYear &&  
 											 o.Year <= endWithYear);

    // lets say I’m keen on performance and I only want to iterate  
    //    once through the resultset.  
    // If I would use the return order.ToArray().Count(IsHighRisk),  
    //    instead of the foreach, there is one iteration for ToArray 
	//    and one for the Count()  
    int count = 0;  
    foreach (var o in orders)  
    {  
        if (IsHighRisk(o))  
        count++;  
    }
   
    return count;  
 }

 public IEnumerable<Order> GetOrders(int startingWith, int endingWith)  
 {  
    return GetHighValueOrders()  
              .Where(order => order.Year >= startingWith &&  
              order.Year <= endingWith);

  // I have the flexibility to:  
  // -return IEnumerable, so the query executes when iterated first  
  // -return IQueryable, so the query can be enriched by the client before execution  
  // -return IEnumerable, but with an List underneath so the query executes here (call .ToList())  
 }

 // This functions makes reuseable the is HighValue evaluation.  
 // The evaluation of the condition can be translated through  
 // LINQ to SQL into a WHERE filtering which runs in the database  
 private IQueryable<Order> GetHighValueOrders()  
 {
       var orders = from o in repository.GetEntities<Order>()  
                          join ol in repository.GetEntities<OrderLine>() on o.Id equals ol.OrderId  
                    where ol.Amount > 100  
                    select o;

       return orders;  
  }  
 …  
}  

With this approach, I also get the benefit that for a new use case, I could maybe only wire-up in a different way some of my existent services, reusing them, and all will get through DI the same instance of the repository. Therefore, the underlying implementation has the opportunity to send all the queries from one session / request on the same storage connection. Even more, if my services return the IQueryable as IEnumerable, the query will be executed only when the client needs it.

Coming back to disposing the underling DbContext, one option is to make the repository implementation to also implement IDisposable, and to call DbContext.Dispose() when it is disposed:

class Repository : IRepository, IDisposable  
{  
   private MyDbContext context = new MyDbContext();

   public IQueryable<T> GetEntities<T>()  
   {  
      return context.Set<T>().AsNoTracking();  
   }

   public void Dispose()  
   {  
      context.Dispose();  
   }  
 …  
 }  

Now, I have all the benefits I described above, but who is going to call the Dispose() of my repository implementation. The services should not do it. They do not know that the implementation they got injected is an IDisposable and they shouldn’t know it. Making them also to implement IDisposable, and to check on Dispose() if some of their members are IDisposable, is not a solution either. My rule of thumb is that the code that creates an IDisposable instance, should also call dispose on it. It is also according with the Code Analyses Rules. In our case this is the DI Container. Our code does not create, nor explicitly asks for the instance, it is just injected into it. By using DI I am inverting the control of creating instances from my own code to the framework. The framework should also clean them up when no longer needed.

For this, we need to make sure that two things happen:

  1. the DIC will call Dispose() on all the disposable instances it created, and
  2. it will do it in a deterministic way (when no longer needed, meaning when the request  / session ends)

If you’re using Unity Container you have to take care of both. When an instance of the UnityContainer is disposed, it will call Dispose() on all the lifetime managers which are IDisposable. However, the built-in lifetime managers Unity provides for short lived instances, are not disposable, so you need to write your own. Here are some examples on how to do it. Other DIC, like MEF, have this built in.  The other thing to care of is: when the Dispose() call chain will be kicked off. For this you need to use Scoped Containers (aka Container Hierarchies). In short, this means that you will need to associate a request / session with a child container instance, and dispose that child container when the request ends. The dispose of the child container will trigger the dispose of all IDisposable instances it created. Here is a simple example on how to do this for an ASP.NET MVC application, where a child container is associated with each HTTP Request.

Even if this approach gives a lot of flexibility and advantages, is not easy to setup. It requires some non-trivial code to be written. In some applications the added complexity may not be justified. Let’s explore other ideas, too.

We could make the IRepository interface to be IDisposable and not use DI to get the repository implementation, but use Service Locator instead. The main difference from the above is that we are no longer inverting the control. Our code is now in charge of explicitly ask for a repository, so it should also take care of cleaning it up when it is no longer needed. Now, we don’t need to go through all the trouble of making the DI to call Dispose(), because our code will do it. The services now can use a using statement for each repository instance, like:

class SalesOrdersService : ISalesOrdersService  
{  
  private readonly IServiceLocator sl;

  public SalesOrdersService(IServiceLocator serviceLocator)  
  {  
      this.sl = serviceLocator;  
  }

  public decimal GetHighRiskOrdersAmount(int year)  
  {  
     using (IRepository repository = sl.GetInstance<IRepository>())  
     {  
        IQueryable<Order> orders = GetHighValueOrders(repository)  
                 .Where(o => o.Year == year);

        decimal amount = 0;  
        foreach (var order in orders)  
 	    {  
 			if (IsHighRisk(order))  
 				ammount += order.OrderLines.Sum(ol => ol.Ammount);  
 	    }

 	    return ammount;  
 	 }  
  }

  public IEnumerable<Order> GetOrders(int startingWith, int endingWith)  
  {  
      using (IRepository rep = sl.GetInstance<IRepository>())  
      {  
      		return GetHighValueOrders(rep)  
      					.Where( order => order.Year >= startingWith &  
      							order.Year <= endingWith);  
      }  
  // Is the above code correct? Have another look
 
  // the returned IEnumerable will be enumerated by the caller  
  // code after the rep.Dispose() –> error.
 
  // To avoid this more (uncontrollable) approaches may be taken  
  // –> error prone –> low maintainability
 
  // -receive the repository instance created by the caller code.  
  // The caller code creates it, so it needs to dispose it, not me
 
  // -call .ToList() on this, but still return it as IEnumerable  
  // What happens, when the caller iterates the  
  // order.OrderLines? –> error
 
  // -this function is refactored to only send a query,  
  // which should be wrapped and executed by the caller  
}  

We are losing some of the flexibility and advantages of the above. The main one is that now an IQueryable issued by the repository is scoped to the using statement, not to the request / session. If it is passed as return value to another class, and it gets executed after the using statement ends, an error will occur because the underlying connection is closed. Therefore, I need to be careful that the return values of the services that use a repository should be executed. This means that my clients cannot add to the query additional filters, nor they can decide when / if to execute it. This reduces the flexibility and reusability of my code. This may not be that bad in some applications. One of the biggest risks I see with this is that by trying to workaround these limitations, there may be code that passes around an undisposed repository to different classes or functions (like GetHighValuesOrders() function has). Without discipline, this may get out of hand.

Another different approach is not to make the repository implementation disposable, and to dispose the underlying connection (DbContext), immediately after it is no longer needed. This implies that the IQueryable (better said LINQ to SQL), will not leave the repository scope. This makes the repositories to be use case specific, with a close interface.

class OrdersRepository : IOrdersRepository  
{  
 	public IEnumerable<Order> GetOrders(int startingWith,  
 	int endingWith,  
 	decimal highValueAmount)  
 	{  
 		using (var db = new MyDbContext())  
 		{  
 			var orders = from o in db.Orders  
 				join ol in db.GetEntities<OrderLine>() on o.Id equals ol.OrderId  
 				where 	o.Year >= startingWith && 
				 		o.Year <= endingWith &&	
						ol.Amount > highValueAmount  
 				select o;
		 
 			return orders.ToList();  
 		}  
 	}  
 	…  
}  

This approach is the most rigid and it makes me add new functions and new repositories each time a functionality is added or changed. I usually do not use it, unless I am dealing with a storage which doesn’t support LINQ.

I prefer to implement the first approach described above, even more in large applications, with many use cases. It is a better choice when you are dealing with more types of implementations which use expensive external resources, which need to be released as soon as possible in deterministic way. In other words, I am addressing with this all the IDisposable implementations (not only the repository), which may be injected through DI.

The first approach and the second one are not exclusive. I may get through DI the repository for some cases and use the service locator where I explicitly want to be more closed to my clients and I would need to dispose the repository even sooner than the request / session would end. I very often use them together: the repository through DI for read only cases, where I would want to more flexibility (more reads on same repository instance), and the repository through Service Locator for read write cases, where I want a smaller scope for my unit of work.


many design discussions like the above are included in my Code Design Training

You've successfully subscribed to Code Design
Great! Next, complete checkout to get full access to all premium content.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info is updated.
Error! Billing info update failed.