LINQ to SQL Use static CompiledQuery.Compile and static DataContext for BIG Performance Improvements


Introduction

This blog post is describes some of the steps I took to boost the performance of LINQ to SQL an applications (very significantly).

This “train of thought” was prompted by a blog post here on WordPress “LINQ To SQL Very Slow Performance Without Compile (CompileQuery)”. Which “got me thinking” about the data manipulation program I’ve been developing which transforms unstructured text (following a sort of pattern) into structured data, which I’m rendering using DGML (see: “Introduction to Directed Graph Markup Language ( DGML )” for a brief introduction to DGML). I probably post some of the details for generating DGML graphs on this blog (later). This program is using LINQ to SQL to do a bunch of lookups a various points. It had become a bit of a slow process, so was a perfect candidate for a “bit of a tune up” performance wise, and for the applications of some CompiledQuery.Compile enhancements.

The Implementation

There were a couple of steps in the implementation:

  • Move the declaration, and initialisation of the LINQ to SQL DataContext object (which is derived from the System.Data.Linq.DataContext Class ) from the method to a private static field on the class.
    private class DataElementsProcessor
    {
        private static DataElementsDataContext DEctx = new DataElementsDataContext();
  • Next create the declaration of the compiled query. For those reading this who are not familiar with parameterised classes and lambda expressions, I’ll try and explain what is happening in the declaration following the code snippets.
    • What is being replaced (the original LINQ to SQL query statement):
      var found = from de in DEctx.Data_Elements
                  where de.DE_Name == name
                  select de;
    • The replacement declaration:
    • private static Func<DataElementsDataContext, string, IEnumerable>
          GetDeByName = CompiledQuery.Compile((DataElementsDataContext DEctx, string name) =>
                          from de in DEctx.Data_Elements
                          where de.DE_Name == name
                          select de);
    • To explain the declaration:
      • private static: private because only this class should use this, static because I only want one of these available to all instances of the class.
      • Func> : This declares the type which the compiled LINQ to SQL Expression is. It encapusulates a delegate or lambda expression which takes the first two types as parameters and returns the sequence of DataELement objects.
      • GetDeByName: is the name of the delegate being created.
      • CompiledQuery.Compile: the method on which compiles the LINQ to SQL query (which takes the types identified in the Func< part of the declaration) and returned the compiled delegate.
      • (DataElementsDataContext DeCtx, string name) => : Identifies the types and names of the arguments to lambda expression.
      • from .. : the LINQ to SQL query which will be compiled.

Calling the Compiled Version

This is where it gets “tricky” (I’ll cover the “tricky” bits after the simple cases). The simple use cases are straight replacements:

  • Replace the query.
  • //var found = from de in DEctx.Data_Elements
    //            where de.DE_Name == name
    //            select de;
    var found = GetDeByName(DEctx, name);
  • Replace a bit more:
    • Which I think “works” a bit better. There is now no need for the variable “found” in the segment of code, and just testing the Any() on the sequence is all I wanted to do anyway.
  • //var found = GetDeByName(DEctx, name);
    // if(!found.Any())
    if (!GetDeByName(DEctx, name).Any())
    {
    
    }
  • The “tricky” bits:
    • The Sequence you get at runtime is a “iterate once only” sequence. So, the following code will throw an “System.InvalidOperationException” in System.Data.Linq.dll, with the message “<Message>The query results cannot be enumerated more than once.</Message>. In my code the situations where I’d do things which caused this type of exception was like following:
    • if (!found.Any())
      {
      }
      else
      {
          Debug.WriteLine("Rejected update already existed");
          // need to add this when I added changed to a compiled qu
          //found = GetDEByName(DECtx, DE);
          found.ToList().ForEach(A => Debug.WriteLine(String.Format("{0} {1}", A.DE_Name, A.DE_Classification)));
      }
    • This would break with the exception. The “remedy” I’ve applied has been to use a “ToList()“ to pull the results into a collection in one iteration, and then have the flexibility to iterate over the collection as much as I like. This approach cost a bit of memory, but these have been single row result sets, so there is not too much of a penalty.
      • There are probably other, more elegant, ways to “solve” this problem. But, as part of a quick upgrade to get a BIG performance improvement, they would have been more expensive.
    • // Need to pull the result set into a local list
      var found = GetDEByName(DECtx, DE).ToList();
      if (!found.Any())
      {
      }

The Results

On the program which I was working, the following table is the before, and after results.

Run 1 Run 2 Run 3 Run 4
Base Line No Compiles 114451 115143
All Queries Compile and Static Data Contexts 56906 58120 57177 56515
Percentage Improvement (start – end)/start 50.19077

This is a significant improvement in performance for what amounted to “not too much work”. The hard part was getting “my head around” the Func<> template type declaration. The numbers are from the Stopwatch Class and are ElapsedMilliseconds. The stopwatch object is the only way (superior to all other methods) to get elapsed times for executing objects in .Net.

Conclusion:

  • A vote of thanks to author of “Er. alokpandey’s Blog” for writing the article: “LINQ To SQL Very Slow Performance Without Compile (CompileQuery)”, without which I would not been prompted to implement this way of improving the performance of the program (an many more in the future I expect).
  • The conversion is not painless, if you do not understand what the Func<> is doing. Once you see what that piece of syntax is doing, then things start to fall into place.
  • I did not expect such a big gain in performance from the implementation of this change and I achieved. But, I’d not probed the code to discover where the time was being expended.
  • Apologies if you do not like the code snippet approach this post has taken. Most of the time I try and post code segments which can be used without too much modification. This time the topic is really about statements (declarations of things in the code and the syntax), rather being about “a way of doing something” (which a method can illustrate). So, I’ve used snippets of code to demonstrate the syntax used.
  • Again I yearn for the flexibility which Windows Live Spaces gave, where I could put file of code up as well as a blog post. Something which WordPress does not offer, so code snippets ibid is the solution which I’ve adopted.
Advertisements

, , , , , , , , , , , ,

  1. Yield Return: Uses, Abuses and “Rules of Thumb” « Craig's Eclectic Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: