Archive for November 14th, 2010

Debug.WriteLine in C# 4.0 and .Net Framework 4.0


Introduction

This is one is a very short blog post. There is a one improvement which has made its way into the C# 4.0 and .Net Framework, which I wish to share.

This improvement is a very simple one. If I had a paranoid streak, I would say Microsoft must have been watching the code I’ve been writing. Why? Because the code I write is littered with the following:

int i = 0;
// The old way
Debug.WriteLine(String.Format("What is i {0} - old way", i));

I use the debug output window in Visual Studio heavily. I find shoving formatted strings into the debug output window is one my standard ways of producing debug diagnostics from programs in development.

The Enhancement

[ConditionalAttribute("DEBUG")]
public static void WriteLine(
    string format,
    params Object[] args
);

The above is the prototype of the addition to the Debug class, another overload of the WriteLine method. This WriteLine method is document here : Debug.WriteLine.

Simply put this is a WriteLine method which encapsulates the String.Format call. Which results in code like:

int i = 0;
// The old way
Debug.WriteLine(String.Format("What is i {0} - old way", i));
// New in Version 4 of C# and .Net Framework
Debug.WriteLine("What is i {0} - New C# 4.0 way", i);

Conclusion

It’s a great improvement. It will save me a lot of keystrokes when I’m developing. But this enhancement , has a downside, now I need to remember to use it!

Advertisements

, , , , , , , , , , ,

Leave a comment

LINQ Performance Tuning: Using the LookUp Class


Introduction

This blog posts I hope to share with you the benefits of using the Lookup Class and the IEnumerable.ToLookup method for creating the Lookup Object.

The benefits of using this .Net Framework object can be a significant reduction in the time  taken to execute LINQ statements. By significant I reduced the elapse time for execution of a heavily LINQ to Objects and LINQ to XML program from 15+ minutes, to seconds.

If inspiration grabs me I may include some benchmarking test in this post, or in a subsequent posts. I’m always a bit cautious of creating artificial test cases for things like this. The artificial test case can be very misleading, as they really demonstrate the technique in the best light. The real test of the of any technique is not in a “test tube”, but in how it helps resolve real programming and performance (in this case) problems.

The Lookup Class – Overview

This is a very interesting animal in the .Net Framework. It has the following interesting features:

  • It is only created through the application of the Factory Method, which is attached to IEnumerable.ToLookup as the LINQ extension method. Classes_Diag
  • There is no public constructor for the class.
  • The Lookup Class is somewhere in between the Dictionary and List. In that it possess properties of both class, as a mixture. These properties include:
    • Like a Dictionary it is supports keyed access to the data. To use an analogy with SQL the way SQL behaves the Lookup is like an indexed table.
    • Like a List it supports storing as many of a “type” in it as possible (available memory or the CPU architecture which the process is running under being the limiting factor).
    • Unlike either List<T> or Dictionary<T>, the one key can have multiple objects stored under it. So, it is a very like a Dictionary<T1, List<T2>> .  The implementation uses an IGrouping<T2>, but the analogy holds true.

The Lookup Class – The Details

Internal StorageLookUpStructure

The Lookup class is an interesting storage mechanism. The key features are:

  • The storage is by the key, which is the important aspect which is being leveraged when improving the performance of LINQ operations.
  • Unlike the Dictionary Class, the Lookup Class stores more than one object against the key.
  • Like the List Class the order of the set of objects stored against the key, are not in an ordered storage.
  • The opposite diagram is one way to visualise the way the Lookup Class stores the data. The important points to note are:
    • The Keys must be of the same type.
    • The Objects stored must be of the same type.
    • The order of the objects within a key is undetermined. This is vey much the same as SQL tables, where the order of the rows is not guaranteed, unless you use an Order By clause (which imposes an order).
    • There can be any number of objects stored under each key.

Creating An Instance of the Lookup Class

There is not too much in the creation of a Lookup Object, apart from the caveats which are:

  • It can only be created as the output of a LINQ operation.
  • It is an invariant object structure. The Lookup object is effectively “read only” there are no ways to mutate (add or remove elements or the key set) the content.

The following example C# code shows two of the ways of creating a Lookup Object.

List<XElement> seed = new List<XElement>();
var example1 = (from element in seed
                select element)
               .ToLookup(A => A.Name);
var example2 = (from element in seed
                select new { key = element.Attribute("Test1").Value, value = element })
                    .Union(from element in seed
                           select new { key = element.Attribute("Test2").Value, value = element })
                    .ToLookup(A => A.key, A => A.value);

Example1 will result in a lookup indexed by the XName, containing a set of IGrouping of XElement .

Example2 will result in a lookup indexed by the the string value from the Attributes “Test1” and “Test2”, with the same XElement potentially falling under the different key value.

There are variants of the ToLookup method which take an IEqualityComparer for the type of the Key. These I must admit I’ve not needed use. My use cases for the Lookup Object has not included object types in the Key other than string, and the framework provided equality comparer for that has suited me fine (this far).

Limitations Of The Lookup Class

There are number of limitations which the Lookup Class comes with. Many of these limitations have been mentioned in this blog post already, but I’ll put a list of them in here (just for completeness). The imitations include (and these are the ones I’ve found, or found written about):

  • No public constructor. The only way to create a Lookup Object is to use the ToLookup factory method on the IEnumerable interface. Or, if you prefer, the Lookup Object can only be created as the output from LINQ operations.
  • There are no mutators for the Lookup Object. As an output from LINQ, this output sequence is effectively read only. There are no Add, or Remove, methods available for the Lookup object. If you need a different set of content, you will need to generate that set through LINQ, and create another Lookup Object.
  • The IGrouping structure of the storage which the Lookup Class is a bit of a thing to deal with. There are a couple of way to unravel this structure (I’ve found two thus far, but that’s not to say this is an exhaustive list):
      List<XElement> seed = new List<XElement>();
      var example1 = (from element in seed
                      select element)
                     .ToLookup(A => A.Name);
      // One way to unwrap the IGrouping
      var unwrap1 = example1.SelectMany(A=>A);
      // Another way to unwrap the IGrouping
      var unwarp2 = from step1 in example1
                    from unwrapped in step1
                    select unwrapped;
    • Unwrap1 uses the SelectMany method ( see:  LINQ SelectMany and IGrouping for some more on the use of SelectMany ).  The “A=>A” is required for the method, and simply says flatten the multiple input sequences (one for each key value), into one output sequence.
    • Using a nested from clause in the LINQ syntax. If you’ve done anything with LINQ to XML you would be familiar with using intermediate sequences in LINQ syntax.

Using An Instance Of The Lookup Class In LINQ

This is where the “rubber hits the road”, or more to the point where big performance improvements can be made in the execution of LINQ statements.  As with anything performance related please test these techniques in the context of your application. The following is what I have observed in the context of my application. I’ve been using equal measures of the Stopwatch Class and the Visual Studio Profiler to measure the impacts of my application of the Lookup Class.

The following is the fastest way in LINQ to work with the Lookup object (the best way I’ve found thus far).

List<XElement> seed = new List<XElement>();
var example1 = (from element in seed
                select element)
               .ToLookup(A => A.Name);
var dict1 = new Dictionary<XName, XElement>();
var join_sample = from lut in example1
                  join dict in dict1 on lut.Key equals dict.Key
                  select lut;

The LINQ join clause seems (this is by observation of the impact in the Visual Studio Profiler) to resolve in the indexes of the Dictionary and Lookup objects. The analogy with the way a SQL execution plan will resolve in the indexes of two tables which are being joined.

Conclusion

The performance boost that LINQ operations can gain through the judicious application of the Lookup Object, makes learning to master the use of the object well worthwhile (if you want fast code).

The DGML generating and pagination process I’m currently building has benefitted significantly from the judicious use of these objects. I’ve taken the process of generating multiple files of DGML from minutes (in the 10 to 15 minutes mark) to seconds (20 to 30 seconds).

, , , , , , , , , , , , , ,

Leave a comment

%d bloggers like this: