LINQ Extension Method To Dump any IEnumerable


Introduction

I have been doing some development with LINQ recently, and will present some of the generally useful LINQ Extension methods in this (and some forthcoming blog posts).

This post will focus on the most generally useful LINQ extension methods I have developed. These methods produce a formatted dump the contents of a sequence (IEnumerableto be precise).

These methods have evolved to their current through the application of the DRY Principle (Don’t Repeat Yourself). I was finding that I was writing very similar code to dump the contents of LINQ result sequences repeatedly. The repetitions of very similar code through the project lead me to developing these extension methods.

The Class Defining the Extension Methods

The C# compiler that implements the rules for defining the implementation of Extension Methods is very pedantic. The containing class must be marked as static. The requirements for the implementation of extension methods are described in Extension Methods (C# Programming Guide) .The following is the class definition which I have been using to contain the LINQ extension method.

    public static class LINQ_Extensions 

The full version of the class definition, with comments, which result in IntelliSense context sensitive help being generated is as follows:

    /// <summary> /// Class which supplies LINQ extension Methods. /// Extension methods are: /// <see cref="Window"/>, /// <see cref="AllValuesDistinct"/>, /// ToPrintString"/>, /// <see cref="ToIntegralValue"/>, /// <see cref="ToBigIntValue"/>. /// </summary> public static class LINQ_Extensions 

Failure to mark the class as static will result in the compiler error CS1106.

error CS1106: Extension method must be defined in a non-generic static class

Introducing The ToPrintString LINQ Extension Method

This is the first of a pair of LINQ extension methods I will present. This method is the one I first implemented. The second extension method I will resent here is very similar to this method, but addresses some of the limitations that this method contains. The most significant limitations of this implementation I will detail further on in this bog post.

ToPrintString – Design Decisions

The design of this extension method needed to enable a number of features. These design features included:

1) I wanted external control over a couple of points in the processing of this extension method. These points of control accept Lambda Expressions, enabling the caller to supply the functionality that is required.

2) I wanted the extension method to work with any type of object. Hence, the use of a generic type parameter to the implementation (see: Introduction to Generics (C# Programming Guide) for further information).

3) The type parameter for this implementation should not be constrained. Hence, it will work with declared stucts, declared classes, and anonymous classes (see Anonymous Types (C# Programming Guide) for further information).

4) I wanted a simple signature to the extension method. This desire lead me to implement the method using Optional Parameters (see: Named and Optional Arguments (C# Programming Guide)) with Default Values that provide a useful (in my opinion) implementation. This desire should (and has) resulted in an implementation that can be invokes with no arguments.

5) I wanted the flexibility to select the elements of the sequence are dumped. To achieve this wanted to have a where predicate, like the LINQ Where method, as part of the implementation.

6) I wanted to have all of the information which LINQ can supply available. The particular piece of information that I wanted available was the position in the sequence each object occupies.

ToPrintString – Implementation Decisions

The decisions made in the implementation fall into two groups. These groups are the implementation decisions that support, or implement, the design goals, and those which support an efficient and effective implementation. The following is the signature for the implementation of the extension method.

public static string ToPrintString<TSource>(
    this IEnumerable<TSource> InputSequence,
    Func<TSource, int, bool> WherePredicate = null,
    Func<TSource, int, string> FormatFunction = null,
    Func<StringBuilder, string, StringBuilder> ConcatenateFunction = null)

Implementation Features Supporting The Design Goals

The signature of the extension method implements a number of the design goals for the method. These design goals and implementations include:

1) The three Func< arguments expose the points in the method where the caller of the method can supply custom functionality. This satisfies the goal of allowing the caller to supply functionality that is required.

2) The parts of the function’s signature that utilises the <TSource> type parameter enables the flexibility to apply the extension method to any type of object contained in a sequence.

3) The function signature does not contain any type parameter constraints (see: Constraints on Type Parameters (C# Programming Guide) ). This further enables the flexibility of the method, allowing application to any type.

4) The three Func< arguments to the method are declared with a default a value of null. This allows the method to be invoked using .ToPrintString() call. I will write more about the use of null as a default value further in this blog post.

5) The method argument Func<TSource, int, bool> WherePredicate = null, enables the capability to apply a logical expression to select objects from the sequence. The signature that is utilised for the WherePredicate is same signature as the LINQ Where extension method.

6) The int arguments to the Func<TSource, int, bool> WherePredicate and Func<TSource, int, string> FormatFunction is position in the sequence that the object occupies. This is a base zero number.

Implementation Features Decisions Supporting An Effective Implementation

There are a couple of implementation choices within the implementation. These choices attempt to achieve the most efficient, and effective, implementation. These choices include:

· The FormatFunction and ConcatenateFunction provide useable, and for me useful default values, if the argument is null. See below (The Implementation of ToPrintString) for the default values implemented.

· The extension method uses the LINQ Extension Method Aggregate to perform the output, or resulting, string concatenation. The Aggregate extension method seems to be the clearest expression of intent in forming the output from the extension method.

· The Aggregate method uses a StringBuilder object to assemble the result. The use of the StringBuilder object results in a more efficient implementation when compared with just concatenating String Objects (in general, and when the size of the output string can get large).

The Implementation of ToPrintString

The following is the implementation of the ToPrintString extension method.

/// <summary> /// Builds a printable string from the enumerable. /// /// <typeparam name="TSource">Type of the source object contained in the enumerable. ///InputSequence">The enumerable which is formatted for printing. /// ///WherePredicate">[Optional] A where predicate used to select the objects to be output.<br/> /// Predicate signature: /// <code>FuncInputObject, int PositionInSquence, bool ReturnValue></code> /// Predicate Arguments: /// <list type="number"> /// <item><description>[Input Parameter] <br/> /// The object from the input sequence. <br/> /// The type of the object is the same as the declaration of the sequence.<br/> /// For compound sequences like Dictionary the object is a KeyValuePair. /// </description></item> /// <item><description>[Input Parameter] <br/> /// The position in the input sequence which object occupies. /// This is a base zero number.</description></item> /// <item>><description>[Return Parameter] <br/> /// Indicates if the object should be selected (true case) or excluded (false case). /// /// /// /// ///FormatFunction">Optional. A formatting function which will convert /// the position in the sequence and the object into a string value.<br/> /// Predicate Signature: /// <code>FuncInputObject, int PositionInSquence, bool ReturnValue></code> /// Predicate Arguments: /// <list type="number"> /// <item>[Input Parameter] The object from the input sequence. <br/> /// The type of the object is the same as the declaration of the sequence.<br/> /// For compound sequences like Dictionary the object is a KeyValuePair.</item> /// <item>[Input Parameter] The position in the input sequence which object occupies. /// This is a base zero number.<br/> /// If a where clause is used and rejects objects, /// then this number is the position in the result of the where clause sequence. /// </item> /// <item>[Return Parameter] The required string representation of object .</item> /// <item>[Default Value}<br/>The following is used if FormatFunction argument is null. /// <code>(Source, Position) => string.Format("[{0}] {1}", Position, Source);  /// /// ///ConcatenateFunction">[Optional]<br/> /// A function which concatenates the string versions of the object into one string.<br/> /// Predicate Signature: /// <code>Func<StringBuilder ResultString, string ObjectStringValue, StringBuilder ReturnValue></code> /// Arguments: /// <list type="number"> /// <item>[Input Parameter]<br/> /// The StringBuilder object which is used to collect the input object formatted strings.</item> /// <item>[Input Parameter]<br/>T /// he string representation of the object, generated by the FormatFunction.</item> /// <item>[Returns Parameter]<br/>StringBuilder result from the ConcatenationFnuction.</item> /// <item>[Default Value]<br/>This is used if ConcatenationFunction argument is null. /// <code>ConcatenateFunction = (Result, Value) => Result.AppendFormat(" {0}", Value);</code> /// </item> /// </list> /// </param> /// <returns>String of formatted and concatenated values.</returns> /// <remarks> /// This extension method can potentially exceed the maximum capacity of the string object. /// </remarks> /// <example> /// <code>// Simple tests case - no function arguments /// string output = "Test1".ToPrintString(); /// Debug.WriteLine( /// string.Format("Characters in Test1 = {0}", output)); /// </code> /// </example> public static string ToPrintString<TSource>(
    this IEnumerable<TSource> InputSequence,
    Func<TSource, int, bool> WherePredicate = null,
    Func<TSource, int, string> FormatFunction = null,
    Func<StringBuilder, string, StringBuilder> ConcatenateFunction = null)
{
    if (FormatFunction == null)
        FormatFunction = (Source, Position) => string.Format("[{0}] {1}", Position, Source);
    if (ConcatenateFunction == null)
        ConcatenateFunction = (Result, Value) => Result.AppendFormat(" {0}", Value);
    StringBuilder retVal;
    if (WherePredicate == null)
    {
        retVal = InputSequence
            .Select((a, pos) => FormatFunction(a, pos))
            .Aggregate(new StringBuilder(),
            (ReturnString, Value) => ConcatenateFunction(ReturnString, Value));
    }
    else {
        retVal = InputSequence
            .Where((InputObject, Position) => WherePredicate(InputObject, Position))
            .Select((InputObject, Position) => FormatFunction(InputObject, Position))
            .Aggregate(new StringBuilder(),
            (ReturnString, Value) => ConcatenateFunction(ReturnString, Value));
    }
    return retVal.ToString();
}

Examples of code calling the ToPrintString Extension Method

The following is a method that demonstrates a number of invocations of the extension method ToPrintString. I hope that it shows the main ways that this method could be invoked.

private void SimpleToPrintStringTests()
{
    // Simple tests case - no function arguments string output = "Test1".ToPrintString();
    Debug.WriteLine(
        string.Format("Characters in Test1 = {0}", output));

    // Declaring a simple array int[] SimpleArray = new int[] { 1, 2, 3, 4 };
    // Test Against an array with no arguments. output = SimpleArray.ToPrintString();
    Debug.WriteLine(
        string.Format("Simple 4 Element int array {0}", output));
    // Supplying a where clause output = SimpleArray.ToPrintString((val, pos) => val % 2 == 0);
    Debug.WriteLine(
        string.Format("Simple 4 Element int array, with a where clause {0}", output)); ;
    // Supplying an optional argument for the Format Function output = SimpleArray.ToPrintString(FormatFunction: (val, pos) => val.ToString());
    Debug.WriteLine(
        string.Format(
        "Simple 4 Element int array, with an optional Format Function argument\n{0}" , output)); ;
    // Supplying a optional argument for the Concatenation Function output = SimpleArray.ToPrintString(
        ConcatenateFunction: (carry, val) => carry.AppendFormat("{0}, ", val));
    Debug.WriteLine(
        string.Format(
        "Simple 4 Element int array, with an optional Concatenation Function argument\n{0}" , output));
    // Declaring a where predicate Func<int, int, bool> WherePredicate =
        (InputObject, Position) =>
        {
            if (Position % 2 == 0)
                return false;
            return true;
        };
    // Supplying a where predicate as a externally declared function output = SimpleArray.ToPrintString(WherePredicate);
    Debug.WriteLine(
        string.Format(
        "Simple 4 Element int array, with a Where Predicate argument declared externally\n{0}" , output));

    // Declaring a list of objects List<Tuple<int, string>> SimpleObjects = new List<Tuple<int, string>>()
    {
        Tuple.Create(1, "Test"),        Tuple.Create(200, "String Test"),
        Tuple.Create(-21, "Testing"),   Tuple.Create(0, "the quick brown")
    };
    // External Where Predicate Func<Tuple<int, string>, int, bool> Where1 =
        (obj, pos) =>
        {
            if (obj.Item1 >= 0) return true;
            else return false;
        };
    // External Format Predicate Func<Tuple<int, string>, int, string> Format1 =
        (obj, pos) => string.Format(
            "[{0}] int value={1} string value ={2}\n" , pos, obj.Item1, obj.Item2);
    // Calling using external declared predicates output = SimpleObjects.ToPrintString(Where1, Format1);
    Debug.WriteLine(
        string.Format("Processing a list of object with external predicates\n{0}" , output));
    // Another where predicate Func<Tuple<int, string>, int, bool> Where2 =
        (obj, pos) =>
        {
            if (obj.Item1 < 0) return true;
            else return false;
        };
    // Calling using external declared predicates output = SimpleObjects.ToPrintString(Where2, Format1);
    Debug.WriteLine(
        string.Format("Processing a list of object with external predicates 2\n{0}" , output));

    // A more complex object collection to test against Dictionary<long, int?> DictTest = new Dictionary<long, int?>()
    {
        { 234L, null },         {-44345L, 65742 },
        { -5644, null },        {6799032L, 8765464 }
    };
    // Supplying where and format as more complex inline lambda expressions output = DictTest.ToPrintString((dictObj, pos) => dictObj.Value.HasValue,
        (dictObj, pos) =>
            string.Format("Key={0} Value={1}\n", dictObj.Key, dictObj.Value.Value));
    Debug.WriteLine(
        string.Format("Processing a dictionary with inline lambda expressions\n{0}" , output));
    // A format function for a dictionary. // NB: You need to unwrap the Dictionary into passed KeyValuePair objects. // Also, lambda capture of the Dictionary Object to supply the Count property. Func<KeyValuePair<long, int?>, int, string> Format2 =
        (dictObj, position) =>
        {
            if (dictObj.Value.HasValue)
                return string.Format("Object {0} of {1} Key = {2} Value = {3}\n",
                    position, DictTest.Count(), dictObj.Key, dictObj.Value);
            else return string.Format("Object {0} of {1} Key = {2} Value = null\n",
                    position, DictTest.Count(), dictObj.Key);
        };
    output = DictTest.ToPrintString(FormatFunction: Format2);
    Debug.WriteLine(
        string.Format(
        "Processing the KeyValPair objects with named external function\n{0}" , output));
    return;
}

Limitations Of The Implementation

There is one significant limitation of is implementation the use of a string as the return type. The string object has a finite (but quite large) limit on the length of the string that can be stored in the object. A sequence with many elements, and/or a large amount of information formatted per object, could exceed the maximum size of a string. The System.String documentation says that this limit is about 2GB (or about 2 billion characters); a character count is not possible because the characters of the string are stored as Unicode characters (which can be multiple bytes per character).

Mechanically, or within the string class, the finite limit on the size of the string is probably the maximum positive value of an Int32, or 2,147,483,647 characters, and 2GB of memory. The System.String uses Int32 as arguments to many methods and properties, and probably internally as well. This dependence on the Int32 is why I would conclude the finite limit for the class would be the value of Int32.MaxValue (or 2,147,483,647).

In a subsequent blog post, probably the next blog post, I will detail another extension method that addresses this limitation.

Conclusions

There are a number of points that I should are worthy of noting. These concluding remarks include:

· Building this extension method was not particularly difficult.

· The use of the Func<> object takes a bit of getting used to. There are plenty of examples showing how to use it in the .Net framework library.

· The Func<> object does not allow for more information that in the method signature than shown above. The use and meaning of the arguments has to come in the supporting documentation for the method.

· The use of a default value of null for the optional Func<> arguments is all C# seems to allow. This is probably a design decision made in the definition of language.

Advertisements

, , , , , , , , , ,

  1. Dumping a formatted IEnumerable to Output « Craig's Eclectic Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: