Posts Tagged Language

Yield Return: Uses, Abuses and “Rules of Thumb”


Introduction

I’ve been exploring (and exploiting) the possibilities which the “yield return” statement presents in the C# language. The “yield return” and LINQ provide a very powerful combination in the C# d language.  The power of the yield return has been something which I’ve not “lent” upon, until this week. This week I’ve been doing some “fancy dancing’” in the program I’m writing, and yield return start being the way to solve some tricky logic problems. This program is one which is generating multiple DGML graphs (files) from the one input file. The yield return I’ve been using to generate the pagination of the input data into multiple output DGML graphs.

What does yield return do?

My explanation of what yield return does is quite simple. It is a method which emits one element into the sequence (the IEnumerable) at a time. The timing of the emission of the being “governed” by the iteration over the sequence.

The previous statements being taken on face value, why does yield return have some constraints on its use? If you read the references below, or the MSDN pages on yield return, you will see that there are a number of constraints on the placement of yield return statements. These constraints are below (a straight copy and paste from the MSDN online documentation):

The yield statement can only appear inside an iterator block, which can be implemented as the body of a method, operator, or accessor. The body of such methods, operators, or accessors is controlled by the following restrictions:

  • Unsafe blocks are not allowed.
  • Parameters to the method, operator, or accessor cannot be ref or out.
  • A yield return statement cannot be located anywhere inside a try-catch block. It can be located in a try block if the try block is followed by a finally block.
  • A yield break statement may be located in a try block or a catch block but not a finally block.

A yield statement cannot appear in an anonymous method.

So what do the constraints mean for a programmer using the yield return. In my opinion the constraints are not to hard to live with. The constraint on out or ref parameters makes sense when one considers what is calling the method, which is part of the iteration infrastructure “baked-into” the C# language. If your interested in the design decisions which went into the implementation of this feature of C# have a read of the links to the Eric Lippert links at the end of this post. For an explanation of how C# implements this feature read the Jon Skeet article at the end of this post (the explanation covers all of the gorey details on the implementation).

What the following examples try and demonstrate

The following examples are just a couple of ways that the “yield return” can be used. There are a couple of features which are worthwhile noting:

  • These examples use multiple “yield return” statements in the one method. This  is something which the example code I’ve looked at does not include. A feature which I’ve found very useful.
  • The methods all return IEnumerable of type which is the “standard” object which LINQ is defined over the top of (as extension methods on the IEnumerable of type).
  • There are local variables included in the methods. These work just as one would expect.
  • The methods can result in more than one output value being emitted into the resulting sequence. The ability to generate more than one output value for one input value is one of the big benefits of this approach. The comparison approach in LINQ becomes very complex very quickly, with multiple joins and/or multiple unions.
  • using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    
    namespace YieldReturn
    {
        /// <summary>
        /// Stub class used in the examples
        /// </summary>
        class EntityDetails
        {
            public string EntityType, Name, SubSystem;
        }
        /// <summary>
        /// Stub class used in the examples
        /// </summary>
        class DGML_XML_Assembler
        {
        }
        /// <summary>
        /// Dummy class which is used in the demonstration of yield return
        /// </summary>
        class BulkPagenation
        {
            private DGML_XML_Assembler writer;
            internal bool Matches(string Name, string SubSystem, string Type)
            {
                throw new NotImplementedException();
            }
            public DGML_XML_Assembler Writer
            {
                get { return this.writer; }
            }
        }
        /// <summary>
        /// Class which contains some examples of yield return
        /// </summary>
        class Program
        {
            private List<BulkPagenation> BulkPages;
            static void Main(string[] args)
            {
            }
            /// <summary>
            /// Example of a more complex lambda expression
            /// </summary>
            private void experiments()
            {
                List<String> strList = new List<string>();
                var a = strList.Where(A =>
                {
                    switch (complex1(A))
                    {
                        case 1: return true;
                        default: return false;
                    }
                }
                );
    
            }
            /// <summary>
            /// function called by the above example
            /// </summary>
            /// <param name="value"></param>
            /// <returns></returns>
            private int complex1(string value)
            {
                return 1;
            }
            /// <summary>
            /// Example which shows using a Union to form a complex result
            ///
            ///
    ToEntities">
            ///
    
            private void Sample1a(List<EntityDetails> ToEntities, List<EntityDetails> FromEntities)
            {
                foreach (EntityDetails xx in ToEntities.Where(A => A.EntityType == "InternalSystem")
                    .Union(FromEntities.Where(A => A.EntityType == "InternalSystem")))
                {
                }
            }
            /// <summary>
            /// Another way of expressing the Union of two result sets
            ///
            ///
    ToEntities">
            ///
    
            private void Sample1b(List<EntityDetails> ToEntities, List<EntityDetails> FromEntities)
            {
                var a = ToEntities.Where(A => A.EntityType == "InternalSystem");
                var b = FromEntities.Where(A => A.EntityType == "InternalSystem");
                foreach (EntityDetails xx in a.Union(b))
                {
                }
            }
    
            /// <summary>
            /// Example consuming a yield return implementation
            /// 
            ///
    
            ///
    FromEntities">
            private void Sample2(List<EntityDetails> ToEntities, List<EntityDetails> FromEntities)
            {
                foreach (EntityDetails Entity in FindTheInternalSystems(ToEntities, FromEntities))
                {
                }
            }
            /// <summary>
            /// Implementation which generates a sequence using yield return
            /// </summary>
            /// <param name="ToEntities"></param>
            /// <param name="FromEntities"></param>
            /// <returns></returns>
            private IEnumerable<EntityDetails> FindTheInternalSystems(List<EntityDetails> ToEntities, List<EntityDetails> FromEntities)
            {
                foreach (EntityDetails entity in ToEntities)
                {
                    if (entity.EntityType == "InternalSystem")
                        yield return entity;
                }
                foreach (EntityDetails entity in FromEntities)
                {
                    if (entity.EntityType == "InternalSystem")
                        yield return entity;
                }
            }
    
            /// <summary>
            /// A more complex example where the generated sequence has more than one matching input
            /// </summary>
            /// <param name="Name"></param>
            /// <param name="SubSystem"></param>
            /// <param name="Type"></param>
            /// <returns></returns>
            private IEnumerable<DGML_XML_Assembler> GetAssemblersFor(string Name, string SubSystem, string Type)
            {
                bool used = false;
                foreach (BulkPagenation page in this.BulkPages)
                {
                    if (page.Matches(Name, SubSystem, Type))
                    {
                        used = true;
                        yield return page.Writer;
                    }
                }
                if (!used)
                {
                    var bal = this.BulkPages.Where(A => A.Matches("Balance", String.Empty, "SystemTriggers"));
                    if (bal.Any())
                    {
                        yield return bal.First().Writer;
                    }
                }
                var all = this.BulkPages.Where(A => A.Matches("All", String.Empty, "SystemTriggers"));
                if (all.Any())
                {
                    yield return all.First().Writer;
                }
            }
    
        }
    }

Multiple outputs for one input in LINQ

LINQ has in the pedigree which comes from some of the same root concepts as SQL. These roots results in SQL being great powers for sub-setting data, being inherited by LINQ. But, apart from joining 2 tables in SQL, or using Union, you cannot generate multiple output rows from an input row. LINQ has a similar relational algebra underpinning as SQL, and inherits the same sort of constraint on generating multiple rows.

Rules of Thumb

There are some things which you cannot easily do with straight LINQ that can be done with the “yield return” statement.

Having a method which is generating the sequence on demand is another feature which makes using the construct more effective and flexible than just LINQ operations.

My rules of thumb (the rough and ready guidance I’ve gleaned to date) for when you should consider using the “yield return” method:

  • If your selection logic is complex. This is a bit arbitrary, but when it “feels like it won’t fit” in a lambda expression then consider using the “yield return”.
  • If you want to generate more than one output element from one input element. Sure, the Union LINQ extension method can be used to stitch together two sequences. This works for a small number of sequence, but as the of sequence number climbs it would start to get unwieldy.
  • If you have complex join conditions. Some of the above examples could be expressed as joins between the two sequences, with a conditional join condition. A “conditional join condition” is a bit of a mouthful, really it is a multiple case join condition, (if condition a, join one way, else, join another way).

Conclusions

The wise application of the “yield return” in C# developments is something which can improve the clarity of an implementation. It is well worth a C# developers time mastering.

This construct “hurt my brain” when I came to it. That is probably because I  had read a “bit too much” about the inner mechanics of the implementation in the C# language. Having too much of a focus on the inner workings, rather than focusing on what the technique could offer to my C# developments, was a mistake which probably made coming to terms with the technique.

References

Previous LINQ Blog Posts

, , , , , , , , , ,

Leave a comment

Using the ForEach Method


Introduction

I’ve found a new “toy” in the C# language this week, the Foreach extension method. It seems to me to be a very effective way of expressing some of those short “iterate over an x and do something to the members.

I’ve always been a bit “frustrated” with C# when it comes to iterating over collections. The foreach syntax is fine, when you’re doing a loop which has some complex logic. But, for short sharp thing, has always seemed to be a few too many keystrokes for the results. Sure, there is the code snippet for foreach which cuts down the keystrokes, but you still have the minimum of 2 lines of code, or 4 if you us { and }. Also my “inbuilt style guide” baulks at putting multiple statements on one line.

foreach (string part in TransactionTriggers) Debug.WriteLine(String.Format(" [{0}] ", part));

Or

foreach (string Name in AllNames.Distinct().OrderBy(A => A))
{
    Debug.WriteLine(Name);
}

Oops, the above includes some of my abuses of LINQ. Doing a Distinct and then sorting the list, probably stretching the friendship with LINQ. But, what the heck it works 😉 !

Using the Foreach Extension Method

TokenList.ForEach(a => Debug.WriteLine(a));

the above is a quick dump of the contents of a List>String< to the debug console.

private List LookupType(DataElementsDataContext deDC, List InformationDataElements)
{
    List<string> result = new List>string<();
    InformationDataElements.ForEach(A => result.Add(LookupType(deDC, A)));
    return result;
}

An example of method invocation and adding to a list. I’ve also used LINQ to achieve the same sort of result:

var log1 = from line in ParsedLines
           select new LogMessage(line);

 

StringBuilder result = new StringBuilder();
bool First = true;
InformationNameDetails.ForEach(A =>
    {
    if(First)
        First = false;
    else
        result.Append(" and ");
    result.Append(A);
}
);

It is probably an abuse of the lambda expressions to include chunks of procedural logic. But, again it works for me, so what the heck!

Conclusions

There are more than one observatins which should be made at this point. The major ones are:

  • The use of the Foreach extension method certainly, seems to me, to be an effective way of “cleaning up” C# code, and removing some of the padding which is necessary for the foreach statement.
  • The Foreach extension method is something which could be over used. Well more to the point the complexity of the lambda expression could get greater than the language designers intended.
  • Moving to the method syntax for some foreach loops, opens the door to using the Parallel ForEach Method from the System.Threading.Tasks namespace. You’ve done most of the hard work in the writing of the lambda expression. What is left in the move to a parallel ForEach is a lot of work making sure that what you’re invoking can be done in a parallel manner safely.

References:

, , , , , , , , , , , ,

2 Comments

%d bloggers like this: