Yield Return: Uses, Abuses and “Rules of Thumb”


Introduction

I’ve been exploring (and exploiting) the possibilities which the “yield return” statement presents in the C# language. The “yield return” and LINQ provide a very powerful combination in the C# d language.  The power of the yield return has been something which I’ve not “lent” upon, until this week. This week I’ve been doing some “fancy dancing’” in the program I’m writing, and yield return start being the way to solve some tricky logic problems. This program is one which is generating multiple DGML graphs (files) from the one input file. The yield return I’ve been using to generate the pagination of the input data into multiple output DGML graphs.

What does yield return do?

My explanation of what yield return does is quite simple. It is a method which emits one element into the sequence (the IEnumerable) at a time. The timing of the emission of the being “governed” by the iteration over the sequence.

The previous statements being taken on face value, why does yield return have some constraints on its use? If you read the references below, or the MSDN pages on yield return, you will see that there are a number of constraints on the placement of yield return statements. These constraints are below (a straight copy and paste from the MSDN online documentation):

The yield statement can only appear inside an iterator block, which can be implemented as the body of a method, operator, or accessor. The body of such methods, operators, or accessors is controlled by the following restrictions:

  • Unsafe blocks are not allowed.
  • Parameters to the method, operator, or accessor cannot be ref or out.
  • A yield return statement cannot be located anywhere inside a try-catch block. It can be located in a try block if the try block is followed by a finally block.
  • A yield break statement may be located in a try block or a catch block but not a finally block.

A yield statement cannot appear in an anonymous method.

So what do the constraints mean for a programmer using the yield return. In my opinion the constraints are not to hard to live with. The constraint on out or ref parameters makes sense when one considers what is calling the method, which is part of the iteration infrastructure “baked-into” the C# language. If your interested in the design decisions which went into the implementation of this feature of C# have a read of the links to the Eric Lippert links at the end of this post. For an explanation of how C# implements this feature read the Jon Skeet article at the end of this post (the explanation covers all of the gorey details on the implementation).

What the following examples try and demonstrate

The following examples are just a couple of ways that the “yield return” can be used. There are a couple of features which are worthwhile noting:

  • These examples use multiple “yield return” statements in the one method. This  is something which the example code I’ve looked at does not include. A feature which I’ve found very useful.
  • The methods all return IEnumerable of type which is the “standard” object which LINQ is defined over the top of (as extension methods on the IEnumerable of type).
  • There are local variables included in the methods. These work just as one would expect.
  • The methods can result in more than one output value being emitted into the resulting sequence. The ability to generate more than one output value for one input value is one of the big benefits of this approach. The comparison approach in LINQ becomes very complex very quickly, with multiple joins and/or multiple unions.
  • using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    
    namespace YieldReturn
    {
        /// <summary>
        /// Stub class used in the examples
        /// </summary>
        class EntityDetails
        {
            public string EntityType, Name, SubSystem;
        }
        /// <summary>
        /// Stub class used in the examples
        /// </summary>
        class DGML_XML_Assembler
        {
        }
        /// <summary>
        /// Dummy class which is used in the demonstration of yield return
        /// </summary>
        class BulkPagenation
        {
            private DGML_XML_Assembler writer;
            internal bool Matches(string Name, string SubSystem, string Type)
            {
                throw new NotImplementedException();
            }
            public DGML_XML_Assembler Writer
            {
                get { return this.writer; }
            }
        }
        /// <summary>
        /// Class which contains some examples of yield return
        /// </summary>
        class Program
        {
            private List<BulkPagenation> BulkPages;
            static void Main(string[] args)
            {
            }
            /// <summary>
            /// Example of a more complex lambda expression
            /// </summary>
            private void experiments()
            {
                List<String> strList = new List<string>();
                var a = strList.Where(A =>
                {
                    switch (complex1(A))
                    {
                        case 1: return true;
                        default: return false;
                    }
                }
                );
    
            }
            /// <summary>
            /// function called by the above example
            /// </summary>
            /// <param name="value"></param>
            /// <returns></returns>
            private int complex1(string value)
            {
                return 1;
            }
            /// <summary>
            /// Example which shows using a Union to form a complex result
            ///
            ///
    ToEntities">
            ///
    
            private void Sample1a(List<EntityDetails> ToEntities, List<EntityDetails> FromEntities)
            {
                foreach (EntityDetails xx in ToEntities.Where(A => A.EntityType == "InternalSystem")
                    .Union(FromEntities.Where(A => A.EntityType == "InternalSystem")))
                {
                }
            }
            /// <summary>
            /// Another way of expressing the Union of two result sets
            ///
            ///
    ToEntities">
            ///
    
            private void Sample1b(List<EntityDetails> ToEntities, List<EntityDetails> FromEntities)
            {
                var a = ToEntities.Where(A => A.EntityType == "InternalSystem");
                var b = FromEntities.Where(A => A.EntityType == "InternalSystem");
                foreach (EntityDetails xx in a.Union(b))
                {
                }
            }
    
            /// <summary>
            /// Example consuming a yield return implementation
            /// 
            ///
    
            ///
    FromEntities">
            private void Sample2(List<EntityDetails> ToEntities, List<EntityDetails> FromEntities)
            {
                foreach (EntityDetails Entity in FindTheInternalSystems(ToEntities, FromEntities))
                {
                }
            }
            /// <summary>
            /// Implementation which generates a sequence using yield return
            /// </summary>
            /// <param name="ToEntities"></param>
            /// <param name="FromEntities"></param>
            /// <returns></returns>
            private IEnumerable<EntityDetails> FindTheInternalSystems(List<EntityDetails> ToEntities, List<EntityDetails> FromEntities)
            {
                foreach (EntityDetails entity in ToEntities)
                {
                    if (entity.EntityType == "InternalSystem")
                        yield return entity;
                }
                foreach (EntityDetails entity in FromEntities)
                {
                    if (entity.EntityType == "InternalSystem")
                        yield return entity;
                }
            }
    
            /// <summary>
            /// A more complex example where the generated sequence has more than one matching input
            /// </summary>
            /// <param name="Name"></param>
            /// <param name="SubSystem"></param>
            /// <param name="Type"></param>
            /// <returns></returns>
            private IEnumerable<DGML_XML_Assembler> GetAssemblersFor(string Name, string SubSystem, string Type)
            {
                bool used = false;
                foreach (BulkPagenation page in this.BulkPages)
                {
                    if (page.Matches(Name, SubSystem, Type))
                    {
                        used = true;
                        yield return page.Writer;
                    }
                }
                if (!used)
                {
                    var bal = this.BulkPages.Where(A => A.Matches("Balance", String.Empty, "SystemTriggers"));
                    if (bal.Any())
                    {
                        yield return bal.First().Writer;
                    }
                }
                var all = this.BulkPages.Where(A => A.Matches("All", String.Empty, "SystemTriggers"));
                if (all.Any())
                {
                    yield return all.First().Writer;
                }
            }
    
        }
    }

Multiple outputs for one input in LINQ

LINQ has in the pedigree which comes from some of the same root concepts as SQL. These roots results in SQL being great powers for sub-setting data, being inherited by LINQ. But, apart from joining 2 tables in SQL, or using Union, you cannot generate multiple output rows from an input row. LINQ has a similar relational algebra underpinning as SQL, and inherits the same sort of constraint on generating multiple rows.

Rules of Thumb

There are some things which you cannot easily do with straight LINQ that can be done with the “yield return” statement.

Having a method which is generating the sequence on demand is another feature which makes using the construct more effective and flexible than just LINQ operations.

My rules of thumb (the rough and ready guidance I’ve gleaned to date) for when you should consider using the “yield return” method:

  • If your selection logic is complex. This is a bit arbitrary, but when it “feels like it won’t fit” in a lambda expression then consider using the “yield return”.
  • If you want to generate more than one output element from one input element. Sure, the Union LINQ extension method can be used to stitch together two sequences. This works for a small number of sequence, but as the of sequence number climbs it would start to get unwieldy.
  • If you have complex join conditions. Some of the above examples could be expressed as joins between the two sequences, with a conditional join condition. A “conditional join condition” is a bit of a mouthful, really it is a multiple case join condition, (if condition a, join one way, else, join another way).

Conclusions

The wise application of the “yield return” in C# developments is something which can improve the clarity of an implementation. It is well worth a C# developers time mastering.

This construct “hurt my brain” when I came to it. That is probably because I  had read a “bit too much” about the inner mechanics of the implementation in the C# language. Having too much of a focus on the inner workings, rather than focusing on what the technique could offer to my C# developments, was a mistake which probably made coming to terms with the technique.

References

Previous LINQ Blog Posts

Advertisements

, , , , , , , , , ,

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: