LINQ deferred loading using C# yield-return
Yield keyword exists in C# language since version 2 along with Visual Studio 2005, however, it is rarely used by developers. Recently I read Demystifying the C# Yield-Return Mechanism article about yield-return keyword usage. This article gave the three most common usage scenarios, but the author (James McCaffrey) didn’t mention LINQ. I think LINQ is where yield-return becomes most useful since it provides a mechanism for deferred loading. Here is a quick example.
Let’s say we have a bunch of unit tests to run. It takes some time to run a single test, let’s say 1 sec. For simplicity let’s assume that even tests pass and odd fail:
public class Test
{
public int Id { get; set; }
public bool Assert()
{
System.Threading.Thread.Sleep(1000);
Console.WriteLine(string.Format("Test {0} was processed.", Id));
return Id % 2 == 0;
}
}
Let’s write a couple of LINQ-like extension methods for Test collections. Process1 doesn’t use yield return. Process2 uses yield return and PrintPass output “Test # is passed” message.
public static class Extensions
{
public static IEnumerable<Test> Process1(this IEnumerable<Test> tests)
{
var result = new List<Test>();
foreach (var test in tests)
{
if (test.Assert())
{
result.Add(test);
}
}
return result;
}
public static IEnumerable<Test> Process2(this IEnumerable<Test> tests)
{
foreach (var test in tests)
{
if (test.Assert())
{
yield return test;
}
}
}
public static void PrintPass(this IEnumerable<Test> tests)
{
foreach (var t in tests)
{
Console.WriteLine(string.Format("Test {0} is passed.", t.Id));
}
}
}
Let’s fill up a test collection and compare the output for Process1 and Process2 method:
class Program
{
static void Main()
{
var tests = new List<Test>(new []
{
new Test {Id = 1},
new Test {Id = 2},
new Test {Id = 3},
new Test {Id = 4},
new Test {Id = 5},
});
Console.WriteLine("**** No yield return *******");
tests.Process1().PrintPass();
Console.WriteLine("**** Using yield return *******");
tests.Process2().PrintPass();
Console.ReadKey();
}
}
Output:
**** No yield return *******
Test 1 was processed.
Test 2 was processed.
Test 3 was processed.
Test 4 was processed.
Test 5 was processed.
Test 2 is passed.
Test 4 is passed.
**** Using yield return *******
Test 1 was processed.
Test 2 was processed.
Test 2 is passed.
Test 3 was processed.
Test 4 was processed.
Test 4 is passed.
Test 5 was processed.
As you can see, in the first case whole collection was built before PrintPass method output anything. In the second case, you can see PrintPass prints “Test 2 is passed.” right after Assert method for this test has done its processing.