Posts Tagged Stack Overflow

LINQ over FileInfo And Using Let for Clarity


Introduction

I was reading on StackOverflow a couple of days ago, a question on dealing with the FileInfo class and LINQ. It just so happened I had occasion to delve into this “neck of the woods” myself in the last couple of days. I wanted to calculate some metrics on the relative merits of various compression formats for graphics files, or the minimum size achieved for each format, of the same image stored in a number of image formats (bmp, jpeg, pgn, tiff and wdp). This then lead me into using LINQ over the directory structures and using the the FileInfo class to get the file size.

The Solution

The following is my way of sorting through a directory with a large number of files all by the same name, apart from extensions, and determining which is the smallest. Then, summarising those results into a one line average of all the files processed.

There are a couple of features worth noting:

  • The use of a functions in the first LINQ statement. This makes pulling the file name part so much easier. When I was building the solution, I was putting a breakpoint in the function and making sure I had all of the manipulations happening correctly. This was before I add part of the LINQ statement which created the the FileInfo objects for each of the files.
  • The use of the let statement in LINQ. For me the presence of the let clause is a real sanity saver. I find it so much simpler to work my way through building the LINQ statement when I have some local variable to work with, and really helps me keep my logic clear.
  • The use of the Tuple class. This is a new class in .Net 4.0 Framework. I’m finding the Tuple class a real boon. It is particularly useful for those occasions where I need to tie a couple of values (or instances of classes) together and treat them as one for some process. Sure, you could create a class which does the same thing as well. But, for internal processes, the “just for a moment” as I do something case, the Tuple class is really handy.
  • The use of the 1.0 * in the Best Ratio function. I’m showing my old fashioned roots which come from working with systems based on FORTRAN years ago. But the 1.0 * in the calculation “kicks” the calculation into floating point out of long/integer arithmetic calculations.
class Results
{
    private string _dir;

    public string Dir
    {
        get { return _dir; }
        set { _dir = value; }
    }
    private string _namePart;

    public string NamePart
    {
        get { return _namePart; }
        set { _namePart = value; }
    }
    private long _bmpSize;

    public long BmpSize
    {
        get { return _bmpSize; }
        set { _bmpSize = value; }
    }
    private long _jpegSize;

    public long JpegSize
    {
        get { return _jpegSize; }
        set { _jpegSize = value; }
    }
    private long _tiffSize;

    public long TiffSize
    {
        get { return _tiffSize; }
        set { _tiffSize = value; }
    }
    private long _gifSize;

    public long GifSize
    {
        get { return _gifSize; }
        set { _gifSize = value; }
    }
    private long _wdpSize;

    public long WdpSize
    {
        get { return _wdpSize; }
        set { _wdpSize = value; }
    }
    private string _bestType;

    public string BestType
    {
        get { return _bestType; }
        set { _bestType = value; }
    }
    private double _bestRatio;

    public double BestRatio
    {
        get { return _bestRatio; }
        set { _bestRatio = value; }
    }
    public Results()
    {

    }
    public Results(string Dir, string NamePart,
        long BmpSize, long JpegSize, long TiffSize, long GifSize, long WdpSize,
        string BestType, double BestRatio)
    {
        this._dir = Dir;
        this._namePart = NamePart;
        this._bmpSize = BmpSize;
        this._jpegSize = JpegSize;
        this._tiffSize = TiffSize;
        this._gifSize = GifSize;
        this._wdpSize = WdpSize;
        this._bestType = BestType;
        this._bestRatio = BestRatio;
    }

}

class CheckFilesSizes
{
    public CheckFilesSizes()
    {

    }

    internal void Analysis(int AnalysisVersion, string DirectoryRoot)
    {
        List<Results> results = new List<Results>();
        foreach (var dir in Directory.EnumerateDirectories(DirectoryRoot))
        {
            var analysis1 = from names in Directory.EnumerateFiles(dir, "*.bmp")
                            let namePart = ExtractName(names)
                            let fiBmp = new FileInfo(dir + '\\' + namePart + ".bmp")
                            let fiJpeg = new FileInfo(dir + "\\" + namePart + ".jpg")
                            let fiTiff = new FileInfo(dir + "\\" + namePart + ".tiff")
                            let fiGif = new FileInfo(dir + "\\" + namePart + ".gif")
                            let fiWdp = new FileInfo(dir + "\\" + namePart + ".wdp")
                            select new Results(dir, namePart,
                                fiBmp.Length, fiJpeg.Length, fiTiff.Length, fiGif.Length, fiWdp.Length,
                                PickBestType(fiBmp.Length, fiJpeg.Length, fiTiff.Length, fiGif.Length, fiWdp.Length),
                                BestRatio(fiBmp.Length, fiJpeg.Length, fiTiff.Length, fiGif.Length, fiWdp.Length));
            Debug.WriteLine(analysis1.Count());
            results.AddRange(analysis1);
        }
        var bestTypes = (from result in results
                         group result by result.BestType into groups
                         select new {
                             Key = groups.Key,
                             Freq = groups.Count(),
                             Average = groups.Average(A => A.BestRatio)
                         }).OrderBy(A => A.Freq);
        Debug.WriteLine("The answer is {0} {1} times,  by {2}%",
            bestTypes.Select(A => A.Key).First(),
            bestTypes.Select(A => A.Freq).First(),
            bestTypes.Select(A => A.Average).First());
    }

    private double BestRatio(long bmp, long jpeg, long tiff, long gif, long wdp)
    {
        List<long> sizes = new List<long>() { bmp, jpeg, tiff, gif, wdp };
        long best = sizes.Min();
        long worst = sizes.Max();
        double percentageChange = ((1.0 * (worst - best)) / (1.0 * worst)) * 100.0;
        return percentageChange;
    }

    private string PickBestType(long bmp, long jpeg, long tiff, long gif, long wdp)
    {
        List<Tuple<long, string>> test = new List<Tuple<long, string>>()
            {
                Tuple.Create(bmp, "Bmp"), Tuple.Create(jpeg, "Jpeg"), Tuple.Create(tiff, "Tiff"),
                Tuple.Create(gif, "Gif"), Tuple.Create(wdp, "Wdp")
            };
        return test.Where(A => A.Item1 == test.Min(B => B.Item1)).Select(A => A.Item2).First();
    }

    private string ExtractName(string fileName)
    {
        int iStart = fileName.LastIndexOf('\\');
        int iDot = fileName.LastIndexOf('.');
        string res = fileName.Substring(iStart + 1, iDot - iStart - 1);
        return res;
    }
}

Conclusions

There is a lot one can do with LINQ. Frequently I find that things start to “look tacky”, and coming at the problem with a fresh approach can be very beneficial. There are two main points I wish to make here:

  • Use the let clause in LINQ statements. The judicious use of the let clause can make complex LINQ far more readable, and improve performance of LINQ statements (See my blog post on let: Craig’s Eclectic Blog » LINQ to XML: using let, yield return and SelectMany ).
  • The Tuple class is something which is invaluable. In those cases where you need to have a couple of properties together and use them like a class in a list (List<class x>), the use of a tuple could prove handy. The syntax List<Tuple<type1, type2>> is very convenient, compared to creating a class which has a very limited life expectancy (being use in one place, for one case).
Advertisements

, , , , ,

Leave a comment

LINQ SelectMany and IGrouping


Introduction

This post is prompted by the the following:

Some Examples of SelectMany

The following C# code demonstrates SelectMany being used in a coupe of different ways. I’ve collated example from a couple of sources here (Nick’s Blog, MSDN, and StackOverflow). They demonstrate a couple of different ways to use the SelectMany LINQ extension method.

    namespace SelectMany1
    {
        class PetOwner
        {
            public string Name { get; set; }
            public List<String> Pets { get; set; }
        }
    
        class Program
        {
            static void Main(string[] args)
            {
                Sample1();  // Taken from Nick Olsen's blog  http://nickstips.wordpress.com/2010/07/26/linq-flatten-a-list-of-lists/
                Sample2();  // Taken from 
                Sample3();  // Taken from MSDN http://msdn.microsoft.com/en-us/library/bb534336.aspx
            }
    
            private static void Sample3()
            {
                PetOwner[] petOwners = 
                        { new PetOwner { Name="Higa, Sidney", 
                              Pets = new List<string>{ "Scruffy", "Sam" } },
                          new PetOwner { Name="Ashkenazi, Ronen", 
                              Pets = new List<string>{ "Walker", "Sugar" } },
                          new PetOwner { Name="Price, Vernette", 
                              Pets = new List<string>{ "Scratches", "Diesel" } } };
    
                // Query using SelectMany().
                IEnumerable<string> query1 = petOwners.SelectMany(petOwner => petOwner.Pets);
    
                Console.WriteLine("Using SelectMany():");
    
                // Only one foreach loop is required to iterate 
                // through the results since it is a
                // one-dimensional collection.
                foreach (string pet in query1)
                {
                    Debug.WriteLine(pet);
                }
            }
    
    
            private static void Sample1()
            {
                List<List<string>> listOfLists = new List<List<string>>();
                listOfLists.Add(new List<string>() { "a", "b", "c" });
                listOfLists.Add(new List<string>() { "d", "e", "f" });
                listOfLists.Add(new List<string>() { "g", "h", "i" });
    
                var flattenedList = listOfLists.SelectMany(x => x);
                Debug.WriteLine("Sample 1");
                foreach (string s in flattenedList)
                    Debug.Write(s + " ");
                Debug.WriteLine(" ");
                //Sample 1
                //a b c d e f g h i  
            }
    
            private static void Sample2()
            {
                // Finding duplicates in a list of string
                List<String> list = new List<String> { "6", "1", "2", "4", "6", "5", "1" };
                var duplicates = list.GroupBy(s => s).SelectMany(grp => grp.Skip(1));
                Debug.WriteLine("Sample 2");
                foreach (string s in duplicates)
                    Debug.Write(String.Format("{0} ", s));
                Debug.WriteLine(" ");
                //Sample 2
                //6 1 
            }
    
        }
    
  • Sample 1: shows the way that SelectMany can unwrap a list of lists into a list.
  • Sample 2: shows an interesting way of detecting duplicates. This solution returns a list of the strings with more than 1 occurrence (Skip(1)), as a list of n-1 copies of string (where n is the number of times that string occurs). So if you just want the duplicated string (occurring only once), put a Distinct() after the SelectMany. This example also invokes an interesting property of the GroupBy and what is returned the IGrouping interface, I’ll comment on that below.
  • Sample 3: Is a demonstration of a lambda expression working within a class which contains a list.  

IGrouping – The output from LINQ GroupBy

This is the output from the LINQ GroupBy extension method. The example 2 lead me to ponder “How does it work?”. Having worked with (around) the LINQ GroupBy extension method previously, I knew that there was a Key property, and a list of contributors to the group. So, how did the Skip(1) work?

It turns out that IGrouping is a bit of an “interesting animal”. The description of the interface is as follows:

public interface IGrouping<out TKey, out TElement> : IEnumerable, IEnumerable

The IEnumerableis the key to how the Skip(1) works. The GroupBy result is being read as series of list, one for each of the keys (I think, I could be wrong). 

 

, , , , , , , , ,

3 Comments

A Little Bit of LINQ


The following was posted as a response to a question on Stack Overflow, which is an interesting site I frequent.

Stack Overflow is a community web site of programmers helping programmers. It is a site which allows programmers to post questions, and other programmers post answers. It may sound like a site which would not work because: “Why would you do it for free?”.

The answer to that question is probably a manifold response, or many faceted. Some of those elements would include:

  • Altruism. Programming is an altruistic professions. As a programmer you develop “stuff” for people you may,or may not, know, to help them do something. 
  • Self Improvement. Programming is a profession in which one needs to be learning all the time. There are always new technologies, or technologies which you’ve not worked in before.
  • Didactic. For programmers teaching becomes another “string to the bow” in the profession. There is always something which needs to be explained, or taught, for developer.
  • Self Interest. This is a bit of a follow on from the didactic point, which could be summarised as “If I improve the quality of programmers in general, then there may be one less mess I have to clean up, or a piece of software which has fewer bugs in it”.

The Question

In LINQ, can I select multiple items?

In Summary

Given a sequence like:

string [] foos = { "abc", "def", "ghi" };

Produce a collection which looks like:

string[] result = {"abc", "cba", "def", "fed", "ghi", "ihg"};

The Answer (my answer at least)

static void Main(string[] args)
{
string[] foos = { "abc", "def", "ghi" };
// Just to test how to reverse strings
string[] reveresed = (from strings in foos
select new string(strings.ToCharArray().Reverse().ToArray())).ToArray();
// the solution
string[] result = foos.Union(foos.Select(A=> new string(A.ToCharArray().Reverse().ToArray()))).ToArray()
// output the result to the debug console
foreach(string a in result) Debug.WriteLine(a)
}

Key Points:

  • The extension method Union is used to concatenate the original sequence with the reversed sequence.
  • The strings are reversed by converting them to a char[] (using the string ToCharArray method) and then using the Reverse extension method to reverse that array’s order.
  • The new strings for the result sequence are created by calling the constructor of the string object which accepts a char[] (new string(char[])).
  • The results are dumped to the debug console (just to check we got what we were after).

, , , , , , , , ,

1 Comment

%d bloggers like this: