CSV, ExpandoObject and LINQ


Introduction

I have been spurred to write this post is for a number of reasons which include:

  • In response as a response to JP’s Convert a CSV to XML using C# and the idea that converting CSV to XML is good way to handle (not so)arbitrary CSV data. Which I have to agree is a good way to handle the use case. But, I had an inkling that using Framework Object was another way to tackle the use case.
  • As a personal follow up on an MSDN Magazine Article Expando Objects in C# 4 by Dino Esposito I read a while ago. Which I’d filed away, in the back of my mind, as an article that I should get back to and have a read of again.
  • In part, because I wanted to have a “play” with the ExpandoObject Class in C#,  and
  • In part, because the idea of dynamic object, within the .Net Framework, was intriguing.

That’s were this post started. So, as you will see below I’ve a quick bit of demonstration code which does some interesting things with the ExpandoObject Class.

CSV File Reading Detour

Along the way I was reminded of the important lessons about reading CSV files with .Net programs. That lesson is that regardless of your preferred approach to parse the lines of the CSV file. Be that approach using String.Split, Text.RegularExpressions.Regex, or hand coded parsing. The best way to read a CSV file is using the built-in functionality in the .Net Framework.

So, what in .Net  reads CSV files.  There are the two broad-brush areas in the .Net Framework which can read CSV files. These are the ODBC, and OLEDB data access classes.

Technology .Net Namespace Classes
ODBC System.Data.Odbc Namespace OdbcDataAdapter Class
OdbcDataReader Class
OLEDB System.Data.OleDb Namespace OleDbDataAdapter Class
OleDbDataReader Class

I’ve included the IDataReader Interface presenting classes as well because they can be very useful as well.

One of uses of the IDataReader Interface is as an input into the  SqlBulkCopy Class. The SqlBulkCopy Class is the way to load data into SQL Server with .Net. Why? It is a bulk loader, which is far more efficient than using a SQL Insert statement to load data into SQL Server. (I’ll put on my To Do list to post an example of using the SqlBulkCopy Class, I’ve one I wrote a while ago, for a work project).

Back To Reading CSV into an ExpandoObject

The acronyms reading. The following demonstrates using the ExpandoObject as the container for the fields of a CSV file.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Dynamic;      // Needed for the ExpandoObject
using System.IO;           // Needed for the StreamReader
using System.Diagnostics;  // Needed for the Debug object, my standard dump location

namespace CSV_With_ExpandoObjects
{
    /// <summary>
    /// Default class generated for Console applications
    /// </summary>
    class Program
    {
        /// <summary>
        /// Main method, invoked by the execution framework.
        ///
        ///
args">unused</param>
        static void Main(string[] args)
        {
            // Test file name to be loaded
            string FileName = @"..\..\..\Acronyms1.csv";
            // My CSV reader and load into ExpandoObject
            Acronyms acronyms = new Acronyms();
            List<ExpandoObject> acroynmList = acronyms.LoadCSV(FileName);
            // LINQ using ExpandoObjects
            // LINQ with some "fancy dancing" to cast the dynamic object in such a way that the proprties can be used.
            // Not a very robust way of dealing with things, the cast expression could break if the property is not there.
            var a = from acroynm in acroynmList
                    where String.IsNullOrEmpty((string)((IDictionary<string, object>)acroynm)["Meaning"])
                    select (string)((IDictionary<string, object>)acroynm)["Acrynom"];
            // Just to check the LINQ statement worked
            Debug.WriteLine(String.Format("Acronyms without meanings {0}", a.Count()));
        }
    }
    /// <summary>
    /// Class which is used read the CSV file of acronyms
    /// </summary>
    internal class Acronyms
    {
        /// <summary>
        /// CSV delimiters - Yes I know I'm using a very simple CSV form
        /// </summary>
        private char[] delimiters = { ',' };
        /// <summary>
        /// List of expando object created
        /// </summary>
        private List<ExpandoObject> loadedAcronyms;
        /// <summary>
        /// Default constructor
        /// </summary>
        public Acronyms()
        {
        }
        /// <summary>
        /// Method reads the filename as a CSV file, and creates the List of ExpandObjects
        /// </summary>
        /// <param name="fileName">File of CSV to be read</param>
        /// <returns>List of ExpandoObject, one object for each line in the CSV file</returns>
        internal List<ExpandoObject> LoadCSV(string fileName)
        {
            List<string> Fields = ReadHeader(fileName);
            loadedAcronyms = ReadFileData(fileName, Fields);
            Debug.WriteLine(loadedAcronyms.Count);
            return loadedAcronyms;
        }
        /// <summary>
        /// Reads the data file of CSV.
        /// Skips the first line, as that has the header row.
        /// Very simple file read and parse the CSV.
        /// </summary>
        /// <param name="fileName">File of CSV to be read</param>
        /// <param name="Fields">The Header Row parsed to identify the columns</param>
        /// <returns>List of ExpandoObject, one object for each line in the CSV file</returns>
        private List<ExpandoObject> ReadFileData(string fileName, List<string> Fields)
        {
            List<ExpandoObject> result = new List<ExpandoObject>();
            using (StreamReader rdr = new StreamReader(fileName))
            {
                string line = rdr.ReadLine();
                while (!rdr.EndOfStream)
                {
                    line = rdr.ReadLine();
                    string[] tokens = line.Split(delimiters);
                    dynamic add = new ExpandoObject();
                    for (int i = 0; i < tokens.Length; i++)
                    {
                        if (i >= Fields.Count())
                            continue;
                        if (Fields[i] == "Acrynom")
                        {
                            add.Acrynom = tokens[i].Trim();
                            continue;
                        }
                        if (Fields[i] == "Meaning")
                        {
                            add.Meaning = tokens[i].Trim();
                            continue;
                        }
                        ((IDictionary<string, object>)add).Add(new KeyValuePair<string, object>(Fields[i], tokens[i]));
                        // Demonstrates, in a limited context that a Dynamic Object "plays"
                        //  plays like a real object with dynamically added properties.
                        if (add.Acrynom == String.Empty)
                        {
                        }
                        if (!((IDictionary<string, object>)add).ContainsKey("Meaning"))
                        {
                            add.Meaning = string.Empty;
                        }
                    }
                    result.Add(add);
                }
            }
            return result;
        }
        /// <summary>
        /// Reads the header line from a CSV file, and use that line as the field names for reading that file.
        /// </summary>
        /// <param name="fileName">Filename of the CSV to be read</param>
        /// <returns>List of strings, one string for each column in the header record</returns>
        private List<string> ReadHeader(string fileName)
        {
            string fieldList;
            using (StreamReader rdr = new StreamReader(fileName))
            {
                fieldList = rdr.ReadLine();
            }
            return fieldList.Split(delimiters, StringSplitOptions.RemoveEmptyEntries).Select(A=>A.Trim()).ToList();
        }
    }
}

Conclusion

An interesting way to read a CSV file. The ExpandObject is probably not demonstrated in the best light (there are bound to be better ways to use it).

Reading CSV, as I comments above, should be done with the .Net Framework technology, when reading is required. For quick or one off bits of code, with simple parsing requirements the do-it-yourself  approach is OK.

Some of my previous LINQ Posts

About these ads

, , , , , , , , , ,

  1. Yield Return: Uses, Abuses and “Rules of Thumb” « Craig's Eclectic Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: