Posts Tagged Analysis Services

Build Pivot Tables from Analysis Services Cube in C#


Introduction

This post details some of the sorting out I’ve done to automatically generate (with C# code) a Pivot Table from an Analysis services Cube.

Why?

Well I have 500+ analysis services cubes I want to make multiple Excel sheets for. Each Excel work (one workbook for each cube), will have multiple sheets. These Excel workbook will be the primary display

“School of hard knocks” lessons

SNAGHTML810733Make sure that your using the right interop for Excel

using Excel = Microsoft.Office.Interop.Excel;

The above is a variation of the C# using statement (or is that a directive – I’ll look that up later), which is a bit of a sanity saver, all of the rest of the code identifies the Excel objects succinctly.

  • Make sure that your using the correct interop assembly. There are a number of these installed on my work machine and Visual Studio 2010 is not exactly helpful with the way it loads up all of the available assemblies.
    The key point in the screen grab are the Version 14.0.0.0 is the one you want.

 

SNAGHTML319ede

  • Make use of the Object Browser in Visual Studio. This has been invaluable in sorting out how to do some of the things I needed to implement. The Excel library is well documented, the information on the arguments and return type are invaluable when working with the Excel Automation Layer.
  • This one is a “style” lesson (probably learned working with COM automation in C++ many years ago, but still relevant in C# today). I always use this sort pattern, the key point follow:
    Excel.Application oXL = null;
    Excel.Workbook oWB;
    Excel.Worksheet oSheet;
    try
    {
        // Start Excel Automation Layer
        oXL = new Excel.Application();
        // Let me see what's happening
        oXL.Visible = true;
        // Need a workbook
        oWB = oXL.Workbooks.Add();
        // Select the active sheet - default you get 3 sheets
        oSheet = oWB.ActiveSheet;
    

    • Minimise the casting. This lets the compiler and intelsense help a much as
      possible when your writing the code.
    • Always declare variables as intermediate values where possible. This again makes the compiler work for you.
    • Always use “early-bound” calls (the is a COM automation concept, basically early bound are resolved at compile time, late bound at run time). The above two point keep things in the “early-bound” space, which performs better (at runtime), and is easier to code (IMHO).
  • Make use of the new C# 4.0 Name Parameters feature. The following is a fragment which is from the functioning code in which using the feature adds to the readability of the code.
    Excel.PivotTable tab = cache2.CreatePivotTable(
        TableDestination: oTargetSheet.Range["A3"], 
        TableName:"Guess What", 
        ReadData:false);
    

 

The Main Method 

        static void Main(string[] args)
        {
            List<String> Dimensions = new List<String>();
            List<String> SheetNames = new List<string>();
            string odc_Sample = @"C:UsersCraigDocumentsMy Data Sourceslocalhost SimpleCube Samples 2.odc";
            Excel.Application oXL = null;
            Excel.Workbook oWB;
            Excel.Worksheet oSheet;
            try
            {
                // Start Excel Automation Layer
                oXL = new Excel.Application();
                // Let me see what's happening
                oXL.Visible = true;
                // Need a workbook
                oWB = oXL.Workbooks.Add();
                // Select the active sheet - default you get 3 sheets
                oSheet = oWB.ActiveSheet;
                // Open the ODC file 
                Excel.WorkbookConnection wbCon = oWB.Connections.AddFromFile(odc_Sample);
                // Get hold of the Pviot Cache
                Excel.PivotCaches pch = oWB.PivotCaches();
                // Create the target sheet
                Excel.Worksheet oTargetSheet = oWB.Worksheets.Add(Type: Excel.XlSheetType.xlWorksheet);
                // Create the Pivot table data relationship
                Excel.PivotCache cache2 = pch.Create(
                        SourceType: Excel.XlPivotTableSourceType.xlExternal, 
                        SourceData: wbCon); 
                // Create the Pivot Table
                Excel.PivotTable tab = cache2.CreatePivotTable(
                    TableDestination: oTargetSheet.Range["A3"], 
                    TableName:"Guess What", 
                    ReadData:false);
                // This finds the dimensions, 
                // but does not work for dimensions with Hierarchies 
                foreach(Excel.CubeField field in tab.CubeFields)
                {
                    if (field.Name.Contains("[Measures]"))
                        continue;
                    Dimensions.Add(field.Name);
                }
                // Layout a sheet with each of the dimensions as the row labels.
                for (int i = 0; i < Dimensions.Count; i++)
                {
                    // Pulls apart the ODC file description into a "user friendly" form
                    string Title = ParseDescription(wbCon.Description);               
                    // First layout has the Pivot Table already created
                    if (tab != null)
                    {
                        // Make a sheet name which is descriptive of the content
                        SheetNames.Add(MakeSheetName(Dimensions[i]));
                        oTargetSheet.Name = SheetNames[i];
                        // Layout the Analysis Cube into a Pivot table
                        LayoutCube(tab, Dimensions[i], SheetNames[i], Title, oTargetSheet);
                        tab = null;
                    }
                    else
                    {
                        // Make a sheet name which is descriptive of the content
                        SheetNames.Add(MakeSheetName(Dimensions[i]));
                        oTargetSheet = oWB.Worksheets.Add(Type: Excel.XlSheetType.xlWorksheet);
                        oTargetSheet.Name = SheetNames[i];
                        // Create the data linkage for the new pivot table
                        Excel.PivotCache pivCache = pch.Create(
                                SourceType: Excel.XlPivotTableSourceType.xlExternal,
                                SourceData: wbCon);
                        // Create the pivot table
                        tab = pivCache.CreatePivotTable(
                            TableDestination: oTargetSheet.Range["A3"],
                            TableName: SheetNames[i], 
                            ReadData:false);
                        // call the layout routine
                        LayoutCube(tab, Dimensions[i], SheetNames[i], Title, oTargetSheet);
                        tab = null;
                    }
                }               
            }
            catch (System.Runtime.InteropServices.COMException ex)
            {
                Debug.WriteLine(ex);
                throw;
            }
            catch (Exception ex)
            {
                Debug.WriteLine(ex);
                throw;
            }
            finally
            {
                if(oXL != null)
                    oXL.Quit();
            }
        }

Key Features of the Main:

  • This version is specifically for SQL Server Analysis Services Cubes. The cubes which this version will layout are only ones with:
    • Simple Dimensions. That is no hierarchies in the dimensions, the logic for the identification of dimensions needs significant modifications to cater for hierarchies.
    • More than one Measure. There is a “bug” (not sure where) in that the layout logic (which follows) does not “like” one Measure that PivotTable.DataLabelRange does not work for a single Measure (code following).
  • Pivot tables are built from the PivotCache object. The PivotCache, which uses a WorkbookConnection object as a data source.
  • The WorkbookConnection (in this example) is coming from an ODC file on the file system. The ODC file can also come from a SharePoint 2010 Data Connections library, if you want this simple substitute the URL to the connection for the file name.
  • This version uses ODC (Office Data Connectivity) Files. I’ve made this choice because all of the generated Excel workbooks will end up in SharePoint, and I don’t want to encode the connection strings into the workbooks, rather I’ll be using the SharePoint Data Connections Library to hold them. This leaves one place to change connection details for all of the BI application.

The Layout Method

 

        /// <summary>
        /// Generates the layout of a pivot table
        /// </summary>
        /// <param name="tab">The excel pivot table object</param>
        /// <param name="stubField">The dimension which will be put in the row labels zone</param>
        /// <param name="SheetName">The descriptive lable for the sheet</param>
        /// <param name="Title">The title for Pivot table, not part of the pivot table, but a cell on the sheet providing information to the user</param>
        /// <param name="sheet">The sheet which contains the Pivot Table</param>
        private static void LayoutCube(Excel.PivotTable tab, string stubField, string SheetName, string Title, Excel.Worksheet sheet)
        {
            // Set some of the Pivot table options
            tab.DisplayEmptyRow = true;
            tab.DisplayEmptyColumn = true;
            // "Style" the table
            tab.ShowTableStyleColumnStripes = true;
            tab.ShowTableStyleColumnHeaders = true;
            tab.TableStyle2 = "PivotStyleMedium23";
            // Generate the layout
            foreach (Excel.CubeField field in tab.CubeFields)
            {
                if (field.Name.Contains("[Measures]"))
                {
                    // Measures are the data fileds
                    field.Orientation = Excel.XlPivotFieldOrientation.xlDataField;
                }
                else
                {
                    if (field.Name == stubField)
                    {
                        // Make the selected dimension the row field
                        field.Orientation = Excel.XlPivotFieldOrientation.xlRowField;
                    }
                    else
                    {
                        // Page field = filters on the pivot table.
                        field.Orientation = Excel.XlPivotFieldOrientation.xlPageField;
                    }
                }
            }
            // Make the column headings (data labels) look "better"
            Excel.Range DataLabels = tab.DataLabelRange;
            DataLabels.ColumnWidth = 11;
            DataLabels.WrapText = true;
            DataLabels.HorizontalAlignment = Excel.XlHAlign.xlHAlignRight;
            // Drop the title into the sheet
            Excel.Range titleRange = sheet.Cells[2, 4];
            sheet.Cells[2, 4] = Title;
            titleRange.Value2 = Title;
            titleRange.Font.Name = "Calibri";
            titleRange.Font.Size = "14";
            titleRange.Font.Bold = true;

        }

The Layout Method : Key Features

  • The XlPivotFieldOrientation enum does all of the “hard work” in generating the layout.
  • The ordering of the Data Columns is not “controlled” at present. Enhancing this will be another voyage of discovery.
  • PivotTable.DataLabelRange seems to fail when only one Measure is in the Pivot Table.

 

Technorati Tags: ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

Windows Live Tags: Build,Pivot Tables,Analysis Services,Cube,Excel,Microsoft.Office.Interop.Excel,Visual Studio,Object Browser,Automation,Layer,Application,Worksheet,Visible,Workbooks,ActiveSheet,PivotTable,CreatePivotTable,TableDestination,Range,TableName,ReadData,Dimensions,SheetNames,Sources,SimpleCube,Samples,WorkbookConnection,Connections,AddFromFile,Pviot,Cache,PivotCaches,Create,Worksheets,Type,XlSheetType,PivotCache,SourceType,XlPivotTableSourceType,SourceData,Hierarchies,CubeField,CubeFields,Measures,Title,LayoutCube,Runtime,Debug.WriteLine,Quit,Features,Server,Cubes,modifications,Measure,DataLabelRange,SharePoint,connection,Generates,dimension,SheetName,cell,options,DisplayEmptyColumn,Style,ShowTableStyleColumnStripes,ShowTableStyleColumnHeaders,Orientation,XlPivotFieldOrientation,Page,DataLabels,ColumnWidth,WrapText,HorizontalAlignment,XlHAlign,Drop,Cells,Font,Calibri,Bold,Columns,xlWorksheet,xlExternal

Digg This

Advertisements

, , , , ,

4 Comments

Microsoft Analysis Services, and more


Introduction

This post continues the series on the Data Warehouse, product retirement project I’m currently leading. The previous post in this series was Descending Into Microsoft Analysis Services.

Update – Where We Have Reached

  • The Analysis Services Object model traversing program has made a migration to a Windows Forms Tree Viewer. The program is a “lazy” evaluator, in that it only traverses an object (or collection) when the user selects the type. This makes the thing pretty quick to use. It is also all driven by reflection, so in theory could be applied to any object structure. That is a feature which I’ve not implemented, but I may do some time in the future. I should probably should refactor the code and make a subclass of the TreeView with the reflection code as a “custom control”.
  • The define a “Cube” from  the XML, and SQL Tables, is working well. A couple of refinements, have been made to it. These have included:
    • Getting the display order working better. It’s not perfect, in that to do that we would need to add a presentation order column to all of the SQL tables which are used to build the dimensions.
    • Reusing dimensions across “Cubes” which are built from the same SQL table. One little step forward for better data management. There still is a long way to go with improving the data management. But, almost all of those improvements will have to come in subsequent projects.
  • The data comparer has progressed, but taken a detour. I’ll cover that in the “dancing goal posts” section (next).

Dancing Goal Posts

There are, or have been, a couple of “extra” requirements for the project. These have included:

  • A “data salvage” aspect to the project. There are a number of years worth of data which we are going to salvage from the existing system. Hence, the data compare function, has taken on board an extra requirement to extract all of the base data from the existing data. This is a detour which may lead to a couple of choices needing to be made. These choices are:
    • To pour all of the data into one “Cube”. This exposes a number of issues:
      • The “cookie cutter” which builds the SSAS cubes would need to be enhanced to add a time dimensions (to hang the different “cubes” of data from).
      • There could be revision to the data, which has resulted in different “cubes” being created. Developing a dimensional model which caters for this as well could be a “bridge too far” for this project.
      • There have bee changes to the dimensions over time. These changes have just resulted in different “cubes” with different labels sets, in the old system. Building up an approach which implements “Slowly Changing Dimensions”, again could be a “bridge too far” for the project.
  • My Analysis Object Viewer many well need some enhancements. To get a better understanding of the scope of the dimensional issues, which are mentioned above. These enhancements could include:
    • Creating a cube for each of the salvaged datasets. This would be done by the existing “cookie cutter” we have for creating “Cubes”
    • Building something visual to show the dimensionality of each of the cubes. Would need to get into something which gets me a “table like” view of the dimensions. Would need to have hierarchy, and code set information as well. Probably a big job.  Or,
    • Build something which selects the set of cubes, then uses LINQ over the object model to determine the following:
      • Sets the “time 0” cube then runs time forward from there.
      • Identifies “New Dimensions”. These would be dimensions which did not exists in the”time 0” Cube.
      • Identifies Changed Dimensions. There are a number of classes of change which will need to be detected. These include:
        • New Codes, or Labels
        • New Levels
        • Different Hierarchy
        • Differences in the Key pointing to a Label.
  • The data presentation layer for the project needs to be sorted through. There are a number of alternatives here which include:
    • Excel, and the Pivot table. I’ve already pointed Excel at some of the Cubes we have built. They did work OK. I’m not sure how good an Excel solution will work with some of the low bandwidth sites that this solution will need to support. Hence, the next couple of alternatives.
    • SharePoint, with the Excel parts. I’ve never explored how this works, so it will be a “voyage of discovery”. This solution has a number of the benefits:
      • it may work better in the low bandwidth sites,
      • it may allow delegation of the management of the security of the cubes to clients,
      • it may provide a better environment for doing “big” analysis jobs.
    • SharePoint, with PowerPivot. A new thing from Microsoft. I’ve yet to see it in action. So, I’ve no idea what it gives above and beyond standard Excel functions.
    • Reporting Services. This may need to be part of the mix, just for the routine reporting requirements, or maybe more wide spread. Again, sight unseen, I’m not sure what this will give to the mix.

Conclusions

This is an ongoing “voyage of discover”. I’m positive about the prospect of getting something which satisfies the “project brief”. What will need to be done in the future is something which we will have to put on the “to do list” for sometime in the future.

Digg This

, , , , , ,

Leave a comment

Descending into Microsoft Analysis Services


Introduction

One of the projects which I’ve on the go at the new job is migration a collection of reporting function into the Microsoft way of doing things. There is a tight “timebox” on this project, and a diversity of reporting “cubes” which need to be migrated into Analysis Services Cubes.

The other constraint on the project is that the new Cubes should be a direct replacement of the original reporting solution. There should be no “enhancement” of the solution (to start with at leats). If we get the project to the finish line early, then there could be an opportunity to do some dimensional modelling on the implementations an “tidy up” the resulting Analysis Cubes.

The Strategy

My plan is a four pronged attack on the problem. Those prongs are:

  1. Use the xml file, which is used to the data (and metadata) into the current system, to defined the Analysis Cubes. The developer I’ve working on this is making good progress building a program which is defining a Cube programmatically. This is using the .Net object which Analysis Services for  .Net (Namespace: Microsoft.AnalysisServices  Assembly: Microsoft.AnalysisServices (in microsoft.analysisservices.dll)). Defining the translation between in existing products conceptual model to the Analysis Services model will come next week. Then we will start to see how this could come together.
  2. A program which will traverse the object model and report all of the properties of a cube. Currently, I’ve a version of this which  produce text file. Moving it to a WinForms interface could be an option for next week, or I may “put it to bed” at it stands.
  3. A program which will compare the results of the same queries into the 2 product programmatically. I expect that the developer building this will make some good inroads into this over the coming week as well.
  4. “Sweeping up the loose ends”. I’m uncertain for all of the things which will end up in here. There is a bunch of documentation on the continued operation of the solution, already in this bucket.

Conclusions

  1. The object model for the Analysis Services looks good (so far). It would appear to allow these type of interactions will the product.
  2. There are a number of issues which the project will need to address as part of the migration which are not immediately part of the translation process, but part of the bigger “making it work well” process. These include:
    1. Storage management. I’ve not delved into this part of SQL Server and Analysis Services. These Cubes will consume space, and how that space is managed will be an issue to manage.
    2. Performance tuning. Another part of SQL Server and Analysis Services Cubes which is unknown at present. The generation of aggregates which are commonly used (or maybe all rollups), the choosing an approach and storage management will be an issue.
    3. Instrumentation. What we currently get out of the systems, and what we should be getting (to make things more robust) is another open question.
    4. Integration Services. It is unclear at present how much new ETL will need to build to deliver an automated solution. There is some ETL processes in existence which is used in the current implementations. How much the migration will need to modify is yet to be seen.
Digg This

, , , , , ,

Leave a comment

%d bloggers like this: