This post continues the series on the Data Warehouse, product retirement project I’m currently leading. The previous post in this series was Descending Into Microsoft Analysis Services.
Update – Where We Have Reached
- The Analysis Services Object model traversing program has made a migration to a Windows Forms Tree Viewer. The program is a “lazy” evaluator, in that it only traverses an object (or collection) when the user selects the type. This makes the thing pretty quick to use. It is also all driven by reflection, so in theory could be applied to any object structure. That is a feature which I’ve not implemented, but I may do some time in the future. I should probably should refactor the code and make a subclass of the TreeView with the reflection code as a “custom control”.
- The define a “Cube” from the XML, and SQL Tables, is working well. A couple of refinements, have been made to it. These have included:
- Getting the display order working better. It’s not perfect, in that to do that we would need to add a presentation order column to all of the SQL tables which are used to build the dimensions.
- Reusing dimensions across “Cubes” which are built from the same SQL table. One little step forward for better data management. There still is a long way to go with improving the data management. But, almost all of those improvements will have to come in subsequent projects.
- The data comparer has progressed, but taken a detour. I’ll cover that in the “dancing goal posts” section (next).
Dancing Goal Posts
There are, or have been, a couple of “extra” requirements for the project. These have included:
- A “data salvage” aspect to the project. There are a number of years worth of data which we are going to salvage from the existing system. Hence, the data compare function, has taken on board an extra requirement to extract all of the base data from the existing data. This is a detour which may lead to a couple of choices needing to be made. These choices are:
- To pour all of the data into one “Cube”. This exposes a number of issues:
- The “cookie cutter” which builds the SSAS cubes would need to be enhanced to add a time dimensions (to hang the different “cubes” of data from).
- There could be revision to the data, which has resulted in different “cubes” being created. Developing a dimensional model which caters for this as well could be a “bridge too far” for this project.
- There have bee changes to the dimensions over time. These changes have just resulted in different “cubes” with different labels sets, in the old system. Building up an approach which implements “Slowly Changing Dimensions”, again could be a “bridge too far” for the project.
- My Analysis Object Viewer many well need some enhancements. To get a better understanding of the scope of the dimensional issues, which are mentioned above. These enhancements could include:
- Creating a cube for each of the salvaged datasets. This would be done by the existing “cookie cutter” we have for creating “Cubes”
- Building something visual to show the dimensionality of each of the cubes. Would need to get into something which gets me a “table like” view of the dimensions. Would need to have hierarchy, and code set information as well. Probably a big job. Or,
- Build something which selects the set of cubes, then uses LINQ over the object model to determine the following:
- Sets the “time 0” cube then runs time forward from there.
- Identifies “New Dimensions”. These would be dimensions which did not exists in the”time 0” Cube.
- Identifies Changed Dimensions. There are a number of classes of change which will need to be detected. These include:
- New Codes, or Labels
- New Levels
- Different Hierarchy
- Differences in the Key pointing to a Label.
- The data presentation layer for the project needs to be sorted through. There are a number of alternatives here which include:
- Excel, and the Pivot table. I’ve already pointed Excel at some of the Cubes we have built. They did work OK. I’m not sure how good an Excel solution will work with some of the low bandwidth sites that this solution will need to support. Hence, the next couple of alternatives.
- SharePoint, with the Excel parts. I’ve never explored how this works, so it will be a “voyage of discovery”. This solution has a number of the benefits:
- it may work better in the low bandwidth sites,
- it may allow delegation of the management of the security of the cubes to clients,
- it may provide a better environment for doing “big” analysis jobs.
- SharePoint, with PowerPivot. A new thing from Microsoft. I’ve yet to see it in action. So, I’ve no idea what it gives above and beyond standard Excel functions.
- Reporting Services. This may need to be part of the mix, just for the routine reporting requirements, or maybe more wide spread. Again, sight unseen, I’m not sure what this will give to the mix.
This is an ongoing “voyage of discover”. I’m positive about the prospect of getting something which satisfies the “project brief”. What will need to be done in the future is something which we will have to put on the “to do list” for sometime in the future.