Some new developments will have significant influence in our (Microsoft) BI World. Some are being called revolutionary, some more evolutionary. We see a great temptation to embrace these developments as quickly as possible.
However it’s wise to observe these developments from a more conceptual point-of-view. It’s important to ‘discover’ how the puzzle will or could fit together in the long-end. Some of the developments will most certainly never see true maturity and/or will be replaced by newer developments.
During the last Microsoft BI Conference in Seattle last October, we visited a number of exciting presentations, but we more and more missed a number of conceptual considerations from a Microsoft point-of-view. The Macaw delegation returned home slightly disappointed on this specific issue. However we are currently investigating how to revise our Microsoft BI architecture (development) model and our BI Solution Factory, which we will use to implement for our customers for the period to come: 2010-2015.
We are doing this by testing in some small R&D programs high potential developments and their necessary integrated role in the whole architecture of a 100% pure Microsoft BI architecture environment.
The model of a Data Warehouse
Kimball/Inmon and Data Vault
The Kimball, Inmon battle is extended with Data Vault. Data Vault addresses a number of shortcomings in the existing models: tracebility, auditibility and adaptivity regarding changes of source systems (auditing and maintenance costs). However, the way the presentation data layer should be shaped, stays unmentioned; other than the use of a Star of Snowflake model is being advised. The advantage of a Data Vault Data Warehouse therefore can rapidly change into a disadvantage (the complete environment/architecture becomes more complex!). Data Vault furthermore uses the concept of a source-data driven Data Warehouse instead of a demand driven Data Warehouse; this will mean that a Data Warehouse will become substantial larger than needed.
The Data Vault development is very interesting, but from a Microsoft point-of-view not mature.
First of all there is a very simply reason from a practice point a view. Most of the organizations implementing a Data Warehouse are still in the phase of ‘first time use’ and don’t ‘feel’ the necessity on pre-investing on potential future advantages on maintenance costs.
Furthermore a Data Vault Data Warehouse cannot be build more cost-efficient then a regular Kimball environment with the current state of technology. The business case for a Data Vault Data Warehouse is than specifically when heavily maintenance is expected (source system change e.g.) or audit trail management (regulations, e.g.).
We have to wait for some fully automated tools for generating a Data Vault Data Warehouse out of a Source System structure and also generating a Star/Snowflake (cube) model out of a Data Vault model. There are already some third party tools (e.g. Rapidace) which are covering part of the solution, but this is most certainly not the way to go for a pure Microsoft BI environment. Tooling has to embrace the UDM concept and has to anticipate on the Master Data Management development. Current tools don't.
Some rumors say that SSIS components will come available next year which helps to create a Data Vault Data Warehouse. A first step.
Master Data Management
Master Data Management (MDM) integrated on all applications (line of business and business intelligence). At this moment (2008) it is basically just a concept with tooling for maintaining the master data management repository. When MDM as a tool will become mature over the next few years, it will take organizations years to fully integrate MDM in a ICT (and business!) architecture.
The value of MDM for BI will substantially grow with the number of Line Of Business applications that will fully use MDM. Regarding BI; ETL (business and transformation logic will be handled trough MDM) and modeling will become less complex. MDM could decrease the business value of using Data Vault modeling, but will also change BI modeling techniques in general. Dimensions for example will no longer be part of a Data Warehouse, but will belong to the MDM environment (Dimension Management). The next generation of Data Warehouses therefore can consists only of Fact tables.
The greatest risc of MDM becoming successful is it’s major impact on a ICT and business architecture. In my opinion it will have more impact than implementing for example a true SOA architecture or a large ERP system and it will take a longer period of time. Hopefully not only tools will be developed, but also some good methods on ‘how to eat this elephant’.
The first MDM BI steps are to be expected by 2010/2011 (Microsoft plans it’s MOSS MDM tool to be released somewhere in 2010/2011).
The architecture of a Data Warehouse
Master Data Management with or without Data Vault modeling will have impact on the structure of a Data Warehouse. Currently known ‘default’ constructions like ODS, Staging Area’s, but also full relational Data Warehouse will have to be redefined; they could not be necessary anymore.
For example. If a Data Vault Data Warehouse is in place, the only thing needed is an Data Source View algorithm to transform a Data Vault Structure to a Star/Snowflake scheme that is necessary for a cube. On top of this a MDM application could basically act as a kind of engine for Dimension Management, but also for the necessary Data Vault transformation.
If we are looking to some new emerging technology, the conclusion is that technology will be ready to support these kind of thoughts. Project Gemini and Kilimanjaro are living proof. The first will be the self service analyses engine mechanism (2010/2011). This can be an accelerator for a next step.
Self Service BI will probably be the first step on the way to an simplified Enterprise Data Warehouse (EDW) environment in the years to come. Self-Service BI however should be carefully managed; it should not lead to the next version of ‘Excel hell’; the ‘Self service hell’. In our opinion the dataset on which Self Service BI can be performed should meet some kind of quality conditions (a controlled data environment). If an organization embraces the thought of an EDW (it’s maturity reaches the single version of the truth thought), then there should be one common data source for self service BI.
We think that controlled data environments will form within an EDW environment. This would be some kind of transactional (OLTP) data source. A new form that would combine/replace ODS and Staging; Data Vault could be a very useful way to do this.
Self Service BI could be one of the applications on this data source. If Self Service leads to structured form of analysis and reporting; the Self Service environment can then be more or less automatically migrated to the formalized EDW BI environment. In Microsoft language : the automatically generation of a DSV, a formalized SSAS cube, connected to PPS. The demo’s given in Seattle proofs that this is no rocket science; let’s hope that ‘Donald Farmer & Friends’ will keep on track of their current development speed.
Project Gemini is a ‘small statement’ that SSAS technology evolves quickly. This could also mean that the necessity disappears of a full relational Data Warehouse or Multi Cube environment. It looks like SSAS is heading for a ‘single EDW cube’ strategy for most organizations.
In our crystal ball the architecture of a future Data Warehouse will only consists out of 2 components:
· A controlled data environment (driven by source systems data); which will support self-service/sandbox approaches. Controlled will mean some kind of cleansing (the same we see in Data Vault modeling)
We strongly support this thought. It will lower the initial costs of a Business Intelligence project.
· A formalized BI data environment (ultimate a single cube SSAS environment)
Above is just one way how the Microsoft BI future could look like. Reality most certainly will look different. The general idea is that BI will change. And also a path as mentioned could easily be extended with a MDM environment Microsoft is now developing. We are excited on the years to come.