If you’ve been following Microsoft’s recent press releases, chances are you’ll have been exposed to the term “Common Data Service” (CDS). Read on as we shed light on the exact nature of CDS and what it can mean to your business.
Back in November 2016, Microsoft released their Common Data Service to general availability. In a nutshell, CDS is Microsoft’s attempt at providing a solution to counter the time and effort customers are spending to bring together disparate apps, services and solutions. At its most basic level, it provides a way to connect disparate systems around a focal point of your data. The intention is that Microsoft will provide the “heavy lifting” required to ensure the data flows back and forth as required.
To achieve this, Microsoft has defined a set of core business entities and then built them into what is known as the Common Data Model (CDM). For example, they have exposed entities for managing data around Accounts, Employees, Products, Opportunities and Sales Orders (for a full list see: https://docs.microsoft.com/en-us/powerapps/developer/common-data-service/reference/about-entity-reference). Where there isn’t an existing entity to suit a business requirement, Microsoft has made the CDM extensible, which allows you to add to your organisation’s instance of the CDM to meet your needs. As your organisation adapts and changes your CDS instance, Microsoft will then monitor this and look for common patterns amongst the business community that it will use to modify and extend the standard CDS.
Microsoft is committed to making their applications CDS aware and is working with their partners to get third party applications to interact effectively with the CDS.
When establishing CDS integration from an organisational use perspective, it should ideally be a simple configuration of a connector from a source application to the CDS, aligning its data entities with the reciprocal entities within the CDS. This will ensure that as products are changed to meet business needs over time, the impact should be almost negligible to other systems. This negates the need for an organisation to spend an excessive amount of time ensuring the correct architecting of a solution in bringing together disparate apps and siloed information. This can now be handled through the CDS.
Since its release in 2016, CDS has evolved with Microsoft recently announcing the release of two new services; Common Data Service for Apps (CDS for Apps) and Common Data Service for Analytics (CDS for Analytics).
CDS for Apps was released in January 2018 with CDS for Analytics expected for release in second quarter 2018.As a snapshot of how the various “pieces” fit together, Figure 1 provides a logical view of how the services will interact.
Common Data Service for Apps
CDS for Apps was initially designed for businesses to engage with their data on a low-code/no-code basis through Microsoft’s PowerApps product. This allows a business to rapidly develop scalable, secure and feature-rich applications.
For organisations needing further enhancement, Microsoft offers developer extensions to engage with CDS for Apps.
Common Data Service for Analytics
CDS for Analytics was designed to function with Power BI as the visual reporting visual product. Similarly to the way CDS for Apps is extensible by developers, CDS for Analytics will also provide extensibility options.
Figure 2 below provides the current logic model for how CDS for Analytics will integrate.
Implementing the CDS for Apps and CDS for Analytics will enable you to be able to easily capture data and then accelerate your ability to provide insights into your business data.
To assist in this acceleration, Microsoft and expose data, as their partners, will be building industry specific apps that immediately surface deep insights to an organisation’s data. An initial example is currently being developed by Microsoft; Power BI for Sales Insights will address the maximisation of sales productivity by providing insights into which opportunities are at risk and where salespeople could be spending their time more efficiently.
The ease of development and portability of solutions aren’t possible, however, without having a standardised data model. By leveraging Microsoft’s new common data services and with the suite of Microsoft’s platform of products being CDS aware, utilisation of tools such as Azure Machine Learning and Azure Databricks for deeper analysis of your organisation’s data becomes transformational.
If you’d like to understand more about how to take advantage of the Common Data Service or for further discussion around how it can assist your business, please get in touch.
Exposé designed and developed a solution that saw an increasingly temperamental Networks Asset Analytical solution move to the Exposé developed Enterprise Analytics Platform.
The solution now:
• Allows staff to focus on business-critical tasks by utilising the data created by the system.
• Reduces support costs due to the improved system stability.
• Utilises the IT resources for other projects that improve business productivity.
An Advanced Analytics and Big Data solution allows for the acquisition, aggregation and blending of large volumes of data often derived from multiple disparate sources. Incorporating IoT, smart devices and predictive analytics into the solution.
Our ACH Group case study shows how a clever data platform architecture and design facilitates transformation into a data-centric organisation in response to comprehensive regulatory changes and to leverage opportunities presented by technology in order to create a better experience for customers and staff.
An Exposé case study around our advanced analytics solution for the ‘voice of business in South Australia’, Business SA. The solution was an important component of a large digital transformation program that saw Business SA transition to a modern, automated and simplified organization which was underpinned by the following technology changes;
• A cloud-first strategy which reduced Business SA’s dependence on resources that provided no market differentiation
• Simplified the technology landscape with a few core systems which performed specific functions
• Established a modular architecture which is more able to accommodate change
• Implemented a digital strategy to support an automated, self-service and 24/7 service delivery
• Improve data quality through simpler and more intuitive means of data entry and validation
• Utilise the latest desktop productivity tools providing instant mobility capabilities
I am increasingly asked by customers – Is the Data Warehouse dead?
In technology terms, 30 years is a long time. This is how old the Data Warehouse is – that makes the Data Warehouse an old timer. Can we consider it a mature yet productive worker, or is it a worker gearing up for a pension?
I come from the world of Data Warehouse architecture and in the mid to late naughties (2003 to 2010) whilst working for various high profile financial service institutions in the London market, Data Warehouses were considered all important and companies spent a lot of money on their design, development, and maintenance. The prevailing consensus was that you could not get meaningful, validated and trusted information to business users for decision support without a Data Warehouse (whether it followed an Inmon, or a Kimbal methodology – the pros and cons of which are not under the spotlight here). The only alternative for companies without the means to commit to the substantial investment typically associated with a Data Warehouse was to allow Report Writers to develop code against the source systems database (or a landed version thereof), but this, of course, leads to the proliferation of reports, and it caused a massive maintenance nightmare and it went against every notion of a single trusted source of the truth.
Jump ahead to 2011, and businesses started showing a reluctance to invest in Data Warehouses – a trend that accelerated from that point onward. My observations of the reasoning for this ranged from the cost involved, the lack of quick ROI, a low take-up rate, difficulty to align it to ongoing business change, and, more recently, a change in the variety, volume and velocity of data that businesses are interested in.
Does all of this mean the Data Warehouse is dead/ dying? Is it an old timer getting ready for pension, or does it still have years of productive contribution to a corporate data landscaper left?
My experience across the Business Intelligence and Data Analytics market, across multiple industries and technology taught me that:
A Data Warehouse is no longer a must-have for meaningful, validated and trusted information to the business users for decision support. As explained in the previous article the PaaS, SaaS and IaaS services that focus on Data Analytics (for example the Cortana Intelligence Suite in Azure (https://www.microsoft.com/en-au/cloud-platform/cortana-intelligence-suite), or the Amazon Analytics Products (https://aws.amazon.com/products/analytics/) allows for modular solutions that can be provisioned as required which collectively answers all the Data Analytics challenges and ensures data gets to users (no matter where it originates, its format, its velocity or its volume) fast, validated and in a business-friendly format.
But this does not mean that these modular Data Platforms that use a clever mix of PaaS, Saas, and IaaS services can easily provide some of the fundamental services provided by a Data Warehouse (or more accurately, components typically associated with a Data Warehouse), such as:
Where operational systems do not track history and the analytical requirements require such history to be tracked through (for example slowly changing dimensionality type 2).
Where business rules and transformations are so complex that it makes sense to define the rules and transformations by way of detailed analysis and for it to be hardcoded into the landscape through code and materialised data in structures that the business can understand and is often reused (for example dimensions and facts resulting from complex business rules and transformations).
Where complex hierarchies are required by the reporting and self-service layer.
To assist regulatory requirements such as proven data lineage, reconciliation, and retention by law (for example for Solvency II, Basel II and III and Sarbanes-Oxley).
Where these requirements exist, a Data Warehouse (or more accurately, components typically associated with a Data Warehouse) is required. But even in those cases, a Data Warehouse (or more accurately, components typically associated with a Data Warehouse) will merely form part of a much larger Data Analytics Data Landscape. It will perform the workloads described above, and there is a larger data story delivered by complimentary services.
In the past, Data Warehouses were key to delivering optimized analytical models that normally manifested themselves in materialized Data Mart Star Schemas (the end result of a series of layers such as ODS, staging, etc.) Such optimized analytical models are now instead handled by business-friendly metadata layers (e.g. Semantic Models) that source data from any appropriate source of information, bringing fragmented sources together in models that are quick to develop and easy for the business to consume. These sources include those objects typically associated with a Data Warehouse/ Mart (for example materialized SCD2 Dimensions, materialized facts resulting from complex business rules, entities created for regulatory purposes, etc.) and they are blended with data from a plethora of additional sources. The business user still experiences that clean and easy to consume Star Schema-like model. The business-friendly metadata layer becomes the Data Mart, but is easier to develop, provides a quicker ROI, is much more responsive to business change, etc.
The data warehouse is not dead but its primary role as we knew it is fading. It is becoming complementary to a larger Data Analytics Platforms we see evolving. Some of its components will continue to fulfil a central role, but it will be surrounded by all manner of services and collectively these will fulfil the organisation’s data needs.
In addition, we see the evolution of Data Warehouse as a Service (DWaaS). This is not a Data Warehouse in the typical sense of the word as spoken of in this article, but rather a service optimized for Analytical Workloads. Can it serve those requirements typically associated with a Data Warehouse such as SCD2, materialization due to complex rules, hierarchies or regulatory requirements? Absolutely. But its existence does not change the need for those modular targeted architectures and the need for a much larger Data Analytics Data Landscape using a variety of PaaS (including DWaaS), IaaS and SaaS. It merely makes the hosting of typical DW workloads much simpler, better performing and more cost-effective. Examples of DWaaS are Microsoft’s Azure SQL DW and Amazon’s Redshift.
For those who have been involved in conventional Business Intelligence projects, you will be all too familiar with the likely contiguous chain of events and the likely outcome. It typically goes something like this: The idea is incubated by someone (very often this would be within ICT), a business case for a project is written, someone holding the purse strings approves it, suitable business and functional requirements are gathered-, written- and approved, the solution is architected and developed-, tested- and signed off, the solution is implemented, and voila, the business MIGHT use the solution.
These conventional solutions are problematic for a number of reasons:
Requirements are lukewarm at best – If the idea was not incubated by someone experiencing a real business pain point, or wanting to exploit a real opportunity, then the “requirements” gathered are always going to be skewed towards “what do you think you need” rather than “what do you know you need right now”.
Suboptimal outcome – Lukewarm requirements will lead to solutions that do not necessarily add value to the business as they are not based on real pain points or specific opportunities that need to be exploited. So when the formal SDLC process conclude the business may, in the spirit of trying to contribute to successful project outcomes, try to use the solution as is, or send it back for rework so that the solution at least satisfies some of their needs.
Low take-up rate – accepting a suboptimal solution so as to be a good team player, over time, usage will drop off as it’s not really serving a real need.
Costly rework and an expensive project – these issues means that the business often gets to see a final product late in the project lifecycle and often only then start thinking about what it could do for them if changes are applied to meet their real needs. The solution is sent back for changes, and that is very costly.
So what has changed?
Technology has finally caught up with what business users have been doing for a long time.
For many years, given the costly and problematic outcomes of conventional BI, users have often preferred access to the data they need, rather than fancy reports or analytics. They would simply download the data into Excel and create their own really useful BI (see related blog here https://exposedata.wordpress.com/2016/07/02/power-bi-and-microsoft-azure-whats-all-the-fuss-about/ ). This and the advent of a plethora of services in the form of Platform-, Infrastructure- and Software as a Service (PaaS, IaaS and SaaS), and more recently BI specific services such as Data Warehouse as a Service (DWaaS), are proving to be highly disruptive in the Analytics market, and true game changers.
How have this changed things?
In the end, the basic outcome is still the same. Getting data from some kind of source to the business in a format that they can consume by converting it to useful information.
But technology and clever architectures now allow for fast response to red-hot requirements. This is game-changing as solutions are now nimble and responsive and can, therefore, respond to requirements often discussed informally when pain points and/ or opportunities are highlighted in the course of the normal business day such as in meetings, over drinks with colleagues and at the “watercooler”. The trick is to recognize these “requirements” and to relate them back to the opportunities that the new world of data and analytics provide. If that can be done, then the solutions typically respond much better to these organic requirements vs. solutions that respond to requirements incubated and elicited in a much more formal way.
What are water cooler discussions?
I use this term to describe informal discussions around the organization about pain points or opportunities in the business. These pain points or opportunities represent organic requirements that should be responded to fast if they hold real value. This DOES NOT mean that formal requirements are no longer valid, not at all, but it means that we need to recognize that real requirements manifest itself in informal ways too. Here is an example of how a requirement can originate in a formal and in an informal way:
City planning realize that there are issues with parking availability as they receive 100s of calls each month from irate commuters stuck driving around looking for parking. It seems as if commuters are abandoning the city in favor of suburban shopping centers where ample parking is provided. This is not good for businesses in the city center, and not good for the city council.
The head of planning realizes that information (in the form of data) will be key to any of his decisions to deal with the problem, so requests reports on traffic volumes, finances and works management planning. It seems as if this data is not in the data warehouse, so a business analyst is employed to elicit the requirements around the reports required, and so the long and costly process starts.
The Head of Planning tells a colleague that he wishes he could expedite getting his hands on the information he needs but he has a limited budget so he cannot employ more resources to move his reports along quicker.
This is overheard by the Data Analytics consultant who realizes that in order to maximize supporting such an important decision, the Head of Planning will have to look at the issue from multiple angles which will likely not be provided by such formal reports. The data he needs, I.e. traffic volumes, finances and works management planning must be blended with other contextual data such as weather, events, date and time of day.
The consultant knows that:
The city already holds traffic volumes, finance date and works management planning across fragmented source systems.
The city already collects millions of sensor data per day – parking, traffic flow, commuter flow.
There are heaps of contextual data out there which is easy to access – weather, events, city businesses financial results, employment figures, etc.
The city already has a cloud subscription where services such as IaaS, but especially PaaS and SaaS and DWaaS can quickly be added and configured so as to allow for the collection, blending, storage, processing of data at a fraction of the cost of achieving the same on-premise.
That the cloud subscription allows for data science and predictive analytical activities to complement the collection, blending, storage, processing of data.
He calls a meeting with the Head of Planning who is intrigued with the idea and the quick return on investment (ROI) at a fraction of the cost, and commissions the consultant to provide a proof of concept (POC) on the matter.
From water cooler discussion to the solution in record time
In my example (which is based one of our real-world examples using Microsoft Azure) the city has an existing investment in a cloud service. Also, note that I provide a high-level resource mapping of the POC and solution at the end of this blog for both Amazon Web Services (AWS)® and Microsoft Azure®.
The process starts with a POC. Either in a free trial subscription or in the customer’s existing cloud provider such as AWS or Azure.
The preference is to keep the solution server-less and only opt for IaaS where resources cannot be provided as PaaS or SaaS.
IoT parking data form the basis of the solution and both real-time flows plus history is required.
Weather data, traffic flows into and around the city, events, and time of day will help add important context when predicting times of parking through peaks and troughs.
Organisational works management planning data will further enhance better parking planning.
Whilst business financial results show the impact, and more importantly lost opportunity cost on businesses if people abandoned the city in favor of suburban shopping centers.
The processing and storage of the sheer volume of data are achieved at a fraction of the cost than previously envisaged by the business.
The resulting solution is not a replacement for any corporate data warehouse, but complementary to it. Any existing data repository can be viewed as additional and useful contextual data in this new data analytics landscape.
The POC architecture involves the following resources (both Azure and AWS are shown):
Please note that the diagram below by no means implies a detailed design, but is a true representation of the high-level architecture we used to achieve the specific solution.
Data flow patterns
A: IOT – Sensor originated real-time data;
o A1: Into storage;
o A2: Into predictive analytics where it is blended with B and C;
o A3: Directly into real-time visualizations;
B: Additional contextual data from publicly available sources such as weather, events, business financial performance, into storage;
o B1: Into predictive analytics where it is blended with A and C;
C: On-premise data such as works management, into storage;
o C1: Into predictive analytics where it is blended with A and B;
D: Predictive analytical results into real-time visualizations, and also to storage for historical reporting;
E: Massive parallel processing, scalability and on-demand compute where and when required and supporting visual reporting;
The result of such a real-world example POC was the realization by the business that very deep insights can be achieved by leveraging the appropriate data wherever it exists and by cleverly architecting solutions with components and services within easy reach, superior outcomes can be achieved fast.
The building blocks created in the POC was adopted and extended into a full production solution and it set the direction for future data analytical workloads.
SA - Lot Fourteen, Margaret Graham Building, Adelaide 5000