ETL as a Service – Lifting & Shifting your existing SSIS ETL to Azure

Why a trend towards PaaS over IaaS ETL and how this relates to SSIS

Businesses are moving more towards environments where infrastructure is invisible to the business; where Platforms as a Service (PaaS) are developed, deployed and managed in the cloud to reduce operational infrastructure overhead costs as well as increase the scalability and availability of services.

But why is PaaS more attractive compared to on-premise or Infrastructure as a Service (IaaS) solutions?

  • Savings – PaaS services often use a Pay as You Go (PAYG) billing model, meaning you only pay for what you use and when you use it (for example, there are no costs due to idle server times).
  • Scalability – PaaS provides huge scalability and availability on demand.
  • Less overhead – PaaS requires no patching or maintenance of application infrastructure – it’s all taken care of. Therefore, there is less staff overhead from a maintenance perspective which allows staff to take part in activities that deliver a definite and measurable business benefit. This may include new solution developments, improving existing solutions, or data analytics.

3 images

This did not apply to SSIS

Microsoft’s incumbent data integration technology that handles the extract-, transform and load (ETL) of large volumes of data, SQL Server Integration Services (SSIS), has played a key role in many organisations’ data platforms, but unfortunately has not been available as a Service (PaaS).

So for businesses with existing investments in SSIS, wishing to move their data analytics solutions into the cloud and leverage the advantages of PaaS over IaaS, would not have been able to do so with their existing SSIS workloads. They would have had to embark on a full ETL redevelopment using an alternative data integration technology such as Azure Data Factory, or they would have had to use Azure VMs running SSIS to host their SSIS packages.

The good news

The good news is that Microsoft (MS) recently announced Azure-SSIS Integration Runtime (SSIS-IR) public preview, as well as the public announcement Azure SQL Database Managed Instance (SQLDBMI). Together these two new PaaS Services allow businesses to leverage the advantages of PaaS for their ETL workloads, without a major redevelopment.

In this article

  • We discuss what this means for businesses.
  • We discuss some capabilities still missing in public preview but in the pipeline of work for General Availability of the product.
  • We discuss how the recent announcement of SQLDBMI goes hand in hand with shifting your SSIS solutions into PaaS.
  • We touch on some of the other benefits of SQLDBMI.
  • We walk the reader through creating an SSIS IR instance as well as deploying, running and monitoring solution execution.

What are the business benefits

Savings, Scalability and Less overhead

SSIS workloads can now leverage the attractive cost advantages through a PaaS PAYG model – this includes instantaneous scale up and down to meet varying workloads, paying only for what you use, and no hidden costs through servers being idle.

By moving your SSIS workloads to PaaS, you minimise overhead costs, time and skillsets required to manage its supporting infrastructure. Infrastructure often involves costly hardware- (including depreciation, physical space and upgrades), plus backup-, network-, redundancy- and failover costs. Most of these costs simply do not apply to PaaS solutions.

Your staff can also focus on where there is a maximum business benefit – i.e. the solutions themselves rather than infrastructure or the setup, configuration, maintenance and licensing.

In addition, use what you previously developed

SSIS-IR allows businesses to easily lift and shift existing ETL workloads into the cloud so there is no need for redevelopment in ADF.  Developers familiar with the extensive functionality and components available in SSIS can easily continue to develop using proven technology such as Visual Studio and then deploy to their SSIS-IR instance in the same fashion they would for on-premise solutions.

What does this mean in $ terms

piggybank

A cost analysis between the 3 options (running SSIS in an on-premise server environment, running it on a cloud-based VM or as a PaaS service) based on ETL processes that is required to run twice a day and takes around 1 hour per run (~60 hours a month) shows a clear cost advantage when using PaaS.

Please note that the table below is NOT an official quote and any cost estimates must be obtained from your licenced software provider. Costs taken from the following sources:

CostMatrix

scalability

This option also leaves the business with the flexibility to easily scale up or down they compute power as their needs change. Doubling the computing power would take the annual cost from $550 to $870.

people

Regarding FTE overhead and maintenance – even if a conservative 30% of DBA time is saved by adopting PaaS, this will have a major cost saving and productivity increase to the Business.

What is currently missing? And what is coming?

SSIS-IR is still in public preview.  Therefore, there are some features not yet available as at the time of authoring this article – January 2018. These features are all expected to be available upon General Availability of the product:

  • 3rd party assemblies and API’s are not supported in SSIS-IR in the preview, only the stock SSIS components are supported.
  • Only East US, East US2, West Europe & North Europe host SSIS-IR in the preview.
  • There is no portal GUI available for provisioning your SSIS-IR instance; PowerShell scripts must be used to provision, stop/start and scale up/down.
  • You cannot schedule the execution of the SSIS packages from within SSIS-IR in preview and packages must be run
    • This is where another upcoming release by Microsoft, Azure SQL Database Managed Instance (SQLDBMI) comes to play.

SQLDBMI

The PaaS database service, Azure SQL Database Managed Instance (SQLDBMI) was announced at the recent MS Ignite conference and is set to add some key features into a more powerful version of Azure SQL Database.  This will be similar to what you would normally expect to see in the on-premise version of SQL Server and makes a very compelling case for migrating existing databases to the cloud. One of the key features in SQLDBMI is the SQL Agent, this will enable the scheduling, automation and execution of SSIS-IR packages.

Also in SQLDBMI

Additional to SQL Agent in SQLDBMI there are many other great features which are on their way.  The public preview of SQLDBMI is set to be available around Q1 2018. So, stay tuned for the following:

  • DB Mail
    • An enterprise solution for sending email messages through the SQL Server database engine
  • R / Machine Learning Services
    • Develop and deploy data science solutions that uncover new insights as an in-database service
  • Service Broker
    • Native support for messaging and queuing applications.
  • CDC
    • Change Data Capture records data manipulation operations that are applied to SQL server tables in an easily consumed relational format
  • Replication
    • Enables the copying and distributing of data and database objects from one database to another and the synchronising between them to maintain consistency.
  • Resource Governor
    • Used to manage the SQL Server workload and system resource consumption by specifying limits that incoming application requests can use

Conclusion

Together SSIS-IR and SQLDBMI will make it much easier for businesses to lift and shift existing Business Intelligence architectures (notably data storage in a fully functional Database as a Service environment, and batch based Extract, Transform and Load solutions) straight into the cloud without the need for additional costs of redevelopment. The benefits of reduced operational overhead and running costs, and increased scalability and availability are hard to overlook when comparing on-premise or IaaS based solutions with PaaS based architectures.

For those interested in more of the technical aspects of SSIS-IR, the remainder of this article will walk through a set-up of SSIS-IR. Knowledge of SQL Databases and SQL Server Integration Services are required for the remainder of the article.

Walkthrough of setting up Azure-SSIS Integration Runtime

An existing Azure SQL Server is required – this is where SSISDB will reside.  We preloaded the server with an Azure SQLDB – Adventure works (not required) which were used for some SSIS tests.

SQLDB Provisioning Screen
Resource group with Azure SQL Server and SQLDB

Missing functionality in the preview – Next, because there is no GUI in private or public preview, SSIS instance must be provisioned using PowerShell scripts.  The parameters, variables and example scripts can be found here: https://docs.microsoft.com/en-us/azure/data-factory/tutorial-deploy-ssis-packages-azure

After provisioning, the following objects should appear in your resource group.

SQLDB Provisioning Screen 2
Resource group after provisioning showing ADFv2 and SSISDB

After connecting to your Azure SQL Server through SSMS, you can see that provisioning has created a SSISDB database. To see the Integration Services Catalog, you must connect to your Azure SQL Server and SSISDB database directly through the advanced options.

Integration Services Catalog
Integration Services Catalog menu

You can now go ahead and create folders and deploy packages to the Azure-SSIS Integration Runtime environment just as you would with a regular SSIS server.  We created a few simple test cases in visual studio and deployed the project to the newly created Azure-SSIS.

SSIS Packages
SSIS Project packages

These packages can be configured in the same way as standard integration services, including parameters, environment variables and connection managers.

SSIS Package Configuration
SSIS Package configuration

Running these packages and checking the integration services catalogues reports allow for all the logging associated with standard SSIS.

SSIS Package Execution
SSIS Package execution
SSIS Package Run Log
SSIS Package run log

Internet of Mice

Advanced analytics

The Internet of Mice – Our IoT and Advanced Analytics Solution

Understanding how animals involved in research move and eliminating as much human handling as possible makes for a much more humane environment for the animals. The outcome is more accurate results for the researchers. See how our IoT and Advanced Analytics solution developed for our customer strives towards a humane research environment and delivers more intelligent insights to researchers.

See more about IoT

Our YouTube channel

Our youtube Channel

We have a growing list of videos on our YouTube channel where you can find some selected case studies, test drives and solutions. Get an inside look at the world of Smart Analytics.

Topics include:  Advanced Analytics, Cognitive Intelligence, Artificial Intelligence, Augmented- and Virtual Reality, IOT and Business Intelligence

Feel free to subscribe as we are constantly adding new videos.

Our YouTube channel

 

Transforming the business into a data centric organisation through an Advanced Analytics and Big Data solution – our ACH Group case study

Big Data

An Advanced Analytics and Big Data solution allows for the acquisition, aggregation and blending of large volumes of data often derived from multiple disparate sources. Incorporating IoT, smart devices and predictive analytics into the solution.
Our ACH Group case study shows how a clever data platform architecture and design facilitates transformation into a data-centric organisation in response to comprehensive regulatory changes and to leverage opportunities presented by technology in order to create a better experience for customers and staff.

See the case study here: Exposé case study – ACH Group

See more about advanced analytics

Important Power BI news release

Power BI News

Starting June 1, 2017, Microsoft is making some changes to the way Power BI is licensed, there are also some important changes to the Power BI Service. So if you use Power BI, or intend to use Power BI, please be aware of these changes.

Exposé has been at the forefront of the Power BI revolution and we view these changes as even more positive steps towards a cost-effective, scalable and maturing BI and Analytics platform. We found that organizations really benefit from some guidance on the administration side of Power BI. If you’d like further advice on these changes or assistance with this transition and how it affects you and your organization, please don’t hesitate to get in touch.

Here are the changes:

Power BI free tier

Microsoft is now giving all free tier users the following capabilities:

  • ability to connect to all of the data sources that Pro users can connect to
  • a storage quota increase from 1GB to 10GB
  • data refresh maximum from once daily to once hourly
  • streaming data rates from ten thousand rows per hour to one million rows per hour

But in doing this they will be removing the following capabilities:

  • sharing reports and dashboards with other users
  • using group workspaces (now to be called app workspaces)
  • export to PowerPoint, CSV, Excel
  • analyze in Excel

This makes the licensing of the free tier truly for personal use only as all private sharing capabilities are now no longer available within the Power BI free license.

To help ease the transition to the new licensing model, Microsoft is allowing people who have had a license with the Power BI service on or before May 2, 2017 and have at least signed in once between May 2, 2016 and May 2, 2017 to apply for an extended trial of a Power BI Pro license. This license will enable the use of all Power BI Pro features until May 31, 2018. If you meet these requirements you will be sent an email from MS and will also have a notification appear when you log in to the service.

If you require an organizational use of Power BI, you will now either need to license all users for Power BI Pro or their new tier, Power BI Premium.

Power BI Premium

Power BI Premium is a new capacity-based licensing model coming late in the second quarter of 2017. It allows organizations to acquire only Power BI Pro licenses for report creators and the rest of the organization to consume these reports and dashboards without having to purchase a Pro license.

The charging model for this is based on a Premium node within the Azure environment that can be scaled according to an organizations performance requirements. Microsoft has provided a calculator service here to help estimate costs.

Power BI Report Server

Coming late in the second quarter of 2017, Microsoft will be offering the capability to publish Power BI reports on-premise using Power BI Report Server.

The on-premises server will allow the deployment and distribution of interactive Power BI reports and traditional paginated reports within the boundaries of an organization’s firewall.

To enable the use of Power BI Report Server, you will need to either be licensed under Power BI Premium or have a per-core license of SQL Server Enterprise Edition with Software Assurance.

Power BI Apps

Power BI content packs are changing to become known as Power BI apps.

At the moment, there won’t be a large difference between apps and content packs, mostly a change in interface and publishing process. But Microsoft has a roadmap for improvement under the new app model.

They are planning the following enhancements to app workspaces in the coming months:

  • Creating app workspaces will not create corresponding entities in O365 like group workspaces do. So you can create any number of app workspaces without worrying about different O365 groups being created behind the scene (you can still use an O365 group’s OneDrive for business to store your files).
  • Today you can add only individuals to the members and admin lists. In the next iteration, you will be able to add multiple AD security groups or modern groups to these lists to allow for easier management.

The impact, for now, is that Microsoft will rename all group workspaces to app workspaces and you can publish an app from any of these workspaces.

Power BI Embedded

Microsoft has also announced the convergence of the Power BI Embedded service with the Power BI service. This means that there will be one Power BI API that will have feature parity with the current Power Embedded service and so any existing apps built using Embedded today should continue to function but you will be required to prepare for migration over to the new service.

Power BI Service Interface

Finally, and for those who may not have been aware, Microsoft has been trialling a new interface for the Power BI service over the past few months. As of May, this interface will become the default. I’d recommend taking some time to understand what the new interface is like as there are some large changes to what you may be used to in relation to your workflow.

From operational challenges to a modern, automated and simplified organisation – Our Business SA case study

An Exposé case study around our advanced analytics solution for the ‘voice of business in South Australia’,  Business SA. The solution was an important component of a large digital transformation program that saw Business SA transition to a modern, automated and simplified organization which was underpinned by the following technology changes;

• A cloud-first strategy which reduced Business SA’s dependence on resources that provided no market differentiation
• Simplified the technology landscape with a few core systems which performed specific functions
• Established a modular architecture which is more able to accommodate change
• Implemented a digital strategy to support an automated, self-service and 24/7 service delivery
• Improve data quality through simpler and more intuitive means of data entry and validation
• Utilise the latest desktop productivity tools providing instant mobility capabilities

See the full case study here: Data and Analyticsexposé case study – Business-SA