+44 (0)1926 623303

Knowledge

07/10/2015

7 Considerations for Data Warehousing in the Cloud

We recently wrote a post about dashDB, IBM’s data warehouse in the cloud, and have had a number of questions around when to use cloud as a deployment method and what aspects to consider when making the decision. While there are many benefits to cloud data warehousing, cloud deployments may not be the best fit for all BI or analytics projects. With this in mind, here we highlight seven areas for consideration: (The following list is based on a report by TDWI Research combined with our own customer experiences.)

  1. Fit your data warehouse platform to the analytics purpose

At EBI, our framework is always based on first establishing a specific use case, exploring the relevant data sources and then understanding the skill level of users as these factors will impact the evaluation and selection of a platform. This is true for both on-premise and cloud deployments. Periodically published predefined reports may be suited to an established dimensional star-schema data mart, while predictive modelling applications would be better suited to a high-performance analytics appliance. In addition to this, there are some projects which are particularly applicable to cloud deployments. For example, short-lived projects, seasonal analyses and departmental self-service reporting. Increasingly, we don’t know whether data holds value until we are given the tools to analyse it. In these “analytical experiments” the ability to start small with cloud deployments offers a huge opportunity, though it is important to understand the roadmap to enterprise class solutions up front.

  1. Choose a cost model that suits

Cost is often cited as a barrier when it comes to data warehouse strategies and cloud helps to overcome this with a utility cost model and no large capital investments. Buyers should consider the Total Cost of Ownership when selecting a solution to include “hidden” costs such as the cost of acquisition to ongoing operations and maintenance costs. Different organisations can tolerate different levels of capex/opex and need to balance these restraints with the impact of time to value.

  1. Cloud deployment can shorten time to value

By selecting a cloud based solution, organisations are able to draw on vendors’ experience in assembling the tools to support the complete process and benefit from best practice design. Combined with the offloading of infrastructure engineering tasks, this means that organisations can get up and running much more quickly. Particularly for those projects described above which are short-lived or experimental, this frees time for users to concentrate on information requirements and analysis.

  1. Understand the benefits of integrated analytics

As technology for BI and advanced analytics has matured, so too have the requirements of the business data users. Although some cloud-based data warehouse vendors continue to focus on straight forward reporting, others like IBM dashDB, have built in analytic functions to enable statistical and predictive modelling. These include clustering, segmentation, classification, decision trees and association analysis which are particularly useful for B2C organisations. The level of analytics can impact performance so features such as in-memory and in-database processing become increasingly important.

  1. Be clear about performance requirements

Deploying in a multi-tenanted cloud environment can lead to organisations suffering from “noisy neighbours”. That is to say, when co-locating with other applications may affect your application’s performance. Performance criteria, acceptable performance levels and an understanding of how it is benchmarked should be made clear up front. If required, a bare-metal, single tenant option should be explored.

  1. Proactively manage data integration and governance

In every organisation, new analytics projects should be underpinned with information governance and integration to ensure that trust and quality are maintained. As well as the movement and loading of data between on-premise and cloud architectures, understanding, cleansing, integrating and transforming data are also part of the process.

  1. Satisfy security and data protection requirements

It is rare that a cloud conversation is had without security being raised as a concern. The reality is that vendors, like IBM, are so aware of clients’ concerns that cloud deployments are often more secure than organisations own data centres. That said, tools and enhanced security functions can be included where necessary and bare-metal deployments should be considered as an option where highly sensitive data is involved.

At EBI, we follow a methodology which takes these considerations and others on board and we can help you evaluate on-premise vs cloud solutions to align with your required business outcome.

Comments