move more parts of their IT environment to the cloud. Even cloud services such as data warehousing, business intelligence and analytics capabilities — environments that corporate IT organizations have previously resisted moving to the cloud — are no exception. Cost-effectiveness is certainly a big draw for organizations: Amazon Web Services, Google Cloud Platform, IBM’s Bluemix and Microsoft Azure all offer introductory trial accounts that either provide a certain level of free services or account credit to motivate new users to try out the different IT services. But aside from the potential cost benefits of transitioning IT such as data warehousing, business intelligence and analytics environments to the cloud, there are four factors that make cloud an attractive option for CIOs:
Array of cloud resources
From the perspective of renovating and modernizing a data warehousing, analytics and reporting environment, the term cloud services actually covers a wide array of resources and capabilities. These include, but certainly are not limited to:
Data storage: There are several different, readily available, massively scalable storage options such as file storage (similar to traditional hierarchical file systems), block storage or object storage, in which data items are stored as objects with accompanying metadata describing them to simplify accessibility.
Computing platforms: Each of the cloud vendors allows the user to specify and spin up compute platforms in different CPU, memory and temporary storage configurations. Cost depends on the desired resources.
Database management systems: The options range from traditional relational database management systems to more sophisticated NoSQL databases, to column-oriented and in-memory databases optimized for performance.
Networking and load balancing: Intended to manage performance among the nodes in a configured compute configuration.
Emerging services: This is the most interesting category, with options for machine learning, unstructured search, text analytics, speech and natural language APIs, data visualization, and other capabilities.
The decision to move to the cloud provides some flexibility to work with the cloud vendor to choose from a range of computing platform, storage, application, database and service options, along with management tools, developer tools, security management capabilities, system monitoring and other services. In many cases, the cloud provider will both provide the platform and access to the services, as well as work with your data consumers to understand their analytical needs. They will also help design, implement and manage your business intelligence platform, all “as a service.”
For the cost-conscious organization, it is important to develop a better understanding of cloud computing economics before completely committing to a cloud strategy. There might be hidden costs that come as a surprise, influencing decisions about the type of storage, compute platforms, data access patterns and services that are heavily dependent on overall spending expectations. The cost models can be somewhat confusing, especially when the units of measurement employed don’t necessarily match up — even across services from the same provider, never mind various providers when trying to get an apples-to-apples comparison.
Consider these examples:
Compute resources are generally configured based on the number of virtual CPUs or machine cores, the amount of memory and, in some cases, temporary storage associated with the virtual machine.
Storage can be priced based on the amount of storage requested, the number of objects stored, the number of requests and the bandwidth of data transferred.
Databases may be charged based on an hourly fee for use of the database on a specified configured virtual server. More sophisticated high-performance database and data warehousing services will incur higher charges.
Other services are charged as they are used. For example, consider a vendor that allows users to execute SQL queries against the persistent object storage and pay by the query. However, the cost of this service includes a per-query flat cost based on the amount of data accessed, as well as the cost of storing the query results back to the object store. With uncompressed data, executing many queries like this can lead to higher charges. But if the data is stored in a compressed format and aligned in a columnar data layout, less data will be returned by the query and overall costs will be lower.
Renovating your analytics environment
Understanding the platforms and services that are available provides the starting point for developing a plan to renovate your reporting and analytics environment. Implementing a modernized environment will accomplish two goals. First, it will move your existing capabilities to a lower-cost, but higher-performing, platform. Second, and perhaps more important, it enables the adoption of innovative analytics capabilities, including ingesting multiple data streams in real time, algorithmic applications such as machine learning and artificial intelligence, and integrated real-time analytics.
The plethora of choices complicates the ability to balance system availability, data usability, overall performance and ongoing costs, however. Optimizing your configuration for one of these variables may lead to lapses in one or more of the others. For example, a data layout that maps each transaction to its own data object may ease data accessibility, but as the number of transactions grows, it increases the number of data objects. This, consequently, increases costs for object storage, because you are charged by the number of objects. On the other hand, collecting many records as objects within a smaller number of files will lower the cost per object, but accessing specific records requires accessing a greater amount of data, increasing the data transfer costs.
In other words, “cloud computing economics” becomes an important variable in numerous aspects of system design. When considering migrating your data warehouse, business intelligence, reporting and analytics functions to the cloud, it’s important to evaluate the data use patterns, determine the types of information models and assess the user community’s requirements. Work with the cloud vendor to figure out the best approach to data architecture, services and application design to optimize data availability, data utilization, system performance and expenditures.