David Menninger's Analyst Perspectives

Microsoft Azure: Cloud Computing for Data and Analytics

Written by David Menninger | Mar 17, 2021 10:00:00 AM

Organizations are increasingly using data as a strategic asset, which makes data services critical. Huge volumes of data need to be stored, managed, discovered and analyzed. Cloud computing and storage approaches provide enterprises with various capabilities to store and process their data in third-party data centers. The advent of data platforms previously discussed here are essential for organizations to effectively manage their data assets.

Many organizations are shifting their data and analytics workloads toward a consumption-based, hyper-scale public cloud model. Cloud offers organizations the flexibility to offload administration and management of some of their systems. Migrating data to the cloud enables organizations to easily scale resources up and down, on demand. The other major benefit of using cloud service providers is the reduction of upfront costs.

Ventana Research asserts that through 2022, more than one-half of organizations will migrate on-premises workloads to cloud data platforms, allowing them to focus more on business needs rather than maintaining systems.

Microsoft Azure is a scalable cloud platform with computing, networking, storage, database and management, along with advanced services such as analytics and machine learning (ML). Azure includes a flexible platform that helps developers build, deploy and manage enterprise, mobile, web and internet of things (IoT) applications, for any platform or device, without having to worry about the underlying infrastructure. As a cloud-based platform, Azure enables customers to devote more resources to development and use of applications that benefit their organizations, rather than managing on-premises hardware and software.

Featuring unique hybrid capabilities, Azure facilitates easy mobility and a reliable, consistent platform between on-premises and public cloud. Azure provides a broad range of hybrid connections including virtual private networks (VPNs), caches, content delivery networks (CDNs) and ExpressRoute connections to improve usability and performance.

Azure is a cloud platform that offers blockchain as a service (BaaS), ML, bots, and cognitive API capabilities. Azure Data Lake is built on Azure Blob Storage, which is the Microsoft object storage system for the cloud. The system features low-cost, tiered storage and high availability plus disaster-recovery capabilities. It integrates with other Azure services, including Azure Data Factory, a tool for creating and running extract, transform and load (ETL) and extract, load and transform (ELT) processes.

Azure is based on the Apache Hadoop YARN cluster-management platform. It can scale dynamically across SQL servers within the data lake, as well as servers in Azure SQL Database and Azure SQL Data Warehouse. Because the platform supports all types of data, it can be used to integrate enterprise data from multiple sources into a single data warehouse.

Azure Synapse Analytics was introduced last year to help organizations use their data and deploy all types of analysis including artificial intelligence (AI) and ML. Azure Synapse Analytics brings together data integration, enterprise data warehousing and big data analytics. It enables an organization to query both relational and nonrelational data at petabyte scale using languages such as SQL, Python, .NET, Java, Scala and R. This makes it highly suitable for different analysis workloads, data engineering and developer profiles.

The integration between Azure Synapse and Power BI offers the capabilities to analyze very large volumes of data. For example, an organization can implement composite models and aggregations to benefit from both the scale of Azure Synapse and the performance of the Power BI Vertipaq, in-memory columnar database engine. Microsoft claims these features can be used to enable interactive analysis over petabyte-scale datasets. Ventana Research asserts that through 2023, three-quarters of Chief Data Officers’ primary concerns will be governing the privacy and security of their organization’s data.

Azure Purview is a new data-governance system that is integrated with the Microsoft Information Protection service. It enables enterprises to map, catalogue, understand and manage operational and analytical data. Purview can automate data discovery and data cataloging while minimizing compliance risk. It can be used with Azure Synapse Analytics.

Azure Purview includes three main components:

  • Data discovery: Azure Purview can automatically find all of an organization’s data on-premises or in the cloud, even those managed by other providers, and evaluate the characteristics and sensitivity of the data.
  • Data catalog: Azure Purview enables all users to search for data using a simple web-based experience. Visual graphs let users quickly see if data of interest is from a trusted source.
  • Data governance: Azure Purview provides an end-to-end view of a company’s data landscape, enabling data officers to efficiently govern data use. This enables governance insights such as the distribution of data across environments, how data is moving and where sensitive data is stored.

When integrated with Microsoft’s BI tools, Purview can provide enhanced governance and cataloguing capabilities. This integration will help with better discovery of hybrid data, facilitating a more complete understanding of data and analytics regardless of where they reside.

As Microsoft has built the Azure Synapse platform, it has migrated and integrated the on-premises SQL Server offerings. Azure Synapse integrates many pieces of software required in data and analytics processes, saving organizations the time and effort of combining offerings from different vendors. Organizations already working with Microsoft products, especially those using Microsoft’s Azure cloud services, will find Synapse fits easily into their environments. Those who are using other cloud services will want to compare their provider’s offerings and integrations with Microsoft’s. The Azure Synapse platform provides a comprehensive and competitive offering in the marketplace that builds on Microsoft’s widespread presence. It positions Microsoft to capture more of the emerging cloud-based data warehouse and cloud-based data lake markets.

Organizations can apply Azure Synapse to any of their data and analytics processes and the inclusion of Purview will help organizations provide proper data governance, which is often an afterthought. Our Dynamic Insights research on data lakes shows Marketing, Customer Service, Sales, and Finance are the most common application areas for data platforms. As a broad data platform Synapse also includes Azure Stream Analytics for real-time processing and analysis of fast-moving streams of data. These are often new applications where organizations are evaluating and selecting technologies.

Clearly, organizations that are “Microsoft shops,” especially SQL Server and/or Power BI shops, should evaluate Synapse and Purview. However, organizations must also evaluate how well the offerings fit with their existing investments and skills in other technologies. While Synapse offers significantly expanded capabilities, it should be compared with other data-platform vendors. When evaluating Synapse, consider whether its hybrid and multi-cloud options meet your organization's needs. And, although Synapse represents an improvement in integration, there are still many parts to the solution. Further integration and ease of use would help organizations implement, and be more productive with, the platform.

Data and analytics are moving to the cloud and vendors’ platforms are becoming more sophisticated and integrated. Azure Synapse represents a significant advance in Microsoft’s offering. With Purview, it brings to the fore a focus on governance that all organizations should have. We recommend organizations consider Synapse when evaluating their data platform requirements.

Regards,

David Menninger