David Menninger's Analyst Perspectives

Datawatch Gets Social with Data Preparation

Written by David Menninger | Aug 24, 2017 10:07:01 AM

Many organizations continue to struggle with preparing data for use in operational and analytical processes. We see these issues reported in our Data and Analytics in the Cloud benchmark research, where 55 percent of organizations identify data preparation as the most time-consuming task in their analytical processes.  Similarly, in our Next-Generation Predictive Analytics research, 62 percent of companies report that they’re unsatisfied because data needed for access or integration is not readily available. In our Big Data Integration research, 52 percent report spending that in working with big data integration processes, they spend the most time reviewing data for quality and consistency.  And nearly half of companies (48%) report this same issue in our Internet of Things research. We are currently conducting further research into this critical issue with our Data Preparation benchmark research.

Datawatch Monarch Swarm , the latest addition to the Datawatch Monarch product line, takes a new approach to solving some of these challenges. My colleague has written about prior versions of Monarch, particularly Datawatch’s ability to identify data format and structure from PDF documents, HTML web pages and other loosely structured data sources. Monarch also supports access to big data sources such as Hadoop and MongoDB as well as a variety of relational database formats.

Swarm provides two significant enhancements to Monarch. First, Monarch functionality is available via a web browser and through cloud computing for as many people as needed to participate in the data preparation process. Second, the new release adds data socialization features to support collaboration throughout the data preparation process. Users can like and follow other users and can discuss and comment on the work they have done. The likes and follows are used as a ranking mechanism that, along with the discussions and comments, make it easier for users to identify which data may be most useful to them, thereby shortening the time to prepare data. Swarm also incorporates machine learning techniques, which I have written about, to assess data quality, suggest data sources and recommend data preparation steps.

Data sources and workspaces are cataloged in a searchable information marketplace so users can find the right information more quickly. The marketplace also includes security and governance, enabling organizations to certify as authentic both raw and curated data sets.

The final step in preparing data is to deliver it in a format that can be consumed by the analytic tools that will be used by the line-of-business users. Monarch provides formats that support a variety of analytic tools, including IBM Watson and Cognos Analytics, Microsoft Excel, Microsoft PowerBI, Qlik and Tableau.

Datawatch historically differentiated Monarch, a 2016 Ventana Research Technology Innovation Award winner, based on its ability to extract information from loosely structured data sources. The software vendor is still a leader in this area, but it has evolved the Monarch product line over time to support the full data preparation process. Now, with Swarm, Datawatch has differentiated its products by combining data preparation with a robust set of collaboration capabilities.

Think about your data preparation processes and the people involved in them. Consider whether your organization can shorten the time to value for the data that it collects and processes. Our research suggests this combination of data preparation and socialization capabilities is important to have people and data collaborate. If you agree, you should consider Datawatch Monarch Swarm as a way to introduce collaboration and socialization to the data preparation process and how it can be more readily available through cloud computing.

Regards,

David Menninger

SVP & Research Director

Follow Me on Twitter and Connect with me on LinkedIn.