Matillion Brings Cloud-Native ETL to Snowflake Data Platform on Google Cloud Platform

Matillion is bringing cloud-native ETL data transformation to the Snowflake data platform running on the Google Cloud Platform. IDN speaks with David Langton, vice president of product at Matillion.

Tags: data, cloud, ETL, GCP, Google Cloud, integration, Matillion, transformation, Snowflake,

David Langton, Matillion
David Langton
vice president of product
InfluxData


"We definitely see wholesale movement to the cloud, rather than hybrid approaches. So, it makes sense to put the ETL in the cloud."

Enterprise Integration Summit
Integration Powers Digital Transformation for APIs, Apps, Data & Cloud
August 27, 2020
An Online Conference

Matillion is bringing cloud-native ETL data transformation to the Snowflake data platform running on the Google Cloud Platform.

 

Matillion ETL for Snowflake on Google Cloud Platform is purpose-built for Snowflake and supports enhanced access, sharing, and replication of data across multiple cloud environments, according to David Langton, vice president of product at Matillion.

 

In specific, Matillion ETL for Snowflake on GCP lets customers leverage popular Snowflake features for faster time to value and insights, Langton noted. Matillion ETL’s benefits include: 

  • Easy-to-use, code-optional, drag-and-drop transformation canvas to allow business users to transform data at scale quickly
  • More than 80 data sources to integrate with (on-prem databases, files, and SaaS apps)
  • User-configurable REST API connector for additional data sources without native integrations
  • Suspend, resume and, resize a virtual warehouse via deep integration into Snowflake features including Flatten Variant and Alter Warehouse
  • Data ingestion support into Snowflake from Google Cloud Storage

Matillion ETL also provides a built-in scheduler, version control, job documentation generator, data lineage and GIT integration.

In-Depth on Matillion's Purpose-Built Solutions for Cloud

Matillion’s ETL solution for Snowflake is the latest in a lengthy list of purpose-built solutions for cloud-based data.  The company’s portfolio supports what Langton called a “wholesale movement to the cloud” by large and mid-sized enterprises. 

 

“We definitely see wholesale movement to the cloud, rather than hybrid approaches. The very nature of ETL meant that, for a lot of organizations, their source data went beyond their “core network” anyway,” Langton told IDN.

 

He noted cloud technologies have matured over the last decade, and mindset is now shifting from on-prem to cloud-first -- even multi-cloud.

 

“Salesforce has been available for a long time now, and so there was always an element of bringing SaaS and on-premise data into one place for analytic use cases. As the balance has now tipped squarely in favor of a majority of organizations running SaaS and in-cloud applications to run their core businesses, it makes sense to also put the ETL, data warehouse, data lake, and analytics tools in the cloud,” Langton added.

 

Matillion shared this corporate view on the dynamics now underway with enterprise cloud adoption.

[A]s more companies adopt a multi-cloud strategy, they are retiring on-premises databases and replacing them with cloud alternatives. Using different cloud warehouses and platforms together increase flexibility and introduce cost-efficiencies into data workflows. Matillion ETL can be a critical component of a successful multi-cloud strategy.

That said. beyond cloud migration projects, Langton said he also sees a strong wave of greenfield, born-in-the-cloud projects. “We also see new projects, doing analytics use cases for the first time because doing it in the cloud makes previously infeasible projects feasible. That is really driven by cloud economics.”

 

The benefits from cloud economics are evident in the spike in adoption for cloud-native data projects, Langton noted. As a consequence, Matillion ETL is playing roles with just about every flavor and format of enterprise data, he added.

“We see different types of data, from relational databases, non-relational databases, flat file formats, SaaS applications, CRM Systems, social media platforms, and others being touched by cloud-native ETL,” Langton said.

 

This sets the stage for similar projects because "the majority of that data lives in the cloud already," Langton said. "So, moving and transforming cloud-native data into a cloud data warehouse gives great performance and scalability to help businesses move quickly.

 

"Essentially, a lot of that still comes down to rows and columns of strings, numbers, dates - although that’s increasingly delivered as nested arrays and structs rather than normalized, which tie directly into the Snowflake Variant type," he added.

Inside the Matillion ETL / Snowflake Operations

Under the covers, a lot is going on in the ‘purpose-built’ approach Matillion ETL brings to Snowflake.

 

Langton described to IDN many details of the architecture and how Matillion/Snowflake work together in operations.

 

“Snowflake’s rich set of functions and operators can then quickly and easily derive new datasets with clean, sanitized, fully-or-partially aggregated data joined from a wide range of data silos. Matillion ETL also ties into the Snowflake separated compute layer so ETL jobs can dynamically scale the processing power of the warehouse for different parts of the ETL pipeline, then scale it back down afterward. This helps keep our customers’ costs down,” he said.

 

Larger on-prem to cloud migrations require various skills to balance the needs for effectiveness and safety.  Such migrations can become a team sport for an enterprise. So Matillion ETL’s features are engineered for team collaboration, he said.

 

Langton also described in detail how Matillion ETL delivers such cooperation.  

In order to facilitate collaboration amongst large teams, the Matillion UI is collaborative by design. If another user is in the same project, version and job, you’re seeing each other's changes in real-time. If your business demands a different working style, you can use a GIT repository. Now all work can be done in your own branches (using a Matillion version each to provide a more isolated environment) - then merge the work together later.

 

These teams are often made up of emerging job titles as well as some more well-known ones like data engineer, ETL engineer, or data scientist. However, it is usually someone with a data-specific use case or need first, and a background in information technology second. Since every business needs to compete with data, limiting access to data just inside IT does not help a company move quickly with insights.

 

So, this new persona, the Citizen Data Professionals, would include many data integrator or data analysis type roles. A Citizen Data Professional is a data-savvy knowledge worker that sits outside the IT department, who wants to tackle data problems using intuitive, simple tools to load and centralize data sources for analytics and innovation. These people could report into a lot of different executives and are no longer singularly under the purview of the CTO, as you may expect.

With such a wide range of options for an enterprise to use the cloud for new-gen data analytics, we asked Langton whether there are more popular patterns emerging for a typical project.

The typical use cases for Matillion ETL are data transformation for analytics and reporting, data warehouse modernization, machine learning modeling, and data centralization.

 

More broadly, any use case that involves the movement of data from multiple, disparate sources like relational databases, document-style databases, SaaS application like Salesforce, Marketo, Hubspot, or social media platforms like Twitter or Facebook - into Snowflake, with the ability to then transform that data into other shapes. More specifically, those “other data shapes” might be Star Schemas or Data Vaults in a data warehouse modernization use case. It might be to provide a centralized store of master data in a master data management scenario, or just some big flat tables to run queries against.

Matillion also has DWaaS partnerships with Amazon Redshift, Snowflake and Google BigQuery. On AWS, the company supports Amazon Redshift and Snowflake. On GCP, it supports Google BigQuery and Snowflake. On Azure, Matillion supports Snowflake and will launch Matillion ETL for Azure Synapse this year, Langton added.

 

Matillion ETL for Snowflake on GCP is exclusively available on the Google Cloud Marketplace, allowing customers to leverage their existing billing relationship with Google.




back