Talend Cloud Adds Pipeline Designer To Empower More Users To Easily Build Flexible, Fast End-to-End Data Pipelines

Talend is adding a new feature to its iPaaS to make it easier for more enterprise workers to create end-to-end data pipelines.  IDN talks about Talend Cloud’s new Pipeline Designer with Talend’s Ray Christopher.

Tags: batch, cloud, data, hybrid, iPaaS, pipeline, streaming, Talend,

Ray Christopher, Talend
Ray Christopher
product marketing manager

"More and more users need to work with a wider variety of data, but many don’t have the skills. We built Pipeline Designer so users don’t need deep technical skills."

Application Architecture Summit
Modern Application Development for Digital Business Success
Online Conference

Talend is adding features to its iPaaS cloud integration offering to make it faster easier to create end-to-end data pipelines.   Talend Cloud is adding a flexible and intuitive Pipeline Designer to tackle complex integrations to make it easier for more staffers to get data where it needs to be. 


Talend Pipeline Designer is a web-based graphical designer that runs atop Talend Cloud. It removes the complexity of data pipelines for multiple data types – and across on-premises and multiple clouds -- by providing a set persona-specific apps, Talend’s product marketing manager Ray Christopher told IDN.


“As data has become more and more important, requirements have changed.  More types of people need to work with data. But today’s tools are either too complex or too expensive. So, our strategy is to offer a set of pre-built, persona-based apps,” that will let everyone who needs to have their own way to work with data, Christopher said.


Further, thanks to Talend’s personalized app approach, working collaboratively with data will become easier too. “We see data will become more of a team sport,” he added.

To be specific, Talend has Pipeline Designer custom-designed apps for data scientists, analysts, data engineers and data stewards.  Each set of users are given the ability to collect, govern, transform, and share multiple types of enterprise data – predominantly through a simple-to-navigate GUI, he added.


Pipeline Designer apps allow users to work with multiple formats – batch, structured, streaming, real-time, even JSON data files, Christopher said. They are also built to be self-serviceable – but also governable, he added.

Thanks to Talend’s Pipeline Designer, a wide array of users can use persona-based apps to:

  • Create end-to-end pipelines to preview live data and transform it in a web-based UI
  • Design and build resilient pipelines with schema capabilities
  • Develop and debug in real time with a live preview of data
  • Design with speed across batch and streaming data use cases (via a single interface)
  • Integrate any data, structured or unstructured

Talend included technology its Pipeline Designer that aims to directly tackle several top challenges of working with data in 2019, Christopher said.


Talend’s Pipeline Designer Helps Enterprises Become More Data-Driven 

Talend’s Pipeline Designer arises as enterprises struggle to become more data-driven, Christopher told IDN. 


“More and more users need to work with a wider variety of data, but many simply don’t have all the technical skills required. We built Pipeline Designer so users don’t need deep technical skills.” 


Talend cited a data scientist survey by Kaggle, which found more than 30% of data scientists reported their top challenges as the unavailability of data and the difficulty accessing data.  The survey also found the rocketing demand for better access to more data indicated by a surge in job postings data engineers.


To promote collaboration among multiple users, the Pipeline Designer’s persona-based apps are also built to work together. “This way, everyone can have their own specialized app, and working with data can also become a team sport,”  he added.


Talend also aims to promote pipeline portability, as well as the laborious steps often required to input or access data.


“To improve speed and portability, we’ve taken a ‘Design once. Deploy anywhere’ approach to the Pipeline Designer,” Christopher said.  The idea is to make pipelines less tightly coupled to platforms – and provide more flexibility and even a degree of future-proofing.   Talend allows pipelines to run with multiple clouds – or even with Amazon EMR, Snowflake, and other sources – without major re-programming, he added.


Much of this agility comes from the fact that Talend is leveraging Apache Beam. 


“We built Pipeline Designer on top of Apache Beam, which insulates you from deployment platform. So we make it easy for you to run on any cloud platform or EMR today, and we’re ready for future platforms. Apache Beam lets Talend support whatever comes down the road,” Christopher said. 


A Talend blog lost describes the benefits of Apache Beam.

Talend has long been a leader in “future proofing” your development work. You model your pipeline and can then select the platform to run it on (on-prem, cloud or big data). And when your requirements change, you just select a different platform.

An example is when we turned our code generator from MapReduce to Spark, so you could turn your job to running optimized, native Spark in a few clicks.

But now, it’s even better. By building on top of the open source project Apache Beam, we are able to decouple design and runtime, allowing you to build pipelines without having to think about the processing engine you will run your pipeline on.

Even more, you are able to design both streaming and batch pipelines in the same palette.

So you could plug the same pipeline on a bounded source, like a SQL query, or an unbounded source, for example, a message queue, and it will work as a batch pipeline or a stream pipeline simply based on the source of data.

At runtime, [users] can choose to run natively in the cloud platform where your data resides, and [users] can even choose to run on EMR for ultimate scalability.

Christopher explained how Pipeline Designer’s architecture aims to deliver a  “design once and run anywhere” outcome and allows companies to run on multiple clouds in a scalable way.


“Pipeline Designer lets users design and run both streaming and batch pipelines in the same palette. So the same pipeline can support a bounded source, like a SQL query, or an unbounded source, for example, a message queue. It will work as a batch pipeline or a stream pipeline simply based on the source of data,” he told IDN. 


For ‘On-the-Fly’ Access To Data, Talend Sidesteps Use of ‘Rigid’ Metadata

Talend’s Pipeline Designer also is designed to promote speedy access to data – both to insert and extract from a pipeline.


Talend’s design gives users “immediate access” to the data they’re using, Christopher said. It does this because it works with a schema-less design using a method called schema-on-read.   This saves time by eliminating the requirement to map incoming data, he added.


“Because Pipeline Designer doesn’t use a rigid meta approach, this means users can work with their data immediately – without a ton of set-up work,” he said.  “Our dynamic capabilities mean users don’t have to manually map. And we constantly auto-adjust.  So users can add data sources ‘on-the-fly’ and get to access them because we dynamically discover schemas.


Pipeline Designer also provides visibility into operations, in a way that even non-technical users can be alerted. 


"In that case, [users] would get a warning or an error in the bottom pane of the Pipeline Designer window that displays the preview of live data with a message that will help easily troubleshoot the root cause of the problem.  You would check the result of the transformation through the live data preview pane at the bottom of the screen,” Christopher told IDN.


“In the future, we are looking at bringing intelligent assistance in the design of the pipeline with interactive tips [to] raise warnings and trigger actions based on bad/invalid data flowing through the pipeline," he added.


Pipeline Designer is part of the Talend Data Fabric platform. This component integration solves several complex aspects of managing a data value chain end-to-end. In conjunction with Talend Data Fabric, users of Pipeline Designer can collect data across systems, govern it to ensure proper use, transform it into new formats, improve its quality, and share it with internal and external stakeholders.


Further, Pipeline Designer is managed by Talend Management Console  -- the same application as the rest of Talend Cloud: This continuity ensures that IT is able to have a full view of the Talend platform, providing the oversight and governance that can only come from a unified platform like Talend Cloud, Christopher added.  This also makes it easy to control data usage as well as to audio and ensure privacy compliance, he added.


One analyst how data-driven initiatives would benefit from Talend’s approach.


“A majority of organizations are integrating sources and targets across hybrid and multi-cloud environments [which] can result in an assembly of multiple tools and technologies for one end-to-end solution,” said Stewart Bond, director for IDC’s data integration and integrity software unit  “Talend Cloud with Pipeline Designer is built for hybrid and multi-cloud data environments, providing data engineers with ease-of-use and elastic scalability on-demand while also enabling integration of streaming and at-rest data, all-in-one tool, simplifying integration solutions.”


Because Pipeline Designer is integrated into the Talend Cloud, it is instantly available to current Talend Cloud users.  For those not currently using Talend Cloud, a trial is available for the Talend Cloud Pipeline Designer.