Talend’s Big Data Sandbox Assembles Key Ingredients and ‘Recipes’ To Explore Ideas and Deliver Success

Talend is looking to make big data experiments easier, safer and more fruitful. Talend’s Big Data Sandbox provides an intuitive way for developers to test big data technologies -- before the work involved to deploy them into existing infrastructure.

Tags: analytics, big data, Cloudera, data, Docker, ETL, Hadoop, Hortonworks, integration, MapR, real-time, Talend, virtual,

Big data often entails experimentation. It aims to cast a light into the undiscovered corners of the business hoping to uncover insights that can suggest changes to make things better – for both the company and for customers. With such goals in mind, very few organizations are in a position to start the first big data project with a clear view of the outcomes.


With this in mind, Talend is looking to make big data experiments easier, safer and more fruitful.


Talend’s Big Data Sandbox provides an intuitive way for developers to test big data technologies -- before the work involved to deploy them into existing infrastructure, according to Talend’s chief marketing officer Ashley Stirrup. 


Providing a full-featured virtual environment in a pre-deployment phase allows IT to experiment without disruption, and discover all the elements they need for a successful big data project -- budget, talent, roll-out time and more, he added.

In specific, Talend’s Big Data Sandbox provides a pre-assembled drag-and-drop, visual design environment to make it easy to build integration workflows.  It is pre-configured to eliminate a lot of the complex and time-consuming set up.  To spur innovation it even comes with pre-built, big data use cases, according to Stirrup.  


The Talend Big Data Sandbox looks to speed and simplify big data projects by helping users:  

Design faster: with the Talend Studio users can design batch, real-time and streaming integration jobs with a drag-and-drop user interface.


Collaborate better: improve collaboration with a shared repository, continuous delivery methods, and metadata bridge sharing.


Cleanse earlier: use native Hadoop data profiling, data matching, and machine learning to better understand corporate data.


Manage more: leverage consoles to centrally manage and monitor big data projects.


Scale easier: achieve infinite scale with built-in Lambda architecture and in-memory processing.

Inside the Talend Big Data Sandbox: Ingredients and Recipes for Success

Under the covers, Talend’s Big Data Sandbox brings together latest editions of the company’s Big Data Integration offering with top Apache projects (Spark, Kafka and Cassandra); and Hadoop distributions from Cloudera Hadoop and Hortonworks. (The Hadoop distro from MapR is optionally available.) 


In addition to bringing together key ingredients for big data success, the Talend Big Data Sandbox also offers a step-by-step ‘cookbook’ with recipes for popular ready-to-run, real-world use case scenarios:

  • Real-time analytics of data from multiple streaming sources
  • Real-time, personalized offer recommendations based on customer behavior
  • Clickstream analysis with ability to visualize activity on a heat map so companies can more precisely track web traffic
  • Monitoring IT operations using Apache weblogs
  • Extract, Transform and Load (ETL) offload performance to help accelerate complex workload processing


The Talend Big Data Sandbox also uses Docker technology, which provides users several benefits, according to Mark Balkenende, a solution architect at Talend.


Docker lets users conduct side-by-side comparisons of Hadoop distribution platforms in real-time and determine which will better work for their needs. Docker also lets developers easily share use case prototypes or investigate NoSQL and other big data technologies.


Balkenende further detailed Docker’s benefits to Talend’s Big Data Sandbox in a recent blog post:

One of the most exciting changes, but possibly the least visible, is our use of Docker for containerization of many of the underlying components. With the explosion of enterprise movement into the DevOps space, Docker has become a powerful tool for rapid and reliable provisioning and deployment of services and applications. We at Talend are embracing this movement internally and this Sandbox represents our first comprehensive use of Docker to distribute our own evaluation software platform.

Docker technology offers developers a way to package their application into a standardized piece of software in a complete filesystem that contains everything needed to run: code, runtime, system tools, system libraries – anything that can be installed on a server. This allows developers to quickly evaluate a variety of ready-to-run big data scenarios, tools and platforms within a virtual environment so that they can better understand the end-to-end lifecycle of a big data project and how it is likely to perform in their current environment.

A report from Gartner noted several benefits to the pre-packaged approach adopted by Talend:

Starting with a predefined and comprehensive set of business cases and an elaborate strategy can easily take the discovery and experimentation out of the equation. This would be unfortunate, as big data approaches and technologies offer the opportunity to come up with different business questions, different insights and different business operations.

[The Gartner report: “ Big Data Strategy: Get Inspired, Get Going, Get Organized”, was co-authored by Frank Buytendijk, Alexander Linden and Douglas Laney.]

Readers can learn more about a free trial of Talend’s Big Data Sandbox.