Tableau Delivers Simple and Fast ‘Ad-Hoc’ Visualizations for More Hadoop, NoSQL Platforms

Tableau Software is rolling out more simple and fast ‘ad-hoc’ visualizations for Hadoop, thanks to deep new direct connections for IBM’s InfoSphere BigInsights, Amazon Elastic MapReduce, Spark SQL and MarkLogic’s Enterprise NoSQL. Tableau already supports MapR, Cloudera, Hortonworks and Pivotal.  IDN talks with Tableau’s Jeff Feng.

Tags: Amazon, analytics, big data, BigInsight, Clodera, cubes, Hadoop, Hive, Hortonworks, IBM, MarkLogic, MapR, NoSQL, Pivotal, report, SAML, Tableau, visualizations,

Jeff Feng
product manager

"Tableau greatly simplifies the process of exploring and querying your data so that a user can focus on asking the questions."

Tableau Software is rolling out more simple, fast and ad-hoc visualizations for Hadoop, thanks to deep new direct connections for IBM’s InfoSphere BigInsights, Amazon Elastic MapReduce, Spark SQL and MarkLogic’s Enterprise NoSQL.


Big Data in Motion Summit
Manage Expanding Data Volumes for Analytics & Operations
Jan 29, 2015
Online Conference

The latest Tableau’s Hadoop / NoSQL connector initiatives are in various stages of public availability.

  • A connection for IBM’s InfoSphere BigInsights, Big Blue’s flagship Hadoop offering, is delivered through the BigSQL technology. It provides users drag-and-drop analytics.
  • A direct connector to MarkLogic’s Enterprise NoSQL database platform lets users augment their existing Hadoop deployment with MarkLogic Enterprise NoSQL. Tableau provides a direct path to that highly indexed un-structured data.
  • A beta version of a Spark SQL connector, which allows queries to run in-memory on Hadoop clusters up to 100x faster than Hadoop MapReduce.
  • A beta of a connector for Amazon Elastic MapReduce service, which runs on Amazon Web Services. With this Tableau offering, users will be able work directly with their Amazon Web Services hosted and managed Hadoop environments.

These latest additions come on top of Tableau’s Hadoop / NoSQL connector portfolio, which already supports MapR, Cloudera, Hortonworks and Pivotal.


Tableau’s expanding partnerships with Hadoop and NoSQL vendors aim to do more than simply add to Tableau’s list of connectors, Tableau product manager, Jeff Feng, told IDN. They are collaborative technology partnerships that produce tools to let non-technical users to visually analyze their data. “[Tableau] greatly simplifies the process of exploring and querying your data so that a user can focus on asking the questions of their data rather than the processing of asking questions of their data,” Feng added.


He shared more about the depth of the technology cooperation between Tableau and the Hadoop / NoSQL platform providers.


“We collaborate closely with our technology partners so that we jointly leverage the best interface to the data. We also provide guidance on the functionality of connectors to our platform as well as how to optimize performance,” he said. “In most cases, the data technology [partner] provider provides Tableau a driver to work with, .We take that driver and we optimize it. This means we consider the unique capabilities of each database, and fine-tune the queries that Tableau generate and translate it into the results a Tableau user expects to see,” Feng added.


Beyond data connectors, this edition of Tableau supports what the company calls, “fast ad-hoc visualizations” thanks to deeper integration with its Hadoop partners, connecting directly to more Hadoop data stores. Tableau’s VizQL technologies translate drag-and-drop gestures into queries that databases can understand, and then renders the result set as visualization in Tableau.


A Deeper Look at Tableau’s Data Visualization Architecture for Hadoop, NoSQL

Architecturally, Tableau’s latest Hadoop / NoSQL connectors underscore the company’s growing focus on integration and other techniques to let users quickly derive insights that combine structured, unstructured, streaming and other data type.


“Organizations have an ever-changing data environment and our goal is to bring any data, anywhere to Tableau for fast and easy analysis,” Feng noted. “Long gone are the days where data lived in a set place. Therefore we see ourselves as a flexible platform and strive to integrate with the data platforms our customers have invested in.”


So, when Tableau talks about integration, “we’re talking about working with their drivers and fine-tuning them so that Tableau is running optimized queries against each of these data sources,” Feng said. The goal isn’t to help people build [just] pretty ‘pixel-perfect’ dashboards Feng said. “It’s to enable people to do ad-hoc exploration of their data.”


Tableau’s vice president of product management Dan Jewett put it this way. “Our integrations with technology partners in the Hadoop and NoSQL space . . . stem from our mission to put the rich visual analytics capabilities of Tableau into the hands of everyone, even those with billions of rows of data.”


Feng enumerated some of Tableau’s guiding principles and enabling technologies for visualizing insights from multiple data types using Hadoop.


Ease of access for data from across the enterprise -- and even cloud. “We give users the capability to connect their data assets wherever they are. We don’t require anyone to move their complete data set from their data store, warehouse and application into our system,” he said.


Flexible data architecture. This allows a user to connect to a big data platform either live or via an extract. In the case where data is very large, users will be generally better off to leverage the computational power of the big data platform if they have a fast platform and a larger cluster size. “If a user does not have a fast platform or if they are working with smaller data sets, a user can extract the data and move it in-memory for faster performance. Or they can use a combination of both,” Feng said. “Tableau does not require users to move data into proprietary data stores.”


Data blending. Because Tableau can perform data blending across these various different sources, users are free to keep their data assets where they are. “This means faster time to insight without having to move the data,” Feng said. Tableau can access structured, semi-structured, and unstructured data from big data platforms through a supported interface such as Hive, Impala, HAWQ, etc., Feng noted. “The advantage of [Hadoop] platforms is that all of the data can be kept in the same location without moving it in and out and the schema does not need to be defined beforehand, giving users flexibility in how they consume the data,” he added.


Timely access to data. “Anyone can choose to connect ‘live’ to their data platform or bring their data in-memory (into Tableau’s system),” Feng said. “Connecting ‘live’ means that Tableau sends the query and the database processes the query directly only returning the minimum amount of data needed to visualize the result in Tableau,” he said. Tableau also has the ability to bring data in-memory if you want to look at a smaller subset of data, Feng added.


Tighten the cycle of visual analysis. Feng calls this Tableau’s “secret sauce.” This capability has two parts: (1) Providing an efficient means to access and query data from different database and data processing engines; and (2) Once a user connects to their data, it’s about enabling a user to visually exploring their data using their natural train of thought.


User experience. Tableau partners with Hadoop vendors to also deliver not simply data, but a great user experience that lets non-technical users combine all these data types – without the need to have a lot of technical knowledge, Feng told IDN.


“We’ve made the experience as easy as importing a picture into a document,” he said. [Users] can connect to any data source with one click. The experience is the same no matter what data source you connect to. Once you have connected to the data source and gotten authenticated, you can drag and drop any data field onto our canvas in Tableau, add additional data sources and blend them together. When you connect to your data in Tableau, you can preview your data, make sure it’s the right data and make some modifications to it, such as re-naming fields or hiding fields you don’t need.”


One major user of Tableau for big data is very happy. "Tableau's solution for Hadoop is one of the most elegant solutions I've seen,” said Ravi Bandaru, Product Manager of Advanced Analytics & Data Visualization at Nokia. "This obviates any need for us to move huge log data into Relational store before analyzing it with Tableau."


Tableau Also Ships Kerberos Security in Tableau 8.3 Update

Tableau also released its latest update to general availability. Tableau 8.3 delivers support for Kerberos for Microsoft SQL Server, Microsoft SQL Server Analysis Services and Cloudera Impala. Kerberos support provides single sign-on with strong cryptography from the desktop or browser all the way to the database. Further, Tableau Server now supports a built-in authentication system.


While Tableau already supported enterprise class security and authentication mechanisms, Tableau 8.3 extends security integration beyond native Active Directory or SAML-based IAM providers, officials said.


Here are the additional options:

  • Seamless, single sign-on experience from your Windows login through the Tableau client to back-end data sources
  • Ensures sensitive data stays protected by integrating with existing IT investments in enterprise-grade authentication and data security
  • Supports smart card authentication

Current Tableau users welcomed the added security, especially as users demand more ability to do ad hoc or on-the-fly visualizations. David Purdy, an IT architect at Eastman Chemical Co. put it the Tableau 8.3 benefits this way in a statement: “Authors can create visualizations against secured database views and/or OLAP cubes and publish them without worrying about security because now the backend data source can take care of that based on the viewer’s identity. As an added bonus, the guided configuration made set-up easy.”