Pentaho Ships BI & Analytics Tools for Hadoop, Cloud

Pentaho Corp., an open source BI company, is bringing business intelligence and analytics to “big data” with new features for Apache Hadoop.  Pentaho’s support for Hadoop includes tools and wizards for BI design, deployment, staging and  management – all to make it easier for designers, data architects and end users to leverage Hadoop for BI and analytics tasks.

Tags: business intelligence, analytics, Pentaho, Hadoop, Apache, data integration, EDS, cloud services, big data, metadata services,

pentahoforhadoopjob1_03Pentaho, the open source BI company. is bringing business intelligence and analytics to “big data” with new support features for Apache Hadoop.  Pentaho’s BI support for Hadoop includes tools and wizards for BI design, deployment, staging and  management – all to make it easier for designers, data architects and end users to leverage Hadoop for BI and analytics.

 

bi-02-02-02_02

Apache Hadoop is an open source software framework that allows data intensive applications and operations to work against massive amounts of data – petabytes and beyond.

“Our goal is Hadoop with practically zero programming, so we can simplify the use of Hadoop for analytics, including file input and output steps as well as managing Hadoop jobs,” Joe Nicholson, Pentaho’s vice president of product marketing told Integration Developer News.

 

View demos of Pentaho’s BI/Analytics support for Hadoop


Nicholson and his team spoke with customers and prospects about Hadoop for BI and analytics projects, and found a major pro and a major con.  “Many  really like Hadoop’s ability to deliver cost-effectiveness for large data volumes,” Nicholson told IDN. “But, we also heard Hadoop’s command-line driven development environment makes it difficult and expensive for many customers. That’s a big barrier for many users.” In fact, he said, some customers feel held back because they don’t have the huge Java staff resources that many big companies have.

Driven by that focus, Pentaho’s announced Hadoop offerings include data integration, analytics data services, as well as a powerful data access wizard that simplifies data preparation, loading and integration.

“Our combination [of tools] gives normal line of business managers the ability to bring through all this powerful technology at a single data source, and auto-magically create an MDM layer in a cube,” Nicholson told IDN.   

Pentaho’s data access wizard avoids the need to use the command-line interface and provides a drag-and-drop way to upload and stage data, as well as accelerate transformations. The wizard can auto-generate reporting and OLAP metadata – functions that Nicholson said will allow access by non-BI specialists and even “self-service” BI. The wizard is based on technology developed for Pentaho’s BI cloud-based on-demand offering.

"Our end game will be to make Hadoop easy for analytics processing, and get people over any technical barriers.”


Joe Nicholson
Vice President of Product Marketing
Pentaho


Hadoop’s Promise for BI, Analytics –
Today and for Future Use Cases

For today, Hadoop support represents two key market opportunities for Pentaho, Nicholson said.  They are (1) Existing accounts looking to expand their BI and analytics efforts across multiple and high-volume data sources and clickstream data (such as CRM, marketing, financial and fraud detection), and (2) Greenfield accounts looking to use Hadoop for complex BI and analytics across multiple or large data stores.  

Nicholson said, Pentaho sees great growth opportunities for Hadoop-driven BI and analytics opportunities into the future, as well.

“Hadoop is not just about more large volumes,” Nicholson said. “Customers tell us they also want to relate their data from one area of the business to another, such as sales values, geographies, verticals, product sets, and so on, so they can really see where they are across the whole company.”  

Nicholson noted that Hadoop brings customers the ability to crunch data for many multiple data sources, including unstructured data – capabilities that he said “are difficult for data marts to deliver without a lot of cost and complexity.” 

Pentaho’s Hadoop interests will remain focused on BI-enablement, Nicholson emphasized, and will avoid Hadoop’s back-end architecture or spinning up instances, “Our end game will be to make Hadoop easy for analytics processing, and get people over any technical barriers,” he added.

With a background at data integration firm Informatica prior to joining Pentaho, Nicholson also sees the big value to customers by allowing them to apply BI and analytics across larger and larger sets of mixed data.

“Data sources will get bigger. “They are going to be huge – into the petabytes of data and even beyond, Nicholson said. So, we feel better access and manageability for growing sizes and diverse formats of data will be the next big driver of BI and analytics solutions.”


Pentaho Enterprise Data Services Suite
Ships New Technologies for BI, Analytics

On the point of BI/analytics support beyond Hadoop, Pentaho is also bringing other capabilities with its Enterprise Data Services Suite (EDS). 

Pentaho’s EDS expands beyond Pentaho Data Integrator’s core ETL and data integration and transformation offerings, and includes support for demand clouds and data wizard technologies aimed at letting users load data in and immediately do analytics.

“This sector is way bigger than ETL and data integration,” Nicholson said, “EDS creates for us a concept beyond PDI, where we can add pieces beyond ETL and integration to promote Agile BI and analytics down the road. “ 

Pentaho’s EDS offers users a set of tools to access a data source and subsequently prepare that data for end-user analysis. The software is available in on-premise or SaaS format. The range of tools includes enterprise-class data integration, data uploading, metadata and data optimization, and a tool for easing the way that data can be moved to the cloud.

Among specific features of Pentaho’s EDS are:

  • Pentaho Data Integration Enterprise Edition. PDI provides a scalable ETL dev and management toolset for data integration, as well as building and managing data mart and data warehouse projects. PDI includes a graphical, drag-and-drop environment and supports access for all common data sources, (open source and commercial RDBMS, packaged apps, SaaS such as salesforce.com, and many flat file formats.

    PDI also comes with many data integration capabilities, including 150+ pre-built mapping objects and support for “slowly changing dimensions” and enterprise information integration (EII).
  • Analytic Data Services.These services are designed to simplify the ability of end users to conduct rich and complex BI and analytics tasks against multiple data sources. These analytics data services are powered by Pentaho's ROLAP engine, wizard-based aggregation tools.
  • Cloud Data Services. Pentaho’s cloud data services eliminate complexities of moving data to the cloud. Via EDS, users can easily upload data and immediately begin making sense of that data without having to purchase new software, hardware or IT infrastructure.
  • Metadata Services. These enable users to specify what they want to accomplish and allow the metadata to be created and carried through the entire process, from data access through transformation and preparation. Pentaho’s Meta Services also include project and process management tasks such as versioning and team development.

 


back

Share
Go