Hortonworks Acquires Onyara; Launches DataFlow To Tackle 'Internet of Anything' Solutions

Hadoop distribution vendor Hortonworks has acquired Onyara, the company behind the Apache NiFi open source project that aims to automate the flow of data between systems. Hortonworks aims to make it easier to automate and secure the data flows from ‘Internet of Anything’ projects. 

Tags: analytics, Apache, dataflow, big data, Hadoop, Hortonworks, IoAT, IoT, NiFi, Onyara,

Hadoop distribution vendor Hortonworks has acquired Onyara, the creator of and key contributor to Apache NiFi, an open source project to automate the flow of data between systems at the Apache Software Foundation.


Hortonworks executives say the Onyara acquisition will make it easier for companies to design and launch ‘Internet of Anything’ (IoAT) projects.  In specific, customers will be able to easily automate and secure the data flows that collect, conduct and curate real-time business insights and actions derived from data in motion.


Thanks to the agreement, Hortonworks is shipping Hortonworks DataFlow powered by Apache NiFi.  The Hortonworks DataFlow powered by Apachi NiFi, a top-level open source project, is complementary to the Hortonworks Data Platform powered by Apache Hadoop.


Hortonworks’ latest moves into IoAT come as big data developers are looking for easier and more reliable ways to create and capture waves of new data from machines, sensors, geo-location devices, social feeds, clickstreams, server logs and more.


Unlike data from conventional systems, this new data requires two way connections and security from the edge to the datacenter, resulting in a need for specific security protocols but also data protection, governance and provenance, according to Joe Witt, chief technology officer at Onyara.


“Nearly a decade ago when IoAT [Internet of Anything] began to emerge, we saw an opportunity to harness the massive new data types from people, places and things, and deliver it to businesses in a uniquely secure and simple way,” Witt said in a statement.


Rob Sader, co-founder of Onyara, explains the impact of the DataFlow technology in a recent blogpost entitled DataFlow - An Expert View:

We could give you all kinds of definitions from all over the place but, to keep it simple, Dataflow is connecting systems that produce data (Sensors), with systems that process data (Apache Spark), which are then connected to systems that store data (HDFS, MongoDB).


Pretty simple when you draw it out on paper or in a diagram, but in practice, connecting systems has always been one of the most significant challenges that organizations have faced for years.

“Onyara’s impressive work on security and simplicity in NiFi, combined with their commitment to open source makes for a perfect addition to our technology team,” Hortonworks CEO Rob Bearden said upon the acquisition.


Inside Apache NiFi: The Core of Hortonworks DataFlow

The DataFlow technology has its roots in both Apache open source and the U.S. government’s tech transfer program.


Apache NiFi is based on technology previously called “Niagara Files” that has been in development and use at scale within the U.S. National Security Agency (NSA) for the last 8 years. The NSA Technology Transfer Program released Apache NiFi to the open source community in the fall of 2014.


It was designed and built to orchestrate dataflows from disparate data sources and can securely:

Collect - Aggregate any and all IoAT data from sensors, machines, geo location devices, clickstream, files, and social feeds via a highly secure lightweight agent.


Conduct - Mediate secure point-to-point and bidirectional data flows and deliver reliably to real-time analytic applications and full fidelity data systems such as HDP.


Curate - Allow parsing, filtering, joining, transforming, forking or cloning of data streams.

These operational capabilities, along with NiFi’s features described above, enable data stewards to construct secure and reliable data grids as continuous dataflows for real-time processing - from anything, from anywhere - at scale.


A description of DataFlow’s value to IoAT from the Hortownworks website put it this way:


To derive value and real-time insights, data in motion from IoAT must be treated as dataflows - from source to destination - so that modern analytical applications can collect, conduct and curate the data in a secure, scalable and reliable manner.


Thanks to HDF, customers will be able to securely and easily collect, conduct and curate any type of data from any origin with this new offering. Traditional Data at rest as well as real time data in motion can now be blended to provide historical and perishable insights


The Apache NiFi project supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic. Some of its high-level capabilities and objectives include:

  • Web-based user interface for seamless experience between design, control, feedback, and monitoring
  • Highly configurable: loss tolerant vs. guaranteed delivery, low latency vs. high throughput, dynamic prioritization, flow can be modified at runtime, back pressure
  • Data Provenance to track dataflow from beginning to end
  • Designed for extension, lets you build your own processors and enables rapid development and effective testing
  • Secure with support for SSL, SSH, HTTPS, encrypted content, etc. and pluggable role-based authentication/authorization

Apache NiFi was made available through the NSA Technology Transfer Program in the fall of 2014. Over the past eight years, Onyara’s engineers were key contributors to the US government software project that evolved into Apache NiFi. Onyara was formed as a private company around Apache NiFi’s capabilities in December 2014. Subsequently, Apache NiFi became a top-level Apache project in July 2015, signifying that its community and technology have been successfully governed under ASF.