MapR’s Brings Big Data To ‘BizOps’ – Real-Time ROI for Business Operations

Big Data deployments are poised to hit their stride in 2016. Evidence is growing that this wave of solutions are at last lowering the bar on complexity and delivering big business ROI.  IDN kicks off a series on ‘Big Data for ‘BizOps’ with a look at MapR Technologies’ Converged Data Platform.

Tags: analytics, big data, Hadoop, insights, MapR, NoSQL, real-time, streaming,

Jack Norris
chief marketing officer

"The key is to collect lots of data and understand what it is telling you . . .fast enough to make a difference to the business."

Big Data in Motion Summit
Data & Analytics for Insights, Intelligence & Operations
February 25, 2016
Online Conference

Big Data deployments may be poised to hit their stride in 2016. Evidence is growing that the latest wave of big data options are lowering the bar on time and complexity – while raising the ROI for business operations.


IDN kicks off a series on ‘Big Data for ‘BizOps’ with a look at MapR Technologies’ Converged Data Platform.


MapR’s Converged Data Platform approach combines a win-win for big data adopters, thanks to a combination of converged architecture and developer training – both designed to lower the bar on IT. This aims to make it easier for companies to launch big data projects – even without an expensive and hard-to-hire data scientist, MapR’s CMO Jack Norris told IDN.


MapR also offers patterns to help companies capture big ROI from big data from new revenue opportunities, cost reductions, streamlined operations and even happier (more loyal) customers.


“We’re looking to lower the bar on expertise. Instead of super high-end data scientists. We have programs that can train a traditional developer who knows SQL to understand and deliver on some important [big data and analytics] skills,” he said. Among the classes are: Pattern recognition, log analytics, machine learning and data warehouse augmentation.


This quick tour of MapR customer benefits lays out the menu of how the industry-wide big data players are revving up BizOps benefits from big data in 2016:   


Insights for Real-Time Manufacturing

One chipmaker is using MapR Converged Data Platform to collect reams of manufacturing data not just for historic analytics, but as a real-time feedback loop for the massive investments in the manufacturing line.


MapR lets the chipmaker collect and correlate data on heat, vibration and all sorts of other data from machine sensors – all of which is tied in with keeping tabs on chip quality during fabrication, Norris told IDN.  “The key is to be able to collect lots of data and try to understand what the data is telling you – and fast enough to make a real difference to the business,” Norris said.


He shared an example. “Let’s say your analytics show a strange vibration or a spike in the heat sensor in the eighth hour.  You might want to shut down the line and check it. It could save a ton of money or avoid an unhappy customer."  


To deliver these real-time BizOps analytics, MapR’s Converged Data Platform is built to capture and process most of the key pieces.  “Such results require gathering many types of data in a timely, real-time fashion. And, then you need to be able to combine all that data together and use it to detect patterns [and] figure out what these patterns are saying -- Are things OK or not?” Norris said. “So, it’s not just real-time data. It’s coming up with real-time insights and patterns from the data you can use for business decisions.”


Insights for Real-Time Customer Interaction & Satisfaction

This real-time lifecycle – from data capture, data correlation to insights and action – also plays a role for one massively multiplayer online (MMO) gaming firm. This company, with some 200 million registered users, has adopted real-time big data to change the game for its customers / players. In so doing, it also is driving a new level of visibility and profit for its own BizOps.


Even prior to adopting MapR’s Converged Data Platform, this gaming company was a big believer in Hadoop. But its Hadoop platform was separate from its NoSQL database. This silo approach left gaps in knowledge and reaction time.


“The gaming data would be batch loaded to analytics and the company would extract the data and put it in NoSQL. So, they had to use special purpose silos for data flowing around back and forth,” Norris said. The data helped show trends in customer/player trends, and would lead to worthwhile game updates – but only periodically, he added.


The data silos were also a drag on response time, as gaming developers could only run data and get analytics in batch mode – long after the player had left.  But, now armed with real-time (and deeper) analytics, the gaming firm can work in real-time – on a number of levels.

  • It can watch a player’s game activity in real-time, and based on analytics customize the user experience to improve playability and upsell opportunities. “So, through the real-time analytics, it is now clear a player might be lost, and have no idea what to do next. Now, the game [developers] can give open a door or do something to help the player,” Norris said.  
  • For BizOps, the company can use this same real-time visibility into gamer behavior to experiment with new game features or even try out monetization strategies.  “They can now let the user purchase a sword or some other magic device.  They can even do A/B tests in real-time,” Norris noted.

“What’s also important about these benefits, is these tweaks [for players and BizOps] are driven by real analytics, not just guesses,” Norris added.

Insights for Real-Time Security – Killing the Hacker’s Profit Motive

Security software firm Terbium is looking to drive real-time analytics to cut down on hacker incentive for stealing data. In specific, Terbium combines its own data fingerprinting software with MapR Converged Data Platform to search for stolen data on the Dark Web.


It works like this:  By registering fingerprints of a company’s valuable data, Terbium can compare those to ones they gather from across the Internet – and detect unexpected appearance of sensitive information and alert customers immediately. 


Terbium creates the data fingerprint without even knowing what the data says.  It is based on a protocol derived from the idea of fuzzy hashing, (Terbium says it is a kind of one-way Private Set Intersection protocol, a cryptographic technique that allows two parties to measure the intersections of sets without – requiring that they reveal the contents of their own sets to the other party.).


Terbium boasts there are 350 billion data fingerprints in its database, which continues to grow by ten to fifteen billion every day.   As one might expect, the technique requires massive amounts of storage and processing. Using MapR, Terbium registers digital fingerprints of data and searches for stolen data by comparing them to data gathered across the Internet. But beyond capacity, Terbium needs speed and smarts to make this whole BizOps work.


To understand how speed and smarts is so important to Terbium’s business model, consider this: It’s one thing to find stolen data on the Internet. It’s quite another to find it so rapidly – and inform customers about the hack – that the value of the stolen data actually goes down.


Norris put it this way:  “Let’s say a stolen credit card number is worth 5 or 10 cents on the Dark Web. Drive down the time it takes to find out it’s stolen, and you can drive down that price.  Speed it up fast enough and now there’s practically no reward.” As a BizOps bottom line, Norris adds: “Just ask ‘How fast can you identify the [fingerprinted] data that’s stolen respond?’’


Insights for Streamlined Operations

At Valence Health, a provider of clinical integration services for value-based care solutions, data ingestion was becoming a real problem. Each day, Valence systems need to ingest massive data volumes on lab test results, patient records, prescriptions, pharmacy benefits, claims and payments to and from doctors and hospitals. The records are used to inform decisions about improving both healthcare outcomes and reimbursement.


All told, Valence ingests 3,000 inbound data feeds (representing 45 different types of data.) To make the numbers even more mind-numbing, just a single feed can include 20 million lab records. Valence is using MapR to improve their process time and capacity – and set the table for more real-time analytics.


“The MapR approach converges a lot of different things on one platform,” Norris told IDN. “You are pipelining the data and constantly getting analytics in real-time, as operations take place.  This lets organizations dramatically squeeze down their cycle time and do things much faster.”  A case in point: The large data feed of 20 million records mentioned earlier that could have taken 22 hours to process; now takes 20 minutes, according to Valence. 


This kind of performance is important to the bottom line. It lets Valence respond to intricate customer requests that used to be time-consuming and difficult, Norris said. For example, a customer needs to update a record. A traditional database solution might take 3-4 weeks to get that data deleted.  A feature in the MapR Converged Data Platform called ‘snapshots’ provides point-in-time recovery that enable Valence to just roll back and remove that file in minutes.


Such high levls of performance also holds promise for fraud detection (both batch and real-time).  “Gartner came out with a report recently that found the cost of fraud, waste and abuse in the health care industry is estimated at $60 billion every year.  So, health benefit providers are looking for more aggressive ways to identify and prevent fraud,” Norris noted.


Inside the MapR Convered Data Platform

The MapR Converged Data Platform integrates Hadoop and Spark, real-time database capabilities, and global event streaming with big data enterprise storage, for developing and running innovative data applications.


In specific, the MapR platform brings together the following features and components:

  • Core Data Platform Modules: Hadoop, YARN, Spark, MapR-DB, MapR Streams
  • Batch: MapReduce, Hive, Pig, Spark
  • Interactive SQL: Drill, Impala, Spark SQK
  • NoSQL & Search: HBase, AsyncHBase, Solr
  • Graph: GraphX
  • Streaming: Spark Streaming, Storm
  • Data Tools: HttpFS, Sqoop, Flume
  • Coordination: Oozie, ZooKeeper
  • Provisioning, Configuration & Monitoring: Sentry, Hue, Sahara, Myriad


Just this month, MapR’s Convered Data Platform was awarded a patent from the US Patent & Trademark Office.


Among the notable features are: MapR’s ability to eliminate data silos through the convergence of open source, enterprise storage, NoSQL, and event streams. Layering benefits on this convergence is MapR’s performance, data protection, disaster recovery, and multi-tenancy features.  The patent also recognized “an architecture based on data structures called ‘containers’ that safeguards against data loss with optimized replication techniques and tolerance for multiple node failures in a cluster.” 


MapR Convered Data Platform is available in both community (open source) and enterprise (commercial) editions.