MapR's Spyglass Initiative Will Capture Deeper-Dive Analytics, Views for Improving Big Data Operations

This month, MapR Technologies is set to shine a brighter light on big data operations and management by capturing and displaying yet-unseen analytics for MapR Converged Data Platform components for NoSQL, Hadoop, and Spark.  IDN looks at how MapR's Spyglass could power the next leg in big data adoption.

Tags: analytics, Apache, API, big data, CPU, dashboards, Hadoop, management, MapR, monitoring, NoSQL, Spark, usage, YARN,

Dale Kim
senior director of product marketing


"Smarter monitoring across converged data begins with data collection. MapR’s Spyglass is the first step in realizing deep analyses of all this data."

Intelligent Data Summit
Manage Expanding Data Volumes for Analytics & Operations
October 27, 2016
Online Conference

MapR Technologies is set to shine a brighter light on big data management and operations – and by so doing power the next leg in the big data adoption. 

 

MapR’s Spyglass Initiative, announced in June and slatted to roll out in August, will capture deep dive analytics on big data operations from the MapR Converged Data Platform components for NoSQL, Hadoop, and Spark. It will gather diverse monitoring information, which is often hidden or hard to access, across containers, processes, components, and more, Dale Kim, MapR’s senior director of product marketing told IDN. 

 

Spyglass will capture and convert that data into easy-to-consume customizable views, Kim said. The result: more insight, visibility and control on day-to-day big data operations to power new use cases for converged big data. Or as Kim put it: “In short, we’re giving users deep visibility into their clusters so they can do proper resource allocation.”

 

“With MapR’s Converged Data Platform, customers can run different workloads and use it in a variety of ways. This means they’re more interested in monitoring the apps they are running or seeing more about how [Apache] Drill and Streams work together,” Kim said. “With Spyglass, we expose the data they need to help monitor all these components and let them customize how they can view their [operational] data.” 

 

MapR Says Deeper, Centralized Monitoring Powers the Data-Driven Organization

“Smarter monitoring across converged data begins with data collection,” Kim told IDN. “This [release of Spyglass] is the first step in realizing deep analyses of all this data.”  He likened what MapR is doing to improve data visibility in big data to what happened with network monitoring, which led to valuable network security analytics. 

 

With SpyGlass, MapR strives to make organizations more effective in their decision-making. It allows decisions to be driven by a centralized-yet-broad view of operational data that can easily be shared within and across organizations, Kim said.  He added that MapR’s approach to centralized monitoring will open the doors for big data analyses for all data-driven organizations, Kim noted. 

 

Even better, to avoid monitoring “just another data silo,” MapR offers APIs that will let IT integrate the new tooling with other systems, he added.

 

“With this new release, we can help our customers in two distinct ways: customers now have a better view of cluster operations in MapR and they also have a better view of project interoperability,” added Anil Gadre, MapR’s senior vice president, product management, in a statement.

 

MapR’s Spyglass Initiative is born out of MapR’s view that a new virtuous cycle of benefits is on the horizon for companies using big data. As Hadoop, and now Spark, take hold in enterprises, promises of a virtuous cycle for big data gets real – where better visibility and control begets more big data impact and ROI, Kim said.  

 

The missing ingredients to unlock big data’s virtuous cycle, Kin said, are visibility, management and predictability. “Customers continue to build out big data projects with new use-cases, more data, and more users. So, today, users asking for ways to get more monitoring and management so they can provide big data users guaranteed SLAs and predictable performance,” he added. 

 

Under the covers, MapR enables monitoring nodes/infrastructure (such as read/write throughput and database operations), cluster space utilization, YARN/MapReduce applications, and service daemons. All of these help customers understand the state of their cluster to more efficiently manage it, adding more data and jobs.

 

Kim shared a couple of examples to illustrate the level of deep-dives MapR will be doing. “We have audit logs and they help make customers aware of how well cluster operations are performing, even down to the level of how many reads and writes,” he said. Spyglass will also be able to tell which user groups and what datasets are most popular during given periods. 

Among notable features of MapR’s Spyglass Initiative are:

  • Deep search across cluster-wide metrics and logs. The new functionality integrates powerful and popular tools for aggregating and storing metrics and log data from MapR, providing deep visibility into a big data cluster to help plan next steps.
  • Deep dive customizable dashboards that can be shared. The Spyglass dashboards, designed to be mobile-ready, multi-tenant, provide a complete view of cluster operations in user-defined formats. This provides faster and deeper insights on cluster operations across an entire multi-tenant big data environment. Beyond easy-to-customize capabilities, MapR customers are able to share their dashboards with peers--thanks to its leveraging of MapR’s Exchange within the MapR Converge Community.  
  • Extensible APIs for third-party tool integration. Based on open source tools with open APIs, organizations are also free to use other options for visualizing their data.

As far as implementation, MapR Spyglass is actually a ‘multi-release’ initiative, Kim noted. In this first phase one release, MapR monitoring includes data capture capabilities for node and infrastructure, cluster space utilization, YARN/MapReduce applications, and even service daemon monitoring.

 

Kim shared details on the level of data capture MapR’s Spyglass Initiative will provide to support deeper monitoring of big data operations:

 

Node/Infrastructure Monitoring

  • Global Aggregates (Average, Min, Max) Charts (e.g. CPU, Disk utilization)
  • Per-Node Charts (e.g. I/O Throughput by disk)
  • MFS Read/Writes and Throughput
  • DB Puts, Gets, Scans and Cache Metrics


YARN/MR Application Monitoring

  • Global YARN Trend Graphs
  • Containers - Pending, Active
  • vCores & RAM - Allocated & Used
  • Per Queue Charts - Containers, vCores, RAM


Cluster Space Utilization Monitoring

  • Cluster Wide Storage Utilization
  • Storage Utilization Trend
  • Utilization per Volume and per Accountable Entity (data, volume, snapshot and total size)


Service Daemon Monitoring

  • Per-Service Charts  for (CPU Usage by type, Memory)
  • Centralized Searchable Logs
  • MapR Core and Ecosystem Services (includes YARN, Drill and Spark)


Following the initial August release, more and more Spyglass features are scheduled to roll out periodically through 2017.

 

At least one analyst likes how MapR’s approach is responding to the next-gen of big data projects. MapR has recognized the need for “a unified enterprise-grade platform for files, database and streaming data, said Robin Bloor, chief analyst at Bloor Group, in a statement.  With Spyglass, “MapR introduces a single pane of glass to seamlessly monitor, search and analyze data across an organization.”

 

MapR To Reduce Chaos for with Ecosystem for Open Source-Based Big Data

MapR is also looking to bring order to what some see as open source chaos -- and help assure IT better out-of-the-box interoperability among a number of popular open source components that drive build-it-yourself big data implementations, Kim added.

 

The MapR Ecosystem Pack (MEP) provides full certification on cross-project interoperability across a selected subset of popular open source projects. MEP will give customers pre-packed, pre-tested certified packs of open source projects on a quarterly basis.  Monthly updates will be available to address bug fixes and patches.




back