Yellowbrick Data Brings Petabyte-Scale to Hybrid Cloud Data Warehouses

Yellowbrick Data is expanding the scale of its hybrid cloud data warehouse architecture with support for multiple petabyte capacity. One instance can support more than 3 petabytes of data across on-prem and cloud.

Tags: cloud, data warehouse, ETL, Hadoop, hybrid, on-premises, SQL, Yelllowbrick,


Yellowbrick Data is expanding the scale of its hybrid cloud data warehouse architecture with support for multiple petabyte capacity. 


Yellowbrick’s latest offering provides single-warehouse capacity with support for 3.6PB of user data in an 18U rack form factor. When fully populated, this instance has a maximum node count of 45 in 18U and also supports 45 concurrent, single-worker queries on one system. 


Further, The PB-enabled capability aims to extend enterprise infrastructure for growing workloads and is available on Yellowbrick’s new hybrid data warehouse 3-chassis configuration, according to Nick Cox, head of product at Yellowbrick.


“With the 3-chassis configuration, we’re delivering dramatically more storage and performance in a very small form factor. It’s the most recent example of our relentless commitment to innovation—something that’s been part of the company culture and our hybrid data warehouse since day one,” Cox said in a statement. 


Among other key benefits, the 3-chassis product scales to meet future data demands, expands seamlessly to higher capacities of storage and performance, and optimizes data warehouse footprint dramatically for on-premises deployments.

Notable Features of the Yellowbrick Data Warehouse

Yellowbrick’s hybrid data warehouse sports other noteworthy features, including: 

Platform Scalability: The Yellowbrick Data Warehouse stack is engineered to perform at the speed of flash memory. The spinning disk has been replaced with an all memory architecture allowing data to move directly from flash memory to the CPU with virtually no latency. 


Performance: The Yellowbrick Data Warehouse is architected to provide superior performance to those systems that only operate in the public or private cloud. The Yellowbrick Data Warehouse uses a flash-based architecture built and optimized for tasks such as ETL pushdown, in-place ELT, data wrangling and de-normalization. 


Compatibility: The Yellowbrick Data Warehouse is ANSI SQL and Postgres compatible with a broad range of Hybrid ETL tools including Informatica PowerCenter, SyncSort, Hadoop, Spark, Kafka, and *DBC-based tools, etc. 


Data Reliability: For most public and private cloud data warehouses, disaster recovery and backup are either an expensive option or don’t exist at all. Disaster recovery for on-premises appliances requires buying a second appliance that sits idle most of the time, and disaster recovery for the cloud isn’t typically included in enterprise SLAs. 

Yellowbrick’s Cloud Disaster Recovery Services Goes GA

Yellowbrick is also releasing its Cloud Disaster Recovery service to  general availability, alongside new database replication and enhanced backup/restore features. 


These additions “fill a major gap,” Cox added, noting that existing business continuity tools can prove too slow and complicated to manage large and growing datasets across cloud, hybrid cloud and on-prem.


Cox said in a statement, “We’re complementing the existing business continuity functionality inside a single Yellowbrick Data Warehouse--including support for high availability, erasure coding, and fault tolerance--with new features that provide continuity across databases and locations in a low-cost, low-effort way using the power and flexibility of hybrid cloud architecture. That is essential for business-critical applications.”


 These new offerings offer speed and granular control to back up and restore big datasets. In specific, they  support backups at near-line speed, allow for incremental backups and provide transactional consistency (ACID) of restored data.


Users can select one or more target databases to replicate to a second instance, either between cloud regions or from an on-premises instance to any cloud region, with full failover and fail-back capabilities.


The additions can also automate backup/restore operations -- without intermediate storage.


The Yellowbrick data warehouse platform is natively designed for hybrid cloud architecture. It brings flexibility to do lightning-fast analytics at scale on any private and/or popular public cloud, according to Cox. The Yellowbrick solution offers a reserved, single-tenant, always-on instance that lets users tap into high-performance using specialized hardware/software solutions.