Full360’s ElasticBI Framework is our world class DW PaaS. Every day it continues to evolve. As our most demanding customers generate new requirements, and as we look to the future of Data Warehousing and Business Intelligence at cloud scale, we incorporate more features and functions into ElasticBI.
This Blog Series will cover the various aspects of the ElasticBI Framework. Today we talk about the database engines and storage facilities we employ.
Full360 Elastic Data Platform
There are several primary database technologies that we use and several alternatives. We’ve seen multiple generations of database technologies come and go. So we pick the very best of the best as our tools of the trade. Since our customers demand a great deal of us we have to use multiple technologies in our toolkit. But have battle hardened them into weapons. These weapon metaphors are a quick and dirty way of describing the various database technologies we use.
Volt DB - Fast Data
We use VoltDB because some of our customers require that we process data in near real-time. VoltDB is our submachine gun. Our primary application of VoltDB is for gaming and gamification in a use case where we are capturing rapidly generated transactions. We can also use it for complex event processing. But primarily think of a situation where you’re playing online Poker and every time you get a card, the game sends an API call to a database. We will capture that data as it happens and then create staging for analytics. In one case we have used a database cluster of VoltDB to capture 30,000 transactions per minute with just two EC2 instances and we never hit more than a 20% load. So obviously the this technology scales massively and we can capture as many transactions as you can throw at us. We are happy to be expanding our relationship with VoltDB as they release their version 5.
Vertica - Big Complex Data
Most of our current customer requirements are calling for complex data set manipulations for their business intelligence. Our mainstay is Hewlett-Packard’s Vertica database. We call Vertica our samurai sword. That’s because it’s flexible, swift, and strong. It does exactly what we need for those customers. Its development interface is brilliant and contains all of the tools, buttons and levers we are accustomed to as mature database designers. As a columnar database model we can use it in the wide variety of applications small, medium or large. We use this primarily when there are complex data models and we need to move chunks of data from place to place. When we do transforms inside the database this is where vertical shines. We run standard Vertica version 7.1 and we also use the Flex Table facility, greatly extending our ability to deal with all sorts of data formats.
Redshift - Massive Dumb Data
Redshift is our big ogre club, and it’s our giant killer. We use it for what we call big dumb data. If we have sensor data or we have click stream data, or if we have just massive amounts of data like billing records for telephone calls, it generally turns out to be fairly simple. Redshift scales to herculean proportions. Even though Redshift is capable of handling 4000 columns we rarely have data models that wide. That’s because the massive data that we get from our customers is generally not that wide. We have had sensor data that gives us a lot of different metrics per record but most of the time they’re only a few metrics per record just billions and billions of records. For that kind of data, complex set operations are not usually called for. We use Redshift to show off our capability of handling massive data and so we created our ‘Billion Row Demo” and we show that a moderate sized cluster can aggregate 1 billion records in about 5 to 7 seconds. We love our partnership with Amazon. Redshift is a force to be reckoned with.
RDS - Cheap Data
Amazon also provides RDS to give plug compatibility with the current generation of Enterprise class database technologies. Whether you are on Oracle, IBM or Microsoft shop, we provide transitional service to get you out of expensive license arrangements and into the cloud with minimum headaches. Our experts know the best ways to implement this based upon a subtle understanding of what works best in the cloud. Performance is guaranteed, but you’ll really love the price, as familiar as a table knife.
S3 - Any Data
Most people don’t think of S3 as a database technology but we do. Our applications query S3 every day. Swiss army knife here. S3 is integral to the way that the ElasticBI data platform works because we use it as a reliable, low cost store-anything platform. Part of the way that we economize for our customers is that we determine how much of the data set needs to be spending and live and queryable, and how much of that data is off-line or near line. ElasticBI will make sure that we make the most efficient use of database technologies that need to be live and storage technologies that feed those database stores. So we use S3 and Glacier as an integral part of any data warehouse we build. Virtually infinite storage means your take-out rights are always insured.
DynamoDB - Document Data
We use Amazon’s Dynamo DB as an operational repository for our applications. Everything that we do on ElasticBI framework we’ll ultimately conform to an API. (Stay tuned) As we build that out, we use Dynamo DB to capture our command-and-control logic and the metadata associated with that. So obviously this is enormously flexible. We can change our design anytime we like. Of course we can build our UX for mobile applications with DynamoDB like normal web developers as well. This will give us the opportunity to take advantage of new compute services like Amazon Lambda in the future.
Essbase - Tightly Secured Data
Oracle Essbase is the star of our ElasticPM Framework. But sometimes it plays a role in our DW practice. As such, Essbase is our scalpel, infinitely precise. What Essbase does better than any database in our universe is give unrivaled security down to the cell level. When you want lightning fast response time on critical performance management data (KPIs, Financials, Quality Controls) for certain eyes only among your legions of users, that’s where Essbase shines.
There are several other database technologies that we’re thinking about folding into the Data Platform. We haven’t had customer requests that require them so far. What our intentions moving forward is to work with these technologies and vendors.
Snowflake - Next Gen DW Data
Snowflake is the new database technology that we are excited about because the folks at Snowflake think about data warehousing the way that we do. If Snowflake Computing were a service provider then they would be very much like Full360. A number of innovations we have designed into the ElasticBI platform with other database technologies are included in the way that Snowflake have designed their data warehouse technology. We are very excited to get our hands on it. Read Jeremy’s post on Snowflake here.
EMR - Massively Tangled Data
The real meat and potatoes of data warehousing as far as we’re concerned is the data transformation process. We call it ETL, but only because we’re not marketing people. What we mean by ETL is everything necessary, from validation to encryption to parsing whether data is on its way into the DW or on its way out. Our ETL is our any-to-any data transformation service. We engage REST services so we do a lot of XML translation to JSON. We do a lot of XML to CSV. We do a lot of key-value pair transformations. We do wicked parsing. And we generate streams of data that go beyond what normal structured data stores generally provide. At some point we expect that we will get something so huge and complex that our ETL can’t handle it. It hasn’t happened yet, but if it ever does some fraction of the business in the future may require us to do translations using Amazon’s Elastic Map Reduce and/or MAPR.
Riak - Website & Log Data
I’m personally excited about Riak because I’ve been following the Basho company a for time. I believe at some point in the future we will have requirements to upgrade a lot of of CouchDB, MongoDB, Cassondra, Solr and other NoSQL databases to the winners in the space. So we’re keeping our eye on Riak. We’re also interested to find customers that may be using NoSQL and trying to build data warehouses from that same single database technology and we expect that we’ll be able to improve the efficiency of those installations based upon our experience across structured and unstructured data storage.
Keep in mind that Full360 has architected the ElasticBI Framework so that all of these database and storage technologies play nicely with each other. We are only too happy to build a hybrid data model across two or more of these within the ElasticBI Data Platform. We think this is a very unique value proposition that is worth your consideration. Why compromise when you can have the best of all worlds? Your data is unique. The Full360 Data Platform lets your data work for you no matter what shape it’s in.