On this post we'll discuss the reasons for adopting our “different” pricing model and also try to make a fair cost comparison between SlicingDice and other well-known amazing analytics data warehouse solutions, such as Amazon Redshift, Google BigQuery, ElasticSearch Cloud and Keen IO.
If you are looking for the cost comparison, here is it.
Scroll the page to find all comparison’s details and considerations.
SlicingDice Pricing Model
As discussed on the post about SlicingDice creation, we wanted to provide a platform that could be simple and easy for any company to use. We believe that price transparency, simplicity and predictability are really important factors when analyzing and choosing a cloud data warehouse supplier.
Why create a different pricing model?
Differently from most cloud-based data warehouse and analytics database providers, we don’t charge our customers based on infrastructure resource allocation, being it storage space or processing power.
We don’t believe that paying for infrastructure allocation is a good and fair model, so we thought a lot about what pricing model we could use to align SlicingDice’s business to our customer’s business model and future growth.
There are a lot of startups and business that must collect, process and query huge volume of data in order to simply start offering its services, but this doesn’t mean these companies are proportionally making millions in revenue and can afford expensive infrastructure because they have big data.
With that in mind, instead of charging by storage space, CPU usage, instance type or amount of data collected, we “invented” a new pricing model — quantity of columns — that we believe to be more aligned to our customer’s business model and future growth.
Understanding SlicingDice’s pricing model
The unique variable used on our pricing model is number of columns, that basically means how many columns you need to store your data.
SlicingDice has two types of columns, attribute and event columns.
- Attribute Column
Column to store any data not associated to a date/time.
Example: storing the value
Genderattribute column or the value
Car Brandcolumn. In this case both columns and its values are not associated to a specific date/time.
- Event Column
Column to store data that is associated to a date/time.
Example: storing the value
Add to Carton the date/time
Clicksevent column. In this case it’s clearly important to know when the
Add to Cartevent happened.
While creating this pricing model, we noticed that event (time-series) data does not have the same value for all kinds of companies. Some of them just care about recently generated events, while others don’t even store them. But there are many that store events (time-series) data for months or years.
Knowing that, we aligned all interests — ours and our costumer’s — and decided to also use the data retention time for the event columns as a way to give pricing flexibility, according to their data needs.
The concept is straightforward: the more columns you have, more you pay, independently of how much rows/values you store. This way you can pay just for what is important to your business and still store as much data as you want, without increasing your costs!
SlicingDice monthly price calculation example
Suppose you have this data on the left. As you can see, the
Clicks column is storing event data, and other columns are simply storing attributes, like
In order to know how much your monthly bill is going to be, you simply need to count how many attribute columns (two in this case,
State) and event columns (just one,
Clicks) you need to store your data.
Remember: The amount of data (values/rows) you insert for each column (attributes or events) will NOT impact on your cost. Because from 1 to 500 columns you only pay US$ 2.50 per column, on either event and attribute columns.
With this price, you can store UNLIMITED amount of rows/values in each column as you wish.
Although our pricing model is different and innovation is not something people always want, we believe our offering is cheaper, simpler, more predictable and transparent than our competitors. More important, SlicingDice believes its pricing models is more aligned to its customer’s business model and future growth.
We would love to hear your opinion about our pricing model. Please, if you have anything to say or to protest, let us know.
Competitors Comparison — February 2019
Okay, Let’s compare SlicingDice offering against well-known amazing analytics data warehouse solutions, such as Amazon Redshift, Google BigQuery, ElasticSearch Cloud and Keen IO.
1. We tried our best to understand our competitors pricing structure and make a fair comparison across all of them. We must say it isn’t a trivial task, so keep in mind we probably made some mistakes here due to over simplification.
2. We are not really comparing apples to apples here as Amazon Redshift and ElasticSearch Cloud are not serverless solutions and Keen IO is not a database.
3. Except by Keen IO, all solutions have more features and capabilities than SlicingDice, such as SQL support on Amazon Redshift and Google BigQuery.
Comparison dataset and benchmarks
For this comparison we will be using the well-known 1.1 Billion Taxi Rides dataset, that includes rides made in New York City between 2009 and 2015 and that is available on this Github repository.
This dataset has 1.1 billion rows and uses 48 columns to store the data.
Okay, let’s start.
$ 144.00 per month — Total monthly cost to store the 1.1 Billion Taxi Rides dataset, using 48 event columns with UNLIMITED data retention time, including built-in high availability and backup.
Details: $ 144.00 is equivalent 48 event columns ($ 3.00 per column). That’s it, no other costs.
$ 612,00 per month— Total monthly cost to store the 1.1 Billion Taxi Rides dataset on Redshift, making unlimited quantity of queries and expecting them to complete between few seconds up to 2 minutes, without any backup or high availability.
Details: 1 ds2.xlarge on-demand instance running for 720 hours (30 days). This cost above does not include any network transfer or storage (S3) cost.
If you want Redshift to have the SlicingDice kind of speed (queries completing always within few seconds), you can start 6 ds2.xlarge on-demand instances, paying $ 3.672,00 monthly.
$ 12,60 + $ 0.07 x # of queries = $ 12,60 + unpredictable per month
— Total monthly cost to store the 1.1 Billion Taxi Rides dataset on BigQuery, making unknown volume of queries per day and expecting them to complete within seconds, with built-in high availability.
Details: Data is stored uncompressed on BigQuery and takes up about 500 GB of space, so the storage cost will be around $0.42 per day. BigQuery also charges for each query, what could translate to $0.07 for each query run.
So that is $12,6 for storage and $0.07 for each query. Considering 500 queries per month, it would cost $ 35,00, considering 500 per day would end up being $1,050.00 per month.
This cost above does not include any network transfer or storage cost (Cloud Storage) you might need to use for temporarily storing the dataset.
$1,999.00 per month — Total monthly cost to store the 1.1 Billion Taxi Rides dataset on ElasticSearch, making unlimited quantity of queries and expecting them to complete between 10 seconds up to few minutes, with high availability on 3 different DCs (same as SlicingDice).
Details: As this dataset takes around 700 GB of disk space on ElasticSearh, we dimensioned a 32 GB memory and 768 GB reserved storage cluster on ElasticSearch Cloud US East (N. Virginia) with 3 data centers HA.
$11,000.00 just to start — This is the initial cost to load and store the 1.1 Billion Taxi Rides dataset on Keen IO, not making a single query. Keen IO charges $10.00 for each 1 million events streamed to their platform. We need to load 1.1 billion events, so that’s basically 1,100 x $10.00 just to start.
Keen IO also charges $1.00 for every 100 million properties (rows) scanned when making a query, so a single query can cost as much as $11.00.
Putting it all together
Below is a table putting all information above together in order to make the comparison visualization easier.
Another good comparison
“On the surface, it might seem that Redshift is more expensive. Per GB, Redshift costs $0.08, per month ($1000/TB/Year), compared to BigQuery’s $0.02. However, the devil is in the details.
BigQuery’s cost of $0.02/GB only covers storage, not queries. You pay separately per query based on the amount of data processed at a $5/TB rate. Because BigQuery doesn’t provide any indexes, and many analytical queries cover entire database, we can assume that each query will need to scan a big chunk of the data. Say you have 1TB spread evenly across 50 columns (in several tables). A query that scans through 5 of these columns will process 100GB at a cost of $0.5. This means, that per GB you’ll pay an additional $0.005 per query. If you have 12 such queries per month it will actually cost you $0.08 (0.02 + 0.005 * 12). Which is the same as Redshift. Beyond that, BigQuery costs more.
The conclusion we came to is that cheap data storage is worthless disjointed from utilization. It’s arguably comparable to storing data on just Amazon S3. When you actually use the data, you’ll start paying big.”
- If you have a reasonable volume of data, say, dozens of terabytes that you rarely use to perform queries and it’s acceptable for you to have query response times of up to few minutes when you use, then Google BigQuery is an excellent candidate for your scenario.
- If you need to analyze a big amount of data (e.g.: up to a few terabytes) by running many queries — which should be answered each very quickly — and you don’t need to keep the data available once the analysis is done, then an on-demand cloud solution like Amazon Redshift is a great fit. But keep in mind that differently from Google BigQuery, Redshift does need to be configured and tuned in order to perform well.
- Although ElasticSearch is very often used to store and query analytics-related data due its great aggregation capabilities, managing and tuning an ElasticSearch cluster can be a real pain, even using a cloud version.
- As we said before, Keen IO is not a database nor have all the database capabilities from other solutions, although they are focused on providing API-based analytics platform to store and process event data.
- Although Amazon Redshift and ElasticSearch are currently used by thousands of companies as data warehouses, the unique serverless data warehouse (real database) solution that competes against SlicingDice is Google BigQuery, as these other solutions are a cloud version of a server.
We are not looking to be a one-stop analytics database that supports all possible requirements. We simply want to be the simplest, fastest and cheapest solution for anyone that needs to store and query analytics-related (and time-series) data.
We don’t like nor try to hide what we really are. There are many things that we are really good at, but also things that we are not and you should be aware of all that before deciding to use SlicingDice. That also includes checking our current restrictions too.
Besides that, since we are a Serverless Data Warehouse and Analytics Database as a Service platform, we want to use the economy of scale to your advantage and make our platform always more robust and cheaper.
We would love to hear your opinion about this pricing comparison. Please, if you have anything to say or to protest, let us know.
Still not sure if SlicingDice is a good fit for you?
Click here and schedule a 15-minute talk with our developers, totally free of charge, so we can evaluate your case together.