MapD performance of high read requests per second


#1

Hi,

Does anyone have any tips on increasing the performance of MapD when handling multiple read requests per second?

We’ve been load testing various different single node MapD configurations and found that if we fix the database (~10Gb, ~100M rows), scaling CPUs or GPUs or RAM does not result in significant performance gain with respect to read requests per second (a complex select query). Also at full capacity, we see a significant amount of resources (GPU, CPU and RAM) are still available on the machine. Should we consider scaling horizontally for smaller read-only database sizes? Should we even be considering MapD for read-only databases of this size? Are there any recommendations for deploying MapD in environments where there will be a lot of read requests per second and the database will be read-only?

Happy to provide more details if required.

Thanks,
Shane


#2

Hi

Thanks for the interest in MapD, always nice to see people really putting it through its paces.

Before we head off into the weeds too far We need to get a bit of a better understanding of what your ultimate expectations are.

What kind of TPS rate are you looking for at what level of hardware ‘investment’.

I see your DB is quite small, are the queries you are running analytical in nature? It would be useful for you to share your schema and the type of queries you intend to run.

There is definitely scope to scale horizontally once we identify if this is an appropriate ‘fit’

regards


#3

Hi Dwayne,

Thanks for your response, I’m really excited about MapD!

See https://github.com/shusson/mapd-load-testing for our test script, schema and results.

Our use case is a publicly available genomic focused dashboard (using mapd-charting).

What kind of TPS rate are you looking for at what level of hardware ‘investment’

Since the dashboard will be public facing, the TPS will dependent on how many users we get. We would like to know the best way to scale and how much it will cost.

I see your DB is quite small, are the queries you are running analytical in nature?

The test script contains two realistic queries that we make as part of our dashboard. Once loaded the DB will be read-only. Right now our DB has 10^8 rows but in the future we’d like to be able to have a dashboard that sits on top of 10^9 rows.

Is it possible to optimise MapD given that the DB will be read-only and the entire DB will fit in memory?

Thanks,
Shane


#4

@shusson, this is awesome to hear and seems like a great use case!

Can I ask what sort of hardware you are currently running on?

I ask because I noticed that your first query has AVG(AF), where AF is a float. Unfortunately as of 3.0 grouped aggregates of floats kick to CPU on the Nvidia Kepler architecture (i.e. K40, K80) because of lack for native double precision atomics until Maxwell. However we have a workaround for this in master that will be released in a few days as part of 3.1 which should make this query run on GPU on Kepler.


#5

Can I ask what sort of hardware you are currently running on?

We tested on various K80 and CPU only setups, for example https://github.com/shusson/mapd-load-testing/tree/master/results/20gb

released in a few days as part of 3.1 which should make this query run on GPU on Kepler.

Awesome, sounds like it will give us a nice performance boost.

Is it possible the scaling issues we are seeing are due to how MapD processes concurrent requests? We are just guessing based on the limited resource usage we see under full load.


#6

Currently the parsing, optimization and parts of rendering can overlap between queries but most of the execution occurs single file. We may relax this in the future to only cover the GPU portion of the execution (so that things like CPU reduction can occur in parallel).

We’ve found that in general you get the most throughput on the GPU by allowing a query to have all the resources, and don’t have to worry about contention for things like buffer or cache memory. And if you can get queries done very quickly, you then get low latency even with many simultaneous queries.

I think that the issue in your case the real issue may be that the queries are kicking to CPU for the reasons I mentioned above (i.e. should be fixed in 3.1 targeted for next week). This would explain why adding more GPUs is not giving you a performance boost.

However for simple queries on relatively small datasets we are considering supporting executing on subsets of GPUs (smaller than the total number of GPUs) so that different GPU groups can execute at the same time. Expected gains from this configuration would be from parallelizing “fixed overheads” on each query between MapD servers on the same node. Right now you can emulate this behavior by running multiple MapD servers on the same node, mapping each to different sets of GPUs with the --start-gpu and --num_gpus flag (see here for the relevant docs). Note that in this case you’d need each MapD server to have its database, which since your systems are read only could be done by copying the directory for the original database n times, where n is the number of servers you want to run. Depending on the query workload, this may give you anything from no speedup to a decent speedup over running one “mongo” server.

Obviously we eventually should allow a more elegant solution to this, i.e. to allow the user to do it all from one server.

My recommendation would be to try the 3.1 release to see if it gives you acceptable latency with multiple queries and if its not enough try the replication technique described above. Let us know how it goes!


#7

My recommendation would be to try the 3.1 release to see if it gives you acceptable latency with multiple queries and if its not enough try the replication technique described above. Let us know how it goes!

thanks @todd, will do


#8

Just following up on this:

The 3.1 release did speed things up significantly so we have now deployed a simple demo backed by AWS P2 instances.

We also tuned the fragment size to fit our use case better which improved performance.

We still have some concerns around concurrent use cases, like what happens if one user starts making particular queries that take a long time (even with the watchdog). However given our small user community and the fact that it is relatively easy to scale out on AWS we don’t expect this to be a big issue for us.

Thanks for your support and looking forward to more updates!