Scaling and sizing MapD


#1

I am currently evaluating MapD with various sizes of dataset.

I am running MapD 3.61 Community Edition in AWS on a p2.8xlarge instance. The specs for this machines are as follows:-

Nvidia K80 - 8 GPU’s, 32 vCPUs, 488 GiB RAM

When I run this with a 1.25 billion dataset Immerse works as expected, much like the demos on the MapD website.

However when I try a 5 billion dataset Immerse becomes unresponsive and in some cases returns errors. I am thinking that this is due to the dataset being too big for the box to handle, but I would like a better way of measuring and confirming this.

My first query is how to more accurately work out when the problems are due to the data set being too big for the hardware provisioned.

Will MapD log somewhere when the Engine or Immerse cannot cope?

Are there other/better ways to identify when a MapD installation needs scaling?

Does anybody have any thoughts on how the p2.8xlarge and p2.16xlarge instances would scale before needing to scale out rather than up?

How many billion rows could I expect the p2.8xlarge instance to scale to, and how many for p2.xlarge?

The p2.16xlarge instance is:-

Nvidia K80 - 16 GPUs, 64 vCPUs, 732 GiB RAM

The data set I am testing with is a variation of the taxi data, and mostly I am working with a point map in Immerse.

Any help or guidance would be greatly appreciated.

Thanks,

Andrew


#2

Theresa isn’t an easy answer to the question “how many number of rows can my xyz hw configuration can handle?” because mapd is a in-memory columnar database so it depends how many columns and which datatypes you are using on your immerse’s viz; so, as an example, for lon/lat columns you decided to use a single precision datatype the memory footprint for 5B is 40GB (2 columns of 32bits each) , but if you defined the same columns with a DOUBLE or DECIMAL that are 64bits datatype you would need 80GB and you are likely filtering with al least another column like a date and a date datatype could take up another 40gb if your are not using a fixed encoding

the next release will have new geo datatypes with a smaller memory footprint and a good precision, so it’s likely you will be able to run your viz with actual hardware (i just compiled the latest version, but i havent tried yet)
EDIT: did a briefly test and the memory consumption for points is cutted in an half

could you share the ddl you are actually using? we can help you to modify datatypes to help you make your viz usable with current version


#3

Thanks, that is very helpful.

So basically you take the columns you need for the viz, and then work out the row byte size and then multiple by the number of rows. If that exceeds the GPU memory size then things will be very slow?

In this case I am using Timestamp (8 bytes), and longitude/latitude which are each decimal (8 bytes).

24 bytes * 5 billion = 120 GB

Our p2.8xlarge instance has 488 GBof RAM but only 96GB of that is GPU RAM.

I guess what you are recommending is we change to using TIMESTAMP ENCODING FIXED(32) (4 bytes) and then longitude/latitude as float instead of decimal?

This would result in halving the memory space required:-

12 bytes * 5 billion = 60 GB

Presumably this would mean the 5 billion dataset would then fit on a p2.16xlarge, but when we try the 10 billion dataset we would either need to scale up or out.

Is that right?

This really helps in terms of working out what hardware is required for different sizes of dataset.

I will need to explore what impact switching from decimal to float has for longitude and latitude, but presume it would be an acceptable optimisation when storing taxi trip locations?

It would be helpful to know how to monitor MapD via the system metrics logs to know when the memory has been exceeded. At the moment I am relying on MapD locking up and Immerse becoming unresponsive to hint that the data set has exceeded the GPU RAM.

Any thoughts on this?


#4

yes i am suggesting to use float and date/timestamp fixed encoding 32 to reduce the memory footprint to better fit your data on GPU memory; reducing the memory consuption you would be able to run your Viz on a p2.8xlarge (if i am not going wrong it should be have 8 gk210 with 12GB for each of them, so 98GB in total).

on a GPU enabled instance of Mapd, if a single query need more memory than the total avaiable on GPUs it should fail for obiviously reason or fall-back to CPU executions (you have to change configuration setting enable_cpu_retry parameter on mapd.conf file); i am not sure the is a batch execution mechanism (process query in batches, then doing a final reduction with partial results), but this would lead to a big performance hit because the reduced bandwidth of pci-express bus the GPUs are connected and the overhead of memory management mechanism.

If more than on query needs more memory than the total avaiable, the performance are reduced because a chunck of memory has to be replaced between the execution of a query and another; the performace are not ideal but the application would be resposive.

So responding to your original question need the reply to another question “with level of interactivity are you looking for?” :slight_smile:

As for monitoring; it’s quite difficult to find an automated way to detect the queries are failing with system metrics because MAPD use system RAM and GPUs RAM as caches, so it’s likely you will find the GPUs RAM filled out quickly (the memory once allocated is never released, you can deallocate it explicitly with the \clear_gpu command), so i suggest to look on mapd’s logs

the typical error is this one and it can be found on mapd_server.INFO log file

E0610 19:11:10.686862 32743 BufferMgr.cpp:387] ALLOCATION failed to find 32000000B throwing out of memory GPU_MGR:0

followed by a E0610 19:11:10.690361 32152 MapDHandler.cpp:2606] Exception: Query couldn’t keep the entire working set of columns in GPU memory

if you use allow_cpu_retry parameter you will get only the first message because the query will be run in batches or on a cpu, so it better to monitor the second message. (the logs of mapd resides on a subdirectory called mapd_logs where mapd stores the data)

Hopes this briefly and a little confused explainantion will help.


#5

Along with the suggestions that @aznable has already made, there are a few other optimizations you can try:

  • Within mapdql, use \o <table name> to get the optimal DDL. This is calculated by taking the current table definition, then doing cardinality checks on the columns. Depending on the cardinality of the column, you may be able to use a dictionary encoding with a smaller range, allowing for more memory savings

  • Load data in timeseries order. This is a minor optimization, depending whether you need the entire timeperiod in memory. If you are using a crossfilter in your dashboard by time, having the partitions in time order will allow MapD to require fewer partitions in memory than if the records were randomly loaded

  • Shared dictionaries: in cases such as travel datasets where there are pickup/dropoff locations, or other situations with repeated values, making a shared dictionary can help: https://www.mapd.com/docs/latest/mapd-core-guide/fixed-encoding/#shared-dictionaries


#6

Thanks a lot for your help.

I have adjusted the schema and re-imported the 5 billion data set. The import worked fine and was about 50% faster. However Immerse is still completely unresponsive, like before.

I added a point map to the dashboard, selecting the longitude and latitude fields. It took about 10-15 minutes to add the point map as it was waiting on a spinner on the longitude and then the latitude fields once they had been selected. Then when I went to viewed the dashboard it showed a partly rendered screen for about 10 minutes before displaying the map. I then went to adjust the map to show New York and it said there had been an error. I went to refresh the browser, and now all I get is a Loading Dashboard spinner and it will no longer get further than this.

All of this functionality worked as expected when using the 1 billion dataset. The new dataset is identical apart from being 4x bigger.

Any ideas what the problem might be, and what sort of things you do to try rectify the situation?


#7

When I run fiddler to see what error is being reported it is the following:-

Not enough OpenGL memory to render the query results


#8

Hi

You may need to set the amount of buffer space for rendering up. This option is render-mem-bytes try setting it up to 1000000000 and if that doesnt work go to 2000000000

Please review your log and see if you are thrashing the GPU memory, this will appear as EVICTION messages in the mapd_server.INFO log.

Your problem sounds like it could also be related to the time it takes to get the data off the disk and into GPU memory. What underlying filesystem are you using?

If you look in the log file mapd_server.INFO you will see information useful in identifying where you bottleneck is. Is it possible for you to share that log here for us to help you diagnose your perf issue?

The \memory_summary command available from mapdql is also useful for you to see what the current usage of your systems memory (both GPU and CPU) is being used after a query. This will give you an idea of the memory footprint of your data once loaded.

regards


#9

as correctly suggested @dwayneberry you should increase the 1610612736 the memory reserved to render the point map adding the parameter render-mem-bytes on you mapd.conf file; this will reduce the memory you use for the data

port = 9091
http-port = 9090
data = "/opt/mapd_storage/data"
null-div-by-zero = true
**render-mem-bytes = 1610612736**

or you could try to decrease the number of point rendered on Point Map Viz

it could be nice to know how a bigger number of records impact on rendering memory; maybe it’s because a bigger bitmap is returned as a result of filtering of a big recordset, and this bitmap is needed by backend-rendering engine.

as for load times of data it depends by how much bandwidth is avaibale for you doisk subsystem.

this query force the reading of two floats columns from a recordset of 516m

mapdql> select count(*) from yellow_tripdata where dropoff_latitude < 80 and dropoff_longitude <120;
516815823
1 rows returned.
Execution time: 12668 ms, Total time: 12669 ms

so it’s more or less 330mb/sec.

you can restart the mapd server and run this query on your system and discover how your disk subsystem perform and how much meory is needed to store the coords on your GPUs with the \memory_summary command of mapdql

mapd@*******+:~$ mapdql -u mapd -p ****** --db mapd -q -t -n
mapdql> select count(*) from yellow_tripdata where dropoff_latitude < 80 and dropoff_longitude <120;
516815823
1 rows returned.
Execution time: **12624** ms, Total time: 12624 ms
[cut]
MapD Server GPU Memory Summary:
[GPU]            MAX            USE      ALLOCATED           FREE
  [0]    10564.41 MB     3943.10 MB     4096.00 MB      152.90 MB

just do the math 3943 MB / 12.6

you should experiment using sharding (8GPUs 8 shards); maybe helps


#10

Thanks a lot for all the suggestions. I am going through each of them now.

Running mapdql with the switches you provided didn’t return a GPU Memory Summary for me. I don’t know if this is because your on a more recent version than me or whether there is something else I need to do?

The results I got are as follows:-

mapdql> select count(*) from tripsfive where dropoff_latitude < 80 and dropoff_longitude <120;
4983981574
1 rows returned.
Execution time: 12598 ms, Total time: 12599 ms


#11

No it’s because my post is wrong / incomplete sorry.

after the query you have to use the command \memory_summary ,in order to get the total memory allocation of you query after a restart of the server (or a \clear_gpu command)


#12

Great.

mapdql> \memory_summary
MapD Server CPU Memory Summary:
MAX USE ALLOCATED FREE
393444.84 MB 38025.18 MB 40960.00 MB 2934.82 MB


#13

this on is the CPU one; you have to check the GPUs section

MapD Server GPU Memory Summary:
[GPU]            MAX            USE      ALLOCATED           FREE
  [0]    10564.41 MB     3943.10 MB     4096.00 MB      152.90 MB

#14
MapD Server GPU Memory Summary:

[GPU] MAX USE ALLOCATED FREE
[0] 9405.84 MB 4765.49 MB 6144.00 MB 1378.51 MB
[1] 9405.84 MB 4751.38 MB 6144.00 MB 1392.62 MB
[2] 9405.84 MB 4751.38 MB 6144.00 MB 1392.62 MB
[3] 9405.84 MB 4751.38 MB 6144.00 MB 1392.62 MB
[4] 9405.84 MB 4751.38 MB 6144.00 MB 1392.62 MB
[5] 9405.84 MB 4751.38 MB 6144.00 MB 1392.62 MB
[6] 9405.84 MB 4751.38 MB 6144.00 MB 1392.62 MB
[7] 9405.84 MB 4751.38 MB 6144.00 MB 1392.62 MB


#15

Thanks this has done the trick.


#16

I am getting more errors now my testing is continuing and it looks like it is due to running out of memory.

I have a dashboard with the following:-

  1. Pointmap (on float longitude and float latitude)
  2. Histogram (on TIMESTAMP ENCODING FIXED(32))
  3. Bar chart on passenger count SMALLINT
  4. Bar chart on vendor id (ENCODING DICT)
  5. Bar chart on payment type ENCODING DICT(8)

What calculations would I need to use to see if I have enough GPU memory?

It was kind of working when I first setup the dashboard, but over time (as I’ve been interacting with the dashboard) it has stopped working. Now when I load the dashboard I get an error and it exits.

I can see in the logs there are lots of evictions. I guess I have pushed it over the edge on GPU memory with the amount of charts I’ve added.

It would be good to confirm what calculations I can use, to work out how much memory I would require and then I can project what memory I need based on what datasets I am testing with.


#17

basically 15 bytes (8 for lon/lat, 4 for timestamp and 3 bytes for the other 3 fields) * number of records, so with 5B of records, more or less 75B of bytes. you can check the memory consuption with \memory_summary command;

try setting allow_cpu_retry = true on mapd.conf and restart server.

if you are in doubt that the volume of data is the cause of your dashboard malfunction, you can simply rename the table containing the 5B of rows with “alter table [old_name] renamt to [new_name]”, then create a table with old_name and load a resonable amount of data on it, then open the dashabord


#18

Hi

@ArnoldJ @aznable the size of the types are here https://www.mapd.com/docs/latest/mapd-core-guide/fixed-encoding/

So just redoing the calculations for an initial swag

Column Size byte Running Total (bytes)
lon float 4 4
lat float 4 8
Timestamp (32) 4 12
Passenger Count smallint 2 14
Vendor Id text encoded dict(32) 4 16
Payment Type text encoded dict(8) 1 17

So 17 bytes per row so more like ~85G of gpu memory needed. With your 8 cards and the need to have some space for rendering and temporary calculation space on the GPU you are a bit tight with only ~75GB total gpu space ( 9.4GB is current max usage per card by MapD currently as report in your \memory_summary request)

So currently your queries are going to be competing for GPU space, hence your comment on you are seeing evictions with in a single dashboard.

regards


#19

I missed the passenger_count field and assumed a 8 bit encoding of vendor_id field because the low cardinality, so potentially 17 bytes for record.

It doesnt fit on GPU RAM, but the dashboard would run with degraded performance because the transfers from sistem to GPU Memory between a query and another (am i wrong?)


#20

Hi

So a bare dashboard with no cross filters may be able to run, but once you start adding cross filters from the different charts the individual queries are going to require more memory than is available, which depending on configuration will cause the query to fail with out of memory or try to fall back to cpu and run slower.

Would need to see details in the log to see exactly what is stopping immerse dashboards from refreshing but the root cause is trying to squeeze a l bit too much into the gpu memory you have available.

regards