Questions about order by


#1

When I run the following sql statement:
select total_amount from trips order by total_amount desc limit 100;
I get the following exception:
Sorting the result would be too slow
BTW the trips table is from the New york city taxi data and there are about 1.7 bilillion records in it.
the above exception is occured when the sql runs in ‘\gpu’ mode,but when I changed to ‘\cpu’ mode, I can get the result quickly.
So MapD cannot change the running mode automaticlly from \gpu to \cpu?
And I also want to know if there is any bad effect to turn off ‘enable-watchdog’?
Thanks and wait for your reply.


#2

There are parameters not enabled by default that excluding the watchdog (the component saying that the op would be too slow) so the statement fall back to the cpu automatically

They are

enable-watchdog you have to set to false
allow-cpu-retry you have to set to true

You can experiment with both of them changing the mapd.conf thta’s places where you set the storage are on mapd and restart the server


#3

thanks a lot and I wish one day the docs on the webwite can be updated. there seems no description about enable-watchdog or allow-cpu-retry now.
and i run into another problem. the following sql:
a) create table fee_10 as select total_amount from trips order by total_amount desc limit 100000;
b) create table fee_100 as select total_amount from trips order by total_amount desc limit 1000000;
c) create table fee as select total_amount from trips order by total_amount desc;
a) can be executes right but b) and c) will still report the exception about sort is too slow.
and the memory_info is like below:
image


this is the log info ,and it says query ran out of GPU memory, this is very strange…


#4

how to sort all the data successfully?(its about 1.7 billion records)


#5

Hello,

Can you please tell us what version of MapD core you are running right now? How many GPUs do you have on your system?

Thanks.


#6

Hi,
yesterday i have just updated my MapD core to v3.2.4 and the latter version i used is v3.2.2
my machine have 4 p40 installed.
best regards


#7

Hello,

We try very hard to keep all the query execution sub second. To achieve that performance we try to keep certain parts of queries restricted until we implement a better algorithm that ensures the performance of the entire query will be sub second.

I am sorry to say that in the latest version of MapD we have kept the limit of sorting distinct elements to 100000. We are working to implement a better sorting algorithm and extending the watchdog limit to a much higher point.

We believe in implementing the best execution strategy and keep ourselves in check with self imposed limits like static watchdogs.

Thank you for using MapD and thank you for your post.
Vraj


#8

My taxi dataset of 465m is well below 100000 distinct values on total fee, but ctas operation tries to allocate whopping quantities of memory; I think the problem isn’t the sort op


#9

Hi,

after have a trial on MapD core database, i think this work is really great especially in data query. And I am wondering the reason why Mapd still does not suppot DELETE or UPDATE.

Best Regards


#10

Hello,

We are currently working on DELETE and UPDATE operations.

We hope to get it out soon.

Simple reason for not supporting delete: you have to handle dirty caches.

We want to deliver DELETE and UPDATES without compromising the speed much.

Thank you for your kind words.
Vraj