Exception: Hash join failed, reason(s): Not enough memory for the columns involved in join


#21

Yeah, having an ID as dictionary encoded string is always a good idea. Unless you plan on doing divisions on IDs :slight_smile:

@sumit7986 approx_count_distinct shoudl be better on memory as well


#22

If you need doing something you cant do with text, you just duplicate field and use the one best fit for operation you need.
Anyway It works well only if you have a field with skewed values and with big max values, like dates/TS


#23

Hi @aznable,

Hi tried to disable watchdog by updating mapd.conf.

port = 9091
http-port = 9090
data = “/var/lib/mapd/data”
null-div-by-zero = true

[web]
port = 9092
frontend = “/opt/mapd/frontend”

enable-watchdog = false

I am correctly configuring the mapd.conf file or it is wrong.

Still the same exception is coming

Thanks
Sumit


#24

try with

enable-watchdog=false
allow-cpu-retry=true

you have to place those parameters in the first section of mapd.conf file so

port = 9091
http-port = 9090
data = “/var/lib/mapd/data”
null-div-by-zero = true
enable-watchdog=false
allow-cpu-retry=true

[web]
port = 9092
frontend = “/opt/mapd/frontend”

which query have you tried?


#25

Hi @aznable,

Thanks for sharing the right configurations, I was configuring the mapd.conf file in a wrong way. Below given previous issue is resolved I think.
mapdql> select distinct(TransId) from PosB_all_data where Promotion is not null;
Exception: Query would use too much memory

But now I am getting below given output while running the same query.

mapdql> select distinct(TransId) from PosB_all_data where Promotion is not null;
Thrift: Tue Aug 28 11:51:29 2018 TSocket::write_partial() send() <Host: localhost Port: 9091>Broken pipe
Thrift: Tue Aug 28 11:51:37 2018 TSocket::write_partial() send() <Host: localhost Port: 9091>Broken pipe
Cannot connect to MapD Server.

Please help me in resolving this issue also.

Thanks-
Sumit


#26

probably the query is consuming all the ram on you server; i will try to reproduce on a local system.

i am assiming transid is a bigint and the caridinality is very high


#27

i cannot reproduce the problem; is the PosB_all_data a flat table or a view?


#28

Flat table with 1814733986 rows of data.

And datatype of TransId column is TEXT ENCODING DICT.