Bug with joins?


#1

Having an issue with a query doing a couple joins on the same table.

Tables are created like this:

CREATE TABLE IF NOT EXISTS analytics_actions ( 
 id bigint,
 source varchar(32) ENCODING DICT,
 page varchar(255) ENCODING DICT,
 deviceType varchar(255) ENCODING DICT,
 category varchar(32) ENCODING DICT,
 action varchar(64) ENCODING DICT,
 ipAddress varchar(255) ENCODING DICT,
 country varchar(15) ENCODING DICT,
 sessionId varchar(64) ENCODING DICT,
 clientId varchar(64) ENCODING DICT,
 createdAt timestamp,
 SHARD KEY (id) 
) 
WITH ( 
 shard_count = 1 
);

CREATE TABLE IF NOT EXISTS analytics_action_parameters ( 
 id bigint,
 analytics_action_id bigint,
 param varchar(32) ENCODING DICT,
 val varchar(255) ENCODING DICT,
 SHARD KEY (analytics_action_id) 
) 
WITH ( 
 shard_count = 1 
);

Loaded with 50M rows on first table and 100M rows on second.


This query where I only join once it works fine:

select count(*) as num
from 
analytics_actions aa 
join 
analytics_action_parameters aap on aap.analytics_action_id = aa.id and aap.param = 'percent' and aap.val = '0' 
where 
aa.action = 'progress' 

But when I join a second time on the same table:

select count(*) as num
from 
analytics_actions aa 
join 
analytics_action_parameters aap on aap.analytics_action_id = aa.id and aap.param = 'percent' and aap.val = '0' 
join 
analytics_action_parameters aap2 on aap2.analytics_action_id = aa.id and aap2.param = 'vid' and aap2.val = '2319'
where 
aa.action = 'progress'

The query fails with this error:

Exception: Hash join failed, reason(s): Equijoin types must be identical, found: TEXT, VARCHAR | Equijoin types must be identical, found: TEXT, VARCHAR | Equijoin types must be identical, found: TEXT, VARCHAR | Equijoin types must be identical, found: TEXT, VARCHAR


Is this a bug or is there some limitation that prevents the second query listed from working?


#2

@browarnik it’s possible that this is indeed a bug. Can you help us test further by modifying your query to:

select count(*) as num
from 
analytics_actions aa 
join 
analytics_action_parameters aap on aap.analytics_action_id = aa.id  
join 
analytics_action_parameters aap2 on aap2.analytics_action_id = aa.id
where 
aa.action = 'progress' and aap.param = 'percent' and aap.val = '0'  and aap2.param = 'vid' and aap2.val = '2319'

and let us know if that resolves it?

It’s possible we should be treating the literal and the dict encoded string as acceptable columns for comparison, and our engineers think it’s interesting that it fails only when the second join is included.

Thanks for you help!


#3

Thanks for the fast response. That query fails with the same error unfortunately.

If it helps here is the explain calcite for the working first query:

Explanation
LogicalAggregate(group=[{}], num=[COUNT()])
  LogicalProject($f0=[0])
    LogicalFilter(condition=[=($5, 'progress')])
      LogicalJoin(condition=[AND(=($13, $0), =($14, 'percent'), =($15, '0'))], joinType=[inner])
        EnumerableTableScan(table=[[mapd, analytics_actions]])
        EnumerableTableScan(table=[[mapd, analytics_action_parameters]])

My failing second query:

Explanation
LogicalAggregate(group=[{}], num=[COUNT()])
  LogicalProject($f0=[0])
    LogicalFilter(condition=[=($5, 'progress')])
      LogicalJoin(condition=[AND(=($18, $0), =($19, 'vid'), =($20, '2319'))], joinType=[inner])
        LogicalJoin(condition=[AND(=($13, $0), =($14, 'percent'), =($15, '0'))], joinType=[inner])
          EnumerableTableScan(table=[[mapd, analytics_actions]])
          EnumerableTableScan(table=[[mapd, analytics_action_parameters]])
        EnumerableTableScan(table=[[mapd, analytics_action_parameters]])

And your suggestion:

Explanation
LogicalAggregate(group=[{}], num=[COUNT()])
  LogicalProject($f0=[0])
    LogicalFilter(condition=[AND(=($5, 'progress'), =($14, 'percent'), =($15, '0'), =($19, 'vid'), =($20, '2319'))])
      LogicalJoin(condition=[=($18, $0)], joinType=[inner])
        LogicalJoin(condition=[=($13, $0)], joinType=[inner])
          EnumerableTableScan(table=[[mapd, analytics_actions]])
          EnumerableTableScan(table=[[mapd, analytics_action_parameters]])
        EnumerableTableScan(table=[[mapd, analytics_action_parameters]])

Also, the full explain for the first query:

Explanation
IR for the GPU:
===============

define void @query_template_0(i8** nocapture %byte_stream, i8* nocapture %literals, i64* nocapture %row_count_ptr, i64* nocapture %frag_row_off_ptr, i32* %max_matched_ptr, i64* %agg_init_val, i64** %out, i64** %unused, i32 %frag_idx, i64* %join_hash_tables, i32* %total_matched, i32* %error_code) {
.entry:
  %0 = getelementptr i8*, i8** %byte_stream, i32 0
  %1 = load i8*, i8** %0
  %2 = getelementptr i8*, i8** %byte_stream, i32 1
  %3 = load i8*, i8** %2
  %4 = getelementptr i8*, i8** %byte_stream, i32 2
  %5 = load i8*, i8** %4
  %6 = getelementptr i8*, i8** %byte_stream, i32 3
  %7 = load i8*, i8** %6
  %8 = getelementptr i8*, i8** %byte_stream, i32 4
  %9 = load i8*, i8** %8
  %result = alloca i64, align 8
  %row_count = load i64, i64* %row_count_ptr, align 8
  %10 = getelementptr inbounds i64, i64* %agg_init_val, i32 0
  %11 = load i64, i64* %10, align 8
  store i64 %11, i64* %result, align 8
  %pos_start = call i32 @pos_start_impl(i32* %error_code)
  %pos_step = call i32 @pos_step_impl()
  %group_buff_idx = call i32 @group_buff_idx_impl()
  %12 = sext i32 %pos_start to i64
  %13 = icmp slt i64 %12, %row_count
  br i1 %13, label %.loop.preheader, label %.exit

.loop.preheader:                                  ; preds = %.entry
  %14 = sext i32 %pos_step to i64
  br label %.for.body

.for.body:                                        ; preds = %.for.body, %.loop.preheader
  %pos = phi i64 [ %12, %.loop.preheader ], [ %16, %.for.body ]
  %15 = call i32 @row_func(i64* %result, i64* %agg_init_val, i64 %pos, i64* %frag_row_off_ptr, i64* %row_count_ptr, i8* %literals, i8* %1, i8* %3, i8* %5, i8* %7, i8* %9, i64* %join_hash_tables)
  %16 = add nsw i64 %pos, %14
  %17 = icmp slt i64 %16, %row_count
  br i1 %17, label %.for.body, label %._crit_edge

._crit_edge:                                      ; preds = %.for.body
  %.pre.result = load i64, i64* %result, align 8
  br label %.exit

.exit:                                            ; preds = %._crit_edge, %.entry
  %18 = phi i64 [ %.pre.result, %._crit_edge ], [ %11, %.entry ]
  %19 = getelementptr inbounds i64*, i64** %out, i32 0
  %20 = load i64*, i64** %19, align 8
  %21 = mul i32 %frag_idx, %pos_step
  %22 = add i32 %group_buff_idx, %21
  %23 = getelementptr inbounds i64, i64* %20, i32 %22
  store i64 %18, i64* %23, align 8
  ret void
}

; Function Attrs: alwaysinline
define i32 @row_func(i64* %out, i64* %agg_init_val, i64 %pos, i64* %frag_row_off, i64* %num_rows_per_scan, i8* %literals, i8* %col_buf0, i8* %col_buf1, i8* %col_buf2, i8* %col_buf3, i8* %col_buf4, i64* %join_hash_tables) #21 {
entry:
  %0 = load i64, i64* %frag_row_off
  %1 = getelementptr i64, i64* %frag_row_off, i32 1
  %2 = load i64, i64* %1
  br label %singleton_true_

exit:                                             ; preds = %singleton_true_, %filter_false
  ret i32 0

singleton_true_:                                  ; preds = %entry
  %3 = call i64 @fixed_width_int_decode(i8* %col_buf0, i32 8, i64 %pos)
  %4 = ptrtoint i64* %join_hash_tables to i64
  %5 = call i64 @hash_join_idx(i64 %4, i64 %3, i64 15, i64 52489999)
  %6 = icmp sge i64 %5, 0
  br i1 %6, label %loop_body, label %exit

loop_body:                                        ; preds = %singleton_true_
  %7 = call i64 @fixed_width_int_decode(i8* %col_buf1, i32 4, i64 %pos)
  %8 = trunc i64 %7 to i32
  %9 = getelementptr i8, i8* %literals, i16 0
  %10 = bitcast i8* %9 to i32*
  %11 = load i32, i32* %10
  %12 = call i8 @eq_int32_t_nullable_lhs(i32 %8, i32 %11, i64 -2147483648, i8 -128)
  %13 = icmp sgt i8 %12, 0
  %14 = and i1 true, %13
  %15 = call i64 @fixed_width_int_decode(i8* %col_buf2, i32 4, i64 %pos)
  %16 = trunc i64 %15 to i32
  %17 = getelementptr i8, i8* %literals, i16 4
  %18 = bitcast i8* %17 to i32*
  %19 = load i32, i32* %18
  %20 = call i8 @eq_int32_t_nullable_lhs(i32 %16, i32 %19, i64 -2147483648, i8 -128)
  %21 = icmp sgt i8 %20, 0
  %22 = and i1 %14, %21
  %23 = call i64 @fixed_width_int_decode(i8* %col_buf4, i32 4, i64 %5)
  %24 = trunc i64 %23 to i32
  %25 = getelementptr i8, i8* %literals, i16 8
  %26 = bitcast i8* %25 to i32*
  %27 = load i32, i32* %26
  %28 = call i8 @eq_int32_t_nullable_lhs(i32 %24, i32 %27, i64 -2147483648, i8 -128)
  %29 = icmp sgt i8 %28, 0
  %30 = and i1 %22, %29
  br i1 %30, label %filter_true, label %filter_false

filter_true:                                      ; preds = %loop_body
  %31 = call i64 @agg_count(i64* %out, i64 0)
  br label %filter_false

filter_false:                                     ; preds = %filter_true, %loop_body
  br label %exit
}

Couldn’t get the full explain for the second/third query since they throw an exception.

Let me know if there’s anything else I can do to help debug.


#4

Noted by @asuhan could be this is fixed by @vraj change https://github.com/mapd/mapd-core/commit/b51d58a5e2957b2917fe24d3154822781332ef80, as this no longer duplicates in master. Does your build include this fix?


#5

I launched from the MapD AMI on AWS with version 3.4.0, which I think includes that change?


#6

@browarnik we just released 3.5 this morning. Would you mind checking again?