Column is not being grouped


#1

Hi all,

I am new to the MapD community and my SQL is a bit rusty.

I am having trouble with a GROUP BY despite using a MIN aggregator.

create table sample (A INT, B INT, D INT)
insert into sample values (1, 42, 20)
insert into sample values (1, 43, 10)
insert into sample values (2, 44, 50)
insert into sample values (2, 45, 25)
select A, B, min(D) from sample group by A

I was expecting:
1, 43, 10
2, 45, 25

But I get an error msg:
Exception: Exception occurred: org.apache.calcite.runtime.CalciteContextException: From line 1, column 13 to line 1, column 15: Expression ‘B’ is not being grouped

I am using MapD Cloud btw.

Any help or pointer would be welcome.

Thank you,
Yacine


#2

Hi,

All columns in your group by query must be either aggregated (MIN, MAX, SUM etc) or must be part of group by

So your option is to either do

select A, B, min(D) from sample group by A,B

or

select A, min(B), min(D) from sample group by A

Regards


#4

Many thanks for this @dwayneberry !

I am actually trying to select the rows where I pick for each distinct A the min of D (and the resulting B).

The result I am seeking is:
1, 43, 10
2, 45, 25

Unfortunately the “select A, B, min(D) from sample group by A,B” returns the four rows.
And “select A, min(B), min(D) from sample group by A” mingles the rows.

Maybe what I am trying to achieve is not possible with GROUP BY ?

Thanks again for your help!
Yacine


#5

Hi,

Your example appears to be looking for

select A, max(B), min(D) from sample group by A

but I am only basing that on the expected result, as I don’t really know what you are looking for as the aggregator on the B column.

eg

mapdql> create table sample (A INT, B INT, D INT);
mapdql> insert into sample values (1, 42, 20);
mapdql> insert into sample values (1, 43, 10);
mapdql> insert into sample values (2, 44, 50);
mapdql> insert into sample values (2, 45, 25);
mapdql> select A, max(B), min(D) from sample group by A;
A|EXPR$1|EXPR$2
1|43|10
2|45|25
mapdql> 

Regards


#6

Thank you @dwayneberry

I am trying to do something similar to:

but I now realise that the GROUP BY is actually strict, i.e. every column needs to appear in either GROUP BY or used in the aggregate function.

I will go back to the drawing board :slight_smile:

Thanks again,
Yacine


#7

Hi @ysaidji,

Could you just take the max of B? We are may have a SAMPLE aggregate soon that will just take a random value from a column.

Regards


#8

@ysaidji did that suggestion work out for you?


#9

Hi @ysaidji,

I’m happy to report that we now have a LAST_SAMPLE aggregate in the product that does just this. See the relevant docs page for more info!

Regards