Join + group by returns 1 line incorrect results [EDIT: results correct]


#1

Loaded the sample dataset and did this join.

mapdql> select a.carrier_name,count(*) from flights_2008_10k a, flights_2008_7M b where a.carrier_name=b.carrier_name group by a.carrier_name;
carrier_name|EXPR$1
Southwest Airlines|12017540000
mapdql>
mapdql> \version
MapD Server Version: 3.3.0-20171031-ed76a7a
mapdql>

mapdql> select a.carrier_name,count(*) from flights_2008_10k a, flights_2008_7M b where a.carrier_name=b.carrier_name and a.carrier_name not in (‘Southwest Airlines’) group by a.carrier_name;
No rows returned.
mapdql>


#4

Hi,

The flights_2008_10k dataset has 10K records ALL containing Southwest Airlines

The flights_2008_7M dataset has 7M records, of which 1201754 contain Southwest Airlines

mapdql> select count(*) from flights_2008_7m where carrier_name='Southwest Airlines';
EXPR$0
1201754
mapdql>

So the expected result for your query would be 12017540000 which appears to be what you are getting. What are you seeing as wrong here?

regards


#5

doh…I think I knew that…and forgot… ;-). Thank you for the quick reply.


#6

Thanks for the quick response…

I am playing with EE given to me by Mike .

He’s out of the country.

Do you know where I can get details questions answered on how to setup the Distributed environment?

The documentation was only 1 page.

Hung