Wrong result in 3.1.01


#1

Hi,

This is what happened. I had a table, I generated all data myself, it worked fine. Then I hit the glibc SEGV issue, I couldn’t startup Mapd. I download a new CE version 3.1.01, connect to the same table. I see immediately the aggregation result is wrong.

Here is the evidence:

mapdql> select avg(amount), max(amount) from shopping;
EXPR$0|EXPR$1
821744868148.119629|500.000000
mapdql>

If the MAX(amount) is 500, the AVG(amount) can’t be 821,744,868,148.119629.

AVG() can’t be more than MAX().

Any ideas please?

Cheers,


#2

Hi,

i suspect you are running into an overflow issue here. We have been doing some work in that area to tighten up.

We will need a little more info to fix it.

What is the type of amount?
How many rows in the table?
what is the min(amount)?

regards


#3

Hi,

The data type is defined as FLOAT, the min(amount)=1, and there are 124,999,990 rows.

It seems it is introduced in 3.1.01 for fixing SEGV/glibc issue. Hope it helps you to narrow down the area.

Cheers,


#4

Hi

I am going to need some more info as everything seems to be fine in my testing

Here i what I am doing

create a data set

time awk 'BEGIN { for (j = 0; j <= 250000; j++) for (i = 1; i <= 500; ++i) print i }' > test1.csv

create table, load and test

bin/mapdql -p HyperInteractive
User mapd connected to database mapd
mapdql> create table test1 (f1 float);
mapdql> copy test1 from '/data/test1.csv' with (header='false');
Result
Loaded: 125000500 recs, Rejected: 0 recs in 10.535000 secs
mapdql> select max(f1) from test1;
EXPR$0
500.000000
mapdql> select min(f1) from test1;
EXPR$0
1.000000
mapdql> select avg(f1) from test1;
EXPR$0
250.500806
mapdql> \version
MapD Server Version: 3.1.01-20170622-69314c5
mapdql>

Is it possible for you do do above steps and show output?

Is there anything obvious different in your data

regards


#5

Hi,

Sorry, I don’t have a MapD at the moment.

To reproduce it, can you use version 3.0 to generate the data, then switch to 3.1 to run the query. This is how I got in trouble.

Cheers,


#6

Hi,

I tried this by creating DB with v3.0.0 ce edition

bin/mapdql -p HyperInteractive
User mapd connected to database mapd
mapdql> \version
MapD Server Version: 3.0.0-20170507-7626e30
mapdql> create table test1 (f1 float);
mapdql> copy test1 from '/data/test1.csv' with (header='false');
Result
Loaded: 125000500 recs, Rejected: 0 recs in 7.969000 secs
mapdql> select max(f1) from test1;
EXPR$0
500.000000
mapdql> select min(f1) from test1;
EXPR$0
1.000000
mapdql> select avg(f1) from test1;
EXPR$0
250.500000
mapdql> \version
MapD Server Version: 3.0.0-20170507-7626e30
mapdql>

Then I stopped v3.0.0

Then started v3.1.01 on the same data directory

bin/mapdql -p HyperInteractive
User mapd connected to database mapd
mapdql> \version
MapD Server Version: 3.1.01-20170622-69314c5
mapdql> select max(f1) from test1;
EXPR$0
500.000000
mapdql> select min(f1) from test1;
EXPR$0
1.000000
mapdql> select avg(f1) from test1;
EXPR$0
250.498086
mapdql> \version
MapD Server Version: 3.1.01-20170622-69314c5
mapdql>

I could not reproduce your issue.

Any additional info would be useful

regards


#7

Hi,

I guess the data was damaged by the SEGV bug somehow, so you can’t easily reproduce it. I’ll rebuild the table, it is a demo table.

Cheers,