How to read logged encoding error


#1

I’m seeing errors like the following:

785 FixedLengthEncoder.h:43] Fixed encoding failed, Unencoded: 253402214400 encoded: -856064

However, the unencoded value doesn’t exist in the data I’m importing.

Is there a way to determine which values are falling outside the fixed length or at least know which column has an encoding that’s too short?


#2

Hi,

Sorry about the issue, yes the message is not very helpful.

Is it possible for you to supply the schema and a sample file of the data that is causing this issue for us to investigate?

regards


#3

That will be difficult as it’s a very wide table and very large dataset. I’m not sure where in the csv the offending value is located. I’ll have to try various subsets of the file and see if it throws the same warning.


#4

Hi,

Just to try to clear up what it is saying.

Somewhere in your schema you have a column that is fixed encoded, this can be explicitly encoded in the schema eg
S1 SMALLINT ENCODING FIXED(8)
or it might be automatic like with a DATE or TIMESTAMP field

The error message is saying that the size of the pieces of data you are trying to fit in the field is too large, unfortunately the message is not directly comparable to anything in your input file.

I could force a message like this:

Input file badd.csv contains

300
30
3
3000

setup

mapdql> create table ii (i1 smallint encoding fixed(8));
mapdql> copy ii from '~/Datasets/epoch/badi.csv' with (header='false');
Result
Loaded: 4 recs, Rejected: 0 recs in 0.085000 secs
mapdql> select * from ii;
i1
44
30
3
-72
mapdql>

in the log I see

E1004 15:49:13.584585   858 FixedLengthEncoder.h:43] Fixed encoding failed, Unencoded: 300 encoded: 44
E1004 15:49:13.584662   858 FixedLengthEncoder.h:43] Fixed encoding failed, Unencoded: 3000 encoded: -72
I1004 15:49:13.669850   858 ParserNode.cpp:2415] Loaded: 4 recs, Rejected: 0 recs in 0.085000 secs

the 300 and 3000 values are outside the range of what could be stored in a SMALLINT ENCODING FIXED(8)

In this case the unecoded is readable but for a DATE or TIMESTAMP field it would not be

regards


#5

You’re right, it was a date field.

The timestamp value 253402214400 is 12/31/9999 @ 12:00am (UTC)

Thank you for the quick responses! You guys are super helpful!