Thrift error while copying data from a csv file


#1

Hi,

I am trying to copy data into MapD from bunch of .csv files. I encounter the following error when I run the COPY command:

mapdql> COPY TABLE_NAME from ‘/opt/mapd/mydatafolder/part-r-1*’;
Thrift: Wed Aug 23 19:42:34 2017 TSocket::write_partial() send() <Host: localhost Port: 9091>Broken pipe
Thrift: Wed Aug 23 19:42:34 2017 TSocket::open() connect() <Host: localhost Port: 9091>Connection refused
Thrift: Wed Aug 23 19:42:34 2017 TSocket::open() connect() <Host: localhost Port: 9091>Connection refused
Thrift error: connect() failed: Connection refused

This command was successfully able to copy in data for a previous batch but it fails now. I would appreciate any help! Thanks.


#2

Hi,

The broken pipe means the connection to the server has failed. This normally indicated the server has failed.

Please send the contents of <MAPDHOME>/data/mapd_log/mapd_server.INFO

So we can review what the real error was.

Regards


#3

Thanks for your response! I have installed MapD on a GCP instance on CentOS 7 using the instructions on your site. Where is the /data folder located? I am in /opt/mapd/ folder and do not see any logs.

Thanks and appreciate your patience!


#4

Hi

Where did you specify your MapD data directory to be?

How are you starting MapD?

regards.


#5

Hi,

I found the log folder. I had used the ‘/var/lib/mapd’ as my storage folder.

I see the following message in the mapd_server.FATAL log file:

F0823 19:42:31.279278 6804 StringDictionary.cpp:456] Maximum number (1073741824) of Dictionary encoded Strings reached for this column, offset path for column is /var/lib/mapd/data/mapd_data/DB_1_DICT_1/DictOffsets

Does this somehow crash the server?

Thanks.


#6

Hi,

There is a limit of 1B unique strings in a dictionary.

The server will stop if you try to load a new unique string after the 1Bth entry.

All text fields by default are dictionary encoded. You must have a field that really only has unique text in it, ie that the string only occurs once ever in your data. You should set the datatype for that field to be TEXT ENCODING NONE so MapD does not attempt to DIctionary encode that data.

regards


#7

That worked perfectly! Many thanks for your help.