Mapd Core taking long time to get catalog data for delete and truncate operations


From mapd logs, we can see that its taking about 3-4 mins to get the catalog data and more minutes for other activities. We have single node with 4 p100 GPUs with 60 cpu cores. Following is snapshot of the log

I1221 00:37:20.778688 45929 MapDHandler.cpp:568] sql_execute :A1o2n4PoHEkJ7ApmVgGWIA4Aqi2dHXUH:query_str:drop table store_scan_dly;
I1221 00:37:20.778934 45929 MapDHandler.cpp:3256] passing query to legacy processor
I1221 00:40:24.928164 45929 FileMgr.cpp:173] Completed Reading table’s file metadata, Elasped time : 184129ms Epoch: 1035247 files read: 10130 table location: ‘/var/lib/mapd/data/mapd_data/table_3_12/’
I1221 00:42:32.940171 45929 Catalog.cpp:1525] Instantiating Fragmenter for table store_scan_dly_shard_#1 took 312141ms
I1221 00:52:47.649708 45929 Calcite.cpp:179] Time to updateMetadata 1 (ms)
I1221 00:53:10.694839 45929 FileMgr.cpp:173] Completed Reading table’s file metadata, Elasped time : 23044ms Epoch: 1035813 files read: 10135 table location: ‘/var/lib/mapd/data/mapd_data/table_3_13/’
I1221 00:55:16.718389 45929 Catalog.cpp:1525] Instantiating Fragmenter for table store_scan_dly_shard_#2 took 149068ms

Drop table is still in progress as of this posting…

There is nothing suspicious in ERROR/WARNING logs

We also saw similar issue when we try and truncate the table
Do anyone have an insight into the issue/behaviour?



Your issue is the large number number of epochs being created.

MapD stores metadata per epoch and records state for every epoch that has ever existed.

in your case in the log above I see

I1221 00:40:24.928164 45929 FileMgr.cpp:173] Completed Reading table’s file metadata, Elasped time : 184129ms Epoch: 1035247 files read: 10130 table location: '/var/lib/mapd/data/mapd_data/table_3_12/'

Over 1M epochs per shard file. So 2M epochs overall.

An epoch per shard is normally created once per COPY FROM or INSERT so this is basically telling us that you have done over 1M appends to this table. Each one of these carries a significant overhead on the filesystem and drop and truncate are having to remove over 10,000 files from the filesystem, so depending on what actual filesystem you are running on this might take some time.

We do not recommend singleton INSERTS into tables and would normally expect at COPY FROM command to have a large number of records.

Can you describe a little about how you are loading the data.

It would be possible for us the ‘clean’ up the older epochs with code change, and this is something on our roadmap, but it would be good to understand your use case to see if there is something we can streamline on your ingestion side to make all of this work a little smoother.



Thx @dwayneberry , That make sense, I have seen tables duplicated with the copy in mapD were deleted in flash but the ones which are populated with sqoop took forever. Looks like the issue is much bigger, my session has ended when delete was in progress on the table, and since then every other query hangs on the mapd core. We had to restart the mapd service so that other queries can work, however after restart if I query the table on which I attempted delete mapd just hangs rest of the queries again :frowning:

We use sqoop to transfer the data from hadoop/hive and following is the sqoop command I am running, We need to replace few dim tables (daily) with ~60M records in size. Loading from csv file is not an option, please let me know if there is a way to tune this load

sqoop export -Dsqoop.export.statements.per.transaction=100000 -Dsqoop.export.records.per.statement=100000 --connect “”
–driver com.mapd.jdbc.MapDDriver --username xxxxxx --password xxxxxxx!
–table store_scan_dly1 --export-dir /user/hive/warehouse/scan_dly/*
–input-fields-terminated-by ‘\001’ --lines-terminated-by ‘\n’
–input-null-non-string ‘\N’ --input-null-string ‘\N’
–batch -m 10



Sqoop should be fine,it should use bulk load path.

Can you describe your ingestion process in a little more depth. Do you truncate or drop the table each day or just append anther 60M each day?

If you could share a log from when the sqoop job is running would be useful just to confirm it is not using INSERT

What filesystem is your data on? What kind of devices? What kind of speed do you expect from the underlying filesystem? Could you share sudo hdparm -tT /dev/<your dev> just for a ballpark.