Writing from Databricks to MapD Community (AWS)


#1

Hi there, sorry for the newbie question! I’m trying to write a table from Databricks to MapD. I’m using MapD’s prebuilt platform (https://aws.amazon.com/marketplace/pp/B071H71L2Y?qid=1504819898310&sr=0-2&ref_=srh_res_product_title)

I’ve so far been uploading CSVs, but was hoping I could get my ETL to write directly to MapD’s AWS cluster. Is that possible? Are there any examples on how to set that up?

Thanks!!


#2

%python
uncomp_venues.write.format(“jdbc”).option(“url”, “jdbc:mapd:localhost:8443:mapd:http”).option(“driver”,
“com.mapd.jdbc.mapDDriver”).option(“dbtable”, “venues”).option(“user”, “mapd”).option(“password”, “i-
OBFUSCATED”).mode(“overwrite”).save()

I tried the above code but got this error:

java.lang.ClassNotFoundException: com.mapd.jdbc.mapDDriver


#3

Hi,

i edited your post to obfuscate your password.

Do you have the jdbc driver on the class path, normally you would see that error if it is missing

regards


#4

Thanks!

Honestly, I’m not sure what that means :flushed:


#5

Hi,

As far as the obfuscation goes: You had your real password in the text of the post. I just removed it.

Classpath issue: You need to make sure the jar (available in the delivered bin directory) mapdjdbc-1.0-SNAPSHOT-jar-with-dependencies.jar is in the classpath of your job.

regards


#6

We love newbie questions! --pd :slight_smile:


#7

Thank you so much guys. I added this line to my databricks workbook (and ran it) but still got the same error:

%python
spark.extraClassPath= ‘com.mapd.jdbc.mapDDriver’

I then added a Library with file “mapdjdbc_1_0_SNAPSHOT_jar_with_dependencies.jar”, and said for it to auto attach to all clusters, but that still didn’t work.

Any ideas? I feel like I’m missing something simple.

Thank you for hiding my password :flushed:

Maybe I shouldn’t be bothering you guys and should instead hit up the Databricks forums? I would just need to ask about adding a Jar to the classpath, correct?


#8

Hi,

spark.extraClassPath= ‘com.mapd.jdbc.mapDDriver’

Should probably be something like

spark.extraClassPath= ‘<MapDHome>/bin/mapdjdbc_1_0_SNAPSHOT_jar_with_dependencies.jar'

regards


ClassNotFoundException for JDBC driver
#9

Thank you for reply.

I tried running the spark code using three ways (mentioned below) but still getting same error.

./bin/spark-submit --class Hive2Mapd hive_2.11-2.0.jar --conf “spark.driver.extraClassPath=XXX/mapdjdbc-1.0-SNAPSHOT-jar-with-dependencies.jar”

./bin/spark-submit --jars XXX/mapdjdbc-1.0-SNAPSHOT-jar-with-dependencies.jar
–driver-class-path XXX/mapdjdbc-1.0-SNAPSHOT-jar-with-dependencies.jar
–conf spark.executor.extraClassPath=XXX/mapdjdbc-1.0-SNAPSHOT-jar-with-dependencies.jar
–class Hive2Mapd XXX/MapD/aster2hive_2.11-2.0.jar

./bin/spark-submit
–driver-class-path XXX/mapdjdbc-1.0-SNAPSHOT-jar-with-dependencies.jar
–conf spark.executor.extraClassPath=XXX/mapdjdbc-1.0-SNAPSHOT-jar-with-dependencies.jar
–class XXX/aster2hive_2.11-2.0.jar

Please advice.


#10

Hi

please see response to same question but with more detail here ClassNotFoundException for JDBC driver

regards