IP Address/Blacklist Processing


#1

I’d like to check a steady stream of IP addresses against a table of “Blacklisted” IP sources. Would the Dictionary facility be a good approach for a task like this? Also, where to put the procedural logic since there are no stored procedures in mapd -perhaps just code it in Java?


#2

Hi @HokieKev -

Could you elaborate a bit more on the size of data that you are talking about? In general, using a hash table / dictionary is a good approach to a problem like this, but it doesn’t necessarily need MapD as part of the solution.

Is the idea that you would stream the data into MapD, then compare it against a table of known bad IPs? If so, you could do this as a capped table for the streaming data (to keep the data to a manageable, fast size in GPU RAM) and the bad IP addresses as another tiny table, then inner join.

This would give you the ability to make a dashboard of requests from bad IP addresses as a real-time dashboard.

Does this describe what you are looking to do?

Thanks,
Randy


#3

Randy,

The Blacklist could be millions of rows. The incoming stream of IPs I need to compare against is also very large - but I love the idea of the capped table. I was not familiar with that capability so I really appreciate you pointing it out to me.

Kevin


#4

The parameter is called max_rows, which you place in your CREATE TABLE statement definition. Then, as you keep pumping data in, it will drop the oldest table fragment out.

We just updated our docs for 4.0, so unfortunately I can’t link you to an example right this second, but we’re working on getting everything straightened out.


#5

Hello @HokieKev. The docs have been updated. Here’s a link to the TABLE section of the DDL.