Spatial intersections


#1

Hi, I’m looking to build a system that requires the intersection of arbitrary polygons with points (possibly other geometry types down the line). What is the recommended way to do this?

Thanks!


#2

Hi,

You may have to share a little more detail about what you are trying to do.

This forum post may address your question?

We are currently adding more geo function into the product so your usecase and high priority ‘wants’ would be helpful to us to identify where best to focus our energies first.

regards


#3

Thanks for the quick response! I saw that one, but as mentioned it is a rather large and complex query so I was hoping to find something automated or at least a little easier to understand before diving in and doing it myself. The core of what I’d like to do is:

  1. Create a table with latitude/longitude points tied to an ID
  2. Run a query that filters down to rows that are within a dynamically defined polygon
  3. Join the selected rows with other tables.

In PostGIS this is st_intersects(). Should also mention that the polygons are small enough that projection doesn’t matter.


#4

Hi,

Yeah, its not pretty but that is what that big ugly query does.

I suspect we will be wrapping that ear-clipping query generation as one of the first items we add. Keep an eye on the release notes of up coming releases to see when it lands.

For now, If you use the mapd charting, you get a selection tool which takes care of the big complex query so you might want to take a look at that. See: https://mapd.github.io/mapd-charting/example/example5.html

regards


#5

I’ll take a look. Thanks!


#6

FYI, this area is in active development - support for geometry types and operators. To the extent that you can share use case info such as respective cardinalities for combination of geometries input to st_contains/intersects this would be useful.


#7

Hi Bill, when are these features currently expected to be released?

As for our particular needs - any way I can reach you privately to discuss more?


#8

So, query like this?
SELECT t1.resultcol
FROM t1 JOIN t2 ON t1.id = t2.id
WHERE ST_CONTAINS(t1.pointcol, ‘POLYGON( 2 0, 2 2, 0 2, 2 0)’)
I’m guessing here that t1 is a large fact table, and by “dynamic” polygon you mean the sort of thing you’d create with a selection tool rather than a spatial join another table. If so, that’s the example5 above, and MapD is quite fast for even large numbers of points. Server side also should satisfy this and also the join cases. Oh, and I think contains is same as intersects for point+poly.


#9

Bill, you’ve got it. I imagine MapD could blow nearly any other GIS database out of the water with a full suite of geometric and topological operations. Looking forward to seeing it in action!


#10

Is there any more update on support for geometry types and operators. I have a potential use case using 5 to 100’s of millions features for a national cadastre time series analysis that could really benefit from multidimensional queries on polygons using near real time table scans!


#11

Hi, I couldn’t tell from the conversation, @billmaimone, @dwayneberry, is this query currently supported or not?

SELECT t1.resultcol
FROM t1 JOIN t2 ON t1.id = t2.id
WHERE ST_CONTAINS(t1.pointcol, ‘POLYGON( 2 0, 2 2, 0 2, 2 0)’)

Because the github ticket seems to be still open: https://github.com/mapd/mapd-core/issues/58.

If not currently supported, is there any expected date here? Thanks!


#12

hi @dfernan, we’re working on a comprehensive geo roadmap across multiple releases this year, which covers the usual gamut of spatial operators and types including those you’ve listed here. we’ll reaching out to our interested community members when we get closer, so please stay tuned!


Assign lat/lon records to cells on a custom grid
#13

Hi @niviksha this is great news and can’t wait for this features to be released, they will truly be a game changer for the capabilities of MapD when processing spatial data. Please keep us posted, thanks again!


#14

I’m also curious about these and other spatial functions on the roadmap for MapD.

Our team has been testing various in-memory, highly distributed and GPU based data solutions to support very large GIS datasets (of course MapD being one of them).

I’m curious what advantages executing these functions in MapD would offer as many of these functions are not very parallelizable (is that a word?).

Frameworks like Geomesa and Geo Spark already offer moderate performance for scale. I believe Kinetica’s GPU database has been the most mature and performant for advanced GIS workloads but of course they are not open source, and I think Sqream is coming with their own now as well?

Not trying to start a debate, but honestly curious what the thinking behind this is.


#15

Hi @geonix - thanks for your great question. Here’s a long answer :slight_smile:

If you’ve seen our public demo of shipping traffic (unlike our competitors, all our demos are actually public) with 11.6 billion points of shipping traffic data, it should be clear that our performance focus does not just stop at high-performance SQL execution but also leverages our highly optimized server-side GPU-based rendering of large result sets.

As a general principle, it’s not ‘just’ SQL performance, but interactivity at scale that we’re going after - we’re focused on the end-to-end analytics experience for our users. That’s how/why our geospatial capabilities (both current and upcoming) and nearly anything we do, will be different from our competitors. It’s one thing to add support for geospatial operators like Kinetica, or support scale like GeoMesa/GeoSpark - it’s entirely another to make these capabilities visual+interactive. This will also extend to other product areas we’re going after, like Machine Learning and AI.

Also, like you’ve pointed out, our core engine is open source and we continue to generate great interest in mutiple user communities/ecosystems as a result (for example, a lot of the work in pymapd, our python API was contributed by a combination of Wes McKinney, the creator of Pandas, and Anaconda). We’re already seeing further interest from communities like Julia and R - primarily because we’re built on an Open Source foundation.

Finally, all of this has been, and continues to be strongly validated by our customers in every industry vertical, and use case area (including GeoSpatial Analytics) - you can see this in video testimonials and use case studies on our website.

Hope this answers your question - let me know if I can clarify further.