Point in Polygon – Vertica

We used AU nation-wide 2021 GNaf address points (n 15,150,683), and 2021 meshblock polygons (n 368,255) – attributing each point with a meshblock attribute. Runtime: 104 seconds

Steps:

Create spatial tables, e.g. with 
SELECT STV_ShpCreateTable ( USING PARAMETERS file = '/data_au/MB_2021_AUST_GDA94.shp') OVER() AS spatial_data;

copy data:
COPY data_au.mb_2021_aust_gda94 WITH SOURCE STV_ShpSource(file='/data_au/MB_2021_AUST_GDA94.shp', SRID=4283) PARSER STV_ShpParser(); EXCEPTIONS 'data1/vertica_data/exceptions_ralf'; commit;

We can pull the points file (with x/y coordinates):
copy data_au.aus_gnaf_aug21 from '/data_au/aus_gnaf_aug21_prem_2021.csv' PARSER fcsvparser(header=true);
and add geometry:
alter table data_au.aus_gnaf_aug21 add column geom geometry;
UPDATE data_au.aus_gnaf_aug21 SET geom = STV_GeometryPoint(lon, lat, 4283); commit;

Create spatial index for the polygon data; set ram-memory to 10 gig
(default 256 mb); we use the mb_code created as integer column:
SELECT STV_Create_Index(mb_code_int, geom USING PARAMETERS index='mb_index', overwrite=true, max_mem_mb = 10000) OVER() FROM data_au.mb_2021_aust_gda94;

Perform point in polygon selection: (point id has to be unique & integer):
SELECT STV_Intersect(id, geom USING PARAMETERS index='mb_index') OVER (PARTITION BEST) AS (point_id, polygon_gid) FROM data_au.aus_gnaf_aug21

We can re-run and join using gnaf_pid to get point_ids with mb_codes, this time we also account for invalid point data (where-clause):
Select aa.gnaf_pid, polygon_gid as mb_code_int from data_au.aus_gnaf_aug21 aa
join (
SELECT STV_Intersect(id, geom USING PARAMETERS index='mb_index') OVER (PARTITION BEST) AS (point_id, polygon_gid) FROM data_au.aus_gnaf_aug21
where not (geom is null or id is null)) bb
on aa.id=bb.point_id 

Results:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.