I'm trying to compute some statistics for OSM data using PostgreSQL 9.3.5 and PostGIS 2.1.4. I started with a small bavaria extract which I downloaded from Geofabrik. The db schema is the normal API 0.6 schema, the data was imported via the dump approach into Postgres (using the pgsnapshot_schema_0.6*.sql scripts which come with osmosis). ANALYZE VACUUM was also performed. The only custom made thing I'm using is a polygon table which contains multipolygons for all administrative boundary relations. The polygon geometry was not simplified in any way.
What I'm now trying to achieve is counting all nodes which are inside of the admin=6 boundaries of bavaria. Here is my SQL query:
SELECT relpoly.id,count(node) FROM bavaria.relpolygons relpoly, bavaria.nodes node
WHERE relpoly.tags @> '"boundary"=>"administrative","admin_level"=>"6"'::hstore AND ST_Intersects(relpoly.geom, node.geom)
GROUP BY relpoly.id;
The runtime of this query is terrible because Postgres is doing a nested loop join and scans over all nodes for every admin=6 boundary. FYI, bavaria is divided into 98 admin=6 polygons and there are about 30 million nodes in the bavaria extract.
Is it possible to avoid this sub-optimal query execution and to tell Postgres that it should scan all nodes only once (e.g., by incrementing a counter for the corresponding polygon in the result set or by using hints)?
Aucun commentaire:
Enregistrer un commentaire