Load Geospatial Data in Data Pipelines
Data pipeline loading supports the load of geospatial data for use with functionality in the . Use geospatial functions to manipulate this data. For details, see Geospatial Functions.
These geospatial functions are supported for data pipeline loading:
The System automatically casts geospatial data from the CHAR type to POINT, LINESTRING, or POLYGON types using the ST_POINTFROMTEXT, ST_LINEFROMTEXT, or ST_POLYGONFROMTEXT functions, respectively. The data must be in WKT format for the casting to work. If your data is not in the WKT format, use one of the supported geospatial functions to transform your data to the required type.
During loading, the Ocient System automatically performs normalization of point data into a regular format used within Ocient. The Ocient System performs these operations on point data during the load:
- Constrain longitude to [-180, 180] and latitude to [-90, 90].
- Wrap around invalid coordinates using correct geographical handling.
- Snap points near the pole to the pole.
- Set the longitude of points on the pole to 0.
- Remove signed zeros from coordinates, so -0 becomes 0.
The Ocient System has a standardized polygon orientation convention. The standard convention has the polygon with a counterclockwise rotation of the outer polygon ring and a clockwise rotation of the inner polygon ring. Use the ST_FORCECCW function to convert polygons to the standardized convention from systems that interpret polygon rotation differently.
The maximum allowed size of geospatial data types is 512 MiB.
Use these examples to understand how to create a data pipeline to load geospatial data for different formats (point and WKT data).
Create the target table cities using the CREATE TABLE SQL statement with these columns:
- id — Integer
- name — Variable-length string
- location — Numeric point
Create the cities data pipeline using the CREATE PIPELINE SQL statement for an S3 source with bucket ocient-examples, filter for the folder metabase_samples/jsonl/cities.jsonl, and region us-east-1. Use the JSON format to load a JSON file into the cities table. Specify a numeric point using the ST_POINT function with the longitude and latitude coordinates.
Create the target table geospatial_table using the CREATE TABLE SQL statement with these non-nullable columns:
Create the geospatial_pipeline data pipeline using the CREATE PIPELINE SQL statement for an S3 source with bucket ocient-examples, filter for the folder gis_small/gis_types/*.csv, and region us-east-1. Use the CSV format to load a CSV file into the geospatial_table table. The file contains geospatial data in WKT format in the sixth, ninth, and twelfth columns with POLYGON, LINESTRING, and POINT data, respectively. Specify the polygon data in WKT format by using the ST_POLYGONFROMTEXT function and then reorient the resulting polygon counter-clockwise for the exterior and clockwise for the interior by using the ST_FORCECCW function.
During the load, the Ocient System automatically casts the WKT-formatted data in the three columns to the corresponding geospatial data types.