georeader.vectorize¶
This module provides functions to convert raster data (binary masks, segmentation outputs) into vector geometries (polygons).
Overview¶
Vectorization is essential for:
- Converting segmentation masks to GeoJSON/Shapefile
- Extracting object boundaries from classification results
- Creating vector features from raster analysis
Quick Start¶
from georeader import vectorize
from georeader.geotensor import GeoTensor
import numpy as np
import rasterio
# Example 1: Get polygons from GeoTensor (automatically in the GeoTensor's CRS)
mask_data = np.zeros((100, 100), dtype=np.uint8)
mask_data[20:80, 20:80] = 1 # A square region
transform = rasterio.Affine(10.0, 0, 500000, 0, -10.0, 4500000)
gt_mask = GeoTensor(mask_data, transform, crs="EPSG:32610")
# Polygons are automatically in the GeoTensor's CRS (EPSG:32610)
polygons = vectorize.get_polygons(gt_mask, min_area=100)
# For CRS reprojection, use window_utils.polygon_to_crs
from georeader import window_utils
polygon_wgs84 = window_utils.polygon_to_crs(polygons[0],
crs_polygon="EPSG:32610",
dst_crs="EPSG:4326")
Key Functions¶
| Function | Description |
|---|---|
get_polygons |
Extract polygons from binary mask with optional area filtering |
Parameters¶
get_polygons¶
binary_mask: Input mask (GeoTensor or numpy array)min_area: Minimum polygon area in pixel units (default: 25.5), applied before affine transform- Returns: List of shapely Polygon objects (in CRS coordinates if transform provided, else pixel coordinates)
Vectorize Module: Convert raster masks to vector polygons.
This module provides functions to extract polygon geometries from binary raster masks. Essential for converting classification results, segmentation outputs, and masks into GIS-compatible vector formats.
Vectorization Process¶
Converting binary rasters to polygon geometries::
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ VECTORIZATION PROCESS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ Raster (Binary Mask) Vector (Polygons) โ
โ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โ
โ โ
โ โโโฌโโฌโโฌโโฌโโฌโโฌโโฌโโ โโโโโโโโโโโโโ โ
โ โ0โ0โ0โ1โ1โ1โ0โ0โ โโ โโ โ
โ โโโผโโผโโผโโผโโผโโผโโผโโค โโ โโ โ
โ โ0โ0โ1โ1โ1โ1โ1โ0โ โโโโโโโโโโโโบ โโ โโ โ
โ โโโผโโผโโผโโผโโผโโผโโผโโค Vectorize โ Polygon 1 โ โ
โ โ0โ1โ1โ1โ1โ1โ1โ0โ โโ โโ โ
โ โโโผโโผโโผโโผโโผโโผโโผโโค โโ โโ โ
โ โ0โ0โ1โ1โ1โ1โ0โ0โ โโ โโ โ
โ โโโดโโดโโดโโดโโดโโดโโดโโ โโโโโโโโโโโโโ โ
โ โ
โ 1 = foreground (vectorized) โ
โ 0 = background (ignored) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Polygon Simplification¶
Reducing vertex count while preserving shape::
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ POLYGON SIMPLIFICATION (tolerance parameter) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ Raw (pixelated) Simplified (tolerance=1) โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ โโโ โญโโโโโโโโฎ โ
โ โ โโโ โฑ โฒ โ
โ โ โโโ โฑ โฒ โ
โ โ โโโ โโโโโโโโโโโโโบ โ โ Fewer vertices, โ
โ โ โ simplify โ โ smoother edges โ
โ โ โโโ โฒ โฑ โ
โ โ โโโ โฒ โฑ โ
โ โโโโโ โฐโโโโโโโโฏ โ
โ โ
โ tolerance=0: Keep all vertices (staircase pattern) โ
โ tolerance=1: Simplify ~1 pixel tolerance (DEFAULT) โ
โ tolerance>1: More aggressive simplification โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Filtering Options¶
Removing small or unwanted polygons::
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ POLYGON FILTERING โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ Parameters: โ
โ โโโโโโโโโโโ โ
โ โ
โ min_area=25.5 (default) Remove polygons smaller than ~5x5 pixels โ
โ Helps filter noise and artifacts โ
โ โ
โ polygon_buffer=0 Buffer/erode polygons by N pixels โ
โ Positive: expand โ
โ Negative: shrink (erode) โ
โ โ
โ Before (min_area=0): After (min_area=25): โ
โ โโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ โโโโโโโโโ โ โ โโโโโโโโโ โ โ
โ โ โ โ โ โ โ โ โโโโโโโโบ โ โ โ โ โ
โ โ โ โ โ Filter โ โ โ โ โ
โ โ โ โโโโโโโโโ โ โ โโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโ โ
โ โ small polygons removed โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Module Functions Overview¶
Vectorization
- :func:
get_polygons: Extract polygons from binary mask - :func:
vectorize_raster: Vectorize with geographic coordinates
Utilities
- Automatic CRS handling from GeoData inputs
- Integration with shapely for geometry operations
Quick Start¶
Extract polygons from a binary mask::
from georeader import vectorize
import numpy as np
# Binary mask (e.g., from classification)
mask = (classified_image == 1).astype(np.uint8)
# Extract polygons
polygons = vectorize.get_polygons(
mask,
min_area=100, # Minimum 10x10 pixel area
tolerance=1.5, # Simplification tolerance
transform=transform # Affine transform for georeferencing
)
# Polygons are in CRS units (georeferenced)
for poly in polygons:
print(f"Area: {poly.area} sq. CRS units")
Vectorize a GeoTensor mask::
# GeoTensor carries its own transform
polygons = vectorize.get_polygons(
water_mask_geotensor, # GeoTensor with transform
min_area=50,
polygon_buffer=-1 # Erode by 1 pixel
)
See Also¶
georeader.rasterize : Inverse operation (vector โ raster) georeader.geotensor : Input format with transform rasterio.features.shapes : Underlying implementation
References¶
- Rasterio shapes: https://rasterio.readthedocs.io/en/latest/api/rasterio.features.html
- Shapely simplify: https://shapely.readthedocs.io/en/latest/manual.html#object.simplify
get_polygons(binary_mask, min_area=25.5, polygon_buffer=0, tolerance=1.0, transform=None)
¶
Convert a binary raster mask to vector polygons.
Extracts connected regions of True/nonzero pixels as polygon geometries. Includes options for filtering small polygons, buffering/eroding boundaries, and simplifying vertex counts.
If binary_mask is a GeoTensor (has .transform attribute), polygons are
returned in geographic coordinates. If it's a numpy array, polygons are in
pixel coordinates unless a transform is provided.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
binary_mask
|
Union[ndarray, GeoData]
|
2D binary mask where nonzero pixels represent features to vectorize. Accepts GeoTensor (uses its transform) or numpy array. |
required |
min_area
|
float
|
Minimum polygon area in square pixels to include. Polygons smaller than this are filtered out. Default 25.5 (~5x5 px). Set to 0 to keep all polygons. |
25.5
|
polygon_buffer
|
int
|
Buffer distance in pixels to apply to polygons. Default 0 (no buffering).
|
0
|
tolerance
|
float
|
Simplification tolerance in pixels. Higher values produce simpler polygons with fewer vertices. Default 1.0. Set to 0 for no simplification (keeps staircase pixel edges). |
1.0
|
transform
|
Optional[Affine]
|
Affine transform to convert pixel coordinates to geographic coordinates. Only used if binary_mask is a numpy array. If binary_mask is GeoTensor, uses its transform. |
None
|
Returns:
| Type | Description |
|---|---|
List[Polygon]
|
List[Polygon]: List of shapely Polygon objects. Coordinates are in: - Geographic CRS units if transform provided or binary_mask is GeoTensor - Pixel coordinates otherwise |
Examples:
Vectorize a classification result:
>>> import numpy as np
>>> from georeader import vectorize
>>>
>>> # Binary mask from classification
>>> water_mask = (classification == 1).astype(np.uint8)
>>>
>>> # Extract water body polygons
>>> polygons = vectorize.get_polygons(
... water_mask,
... min_area=100, # Filter out tiny artifacts
... tolerance=2.0 # Simplify boundaries
... )
>>> print(f"Found {len(polygons)} water bodies")
Vectorize with erosion to remove edge noise:
>>> polygons = vectorize.get_polygons(
... noisy_mask,
... min_area=50,
... polygon_buffer=-2 # Erode 2 pixels
... )
Vectorize GeoTensor (auto-uses transform):
>>> from georeader.geotensor import GeoTensor
>>> # mask_gt is a GeoTensor with transform
>>> polygons = vectorize.get_polygons(mask_gt, min_area=200)
>>> # Polygons are in geographic coordinates
>>> for poly in polygons:
... print(f"Area: {poly.area:.2f} sq. CRS units")
Get pixel-coordinate polygons from numpy array:
>>> polygons_px = vectorize.get_polygons(
... mask_array,
... transform=None # No transform = pixel coordinates
... )
Note
- Polygons are simplified AFTER buffering, so buffer then simplify
- For very large masks, consider processing in tiles
- MultiPolygon results are returned as separate Polygon objects
See Also
transform_polygon: Transform polygon between coordinate systems. georeader.rasterize: Inverse operation (vector โ raster).
Source code in georeader/vectorize.py
151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 | |
transform_polygon(polygon, transform, relative=False, shape_raster=None)
¶
Transform polygon coordinates using an affine transformation.
Applies a rasterio Affine transform to all vertices of a polygon, converting between pixel and geographic coordinate systems. Handles both simple Polygons and MultiPolygons, including holes.
Common use cases: - Pixel coordinates โ Geographic coordinates (using raster transform) - Geographic coordinates โ Relative coordinates (0-1 range for ML)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
polygon
|
Union[Polygon, MultiPolygon]
|
Shapely geometry to transform. Coordinates can be any numeric type. |
required |
transform
|
Affine
|
2D affine transformation matrix. Common sources: raster.transform, rasterio.Affine.scale(), etc. |
required |
relative
|
bool
|
If True, output normalized coordinates in [0, 1] range relative to the raster dimensions. Useful for ML model inputs. Requires shape_raster. Default False. |
False
|
shape_raster
|
Optional[Tuple[int, int]]
|
Raster dimensions as (height, width). Required if relative=True. Ignored otherwise. |
None
|
Returns:
| Type | Description |
|---|---|
Union[Polygon, MultiPolygon]
|
Union[Polygon, MultiPolygon]: Transformed geometry with same type as input. All coordinates are transformed by the affine matrix. |
Raises:
| Type | Description |
|---|---|
AssertionError
|
If relative=True but shape_raster not provided. |
Examples:
Convert pixel polygon to geographic coordinates:
>>> from shapely.geometry import Polygon
>>> import rasterio
>>> from georeader.vectorize import transform_polygon
>>>
>>> # Polygon in pixel coordinates
>>> poly_px = Polygon([(0, 0), (100, 0), (100, 100), (0, 100)])
>>>
>>> # Transform from a raster
>>> transform = rasterio.Affine(10.0, 0, 500000, 0, -10.0, 4500000)
>>>
>>> # Convert to UTM coordinates
>>> poly_geo = transform_polygon(poly_px, transform)
>>> print(poly_geo.bounds)
(500000.0, 4499000.0, 501000.0, 4500000.0)
Get relative coordinates for ML input:
>>> poly_rel = transform_polygon(
... poly_px,
... transform=rasterio.Affine.identity(),
... relative=True,
... shape_raster=(1000, 1000) # 1000x1000 image
... )
>>> print(poly_rel.bounds) # Values in [0, 1]
(0.0, 0.0, 0.1, 0.1)
Transform MultiPolygon:
>>> from shapely.geometry import MultiPolygon
>>> multi = MultiPolygon([poly_px, poly_px.buffer(10)])
>>> multi_geo = transform_polygon(multi, transform)
>>> print(type(multi_geo))
<class 'shapely.geometry.multipolygon.MultiPolygon'>
Note
- The transform is applied as: (x_out, y_out) = transform * (x_in, y_in)
- For pixel-to-geo conversion, use the raster's transform directly
- For geo-to-pixel conversion, use ~transform (inverse)
- Holes in polygons are preserved after transformation
Source code in georeader/vectorize.py
271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 | |