Direct GeoPandas conversion (Legacy)¶

The API listed here was the initial non-Arrow-based STAC-GeoParquet implementation, converting between JSON and GeoPandas directly. For large collections of STAC items, using the new Arrow-based functionality (under the stac_geoparquet.arrow namespace) will be more performant.

Note that stac_geoparquet lifts the keys in the item properties up to the top level of the DataFrame, similar to geopandas.GeoDataFrame.from_features.

>>> import requests
>>> import stac_geoparquet.arrow
>>> import pyarrow.parquet
>>> import pyarrow as pa

>>> items = requests.get(
...     "https://planetarycomputer.microsoft.com/api/stac/v1/collections/sentinel-2-l2a/items"
... ).json()["features"]
>>> table = pa.Table.from_batches(stac_geoparquet.arrow.parse_stac_items_to_arrow(items))
>>> stac_geoparquet.arrow.to_parquet(table, "items.parquet")
>>> table2 = pyarrow.parquet.read_table("items.parquet")
>>> items2 = list(stac_geoparquet.arrow.stac_table_to_items(table2))

stac_geoparquet.to_geodataframe ¶

to_geodataframe(
    items: Sequence[dict[str, Any]],
    add_self_link: bool = False,
    dtype_backend: DTYPE_BACKEND | None = None,
    datetime_precision: str = "ns",
) -> GeoDataFrame

Convert a sequence of STAC items to a geopandas.GeoDataFrame.

The objects under properties are moved up to the top-level of the DataFrame, similar to geopandas.GeoDataFrame.from_features.

Parameters:

items (Sequence[dict[str, Any]]) –

A sequence of STAC items.
add_self_link (bool, default: False ) –

bool, default False Add the absolute link (if available) to the source STAC Item as a separate column named "self_link"
dtype_backend (DTYPE_BACKEND | None, default: None ) –
{'pyarrow', 'numpy_nullable'}, optional The dtype backend to use for storing arrays.

By default, this will use 'numpy_nullable' and emit a FutureWarning that the default will change to 'pyarrow' in the next release.

Set to 'numpy_nullable' to silence the warning and accept the old behavior.

Set to 'pyarrow' to silence the warning and accept the new behavior.

There are some difference in the output as well: with dtype_backend="pyarrow", struct-like fields will explicitly contain null values for fields that appear in only some of the records. For example, given an assets like::
```
{
    "a": {
        "href": "a.tif",
    },
    "b": {
        "href": "b.tif",
        "title": "B",
    }
}
```
The assets field of the output for the first row with dtype_backend="numpy_nullable" will be a Python dictionary with just {"href": "a.tiff"}.

With dtype_backend="pyarrow", this will be a pyarrow struct with fields {"href": "a.tif", "title", None}. pyarrow will infer that the struct field asset.title is nullable.
datetime_precision (str, default: 'ns' ) –

str, default "ns" The precision to use for the datetime columns. For example, "us" is microsecond and "ns" is nanosecond.

Returns:

GeoDataFrame –

The converted GeoDataFrame.

stac_geoparquet.to_item_collection ¶

to_item_collection(df: GeoDataFrame) -> ItemCollection

Convert a GeoDataFrame of STAC items to a pystac.ItemCollection.

Parameters:

df (GeoDataFrame) –

A GeoDataFrame with a schema similar to that exported by stac-geoparquet.

Returns:

ItemCollection –

The converted ItemCollection. There will be one record / feature per row in the in the GeoDataFrame.

stac_geoparquet.to_dict ¶

to_dict(record: dict) -> dict

Create a dictionary representing a STAC item from a row of the GeoDataFrame.

Parameters:

record (dict) –

dict