Skip to content

Direct GeoPandas conversion (Legacy)

The API listed here was the initial non-Arrow-based STAC-GeoParquet implementation, converting between JSON and GeoPandas directly. For large collections of STAC items, using the new Arrow-based functionality (under the stac_geoparquet.arrow namespace) will be more performant.

Note that stac_geoparquet lifts the keys in the item properties up to the top level of the DataFrame, similar to geopandas.GeoDataFrame.from_features.

>>> import requests
>>> import stac_geoparquet.arrow
>>> import pyarrow.parquet
>>> import pyarrow as pa

>>> items = requests.get(
...     "https://planetarycomputer.microsoft.com/api/stac/v1/collections/sentinel-2-l2a/items"
... ).json()["features"]
>>> table = pa.Table.from_batches(stac_geoparquet.arrow.parse_stac_items_to_arrow(items))
>>> stac_geoparquet.arrow.to_parquet(table, "items.parquet")
>>> table2 = pyarrow.parquet.read_table("items.parquet")
>>> items2 = list(stac_geoparquet.arrow.stac_table_to_items(table2))

stac_geoparquet.to_geodataframe

to_geodataframe(
    items: Sequence[dict[str, Any]],
    add_self_link: bool = False,
    dtype_backend: DTYPE_BACKEND | None = None,
    datetime_precision: str = "ns",
) -> GeoDataFrame

Convert a sequence of STAC items to a geopandas.GeoDataFrame.

The objects under properties are moved up to the top-level of the DataFrame, similar to geopandas.GeoDataFrame.from_features.

Parameters:

  • items (Sequence[dict[str, Any]]) –

    A sequence of STAC items.

  • add_self_link (bool, default: False ) –

    bool, default False Add the absolute link (if available) to the source STAC Item as a separate column named "self_link"

  • dtype_backend (DTYPE_BACKEND | None, default: None ) –

    {'pyarrow', 'numpy_nullable'}, optional The dtype backend to use for storing arrays.

    By default, this will use 'numpy_nullable' and emit a FutureWarning that the default will change to 'pyarrow' in the next release.

    Set to 'numpy_nullable' to silence the warning and accept the old behavior.

    Set to 'pyarrow' to silence the warning and accept the new behavior.

    There are some difference in the output as well: with dtype_backend="pyarrow", struct-like fields will explicitly contain null values for fields that appear in only some of the records. For example, given an assets like::

    {
        "a": {
            "href": "a.tif",
        },
        "b": {
            "href": "b.tif",
            "title": "B",
        }
    }
    

    The assets field of the output for the first row with dtype_backend="numpy_nullable" will be a Python dictionary with just {"href": "a.tiff"}.

    With dtype_backend="pyarrow", this will be a pyarrow struct with fields {"href": "a.tif", "title", None}. pyarrow will infer that the struct field asset.title is nullable.

  • datetime_precision (str, default: 'ns' ) –

    str, default "ns" The precision to use for the datetime columns. For example, "us" is microsecond and "ns" is nanosecond.

Returns:

stac_geoparquet.to_item_collection

to_item_collection(df: GeoDataFrame) -> ItemCollection

Convert a GeoDataFrame of STAC items to a pystac.ItemCollection.

Parameters:

  • df (GeoDataFrame) –

    A GeoDataFrame with a schema similar to that exported by stac-geoparquet.

Returns:

  • ItemCollection

    The converted ItemCollection. There will be one record / feature per row in the in the GeoDataFrame.

stac_geoparquet.to_dict

to_dict(record: dict) -> dict

Create a dictionary representing a STAC item from a row of the GeoDataFrame.

Parameters:

  • record (dict) –

    dict