🤖 AI Summary
The University of Cambridge team released GeoTessera, a Python CLI/library and dataset of precomputed geospatial embeddings from TESSERA — a foundation model that compresses a full year of Sentinel‑1 and Sentinel‑2 data into 128‑dimensional vectors at 10 m resolution. The release includes tiled GeoTIFFs (128 float32 bands per tile) covering parts of 2017–2024, hosted on dl.geotessera.org and discoverable/downloadable via the uvx geotessera CLI (coverage, download by GeoJSON/shapefile/bbox). TESSERA is trained only on public ESA data and targets downstream EO tasks such as crop type classification, forest canopy height and biomass estimation, wildfire detection, and other temporal‑spectral analyses — enabling users to plug dense, temporally aware embeddings directly into GIS and ML pipelines without retraining large models.
Technically, tiles preserve local UTM coords, use Pooch manifests for efficient selective download, and integrate with GDAL/Leaflet for inspection and visualization (PCA false‑colour, uvx visualize/serve). The dataset is petabyte‑scale, so region‑of‑interest selective download is critical; storage and egress are active challenges the project flags. The team also provides an interactive Jupyter notebook for manual labeling and prototyping, and is exploring an OCaml implementation and ML workflow modules. For researchers and practitioners, GeoTessera lowers the barrier to spatial‑temporal representation use, accelerates prototyping of EO models, and shifts effort from heavy satellite preprocessing to model and application development.
Loading comments...
login to comment
loading comments...
no comments yet