Publishing Raster Data

İbrahim Sarıçiçek
7 min readFeb 2, 2023

--

*Taken from Freepik

I first came up with the issue of raster display on the web in 2010. Since Google did not establish a legal office in Turkey and did not pay any tax, access to all Google services was banned by the Turkish government. While working in the vehicle tracking industry, we were also using Google maps as base layer and some of the map infrastructure was also inaccessible in this case. Luckily we had already developed our own tile maps using vector data and set it as default.

But there was still a problem, some of our customers asked for satellite images as well, yet our map infrastructure only had vector data served as tiled images. At that time, a question appeared; “can we enrich our own map by using satellite images that we can freely obtain”?

Decided to use landsat images at least at the upper zoom levels. Downloaded Landsat (6 or 7) data and started seeding to create tiles. By the way we were using Mapserver and Tilecache at that time.

MapServer is an open-source development environment for building spatially enabled internet applications, built in the C language, and is widely known as one of the fastest Web mapping engines available. It can run as a CGI program or via MapScript which supports several programming languages.

As the number of satellite images used increased, we realized that images taken at different days and different times of the day are in different color tones and different cloudiness levels. Years later, I realized that this is a problem that requires a lot of work to solve. Take a look how Mapbox dealt with this issue long long time ago -> https://www.wired.com/2013/05/a-cloudless-atlas/

While we spent time processing satellite imagery and creating tiles, Google and the government agreed and Google was set to open an office in Turkey and started to pay taxes. Eventually all services became accessible at that time.

The next time I came up with rasters was after I watched a documentary about the capital of Turkiye, Ankara; “Ankara’nın Altında Dereler Var’’. It was about connecting all rivers to the sewer system under the name of creek improvement and also about the (old but still flowing underway) rivers that still cause floods at many points in Ankara.

I thought it would be helpful to put together old plans, maps and aerial photographs to understand what happened to these creeks. Really disappointed because it was too hard to find a relevant institution that provides and of course shares old plans, maps or aerial photographs. Also some collectors who are mainly tutors from my university did not even give an answer to my messages or emails. It took a long time to find relevant maps on the internet myself.

The second time consuming phase was georeferencing raster images. It took days and nights to find reference points that are mainly old buildings or road junctions that still exist on current satellite images. There was going to be one more time consuming duty but I decided not to do that; I really don’t know the exact term but it may be background masking. After georeferencing a raster, the shifted part of the background image becomes black, white or any specific color according to the application used. There are methods to mask it but I really didn’t want to spend more time. If you have an easy method to convert background color to transparent background, please reach me.

After georeferencing and also coding a web map, I used gdal2tiles to serve all geotiff images as tiles.

GDAL is a translator library for raster and vector geospatial data formats that is released under an MIT style Open Source License by the Open Source Geospatial Foundation. And as a utility under gdal, gdal2tiles generates a directory with small tiles and metadata, following the OSGeo Tile Map Service Specification. Simple web pages with viewers based on Google Maps, OpenLayers and Leaflet are generated as well — so anybody can comfortably explore maps on-line and you do not need to install or configure any special software (like MapServer, Mapnik, etc.) and the map displays very fast in the web browser.

For all relevant zoom levels, pre-prepared tiles took disk space 5–10 times more than actual images, which became a problem when the number of images increased. Also it was a time consuming effort to both georeference and pre-prepare tiles for all images. As a solution — because all images that I found were freely accessible on the internet and not national sensitive data, I thought to use MapBox as a cloud layer and tile provider.

Having uploaded all geotiffs to Mapbox I used every image as a layer on my web map. Here is the source code https://github.com/saricicekibrahim/haritaANKARA, and here is the web map https://saricicekibrahim.github.io/haritaankara. Mapbox is free up to 200000 requests per month for raster tiles. For me it has been a perfect solution for serving all rasters without pre-tiling.

I could also used OpenAerialMap. OpenAerialMap is an open service to provide access to a commons of openly licensed imagery and map layer services. But OAM is like a servise for recent or near historic aerial data. Uploading old maps and plans was going to be irrelevant and images will be deleted any time in future for not using there for related purpose.

What can be any other solution for serving rasters? Mapserver — Tilecache or any other web server can create on demand or by seeding ready tiles and gdal2tiles creates pre-prepared tiles but is there any other option to do so?

While taking part in “Guiding Redesign of OpenAerialMap with Product Mindset” for Humanitarian OpenStreetMap Team at Kontur (blog post here -> https://www.kontur.io/blog/oam-redesign/), I realized that for many years dynamic tiling is used to create maps from rasters without a need to cache. With simplest explanation dynamic tiling is creating small images from a greater image on demand. What kind of a greater image? For performance issues there is one more term to dig in — Cloud Optimized GeoTIFF (COG).

According to https://www.cogeo.org a “Cloud Optimized GeoTIFF (COG)” is a regular GeoTIFF file, aimed at being hosted on a HTTP file server, with an internal organization that enables more efficient workflows on the cloud. It does this by leveraging the ability of clients issuing ​HTTP GET range requests to ask for just the parts of a file they need. As a summary, if the GeoTIFF is not ‘cloud optimized’ with overviews and tiles then doing remote operations on the data will still work. But they may download the whole file or large portions of it when only a very small part of the data is actually needed.

There is a great article on how to make geotiffs as cloud optimized. https://blog.cleverelephant.ca/2015/02/geotiff-compression-for-dummies.html

As it can be understood, instead of storing large geotiff files and tile caches produced from these files, it is acceptable to optimize geotiffs to run in the cloud and produce tiles only when necessary. I tried to define static & dynamic tiling but there is a wider explanation here -> https://developmentseed.org/titiler/dynamic_tiling/

But how to do dynamic tiling? On top of maybe everything in gis world there is GDAL. As a part of GDAL, gdalwarp can crop images with several methods. There is also Mapserver as some call it cockroach, because for years it has supported many vector and raster data besides new ways of serving geospatial data, and still being used by many organizations with great performance. Besides that, Geoserver has an extension for serving COG but I could not make it run yet. It has support for reaching and tiling COG’s.

There are also more modern ways like Rasterio and a very cool gadget Titiler. Titiler is a dynamic tile server built on top of FastAPI and Rasterio/GDAL. Installation & usage is super easy and comes with a default application with support for COG, STAC, and MosaicJSON. Did I type STAC?

Here is another new term at least for me -> STAC Api. It’s better to talk about a bit on OIN before STAC. Openaerialmap has been using a standard named Open Imagery Network (OIN) that is a discoverable network of openly licensed imagery. Contributors to OIN make imagery and its associated metadata available under a common license and developed an infrastructure model that enables those who share aerial imagery data under license or free of charge to be used or consumed by applications easily.

With more reachable infrastructure, earth imagery has been more accessible by organizations and the number of providers hosting imagery online has increased over the years. Apps serving online data and interfaces to search, reach images with metadata have grown to describe and provide access to that data. The SpatioTemporal Asset Catalog (STAC) specification aims to provide a unified framework for describing and linking to earth observation data, with the goal of increasing the interoperability of search tools and making it easier to access the data.

The STAC ecosystem is driven by open source contributions and community support. The amount of STAC data being collected and hosted in STAC APIs and static catalogs grows as data providers adopt STAC for spatiotemporal data. As more datasets are created, more library of tools and resources are developed, the community adds to stacindex.org to be discovered and explored.

--

--