COG physical arrangements, best ways to decode/cache and encode other data (vector ?) #7

Farkal · 2020-07-13T13:45:02Z

I think you should also add a scheme on the COG physical arrangements. Here is what i have found on https://www.fileformat.info/format/tiff/egff.htm

tiff_architecture.drawio.tar.gz

I added the 4 based on what i have read on this spec. I also see that you add an area for Values of TIFF tags that don't fit inline in the IFD directory, such as TileOffsets, TileByteCounts and GeoTIFF keys but there is an issue asking if this area is before all the IFD or after each IFD. The new solution is also proposed to be more efficient and i think there should be more informations about the best ways to decode/cache a cog file.
Here is a guideline to get a tile from X, Y and Z (i can make a PR to add it on the website or to the spec):

Decoding a tile (X, Y, Z):

First request of 1024 bytes
Decode the IFD in memory
If we can't read to the end of an IFD make a new request based on the IFD offset multiplied by the entry count of this IFD
We can get the IFD corresponding to Z by matching tile matrix resolution and the full resolution image
With the corresponding IFD we can get the offset and byte counts index with Y * (ImageWidth / TileLength) + X
Make the http request between TileOffsets[index] and TileOffsets[index] + TileByteCounts[index]
Cache the IFD structure in memory

With the caching we reduce the number of request for the other tiles of the same cog. But if the image have a very large resolution we will saturate our memory.
So i think the best architecture should be to put the TileOffsets and TileByteCounts after all the IFDs. Also it would be great to have some information on the header size with a tag on the first IFD. So we should be able to make a maximum of two requests to get all the IFDs: First request of 1024 bytes and if Tag::IFDTotalSize > 1024 make another request to Tag::IFDTotalSize.
Store TileOffsets and TileByteCounts in memory only if their size is not too large.
For all the other request of the same cog if we have the TileOffsets and TileByteCounts in memory we can get the tile with one request else we need 2 request.

We could also add some guidelines to add other dimensions. Today i don't know what is the best way to add another dimension to the a cog (for example altitude). Also how can we store vector in cog ? On https://www.fileformat.info/format/tiff/egff.htm the author wrote

TIFF files contain only bitmap data, although adding a few tags to support vector- or text-based images would not be a hard thing to do.

There is multiple projects trying to define some spec for vector (for ex: https://github.com/planetfederal/cogj-spec) and by just adding some tag we should be able to manage vector and get more simpler processing chain for all geographic data.
I think this spec should be a place to propose and discuss about best implementations for all these uses cases.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

COG physical arrangements, best ways to decode/cache and encode other data (vector ?) #7

COG physical arrangements, best ways to decode/cache and encode other data (vector ?) #7

Farkal commented Jul 13, 2020 •

edited

Loading

COG physical arrangements, best ways to decode/cache and encode other data (vector ?) #7

COG physical arrangements, best ways to decode/cache and encode other data (vector ?) #7

Comments

Farkal commented Jul 13, 2020 • edited Loading

Farkal commented Jul 13, 2020 •

edited

Loading