Cleaning methods when loading optical bands#

Let’s take a peek on the cleaning methods of optical bands and their potential respective time-consumption.

Warning: The durations shown hereunder may not be representative of your computer’s performances. Please take it as a hint about relative performances between constellations.

To summarize:

  • RAW is fast and dirty

  • NODATA is used by default, still relatively fast and puts nodata outside detectors footprint

  • CLEAN is the most complete method (used before version 0.11.0) but can be very slow and as the defective pixels are relatively rare. This may be overkill for your usage.

Note that the keywords are working with both load and stack functions.

Try with Landsat-8#

Let’s open a Landsat-8 OLCI collection 2 tile. Landsat COL-2 products manage their nodata and defective pixels through two flag files:

  • QA_PIXELS

  • QA_RADSAT

See more about these files here

# Imports
import os
from eoreader.reader import Reader
from eoreader.bands import GREEN
from eoreader.keywords import CLEAN_OPTICAL
from eoreader.products import CleanMethod
# Open the product
folder = os.path.join("/home", "data", "DS3", "CI", "eoreader", "optical")
path = os.path.join(folder, "LC08_L1TP_200030_20201220_20210310_02_T1.tar")
reader = Reader()
prod = reader.open(path)
There is no existing products in EOReader corresponding to /home/data/DS3/CI/eoreader/optical/LC08_L1TP_200030_20201220_20210310_02_T1.tar

Time the RAW method#

The RAW method is simple: just open the given tile with no pixel processing.

%%timeit
prod.load(
    GREEN, 
    **{CLEAN_OPTICAL: CleanMethod.RAW}
)
prod.clean_tmp()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[3], line 1
----> 1 get_ipython().run_cell_magic('timeit', '', 'prod.load(\n    GREEN, \n    **{CLEAN_OPTICAL: CleanMethod.RAW}\n)\nprod.clean_tmp()\n')

File ~/checkouts/readthedocs.org/user_builds/eoreader/envs/latest/lib/python3.8/site-packages/IPython/core/interactiveshell.py:2422, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
   2420 with self.builtin_trap:
   2421     args = (magic_arg_s, cell)
-> 2422     result = fn(*args, **kwargs)
   2423 return result

File ~/checkouts/readthedocs.org/user_builds/eoreader/envs/latest/lib/python3.8/site-packages/IPython/core/magics/execution.py:1162, in ExecutionMagics.timeit(self, line, cell, local_ns)
   1160 for index in range(0, 10):
   1161     number = 10 ** index
-> 1162     time_number = timer.timeit(number)
   1163     if time_number >= 0.2:
   1164         break

File ~/checkouts/readthedocs.org/user_builds/eoreader/envs/latest/lib/python3.8/site-packages/IPython/core/magics/execution.py:156, in Timer.timeit(self, number)
    154 gc.disable()
    155 try:
--> 156     timing = self.inner(it, self.timer)
    157 finally:
    158     if gcold:

File <magic-timeit>:1, in inner(_it, _timer)

AttributeError: 'NoneType' object has no attribute 'load'

Time the NODATA method#

Only the detector nodata is processed by the NODATA method.
The bands will be set to nodata outside of the detector footprint (instead of keeping the raw nodata value)

%%timeit
prod.load(
    GREEN, 
    **{CLEAN_OPTICAL: CleanMethod.NODATA}
)
prod.clean_tmp()
2.76 s ± 57.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Time the CLEAN method#

Every defective pixel given by the provider by the CLEAN method. These pixels will be set to nodata.

%%timeit
prod.load(
    GREEN, 
    **{CLEAN_OPTICAL: CleanMethod.CLEAN}
)
prod.clean_tmp()
4.93 s ± 70.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Try another product: Sentinel-2#

Let’s open a Sentinel-2 (processing baseline < 04.00, ~acquired before end of 2021, with flag files provided as vectors).

The invalid pixel are retrived from the files:

  • DETFOO: Detector footprint (nodata outside the detectors)

  • NODATA: Pixel nodata (inside the detectors) (QT_NODATA_PIXELS)

  • DEFECT: Defective pixels

  • SATURA: Saturated Pixels

  • TECQUA: Technical quality mask (MSI_LOST, MSI_DEG)

Note: Open the 20 m bands, to have array shapes comparable to Landsat-8.

# Open the product
path = os.path.join(folder, "S2B_MSIL2A_20200114T065229_N0213_R020_T40REQ_20200114T094749.SAFE")
prod = reader.open(path)

Time the RAW method#

The RAW method is simple: just open the given tile with no pixel processing.

%%timeit
prod.load(
    GREEN,
    resolution=20.,
    **{CLEAN_OPTICAL: CleanMethod.RAW}
)
prod.clean_tmp()
1.75 s ± 46.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Time the NODATA method#

Only the detector nodata is processed by the NODATA method.
The bands will be set to nodata outside of the detector footprint (instead of keeping the raw nodata value)

%%timeit
prod.load(
    GREEN,
    resolution=20.,
    **{CLEAN_OPTICAL: CleanMethod.NODATA}
)
prod.clean_tmp()
2.23 s ± 27 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Time the CLEAN method#

Every defective pixel given by the provider by the CLEAN method. These pixels will be set to nodata.

%%timeit
prod.load(
    GREEN,
    resolution=20.,
    **{CLEAN_OPTICAL: CleanMethod.CLEAN}
)
prod.clean_tmp()
2.67 s ± 36 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Try with the latest Sentinel-2 baseline#

Let’s open a Sentinel-2 (processing baseline >= 04.00, ~acquired after end of 2021, with flag files provided as rasters).

The invalid pixel are retrived from the file:

  • QUALIT: Regrouping TECQUA, DEFECT, NODATA, SATURA

The nodata pixels (outside detector footprints) are now retrieved from null pixels, as a radiometric offset has been added.

See here for more information about the processing baseline update.

Note: Open the 20 m bands, to have array shapes comparable to Landsat-8.

# Open the product
path = os.path.join(folder, "S2B_MSIL2A_20210517T103619_N7990_R008_T30QVE_20211004T113819.SAFE")
prod = reader.open(path)

Time the RAW method#

The RAW method is simple: just open the given tile with no pixel processing.

%%timeit
prod.load(
    GREEN,
    resolution=20.,
    **{CLEAN_OPTICAL: CleanMethod.RAW}
)
prod.clean_tmp()
1.78 s ± 23.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Time the NODATA method#

Only the detector nodata is processed by the NODATA method.
The bands will be set to nodata outside of the detector footprint (instead of keeping the raw nodata value)

%%timeit
prod.load(
    GREEN,
    resolution=20.,
    **{CLEAN_OPTICAL: CleanMethod.NODATA}
)
prod.clean_tmp()
1.95 s ± 32.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Time the CLEAN method#

Every defective pixel given by the provider by the CLEAN method. These pixels will be set to nodata.

%%timeit
prod.load(
    GREEN,
    resolution=20.,
    **{CLEAN_OPTICAL: CleanMethod.CLEAN}
)
prod.clean_tmp()
9.6 s ± 120 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)