Python API¶
The Python API provides programmatic access to gedixr's functionality for custom workflows.
Core Functions¶
download_data¶
Download GEDI data using NASA Harmony API based on a time range and spatial subset.
Please note that if subset_vector is provided, the download will be subset to the
bounding box of the vector geometry and not the exact geometry itself. To perform
precise spatial subsetting, use the vector file again during data extraction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
directory
|
str or Path
|
Directory where downloaded files will be saved. A subdirectory named after the GEDI product will be created within this directory and files will be saved there. |
required |
gedi_product
|
str
|
GEDI product name: 'L2A' or 'L2B' |
required |
time_range
|
tuple of str
|
Time range as (start_date, end_date) in format 'YYYY-MM-DD' |
None
|
subset_vector
|
str or Path
|
Path to vector file for spatial subsetting. Please note that the download will be subset to the bounding box of the vector geometry and not the exact geometry itself. To perform precise spatial subsetting, use the vector file again during data extraction. If provided, takes precedence over subset_bbox. |
None
|
subset_bbox
|
tuple of float
|
Bounding box as (min_lon, min_lat, max_lon, max_lat). |
None
|
job_id
|
str
|
Harmony job ID to resume a previous download. If provided, a new request will not be submitted and other parameters (time_range, subset_*) are ignored. |
None
|
verbose
|
bool
|
Whether to print progress messages |
True
|
Returns:
| Type | Description |
|---|---|
tuple of (list of Path, str)
|
Downloaded file paths and the job ID for potential resumption. |
Examples:
>>> # Initial download
>>> files, job_id = download_data(
... directory='data/gedi',
... gedi_product='L2A',
... time_range=('2020-01-01', '2020-01-31'),
... subset_bbox=(-10, 40, 5, 50)
... )
>>> # Resume interrupted download
>>> files, job_id = download_data(
... directory='data/gedi',
... gedi_product='L2A',
... job_id=job_id
... )
Source code in gedixr/download.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 | |
extract_data¶
Extracts data from GEDI L2A or L2B files in HDF5 format using the following steps:
(1) Search a root directory recursively for GEDI L2A or L2B HDF5 files (2) OPTIONAL: Filter files by month of acquisition (3) Extract data from each file for specified beams and variables into a Dataframe (4) OPTIONAL: Filter out shots of poor quality (5) Convert Dataframe to GeoDataFrame including geometry column (6) OPTIONAL: Subset shots spatially using intersection via provided vector file or list of vector files (7) Save the result as a GeoParquet file or multiple files (one per provided vector file, if applicable) (8) Return a GeoDataFrame or dictionary of GeoDataFrame objects (one per provided vector file, if applicable)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
directory
|
str | Path
|
Root directory to recursively search for GEDI L2A/L2B files. |
required |
gedi_product
|
str
|
GEDI product type. Either 'L2A' or 'L2B'. Default is 'L2B'. |
required |
variables
|
Optional[list[tuple[str, str]]]
|
List of tuples containing the desired column name in the returned
GeoDataFrame and the GEDI layer name to be extracted. Defaults to those
retrieved by |
None
|
beams
|
Optional[str | list[str]]
|
Which GEDI beams to extract values from? Defaults to all beams (power and
coverage beams). Use |
None
|
filter_month
|
Optional[tuple[int, int]]
|
Filter GEDI shots by month of the year? E.g. (6, 8) to only keep shots that were acquired between June 1st and August 31st of each year. Defaults to (1, 12), which keeps all shots of each year. |
None
|
subset_vector
|
Optional[str | Path | list[str | Path]]
|
Path or list of paths to vector files in a fiona supported format to subset the GEDI data spatially. Default is None, to keep all shots. Note that the basename of each vector file will be used in the output names, so it is recommended to give those files reasonable names beforehand! |
None
|
apply_quality_filter
|
bool
|
Apply a basic quality filter to the GEDI data? Default is True. This basic filtering strategy will filter out shots with quality_flag != 1, degrade_flag != 0, num_detectedmodes > 1, and difference between detected elevation and DEM elevation < 100 m. |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
GeoDataFrame or dictionary
|
In case of an output dictionary, these are the expected key, value pairs:
|
|
out_path |
Path or None
|
In case no vector files were provided, the path to the output GeoParquet file is returned. Otherwise, None is returned as the output paths are included in the output dictionary. |
Source code in gedixr/extract.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 | |
Post-extraction Functions¶
load_to_gdf¶
Loads GEDI L2A and/or L2B GeoParquet or GeoPackage files as GeoDataFrames. If both are provided, they will be merged into a single GeoDataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
l2a
|
Optional[str | Path]
|
Path to a GEDI L2A GeoParquet or GeoPackage file. |
None
|
l2b
|
Optional[str | Path]
|
Path to a GEDI L2B GeoParquet or GeoPackage file. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
final_gdf |
GeoDataFrame
|
GeoDataFrame containing the data from the provided GEDI L2A and/or L2B files. |
Source code in gedixr/xr.py
merge_gdf¶
Merges the data of two GeoDataFrames containing GEDI L2A and L2B data. If
dictionaries are provided, the function assumes key, value pairs of the dictionary
output of gedixr.extract.extract_data. The function will merge the data of
matching geometries and return a dictionary of GeoDataFrames.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
l2a
|
GeoDataFrame | dict
|
GeoDataFrame or a dictionary of GeoDataFrames containing GEDI L2A data. |
required |
l2b
|
GeoDataFrame | dict
|
GeoDataFrame or a dictionary of GeoDataFrames containing GEDI L2B data. |
required |
how
|
str
|
The type of merge to be performed. Default is 'inner'. |
'inner'
|
on
|
Optional[str | list[str]]
|
The column(s) to merge on. Default is ['geometry', 'shot', 'acq_time']. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
merged_out |
GeoDataFrame or dict
|
A GeoDataFrame or a dictionary of GeoDataFrames containing the merged GEDI L2A and L2B data. |
Source code in gedixr/xr.py
gdf_to_xr¶
Rasterizes a GeoDataFrame containing GEDI L2A/L2B data to an xarray Dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf
|
GeoDataFrame
|
GeoDataFrame containing GEDI L2A/L2B data. |
required |
measurements
|
Optional[list[str]]
|
List of measurements names (i.e. GEDI variables) to be included. Default is None, which will include all measurements. |
None
|
resolution
|
Optional[tuple[float, float]]
|
A tuple of the pixel spacing of the returned data (Y, X). This includes the direction (as indicated by a positive or negative number). Default is (-0.0003, 0.0003), which corresponds to a spacing of 30 m. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
cube |
Dataset
|
An xarray Dataset containing the rasterized GEDI data. |