Photo by Justin Veenema on Unsplash
Exploring The Open Source World - What Is Moja FLINT All About, Anyways? (Temporal Distribution, Data Preparation) 4/🧵
Temporal Distribution, Time-Steps, Carbon Pools, Modules, IO, Simulation, Debris, Disturbances, Data Transformations, Variable Types
Table of contents
Temporal Distribution
In order to track the interactions between carbon pools, it is necessary for them to be on a comparable temporal scale
It is simple enough to track changes through multiple pools from multiple nodes.
It is more difficult for modules being built and calibrated for time-frames that are appropriate for the pools that are being modeled.
For example,
- Empirical forest growth module may provide annual growth data
- The debris module may operate monthly
- The soil carbon module may operate daily
The temporal scale in FLINT is referred to as "time-steps".
Time-Steps
Time-steps are lengths of time over which operations are reported
It is only at the end of a time-step that carbon can be moved from one pool to another
Time-steps reduce processing requirements of the model.
Because of time-steps, rather than continuous changes, there is "graininess" in the output data of FLINT.
Handling Timing Of Modules In FLINT
Balance between graininess of output data and processing, with the standard time-step in the FLINT being one month.
One month is the recommended time-step for modelling carbon.
With a standard time-step of one month, the FLINT automatically adjusts the output of each module through the unit controller.
The ability of FLINT to control the timing and flow of inputs and outputs from modules that operate at different time steps without adjusting the modules themselves is one of its key features.
Running an annual model at daily time steps can lead to significant errors.
Simulation Runtime
During a simulation, the FLINT will run at the finest time step of any input module or the output module.
Given a soil module running at a daily time step, FLINT will run daily.
Other modules will have to run at their native temporal resolution.
The FLINT will handle the interpolation of low temporal resolution data to high resolution data by proportionally allocating module output.
Operations in FLINT ensure all sub-time step information is recorded, reported, for the Flux summary tables so that the appropriate summary statistics can be generated.
How do we handle disturbances if they occur part way through a time-step?
Disturbances
During the temporal resolution runtime of the system, disturbances may occur part way through a time-step - FLINT Is designed to handle this.
Disturbances can be an event that significantly deviate or affect initial predictions or states in a system.
For example, a forest harvest occurring in a year.
The harvest operation itself will occur over many weeks, but a small enough region will be effected in a single day.
The forest growth module attached to the FLINT is an annual growth model.
Data
Let's say, a clearfall harvest occurs on day 200 of a growth year.
FLINT has growth data for the entire year.
It multiplies the growth by 200/365, applies this to pools prior to the harvest.
The harvest event then occurs - FLINT sends the growth module the ne data for the year: updates to the pool.
Then, this is grown on for a full year from the new data point, and the growth data is then proportioned back to the end of the orignal year using the same process described previously.
Data Preparation
Sending Input
FLINT has a Variable System
Variables are of 2 types; Internal or External
Each can return one of four data types,
- Static
- Time Series
- Object
- Table
Module writers will know which is which, each returning 1 of the 4 data types.
Internal Variables
Kept within the system, and used to share information between Modules
External Variables
Access to 4 categories of external Data Providers; Raster, Vector, SQL, NoSQL
Extended FLINT
FLINT can be extended with new Data Providers, if they match 1 of the defined External Data Categories (EDC).
Through this, modules can remain independent from the core of the FLINT, significantly increasing the flexibility of the FLINT.
This allows modules to easily use available data, allowing FLINT to run spatially without having to set up individual modules.
Data Types
There are four data types for variables defined in the FLINT. A module may use any combination of these data depending on its requirements
- Static variables: single values,
- Examples are model parameters
- true/false
- value (integer or double precision)
- Examples are model parameters
- These variables can be internal or come from databases, vector or raster files
- Time series are variables that change through time
- These are typically integer or doubles, but may be Boolean (for example, forest/non-forest).
- Time step may be different for different variables
* For example, climate may be monthly, forest cover annual or between set dates
- The time series may require extrapolation or interpolation to:
* Project forward * fill in gaps in the time series * change the length of the time step depending on module (e.g. a module may need annual data, and therefore sum monthly to annual)
- Time series can be internal or come from databases or stacks of rasters/vectors
- Object: a useful container of data (either static or time series) that relates to a single unit, for example soil type or species
- Objects are built up from various data that the system needs to use and allow for bulk loading of variables in one go
- for example, ask for soil ID, get all soil parameters.
- Table
- A table is effectively a database that can be looked up. For example, in the CBM, get table functions are used for:
- hardwood volume to biomass parameter
- softwood merchantable volume
- hardwood merchantable volume
- A table is effectively a database that can be looked up. For example, in the CBM, get table functions are used for:
Data Provider Types
There are 4, as mentioned previously,
- Raster
- Vector
- NoSQL Database
- Data Transforms
Raster
A format where spatial files contain data divided evenly into pixels
Useful in FLINT for querying raster files,
- Specific Coordinates
- Raster File Index
On querying, a raster will return one or more attributes relating to latitude.
Indexing is another method (2), where raster data of a country is broken into smaller tiles, broken further into blocks, then into cells.
For FLINT, raster data is made into 1 degree tiles, 0.1 degree bocks, and pixel resolution calls.
1 degree tile has been found to be the most efficient size, containing large amount of data, without compromising the processing capacity.
Vector
Vectors contain information relating to points, lines, polygons, instead of raster data
SQL Database
Attributes are stored as records in rows organized by relations
NoSQL Database
No fixed set of attributes, to cater to uncertain number of variety in data received
Data Transform
A module shall be given only minimum data to function.
Hence, information pulls and transforms are not part of the module itself.
There is no spatial awareness of module, all transforms of data happen external to the modules.
Case Study - Data Transforms
A module may need,
- Annual rainfall data to operate
- Where the data is stored in an SQL DB
- The SQL DB is updated on a monthly basis
To fetch this data,
- Module sends a request to the "Local Domain" (LD) with specifics about the data it needs
- LD calls on FLINT
- FLINT calls on the monthly data via the SQL DB
- Data is summed in FLINT through a transform
- Aggregated data sent back to Module
Hence, Modules only ever need to pull data from LD.
As a catch 22, FLINT needs to know where the appropriate data source is to fetch it on the Module's behalf.
Configuration in FLINT is straightforward.