Background
Subsurface Insights (in collaboration with Dr. John Bargar at PNNL) recently (Feb 2024) received a DOE SBIR award (DE-SC0024850) for a proposal in response to topic 17a (Complex Data: Advanced Data Analytic Technologies For Systems Biology And Bioenergy) of the FY2024 Phase I release I SBIR call.
Our proposal addresses needs defined by BER’s Biological Systems Science Division (BSSD) related to how to create sustainable production systems for biofuels and bioproducts. Biofuels are a critical part of the US energy strategy – see e.g. the recently released 2023 Billion Ton Report. However, there are numerous challenges associated with going from feedstocks to biofuels and bioproducts at scale and in a cost efficient manner. These challenges (which within DOE are for a large part addressed within four Bioenergy Research Centers) can be roughly grouped in the four research thrusts shown below.
Four different research thrusts associated with the research done by DOE scientists. Figure source: Bioenergy Research Centers 2022 Program Update (page 3).
Specific need addressed by our proposal
The specific need addressed by our proposal is related to the first research thrust: Sustainability, and specifically the need for optimal feedstock development. Selecting and developing site specific optimal feedstocks requires knowledge and insights about the complex multivariate interactions between crops and their environment, impacts of crop choice and management systems, and key plant-microbe-environment interactions.
The increasing availability of large bioenergy crop, soil and environmental data sets promises major new opportunities to obtain this knowledge by fusing and analyzing this data to discover constitutive relationships that link key parameters such as genotype and soil microbiome to crop yield and water stress resiliency.
As is shown in the figure below, this data is multimodal and multiscale. It includes high throughput plant phenotype data, multi-omic data from plants and soils, and imagery over scales ranging from molecular (e.g., protein structure models) to plot scale (e.g., Phenocam and drone data) to tens of meters (i.e., green normalized difference vegetation index (GNDVI) from satellite data). Critical ancillary data include soil types, topography, and environmental conditions (precipitation, solar radiation and air and soil temperature). Our SBIR project is developing tools for fusing and analyzing this data.
Some of the different data sets associated with bioenergy crops. These datasets are multimodal and multiscale and can be categorized and grouped in different ways. One possible grouping and some of the sources are shown. Figure inspired and modified from (Venturas, Sperry et al. 2018)
SBIR approach
Approach summary: we will fuse multimodal biofuel crop data into an ODMX Database, implement ML-guided analysis capabilities and demonstrate the application of the analysis tool on different field data sets.