Estimating aggregated nutrient fluxes in four Finnish rivers via Gaussian state space models
Reliable estimates of the nutrient fluxes carried by rivers from land-based sources to the sea are needed for efficient abatement of marine eutrophication. Although nutrient concentrations in rivers generally display large temporal variation, sampling and analysis for nutrients, unlike flow measurements, are rarely performed on a daily basis. The infrequent data calls for ways to reliably estimate the nutrient concentrations of the missing days. Here, we use the Gaussian state space models with daily water flow as a predictor variable to predict missing nutrient concentrations for four agriculturally impacted Finnish rivers. Via simulation of Gaussian state space models, we are able to esti…
Statistical classification and proportion estimation - an application to a macroinvertebrate image database
We apply and compare a random Bayes forest classifier and three traditional classification methods to a dataset of complex benthic macroinvertebrate images of known taxonomical identity. Since in biomonitoring changes in benthic macroinvertebrate taxa proportions correspond to changes in water quality, their correct estimation is pivotal. As classification errors are passed on to the allocated proportions, we explore a correction method known as a confusion matrix correction. Classification methods were compared using the misclassification error and the χ2 distance measures of the true proportions to the allocated and to the corrected proportions. Using low misclassification error and small…
Breaking the curse of dimensionality in quadratic discriminant analysis models with a novel variant of a Bayes classifier enhances automated taxa identification of freshwater macroinvertebrates
Macroinvertebrate samples are commonly used in biomonitoring to study changes on aquatic ecosystems. Traditionally, specimens are identified manually to taxa by human experts being time-consuming and cost intensive. Using the image data of 35 taxa and 64 features, we propose a novel variant of the quadratic discriminant analysis for breaking the curse of dimensionality in quadratic discriminant analysis models. Our variant, called a random Bayes array (RBA), uses bagging and random feature selection similar to random forest. We explore several variations of RBA. We consider three classification (i.e taxa identification) decisions: majority vote, averaged posterior probabilities, and a novel…
The effect of automated taxa identification errors on biological indices
In benthic macroinvertebrate biomonitoring systems, the target is to determine the status of ecosystems based on several biological indices. To increase cost-efficiency, computer-based taxa identification for image data has recently been developed. Taxa identification errors can, however, have strong effects on the indices and thus on the determination of the ecological status. In order to shift the biomonitoring process towards automated expert systems, we need a clear understanding on the bias caused by automation. In this paper, we examine eleven classification methods in the case of macroinvertebrate image data and show how their classification errors propagate into different biological…
Classification and retrieval on macroinvertebrate image databases
Aquatic ecosystems are continuously threatened by a growing number of human induced changes. Macroinvertebrate biomonitoring is particularly efficient in pinpointing the cause-effect structure between slow and subtle changes and their detrimental consequences in aquatic ecosystems. The greatest obstacle to implementing efficient biomonitoring is currently the cost-intensive human expert taxonomic identification of samples. While there is evidence that automated recognition techniques can match human taxa identification accuracy at greatly reduced costs, so far the development of automated identification techniques for aquatic organisms has been minimal. In this paper, we focus on advancing …
Benchmark database for fine-grained image classification of benthic macroinvertebrates
Managing the water quality of freshwaters is a crucial task worldwide. One of the most used methods to biomonitor water quality is to sample benthic macroinvertebrate communities, in particular to examine the presence and proportion of certain species. This paper presents a benchmark database for automatic visual classification methods to evaluate their ability for distinguishing visually similar categories of aquatic macroinvertebrate taxa. We make publicly available a new database, containing 64 types of freshwater macroinvertebrates, ranging in number of images per category from 7 to 577. The database is divided into three datasets, varying in number of categories (64, 29, and 9 categori…
Estimating aggregated nutrient fluxes in four Finnish rivers via Gaussian state space models
Reliable estimates of the nutrient fluxes carried by rivers from land-based sources to the sea are needed for efficient abatement of marine eutrophication. Although nutrient concentrations in rivers generally display large temporal variation, sampling and analysis for nutrients, unlike flow measurements, are rarely performed on a daily basis. The infrequent data calls for ways to reliably estimate the nutrient concentrations of the missing days. Here, we use the Gaussian state space models with daily water flow as a predictor variable to predict missing nutrient concentrations for four agriculturally impacted Finnish rivers. Via simulation of Gaussian state space models, we are able to esti…
Empirical Bayes improves assessments of diversity and similarity when overdispersion prevails in taxonomic counts with no covariates
Abstract The assessment of diversity and similarity is relevant in monitoring the status of ecosystems. The respective indicators are based on the taxonomic composition of biological communities of interest, currently estimated through the proportions computed from sampling multivariate counts. In this work we present a novel method to estimate the taxonomic composition able to work even with a single sample and no covariates, when data are affected by overdispersion. The presence of overdispersion in taxonomic counts may be the result of significant environmental factors which are often unobservable but influence communities. Following the empirical Bayes approach, we combine a Bayesian mo…
Automatic image‐based identification and biomass estimation of invertebrates
Understanding how biological communities respond to environmental changes is a key challenge in ecology and ecosystem management. The apparent decline of insect populations necessitates more biomonitoring but the time-consuming sorting and expert-based identification of taxa pose strong limitations on how many insect samples can be processed. In turn, this affects the scale of efforts to map and monitor invertebrate diversity altogether. Given recent advances in computer vision, we propose to enhance the standard human expert-based identification approach involving manual sorting and identification with an automatic image-based technology. We describe a robot-enabled image-based identificat…
Diel feeding periodicity, daily ration and prey selectivity in juvenile brown trout in a subarctic river
Feeding of age-1 brown trout Salmo trutta in a third-order river in northern Finland was usually highest in the twilight hours and lowest around midday. Diel periodicity in food intake was less distinct and rarely significant for age-0 trout. Daily rations declined seasonally, being lowest in October, and highest in June (age-1 trout) or early August (age-0 trout). Prey selection did not differ between day and night, but differences between age classes and sampling dates were distinct. Age-0 trout preferred Ephemerella nymphs in summer and Micrasema larvae later in the season. Age-1 trout fed selectively on caddis larvae on all sample dates. Aerial insects and Baetis nymphs were avoided by …
Automatic image-based identification and biomass estimation of invertebrates
1. Understanding how biological communities respond to environmental changes is a key challenge in ecology and ecosystem management. The apparent decline of insect populations necessitates more biomonitoring but the time-consuming sorting and expert-based identification of taxa pose strong limitations on how many insect samples can be processed. In turn, this affects the scale of efforts to map and monitor invertebrate diversity altogether. Given recent advances in computer vision, we propose to enhance the standard human expert-based identification approach involving manual sorting and identification with an automatic image-based technology. 2. We describe a robot-enabled image-based ident…
Evaluating the performance of artificial neural networks for the classification of freshwater benthic macroinvertebrates
Abstract Macroinvertebrates form an important functional component of aquatic ecosystems. Their ability to indicate various types of anthropogenic stressors is widely recognized which has made them an integral component of freshwater biomonitoring. The use of macroinvertebrates in biomonitoring is dependent on manual taxa identification which is currently a time-consuming and cost-intensive process conducted by highly trained taxonomical experts. Automated taxa identification of macroinvertebrates is a relatively recent research development. Previous studies have displayed great potential for solutions to this demanding data mining application. In this research we have a collection of 1350 …
The value of perfect and imperfect information in lake monitoring and management.
Highlights • Knowledge on the value of monitoring can assist decision-making in lake management. • We calculate value of perfect information theoretically. • We estimate value of imperfect information with Monte Carlo type of approach. • Generally, monitoring is profitable to invest in if VOI exceeds the cost. • Additional monitoring is profitable even if the lake is in good condition a priori. Uncertainty in the information obtained through monitoring complicates decision making about aquatic ecosystems management actions. We suggest the value of information (VOI) to assess the profitability of paying for additional monitoring information, when taking into account the costs and benefits of…
Cost-efficiency assessments of marine monitoring methods lack rigor—a systematic mapping of literature and an end-user view on optimal cost-efficiency analysis
Global deterioration of marine ecosystems, together with increasing pressure to use them, has created a demand for new, more efficient and cost-efficient monitoring tools that enable assessing changes in the status of marine ecosystems. However, demonstrating the cost-efficiency of a monitoring method is not straightforward as there are no generally applicable guidelines. Our study provides a systematic literature mapping of methods and criteria that have been proposed or used since the year 2000 to evaluate the cost-efficiency of marine monitoring methods. We aimed to investigate these methods but discovered that examples of actual cost-efficiency assessments in literature were rare, contr…
Comparing long term sediment records to current biological quality element data – Implications for bioassessment and management of a eutrophic lake
Defining reference conditions for lakes situated in areas of human settlement and agriculture is rarely straightforward, and is especially difficult within easily eroding and nutrient rich watersheds. We used diatoms, cyanobaterial akinetes, remains of green algae and chironomid head capsules from sediment samples of Lake Kirmanjarvi, Finland, to assess its deviation from the initial ecological status. These site-specific records of change were compared to current type-specific ecological status assessment. All paleolimnological data indicated deviation from natural conditions and mirrored the current, monitoring-based assessment of “moderate” ecological lake status. However, the sediment d…
Science Advances
River ecosystems receive and process vast quantities of terrestrial organic carbon, the fate of which depends strongly on microbial activity. Variation in and controls of processing rates, however, are poorly characterized at the global scale. In response, we used a peer-sourced research network and a highly standardized carbon processing assay to conduct a global-scale field experiment in greater than 1000 river and riparian sites. We found that Earth’s biomes have distinct carbon processing signatures. Slow processing is evident across latitudes, whereas rapid rates are restricted to lower latitudes. Both the mean rate and variability decline with latitude, suggesting temperature constrai…
Human experts vs. machines in taxa recognition
The step of expert taxa recognition currently slows down the response time of many bioassessments. Shifting to quicker and cheaper state-of-the-art machine learning approaches is still met with expert scepticism towards the ability and logic of machines. In our study, we investigate both the differences in accuracy and in the identification logic of taxonomic experts and machines. We propose a systematic approach utilizing deep Convolutional Neural Nets with the transfer learning paradigm and extensively evaluate it over a multi-pose taxonomic dataset with hierarchical labels specifically created for this comparison. We also study the prediction accuracy on different ranks of taxonomic hier…
Testate amoebae community analysis as a tool to assess biological impacts of peatland use
As most ecosystems, peatlands have been heavily exploited for different human purposes. For example, in Finland the majority is under forestry, agriculture or peat mining use. Peatlands play an important role in carbon storage, water cycle, and are a unique habitat for rare organisms. Such properties highlight their environmental importance and the need for their restoration. To monitor the success of peatland restoration sensitive indicators are needed. Here we test whether testate amoebae can be used as a reliable bioindicator for assessing peatland condition. To qualify as reliable indicators, responses in testate amoebae community structure to ecological changes must be stronger than ra…