0000000000002729

AUTHOR

Salme Kärkkäinen

Dangerous relationships : biases in freshwater bioassessment based on observed to expected ratios

Copyright by the Ecological Society of America The ecological assessment of freshwaters is currently primarily based on biological communities and the reference condition approach (RCA). In the RCA, the communities in streams and lakes disturbed by humans are compared with communities in reference conditions with no or minimal anthropogenic influence. The currently favored rationale is using selected community metrics for which the expected values (E) for each site are typically estimated from environmental variables using a predictive model based on the reference data. The proportional differences between the observed values (O) and E are then derived, and the decision rules for status ass…

research product

Orientation analysis of stochastic fibre systems with an application to paper research

research product

Statistical classification and proportion estimation - an application to a macroinvertebrate image database

We apply and compare a random Bayes forest classifier and three traditional classification methods to a dataset of complex benthic macroinvertebrate images of known taxonomical identity. Since in biomonitoring changes in benthic macroinvertebrate taxa proportions correspond to changes in water quality, their correct estimation is pivotal. As classification errors are passed on to the allocated proportions, we explore a correction method known as a confusion matrix correction. Classification methods were compared using the misclassification error and the χ2 distance measures of the true proportions to the allocated and to the corrected proportions. Using low misclassification error and small…

research product

Breaking the curse of dimensionality in quadratic discriminant analysis models with a novel variant of a Bayes classifier enhances automated taxa identification of freshwater macroinvertebrates

Macroinvertebrate samples are commonly used in biomonitoring to study changes on aquatic ecosystems. Traditionally, specimens are identified manually to taxa by human experts being time-consuming and cost intensive. Using the image data of 35 taxa and 64 features, we propose a novel variant of the quadratic discriminant analysis for breaking the curse of dimensionality in quadratic discriminant analysis models. Our variant, called a random Bayes array (RBA), uses bagging and random feature selection similar to random forest. We explore several variations of RBA. We consider three classification (i.e taxa identification) decisions: majority vote, averaged posterior probabilities, and a novel…

research product

Attentional modulation of interhemispheric (a)symmetry in children with developmental language disorder

Funding Information: The MEG recordings were conducted at Aalto University with the support of Grant #315553 from the Academy of Finland. This research was also supported by a personal grant to DH from the Jenni and Antti Wihuri Foundation and to RS from the Sigrid Jusélius Foundation. Publisher Copyright: © 2022, The Author(s). The nature of auditory processing problems in children with developmental language disorder (DLD) is still poorly understood. Much research has been devoted to determining the extent to which DLD is associated with general auditory versus language-specific dysfunction. However, less emphasis has been given to the role of different task conditions in these dysfunctio…

research product

Inclusion ratio based estimator for the mean length of the boolean line segment model with an application to nanocrystalline cellulose

A novel estimator for estimating the mean length of fibres is proposed for censored data observed in square shaped windows. Instead of observing the fibre lengths, we observe the ratio between the intensity estimates of minus-sampling and plus-sampling. It is well-known that both intensity estimators are biased. In the current work, we derive the ratio of these biases as a function of the mean length assuming a Boolean line segment model with exponentially distributed lengths and uniformly distributed directions. Having the observed ratio of the intensity estimators, the inverse of the derived function is suggested as a new estimator for the mean length. For this estimator, an approximation…

research product

A Bayesian stable isotope mixing model for coping with multiple isotopes, multiple trophic steps and small sample sizes

We introduce a Bayesian stable isotope mixing model for estimating the relative contributions of different dietary components to the tissues of consumers within food webs. The model is implemented with the probabilistic programming language Stan. The model incorporates isotopes of multiple elements (e.g. C, N, H) for two trophic levels, when the structure of the food web is known. In addition, the model allows inclusion of latent trophic levels (i.e. for which no empirical data are available) intermediate between sources and measured consumers. Running the model in simulations driven by a real dataset from Finnish lakes, we tested the sensitivity of the posterior distributions by altering c…

research product

Classification and retrieval on macroinvertebrate image databases

Aquatic ecosystems are continuously threatened by a growing number of human induced changes. Macroinvertebrate biomonitoring is particularly efficient in pinpointing the cause-effect structure between slow and subtle changes and their detrimental consequences in aquatic ecosystems. The greatest obstacle to implementing efficient biomonitoring is currently the cost-intensive human expert taxonomic identification of samples. While there is evidence that automated recognition techniques can match human taxa identification accuracy at greatly reduced costs, so far the development of automated identification techniques for aquatic organisms has been minimal. In this paper, we focus on advancing …

research product

Benchmark database for fine-grained image classification of benthic macroinvertebrates

Managing the water quality of freshwaters is a crucial task worldwide. One of the most used methods to biomonitor water quality is to sample benthic macroinvertebrate communities, in particular to examine the presence and proportion of certain species. This paper presents a benchmark database for automatic visual classification methods to evaluate their ability for distinguishing visually similar categories of aquatic macroinvertebrate taxa. We make publicly available a new database, containing 64 types of freshwater macroinvertebrates, ranging in number of images per category from 7 to 577. The database is divided into three datasets, varying in number of categories (64, 29, and 9 categori…

research product

Estimation of fibre orientation from digital images

In this paper, estimation of fibre orientation is studied for fibre systems observable as a blurred greyscale image. The estimation method is based on scaled variograms observed along a set of sampling lines in different directions. The parameters of the orientation distribution are obtained numerically. Simulated data are used to study the statistical properties of the method.

research product

Determination of Fibre Orientation Distribution from Images of Fibre Networks

We recall two categories of algorithms for estimating fibre orientation distribution from an image of a spatial fibre system. In the first algorithm, the estimate is a magnitude-weighted distribution from angles perpendicular to the directions of the gradients in the image. The second algorithm is based on the scaled variogram of grey values scanned along a sampling line and its relation to the fibre orientation distribution. Using lines in several directions and assuming a parametric model for the orientation distribution, the orientation parameters are estimated numerically from a least-squares type procedure. Two versions of variogram-based methods are used in this work. We compare the p…

research product

Empirical Bayes improves assessments of diversity and similarity when overdispersion prevails in taxonomic counts with no covariates

Abstract The assessment of diversity and similarity is relevant in monitoring the status of ecosystems. The respective indicators are based on the taxonomic composition of biological communities of interest, currently estimated through the proportions computed from sampling multivariate counts. In this work we present a novel method to estimate the taxonomic composition able to work even with a single sample and no covariates, when data are affected by overdispersion. The presence of overdispersion in taxonomic counts may be the result of significant environmental factors which are often unobservable but influence communities. Following the empirical Bayes approach, we combine a Bayesian mo…

research product

Left hemisphere enhancement of auditory activation in language impaired children

| openaire: EC/H2020/641652/EU//ChildBrain Specific language impairment (SLI) is a developmental disorder linked to deficient auditory processing. In this magnetoencephalography (MEG) study we investigated a specific prolonged auditory response (N250m) that has been reported predominantly in children and is associated with level of language skills. We recorded auditory responses evoked by sine-wave tones presented alternately to the right and left ear of 9–10-year-old children with SLI (n = 10) and children with typical language development (n = 10). Source analysis was used to isolate the N250m response in the left and right hemisphere. In children with language impairment left-hemisphere …

research product

EMG, heart rate, and accelerometer as estimators of energy expenditure in locomotion.

AB Purpose: Precise measures of energy expenditure (EE) during everyday activities are needed. This study assessed the validity of novel shorts measuring EMG and compared this method with HR and accelerometry (ACC) when estimating EE. Methods: Fifty-four volunteers (39.4 +/- 13.9 yr) performed a maximal treadmill test (3-min loads) including walking with different speeds uphill, downhill, and on level ground and one running load. The data were categorized into all, low, and level loads. EE was measured by indirect calorimetry, whereas HR, ACC, and EMG were measured continuously. EMG from quadriceps (Q) and hamstrings (H) was measured using shorts with textile electrodes. Validity of the met…

research product

On the orientational analysis of planar fibre systems from digital images

The orientational characteristics of fibres in digital images are studied. The fibres are modelled by a planar Boolean model whose typical grain is a thick (coloured) fibre. The aim is to make stereological inference on the rose of directions of the unobservable central fibres from observations made on a digital image of the thick fibres. For central fibres, the relation between the rose of directions and the point intensity, observed on a sampling line, is known. We derive, under regularity conditions, the relation between the unobservable point intensity and the scaled variogram observed on the line in a binary and a greyscale image. Using such a relation, it is possible to draw inference…

research product

Evaluating the performance of artificial neural networks for the classification of freshwater benthic macroinvertebrates

Abstract Macroinvertebrates form an important functional component of aquatic ecosystems. Their ability to indicate various types of anthropogenic stressors is widely recognized which has made them an integral component of freshwater biomonitoring. The use of macroinvertebrates in biomonitoring is dependent on manual taxa identification which is currently a time-consuming and cost-intensive process conducted by highly trained taxonomical experts. Automated taxa identification of macroinvertebrates is a relatively recent research development. Previous studies have displayed great potential for solutions to this demanding data mining application. In this research we have a collection of 1350 …

research product

A stochastic shape and orientation model for fibres with an application to carbon nanotubes

Methods are introduced for analysing the shape and orientation of planar fibres from greyscale images of fibrous systems. The sequence of image processing techniques needed for segmentation of fibres is described. The identified fibres were interpreted as deformed line segments for which two shape and two orientation parameters are estimated by the maximum likelihood method. The methods introduced are shown to perform quite well for simulated systems of deformed line segments with known properties. They were applied to TEM images of carbon nanotubes embedded in polycarbonate.

research product

A nonlinear mixed model approach to predict energy expenditure from heart rate.

Abstract Objective. Heart rate (HR) monitoring provides a convenient and inexpensive way to predict energy expenditure (EE) during physical activity. However, there is a lot of variation among individuals in the EE-HR relationship, which should be taken into account in predictions. The objective is to develop a model that allows the prediction of EE based on HR as accurately as possible and allows an improvement of the prediction using calibration measurements from the target individual. Approach. We propose a nonlinear (logistic) mixed model for EE and HR measurements and an approach to calibrate the model for a new person who does not belong to the dataset used to estimate the model. The …

research product

The value of perfect and imperfect information in lake monitoring and management.

Highlights • Knowledge on the value of monitoring can assist decision-making in lake management. • We calculate value of perfect information theoretically. • We estimate value of imperfect information with Monte Carlo type of approach. • Generally, monitoring is profitable to invest in if VOI exceeds the cost. • Additional monitoring is profitable even if the lake is in good condition a priori. Uncertainty in the information obtained through monitoring complicates decision making about aquatic ecosystems management actions. We suggest the value of information (VOI) to assess the profitability of paying for additional monitoring information, when taking into account the costs and benefits of…

research product

Orientational analysis of planar fibre systems observed as a Poisson shot-noise process

Summary We consider two-dimensional fibrous materials observed as a digital greyscale image. The problem addressed is to estimate the orientation distribution of unobservable thin fibres from a greyscale image modelled by a planar Poisson shot-noise process. The classical stereological approach is not straightforward, because the point intensities of thin fibres along sampling lines may not be observable. For such cases, Karkkainen et al. (2001) suggested the use of scaled variograms determined from grey values along sampling lines in several directions. Their method is based on the assumption that the proportion between the scaled variograms and point intensities in all directions of sampl…

research product

Estimating Mean Lifetime from Partially Observed Events in Nuclear Physics

Abstract The mean lifetime is an important characteristic of particles to be identified in nuclear physics. State-of-the-art particle detectors can identify the arrivals of single radioactive nuclei as well as their subsequent radioactive decays (departures). Challenges arise when the arrivals and departures are unmatched and the departures are only partially observed. An inefficient solution is to run experiments where the arrival rate is set very low to allow for the matching of arrivals and departures. We propose an estimation method that works for a wide range of arrival rates. The method combines an initial estimator and a numerical bias correction technique. Simulations and examples b…

research product

Human experts vs. machines in taxa recognition

The step of expert taxa recognition currently slows down the response time of many bioassessments. Shifting to quicker and cheaper state-of-the-art machine learning approaches is still met with expert scepticism towards the ability and logic of machines. In our study, we investigate both the differences in accuracy and in the identification logic of taxonomic experts and machines. We propose a systematic approach utilizing deep Convolutional Neural Nets with the transfer learning paradigm and extensively evaluate it over a multi-pose taxonomic dataset with hierarchical labels specifically created for this comparison. We also study the prediction accuracy on different ranks of taxonomic hier…

research product

Research data of article: "Dangerous relationships: Biases in freshwater bioassessment based on observed to expected ratios"

research product