0000000001169959
AUTHOR
Juha Karvanen
Bayesian subcohort selection for longitudinal covariate measurements in follow-up studies
We consider planning longitudinal covariate measurements in follow-up studies where covariates are time-varying. We assume that the entire cohort cannot be selected for longitudinal measurements due to financial limitations and study how a subset of the cohort should be selected optimally in order to obtain precise estimates of covariate effects in a survival model. In our approach, the study will be designed sequentially utilizing the data collected in previous measurements of the individuals as prior information. We propose using a Bayesian optimality criterion in the subcohort selections, which is compared with simple random sampling using simulated and real follow-up data. This study ex…
Study design in causal models
The causal assumptions, the study design and the data are the elements required for scientific inference in empirical research. The research is adequately communicated only if all of these elements and their relations are described precisely. Causal models with design describe the study design and the missing data mechanism together with the causal structure and allow the direct application of causal calculus in the estimation of the causal effects. The flow of the study is visualized by ordering the nodes of the causal diagram in two dimensions by their causal order and the time of the observation. Conclusions whether a causal or observational relationship can be estimated from the collect…
Body weight and premature retirement: population-based evidence from Finland.
Abstract Background Health status is a principal determinant of labour market participation. In this study, we examined whether excess weight is associated with withdrawal from the labour market owing to premature retirement. Methods The analyses were based on nationally representative data from Finland over the period 2001–15 (N ∼ 2500). The longitudinal data included objective measures of body weight (i.e. body mass index and waist circumference) linked to register-based information on actual retirement age. The association between the body weight measures and premature retirement was modelled using cubic b-splines via logistic regression. The models accounted for other possible risk fact…
Estimating the causal effect of timing on the reach of social media posts
AbstractModern companies regularly use social media to communicate with their customers. In addition to the content, the reach of a social media post may depend on the season, the day of the week, and the time of the day. We consider optimizing the timing of Facebook posts by a large Finnish consumers’ cooperative using historical data on previous posts and their reach. The content and the timing of the posts reflect the marketing strategy of the cooperative. These choices affect the reach of a post via a dynamic process where the reactions of users make the post more visible to others. We describe the causal relations of the social media publishing in the form of a directed acyclic graph, …
Comment on ‘Generating survival times to simulate Cox proportional hazards models with time-varying covariates’
Optimal selection of individuals for repeated covariate measurements in follow-up studies
Repeated covariate measurements bring important information on the time-varying risk factors in long epidemiological follow-up studies. However, due to budget limitations, it may be possible to carry out the repeated measurements only for a subset of the cohort. We study cost-efficient alternatives for the simple random sampling in the selection of the individuals to be remeasured. The proposed selection criteria are based on forms of the D-optimality. The selection methods are compared with the simulation studies and illustrated with the data from the East–West study carried out in Finland from 1959 to 1999. The results indicate that cost savings can be achieved if the selection is focuse…
Prioritizing covariates in the planning of future studies in the meta-analytic framework
Science can be seen as a sequential process where each new study augments evidence to the existing knowledge. To have the best prospects to make an impact in this process, a new study should be designed optimally taking into account the previous studies and other prior information. We propose a formal approach for the covariate prioritization, i.e., the decision about the covariates to be measured in a new study. The decision criteria can be based on conditional power, change of the p-value, change in lower confidence limit, Kullback-Leibler divergence, Bayes factors, Bayesian false discovery rate or difference between prior and posterior expectation. The criteria can be also used for decis…
Etäteknologian vaikuttavuus liikunnallisessa kuntoutuksessa : järjestelmällinen kirjallisuuskatsaus ja meta-analyysi
Effectiveness of technology-based distance interventions promoting physical activity : Systematic review, meta-analysis and meta-regression
Objective: To determine the effectiveness of technology-based distance interventions for promoting physical activity, using systematic review and metaanalysis. Methods: A literature search of studies published between 2000 and 2015 was conducted in the following databases: CENTRAL, EMBASE, Ovid MEDLINE, CINAHL, PsycINFO, OTseeker, WOS and PEDro. Studies were selected according to the PICOS framework, as follows: P (population): adults; I (intervention): technology-based distance intervention for promoting physical activity; C (comparison) similar distance intervention without technology, O (outcomes) physical activity; S (study design) randomized controlled trial. Physical activity outcomes…
Päätäntätiedettä Suomessa 1968 ja 2018
Päätösanalytiikkaa sovelletaan nykyään monella eri alalla. Suomessa päätäntätieteestä kirjoittivat jo vuonna 1968 Leo Törnqvist ja Leif Nordberg. nonPeerReviewed
Follow-Up Data Improve the Estimation of the Prevalence of Heavy Alcohol Consumption.
Aims. We aim to adjust for potential non-participation bias in the prevalence of heavy alcohol consumption. Methods. Population survey data from Finnish health examination surveys conducted in 1987–2007 were linked to the administrative registers for mortality and morbidity follow-up until end of 2014. Utilising these data, available for both participants and non-participants, we model the association between heavy alcohol consumption and alcohol-related disease diagnoses. Results. Our results show that the estimated prevalence of heavy alcohol consumption is on average of 1.5 times higher for men and 1.8 times higher for women than what was obtained from participants only (complete case an…
Surrogate outcomes and transportability
Identification of causal effects is one of the most fundamental tasks of causal inference. We consider an identifiability problem where some experimental and observational data are available but neither data alone is sufficient for the identification of the causal effect of interest. Instead of the outcome of interest, surrogate outcomes are measured in the experiments. This problem is a generalization of identifiability using surrogate experiments and we label it as surrogate outcome identifiability. We show that the concept of transportability provides a sufficient criteria for determining surrogate outcome identifiability for a large class of queries.
Harmonising and linking biomedical and clinical data across disparate data archives to enable integrative cross-biobank research
A wealth of biospecimen samples are stored in modern globally distributed biobanks. Biomedical researchers worldwide need to be able to combine the available resources to improve the power of large-scale studies. A prerequisite for this effort is to be able to search and access phenotypic, clinical and other information about samples that are currently stored at biobanks in an integrated manner. However, privacy issues together with heterogeneous information systems and the lack of agreed-upon vocabularies have made specimen searching across multiple biobanks extremely challenging. We describe three case studies where we have linked samples and sample descriptions in order to facilitate glo…
Lifetime cumulative risk factors predict cardiovascular disease mortality in a 50-year follow-up study in Finland.
Summary. Background. Systolic blood pressure, total cholesterol and smoking are known predictors of cardiovascular disease (CVD) mortality. Less is known about the effect of lifetime accumulation and changes of risk factors over time as predictors of CVD mortality, especially in very long follow-up studies. Methods. Data from the Finnish cohorts of the Seven Countries Study were used. The baseline examination was in 1959 and seven re-examinations were carried out approximately in five-year intervals. Cohorts were followed up for mortality until the end of 2011. Time-dependent Cox models with regular time-updated risk factors, time-dependent averages of risk factors and latest changes in ris…
Effectiveness of technology-based distance physical rehabilitation interventions on physical activity and walking in multiple sclerosis: a systematic review and meta-analysis of randomized controlled trials
Objective: To determine the effectiveness of technology-based distance physical rehabilitation interventions in multiple sclerosis (MS) on physical activity and walking. Data sources: A systematic literature search was conducted in seven databases from January 2000 to September 2016. Randomized controlled trials of technology-based distance physical rehabilitation interventions on physical activity and walking outcome measures were included. Methods: Methodological quality of the studies was determined and a meta-analysis was performed. In addition, a subanalysis of technologies and an additional analysis comparing to no treatment were conducted. Results: The meta-analysis consisted of 11 s…
Bayesian models for data missing not at random in health examination surveys
In epidemiological surveys, data missing not at random (MNAR) due to survey nonresponse may potentially lead to a bias in the risk factor estimates. We propose an approach based on Bayesian data augmentation and survival modelling to reduce the nonresponse bias. The approach requires additional information based on follow-up data. We present a case study of smoking prevalence using FINRISK data collected between 1972 and 2007 with a follow-up to the end of 2012 and compare it to other commonly applied missing at random (MAR) imputation approaches. A simulation experiment is carried out to study the validity of the approaches. Our approach appears to reduce the nonresponse bias substantially…
Sima – an Open-source Simulation Framework for Realistic Large-scale Individual-level Data Generation
We propose a framework for realistic data generation and the simulation of complex systems and demonstrate its capabilities in a health domain example. The main use cases of the framework are predicting the development of variables of interest, evaluating the impact of interventions and policy decisions, and supporting statistical method development. We present the fundamentals of the framework by using rigorous mathematical definitions. The framework supports calibration to a real population as well as various manipulations and data collection processes. The freely available open-source implementation in R embraces efficient data structures, parallel computing, and fast random number gener…
Correction: Correcting for non-ignorable missingness in smoking trends
Physical activity, aerobic fitness, and brain white matter : Their role for executive functions in adolescence
Highlights • Aerobic fitness level, but not physical activity, is related to white matter properties in the brain. • The relation between physical activity and working memory is moderated by fractional anisotropy (FA) of the corpus callosum. • The FA of the corpus callosum and superior corona radiata moderates the relation between aerobic fitness and working memory.
Bayesian subcohort selection for longitudinal covariate measurements in follow‐up studies
We propose an approach for the planning of longitudinal covariate measurements in follow-up studies where covariates are time-varying. We assume that the entire cohort cannot be selected for longitudinal measurements due to financial limitations, and study how a subset of the cohort should be selected optimally, in order to obtain precise estimates of covariate effects in a survival model. In our approach, the study will be designed sequentially utilizing the data collected in previous measurements of the individuals as prior information. We propose using a Bayesian optimality criterion in the subcohort selections, which is compared with simple random sampling using simulated and real follo…
Optimal design of observational studies: overview and synthesis
We review typical design problems encountered in the planning of observational studies and propose a unifying framework that allows us to use the same concepts and notation for different problems. In the framework, the design is defined as a probability measure in the space of observational processes that determine whether the value of a variable is observed for a specific unit at the given time. The optimal design is then defined, according to Bayesian decision theory, to be the one that maximizes the expected utility related to the design. We present examples on the use of the framework and discuss methods for deriving optimal or approximately optimal designs.
Predicting the age at natural menopause in middle-aged women
Objective To predict the age at natural menopause (ANM). Methods Cox models with time-dependent covariates were utilized for ANM prediction using longitudinal data from 47 to 55-year-old women (n = 279) participating in the Estrogenic Regulation of Muscle Apoptosis study. The ANM was assessed retrospectively for 105 women using bleeding diaries. The predictors were chosen from the set of 32 covariates by using the lasso regression (model 1). Another easy-to-access model (model 2) was created by using a subset of 16 self-reported covariates. The predictive performance was quantified with c-indices and by studying the means and standard deviations of absolute errors (MAE ± SD) between the pre…
How many longitudinal covariate measurements are needed for risk prediction?
Abstract Objective In epidemiologic follow-up studies, many key covariates, such as smoking, use of medication, blood pressure, and cholesterol, are time varying. Because of practical and financial limitations, time-varying covariates cannot be measured continuously, but only at certain prespecified time points. We study how the number of these longitudinal measurements can be chosen cost-efficiently by evaluating the usefulness of the measurements for risk prediction. Study Design and Setting The usefulness is addressed by measuring the improvement in model discrimination between models using different amounts of longitudinal information. We use simulated follow-up data and the data from t…
Sublethal Pyrethroid Insecticide Exposure Carries Positive Fitness Effects Over Generations in a Pest Insect
AbstractStress tolerance and adaptation to stress are known to facilitate species invasions. Many invasive species are also pests and insecticides are used to control them, which could shape their overall tolerance to stress. It is well-known that heavy insecticide usage leads to selection of resistant genotypes but less is known about potential effects of mild sublethal insecticide usage. We studied whether stressful, sublethal pyrethroid insecticide exposure has within-generational and/or maternal transgenerational effects on fitness-related traits in the Colorado potato beetle (Leptinotarsa decemlineata) and whether maternal insecticide exposure affects insecticide tolerance of offspring…
Effectiveness of Exergame Intervention on Walking in Older Adults: A Systematic Review and Meta-Analysis of Randomized Controlled Trials
Abstract Objective The objective of this review was to systematically evaluate the effectiveness of exergaming on walking in older adults. In addition, the aim was to investigate the relationship between the exergaming effect and age, baseline walking performance, exercise traits, technology used, and the risk of bias. Methods A literature search was carried out in the databases MEDLINE, CINAHL, CENTRAL, EMBASE, WoS, PsycInfo, and PEDro up to January 10, 2020. Studies with a randomized controlled trial design, people ≥60 years of age without neurological disorders, comparison group with other exercise or no exercise, and walking-related outcomes were included. Cochrane RoB2, meta-analysis, …
Body weight and premature retirement : population-based evidence from Finland
Background Health status is a principal determinant of labour market participation. In this study, we examined whether excess weight is associated with withdrawal from the labour market owing to premature retirement. Methods The analyses were based on nationally representative data from Finland over the period 2001–15 (N ∼ 2500). The longitudinal data included objective measures of body weight (i.e. body mass index and waist circumference) linked to register-based information on actual retirement age. The association between the body weight measures and premature retirement was modelled using cubic b-splines via logistic regression. The models accounted for other possible risk factors and p…
Do-search -- a tool for causal inference and study design with multiple data sources
Epidemiologic evidence is based on multiple data sources including clinical trials, cohort studies, surveys, registries, and expert opinions. Merging information from different sources opens up new possibilities for the estimation of causal effects. We show how causal effects can be identified and estimated by combining experiments and observations in real and realistic scenarios. As a new tool, we present do-search, a recently developed algorithmic approach that can determine the identifiability of a causal effect. The approach is based on do-calculus, and it can utilize data with nontrivial missing data and selection bias mechanisms. When the effect is identifiable, do-search outputs an i…
Underweight and obesity are related to higher mortality in patients undergoing coronary angiography: The KARDIO invasive cardiology register study
Background: In patients with some cardiovascular disease conditions, slightly elevated body mass index (BMI) is associated with a lower mortality risk (termed “obesity paradox”). It is uncertain, however, if this obesity paradox exists in patients who have had invasive cardiology procedures. We evaluated the association between BMI and mortality in patients who underwent coronary angiography. Methods: We utilised the KARDIO registry, which comprised data on demographics, prevalent diseases, risk factors, coronary angiographies, and interventions on 42,636 patients. BMI was categorised based on WHO cut-offs or transformed using P-splines. Hazard ratios (HRs) with 95% confidence intervals (CI…
Physical activity is positively related to local functional connectivity in adolescents’ brains
AbstractAdolescents have experienced decreased aerobic fitness levels and insufficient physical activity levels over the past decades. While both physical activity and aerobic fitness are related to physical and mental health, little is known concerning how they manifest in the brain during this stage of development, characterized by significant physical and psychosocial changes. Previous investigations have demonstrated associations of physical activity and aerobic fitness with the brain’s functional connectivity in both children and adults. However, it is difficult to generalize these results to adolescents because the development of functional connectivity has unique features during adol…
Adjusting for selective non-participation with re-contact data in the FINRISK 2012 survey
Aims: A common objective of epidemiological surveys is to provide population-level estimates of health indicators. Survey results tend to be biased under selective non-participation. One approach to bias reduction is to collect information about non-participants by contacting them again and asking them to fill in a questionnaire. This information is called re-contact data, and it allows to adjust the estimates for non-participation. Methods: We analyse data from the FINRISK 2012 survey, where re-contact data were collected. We assume that the respondents of the re-contact survey are similar to the remaining non-participants with respect to the health given their available background informa…
Puuttuva tieto ja vilppi
Saako tutkija analyysivaiheessa lisätä tai poistaa havaintoja aineistostaan? Jokaisen eettisesti valveutuneen tutkijan tulisi ensi reaktionaan huudahtaa ”Ei tietenkään saa!”. Asiaa tarkemmin pohdittaessa vastaus ei enää olekaan yksiselitteinen. Aineiston tietoisen rajaamisen lisäksi havaintoja voi lähes huomaamattomasti jäädä pois käytettäessä tilastollisia ohjelmistoja. Aineiston tekaiseminen on selkeästi tuomittavaa, mutta toisaalta imputointimenetelmät luovat havaintoja silloin, kun niitä ei ole olemassa. Mikä siis on sallittua ja mikä ei?
Modeling atmospheric aging of small-scale wood combustion emissions: distinguishing causal effects from non-causal associations
Small-scale wood combustion is a significant source of particulate emissions. Atmospheric transformation of wood combustion emissions is a complex process involving multiple compounds interacting simultaneously. Thus, an advanced methodology is needed to study the process in order to gain a deeper understanding of the emissions. In this study, we are introducing a methodology for simplifying this complex process by detecting dependencies of observed compounds based on a measured dataset. A statistical model was fitted to describe the evolution of combustion emissions with a system of differential equations derived from the measured data. The performance of the model was evaluated using simu…
Participation rates by educational levels have diverged during 25 years in Finnish health examination surveys
Background Declining participation rates in health examination surveys may impair the representativeness of surveys and introduce bias into the comparison of results between population groups if participation rates differ between them. Changes in the characteristics of non-participants over time may also limit comparability with earlier surveys. Methods We studied the association of socio-economic position with participation, and its changes over the past 25 years. Occupational class and educational level are used as indicators of socio-economic position. Data from six cross-sectional FINRISK surveys conducted between 1987 and 2012 in Finland were linked to national administrative registers…
Correcting for non-ignorable missingness in smoking trends
Data missing not at random (MNAR) is a major challenge in survey sampling. We propose an approach based on registry data to deal with non-ignorable missingness in health examination surveys. The approach relies on follow-up data available from administrative registers several years after the survey. For illustration we use data on smoking prevalence in Finnish National FINRISK study conducted in 1972-1997. The data consist of measured survey information including missingness indicators, register-based background information and register-based time-to-disease survival data. The parameters of missingness mechanism are estimable with these data although the original survey data are MNAR. The u…
Estimation of causal effects with small data in the presence of trapdoor variables
We consider the problem of estimating causal effects of interventions from observational data when well-known back-door and front-door adjustments are not applicable. We show that when an identifiable causal effect is subject to an implicit functional constraint that is not deducible from conditional independence relations, the estimator of the causal effect can exhibit bias in small samples. This bias is related to variables that we call trapdoor variables. We use simulated data to study different strategies to account for trapdoor variables and suggest how the related trapdoor bias might be minimized. The importance of trapdoor variables in causal effect estimation is illustrated with rea…
Selection bias was reduced by recontacting nonparticipants
Objective One of the main goals of health examination surveys is to provide unbiased estimates of health indicators at the population level. We demonstrate how multiple imputation methods may help to reduce the selection bias if partial data on some nonparticipants are collected. Study Design and Setting In the FINRISK 2007 study, a population-based health study conducted in Finland, a random sample of 10,000 men and women aged 25–74 years were invited to participate. The study included a questionnaire data collection and a health examination. A total of 6,255 individuals participated in the study. Out of 3,745 nonparticipants, 473 returned a simplified questionnaire after a recontact. Both…
Prioritizing covariates in the planning of future studies in the meta-analytic framework
Science can be seen as a sequential process where each new study augments evidence to the existing knowledge. To have the best prospects to make an impact in this process, a new study should be designed optimally taking into account the previous studies and other prior information. We propose a formal approach for the covariate prioritization, i.e., the decision about the covariates to be measured in a new study. The decision criteria can be based on conditional power, change of the p-value, change in lower confidence limit, Kullback-Leibler divergence, Bayes factors, Bayesian false discovery rate or difference between prior and posterior expectation. The criteria can be also used for decis…
Aerobic fitness, but not physical activity, is associated with grey matter volume in adolescents.
Higher levels of aerobic fitness and physical activity are linked to beneficial effects on brain health, especially in older adults. The generalizability of these earlier results to young individuals is not straightforward, because physiological responses (such as cardiovascular responses) to exercise may depend on age. Earlier studies have mostly focused on the effects of either physical activity or aerobic fitness on the brain. Yet, while physical activity indicates the amount of activity, aerobic fitness is an adaptive state or attribute that an individual has or achieves. Here, by measuring both physical activity and aerobic fitness in the same study, we aimed to differentiate the assoc…
Effectiveness of Technology-Based Distance Physical Rehabilitation Interventions for Improving Physical Functioning in Stroke: A Systematic Review and Meta-analysis of Randomized Controlled Trials.
OBJECTIVE: To study the effectiveness of technology-based distance physical rehabilitation interventions on physical functioning in stroke. DATA SOURCES: A systematic literature search was conducted in 6 databases from January 2000 to May 2018. STUDY SELECTION: Inclusion criteria applied the patient, intervention, comparison, outcome, study design framework as follows: (P) stroke; (I) technology-based distance physical rehabilitation interventions; (C) any comparison without the use of technology; (O) physical functioning; (S) randomized controlled trials (RCTs). The search identified in total 693 studies, and the screening of 162 full-text studies revealed 13 eligible studies. DATA EXTRACT…
Atmospheric aging of small-scale wood combustion emissions (model MECHA 1.0) – is it possible to distinguish causal effects from non-causal associations?
Abstract. Primary emissions of wood combustion are complex mixtures of hundreds or even over a thousand compounds, which pass through a series of chemical reactions and physical transformation processes in the atmosphere (aging). This aging process depends on atmospheric conditions, such as concentration of atmospheric oxidizing agents (OH radical, ozone and nitrate radicals), humidity and solar radiation, and is known to strongly affect the characteristics of atmospheric aerosols. However, there are only few models that are able to represent the aging of emissions during its lifetime in the atmosphere. In this work, we implemented a model (Model for aging of Emissions in environmental CHAm…
Study Design in Causal Models
The causal assumptions, the study design and the data are the elements required for scientific inference in empirical research. The research is adequately communicated only if all of these elements and their relations are described precisely. Causal models with design describe the study design and the missing-data mechanism together with the causal structure and allow the direct application of causal calculus in the estimation of the causal effects. The flow of the study is visualized by ordering the nodes of the causal diagram in two dimensions by their causal order and the time of the observation. Conclusions on whether a causal or observational relationship can be estimated from the coll…
Systematic handling of missing data in complex study designs : experiences from the Health 2000 and 2011 Surveys
We present a systematic approach to the practical and comprehensive handling of missing data motivated by our experiences of analyzing longitudinal survey data. We consider the Health 2000 and 2011 Surveys (BRIF8901) where increased non-response and non-participation from 2000 to 2011 was a major issue. The model assumptions involved in the complex sampling design, repeated measurements design, non-participation mechanisms and associations are presented graphically using methodology previously defined as a causal model with design, i.e. a functional causal model extended with the study design. This tool forces the statistician to make the study design and the missing-data mechanism explicit…
Impact or No Impact for Women With Mild Knee Osteoarthritis: A Bayesian Meta‐Analysis of Two Randomized Controlled Trials With Contrasting Interventions
Objective We aim to predict the probability of a benefit from two contrasting exercise programs for a woman with a new diagnosis of mild knee osteoarthritis (OA). The short and long-term effects of aquatic resistance training (ART) and high-impact aerobic land training (HLT) compared with the control will be estimated. Methods Original data sets from two previously conducted randomised controlled trials (RCT) were combined and used in a Bayesian meta-analysis. Group differences in multiple response variables were estimated. Variables included cardiorespiratory fitness, dynamic maximum leg muscle power, maximal isometric knee extension and flexion force, pain, other symptoms and quality of l…
Efficient spatial designs using Hausdorff distances and Bayesian optimization
An iterative Bayesian optimisation technique is presented to find spatial designs of data that carry much information. We use the decision theoretic notion of value of information as the design criterion. Gaussian process surrogate models enable fast calculations of expected improvement for a large number of designs, while the full-scale value of information evaluations are only done for the most promising designs. The Hausdorff distance is used to model the similarity between designs in the surrogate Gaussian process covariance representation, and this allows the suggested algorithm to learn across different designs. We study properties of the Bayesian optimisation design algorithm in a sy…
Unicorn-Open science for assessing environmental state, human health and regional economy
Open data and models are becoming increasingly available, but there are not yet good methods and platforms to turn those into systematic evidence-based decision support. Unicorn will produce such an environment based on existing theoretical and practical knowledge about decision support and models. This con sortium possesses the necessary models, data, and skills to set up an environment and demonstrate its func tionality and usefulness with several case studies related to the environmental issues, human health, and economy. The Unicorn environment will be built in a generic and systematic way so that it could even be come an international standard for evidence-based decision support. Devel…
Effectiveness of Exergame Intervention on Walking in Older Adults : A Systematic Review and Meta-Analysis of Randomized Controlled Trials
Objective. The objective of this review was to systematically evaluate the effectiveness of exergaming on walking in older adults. In addition, the aim was to investigate the relationship between the exergaming effect and age, baseline walking performance, exercise traits, technology used and the risk of bias. Methods. A literature search was carried out in the databases MEDLINE, CINAHL, CENTRAL, EMBASE, WoS, PsycInfo and PEDro up to January 10, 2020. Studies with a Randomized Controlled Trial (RCT) design, people ≥60 years of age without neurological disorders, comparison group with other exercise or no exercise, and walking related outcomes were included. Cochrane RoB2, meta-analysis, met…
The value of perfect and imperfect information in lake monitoring and management.
Highlights • Knowledge on the value of monitoring can assist decision-making in lake management. • We calculate value of perfect information theoretically. • We estimate value of imperfect information with Monte Carlo type of approach. • Generally, monitoring is profitable to invest in if VOI exceeds the cost. • Additional monitoring is profitable even if the lake is in good condition a priori. Uncertainty in the information obtained through monitoring complicates decision making about aquatic ecosystems management actions. We suggest the value of information (VOI) to assess the profitability of paying for additional monitoring information, when taking into account the costs and benefits of…
Tilastotieteestä valmistuneiden työllisyys : totuus ja tilastot
Opetusministeriön vipunen.fi-palvelun luvut antavat ensisilmäyksellä täysin väärän käsityksen tilastotieteestä valmistuneiden työllisyydestä. Yhdistämällä raportin vuodet 2009–2016 käy kuitenkin ilmi, että noin 5 % tilastotieteestä valmistuneista on ollut työttömänä vuosi valmistumisen jälkeen. Harhaanjohtavien lukujen syyksi paljastuu raporttiin sovellettu tietosuojaus. nonPeerReviewed
Identifying Causal Effects with the R Package causaleffect
Do-calculus is concerned with estimating the interventional distribution of an action from the observed joint probability distribution of the variables in a given causal structure. All identifiable causal effects can be derived using the rules of do-calculus, but the rules themselves do not give any direct indication whether the effect in question is identifiable or not. Shpitser and Pearl constructed an algorithm for identifying joint interventional distributions in causal models, which contain unobserved variables and induce directed acyclic graphs. This algorithm can be seen as a repeated application of the rules of do-calculus and known properties of probabilities, and it ultimately eit…
Value of information in multiple criteria decision making: an application to forest conservation
Abstract Developing environmental conservation plans involves assessing trade-offs between the benefits and costs of conservation. The benefits of conservation can be established with ecological inventories or estimated based on previously collected information. Conducting ecological inventories can be costly, and the additional information may not justify these costs. To clarify the value of these inventories, we investigate the multiple criteria value of information associated with the acquisition of improved ecological data. This information can be useful when informing the decision maker to acquire better information. We extend the concept of the value of information to a multiple crite…
Menopausal symptoms and cardiometabolic risk factors in middle-aged women : A cross-sectional and longitudinal study with 4-year follow-up
Objective To study associations of menopausal symptoms with cardiometabolic risk factors. Study design A cross-sectional and longitudinal study of a representative population sample of 1393 women aged 47–55 years with a sub-sample of 298 followed for four years. The numbers of vasomotor, psychological, somatic or pain, and urogenital menopausal symptoms were ascertained at baseline through self-report. Their associations with cardiometabolic risk factors were studied using linear regression and linear mixed-effect models. Models were adjusted for age, menopausal status, body mass index, the use of hormonal preparations, education, smoking, and alcohol consumption. Main outcome measures Card…
Recommendations for design and analysis of health examination surveys under selective non-participation
Background The decreasing participation rates and selective non-participation peril the representativeness of health examination surveys (HESs). Methods Finnish HESs conducted in 1972–2012 are used to demonstrate that survey participation rates can be enhanced with well-planned recruitment procedures and auxiliary information about survey non-participants can be used to reduce selection bias. Results Experiments incorporated to pilot surveys and experience from previously conducted surveys lead to practical improvements. For example, SMS reminders were taken as a routine procedure to the Finnish HESs after testing their effect on a pilot study and finding them as a cost-effective way to inc…
Avoin yritysdata hyödyttäisi yrityksiä ja yliopistoja
Survey data and Bayesian analysis: a cost-efficient way to estimate customer equity
We present a Bayesian framework for estimating the customer lifetime value (CLV) and the customer equity (CE) based on the purchasing behavior deducible from the market surveys on customer purchasing behavior. The proposed framework systematically addresses the challenges faced when the future value of customers is estimated based on survey data. The scarcity of the survey data and the sampling variance are countered by utilizing the prior information and quantifying the uncertainty of the CE and CLV estimates by posterior distributions. Furthermore, information on the purchase behavior of the customers of competitors available in the survey data is integrated to the framework. The introduc…
Physical activity and aerobic fitness in relation to local and interhemispheric functional connectivity in adolescents' brains
Abstract Introduction Adolescents have experienced decreased aerobic fitness levels and insufficient physical activity levels over the past decades. While both physical activity and aerobic fitness are related to physical and mental health, little is known concerning how they manifest in the brain during this stage of development, characterized by significant physical and psychosocial changes. The aim of the study is to examine the associations between both physical activity and aerobic fitness with brains’ functional connectivity. Methods Here, we examined how physical activity and aerobic fitness are associated with local and interhemispheric functional connectivity of the adolescent brai…
Genome-Wide Association Study for Incident Myocardial Infarction and Coronary Heart Disease in Prospective Cohort Studies: The CHARGE Consortium
Background Data are limited on genome-wide association studies (GWAS) for incident coronary heart disease (CHD). Moreover, it is not known whether genetic variants identified to date also associate with risk of CHD in a prospective setting. Methods We performed a two-stage GWAS analysis of incident myocardial infarction (MI) and CHD in a total of 64,297 individuals (including 3898 MI cases, 5465 CHD cases). SNPs that passed an arbitrary threshold of 5×10−6 in Stage I were taken to Stage II for further discovery. Furthermore, in an analysis of prognosis, we studied whether known SNPs from former GWAS were associated with total mortality in individuals who experienced MI during follow-up. Res…
Causal Effect Identification from Multiple Incomplete Data Sources: A General Search-Based Approach
Causal effect identification considers whether an interventional probability distribution can be uniquely determined without parametric assumptions from measured source distributions and structural knowledge on the generating system. While complete graphical criteria and procedures exist for many identification problems, there are still challenging but important extensions that have not been considered in the literature. To tackle these new settings, we present a search algorithm directly over the rules of do-calculus. Due to generality of do-calculus, the search is capable of taking more advanced data-generating mechanisms into account along with an arbitrary type of both observational and…
Non-participation modestly increased with distance to the examination clinic among adults in Finnish health examination surveys
Aims: Health examination surveys (HES) provide important information about population health and health-related factors, but declining participation rates threaten the representativeness of collected data. It is hard to conduct national HESs at examination clinics near to every sampled individual. Thus, it is interesting to look into the possible association between the distance from home to the examination clinic and non-participation, and whether there is a certain distance after which the participation activity decreases considerably. Methods: Data from two national HESs conducted in Finland in 2011 and 2012 were used and a logistic regression model was fitted to investigate how distanc…
Odottelua pysäkillä
Simulation Framework for Realistic Large-scale Individual-level Data Generation with an Application in the Health Domain
We propose a framework for realistic data generation and simulation of complex systems and demonstrate its capabilities in the health domain. The main use cases of the framework are predicting the development of risk factors and disease occurrence, evaluating the impact of interventions and policy decisions, and statistical method development. We present the fundamentals of the framework using rigorous mathematical definitions. The framework supports calibration to a real population as well as various manipulations and data collection processes. The freely available open-source implementation in R embraces efficient data structures, parallel computing and fast random number generation which…
Estimating Mean Lifetime from Partially Observed Events in Nuclear Physics
Abstract The mean lifetime is an important characteristic of particles to be identified in nuclear physics. State-of-the-art particle detectors can identify the arrivals of single radioactive nuclei as well as their subsequent radioactive decays (departures). Challenges arise when the arrivals and departures are unmatched and the departures are only partially observed. An inefficient solution is to run experiments where the arrival rate is set very low to allow for the matching of arrivals and departures. We propose an estimation method that works for a wide range of arrival rates. The method combines an initial estimator and a numerical bias correction technique. Simulations and examples b…
Effectiveness of physical activity promoting technology-based distance interventions compared to usual care. Systematic review, meta-analysis and meta-regression.
Introduction Technology has been thought to have strong potential for promoting physical activity, but the evidence has remained unclear. The aim of this study was to examine whether a technology-based distance intervention promoting physical activity is more effective than a physical activity intervention without the use of technology. This systematic review is registered in Prospero (CRD42016035831). Evidence acquisition A systematic literature search of studies published between January 2000 to December 2015 was conducted in CENTRAL, EMBASE, Ovid MEDLINE, CINAHL, PsycINFO, OT-Seeker, WOS and PEDro. Studies were selected by two independent authors applying the following PICOS criteria P) …
Impact or No Impact for Women With Mild Knee Osteoarthritis : A Bayesian Meta-Analysis of Two Randomized Controlled Trials With Contrasting Interventions
Objective: We aim to predict the probability of a benefit from two contrasting exercise programs for a woman with a new diagnosis of mild knee osteoarthritis (OA). The short and long-term effects of aquatic resistance training (ART) and high-impact aerobic land training (HLT) compared with the control will be estimated. Methods: Original data sets from two previously conducted randomised controlled trials (RCT) were combined and used in a Bayesian meta-analysis. Group differences in multiple response variables were estimated. Variables included cardiorespiratory fitness, dynamic maximum leg muscle power, maximal isometric knee extension and flexion force, pain, other symptoms and quality of…
Codes and datasets related to https://doi.org/10.5194/gmd-2020-13
Codes and datasets related to https://doi.org/10.5194/gmd-2020-13, discussion paper.
The value and costs of information for conservation decisions – a comparison of inventory strategies using imperfect and perfect information
Conservation decisions should be made considering the information available. The quality of information can vary, depending on how the data is collected. High quality (expensive) information could be obtained from detailed field inventories, or lower quality (inexpensive / free) information could be obtained from remotely sensed information or previously acquired information. From a Bayesian statistics perspective, the value of collecting better information can be evaluated. The remotely sensed or previously acquired information could serve as prior information while the detailed field inventories could be the posterior information. For a simple one stand decision, the value of information …
Survey on mobile phone purchases, February 2013
The mobile phone data were collected in February 2013 together with the National Consumer Net Shopping Study conducted by market research company Tietoykkönen Oy. The target group was 15--79 years old mobile phone owners in Finland. The data collection method was telephone interviews by using a computer-assisted telephone interviewing (CATI) system. The sample source was targeting service Fonecta Finder B2C, which contains all publicly available phone numbers in Finland. Random sampling was made by setting quotas in respondents’ gender, age and region in the major region level excluding Åland autonomic region. The sample size was 536 completed interviews. All 536 survey respondents had a mo…