Author: Santtu Tikka

0000000001327694

AUTHOR

Santtu Tikka

showing 18 related works from this author

The gallium anomaly reassessed using a Bayesian approach

2020

The solar-neutrino detectors GALLEX and SAGE were calibrated by electron-neutrino flux from the $^{37}$Ar and $^{51}$Cr calibration sources. A deficit in the measured neutrino flux was recorded by counting the number of neutrino-induced conversions of the $^{71}$Ga nuclei to $^{71}$Ge nuclei. This deficit was coined ``gallium anomaly'' and it has lead to speculations about beyond-the-standard-model physics in the form of eV-mass sterile neutrinos. Notably, this anomaly has already defied final solution for more than 20 years. Here we reassess the statistical significance of this anomaly and improve the related statistical approaches by treating the neutrino experiments as repeated Bernoulli…

High Energy Physics - PhenomenologyHigh Energy Physics - Phenomenology (hep-ph)FOS: Physical sciences

researchProduct

Sima – an Open-source Simulation Framework for Realistic Large-scale Individual-level Data Generation

2021

We propose a framework for realistic data generation and the simulation of complex systems and demonstrate its capabilities in a health domain example. The main use cases of the framework are predicting the development of variables of interest, evaluating the impact of interventions and policy decisions, and supporting statistical method development. We present the fundamentals of the framework by using rigorous mathematical definitions. The framework supports calibration to a real population as well as various manipulations and data collection processes. The freely available open-source implementation in R embraces efficient data structures, parallel computing, and fast random number gener…

tietorakenteetavoin lähdekooditerveysalalähdekoodittilastomenetelmätennusteetsimulointimatemaattiset mallittietojenkäsittelytietojärjestelmät

researchProduct

Causal Effect Identification from Multiple Incomplete Data Sources: A General Search-Based Approach

2021

Causal effect identification considers whether an interventional probability distribution can be uniquely determined without parametric assumptions from measured source distributions and structural knowledge on the generating system. While complete graphical criteria and procedures exist for many identification problems, there are still challenging but important extensions that have not been considered in the literature. To tackle these new settings, we present a search algorithm directly over the rules of do-calculus. Due to generality of do-calculus, the search is capable of taking more advanced data-generating mechanisms into account along with an arbitrary type of both observational and…

FOS: Computer and information sciencesStatistics and ProbabilityComputer Science - Machine LearningcausalityComputer Science - Artificial IntelligenceHeuristic (computer science)Computer scienceeducationMachine Learning (stat.ML)transportabilitycomputer.software_genre01 natural sciencesMachine Learning (cs.LG)R-kielimissing dataQA76.75-76.765; QA273-280010104 statistics & probabilitydo-calculuscausality; do-calculus; selection bias; transportability; missing data; case-control design; meta-analysisStatistics - Machine LearningSearch algorithmselection bias0101 mathematicsParametric statisticspäättelymeta-analyysicase-control designhakualgoritmit113 Computer and information sciencesMissing datameta-analysisIdentification (information)Artificial Intelligence (cs.AI)Causal inferencekausaliteettiIdentifiabilityProbability distributionData miningStatistics Probability and UncertaintycomputerSoftwareJournal of Statistical Software

researchProduct

Body weight and premature retirement: population-based evidence from Finland.

2021

Abstract Background Health status is a principal determinant of labour market participation. In this study, we examined whether excess weight is associated with withdrawal from the labour market owing to premature retirement. Methods The analyses were based on nationally representative data from Finland over the period 2001–15 (N ∼ 2500). The longitudinal data included objective measures of body weight (i.e. body mass index and waist circumference) linked to register-based information on actual retirement age. The association between the body weight measures and premature retirement was modelled using cubic b-splines via logistic regression. The models accounted for other possible risk fact…

MaleWaistOverweightBody weightLogistic regressionWeight Gain03 medical and health sciences0302 clinical medicineRisk FactorsmedicineHumansAcademicSubjects/MED00860AcademicSubjects/SOC01210030212 general & internal medicineOccupationsFinlandRetirementbusiness.industryConfoundingWork and HealthPublic Health Environmental and Occupational Health030210 environmental & occupational healthWeight distributionFemalemedicine.symptombusinessBody mass indexAcademicSubjects/SOC02610Retirement ageDemographyEuropean journal of public health

researchProduct

Surrogate outcomes and transportability

2019

Identification of causal effects is one of the most fundamental tasks of causal inference. We consider an identifiability problem where some experimental and observational data are available but neither data alone is sufficient for the identification of the causal effect of interest. Instead of the outcome of interest, surrogate outcomes are measured in the experiments. This problem is a generalization of identifiability using surrogate experiments and we label it as surrogate outcome identifiability. We show that the concept of transportability provides a sufficient criteria for determining surrogate outcome identifiability for a large class of queries.

FOS: Computer and information scienceskokeilucausalityGeneralizationComputer scienceComputer Science - Artificial Intelligence02 engineering and technologyMachine learningcomputer.software_genreOutcome (game theory)Theoretical Computer ScienceMethodology (stat.ME)do-calculusArtificial Intelligence020204 information systemsalgoritmit0202 electrical engineering electronic engineering information engineeringStatistics - Methodologyta113päättelyta112experimentbusiness.industrySurrogate endpointverkkoteoriaApplied MathematicsCausal effectta111graphidentifiabilityIdentification (information)Artificial Intelligence (cs.AI)Causal inferencekausaliteettiIdentifiability020201 artificial intelligence & image processingObservational studyArtificial intelligencebusinessmediatorcomputerSoftware

researchProduct

Simplifying Probabilistic Expressions in Causal Inference

2018

Obtaining a non-parametric expression for an interventional distribution is one of the most fundamental tasks in causal inference. Such an expression can be obtained for an identifiable causal effect by an algorithm or by manual application of do-calculus. Often we are left with a complicated expression which can lead to biased or inefficient estimates when missing data or measurement errors are involved. We present an automatic simplification algorithm that seeks to eliminate symbolically unnecessary variables from these expressions by taking advantage of the structure of the underlying graphical model. Our method is applicable to all causal effect formulas and is readily available in the …

FOS: Computer and information sciencesComputer Science - Artificial Intelligencegraph theoryyksinkertaisuussimplificationgraphical modelMachine Learning (stat.ML)Machine Learning (cs.LG)Computer Science - Learningprobabilistic expressionArtificial Intelligence (cs.AI)Statistics - Machine Learningkausaliteettipiirrosmerkitcausal inferencegraafit

researchProduct

Sublethal Pyrethroid Insecticide Exposure Carries Positive Fitness Effects Over Generations in a Pest Insect

2019

AbstractStress tolerance and adaptation to stress are known to facilitate species invasions. Many invasive species are also pests and insecticides are used to control them, which could shape their overall tolerance to stress. It is well-known that heavy insecticide usage leads to selection of resistant genotypes but less is known about potential effects of mild sublethal insecticide usage. We studied whether stressful, sublethal pyrethroid insecticide exposure has within-generational and/or maternal transgenerational effects on fitness-related traits in the Colorado potato beetle (Leptinotarsa decemlineata) and whether maternal insecticide exposure affects insecticide tolerance of offspring…

Male0301 basic medicineInsecticidesOffspringScienceEvolutionary ecologyinsektisiditArticleInsecticide ResistanceToxicology03 medical and health sciences0302 clinical medicinetuhohyönteisetPyrethrinsAnimalsvieraslajitLeptinotarsaspecies invasionssopeutuminenLarvaMultidisciplinaryInvasive speciesstress tolerancebiologyQColorado potato beetleRkoloradonkuoriainenstressi15. Life on landPesticidebiology.organism_classificationresistenssiColeopteraPupa030104 developmental biologyLarvaMedicineFemalePEST analysisAdaptationIntroduced Speciesadaptation to stress030217 neurology & neurosurgeryScientific Reports

researchProduct

Do-search -- a tool for causal inference and study design with multiple data sources

2020

Epidemiologic evidence is based on multiple data sources including clinical trials, cohort studies, surveys, registries, and expert opinions. Merging information from different sources opens up new possibilities for the estimation of causal effects. We show how causal effects can be identified and estimated by combining experiments and observations in real and realistic scenarios. As a new tool, we present do-search, a recently developed algorithmic approach that can determine the identifiability of a causal effect. The approach is based on do-calculus, and it can utilize data with nontrivial missing data and selection bias mechanisms. When the effect is identifiable, do-search outputs an i…

FOS: Computer and information sciencesEpidemiologyComputer sciencemedia_common.quotation_subjectInformation Storage and RetrievalMachine learningcomputer.software_genre01 natural sciencesStatistics - ApplicationsMethodology (stat.ME)010104 statistics & probability03 medical and health sciences0302 clinical medicineHumansApplications (stat.AP)030212 general & internal medicine0101 mathematicsSalt intakeStatistics - Methodologymedia_commonSelection biasbusiness.industryNutrition SurveysMissing dataCausalityCausalityResearch DesignCausal inferenceMeta-analysisSurvey data collectionIdentifiabilityArtificial intelligencebusinesscomputer

researchProduct

Enhancing identification of causal effects by pruning

2018

Causal models communicate our assumptions about causes and effects in real-world phe- nomena. Often the interest lies in the identification of the effect of an action which means deriving an expression from the observed probability distribution for the interventional distribution resulting from the action. In many cases an identifiability algorithm may return a complicated expression that contains variables that are in fact unnecessary. In practice this can lead to additional computational burden and increased bias or inefficiency of estimates when dealing with measurement error or missing data. We present graphical criteria to detect variables which are redundant in identifying causal effe…

päättelyFOS: Computer and information sciencesalgorithmcausal modelMachine Learning (stat.ML)Machine Learning (cs.LG)Computer Science - Learningleikkaus (kasvit)koneoppiminenStatistics - Machine Learningidentiafiabilityalgoritmitkausaliteetticausal inferencetunnistaminen

researchProduct

Estimation of causal effects with small data in the presence of trapdoor variables

2021

We consider the problem of estimating causal effects of interventions from observational data when well-known back-door and front-door adjustments are not applicable. We show that when an identifiable causal effect is subject to an implicit functional constraint that is not deducible from conditional independence relations, the estimator of the causal effect can exhibit bias in small samples. This bias is related to variables that we call trapdoor variables. We use simulated data to study different strategies to account for trapdoor variables and suggest how the related trapdoor bias might be minimized. The importance of trapdoor variables in causal effect estimation is illustrated with rea…

FOS: Computer and information sciencesStatistics and ProbabilityEconomics and EconometricsbiascausalityComputer scienceBayesian probabilityContext (language use)01 natural sciencesStatistics - ComputationMethodology (stat.ME)010104 statistics & probability0504 sociologyEconometrics0101 mathematicsComputation (stat.CO)Statistics - MethodologyestimointiEstimationSmall databayesilainen menetelmä05 social sciences050401 social sciences methodsEstimatorBayesian estimationidentifiabilityConstraint (information theory)functional constraintConditional independencekausaliteettiObservational studyStatistics Probability and UncertaintySocial Sciences (miscellaneous)

researchProduct

Identifying Causal Effects via Context-specific Independence Relations

2019

Causal effect identification considers whether an interventional probability distribution can be uniquely determined from a passively observed distribution in a given causal structure. If the generating system induces context-specific independence (CSI) relations, the existing identification procedures and criteria based on do-calculus are inherently incomplete. We show that deciding causal effect non-identifiability is NP-hard in the presence of CSIs. Motivated by this, we design a calculus and an automated search procedure for identifying causal effects in the presence of CSIs. The approach is provably sound and it includes standard do-calculus as a special case. With the approach we can …

FOS: Computer and information sciencescontext-specific independence relationsComputer Science - Machine LearningArtificial Intelligence (cs.AI)Computer Science - Artificial Intelligenceeducationkausaliteetticausal effect identification113 Computer and information sciencesMachine Learning (cs.LG)

researchProduct

Identifying Causal Effects with the R Package causaleffect

2017

Do-calculus is concerned with estimating the interventional distribution of an action from the observed joint probability distribution of the variables in a given causal structure. All identifiable causal effects can be derived using the rules of do-calculus, but the rules themselves do not give any direct indication whether the effect in question is identifiable or not. Shpitser and Pearl constructed an algorithm for identifying joint interventional distributions in causal models, which contain unobserved variables and induce directed acyclic graphs. This algorithm can be seen as a repeated application of the rules of do-calculus and known properties of probabilities, and it ultimately eit…

Statistics and ProbabilityFOS: Computer and information sciencesTheoretical computer sciencecausalityDistribution (number theory)C-componentComputer sciencecausal model02 engineering and technologyCausal structureMethodology (stat.ME)03 medical and health sciences0302 clinical medicinedo-calculusJoint probability distribution0202 electrical engineering electronic engineering information engineering030212 general & internal medicineDAG; do-calculus; causality; causal model; identifiability; graph; C-component; hedge; d-separationlcsh:Statisticslcsh:HA1-4737Statistics - Methodologycomputer.programming_languageCausal modelta112DAGd-separationgraphhedgeidentifiabilityExpression (mathematics)PEARL (programming language)Action (philosophy)kausaliteetti020201 artificial intelligence & image processingStatistics Probability and UncertaintycomputerSoftware

researchProduct

The effects of short-term glyphosate-based herbicide exposure on insect gene expression profiles

2023

Glyphosate-based herbicides (GBHs) are the most frequently used herbicides worldwide. The use of GBHs is intended to tackle weeds, but GBHs have been shown to affect the life-history traits and antioxidant defense system of invertebrates found in agroecosystems. Thus far, the effects of GBHs on detoxification pathways among invertebrates have not been sufficiently investigated. We performed two different experiments—1) the direct pure glyphosate and GBH treatment, and 2) the indirect GBH experiment via food—to examine the possible effects of environmentally relevant GBH levels on the survival of the Colorado potato beetle (Leptinotarsa decemlineata) and the expression profiles of their deto…

PhysiologykoloradonkuoriainenCytochrome P450torjunta-aineetacetylcholinesteraseherbisiditdetoxification genesglyphosateInsect ScienceglyfosaattihyönteisetColorado potato beetlegeeniekspressioRoundupJournal of Insect Physiology

researchProduct

Body weight and premature retirement : population-based evidence from Finland

2021

Background Health status is a principal determinant of labour market participation. In this study, we examined whether excess weight is associated with withdrawal from the labour market owing to premature retirement. Methods The analyses were based on nationally representative data from Finland over the period 2001–15 (N ∼ 2500). The longitudinal data included objective measures of body weight (i.e. body mass index and waist circumference) linked to register-based information on actual retirement age. The association between the body weight measures and premature retirement was modelled using cubic b-splines via logistic regression. The models accounted for other possible risk factors and p…

tupakointifinlandeläkkeelle siirtyminentyökykyhealth statusylipainolabor markettyömarkkinatbody mass index procedurepainoindeksiwaist circumference

researchProduct

Simulation Framework for Realistic Large-scale Individual-level Data Generation with an Application in the Health Domain

2020

We propose a framework for realistic data generation and simulation of complex systems and demonstrate its capabilities in the health domain. The main use cases of the framework are predicting the development of risk factors and disease occurrence, evaluating the impact of interventions and policy decisions, and statistical method development. We present the fundamentals of the framework using rigorous mathematical definitions. The framework supports calibration to a real population as well as various manipulations and data collection processes. The freely available open-source implementation in R embraces efficient data structures, parallel computing and fast random number generation which…

Methodology (stat.ME)FOS: Computer and information sciencesApplications (stat.AP)Statistics - ApplicationsStatistics - Methodology

researchProduct

dynamite: An R Package for Dynamic Multivariate Panel Models

2023

dynamite is an R package for Bayesian inference of intensive panel (time series) data comprising of multiple measurements per multiple individuals measured in time. The package supports joint modeling of multiple response variables, time-varying and time-invariant effects, a wide range of discrete and continuous distributions, group-specific random effects, latent factors, and customization of prior distributions of the model parameters. Models in the package are defined via a user-friendly formula interface, and estimation of the posterior distribution of the model parameters takes advantage of state-of-the-art Markov chain Monte Carlo methods. The package enables efficient computation of …

Methodology (stat.ME)FOS: Computer and information sciencesStatistics - Methodology

researchProduct

Improving identification algorithms in causal inference

2018

Causal models provide a formal approach to the study of causality. One of the most useful features of causal modeling is that it enables one to make causal claims about a phenomenon using observational data alone under suitable conditions. This feature enables the analysis of interventions that may be infeasible to conduct in the real world for practical or ethical reasons. The uncertainty associated with the variables of interest is taken into account by including a probability distribution in the causal model, making it is possible to study the eﬀects of external interventions by examining how this distribution is changed by the action. The probability distribution of a speciﬁc variable i…

päättelyR-kielialgoritmitkausaliteettimuuttujatmallittodennäköisyysgraafit

researchProduct

Itseopiskelumateriaalia: Kausaalimallintamisen perusteet tilastotieteessä

2016

Tämä moniste on tarkoitettu itseopiskelumateriaaliksi tilastotieteen maisterivaiheen opiskelijoille (tai vastaavat tiedot omaaville). Erityisesti todennäköisyyslaskennan ja yleistettyjen lineaaristen mallien tuntemus on tarpeen. Materiaalin tarkoituksena on selvittää lukijalle perusteet Judea Pearlin kehittämästä kausaalimallintamisesta ja -laskennasta. Materiaali perustuu Judea Pearlin kirjaan Causality [Pearl, 2009]. Lauseiden ja määritelmien kohdalla annetaan aina kirjan osio, josta nämä löytyvät. nonPeerReviewed

graafiteoriatilastotiedekausaalimallintaminen

researchProduct