Search results for "hypothesis testing"
showing 10 items of 124 documents
Estimating the Number of Changepoints in Segmented Regression Models: Comparative Study and Application
2020
This paper deals with the problem of selecting the number of changepoints in segmented regression models. The aim is to review selection criteria, namely information criteria and hypothesis testing, and to propose a novel application in the context of students' careers in higher education. The performance of the selection criteria is assessed through simulation studies. Furthermore, we investigate the relationship between University students' performance and one of its main determinants, finding out that this relationship is actually broken-line.
Time Series Momentum in the US Stock Market: Empirical Evidence and Theoretical Implications
2020
There is much controversy in the academic literature on the presence of short-term trends in financial markets and the trend-following strategy's profitability. This paper restricts its attention to the study of time-series momentum (TSMOM) in the US stock market. The paper aims to suggest answers to several important questions regarding TSMOM and to explain the existing controversy. Our answer to the question, whether short-term trends exist, is strongly affirmative. For the first time, we suppose that the returns follow a p-order autoregressive process with p>1 and evaluate this process's parameters. Fairly accurate knowledge of the momentum generating process allows us to provide analyti…
Interval Length Analysis in Multi Layer Model
2009
In this paper we present an hypothesis test of randomness based on the probability density function of the symmetrized Kulback-Leibler distance estimated, via a Monte Carlo simulation, by the distributions of the interval lengths detected using the Multi-Layer Model (MLM). The $MLM$ is based on the generation of several sub-samples of an input signal; in particular a set of optimal cut-set thresholds are applied to the data to detect signal properties. In this sense MLM is a general pattern detection method and it can be considered a preprocessing tool for pattern discovery. At the present the test has been evaluated on simulated signals which respect a particular tiled microarray approach …
Comparing local structures of spatio-temporal point processes on linear networks
2022
We employ the Local Indicators of Spatio-Temporal Association (LISTA) functions on linear networks to build a statistical test for local second-order structure. This allows to identify differences in the spatio-temporal clustering behaviour of two point patterns, a point pattern of interest and a background one, both occurring on the same linear network. We illustrate the proposed methodology analysing a traffic-related problem.
A novel sequential testing procedure for selecting the number of changepoints in segmented regression models
2023
In this work, we address the problem of selecting the number of changepoints in segmented regression models. We propose a novel stepwise procedure and assess its performance through simulation studies. We demonstrate that our proposal behaves well with the Gaussian and Binomial responses.
Logit analysis in L2 research: measuring L1 and L2/Ln effects
1998
In quantitatively oriented L2 studies, we normally contrast two phenomena at a time, both at the group and individual level. However, it is generally acknowledged that what a learner produces is determined by a multitude of factors influencing the interlanguage (IL) simultaneously. When dealing with discrete, nominal categories, the numerical and causal relations between the variables involved cannot be adequately captured in an analysis where the phenomenon to be studied and the explanatory factors are subjected to a series of pairwise statistical tests. Instead of the two-variable approach, multivariate techniques should be applied, since they allow for the examination of the effects of m…
Nature et impacts des effets spatiaux sur les valeurs immobilières : le cas de l'espace urbanisé francilien
2013
International audience
Knowledge Discovery from the Programme for International Student Assessment
2017
The Programme for International Student Assessment (PISA) is a worldwide study that assesses the proficiencies of 15-year-old students in reading, mathematics, and science every three years. Despite the high quality and open availability of the PISA data sets, which call for big data learning analytics, academic research using this rich and carefully collected data is surprisingly sparse. Our research contributes to reducing this deficit by discovering novel knowledge from the PISA through the development and use of appropriate methods. Since Finland has been the country of most international interest in the PISA assessment, a relevant review of the Finnish educational system is provided. T…
Getting rid of the Chi-square and Log-likelihood tests for analysing vocabulary differences between corpora
2018
Log-likelihood and Chi-square tests are probably the most popular statistical tests used in corpus linguistics, especially when the research is aiming to describe the lexical variations between corpora. However, because this specific use of the Chi-square test is not valid, it produces far too many significant results. This paper explains the source of the problem (i.e., the non-independence of the observations), the reasons for which the usual solutions are not acceptable and which kinds of statistical test should be used instead. A corpus analysis conducted on the lexical differences between American and British English is then reported, in order to demonstrate the problem and to confirm …
Statistical power of disease cluster and clustering tests for rare diseases: A simulation study of point sources
2012
Abstract Two recent epidemiological studies on clustering of childhood leukemia showed different results on the statistical power of disease cluster and clustering tests, possibly an effect of spatial data aggregation. Eight different leukemia cluster scenarios were simulated using individual addresses of all 1,009,332 children living in Denmark in 2006. For each scenario, a number of point sources were defined with an increased risk ratio at centroid, decreasing linearly to 1.0 at the edge; aggregation levels were administrative units of Danish municipalities and squares of 5, 12.5 and 25 km 2 . Six statistical methods were compared. Generally, statistical power decreased with increasing s…