0000000000003200

AUTHOR

Fabrizio Lillo

Applying complexity science to air traffic management

Versión aceptada obtenida del archivo digital en línea WestminsterResearch de la Universidad de Westminster. Complexity science is the multidisciplinary study of complex systems. Its marked network orientation lends itself well to transport contexts. Key features of complexity science are introduced and defined, with a specific focus on the application to air traffic management. An overview of complex network theory is presented, with examples of its corresponding metrics and multiple scales. Complexity science is starting to make important contributions to performance assessment and system design: selected, applied air traffic management case studies are explored. The important contexts of…

research product

Modeling foreign exchange market activity around macroeconomic news: Hawkes-process approach

We present a Hawkes-model approach to the foreign exchange market in which the high-frequency price dynamics is affected by a self-exciting mechanism and an exogenous component, generated by the pre-announced arrival of macroeconomic news. By focusing on time windows around the news announcement, we find that the model is able to capture the increase of trading activity after the news, both when the news has a sizable effect on volatility and when this effect is negligible, either because the news in not important or because the announcement is in line with the forecast by analysts. We extend the model by considering noncausal effects, due to the fact that the existence of the news (but not…

research product

Multi-scale analysis of the European airspace using network community detection

We show that the European airspace can be represented as a multi-scale traffic network whose nodes are airports, sectors, or navigation points and links are defined and weighted according to the traffic of flights between the nodes. By using a unique database of the air traffic in the European airspace, we investigate the architecture of these networks with a special emphasis on their community structure. We propose that unsupervised network community detection algorithms can be used to monitor the current use of the airspaces and improve it by guiding the design of new ones. Specifically, we compare the performance of three community detection algorithms, also by using a null model which t…

research product

Anomalous Spreading of Power-Law Quantum Wave Packets

We introduce power-law tail quantum wave packets. We show that they can be seen as eigenfunctions of a Hamiltonian with a physical potential. We prove that the free evolution of these packets presents an asymptotic decay of the maximum of the wave packets which is anomalous for an interval of the characterizing power-law exponent. We also prove that the number of finite moments of the wave packets is a conserved quantity during the evolution of the wave packet in the free space.

research product

Identification of clusters of investors from their real trading activity in a financial market

We use statistically validated networks, a recently introduced method to validate links in a bipartite system, to identify clusters of investors trading in a financial market. Specifically, we investigate a special database allowing to track the trading activity of individual investors of the stock Nokia. We find that many statistically detected clusters of investors show a very high degree of synchronization in the time when they decide to trade and in the trading action taken. We investigate the composition of these clusters and we find that several of them show an over-expression of specific categories of investors.

research product

Variety and volatility in financial markets

We study the price dynamics of stocks traded in a financial market by considering the statistical properties both of a single time series and of an ensemble of stocks traded simultaneously. We use the $n$ stocks traded in the New York Stock Exchange to form a statistical ensemble of daily stock returns. For each trading day of our database, we study the ensemble return distribution. We find that a typical ensemble return distribution exists in most of the trading days with the exception of crash and rally days and of the days subsequent to these extreme events. We analyze each ensemble return distribution by extracting its first two central moments. We observe that these moments are fluctua…

research product

There's more to volatility than volume

It is widely believed that fluctuations in transaction volume, as reflected in the number of transactions and to a lesser extent their size, are the main cause of clustered volatility. Under this view bursts of rapid or slow price diffusion reflect bursts of frequent or less frequent trading, which cause both clustered volatility and heavy tails in price returns. We investigate this hypothesis using tick by tick data from the New York and London Stock Exchanges and show that only a small fraction of volatility fluctuations are explained in this manner. Clustered volatility is still very strong even if price changes are recorded on intervals in which the total transaction volume or number of…

research product

Do firms share the same functional form of their growth rate distribution? A statistical test

We introduce a new statistical test of the hypothesis that a balanced panel of firms have the same growth rate distribution or, more generally, that they share the same functional form of growth rate distribution. We applied the test to European Union and US publicly quoted manufacturing firms data, considering functional forms belonging to the Subbotin family of distributions. While our hypotheses are rejected for the vast majority of sets at the sector level, we cannot rejected them at the subsector level, indicating that homogenous panels of firms could be described by a common functional form of growth rate distribution.

research product

Air Transport Network: a short review

research product

Modelling systemic price cojumps with Hawkes factor models

Instabilities in the price dynamics of a large number of financial assets are a clear sign of systemic events. By investigating a set of 20 high cap stocks traded at the Italian Stock Exchange, we find that there is a large number of high frequency cojumps. We show that the dynamics of these jumps is described neither by a multivariate Poisson nor by a multivariate Hawkes model. We introduce a Hawkes one factor model which is able to capture simultaneously the time clustering of jumps and the high synchronization of jumps across assets.

research product

How Tick Size Affects the High Frequency Scaling of Stock Return Distributions

We study the high frequency scaling of the distributions of returns for stocks traded at NASDAQ market as a function of the tick-to-price ratio. The tick-to-price ratio is a measure of an effective tick size. We find dramatic differences between distributions for assets with large and small tick-to-price ratio. The presence of returns clustering is evident for large tick size assets. The statistical differences between large and small tick size assets appear to reduce at higher time scales of observation. A possible way to explain returns dynamics for large tick size assets is the coupling of returns with bid-ask spread dynamics. A simple Markov- switching model is able to reproduce the pro…

research product

Inverted and mirror repeats in model nucleotide sequences.

We analytically and numerically study the probabilistic properties of inverted and mirror repeats in model sequences of nucleic acids. We consider both perfect and non-perfect repeats, i.e. repeats with mismatches and gaps. The considered sequence models are independent identically distributed (i.i.d.) sequences, Markov processes and long range sequences. We show that the number of repeats in correlated sequences is significantly larger than in i.i.d. sequences and that this discrepancy increases exponentially with the repeat length for long range sequences.

research product

Scale-free relaxation of a wave packet in a quantum well with power-law tails

We propose a setup for which a power-law decay is predicted to be observable for generic and realistic conditions. The system we study is very simple: A quantum wave packet initially prepared in a potential well with (i) tails asymptotically decaying like ~ x^{-2} and (ii) an eigenvalues spectrum that shows a continuous part attached to the ground or equilibrium state. We analytically derive the asymptotic decay law from the spectral properties for generic, confined initial states. Our findings are supported by realistic numerical simulations for state-of-the-art expansion experiments with cold atoms.

research product

A theory for long-memory in supply and demand

Recent empirical studies have demonstrated long-memory in the signs of orders to buy or sell in financial markets [2, 19]. We show how this can be caused by delays in market clearing. Under the common practice of order splitting, large orders are broken up into pieces and executed incrementally. If the size of such large orders is power law distributed, this gives rise to power law decaying autocorrelations in the signs of executed orders. More specifically, we show that if the cumulative distribution of large orders of volume v is proportional to v to the power -alpha and the size of executed orders is constant, the autocorrelation of order signs as a function of the lag tau is asymptotica…

research product

Univariate and multivariate statistical aspects of equity volatility

We discuss univariate and multivariate statistical properties of volatility time series of equities traded in a financial market. Specifically, (i) we introduce a two-region stochastic volatility model able to well describe the unconditional pdf of volatility in a wide range of values and (ii) we quantify the stability of the results of a correlation-based clustering procedure applied to synchronous time evolution of a set of volatility time series.

research product

Statistically validated networks in bipartite complex systems.

Many complex systems present an intrinsic bipartite nature and are often described and modeled in terms of networks [1-5]. Examples include movies and actors [1, 2, 4], authors and scientific papers [6-9], email accounts and emails [10], plants and animals that pollinate them [11, 12]. Bipartite networks are often very heterogeneous in the number of relationships that the elements of one set establish with the elements of the other set. When one constructs a projected network with nodes from only one set, the system heterogeneity makes it very difficult to identify preferential links between the elements. Here we introduce an unsupervised method to statistically validate each link of the pr…

research product

Why Is Equity Order Flow so Persistent?

Order flow in equity markets is remarkably persistent in the sense that order signs (to buy or sell) are positively autocorrelated out to time lags of tens of thousands of orders, corresponding to many days. Two possible explanations are herding, corresponding to positive correlation in the behavior of different investors, or order splitting, corresponding to positive autocorrelation in the behavior of single investors. We investigate this using order flow data from the London Stock Exchange for which we have membership identifiers. By formulating models for herding and order splitting, as well as models for brokerage choice, we are able to overcome the distortion introduced by brokerage. O…

research product

Statistical properties of thermodynamically predicted RNA secondary structures in viral genomes

By performing a comprehensive study on 1832 segments of 1212 complete genomes of viruses, we show that in viral genomes the hairpin structures of thermodynamically predicted RNA secondary structures are more abundant than expected under a simple random null hypothesis. The detected hairpin structures of RNA secondary structures are present both in coding and in noncoding regions for the four groups of viruses categorized as dsDNA, dsRNA, ssDNA and ssRNA. For all groups hairpin structures of RNA secondary structures are detected more frequently than expected for a random null hypothesis in noncoding rather than in coding regions. However, potential RNA secondary structures are also present i…

research product

Kullback-Leibler distance as a measure of the information filtered from multivariate data

We show that the Kullback-Leibler distance is a good measure of the statistical uncertainty of correlation matrices estimated by using a finite set of data. For correlation matrices of multivariate Gaussian variables we analytically determine the expected values of the Kullback-Leibler distance of a sample correlation matrix from a reference model and we show that the expected values are known also when the specific model is unknown. We propose to make use of the Kullback-Leibler distance to estimate the information extracted from a correlation matrix by correlation filtering procedures. We also show how to use this distance to measure the stability of filtering procedures with respect to s…

research product

Modeling the Dynamics of a Financial Index after a Crash

Supply and demand are perhaps the most fundamental concepts in economics. In a financial market they reflects the orders of the agents to buy or sell a given asset. In turn the fluctuations of supply and demand influence the dynamics of the price of an asset, as, for example, a stock or a financial index. Therefore the dynamics of the price of an asset is affected by the actions and of the beliefs of the agents. It is known that the dynamics of the price of an asset is far from simple, Several stylized facts has been empirically discovered such as, for example, the fat tails in the return distribution and the clustered volatility. These stylized facts has been detected by considering long t…

research product

The limit order book on different time scales

Financial markets can be described on several time scales. We use data from the limit order book of the London Stock Exchange (LSE) to compare how the fluctuation dominated microstructure crosses over to a more systematic global behavior.

research product

Modeling the coupled return-spread high frequency dynamics of large tick assets

Large tick assets, i.e. assets where one tick movement is a significant fraction of the price and bid-ask spread is almost always equal to one tick, display a dynamics in which price changes and spread are strongly coupled. We introduce a Markov-switching modeling approach for price change, where the latent Markov process is the transition between spreads. We then use a finite Markov mixture of logit regressions on past squared returns to describe the dependence of the probability of price changes. The model can thus be seen as a Double Chain Markov Model. We show that the model describes the shape of return distribution at different time aggregations, volatility clustering, and the anomalo…

research product

Statistical identification with hidden Markov models of large order splitting strategies in an equity market

Large trades in a financial market are usually split into smaller parts and traded incrementally over extended periods of time. We address these large trades as hidden orders. In order to identify and characterize hidden orders we fit hidden Markov models to the time series of the sign of the tick by tick inventory variation of market members of the Spanish Stock Exchange. Our methodology probabilistically detects trading sequences, which are characterized by a net majority of buy or sell transactions. We interpret these patches of sequential buying or selling transactions as proxies of the traded hidden orders. We find that the time, volume and number of transactions size distributions of …

research product

Market efficiency and the long-memory of supply and demand: is price impact variable and permanent or fixed and temporary?

In this comment we discuss the problem of reconciling the linear efficiency of price returns with the long-memory of supply and demand. We present new evidence that shows that efficiency is maintained by a liquidity imbalance that co-moves with the imbalance of buyer vs. seller initiated transactions. For example, during a period where there is an excess of buyer initiated transactions, there is also more liquidity for buy orders than sell orders, so that buy orders generate smaller and less frequent price responses than sell orders. At the moment a buy order is placed the transaction sign imbalance tends to dominate, generating a price impact. However, the liquidity imbalance rapidly incre…

research product

Modeling the dynamics os a financial index after a crash

research product

Statistical characterization of deviations from planned flight trajectories in air traffic management

Understanding the relation between planned and realized flight trajectories and the determinants of flight deviations is of great importance in air traffic management. In this paper we perform an in depth investigation of the statistical properties of planned and realized air traffic on the German airspace during a 28 day periods, corresponding to an AIRAC cycle. We find that realized trajectories are on average shorter than planned ones and this effect is stronger during night-time than daytime. Flights are more frequently deviated close to the departure airport and at a relatively large angle to destination. Moreover, the probability of a deviation is higher in low traffic phases. All the…

research product

Specialization and herding behavior of trading firms in a financial market

Agent-based models of financial markets usually make assumptions about agent’s preferred stylized strategies. Empirical validations of these assumptions have not been performed so far on a full-market scale. Here we present a comprehensive study of the resulting strategies followed by the firms which are members of the Spanish Stock Exchange. We are able to show that they can be characterized by a resulting strategy and classified in three well- defined groups of firms. Firms of the first group have a change of inventory of the traded stock which is positively correlated with the synchronous stock return whereas firms of the second group show a negative correlation. Firms of the third group…

research product

Degree stability of a minimum spanning tree of price return and volatility

We investigate the time series of the degree of minimum spanning trees obtained by using a correlation based clustering procedure which is starting from (i) asset return and (ii) volatility time series. The minimum spanning tree is obtained at different times by computing correlation among time series over a time window of fixed length $T$. We find that the minimum spanning tree of asset return is characterized by stock degree values, which are more stable in time than the ones obtained by analyzing a minimum spanning tree computed starting from volatility time series. Our analysis also shows that the degree of stocks has a very slow dynamics with a time-scale of several years in both cases.

research product

THE KEY ROLE OF LIQUIDITY FLUCTUATIONS IN DETERMINING LARGE PRICE CHANGES

Recent empirical analyses have shown that liquidity fluctuations are important for understanding large price changes of financial assets. These liquidity fluctuations are quantified by gaps in the order book, corresponding to blocks of adjacent price levels containing no quotes. Here we study the statistical properties of the state of the limit order book for 16 stocks traded at the London Stock Exchange (LSE). We show that the time series of the first three gaps are characterized by fat tails in the probability distribution and are described by long memory processes.

research product

On the origin of power law tails in price fluctuations

In a recent Nature paper, Gabaix et al. \cite{Gabaix03} presented a theory to explain the power law tail of price fluctuations. The main points of their theory are that volume fluctuations, which have a power law tail with exponent roughly -1.5, are modulated by the average market impact function, which describes the response of prices to transactions. They argue that the average market impact function follows a square root law, which gives power law tails for prices with exponent roughly -3. We demonstrate that the long-memory nature of order flow invalidates their statistical analysis of market impact, and present a more careful analysis that properly takes this into account. This makes i…

research product

THE ROLE OF UNBOUNDED TIME-SCALES IN GENERATING LONG-RANGE MEMORY IN ADDITIVE MARKOVIAN PROCESSES

Any additive stationary and continuous Markovian process described by a Fokker–Planck equation can also be described in terms of a Schrödinger equation with an appropriate quantum potential. By using such analogy, it has been proved that a power-law correlated stationary Markovian process can stem from a quantum potential that (i) shows an x-2 decay for large x values and (ii) whose eigenvalue spectrum admits a null eigenvalue and a continuum part of positive eigenvalues attached to it. In this paper we show that such two features are both necessary. Specifically, we show that a potential with tails decaying like x-μ with μ < 2 gives rise to a stationary Markovian process which is not p…

research product

The key role of liquidity fluctuations in detrmining large price fluctuations

research product

Generation of hierarchically correlated multivariate symbolic sequences: With an application to the assessment of bootstrap confidence in phylogenetic analysis.

We introduce a method to generate multivariate series of symbols from a finite alphabet with a given hierarchical structure of similarities based on the Hamming distance. The target hierarchical structure of similarities is arbitrary, for instance the one obtained by some hierarchical clustering method applied to an empirical matrix of similarities. The method that we present here is based on a generating mechanism that does not make use of mutation rate, which is widely used in phylogenetic analysis. Here we use the proposed simulation method to investigate the relationship between the bootstrap value associated with a node of a phylogeny and the probability of finding that node in the tru…

research product

Price Impact Function of a Single Transaction

Although supply and demand are perhaps the most fundamental concepts in economics, finding any general form for their behavior has proved to be elusive. Here we discuss our recent findings [1] on the price impact function empirically detected in the New York Stock Exchange (NYSE). Our study builds on earlier studies of how trading affects prices [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]. In particular, we look at the short term response to a single trade. This is done by using huge amounts of data and by measuring the market activity in units of transactions rather than seconds, so that we can more naturally aggregate data for many different stocks. This allows us to find regularities in the respons…

research product

Hierarchically nested factor model from multivariate data

We show how to achieve a statistical description of the hierarchical structure of a multivariate data set. Specifically we show that the similarity matrix resulting from a hierarchical clustering procedure is the correlation matrix of a factor model, the hierarchically nested factor model. In this model, factors are mutually independent and hierarchically organized. Finally, we use a bootstrap based procedure to reduce the number of factors in the model with the aim of retaining only those factors significantly robust with respect to the statistical uncertainty due to the finite length of data records.

research product

Inverted Repeats in Viral Genomes

We investigate 738 complete genomes of viruses to detect the presence of short inverted repeats. The number of inverted repeats found is compared with the prediction obtained for a Bernoullian and for a Markovian control model. We find as a statistical regularity that the number of observed inverted repeats is often greater than the one expected in terms of a Bernoullian or Markovian model in several of the viruses and in almost all those with a genome longer than 30,000 bp.

research product

Statistical Properties of Statistical Ensembles of Stock Returns

We select n stocks traded in the New York Stock Exchange and we form a statistical ensemble of daily stock returns for each of the k trading days of our database from the stock price time series. We analyze each ensemble of stock returns by extracting its first four central moments. We observe that these moments are fluctuating in time and are stochastic processes themselves. We characterize the statistical properties of central moments by investigating their probability density function and temporal correlation properties.

research product

Special issue of Quantitative Finance on ‘Interlinkages and Systemic Risk’

This special issue of Quantitative Finance collects eight papers on the relation between interlinkages and systemic risk. The papers cover several types of interlinkages and follow different approaches, from agent-based modelling to empirical investigation of large and sometimes confidential data. The special issue collects some of the contributions presented at the international workshop‘Interlinkages and systemic risk ’ , which took place in Ancona (Italy) on 4 – 5 July 2013. The workshop, organized within the research project‘. New tools in the credit network modeling with agents ’ heterogeneity ’ funded by the Institute for New Economic Thinking, was attended by a balanced mix of schola…

research product

Market Impact and Trading Profile of Hidden Orders in Stock Markets

We empirically study the market impact of trading orders. We are specifically interested in large trading orders that are executed incrementally, which we call hidden orders. These are statistically reconstructed based on information about market member codes using data from the Spanish Stock Market and the London Stock Exchange. We find that market impact is strongly concave, approximately increasing as the square root of order size. Furthermore, as a given order is executed, the impact grows in time according to a power law; after the order is finished, it reverts to a level of about 0.5-0.7 of its value at its peak. We observe that hidden orders are executed at a rate that more or less m…

research product

The non-random walk of stock prices: The long-term correlation between signs and sizes

We investigate the random walk of prices by developing a simple model relating the properties of the signs and absolute values of individual price changes to the diffusion rate (volatility) of prices at longer time scales. We show that this benchmark model is unable to reproduce the diffusion properties of real prices. Specifically, we find that for one hour intervals this model consistently over-predicts the volatility of real price series by about 70%, and that this effect becomes stronger as the length of the intervals increases. By selectively shuffling some components of the data while preserving others we are able to show that this discrepancy is caused by a subtle but long-range non-…

research product

Master curve for price-impact function

The price reaction to a single transaction depends on transaction volume, the identity of the stock, and possibly many other factors. Here we show that, by taking into account the differences in liquidity for stocks of different size classes of market capitalization, we can rescale both the average price shift and the transaction volume to obtain a uniform price-impact curve for all size classes of firm for four different years (1995–98). This single-curve collapse of the price-impact function suggests that fluctuations from the supply-and-demand equilibrium for many financial assets, differing in economic sectors of activity and market capitalization, are governed by the same statistical r…

research product

Community characterization of heterogeneous complex systems

We introduce an analytical statistical method to characterize the communities detected in heterogeneous complex systems. By posing a suitable null hypothesis, our method makes use of the hypergeometric distribution to assess the probability that a given property is over-expressed in the elements of a community with respect to all the elements of the investigated set. We apply our method to two specific complex networks, namely a network of world movies and a network of physics preprints. The characterization of the elements and of the communities is done in terms of languages and countries for the movie network and of journals and subject categories for papers. We find that our method is ab…

research product

Diffusive behavior and the modeling of characteristic times in limit order executions

We present an empirical study of the first passage time (FPT) of order book prices needed to observe a prescribed price change Delta, the time to fill (TTF) for executed limit orders and the time to cancel (TTC) for canceled ones in a double auction market. We find that the distribution of all three quantities decays asymptotically as a power law, but that of FPT has significantly fatter tails than that of TTF. Thus a simple first passage time model cannot account for the observed TTF of limit orders. We propose that the origin of this difference is the presence of cancellations. We outline a simple model, which assumes that prices are characterized by the empirically observed distribution …

research product

Coupling and Complexity of Interaction of STCA Networks

This paper provides an overview on the results of an ENAV feasibility study, where we exploited an automatic safety data gathering tool to analyze the ATM system performances. In particular, it addresses the use of the EUROCONTROL tool, ASMT (Automatic Safety Monitoring Tool), as a support to monitor STCA performance. The contribution of this study is to explore how analysis methods derived from complex systems theory (i.e. network analysis) can assist in the understanding, monitoring and management of the performance of ATM systems. Our data show that a large number of STCAs do not occur in isolation, but rather that in roughly half of the cases the aircraft involved in an STCA are subsequ…

research product

Econofisica: il contributo dei fisici allo studio dei sistemi economici

research product

Correlation based hierarchical clustering in financial time series

We review a correlation based clustering procedure applied to a portfolio of assets synchronously traded in a financial market. The portfolio considered consists of the set of 500 highly capitalized stocks traded at the New York Stock Exchange during the time period 1987-1998. We show that meaningful economic information can be extracted from correlation matrices.

research product

Limit order placement as an utility maximization problem and the origin of power law distribution of limit order prices

I consider the problem of the optimal limit order price of a financial asset in the framework of the maximization of the utility function of the investor. The analytical solution of the problem gives insight on the origin of the recently empirically observed power law distribution of limit order prices. In the framework of the model, the most likely proximate cause of this power law is a power law heterogeneity of traders' investment time horizons .

research product

Modeling FX market activity around macroeconomic news: a Hawkes process approach

We present a Hawkes model approach to foreign exchange market in which the high frequency price dynamics is affected by a self exciting mechanism and an exogenous component, generated by the pre-announced arrival of macroeconomic news. By focusing on time windows around the news announcement, we find that the model is able to capture the increase of trading activity after the news, both when the news has a sizeable effect on volatility and when this effect is negligible, either because the news in not important or because the announcement is in line with the forecast by analysts. We extend the model by considering non-causal effects, due to the fact that the existence of the news (but not i…

research product

The impact of systemic and illiquidity risk on financing with risky collateral

Abstract Repurchase agreements (repos) are one of the most important sources of funding liquidity for many financial investors and intermediaries. In a repo, some assets are given by a borrower as collateral in exchange of funding. The capital given to the borrower is the market value of the collateral, reduced by an amount termed as haircut (or margin). The haircut protects the capital lender from loss of value of the collateral contingent on the borrower׳s default. For this reason, the haircut is typically calculated with a simple Value at Risk estimation of the collateral for the purpose of preventing the risk associated to volatility. However, other risk factors should be included in th…

research product

Identification of Clusters of Investors from Their Real Trading Activity in a Financial Market

We use statistically validated networks, a recently introduced method to validate links in a bipartite system, to identify clusters of investors trading in a financial market. Specifically, we investigate a special database allowing to track the trading activity of individual investors of the stock Nokia. We find that many statistically detected clusters of investors show a very high degree of synchronization in the time when they decide to trade and in the trading action taken. We investigate the composition of these clusters and we find that several of them show an over-expression of specific categories of investors.

research product

Complex Networks in Air Transport

The application of CNT to air traffic management has seen significant growth in recent years. This is partly because air traffic can be seen as the superposition of different networks, including the networks of airports, sectors and navigation points. Moreover each of these networks can be seen as a multiplex – for example, by associating each layer with a different airline. The study of the topology of these networks is important for several reasons related to understanding, monitoring, controlling, and optimising the air traffic system. The topological properties of air traffic networks are useful: (i) for studying how the air traffic has changed in recent years; (ii) for identifying the …

research product

Dynamics of the Number of Trades of Financial Securities

We perform a parallel analysis of the spectral density of (i) the logarithm of price and (ii) the daily number of trades of a set of stocks traded in the New York Stock Exchange. The stocks are selected to be representative of a wide range of stock capitalization. The observed spectral densities show a different power-law behavior. We confirm the $1/f^2$ behavior for the spectral density of the logarithm of stock price whereas we detect a $1/f$-like behavior for the spectral density of the daily number of trades.

research product

Variety of Stock Returns in Normal and Extreme Market Days: The August 1998 Crisis

We investigate the recently introduced variety of a set of stock returns traded in a financial market. This investigation is done by considering daily and intraday time horizons in a 15-day time period centered at the August 31st, 1998 crash of the S&P500 index. All the stocks traded at the NYSE during that period are considered in the present analysis. We show that the statistical properties of the variety observed in analyses of daily returns also hold for intraday returns. In particular the largest changes of the variety of the return distribution turns out to be most localized at the opening or (to a less degree) at the closing of the market.

research product

How markets slowly digest changes in supply and demand

In this article we revisit the classic problem of tatonnement in price formation from a microstructure point of view, reviewing a recent body of theoretical and empirical work explaining how fluctuations in supply and demand are slowly incorporated into prices. Because revealed market liquidity is extremely low, large orders to buy or sell can only be traded incrementally, over periods of time as long as months. As a result order flow is a highly persistent long-memory process. Maintaining compatibility with market efficiency has profound consequences on price formation, on the dynamics of liquidity, and on the nature of impact. We review a body of theory that makes detailed quantitative pr…

research product

Cluster analysis for portfolio optimization

We consider the problem of the statistical uncertainty of the correlation matrix in the optimization of a financial portfolio. We show that the use of clustering algorithms can improve the reliability of the portfolio in terms of the ratio between predicted and realized risk. Bootstrap analysis indicates that this improvement is obtained in a wide range of the parameters N (number of assets) and T (investment horizon). The predicted and realized risk level and the relative portfolio composition of the selected portfolio for a given value of the portfolio return are also investigated for each considered filtering method.

research product

Networks of equities in financial markets

We review the recent approach of correlation based networks of financial equities. We investigate portfolio of stocks at different time horizons, financial indices and volatility time series and we show that meaningful economic information can be extracted from noise dressed correlation matrices. We show that the method can be used to falsify widespread market models by directly comparing the topological properties of networks of real and artificial markets.

research product

Calibration of optimal execution of financial transactions in the presence of transient market impact

Trading large volumes of a financial asset in order driven markets requires the use of algorithmic execution dividing the volume in many transactions in order to minimize costs due to market impact. A proper design of an optimal execution strategy strongly depends on a careful modeling of market impact, i.e. how the price reacts to trades. In this paper we consider a recently introduced market impact model (Bouchaud et al., 2004), which has the property of describing both the volume and the temporal dependence of price change due to trading. We show how this model can be used to describe price impact also in aggregated trade time or in real time. We then solve analytically and calibrate wit…

research product

Why is equity order flow so persistent?

Abstract Order flow in equity markets is remarkably persistent in the sense that order signs (to buy or sell) are positively autocorrelated out to time lags of tens of thousands of orders, corresponding to many days. Two possible explanations are herding, corresponding to positive correlation in the behavior of different investors, or order splitting, corresponding to positive autocorrelation in the behavior of single investors. We investigate this using order flow data from the London Stock Exchange for which we have membership identifiers. By formulating models for herding and order splitting, as well as models for brokerage choice, we are able to overcome the distortion introduced by bro…

research product

Spectral properties of correlation matrices for some hierarchically nested factor models

We show that spectral methods, such as Principal Component Analysis and Random Matrix Theory, are unable to reveal the hierarchical (or nested) structure of a set of mutivariate data. We consider the method introduced in M. Tumminello et al., EPL 78, 30006 (2007) to associate a hierarchical factor model with a set of data by making use of clustering algorithms. This is done by proving the existence of a bijective correspondence between a hierarchical tree and a factor model.

research product

When do improved covariance matrix estimators enhance portfolio optimization? An empirical comparative study of nine estimators

The use of improved covariance matrix estimators as an alternative to the sample estimator is considered an important approach for enhancing portfolio optimization. Here we empirically compare the performance of 9 improved covariance estimation procedures by using daily returns of 90 highly capitalized US stocks for the period 1997-2007. We find that the usefulness of covariance matrix estimators strongly depends on the ratio between estimation period T and number of stocks N, on the presence or absence of short selling, and on the performance metric considered. When short selling is allowed, several estimation methods achieve a realized risk that is significantly smaller than the one obtai…

research product

Economic Sector Identification in a Set of Stocks Traded at the New York Stock Exchange: A Comparative Analysis

We review some methods recently used in the literature to detect the existence of a certain degree of common behavior of stock returns belonging to the same economic sector. Specifically, we discuss methods based on random matrix theory and hierarchical clustering techniques. We apply these methods to a set of stocks traded at the New York Stock Exchange. The investigated time series are recorded at a daily time horizon. All the considered methods are able to detect economic information and the presence of clusters characterized by the economic sector of stocks. However, different methodologies provide different information about the considered set. Our comparative analysis suggests that th…

research product

Trading activity and price impact in parallel markets: SETS vs. off-book market at the London Stock Exchange

We empirically study the trading activity in the electronic on-book segment and in the dealership off-book segment of the London Stock Exchange, investigating separately the trading of active market members and of other market participants which are non-members. We find that (i) the volume distribution of off-book transactions has a significantly fatter tail than the one of on-book transactions, (ii) groups of members and non-members can be classified in categories according to their trading profile (iii) there is a strong anticorrelation between the daily inventory variation of a market member due to the on-book market transactions and inventory variation due to the off-book market transac…

research product

Tick Size and Price Diffusion

A tick size is the smallest increment of a security price. Tick size is typically regulated by the exchange where the security is traded and it may be modified either because the exchange enforces an overall tick size change or because the price of the security is too low or too high. There is an extensive literature, partially reviewed in Sect. 2 of the present paper, on the role of tick size in the price formation process. However, the role and the importance of tick size has not been yet fully understood, as testified, for example, by a recent document of the Committee of European Securities Regulators (CESR) [1].

research product

Segmentation algorithm for non-stationary compound Poisson processes

We introduce an algorithm for the segmentation of a class of regime switching processes. The segmentation algorithm is a non parametric statistical method able to identify the regimes (patches) of a time series. The process is composed of consecutive patches of variable length. In each patch the process is described by a stationary compound Poisson process, i.e. a Poisson process where each count is associated with a fluctuating signal. The parameters of the process are different in each patch and therefore the time series is non-stationary. Our method is a generalization of the algorithm introduced by Bernaola-Galván, et al. [Phys. Rev. Lett. 87, 168105 (2001)]. We show that the new algori…

research product

Topology of correlation-based minimal spanning trees in real and model markets

We present here a topological characterization of the minimal spanning tree that can be obtained by considering the price return correlations of stocks traded in a financial market. We compare the minimal spanning tree obtained from a large group of stocks traded at the New York Stock Exchange during a 12-year trading period with the one obtained from surrogated data simulated by using simple market models. We find that the empirical tree has features of a complex network that cannot be reproduced, even as a first approximation, by a random market model and by the one-factor model.

research product

How news affect the trading behavior of different categories of investors in a financial market

We investigate the trading behavior of a large set of single investors trading the highly liquid Nokia stock over the period 2003-2008 with the aim of determining the relative role of endogenous and exogenous factors that may affect their behavior. As endogenous factors we consider returns and volatility, whereas the exogenous factors we use are the total daily number of news and a semantic variable based on a sentiment analysis of news. Linear regression and partial correlation analysis of data show that different categories of investors are differently correlated to these factors. Governmental and non profit organizations are weakly sensitive to news and returns or volatility, and, typica…

research product

Spanning Trees and bootstrap reliability estimation in correlation based networks

We introduce a new technique to associate a spanning tree to the average linkage cluster analysis. We term this tree as the Average Linkage Minimum Spanning Tree. We also introduce a technique to associate a value of reliability to links of correlation based graphs by using bootstrap replicas of data. Both techniques are applied to the portfolio of the 300 most capitalized stocks traded at New York Stock Exchange during the time period 2001-2003. We show that the Average Linkage Minimum Spanning Tree recognizes economic sectors and sub-sectors as communities in the network slightly better than the Minimum Spanning Tree does. We also show that the average reliability of links in the Minimum …

research product

Econophysics and the challenge of efficiency

research product

High-Frequency Data

We introduce some of the most common types of high-frequency financial data: tick-by-tick data, trade and quote data, order book data, and market member data. We describe the types of variables that are usually available in the most popular high-frequency financial databases. We discuss the issues related to the handling of these data, including cleaning protocols, timing issues, and issues related to data size. We then briefly consider the issues related to the stylized facts detected in the empirical analysis of high-frequency data. Specifically, we consider (i) the irregular temporal spacing of the events at high frequency and its relevance for the econometric modeling of financial varia…

research product

A statistical analysis of the three-fold evolution of genomic compression through frame overlaps in prokaryotes

Abstract Background Among microbial genomes, genetic information is frequently compressed, exploiting redundancies in the genetic code in order to store information in overlapping genes. We investigate the length, phase and orientation properties of overlap in 58 prokaryotic species evaluating neutral and selective mechanisms of evolution. Results Using a variety of statistical null models we find patterns of compressive coding that can not be explained purely in terms of the selective processes favoring genome minimization or translational coupling. The distribution of overlap lengths follows a fat-tailed distribution, in which a significant proportion of overlaps are in excess of 100 base…

research product

Complexity in Air Traffic Management, Complex Systems

research product

Drift-controlled anomalous diffusion: a solvable Gaussian model

We introduce a Langevin equation characterized by a time dependent drift. By assuming a temporal power-law dependence of the drift we show that a great variety of behavior is observed in the dynamics of the variance of the process. In particular diffusive, subdiffusive, superdiffusive and stretched exponentially diffusive processes are described by this model for specific values of the two control parameters. The model is also investigated in the presence of an external harmonic potential. We prove that the relaxation to the stationary solution is power-law in time with an exponent controlled by one of model parameters.

research product

Multiscale Model Selection for High-Frequency Financial Data of a Large Tick Stock by Means of the Jensen–Shannon Metric

Modeling financial time series at different time scales is still an open challenge. The choice of a suitable indicator quantifying the distance between the model and the data is therefore of fundamental importance for selecting models. In this paper, we propose a multiscale model selection method based on the Jensen–Shannon distance in order to select the model that is able to better reproduce the distribution of price changes at different time scales. Specifically, we consider the problem of modeling the ultra high frequency dynamics of an asset with a large tick-to-price ratio. We study the price process at different time scales and compute the Jensen–Shannon distance between the original…

research product

Market reaction to temporary liquidity crises and the permanent market impact

We study the relaxation dynamics of the bid-ask spread and of the midprice after a sudden, large variation of the spread, corresponding to a temporary crisis of liquidity in a double auction financial market. We find that the spread decays very slowly to its normal value as a consequence of the strategic limit order placement of liquidity providers. We consider several quantities, such as order placement rates and distribution, that affect the decay of the spread. We measure the permanent impact both of a generic event altering the spread and of a single transaction and we find an approximately linear relation between immediate and permanent impact in both cases.

research product

Levels of complexity in financial markets

We consider different levels of complexity which are observed in the empirical investigation of financial time series. We discuss recent empirical and theoretical work showing that statistical properties of financial time series are rather complex under several ways. Specifically, they are complex with respect to their (i) temporal and (ii) ensemble properties. Moreover, the ensemble return properties show a behavior which is specific to the nature of the trading day reflecting if it is a normal or an extreme trading day.

research product

Ultrametric matrices and factor models

research product

Correlation, hierarchies, and networks in financial markets

We discuss some methods to quantitatively investigate the properties of correlation matrices. Correlation matrices play an important role in portfolio optimization and in several other quantitative descriptions of asset price dynamics in financial markets. Specifically, we discuss how to define and obtain hierarchical trees, correlation based trees and networks from a correlation matrix. The hierarchical clustering and other procedures performed on the correlation matrix to detect statistically reliable aspects of the correlation matrix are seen as filtering procedures of the correlation matrix. We also discuss a method to associate a hierarchically nested factor model to a hierarchical tre…

research product

What really causes large price changes?

We study the cause of large fluctuations in prices in the London Stock Exchange. This is done at the microscopic level of individual events, where an event is the placement or cancellation of an order to buy or sell. We show that price fluctuations caused by individual market orders are essentially independent of the volume of orders. Instead, large price fluctuations are driven by liquidity fluctuations, variations in the market's ability to absorb new orders. Even for the most liquid stocks there can be substantial gaps in the order book, corresponding to a block of adjacent price levels containing no quotes. When such a gap exists next to the best price, a new order can remove the best q…

research product

Spectral density of the correlation matrix of factor models: a random matrix theory approach.

We studied the eigenvalue spectral density of the correlation matrix of factor models of multivariate time series. By making use of the random matrix theory, we analytically quantified the effect of statistical uncertainty on the spectral density due to the finiteness of the sample. We considered a broad range of models, ranging from one-factor models to hierarchical multifactor models.

research product

Networks in Finance

research product

The multiplex structure of interbank networks

The interbank market has a natural multiplex network representation. We employ a unique database of supervisory reports of Italian banks to the Banca d'Italia that includes all bilateral exposures broken down by maturity and by the secured and unsecured nature of the contract. We find that layers have different topological properties and persistence over time. The presence of a link in a layer is not a good predictor of the presence of the same link in other layers. Maximum entropy models reveal different unexpected substructures, such as network motifs, in different layers. Using the total interbank network or focusing on a specific layer as representative of the other layers provides a po…

research product

When do Improved Covariance Matrix Estimators Enhance Portfolio Optimization? An Empirical Comparative Study of Nine Estimators

The use of improved covariance matrix estimators as an alternative to the sample estimator is considered an important approach for enhancing portfolio optimization. Here we empirically compare the performance of 9 improved covariance estimation procedures by using daily returns of 90 highly capitalized US stocks for the period 1997-2007. We find that the usefulness of covariance matrix estimators strongly depends on the ratio between estimation period T and number of stocks N, on the presence or absence of short selling, and on the performance metric considered. When short selling is allowed, several estimation methods achieve a realized risk that is significantly smaller than the one obtai…

research product

The adaptive nature of liquidity taking in limit order books

In financial markets, the order flow, defined as the process assuming value one for buy market orders and minus one for sell market orders, displays a very slowly decaying autocorrelation function. Since orders impact prices, reconciling the persistence of the order flow with market efficiency is a subtle issue. A possible solution is provided by asymmetric liquidity, which states that the impact of a buy or sell order is inversely related to the probability of its occurrence. We empirically find that when the order flow predictability increases in one direction, the liquidity in the opposite side decreases, but the probability that a trade moves the price decreases significantly. While the…

research product

Ensemble properties of securities traded in the NASDAQ market

We study the price dynamics of stocks traded in the NASDAQ market by considering the statistical properties of an ensemble of stocks traded simultaneously. For each trading day of our database, we study the ensemble return distribution by extracting its first two central moments. According to previous results obtained for the NYSE market, we find that the second moment is a long-range correlated variable. We compare time-averaged and ensemble-averaged price returns and we show that the two averaging procedures lead to different statistical results.

research product

Volatility in Financial Markets: Stochastic Models and Empirical Results

We investigate the historical volatility of the 100 most capitalized stocks traded in US equity markets. An empirical probability density function (pdf) of volatility is obtained and compared with the theoretical predictions of a lognormal model and of the Hull and White model. The lognormal model well describes the pdf in the region of low values of volatility whereas the Hull and White model better approximates the empirical pdf for large values of volatility. Both models fails in describing the empirical pdf over a moderately large volatility range.

research product

High frequency data entry: statistical findings at high frequency

We introduce some of the most common types of high-frequency financial data: tick-by-tick data, trade andquote data, order bookdata, andmarket member data. We describe the types of variables that are usually available in the most popular high-frequency financial databases. We discuss the issues related to the handling of these data, including cleaning protocols, timing issues, and issues related to data size. We then briefly consider the issues related to the stylized facts detected in the empirical analysis of high- frequency data. Specifically, we consider (i) the irregular temporal spacing of the events at high frequency and its relevance for the econometric modeling of financial variables, (…

research product

How News Affect the Trading Behavior of Different Categories of Investors in a Financial Market

We investigate the trading behavior of a large set of single investors trading the highly liquid Nokia stock over the period 2003-2008 with the aim of determining the relative role of endogenous and exogenous factors that may affect their behavior. As endogenous factors we consider returns and volatility, whereas the exogenous factors we use are the total daily number of news and a semantic variable based on a sentiment analysis of news. Linear regression and partial correlation analysis of data show that different categories of investors are differently correlated to these factors. Governmental and non profit organizations are weakly sensitive to news and returns or volatility, and, typica…

research product

Coupling News Sentiment with Web Browsing Data Improves Prediction of Intra-Day Price Dynamics

The new digital revolution of big data is deeply changing our capability of understanding society and forecasting the outcome of many social and economic systems. Unfortunately, information can be very heterogeneous in the importance, relevance, and surprise it conveys, affecting severely the predictive power of semantic and statistical methods. Here we show that the aggregation of web users' behavior can be elicited to overcome this problem in a hard to predict complex system, namely the financial market. Specifically, our in-sample analysis shows that the combined use of sentiment analysis of news and browsing activity of users of Yahoo! Finance greatly helps forecasting intra-day and dai…

research product

The Structure of Financial Networks

We present here an overview of the use of networks in Finance and Economics. We show how this approach enables us to address important questions as, for example, the structure of control chains in financial systems, the systemic risk associated with them and the evolution of trade between nations. All these results are new in the field and allow for a better understanding and modelling of different economic systems.

research product

How does the market react to your order flow?

We present an empirical study of the intertwined behaviour of members in a financial market. Exploiting a database where the broker that initiates an order book event can be identified, we decompose the correlation and response functions into contributions coming from different market participants and study how their behaviour is interconnected. We find evidence that (1) brokers are very heterogeneous in liquidity provision -- some are consistently liquidity providers while others are consistently liquidity takers. (2) The behaviour of brokers is strongly conditioned on the actions of {\it other} brokers. In contrast brokers are only weakly influenced by the impact of their own previous ord…

research product

Scaling laws of strategic behavior and size heterogeneity in agent dynamics

The dynamics of many socioeconomic systems is determined by the decision making process of agents. The decision process depends on agent's characteristics, such as preferences, risk aversion, behavioral biases, etc.. In addition, in some systems the size of agents can be highly heterogeneous leading to very different impacts of agents on the system dynamics. The large size of some agents poses challenging problems to agents who want to control their impact, either by forcing the system in a given direction or by hiding their intentionality. Here we consider the financial market as a model system, and we study empirically how agents strategically adjust the properties of large orders in orde…

research product

The effect of round-off error on long memory processes

We study how the round-off (or discretization) error changes the statistical properties of a Gaussian long memory process. We show that the autocovariance and the spectral density of the discretized process are asymptotically rescaled by a factor smaller than one, and we compute exactly this scaling factor. Consequently, we find that the discretized process is also long memory with the same Hurst exponent as the original process. We consider the properties of two estimators of the Hurst exponent, namely the local Whittle (LW) estimator and the Detrended Fluctuation Analysis (DFA). By using analytical considerations and numerical simulations we show that, in presence of round-off error, both…

research product

Scaling and data collapse for the mean exit time of asset prices

We study theoretical and empirical aspects of the mean exit time of financial time series. The theoretical modeling is done within the framework of continuous time random walk. We empirically verify that the mean exit time follows a quadratic scaling law and it has associated a pre-factor which is specific to the analyzed stock. We perform a series of statistical tests to determine which kind of correlation are responsible for this specificity. The main contribution is associated with the autocorrelation property of stock returns. We introduce and solve analytically both a two-state and a three-state Markov chain models. The analytical results obtained with the two-state Markov chain model …

research product

Power-law relaxation in a complex system: Omori law after a financial market crash

We study the relaxation dynamics of a financial market just after the occurrence of a crash by investigating the number of times the absolute value of an index return is exceeding a given threshold value. We show that the empirical observation of a power law evolution of the number of events exceeding the selected threshold (a behavior known as the Omori law in geophysics) is consistent with the simultaneous occurrence of (i) a return probability density function characterized by a power law asymptotic behavior and (ii) a power law relaxation decay of its typical scale. Our empirical observation cannot be explained within the framework of simple and widespread stochastic volatility models.

research product

Modelling Systemic Cojumps with Hawkes Factor Models

Instabilities in the price dynamics of a large number of financial assets are a clear sign of systemic events. By investigating a set of 20 high cap stocks traded at the Italian Stock Exchange, we find that there is a large number of high frequency cojumps. We show that the dynamics of these jumps is described neither by a multivariate Poisson nor by a multivariate Hawkes model. We introduce a Hawkes one factor model which is able to capture simultaneously the time clustering of jumps and the high synchronization of jumps across assets.

research product

Market reaction to a bid-ask spread change: a power-law relaxation dynamics.

We study the relaxation dynamics of the bid-ask spread and of the midprice after a sudden variation of the spread in a double auction financial market. We find that the spread decays as a power law to its normal value. We measure the price reversion dynamics and the permanent impact, i.e., the long-time effect on price, of a generic event altering the spread and we find an approximately linear relation between immediate and permanent impact. We hypothesize that the power-law decay of the spread is a consequence of the strategic limit order placement of liquidity providers. We support this hypothesis by investigating several quantities, such as order placement rates and distribution of price…

research product

Dynamics of a financial market index after a crash

We discuss the statistical properties of index returns in a financial market just after a major market crash. The observed non-stationary behavior of index returns is characterized in terms of the exceedances over a given threshold. This characterization is analogous to the Omori law originally observed in geophysics. By performing numerical simulations and theoretical modelling, we show that the nonlinear behavior observed in real market crashes cannot be described by a GARCH(1,1) model. We also show that the time evolution of the Value at Risk observed just after a major crash is described by a power-law function lacking a typical scale.

research product

Diffusive Behavior and the Modeling of Characteristic Times in Limit Order Executions

We present a study of the order book data of the London Stock Exchange for five highly liquid stocks traded during the calendar year 2002. Specifically, we study the first passage time of order book prices needed to observe a prescribed price change Delta, the time to fill (TTF) for executed limit orders and the time to cancel (TTC) for canceled ones. We find that the distribution of the first passage time decays asymptotically in time as a power law with an exponent L_FPT ~ 1.5. The median of the same quantity scales as Delta^1.6, which is different from the Delta^2 behavior expected for Brownian motion. The quantities TTF, and TTC are also asymptotically power law distributed with exponen…

research product

Statistics of order flow

research product

Do Firms Share the Same Functional Form of Their Growth Rate Distribution? A New Statistical Test

We propose a hypothesis testing procedure to investigate whether the same growth rate distribution is shared by all the firms in a balanced panel or, more generally, whether they share the same functional form for this distribution, without necessarily sharing the same parameters. We apply the test to panels of US and European Union publicly quoted manufacturing firms, both at the sectoral and at the subsectoral NAICS levels. We consider the following null hypotheses about the growth rate distribution of the individual firms: i) an unknown shape common to all firms, with all the firms sharing also the same parameters, or with the firm variance related to its firm size through a scaling rela…

research product

The long memory of efficient market

For the London Stock Exchange we demonstrate that the signs of orders obey a long-memory process. The autocorrelation function decays roughly as a power law with an exponent of 0.6, corresponding to a Hurst exponent H = 0.7. This implies that the signs of future orders are quite predictable from the signs of past orders; all else being equal, this would suggest a very strong market inefficiency. We demonstrate, however, that fluctuations in order signs are compensated for by anti-correlated fluctuations in transaction size and liquidity, which are also long-memory processes that act to make the returns whiter. We show that some institutions display long-range memory and others don't.

research product