0000000000018292
AUTHOR
Michele Tumminello
miR-1207-5p Can Contribute to Dysregulation of Inflammatory Response in COVID-19 via Targeting SARS-CoV-2 RNA
The present study focuses on the role of human miRNAs in SARS-CoV-2 infection. An extensive analysis of human miRNA binding sites on the viral genome led to the identification of miR-1207-5p as potential regulator of the viral Spike protein. It is known that exogenous RNA can compete for miRNA targets of endogenous mRNAs leading to their overexpression. Our results suggest that SARS-CoV-2 virus can act as an exogenous competing RNA, facilitating the over-expression of its endogenous targets. Transcriptomic analysis of human alveolar and bronchial epithelial cells confirmed that the CSF1 gene, a known target of miR-1207-5p, is over-expressed following SARS-CoV-2 infection. CSF1 enhances macr…
Additional file 4: of RIP-Chip analysis supports different roles for AGO2 and GW182 proteins in recruiting and processing microRNA targets
Overview of gene expression levels in IP and FT samples. Focus on the enriched genes in AGO2-IP and GW182-IP vs FT samples. The reported expression levels are computed as the average values of the three performed experimental replicates. (PDF 1423 kb)
Networks in biological systems: An investigation of the Gene Ontology as an evolving network
Many biological systems can be described as networks where diFFerent elements interact, in order to perform biological processes. We introduce a network associated with the Gene Ontology. Specifically, we construct a correlation-based network where the vertices are the terms of the Gene Ontology and the link between each two terms is weighted on the basis of the number of genes that they have in common. We analyze a filtered network obtained from the correlation-based network and we characterize its evolution over different releases of the Gene Ontology.
Identification of clusters of investors from their real trading activity in a financial market
We use statistically validated networks, a recently introduced method to validate links in a bipartite system, to identify clusters of investors trading in a financial market. Specifically, we investigate a special database allowing to track the trading activity of individual investors of the stock Nokia. We find that many statistically detected clusters of investors show a very high degree of synchronization in the time when they decide to trade and in the trading action taken. We investigate the composition of these clusters and we find that several of them show an over-expression of specific categories of investors.
Econophysics: a new tool to investigate financial markets
Designing Guarantee Options in Defined Contribution Pension Plans
The shift from defined benefit (DB) to defined contribution (DC) is pervasive among pension funds, due to demographic changes and macroeconomic pressures. In DB all risks are borne by the provider, while in plain vanilla DC all risks are borne by the beneficiary. For DC to provide income security some kind of guarantee is required. A minimum guarantee clause can be modeled as a put option written on some underlying reference portfolio of assets and we develop a discrete model that optimally selects the reference portfolio to minimise the cost of a guarantee. While the relation DB-DC is typically viewed as a binary one, the model can be used to price a wide range of guarantees creating a con…
Gli strumenti di monitoraggio e controllo e l’analisi quali-quantitativa dei dati sulla percezione del cyber-bullismo da parte degli insegnanti
In the last ten years it happens less frequently to hear teachers talk about "pranks" when debating about the trend that shows a small number of teenagers and very young students, to put aggressive, violent discriminating, and, in general, victimizing behaviors are taking place. At least, we register this trend in the perception of the teachers we interviewed. It is certainly a sign, or even a consequence, of the fact that the entire educational institution, from the top management to the teachers, for over twenty years now, has become aware not only of the fact that that weren’t "pranks", but seriously aggressive behaviors, but also of the fact that, in addition to pro-sociality and non-ag…
Statistically validated networks in bipartite complex systems.
Many complex systems present an intrinsic bipartite nature and are often described and modeled in terms of networks [1-5]. Examples include movies and actors [1, 2, 4], authors and scientific papers [6-9], email accounts and emails [10], plants and animals that pollinate them [11, 12]. Bipartite networks are often very heterogeneous in the number of relationships that the elements of one set establish with the elements of the other set. When one constructs a projected network with nodes from only one set, the system heterogeneity makes it very difficult to identify preferential links between the elements. Here we introduce an unsupervised method to statistically validate each link of the pr…
Quantifying Preferential Trading in the e-MID Interbank Market
Interbank markets allow credit institutions to exchange capital for purposes of liquidity management. These markets are among the most liquid markets in the financial system. However, liquidity of interbank markets dropped during the 2007-2008 financial crisis, and such a lack of liquidity influenced the entire economic system. In this paper, we analyze transaction data from the e-MID market which is the only electronic interbank market in the Euro Area and US, over a period of eleven years (1999-2009). We adapt a method developed to detect statistically validated links in a network, in order to reveal preferential trading in a directed network. Preferential trading between banks is detecte…
Ranking coherence in topic models using statistically validated networks
Probabilistic topic models have become one of the most widespread machine learning techniques in textual analysis. Topic discovering is an unsupervised process that does not guarantee the interpretability of its output. Hence, the automatic evaluation of topic coherence has attracted the interest of many researchers over the last decade, and it is an open research area. This article offers a new quality evaluation method based on statistically validated networks (SVNs). The proposed probabilistic approach consists of representing each topic as a weighted network of its most probable words. The presence of a link between each pair of words is assessed by statistically validating their co-oc…
Kullback-Leibler distance as a measure of the information filtered from multivariate data
We show that the Kullback-Leibler distance is a good measure of the statistical uncertainty of correlation matrices estimated by using a finite set of data. For correlation matrices of multivariate Gaussian variables we analytically determine the expected values of the Kullback-Leibler distance of a sample correlation matrix from a reference model and we show that the expected values are known also when the specific model is unknown. We propose to make use of the Kullback-Leibler distance to estimate the information extracted from a correlation matrix by correlation filtering procedures. We also show how to use this distance to measure the stability of filtering procedures with respect to s…
A comparative analysis of the statistical properties of large mobile phone calling networks.
Mobile phone calling is one of the most widely used communication methods in modern society. The records of calls among mobile phone users provide us a valuable proxy for the understanding of human communication patterns embedded in social networks. Mobile phone users call each other forming a directed calling network. If only reciprocal calls are considered, we obtain an undirected mutual calling network. The preferential communication behavior between two connected users can be statistically tested and it results in two Bonferroni networks with statistically validated edges. We perform a comparative analysis of the statistical properties of these four networks, which are constructed from …
Shrinkage and spectral filtering of correlation matrices: A comparison via the Kullback-Leibler distance
The problem of filtering information from large correlation matrices is of great importance in many applications. We have recently proposed the use of the Kullback-Leibler distance to measure the performance of filtering algorithms in recovering the underlying correlation matrix when the variables are described by a multivariate Gaussian distribution. Here we use the Kullback-Leibler distance to investigate the performance of filtering methods based on Random Matrix Theory and on the shrinkage technique. We also present some results on the application of the Kullback-Leibler distance to multivariate data which are non Gaussian distributed.
Evolution of Worldwide Stock Markets, Correlation Structure and Correlation Based Graphs
We investigate the daily correlation present among market indices of stock exchanges located all over the world in the time period Jan 1996 - Jul 2009. We discover that the correlation among market indices presents both a fast and a slow dynamics. The slow dynamics reflects the development and consolidation of globalization. The fast dynamics is associated with critical events that originate in a specific country or region of the world and rapidly affect the global system. We provide evidence that the short term timescale of correlation among market indices is less than 3 trading months (about 60 trading days). The average values of the non diagonal elements of the correlation matrix, corre…
Statistically validated mobile communication networks: the evolution of motifs in European and Chinese data
Big data open up unprecedented opportunities to investigate complex systems including the society. In particular, communication data serve as major sources for computational social sciences but they have to be cleaned and filtered as they may contain spurious information due to recording errors as well as interactions, like commercial and marketing activities, not directly related to the social network. The network constructed from communication data can only be considered as a proxy for the network of social relationships. Here we apply a systematic method, based on multiple hypothesis testing, to statistically validate the links and then construct the corresponding Bonferroni network, gen…
A multivariate statistical test for differential expression analysis
AbstractStatistical tests of differential expression usually suffer from two problems. Firstly, their statistical power is often limited when applied to small and skewed data sets. Secondly, gene expression data are usually discretized by applying arbitrary criteria to limit the number of false positives. In this work, a new statistical test obtained from a convolution of multivariate hypergeometric distributions, the Hy-test, is proposed to address these issues. Hy-test has been carried out on transcriptomic data from breast and kidney cancer tissues, and it has been compared with other differential expression analysis methods. Hy-test allows implicit discretization of the expression profi…
On the observability of Bell's inequality violation in the two-atoms optical Stern-Gerlach model
Using the optical Stern-Gerlach model, we have recently shown that the non-local correlations between the internal variables of two atoms that successively interact with the field of an ideal cavity in proximity of a nodal region are affected by the atomic translational dynamics. As a consequence, there can be some difficulties in observing violation of the Bell's inequality for the atomic internal variables. These difficulties persist even if the atoms travel an antinodal region, except when the spatial wave packets are exactly centered in an antinodal point.
STRANIERI, MERIDIONALI O PROVINCIALI? I CONSUMI NEL TEMPO LIBERO DELLE SECONDE GENERAZIONI
In this paper, we analyze consumption patterns of leisure time among young people belonging to the so-called “second generation” of immigrants in Italy. Leisure time consumption describes how young immigrants use cultural products and services. We analyze data collected by the ISTAT through the survey on the “second generations” (2015). A comparison of leisure consumption patterns between second-generation immigrants and their Italian peers does not show significant differences. Rather, differences in consumption styles are associated to gender (male/female), geographic area of residence (North/South), and size of the municipality (large municipality/small municipality) of residence.
Generation of hierarchically correlated multivariate symbolic sequences: With an application to the assessment of bootstrap confidence in phylogenetic analysis.
We introduce a method to generate multivariate series of symbols from a finite alphabet with a given hierarchical structure of similarities based on the Hamming distance. The target hierarchical structure of similarities is arbitrary, for instance the one obtained by some hierarchical clustering method applied to an empirical matrix of similarities. The method that we present here is based on a generating mechanism that does not make use of mutation rate, which is widely used in phylogenetic analysis. Here we use the proposed simulation method to investigate the relationship between the bootstrap value associated with a node of a phylogeny and the probability of finding that node in the tru…
Emergence of Statistically Validated Financial Intraday Lead-Lag Relationships
According to the leading models in modern finance, the presence of intraday lead-lag relationships between financial assets is negligible in efficient markets. With the advance of technology, however, markets have become more sophisticated. To determine whether this has resulted in an improved market efficiency, we investigate whether statistically significant lagged correlation relationships exist in financial markets. We introduce a numerical method to statistically validate links in correlation-based networks, and employ our method to study lagged correlation networks of equity returns in financial markets. Crucially, our statistical validation of lead-lag relationships accounts for mult…
Hierarchically nested factor model from multivariate data
We show how to achieve a statistical description of the hierarchical structure of a multivariate data set. Specifically we show that the similarity matrix resulting from a hierarchical clustering procedure is the correlation matrix of a factor model, the hierarchically nested factor model. In this model, factors are mutually independent and hierarchically organized. Finally, we use a bootstrap based procedure to reduce the number of factors in the model with the aim of retaining only those factors significantly robust with respect to the statistical uncertainty due to the finite length of data records.
Sector identification in a set of stock return time series traded at the London Stock Exchange
We compare some methods recently used in the literature to detect the existence of a certain degree of common behavior of stock returns belonging to the same economic sector. Specifically, we discuss methods based on random matrix theory and hierarchical clustering techniques. We apply these methods to a portfolio of stocks traded at the London Stock Exchange. The investigated time series are recorded both at a daily time horizon and at a 5-minute time horizon. The correlation coefficient matrix is very different at different time horizons confirming that more structured correlation coefficient matrices are observed for long time horizons. All the considered methods are able to detect econo…
Hybrid recommendation methods in complex networks
We propose here two new recommendation methods, based on the appropriate normalization of already existing similarity measures, and on the convex combination of the recommendation scores derived from similarity between users and between objects. We validate the proposed measures on three relevant data sets, and we compare their performance with several recommendation systems recently proposed in the literature. We show that the proposed similarity measures allow to attain an improvement of performances of up to 20\% with respect to existing non-parametric methods, and that the accuracy of a recommendation can vary widely from one specific bipartite network to another, which suggests that a …
Household Expenditure on Leisure: a Comparative Study of Italian Households with Children from Y- and Z-Generation
The intrinsic complexity of post-materialist society makes it challenging to investigate the connection between social changes and generations. However, the study of consumption might help in the analysis of such a connection. In this paper, we analyse empirical data of consumption on leisure of Italian households, and focus on families at a very precise stage of family life-cycle, that is, couples with teenager children. We look at consumption of households at different points in time, 2001, 2007, and 2012, in order to investigate the impact of both social change and generation of children–Y-generation in 2001 and 2007, and Z-generation in 2012–on the leisure expenditure patterns of famili…
The Phenomenology of Specialization of Criminal Suspects
A criminal career can be either general, with the criminal committing different types of crimes, or specialized, with the criminal committing a specific type of crime. A central problem in the study of crime specialization is to determine, from the perspective of the criminal, which crimes should be considered similar and which crimes should be considered distinct. We study a large set of Swedish suspects to empirically investigate generalist and specialist behavior in crime. We show that there is a large group of suspects who can be described as generalists. At the same time, we observe a non-trivial pattern of specialization across age and gender of suspects. Women are less prone to commi…
Graphs and financial markets: a new approach
Covariance and correlation estimators in bipartite complex systems with a double heterogeneity
Complex bipartite systems are studied in Biology, Physics, Economics, and Social Sciences, and they can suitably be described as bipartite networks. The heterogeneity of elements in those systems makes it very difficult to perform a statistical analysis of similarity starting from empirical data. Though binary Pearson's correlation coefficient has proved effective to investigate the similarity structure of some real-world bipartite networks, here we show that both the usual sample covariance and correlation coefficient are affected by a bias, which is due to the aforementioned heterogeneity. Such a bias affects real bipartite systems, and, for example, we report its effects on empirical dat…
Dominating Clasp of the Financial Sector Revealed by Partial Correlation Analysis of the Stock Market
What are the dominant stocks which drive the correlations present among stocks traded in a stock market? Can a correlation analysis provide an answer to this question? In the past, correlation based networks have been proposed as a tool to uncover the underlying backbone of the market. Correlation based networks represent the stocks and their relationships, which are then investigated using different network theory methodologies. Here we introduce a new concept to tackle the above question--the partial correlation network. Partial correlation is a measure of how the correlation between two variables, e.g., stock returns, is affected by a third variable. By using it we define a proxy of stoc…
Additional file 6: of RIP-Chip analysis supports different roles for AGO2 and GW182 proteins in recruiting and processing microRNA targets
Wilcoxon test p-values summary. Wilcoxon test p-values (log10) obtained by comparing the variable values associated with the enriched/underrepresented genes sets. Three different miRNA target prediction tools (Targetscan, PITA and miRanda) were used to compute the necessary binding sites (BS) matrices. The BS matrices used to compute the p-values in the last panel were obtained by considering BS predicted by at least two of the three prediction tools. In each panel, the variables computed with the three AGO2 IN profiles were used to distinguish enriched and underrepresented genes in AGO2-IP vs FT and the variables computed with the three GW182 IN profiles were used to distinguish enriched a…
Atomic teleportation via cavity QED and position measurements: efficiency analysis
We have recently presented a novel protocol to teleport an unknown atomic state via cavity QED and position measurements. Here, after a brief review of our scheme, we provide a quantitative study of its efficiency. This is accomplished by an explicit description of the measurement process that allows us to derive the fidelity with respect to the atomic internal state to be teleported.
A tool for filtering information in complex systems
We introduce a technique to filter out complex data-sets by extracting a subgraph of representative links. Such a filtering can be tuned up to any desired level by controlling the genus of the resulting graph. We show that this technique is especially suitable for correlation based graphs giving filtered graphs which preserve the hierarchical organization of the minimum spanning tree but containing a larger amount of information in their internal structure. In particular in the case of planar filtered graphs (genus equal to 0) triangular loops and 4 element cliques are formed. The application of this filtering procedure to 100 stocks in the USA equity markets shows that such loops and cliqu…
Community characterization of heterogeneous complex systems
We introduce an analytical statistical method to characterize the communities detected in heterogeneous complex systems. By posing a suitable null hypothesis, our method makes use of the hypergeometric distribution to assess the probability that a given property is over-expressed in the elements of a community with respect to all the elements of the investigated set. We apply our method to two specific complex networks, namely a network of world movies and a network of physics preprints. The characterization of the elements and of the communities is done in terms of languages and countries for the movie network and of journals and subject categories for papers. We find that our method is ab…
Emergence of statistically validated financial intraday lead-lag relationships
According to the leading models in modern finance, the presence of intraday lead-lag relationships between financial assets is negligible in efficient markets. With the advance of technology, however, markets have become more sophisticated. To determine whether this has resulted in an improved market efficiency, we investigate whether statistically significant lagged correlation relationships exist in financial markets. We introduce a numerical method to statistically validate links in correlation-based networks, and employ our method to study lagged correlation networks of equity returns in financial markets. Crucially, our statistical validation of lead-lag relationships accounts for mult…
Pricing Sovereign Contingent Convertible Debt
We develop a pricing model for sovereign contingent convertible bonds (S-CoCo) with payment standstills triggered by a sovereign's credit default swap CDS spread. One innovation is the modeling of CDS spread regime switching which is prevalent during crises. Regime switching is modeled as a hidden Markov process and is integrated with a stochastic process of spread levels to obtain S-CoCo prices through simulation. The paper goes a step further and uses the pricing model in a Longstaff-Schwartz. American option pricing framework to compute state contingent S-CoCo prices at some risk horizon, thus facilitating risk management. Dual trigger pricing is also discussed using the idiosyncratic CD…
Pricing sovereign contingent convertible debt
We develop a pricing model for Sovereign Contingent Convertible bonds (S-CoCo) with payment standstills triggered by a sovereign's Credit Default Swap (CDS) spread. We model CDS spread regime switching, which is prevalent during crises, as a hidden Markov process, coupled with a mean-reverting stochastic process of spread levels under fixed regimes, in order to obtain S-CoCo prices through simulation. The paper uses the pricing model in a Longstaff-Schwartz American option pricing framework to compute future state contingent S-CoCo prices for risk management. Dual trigger pricing is also discussed using the idiosyncratic CDS spread for the sovereign debt together with a broad market index. …
Additional file 7: of RIP-Chip analysis supports different roles for AGO2 and GW182 proteins in recruiting and processing microRNA targets
Summary of miRNA expression profiles shuffling effects. ROC analysis was performed to evaluate the performance of F6 and F4d variables, computed with simulated miRNA profiles, in distinguishing enriched/underrepresented genes in AGO2 or GW182-IP samples. Each panel reports the AUC values obtained with simulated variables. Each boxplot refers to AUC values obtained with a specific set of simulations, where the expression profile of a set of miRNAs was shuffled. The boxplot in the center was obtained by shuffling all miRNAs. The boxplots from the center to the right refer to simulations where all the miRNAs were shuffled with the exception of n top expressed miRNAs, n increasing in the right …
Correlation based networks of equity returns sampled at different time horizons
We investigate the planar maximally filtered graphs of the portfolio of the 300 most capitalized stocks traded at the New York Stock Exchange during the time period 2001-2003. Topological properties such as the average length of shortest paths, the betweenness and the degree are computed on different planar maximally filtered graphs generated by sampling the returns at different time horizons ranging from 5 min up to one trading day. This analysis confirms that the selected stocks compose a hierarchical system progressively structuring as the sampling time horizon increases. Finally, a cluster formation, associated to economic sectors, is quantitatively investigated.
Translational dynamics effects on the non-local correlations between two atoms
A pair of atoms interacting successively with the field of the same cavity and exchanging a single photon, leave the cavity in an entangled state of Einstein-Podolsky-Rosen (EPR) type (see, for example, [S.J.D. Phoenix, and S.M. Barnett, J. Mod. Opt. \textbf{40} (1993) 979]). By implementing the model with the translational degrees of freedom, we show in this letter that the entanglement with the translational atomic variables can lead, under appropriate conditions, towards the separability of the internal variables of the two atoms. This implies that the translational dynamics can lead, in some cases, to difficulties in observing the Bell's inequality violation for massive particles.
How Lead-Lag Correlations Affect the Intraday Pattern of Collective Stock Dynamics
The degree of correlation among stock returns aects the possibility to diversify the risk of investment,
Identification of Clusters of Investors from Their Real Trading Activity in a Financial Market
We use statistically validated networks, a recently introduced method to validate links in a bipartite system, to identify clusters of investors trading in a financial market. Specifically, we investigate a special database allowing to track the trading activity of individual investors of the stock Nokia. We find that many statistically detected clusters of investors show a very high degree of synchronization in the time when they decide to trade and in the trading action taken. We investigate the composition of these clusters and we find that several of them show an over-expression of specific categories of investors.
Structure and evolution of a European Parliament via a network and correlation analysis
We present a study of the network of relationships among elected members of the Finnish parliament, based on a quantitative analysis of initiative co-signatures, and its evolution over 16 years. To understand the structure of the parliament, we constructed a statistically validated network of members, based on the similarity between the patterns of initiatives they signed. We looked for communities within the network and characterized them in terms of members' attributes, such as electoral district and party. To gain insight on the nested structure of communities, we constructed a hierarchical tree of members from the correlation matrix. Afterwards, we studied parliament dynamics yearly, wi…
RIP-Chip analysis supports different roles for AGO2 and GW182 proteins in recruiting and processing microRNA targets.
Background MicroRNAs (miRNAs) are small non-coding RNA molecules mediating the translational repression and degradation of target mRNAs in the cell. Mature miRNAs are used as a template by the RNA-induced silencing complex (RISC) to recognize the complementary mRNAs to be regulated. To discern further RISC functions, we analyzed the activities of two RISC proteins, AGO2 and GW182, in the MCF-7 human breast cancer cell line. Methods We performed three RIP-Chip experiments using either anti-AGO2 or anti-GW182 antibodies and compiled a data set made up of the miRNA and mRNA expression profiles of three samples for each experiment. Specifically, we analyzed the input sample, the immunoprecipita…
Alexithymia and personality traits of patients with inflammatory bowel disease
AbstractPsychological factors, specific lifestyles and environmental stressors may influence etiopathogenesis and evolution of chronic diseases. We investigate the association between Chronic Inflammatory Bowel Diseases (IBD) and psychological dimensions such as personality traits, defence mechanisms, and Alexithymia, i.e. deficits of emotional awareness with inability to give a name to emotional states. We analyzed a survey of 100 patients with IBD and a control group of 66 healthy individuals. The survey involved filling out clinical and anamnestic forms and administering five psychological tests. These were then analyzed by using a network representation of the system by considering it a…
A network analysis of student mobility patterns from high school to master’s
Human migration involves the movement of people from one place to another. An example of undirected migration is Italian student mobility where students move from the South to the Center-North. This kind of mobility has become of general interest, and this work explores student mobility from Sicily towards universities outside the island. The data used in this paper regards six cohorts of students, from 2008/09 to 2013/14. In particular, our goal is to study the 3-step migration path: the area of origin (Sicilian provinces), the regional university for the bachelor’s degree, and the regional university for the master’s. Our analysis is conducted by building a multipartite network with four …
The non dissipative damping of the Rabi oscillations as a "which-path" information
Rabi oscillations may be viewed as an interference phenomenon due to a coherent superposition of different quantum paths, like in the Young's two-slit experiment. The inclusion of the atomic external variables causes a non dissipative damping of the Rabi oscillations. More generally, the atomic translational dynamics induces damping in the correlation functions which describe non classical behaviors of the field and internal atomic variables, leading to the separability of these two subsystems. We discuss on the possibility of interpreting this intrinsic decoherence as a "which-way" information effect and we apply to this case a quantitative analysis of the complementarity relation as intro…
Additional file 5: of RIP-Chip analysis supports different roles for AGO2 and GW182 proteins in recruiting and processing microRNA targets
Venn diagram of lists of enriched genes. The considered lists are: AGO2-IP (UP_AGO2 set), the list of enriched genes detected by Fan et al. [13] (UP_AGO2_Fan) and our list of enriched genes in GW182-IP sample (UP_GW182). The reported p-values refer to the closest intersection set of genes and are computed with one tail Fisher-test. (PDF 34 kb)
Household Expenditures and the Status of Children: An Analysis of the Italian Case
Several factors influence consumption choices, which are not all related to the economic dimension. In this paper, we analyze Italian household consumption depending on family characteristics, such as the parents’ working status, their level of education, and the status of children (aged 21 through 30) as students, workers, or NEETs. We rely upon household consumption microdata and apply the «budgetary unit» concept to investigate preferential patterns of family consumption depending on the aforementioned characteristics. In particular, we are interested in understanding whether and how the status of children shapes the consumption patterns of the family unit. We analyze secondary data coll…
Spectral properties of correlation matrices for some hierarchically nested factor models
We show that spectral methods, such as Principal Component Analysis and Random Matrix Theory, are unable to reveal the hierarchical (or nested) structure of a set of mutivariate data. We consider the method introduced in M. Tumminello et al., EPL 78, 30006 (2007) to associate a hierarchical factor model with a set of data by making use of clustering algorithms. This is done by proving the existence of a bijective correspondence between a hierarchical tree and a factor model.
Analysis of pipeline accidents in the United States from 1968 to 2009
Pipelines are responsible for the transportation of a significant portion of the U.S. energy supply. Unfortunately, pipeline failures are common and the consequences can be catastrophic. Drawing on data from the Pipeline and Hazardous Materials Safety Administration (PHMSA) that covers approximately 40,000 incidents from 1968 to 2009, this paper explores the trends, causes and consequences of natural gas and hazardous liquid pipeline accidents. The analysis indicates that fatalities and injuries from pipeline accidents are generally decreasing over time, while property damage and, in some cases, the numbers of incidents are increasing over time. In five of the ten cases considered in this p…
When do improved covariance matrix estimators enhance portfolio optimization? An empirical comparative study of nine estimators
The use of improved covariance matrix estimators as an alternative to the sample estimator is considered an important approach for enhancing portfolio optimization. Here we empirically compare the performance of 9 improved covariance estimation procedures by using daily returns of 90 highly capitalized US stocks for the period 1997-2007. We find that the usefulness of covariance matrix estimators strongly depends on the ratio between estimation period T and number of stocks N, on the presence or absence of short selling, and on the performance metric considered. When short selling is allowed, several estimation methods achieve a realized risk that is significantly smaller than the one obtai…
Economic Sector Identification in a Set of Stocks Traded at the New York Stock Exchange: A Comparative Analysis
We review some methods recently used in the literature to detect the existence of a certain degree of common behavior of stock returns belonging to the same economic sector. Specifically, we discuss methods based on random matrix theory and hierarchical clustering techniques. We apply these methods to a set of stocks traded at the New York Stock Exchange. The investigated time series are recorded at a daily time horizon. All the considered methods are able to detect economic information and the presence of clusters characterized by the economic sector of stocks. However, different methodologies provide different information about the considered set. Our comparative analysis suggests that th…
Quantitative Analysis of Gender Stereotypes and Information Aggregation in a National Election
By analyzing a database of a questionnaire answered by a large majority of candidates and elected in a parliamentary election, we quantitatively verify that (i) female candidates on average present political profiles which are more compassionate and more concerned with social welfare issues than male candidates and (ii) the voting procedure acts as a process of information aggregation. Our results show that information aggregation proceeds with at least two distinct paths. In the first case candidates characterize themselves with a political profile aiming to describe the profile of the majority of voters. This is typically the case of candidates of political parties which are competing for…
How news affect the trading behavior of different categories of investors in a financial market
We investigate the trading behavior of a large set of single investors trading the highly liquid Nokia stock over the period 2003-2008 with the aim of determining the relative role of endogenous and exogenous factors that may affect their behavior. As endogenous factors we consider returns and volatility, whereas the exogenous factors we use are the total daily number of news and a semantic variable based on a sentiment analysis of news. Linear regression and partial correlation analysis of data show that different categories of investors are differently correlated to these factors. Governmental and non profit organizations are weakly sensitive to news and returns or volatility, and, typica…
Quantum erasure within the optical Stern-Gerlach model
In the optical Stern-Gerlach effect the two branches in which the incoming atomic packet splits up can display interference pattern outside the cavity when a field measurement is made which erases the which-way information on the quantum paths the system can follow. On the contrary, the mere possibility to acquire this information causes a decoherence effect which cancels out the interference pattern. A phase space analysis is also carried out to investigate on the negativity of the Wigner function and on the connection between its covariance matrix and the distinguishability of the quantum paths.
Spanning Trees and bootstrap reliability estimation in correlation based networks
We introduce a new technique to associate a spanning tree to the average linkage cluster analysis. We term this tree as the Average Linkage Minimum Spanning Tree. We also introduce a technique to associate a value of reliability to links of correlation based graphs by using bootstrap replicas of data. Both techniques are applied to the portfolio of the 300 most capitalized stocks traded at New York Stock Exchange during the time period 2001-2003. We show that the Average Linkage Minimum Spanning Tree recognizes economic sectors and sub-sectors as communities in the network slightly better than the Minimum Spanning Tree does. We also show that the average reliability of links in the Minimum …
Student mobility in higher education: Sicilian outflow network and chain migrations
The most important student mobility (SM) flow in Italy is from the Southern to the Central-Northern regions, a phenomenon that has been magnified by an increasing number of outgoing students from Sicily over the last decade. In this paper, we rely upon micro-data of university enrollment and students' personal records for three cohorts of freshmen, in order to investigate preferential patterns of SM from Sicily toward universities in other regions. Our main goal is to reveal the existence of chain migrations, where students from a particular geographical area move towards a particular destination to follow other students that have previously moved. The paper provides aspects that are innova…
Additional file 1: of RIP-Chip analysis supports different roles for AGO2 and GW182 proteins in recruiting and processing microRNA targets
Analysis of miRNA expression in AGO2 and GW182-IP samples. a) miRNA expression level in AGO2-IP samples (average value from the three performed experiments) vs the expression level in IN samples (average value from the three performed experiments). The Pearson correlation values reported on the top of the picture were computed by using all the expressed miRNA, and the top 100 or 50 expressed miRNAs. The colored points refer to miRNA that have been validated by RT-PCR data. Green points refer to hsa-miR-141-3p, hsa-miR-21-5p, hsa-let-7f-5p, hsa-miR-16-5p, hsa-miR-24-3p, hsa-miR-27a-3p, hsa-miR-23a-3p. The red point refers to hsa-miR-1260a. b) Comparison of IP/IN ratios obtained by RT-PCR dat…
Glenoid bone loss in anterior shoulder dislocation: a multicentric study to assess the most reliable imaging method
Purpose: The aim of this multicentric study was to assess which imaging method has the best inter-reader agreement for glenoid bone loss quantification in anterior shoulder instability. A further aim was to calculate the inter-method agreement comparing bilateral CT with unilateral CT and MR arthrography (MRA) with CT measurements. Finally, calculations were carried out to find the least time-consuming method. Method: A retrospective evaluation was performed by 9 readers (or pairs of readers) on a consecutive series of 110 patients with MRA and bilateral shoulder CT. Each reader was asked to calculate the glenoid bone loss of all patients using the following methods: best fit circle area on…
Teleportation of atomic states via position measurements
We present a scheme for conditionally teleporting an unknown atomic state in cavity QED, which requires two atoms and one cavity mode. The translational degrees of freedom of the atoms are taken into account using the optical Stern-Gerlach model. We show that successful teleportation with probability 1/2 can be achieved through local measurements of the cavity photon number and atomic positions. Neither direct projection onto highly entangled states nor holonomous interaction-time constraints are required.
Gene-based and semantic structure of the Gene Ontology as a complex network
The last decade has seen the advent and consolidation of ontology based tools for the identification and biological interpretation of classes of genes, such as the Gene Ontology. The information accumulated time-by-time and included in the GO is encoded in the definition of terms and in the setting up of semantic relations amongst terms. This approach might be usefully complemented by a bottom-up approach based on the knowledge of relationships amongst genes. To this end, we investigate the Gene Ontology from a complex network perspective. We consider the semantic network of terms naturally associated with the semantic relationships provided by the Gene Ontology consortium and a gene-based …
Bazaar economics
Competitive Equilibrium theory has been a widely accepted and extensively used cornerstone in economics for over a century. Here, we suggest a complementary model—motivated by the haggling in a bazaar—that offers a useful, first-principle account of market behavior that better accounts for the observed outcomes in forty market experiments. The Bazaar model uses simple stochastic processes to drive the matching of traders and the determination of price. We show that as agents become more impatient, the system tends toward more Competitive-Equilibrium-like outcomes.
Additional file 8: of RIP-Chip analysis supports different roles for AGO2 and GW182 proteins in recruiting and processing microRNA targets
Empirical Cumulative Distribution Function of 3â UTR and coding region length of IP-Enriched genes. Enriched genes in AGO (1â 4) and in GW182 protein family IP selected by considering log2 IP-Enrichment of transcript greater than 1. Data are downloaded from Landthaler et al. [14]. The Empirical Cumulative Distribution Function of the 3â UTR length (top) and coding region length (bottom) of genes enriched exclusively by AGO-IP (red line), GW182-IP (blue line) and both IPs (black line) are reported. The reported p-value is computed by performing a Wilcoxon test to compare the length distributions of genes enriched exclusively in AGO-IP and in GW182-IP. (PDF 145 kb)
Ultrametric matrices and factor models
An improvement of ComiR algorithm for microRNA target prediction by exploiting coding region sequences of mRNAs
AbstractMicroRNA are small non-coding RNAs that post-transcriptionally regulate the expression levels of messenger RNAs. MicroRNA regulation activity depends on the recognition of binding sites located on mRNA molecules. ComiR is a web tool realized to predict the targets of a set of microRNAs, starting from their expression profile. ComiR was trained with the information regarding binding sites in the 3’utr region, by using a reliable dataset containing the targets of endogenously expressed microRNA in D. melanogaster S2 cells. This dataset was obtained by comparing the results from two different experimental approaches, i.e., inhibition, and immunoprecipitation of the AGO1 protein--a comp…
Correlation, hierarchies, and networks in financial markets
We discuss some methods to quantitatively investigate the properties of correlation matrices. Correlation matrices play an important role in portfolio optimization and in several other quantitative descriptions of asset price dynamics in financial markets. Specifically, we discuss how to define and obtain hierarchical trees, correlation based trees and networks from a correlation matrix. The hierarchical clustering and other procedures performed on the correlation matrix to detect statistically reliable aspects of the correlation matrix are seen as filtering procedures of the correlation matrix. We also discuss a method to associate a hierarchically nested factor model to a hierarchical tre…
Evolution of worldwide stock markets, correlation structure and correlation based graphs
We investigate the daily correlation present among market indices of stock exchanges located all over the world in the time period Jan 1996 - Jul 2009. We discover that the correlation among market indices presents both a fast and a slow dynamics. The slow dynamics reflects the development and consolidation of globalization. The fast dynamics is associated with critical events that originate in a specific country or region of the world and rapidly affect the global system. We provide evidence that the short term timescale of correlation among market indices is less than 3 trading months (about 60 trading days). The average values of the non diagonal elements of the correlation matrix, corre…
When do Improved Covariance Matrix Estimators Enhance Portfolio Optimization? An Empirical Comparative Study of Nine Estimators
The use of improved covariance matrix estimators as an alternative to the sample estimator is considered an important approach for enhancing portfolio optimization. Here we empirically compare the performance of 9 improved covariance estimation procedures by using daily returns of 90 highly capitalized US stocks for the period 1997-2007. We find that the usefulness of covariance matrix estimators strongly depends on the ratio between estimation period T and number of stocks N, on the presence or absence of short selling, and on the performance metric considered. When short selling is allowed, several estimation methods achieve a realized risk that is significantly smaller than the one obtai…
Happy Aged People Are All Alike, While Every Unhappy Aged Person Is Unhappy in Its Own Way
Aging of the world’s population represents one of the most remarkable success stories of medicine and of humankind, but it is also a source of various challenges. The aim of the collaborative cross-cultural European study of adult well being (ESAW) is to frame the concept of aging successfully within a causal model that embraces physical health and functional status, cognitive efficacy, material security, social support resources, and life activity. Within the framework of this project, we show here that the degree of heterogeneity among people who view aging in a positive light is significantly lower than the degree of heterogeneity of those who hold a negative perception of aging. We base…
How News Affect the Trading Behavior of Different Categories of Investors in a Financial Market
We investigate the trading behavior of a large set of single investors trading the highly liquid Nokia stock over the period 2003-2008 with the aim of determining the relative role of endogenous and exogenous factors that may affect their behavior. As endogenous factors we consider returns and volatility, whereas the exogenous factors we use are the total daily number of news and a semantic variable based on a sentiment analysis of news. Linear regression and partial correlation analysis of data show that different categories of investors are differently correlated to these factors. Governmental and non profit organizations are weakly sensitive to news and returns or volatility, and, typica…
Comparing Correlation Matrix Estimators Via Kullback-Leibler Divergence
We use a self-averaging measure called Kullback-Leibler divergence to evaluate the performance of four different correlation estimators: Fourier, Pearson, Maximum Likelihood and Hayashi-Yoshida estimator. The study uses simulated transaction prices for a large number of stocks and different data generating mechanisms, including synchronous and non-synchronous transactions, homogeneous and heterogeneous inter-transaction time. Different distributions of stock returns, i.e. multivariate Normal and multivariate Student's t-distribution, are also considered. We show that Fourier and Pearson estimators are equivalent proxies of the `true' correlation matrix within all the settings under analysis…
Networked relationships in the e-MID Interbank market: A trading model with memory
Interbank markets are fundamental for bank liquidity management. In this paper, we introduce a model of interbank trading with memory. Our model reproduces features of preferential trading patterns in the e-MID market recently empirically observed through the method of statistically validated networks. The memory mechanism is used to introduce a proxy of trust in the model. The key idea is that a lender, having lent many times to a borrower in the past, is more likely to lend to that borrower again in the future than to other borrowers, with which the lender has never (or has in- frequently) interacted. The core of the model depends on only one parameter representing the initial attractiven…
Insurance fraud detection: A statistically validated network approach
Fraud is a social phenomenon, and fraudsters often collaborate with other fraudsters, taking on different roles. The challenge for insurance companies is to implement claim assessment and improve fraud detection accuracy. We developed an investigative system based on bipartite networks, highlighting the relationships between subjects and accidents or vehicles and accidents. We formalize filtering rules through probability models and test specific methods to assess the existence of communities in extensive networks and propose new alert metrics for suspicious structures. We apply the methodology to a real database-the Italian Antifraud Integrated Archive-and compare the results to out-of-sam…
Statistically Validated Networks for evaluating coherence in topic models
Probabilistic topic models have become one of the most widespread machine learning technique for textual analysis purpose. In this framework, Latent Dirichlet Allocation (LDA) gained more and more popularity as a text modelling technique. The idea is that documents are represented as random mixtures over latent topics, where a distribution over words characterizes each topic. Unfortunately, topic models do not guarantee the interpretability of their outputs. The topics learned from the model may be characterized by a set of irrelevant or unchained words, being useless for the interpretation. In the framework of topic quality evaluation, the pairwise semantic cohesion among the top-N most pr…
Additional file 9: of RIP-Chip analysis supports different roles for AGO2 and GW182 proteins in recruiting and processing microRNA targets
Summary of miRNA expression profiles switch between experiment replicas. ROC analysis of F6&F4d SVM model trained with variables calculated with miRNA expression profiles from each of the three anti-AGO2 RIP experiments. SVM models were used to classify the top 1000 and the bottom 1000 genes with respect to the IP/FT mRNA expression ratio, computed for each of the three AGO2 RIP experiments. (PDF 653 kb)
Structure and Evolution of a European Parliament via a Network and Correlation Analysis
We present a study of the network of relationships among elected members of the Finnish parliament, based on a quantitative analysis of initiative co-signatures, and its evolution over 16 years. To understand the structure of the parliament, we constructed a statistically validated network of members, based on the similarity between the patterns of initiatives they signed. We looked for communities within the network and characterized them in terms of members’ attributes, such as electoral district and party. To gain insight on the nested structure of communities, we constructed a hierarchical tree of members from the correlation matrix. Afterwards, we studied parliament dynamics yearly, wi…
Meniscal ramp lesions: diagnostic performance of MRI with arthroscopy as reference standard
Abstract Background The posteromedial meniscal region is gaining interest among orthopedic surgeons, as lesions of this area has been reported to be significantly associated with anterior cruciate ligament tears. The current imaging literature is unclear. Purpose To evaluate the diagnostic performance of MR in the detection of meniscal ramp lesions having arthroscopy as reference standard. Materials and methods We retrospectively included 56 patients (mean age of 25 ± 7 years; 14 females) from January to November 2017 with a arthroscopically proved ACL tear and posterior meniscocapsular separation. On preoperative MRI, two radiologists with 13 and 2 years’ experience in musculoskeletal imag…
Anagraphical relationships and crime specialization within Cosa Nostra
Abstract The aim of the present work is to investigate the relationships established within Cosa Nostra, by making use of networks and complex-systems methods. The analysis is performed at three different levels, that is, individuals, groups within mafia syndicates, and relationships amongst mafia syndicates. The reported empirical analysis is based on the criminal records of 632 affiliates to Cosa Nostra selected from a set of 125 judgements emitted by the Palermo courts from 2000 to 2014. According to the criminal records of the Palermo Prosecutor Office, such a dataset includes approximately 10% of the whole population of Cosa Nostra affiliates in western Sicily. Furthermore, the vital s…
Quantifying preferential trading in the e-MID interbank market
Interbank markets allow credit institutions to exchange capital for purposes of liquidity management. These markets are among the most liquid markets in the financial system. However, liquidity of interbank markets dropped during the 2007-2008 financial crisis, and such a lack of liquidity influenced the entire economic system. In this paper, we analyze transaction data from the e-MID market which is the only electronic interbank market in the Euro Area and US, over a period of eleven years (1999-2009). We adapt a method developed to detect statistically validated links in a network, in order to reveal preferential trading in a directed network. Preferential trading between banks is detecte…
Additional file 3: of RIP-Chip analysis supports different roles for AGO2 and GW182 proteins in recruiting and processing microRNA targets
Summary of enriched and underrepresented genes. Summary of enriched and underrepresented genes in AGO2 and GW182-IP vs FT comparisons performed by SAMR (column 2–3). The enrichment results obtained with the REA algorithm are reported in columns 4–5. Columns 6 and 7 report the 3’UTR and Coding region (CR) lengths respectively. In columns 8–21 we report the number of binding sites predicted by Targetscan in the 3’UTR and the Coding region of seven highly expressed miRNAs. (XLS 294 kb)
Additional file 2: of RIP-Chip analysis supports different roles for AGO2 and GW182 proteins in recruiting and processing microRNA targets
Gene set enrichment analysis results with seven top expressed miRNA predicted targets sets. Predicted targets of miRNAs (column 1) were predicted with three different target prediction tools (column 2). The total number of predicted targets is indicated in column 3. Five lists of genes were analyzed. For each list of genes the number of genes in common with the predicted targets and the associated hypergeometric test pvalue are provided. The total number of genes considered in the analysis is 16,392. The five considered lists are: a list of genes enriched in AGO2 IP sample from [13]; lists of genes enriched in AGO2 IP vs IN and IP vs FT samples; lists of genes enriched in GW182 IP vs IN and…