Search results for "Datasets"
showing 10 items of 45 documents
Acquired IFNγ resistance impairs anti-tumor immunity and gives rise to T-cell-resistant melanoma lesions
2016
Melanoma treatment has been revolutionized by antibody-based immunotherapies. IFNγ secretion by CD8+ T cells is critical for therapy efficacy having anti-proliferative and pro-apoptotic effects on tumour cells. Our study demonstrates a genetic evolution of IFNγ resistance in different melanoma patient models. Chromosomal alterations and subsequent inactivating mutations in genes of the IFNγ signalling cascade, most often JAK1 or JAK2, protect melanoma cells from anti-tumour IFNγ activity. JAK1/2 mutants further evolve into T-cell-resistant HLA class I-negative lesions with genes involved in antigen presentation silenced and no longer inducible by IFNγ. Allelic JAK1/2 losses predisposing to …
An organelle-specific protein landscape identifies novel diseases and molecular mechanisms.
2016
Cellular organelles provide opportunities to relate biological mechanisms to disease. Here we use affinity proteomics, genetics and cell biology to interrogate cilia: poorly understood organelles, where defects cause genetic diseases. Two hundred and seventeen tagged human ciliary proteins create a final landscape of 1,319 proteins, 4,905 interactions and 52 complexes. Reverse tagging, repetition of purifications and statistical analyses, produce a high-resolution network that reveals organelle-specific interactions and complexes not apparent in larger studies, and links vesicle transport, the cytoskeleton, signalling and ubiquitination to ciliary signalling and proteostasis. We observe sub…
Algorithmic paradigms for stability-based cluster validity and model selection statistical methods, with applications to microarray data analysis
2012
AbstractThe advent of high throughput technologies, in particular microarrays, for biological research has revived interest in clustering, resulting in a plethora of new clustering algorithms. However, model selection, i.e., the identification of the correct number of clusters in a dataset, has received relatively little attention. Indeed, although central for statistics, its difficulty is also well known. Fortunately, a few novel techniques for model selection, representing a sharp departure from previous ones in statistics, have been proposed and gained prominence for microarray data analysis. Among those, the stability-based methods are the most robust and best performing in terms of pre…
A methodology for optimisation of solar dish-Stirling systems size, based on the local frequency distribution of direct normal irradiance
2021
Abstract In geographical areas where direct solar irradiation levels are relatively high, concentrated solar energy systems are one of the most promising green energy technologies. Dish-Stirling systems are those that achieve the highest levels of solar-to-electric conversion efficiency, and yet they are still among the least common commercially available technologies. This paper focuses on a strategy aimed at promoting greater diffusion of dish-Stirling systems, which involves optimizing the size of the collector aperture area based on the hourly frequency distributions of beam irradiance and defining a new incentive scheme with a feed-in tariff that is variable with the installed costs of…
Ventricular Fibrillation and Tachycardia detection from surface ECG using time-frequency representation images as input dataset for machine learning
2017
Parameter-less ventricular fibrillation detection with time-frequency representation.Time-frequency representations are treated as images for a classifier.A comparison for four classifiers demonstrates the validity of the proposed method.The proposed technique could be applied to any signal and research field.This is a novel approach to signal analysis. Background and objectiveTo safely select the proper therapy for Ventricullar Fibrillation (VF) is essential to distinct it correctly from Ventricular Tachycardia (VT) and other rhythms. Provided that the required therapy would not be the same, an erroneous detection might lead to serious injuries to the patient or even cause Ventricular Fibr…
piRNAclusterDB 2.0: update and expansion of the piRNA cluster database.
2021
Abstract PIWI-interacting RNAs (piRNAs) and their partnering PIWI proteins defend the animal germline against transposable elements and play a crucial role in fertility. Numerous studies in the past have uncovered many additional functions of the piRNA pathway, including gene regulation, anti-viral defense, and somatic transposon repression. Further, comparative analyses across phylogenetic groups showed that the PIWI/piRNA system evolves rapidly and exhibits great evolutionary plasticity. However, the presence of so-called piRNA clusters as the major source of piRNAs is common to nearly all metazoan species. These genomic piRNA-producing loci are highly divergent across taxa and critically…
Lightweight LCP construction for next-generation sequencing datasets
2012
The advent of "next-generation" DNA sequencing (NGS) technologies has meant that collections of hundreds of millions of DNA sequences are now commonplace in bioinformatics. Knowing the longest common prefix array (LCP) of such a collection would facilitate the rapid computation of maximal exact matches, shortest unique substrings and shortest absent words. CPU-efficient algorithms for computing the LCP of a string have been described in the literature, but require the presence in RAM of large data structures. This prevents such methods from being feasible for NGS datasets. In this paper we propose the first lightweight method that simultaneously computes, via sequential scans, the LCP and B…
Towards A Twitter Observatory: A Multi-Paradigm Framework For Collecting, Storing And Analysing Tweets
2016
International audience; In this article we show how a multi-paradigm framework can fulfil the requirements of tweets analysis and reduce the waiting time for researchers that use computational resources and storage systems to support large-scale data analysis. The originality of our approach is to combine concerns about data harvesting, data storage, data analysis and data visualisation into a framework that supports inductive reasoning in multidisciplinary scientific research. Our main contribution is a polyglot storage system with a generic data model to support logical data independence and a set of tools that can provide a suitable solution for mixing different types of algorithms in or…
Scalable robust clustering method for large and sparse data
2018
Datasets for unsupervised clustering can be large and sparse, with significant portion of missing values. We present here a scalable version of a robust clustering method with the available data strategy. Moreprecisely, a general algorithm is described and the accuracy and scalability of a distributed implementation of the algorithm is tested. The obtained results allow us to conclude the viability of the proposed approach. peerReviewed
Selecting significant respondents from large audience datasets: The case of the World Hobbit Project
2016
International projects, online questionnaires, or data mining techniques now allow audience researchers to gather very large and complex datasets. But whilst data collection capacity is hugely growing, qualitative analysis, conversely, becomes increasingly difficult to conduct. In this paper, I suggest a strategy that might allow the researcher to manage this complexity. The World Hobbit Project dataset (36,109 cases), including answers to both closed and open-ended questions, was used for this purpose. The strategy proposed here is based on between-methods sequential triangulation, and tries to combine statistical techniques (k-means clustering) with textual analysis. K-means clustering pe…