0000000000950001

AUTHOR

Christophe Guyeux

Chloroplast genomes of Rubiaceae: Comparative genomics and molecular phylogeny in subfamily Ixoroideae.

In Rubiaceae phylogenetics, the number of markers often proved a limitation with authors failing to provide well-supported trees at tribal and generic levels. A robust phylogeny is a prerequisite to study the evolutionary patterns of traits at different taxonomic levels. Advances in next-generation sequencing technologies have revolutionized biology by providing, at reduced cost, huge amounts of data for an increased number of species. Due to their highly conserved structure, generally recombination-free, and mostly uniparental inheritance, chloroplast DNA sequences have long been used as choice markers for plant phylogeny reconstruction. The main objectives of this study are: 1) to gain in…

research product

Simulation-based estimation of branching models for LTR retrotransposons

Abstract Motivation LTR retrotransposons are mobile elements that are able, like retroviruses, to copy and move inside eukaryotic genomes. In the present work, we propose a branching model for studying the propagation of LTR retrotransposons in these genomes. This model allows us to take into account both the positions and the degradation level of LTR retrotransposons copies. In our model, the duplication rate is also allowed to vary with the degradation level. Results Various functions have been implemented in order to simulate their spread and visualization tools are proposed. Based on these simulation tools, we have developed a first method to evaluate the parameters of this propagation …

research product

Efficient cluster-based routing algorithm for body sensor networks

International audience; Body Sensor Networks have gained a lot of research interest lately for the variety of applications they can serve. In such networks where nodes might hold critical information about people's lives, designing efficient routing schemes is very important to guarantee data delivery with the lowest delay and energy consumption. Even though some cluster-based routing schemes were proposed in the literature, none of them offer a complete solution that guarantees energy and delay efficient routing in BSN. In this paper, we propose a robust cluster- based algorithm that increases the routing efficiency through every step of the routing process: cluster formation, cluster head…

research product

Efficient Online Laplacian Eigenmap Computation for Dimensionality Reduction in Molecular Phylogeny via Optimisation on the Sphere

Reconstructing the phylogeny of large groups of large divergent genomes remains a difficult problem to solve, whatever the methods considered. Methods based on distance matrices are blocked due to the calculation of these matrices that is impossible in practice, when Bayesian inference or maximum likelihood methods presuppose multiple alignment of the genomes, which is itself difficult to achieve if precision is required. In this paper, we propose to calculate new distances for randomly selected couples of species over iterations, and then to map the biological sequences in a space of small dimension based on the partial knowledge of this genome similarity matrix. This mapping is then used …

research product

Average Performance Analysis of the Stochastic Gradient Method for Online PCA

International audience; This paper studies the complexity of the stochastic gradient algorithm for PCA when the data are observed in a streaming setting. We also propose an online approach for selecting the learning rate. Simulation experiments confirm the practical relevance of the plain stochastic gradient approach and that drastic improvements can be achieved by learning the learning rate.

research product

Efficient Hybrid Emergency Aware MAC Protocol for Wireless Body Sensor Networks

International audience; In Body Sensor Networks (BSNs), two types of events should be addressed: periodic and emergency events. Traffic rate is usually low during periodic observation, and becomes very high upon emergency. One of the main and challenging requirements of BSNs is to design Medium Access Control (MAC) protocols that guarantee immediate and reliable transmission of data in emergency situations, while maintaining high energy efficiency in non-emergency conditions. In this paper, we propose a new emergency aware hybrid DTDMA/DS-CDMA protocol that can accommodate BSN traffic variations by addressing emergency and periodic traffic requirements. It takes advantage of the high delay …

research product

Anomaly‐based intrusion detection systems: The requirements, methods, measurements, and datasets

International audience; With the Internet's unprecedented growth and nations' reliance on computer networks, new cyber‐attacks are created every day as means for achieving financial gain, imposing political agendas, and developing cyberwarfare arsenals. Network security is thus acquiring increasing attention among researchers, practitioners, network architects, policy makers, and others. To defend organizations' networks from existing, foreseen, and future threats, intrusion detection systems (IDSs) are becoming a must. Existing surveys on anomaly‐based IDS (AIDS) focus on specific components such as detection mechanisms and lack many others. In contrast to existing surveys, this article co…

research product

A Hardware and Secure Pseudorandom Generator for Constrained Devices

Hardware security for an Internet of Things or cyber physical system drives the need for ubiquitous cryptography to different sensing infrastructures in these fields. In particular, generating strong cryptographic keys on such resource-constrained device depends on a lightweight and cryptographically secure random number generator. In this research work, we have introduced a new hardware chaos-based pseudorandom number generator, which is mainly based on the deletion of an Hamilton cycle within the $N$ -cube (or on the vectorial negation), plus one single permutation. We have rigorously proven the chaotic behavior and cryptographically secure property of the whole proposal: the mid-term eff…

research product

OSIP1 is a self‐assembling DUF3129 protein required to protect fungal cells from toxins and stressors

International audience; Secreted proteins are key players in fungal physiology and cell protection against external stressing agents and antifungals. Oak stress-induced protein 1 (OSIP1) is a fungal-specific protein with unknown function. By using Podospora anserina and Phanerochaete chrysosporium as models, we combined both in vivo functional approaches and biophysical characterization of OSIP1 recombinant protein. The P. anserina OSIP1(Delta) mutant showed an increased sensitivity to the antifungal caspofungin compared to the wild type. This correlated with the production of a weakened extracellular exopolysaccharide/protein matrix (ECM). Since the recombinant OSIP1 from P. chrysosporium …

research product

SpCLUST: Towards a fast and reliable clustering for potentially divergent biological sequences

International audience; This paper presents SpCLUST, a new C++ package that takes a list of sequences as input, aligns them with MUSCLE, computes their similarity matrix in parallel and then performs the clustering. SpCLUST extends a previously released software by integrating additional scoring matrices which enables it to cover the clustering of amino-acid sequences. The similarity matrix is now computed in parallel according to the master/slave distributed architecture, using MPI. Performance analysis, realized on two real datasets of 100 nucleotide sequences and 1049 amino-acids ones, show that the resulting library substantially outperforms the original Python package. The proposed pac…

research product

A critical review on the implementation of static data sampling techniques to detect network attacks

International audience; Given that the Internet traffic speed and volume are growing at a rapid pace, monitoring the network in a real-time manner has introduced several issues in terms of computing and storage capabilities. Fast processing of traffic data and early warnings on the detected attacks are required while maintaining a single pass over the traffic measurements. To palliate these problems, one can reduce the amount of traffic to be processed by using a sampling technique and detect the attacks based on the sampled traffic. Different parameters have an impact on the efficiency of this process, mainly, the applied sampling policy and sampling ratio. In this paper, we investigate th…

research product

Evaluation of chloroplast genome annotation tools and application to analysis of the evolution of coffee species.

International audience; Chloroplast sequences are widely used for phylogenetic analysis due to their high degree of conservation in plants. Whole chloroplast genomes can now be readily obtained for plant species using new sequencing methods, giving invaluable data for plant evolution However new annotation methods are required for the efficient analysis of this data to deliver high quality phylogenetic analyses. In this study, the two main tools for chloroplast genome annotation were compared. More consistent detection and annotation of genes were produced with GeSeq when compared to the currently used Dogma. This suggests that the annotation of most of the previously annotated chloroplast …

research product

Online shortest paths with confidence intervals for routing in a time varying random network

International audience; The increase in the world's population and rising standards of living is leading to an ever-increasing number of vehicles on the roads, and with it ever-increasing difficulties in traffic management. This traffic management in transport networks can be clearly optimized by using information and communication technologies referred as Intelligent Transport Systems (ITS). This management problem is usually reformulated as finding the shortest path in a time varying random graph. In this article, an online shortest path computation using stochastic gradient descent is proposed. This routing algorithm for ITS traffic management is based on the online Frank-Wolfe approach.…

research product

Collaborative body sensor networks: Taxonomy and open challenges

International audience; Single Body Sensor Networks (BSNs) have gained a lot of interest during the past few years. However, the need to monitor the activity of many individuals to assess the group status and take action accordingly has created a new research domain called Collaborative Body Sensor Network (CBSN). In such a new field, understanding CBSN's concept and challenges over the roots requires investigation to allow the development of suitable algorithms and protocols. Although there are many research studies in BSN, CBSN is still in its early phases and studies around it are very few. In this paper, we define and taxonomize CBSN, describe its architecture, and discuss its applicati…

research product

Reliable diagnostics using wireless sensor networks

International audience; Monitoring activities in industry may require the use of wireless sensor networks, for instance due to difficult access or hostile environment. But it is well known that this type of networks has various limitations like the amount of disposable energy. Indeed, once a sensor node exhausts its resources, it will be dropped from the network, stopping so to forward information about maybe relevant features towards the sink. This will result in broken links and data loss which impacts the diagnostic accuracy at the sink level. It is therefore important to keep the network's monitoring service as long as possible by preserving the energy held by the nodes. As packet trans…

research product

Efficient distributed average consensus in wireless sensor networks

International audience; Computing the distributed average consensus in Wireless Sensor Networks (WSNs) is investigated in this article. This problem, which is both natural and important, plays a significant role in various application fields such as mobile agents and fleet vehicle coordination, network synchronization, distributed voting and decision, load balancing of divisible loads in distributed computing network systems, and so on. By and large, the average consensus' objective is to have all nodes in the network converged to the average value of the initial nodes' measurements based only on local nodes' information states. In this paper, we introduce a fully distributed algorithm to a…

research product

A Route toward Protein Sequencing using Solid-State Nanopores Assisted by Machine Learning

Solid-State Nanopores made of 2-D materials such as MoS2 have emerged as one of the most versatile sensors for single-biomolecule detection, which is essential for early disease diagnosis (biomarker detection). One of the most promising applications of SSN is DNA and protein sequencing, at a low cost and faster than the current standard methods. The detection principle relies on measuring the relatively small variations of ionic current as charged biomolecules immersed in an electrolyte traverse the nanopore, in response to an external voltage applied across the membrane. The passage of a biomolecule through the pore yields information about its structure and chemical properties, as demonst…

research product

A Personal LPWAN Remote Monitoring System

Firefighters are equipped with an immobility detector device also called the Personal Alert Safety System (PASS) that is integrated into the user's Self-Contained Breathing Apparatus (SCBA). If a firefighter remains motionless for a certain period of time, a loud audible alert is triggered to notify the Firefighter Assist and Search Team (FAST) deployed in the area of intervention that the wearer of the PASS device is in trouble and in need of rescue. However, this device is not reliable enough since it triggers frequently false positives which lead to developing a tolerance for sounding alarms among the crew. As a consequence, they do not seem to be concerned about it as they should and th…

research product

On the collision property of chaotic iterations based post-treatments over cryptographic pseudorandom number generators

International audience; There is not a proper mathematical definition of chaos, we have instead a quite big amount of definitions, each of one describes chaos in a more or less general context. Taking in account this, it is clear why it is hard to design an algorithm that produce random numbers, a kind of algorithm that could have plenty of concrete appliceautifat (anul)d bions. However we must use a finite state machine (e.g. a laptop) to produce such a sequence of random numbers, thus it is convenient, for obvious reasons, to redefine those aimed sequences as pseudorandom; also problems arise with floating point arithmetic if one wants to recover some real chaotic property (i.e. propertie…

research product

CIPRNG: A VLSI Family of Chaotic Iterations Post-Processings for $\mathbb {F}_{2}$ -Linear Pseudorandom Number Generation Based on Zynq MPSoC

Hardware pseudorandom number generators are continuously improved to satisfy both physical and ubiquitous computing security system challenges. The main contribution of this paper is to propose two post-processing modules in hardware, to improve the randomness of linear PRNGs while succeeding in passing the TestU01 statistical battery of tests. They are based on chaotic iterations and are denoted by CIPRNG-MC and CIPRNG-XOR. They have various interesting properties, encompassing the ability to improve the statistical profile of the generators on which they iterate. Such post-processing have been implemented on FPGA and ASIC without inferring any blocs (RAM or DSP). A comparison in terms of …

research product

Toward fast and accurate emergency cases detection in BSNs

International audience; In body sensor networks (BSNs), medical sensors capture physiological data from the human body and send them to the coordinator who act as a gateway to health care. The main aim of BSNs is to save peoples' lives. Therefore, fast and correct detection of emergencies while maintaining low-energy consumption of sensors is essential requirement of BSNs. In this study, the authors propose a new adaptive data sampling approach, where the sampling ratio is adapted based on the sensed data variation. The idea is to use the modified version of the cumulative sum (CUSUM) algorithm (modified CUSUM) that they previously proposed for wireless sensor networks to monitor the data v…

research product

Efficient and accurate monitoring of the depth information in a Wireless Multimedia Sensor Network based surveillance

International audience; Abstract—Wireless Multimedia Sensor Network (WMSN) is a promising technology capturing rich multimedia data like audio and video, which can be useful to monitor an environment under surveillance. However, many scenarios in real time monitoring requires 3D depth information. In this research work, we propose to use the disparity map that is computed from two or multiple images, in order to monitor the depth information in an object or event under surveillance using WMSN. Our system is based on distributed wireless sensors allowing us to notably reduce the computational time needed for 3D depth reconstruction, thus permitting the success of real time solutions. Each pa…

research product

Dendrochemical assessment of mercury releases from a pond and dredged-sediment landfill impacted by a chlor-alkali plant.

International audience; Although current Hg emissions from industrial activities may be accurately monitored, evidence of past releases to the atmosphere must rely on one or more environmental proxies. We used Hg concentrations in tree cores collected from poplars and willows to investigate the historical changes of Hg emissions from a dredged sediment landfill and compared them to a nearby control location. Our results demonstrated the potential value of using dendrochemistry to record historical Hg emissions from past industrial activities.

research product

Investigating Low Level Protocols for Wireless Body Sensor Networks

The rapid development of medical sensors has increased the interest in Wireless Body Area Network (WBAN) applications where physiological data from the human body and its environment is gathered, monitored, and analyzed to take the proper measures. In WBANs, it is essential to design MAC protocols that ensure adequate Quality of Service (QoS) such as low delay and high scalability. This paper investigates Medium Access Control (MAC) protocols used in WBAN, and compares their performance in a high traffic environment. Such scenario can be induced in case of emergency for example, where physiological data collected from all sensors on human body should be sent simultaneously to take appropria…

research product

Random Walk in a N-cube Without Hamiltonian Cycle to Chaotic Pseudorandom Number Generation: Theoretical and Practical Considerations

Designing a pseudorandom number generator (PRNG) is a difficult and complex task. Many recent works have considered chaotic functions as the basis of built PRNGs: the quality of the output would indeed be an obvious consequence of some chaos properties. However, there is no direct reasoning that goes from chaotic functions to uniform distribution of the output. Moreover, embedding such kind of functions into a PRNG does not necessarily allow to get a chaotic output, which could be required for simulating some chaotic behaviors. In a previous work, some of the authors have proposed the idea of walking into a $\mathsf{N}$-cube where a balanced Hamiltonian cycle has been removed as the basis o…

research product

Energy-Efficiency and Coverage Quality Management for Reliable Diagnostics in Wireless Sensor Networks

International audience; The processing of data and signals provided by sensors aims at extracting rnrelevant features which can be used to assess and diagnose the health state rnof the monitored targets. Nevertheless, Wireless Sensor Networks (WSNs) present rna number of shortcomings that have an impact on the quality of the gathered rndata at the sink level, leading to imprecise diagnostics rnof the observed targets. To improve data accuracy, two main critical and related issues, namely the energy consumption and coverage quality, need to be considered. The goal is to maximize the network lifetime while guaranteeing the complete coverage of all the targets. Unfortunately, these performance…

research product

Detection of Temporal Clusters of Healthcare-Associated Infections or Colonizations with Pseudomonas aeruginosa in Two Hospitals: Comparison of SaTScan and WHONET Software Packages.

International audience; The identification of temporal clusters of healthcare-associated colonizations or infections is a challenge in infection control. WHONET software is available to achieve these objectives using laboratory databases of hospitals but it has never been compared with SaTScan regarding its detection performance. This study provided the opportunity to evaluate the performance of WHONET software in comparison with SaTScan software as a reference to detect clusters of Pseudomonas aeruginosa. A retrospective study was conducted in two French university hospitals. Cases of P. aeruginosa colonizations or infections occurring between 1st January 2005 and 30th April 2014 in the fi…

research product

Ancestral Reconstruction and Investigations of Genomic Recombination on some Pentapetalae Chloroplasts

Abstract In this article, we propose a semi-automated method to rebuild genome ancestors of chloroplasts by taking into account gene duplication. Two methods have been used in order to achieve this work: a naked eye investigation using homemade scripts, whose results are considered as a basis of knowledge, and a dynamic programming based approach similar to Needleman-Wunsch. The latter fundamentally uses the Gestalt pattern matching method of sequence matcher to evaluate the occurrences probability of each gene in the last common ancestor of two given genomes. The two approaches have been applied on chloroplastic genomes from Apiales, Asterales, and Fabids orders, the latter belonging to Pe…

research product

panISa: ab initio detection of insertion sequences in bacterial genomes from short read sequence data.

Abstract Motivation The advent of next-generation sequencing has boosted the analysis of bacterial genome evolution. Insertion sequence (IS) elements play a key role in prokaryotic genome organization and evolution, but their repetitions in genomes complicate their detection from short-read data. Results PanISa is a software pipeline that identifies IS insertions ab initio in bacterial genomes from short-read data. It is a highly sensitive and precise tool based on the detection of read-mapping patterns at the insertion site. PanISa performs better than existing IS detection systems as it is based on a database-free approach. We applied it to a high-risk clone lineage of the pathogenic spec…

research product

Finding optimal finite biological sequences over finite alphabets: the OptiFin toolbox

International audience; In this paper, we present a toolbox for a specific optimization problem that frequently arises in bioinformatics or genomics. In this specific optimisation problem, the state space is a set of words of specified length over a finite alphabet. To each word is associated a score. The overall objective is to find the words which have the lowest possible score. This type of general optimization problem is encountered in e.g 3D conformation optimisation for protein structure prediction, or largest core genes subset discovery based on best supported phylogenetic tree for a set of species. In order to solve this problem, we propose a toolbox that can be easily launched usin…

research product

Advances in the enumeration of foldable self-avoiding walks

<font color="#336633"&gtSelf-avoiding walks (SAWs) have been studied for a long time due to their intrinsic importance and the many application fields in which they operate. A new subset of SAWs, called foldable SAWs, has recently been discovered when investigating two different SAW manipulations embedded within existing protein structure prediction (PSP) software. Since then, several attempts have been made to find out more about these walks, including counting them. However, calculating the number of foldable SAWs appeared as a tough work, and current supercomputers fail to count foldable SAWs of length exceeding ≈ 30 steps. In this article, we present new progress in this enumeration, bo…

research product

Impact of Insertion Sequences and RNAs on Genomic Inversions in Pseudomonas aeruginosa

Abstract In this article, a bioinformatics pipeline is proposed that focuses on two types of elements, namely the mobile genetic elements (MGE) and Ribonucleic acids (RNAs). The MGEs are called insertion sequences (ISs) in the prokaryotic domain. The objective of this research work is to study the behaviour of RNAs and MGEs genes, and the effects of their presence around inversions in genome sequences. The proposed pipeline finds the relation between the transposase gene types (e.g., DDE and DEDD) located within insertion sequences according to their IS family and sub-family, and RNAs (tRNA and rRNA) on the one hand, and genomic inversion on the other hand. More precisely, we wonder whether…

research product

A clustering package for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Model.

International audience; In this article, a new Python package for nucleotide sequences clustering is proposed. This package, freely available on-line, implements a Laplacian eigenmap embedding and a Gaussian Mixture Model for DNA clustering. It takes nucleotide sequences as input, and produces the optimal number of clusters along with a relevant visualization. Despite the fact that we did not optimise the computational speed, our method still performs reasonably well in practice. Our focus was mainly on data analytics and accuracy and as a result, our approach outperforms the state of the art, even in the case of divergent sequences. Furthermore, an a priori knowledge on the number of clust…

research product