Search results for " bioinformatics"
showing 10 items of 74 documents
Adaptive Continuous Feature Binarization for Tsetlin Machines Applied to Forecasting Dengue Incidences in the Philippines
2020
The Tsetlin Machine (TM) is a recent interpretable machine learning algorithm that requires relatively modest computational power, yet attains competitive accuracy in several benchmarks. TMs are inherently binary; however, many machine learning problems are continuous. While binarization of continuous data through brute-force thresholding has yielded promising accuracy, such an approach is computationally expensive and hinders extrapolation. In this paper, we address these limitations by standardizing features to support scale shifts in the transition from training data to real-world operation, typical for e.g. forecasting. For scalability, we employ sampling to reduce the number of binariz…
A one class classifier for Signal identification: a biological case study
2008
The paper describes an application of a one-class KNN to identify different signal patterns embedded in a noise structured background. The problem become harder whenever only one pattern is well represented in the signal, in such cases one class classifier techniques are more indicated. The classification phase is applied after a preprocessing phase based on a Multi Layer Model (MLM) that provides a preliminary signal segmentation in an interval feature space. The one-class KNN has been tested on synthetic data that simulate microarray data for the identification of nucleosomes and linker regions across DNA. Results have shown a good recognition rate on synthetic data for nucleosome and lin…
Search for a Minimal Set of Parameters by Assessing the Total Optimization Potential for a Dynamic Model of a Biochemical Network.
2017
Selecting an efficient small set of adjustable parameters to improve metabolic features of an organism is important for a reduction of implementation costs and risks of unpredicted side effects. In practice, to avoid the analysis of a huge combinatorial space for the possible sets of adjustable parameters, experience-, and intuition-based subsets of parameters are often chosen, possibly leaving some interesting counter-intuitive combinations of parameters unrevealed. The combinatorial scan of possible adjustable parameter combinations at the model optimization level is possible; however, the number of analyzed combinations is still limited. The total optimization potential (TOP) approach is…
String kernels and high-quality data set for improved prediction of kinked helices in α-helical membrane proteins.
2011
The reasons for distortions from optimal α-helical geometry are widely unknown, but their influences on structural changes of proteins are significant. Hence, their prediction is a crucial problem in structural bioinformatics. For the particular case of kink prediction, we generated a data set of 132 membrane proteins containing 1014 manually labeled helices and examined the environment of kinks. Our sequence analysis confirms the great relevance of proline and reveals disproportionately high occurrences of glycine and serine at kink positions. The structural analysis shows significantly different solvent accessible surface area mean values for kinked and nonkinked helices. More important, …
Molecular analysis of the fungal community associated with phyllosphere and carposphere of fruit crops
DNA MICROARRAY AND BIOINFORMATICS AS TOOLS TO IDENTIFY A COMMON MOLECULAR SIGNATURE SHARED BY HUMAN ANEUPLOID CELLS
Genomic instability is a hallmark of the majority of human tumors explaining the heterogeneity shown by tumor cells. This phenomenon is often associated with chromosomal instability (CIN) and aneuploidy, a condition in which tumor cells lose or gain chromosomes. Previously, we showed that posttranscriptional silencing by RNAi of pRb(1), DNMT1(2) and MAD2(3) is associated with aneuploidy in cultured human cells reinforcing the idea that there are several roads leading to aneuploidy. In the attempt to understand if a common molecular signature exists that underlies aneuploidy and its tolerance in tumor cells, we did post transcriptional silencing of Rb, MAD2 and DNMT1 in human fibroblasts (IM…
Gene expression in diapausing rotifer eggs in response to divergent environmental predictability regimes
2020
AbstractIn unpredictable environments in which reliable cues for predicting environmental variation are lacking, a diversifying bet-hedging strategy for diapause exit is expected to evolve, whereby only a portion of diapausing forms will resume development at the first occurrence of suitable conditions. This study focused on diapause termination in the rotifer Brachionus plicatilis s.s., addressing the transcriptional profile of diapausing eggs from environments differing in the level of predictability and the relationship of such profiles with hatching patterns. RNA-Seq analyses revealed significant differences in gene expression between diapausing eggs produced in the laboratory under com…
The dimer-monomer equilibrium of SARS-CoV-2 main protease is affected by small molecule inhibitors
2021
AbstractThe maturation of coronavirus SARS-CoV-2, which is the etiological agent at the origin of the COVID-19 pandemic, requires a main protease Mpro to cleave the virus-encoded polyproteins. Despite a wealth of experimental information already available, there is wide disagreement about the Mpro monomer-dimer equilibrium dissociation constant. Since the functional unit of Mpro is a homodimer, the detailed knowledge of the thermodynamics of this equilibrium is a key piece of information for possible therapeutic intervention, with small molecules interfering with dimerization being potential broad-spectrum antiviral drug leads. In the present study, we exploit Small Angle X-ray Scattering (…
A motif-independent metric for DNA sequence specificity
2011
Abstract Background Genome-wide mapping of protein-DNA interactions has been widely used to investigate biological functions of the genome. An important question is to what extent such interactions are regulated at the DNA sequence level. However, current investigation is hampered by the lack of computational methods for systematic evaluating sequence specificity. Results We present a simple, unbiased quantitative measure for DNA sequence specificity called the Motif Independent Measure (MIM). By analyzing both simulated and real experimental data, we found that the MIM measure can be used to detect sequence specificity independent of presence of transcription factor (TF) binding motifs. We…