6533b862fe1ef96bd12c7546
RESEARCH PRODUCT
Online Induction of Probabilistic Real Time Automata
Jana SchmidtJana SchmidtStefan Kramersubject
Theoretical computer sciencebusiness.industryComputer scienceProbabilistic logiccomputer.software_genreAutomatonData setTrieAutomata theoryThe InternetData miningbusinessCluster analysiscomputerdescription
Probabilistic real time automata (PRTAs) are a representation of dynamic processes arising in the sciences and industry. Currently, the induction of automata is divided into two steps: the creation of the prefix tree acceptor (PTA) and the merge procedure based on clustering of the states. These two steps can be very time intensive when a PRTA is to be induced for massive or even unbounded data sets. The latter one can be efficiently processed, as there exist scalable online clustering algorithms. However, the creation of the PTA still can be very time consuming. To overcome this problem, we propose a genuine online PRTA induction approach that incorporates new instances by first collapsing them and then using a maximum frequent pattern based clustering. The approach is tested against a predefined synthetic automaton and real-world data sets, for which the approach is scalable and stable. Moreover, we present a broad evaluation on a real world disease group data set that shows the applicability of such a model to the analysis of medical processes.
year | journal | country | edition | language |
---|---|---|---|---|
2012-12-01 | 2012 IEEE 12th International Conference on Data Mining |