6533b821fe1ef96bd127c110

RESEARCH PRODUCT

Learning of regular expressions by pattern matching

Alvis Brazma

subject

PolynomialFinite-state machineRegular languageComputer scienceBounded functionRegular expressionPattern matchingAlgorithmExpression (mathematics)Substring

description

We consider the problem of restoring regular expressions from good examples. We describe a natural learning algorithm for obtaining a “plausible” regular expression from one example. The algorithm is based on finding the longest substring which can be matched by some part of the so far obtained expression. We believe that the algorithm to a certain extent mimics humans guessing regular expressions from the same sort of examples. We show that for regular expressions of bounded length successful learning takes time linear in the length of the example, provided that the example is “good”. Under certain natural restrictions the run-time of the learning algorithm is polynomial also in unsuccessful cases. In the end we discuss the computer experiment of learning regular expressions via the described algorithm, showing that the proposed learning method is quite practical.

https://doi.org/10.1007/3-540-59119-2_194