A vision system for symbolic interpretation of dynamic scenes using arsom
We describe an artificial high-level vision system for the symbolic interpretation of data coming from a video camera that acquires the image sequences of moving scenes. The system is based on ARSOM neural networks that learn to generate the perception-grounded predicates obtained by image sequences. The ARSOM neural networks also provide a three-dimensional estimation of the movements of the relevant objects in the scene. The vision system has been employed in two scenarios: the monitoring of a robotic arm suitable for space operations, and the surveillance of an electronic data processing (EDP) center.