Author: Stan Sclaroff

0000000000143434

AUTHOR

Stan Sclaroff

showing 13 related works from this author

Object Matching in Distributed Video Surveillance Systems by LDA-Based Appearance Descriptors

2009

Establishing correspondences among object instances is still challenging in multi-camera surveillance systems, especially when the cameras’ fields of view are non-overlapping. Spatiotemporal constraints can help in solving the correspondence problem but still leave a wide margin of uncertainty. One way to reduce this uncertainty is to use ap- pearance information about the moving objects in the site. In this paper we present the preliminary results of a new method that can capture salient appearance characteristics at each camera node in the network. A Latent Dirichlet Allocation (LDA) model is created and maintained at each node in the camera network. Each object is encoded in terms of the…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniMatching (statistics)business.industryComputer scienceNode (networking)Video surveillanceObject matchingObject (computer science)Latent Dirichlet allocationsymbols.namesakeSalientMargin (machine learning)symbolsComputer visionArtificial intelligencebusinessCorrespondence problemconsistent labelling

researchProduct

Combining textual and visual cues for content-based image retrieval on the World Wide Web

2002

A system is proposed that combines textual and visual statistics in a single index vector for content-based search of a WWW image database. Textual statistics are captured in vector form using latent semantic indexing (LSI) based on text in the containing HTML document. Visual statistics are captured in vector form using color and orientation histograms. By using an integrated approach, it becomes possible to take advantage of possible statistical couplings between the content of the document (latent semantic content) and the contents of images (visual statistics). The combined approach allows improved performance in conducting content-based search. Search performance experiments are report…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniWorld Wide WebInformation retrievalIndex (publishing)Distributed databaseOrientation (computer vision)Computer scienceHistogramSearch engine indexingContent-based image retrievalSensory cueImage retrievalCBIR latent semantic indexingProceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173)

researchProduct

Online Multi-Person Tracking by Tracker Hierarchy

2012

Tracking-by-detection is a widely used paradigm for multi-person tracking but is affected by variations in crowd density, obstacles in the scene, varying illumination, human pose variation, scale changes, etc. We propose an improved tracking-by-detection framework for multi-person tracking where the appearance model is formulated as a template ensemble updated online given detections provided by a pedestrian detector. We employ a hierarchy of trackers to select the most effective tracking strategy and an algorithm to adapt the conditions for trackers' initialization and termination. Our formulation is online and does not require calibration information. In experiments with four pedestrian t…

Computer scienceBitTorrent trackerbusiness.industryComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONInitializationTracking systemTracking (particle physics)Object detectionActive appearance modelVideo trackingTracking Experts DetectorComputer visionArtificial intelligenceMean-shiftbusiness

researchProduct

Gesture Modeling by Hanklet-Based Hidden Markov Model

2015

In this paper we propose a novel approach for gesture modeling. We aim at decomposing a gesture into sub-trajectories that are the output of a sequence of atomic linear time invariant (LTI) systems, and we use a Hidden Markov Model to model the transitions from the LTI system to another. For this purpose, we represent the human body motion in a temporal window as a set of body joint trajectories that we assume are the output of an LTI system. We describe the set of trajectories in a temporal window by the corresponding Hankel matrix (Hanklet), which embeds the observability matrix of the LTI system that produced it. We train a set of HMMs (one for each gesture class) with a discriminative a…

Conditional random fieldKinectbusiness.industryComputer scienceMaximum-entropy Markov modelAction ClassificationHankel matrixMarkov modelHidden Markov ModelLTI system theoryGestureAction RecognitionGesture recognitionObservabilityArtificial intelligencebusinessHidden Markov modelAlgorithmHankel matrixSkeleton

researchProduct

Unifying Textual and Visual Cues for Content-Based Image Retrieval on the World Wide Web

1999

A system is proposed that combines textual and visual statistics in a single index vector for content-based search of a WWW image database. Textual statistics are captured in vector form using latent semantic indexing based on text in the containing HTML document. Visual statistics are captured in vector form using color and orientation histograms. By using an integrated approach, it becomes possible to take advantage of possible statistical couplings between the content of the document (latent semantic content) and the contents of images (visual statistics). The combined approach allows improved performance in conducting content-based search. Search performance experiments are reported for…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniInformation retrievalComputer scienceOrientation (computer vision)Search engine indexingHTMLSemanticsContent-based image retrievalCBIR latent semantic indexingWorld Wide WebIndex (publishing)HistogramSignal ProcessingComputer Vision and Pattern RecognitionSensory cuecomputerSoftwarecomputer.programming_language

researchProduct

Head Tracking via Robust Registration in Texture Map Images.

1998

A novel method for 3D head tracking in the presence of large head rotations and facial expression changes is described. Tracking is formulated in terms of color image registration in the texture map of a 3D surface model. Model appearance is recursively updated via image mosaicking in the texture map as the head orientation varies. The resulting dynamic texture map provides a stabilized view of the face that can be used as input to many existing 2D techniques for face recognition, facial expressions analysis, lip reading, and eye tracking. Parameters are estimated via a robust minimization procedure; this provides robustness to occlusions, wrinkles, shadows and specular highlights. The syst…

Head trackingSettore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniFacial expressionbusiness.industryComputer scienceColor imageComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONImage registrationFacial recognition systemRobustness (computer science)Motion estimationSpecular highlightEye trackingComputer visionArtificial intelligencebusinessTexture mappingComputingMethodologies_COMPUTERGRAPHICSData compression

researchProduct

Fully automatic, real-time detection of facial gestures from generic video

2005

A technique for the detection of facial gestures from low resolution video sequences is presented. The technique builds upon the automatic 3D head tracker formulation of [M. La Cascia et al., 2000]. The tracker is based on the registration of a texture-mapped cylindrical model. Facial gesture analysis is performed in the texture map by assuming that the residual registration error can be modeled as a linear combination of facial motion templates. Two formulations are proposed and tested. In one formulation, the head and facial motion are estimated in a single, combined linear system. In the other formulation, head motion and then facial motion are estimated in a two-step process. The two-st…

Head (linguistics)Computer sciencebusiness.industryLinear systemComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONImage registrationMotion (physics)Image textureGesture recognitionComputer visionArtificial intelligencebusinessTexture mappingComputingMethodologies_COMPUTERGRAPHICSGestureIEEE 6th Workshop on Multimedia Signal Processing, 2004.

researchProduct

Decoding Children's Social Behavior

2013

We introduce a new problem domain for activity recognition: the analysis of children's social and communicative behaviors based on video and audio data. We specifically target interactions between children aged 1-2 years and an adult. Such interactions arise naturally in the diagnosis and treatment of developmental disorders such as autism. We introduce a new publicly-available dataset containing over 160 sessions of a 3-5 minute child-adult interaction. In each session, the adult examiner followed a semi-structured play interaction protocol which was designed to elicit a broad range of social behaviors. We identify the key technical challenges in analyzing these behaviors, and describe met…

Behavior Psychology Dataset Video analysis Speech Analysis AutismInter-action protocolsSocial and communicative behaviorInteraction protocol02 engineering and technologycomputer.software_genreAnnan data- och informationsvetenskapSession (web analytics)Activity recognitionTechnical challenges0202 electrical engineering electronic engineering information engineeringmedicineSocial behaviorAudio signal processingMultimediabusiness.industryDevelopmental disorders020207 software engineeringmedicine.diseaseSemi-structuredResearch questionsActivity recognitionProblem domainKey (cryptography)Autism020201 artificial intelligence & image processingArtificial intelligencePsychologybusinessOther Computer and Information SciencecomputerCognitive psychologySocial behavior2013 IEEE Conference on Computer Vision and Pattern Recognition

researchProduct

Hankelet-based dynamical systems modeling for 3D action recognition

2015

This paper proposes to model an action as the output of a sequence of atomic Linear Time Invariant (LTI) systems. The sequence of LTI systems generating the action is modeled as a Markov chain, where a Hidden Markov Model (HMM) is used to model the transition from one atomic LTI system to another. In turn, the LTI systems are represented in terms of their Hankel matrices. For classification purposes, the parameters of a set of HMMs (one for each action class) are learned via a discriminative approach. This work proposes a novel method to learn the atomic LTI systems from training data, and analyzes in detail the action representation in terms of a sequence of Hankel matrices. Extensive eval…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniSequenceMarkov chainDynamical systems theorySupervised learningHankel MatrixHidden Markov ModelLTI system theoryDiscriminative learningLinear time invariant systemDiscriminative modelActionComputer Science::Systems and ControlControl theorySignal ProcessingComputer Vision and Pattern RecognitionElectrical and Electronic EngineeringHidden Markov modelHankel matrixAlgorithmMathematicsImage and Vision Computing

researchProduct

Joint Alignment and Modeling of Correlated Behavior Streams

2013

The Variable Time-Shift Hidden Markov Model (VTS- HMM) is proposed for learning and modeling pairs of cor- related streams. Unlike previous coupled models for time series, the VTS-HMM accounts for varying time shifts be- tween correlated events in pairs of streams having different properties. The VTS-HMM is learned on a set of pairs of unaligned streams and, thus, learning entails simultaneous estimation of the varying time shifts and of the parameters of the model. The formulation is demonstrated in the analysis of videos of dyadic social interactions between children and adults in the Multimodal Dyadic Behavior Dataset (MMDB). In dyadic social interactions, an agent starts an interaction …

Variable (computer science)Series (mathematics)Computer scienceMulti-agent systemSpeech recognitionDyadic interactionBehavior Modeling Autism Dyadic InteractionSTREAMSSet (psychology)Hidden Markov modelVisualization

researchProduct

Path Modeling and Retrieval in Distributed Video Surveillance Databases

2012

We propose a framework for querying a distributed database of video surveillance data in order to retrieve a set of likely paths of a person moving in the area under surveillance. In our framework, each camera of the surveillance system locally pro- cesses the data and stores video sequences in a storage unit and the metadata for each detected person in the distributed database. A pedestrian’s path is formulated as a dynamic Bayesian network (DBN) to model the dependencies between subsequent observa- tions of the person as he makes his way through the camera net- work. We propose a tool by which the analyst can pose queries about where a certain person appeared while moving in the site duri…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniDistributed databaseDatabasebusiness.industryComputer scienceComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONcomputer.software_genreComputer Science ApplicationsData modelingMetadataSet (abstract data type)Beam search camera network dynamic Bayesian network (DBN) path modeling path retrievalSignal ProcessingPath (graph theory)Media TechnologyBeam searchComputer visionArtificial intelligenceData miningElectrical and Electronic EngineeringbusinessHidden Markov modelcomputerDynamic Bayesian network

researchProduct

Fast, Reliable Head Tracking Under Varying Illumination

2003

An improved technique for 3D head tracking under varying illumination conditions is proposed. The head is modeled as a texture mapped cylinder. Tracking is formulated as an image registration problem in the cylinder's texture map image. To solve the registration problem in the presence of lighting variation and head motion, the residual error of registration is modeled as a linear combination of texture warping templates and orthogonal illumination templates. Fast and stable on-line tracking is then achieved via regularized weighted least squares minimization of the registration error. The regularization term tends to limit potential ambiguities that arise in the warping and illumination te…

Head trackingSettore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniComputer sciencebusiness.industryFeature extractionDetectorComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONImage registrationRegularization (mathematics)Computer Science::Computer Vision and Pattern RecognitionComputer visionArtificial intelligenceImage warpingbusinessTexture mappingComputingMethodologies_COMPUTERGRAPHICS

researchProduct

ImageRover: A Content-Based Image Browser for the World Wide Web

1997

ImageRover is a search-by-image-content navigation tool for the World Wide Web (WWW). To gather images expediently, the image collection subsystem utilizes a distributed fleet of WWW robots running on different computers. The image robots gather information about the images they find, computing the appropriate image decompositions and indices, and store this extracted information in vector form for searches based on image content. At search time, users can iteratively guide the search through the selection of relevant examples. Search performance is made efficient through the use of an approximate, optimized k-d tree algorithm. The system employs a novel relevance feedback algorithm that se…

CBIRInformation retrievalDistributed databasebusiness.industryComputer scienceSearch engine indexingRelevance feedbackcomputer.software_genreWorld Wide WebInformation extractionTree (data structure)RobotThe InternetbusinessImage retrievalcomputer

researchProduct