Multi-Sensor Human Tracking
From Experimental Robotics
- Hector Galan Marti
- Jake Foster
- Toby Rahilly
- Neeraj Wahi
A person following robot able to follow a specific person in an environment with other people.
- Presentation and paper
- Implement and test leg detection with laser scan data
- Feasibility tests – Some design decisions (such as whether to process image data on-board or off-board the robot) depend on expected performance of robot environment. We plan to experiment with the robot to see what performance is possible.
- Investigate robot processing power
- Investigate network performance
- Sensor tests
- Investigate quality/behaviour of laser range sensor
- Investigate quality/behaviour of camera
- Simple world model in place
- Incorporate leg detect module with world model
- Using world model and leg detect, have robot move towards humans that it finds
- Open CV body recognition test/demo running on robot
- Investigate Kalman filtering
- Implement seperate navigation module
- Fill out world model functionality
- Allow input from camera module
- Allow input/output from navigation module
- Incorporate initial image processing module into system
- Refine algorithms/behaviour based on tests of system
- Entire system (all modules) in place
- Not all modules will be fully featured yet, but every piece will be incorporated into the system
- Continue refining back/shoulder recognition image processing module
- Project clean up
- Start working on final demo scenario
- Start working on final paper & presentation
- Continue work on demo
- Continue work on paper & presentation
- Finish Demo, Paper, and Presentation
To allow for multiple sensors types to work to together in detecting people, a modular approach is required thats splits the logic up into intuitive pieces. A paper from Bielefeld University  discusses a modular system in the context of robots tracking human attention. The paper discusses using three different sensor types, camera, laser and sound to detect people. The input from the different sensors are sent through a process of "anchoring" that updates a single world model that the logic system uses. We envisage a similar system, which is split into three layers:
- The World Model
- Modules (Body detection, leg detection)
- Player Drivers (2D positioning, camera)
The world model contains information on the current state of encountered people in the explored world and the robots position. Modules read world state from the world model, and post updates to update queues. An update processor takes updates from the update queues and applies them to the current world model.
The world model contains a collection of people. A single person contains a position and a list of attributes. An attribute is a tuple (Name, Value, Confidence, Time Stamp).
Modules are separate pieces of logic that use and update the world model. Modules can only communicate via the world state. Modules also have access to the lower level player drivers that drive the actual robots. Each robot will run in its own thread, allowing each to keep track of their own state.
For instance the "Leg Detection" module contains the leg detection algorithm explained later in this report, using the laser player driver, and reports the positions of people back to the world model. The world will take this information from the modules update queue, and compare it to what else it knows about the current world, either updating a previously found person or adding a new person as appropriate.
This approach will allow us to add extra modules in the future, such as SLAM or other self positioning systems with out drastically changing the design of the system.
Leg Detection with Laser Range Scan Data
Our initial solution for human detection will rely on finding leg shaped patterns in the laser range scan data. Bellotto and Hu describe such an algorithm in their paper “Multisensor Integration for Human-Robot Interaction.” 
This paper explains that human legs appear as an alternating pattern of local maximums and minimums. The following graph from the Bellotto and Hu paper shows one such leg pattern in laser scan data.
In addition to the max-min-max-min-max pattern, there are also additional constraints based on the differences between the maximum and minimum distances. These constraints are as follows:
- PA – PB > 20cm
- PC – PB > 50cm
- PC – PD > 50cm
- PE – PD > 20cm
- |PD – PB| > 50cm
These constraints have the following meanings:
- (1) and (4) ensure that the front of the legs are at least 20 cm closer to the robot than the background ouside the legs.
- (2) and (3) ensure that the front of the legs are at least 50 cm closer to the robot than the background inside the legs.
- (5) ensures that the distance between the fronts of the two legs are within 50 cm of each other.
We have implemented the data filtering algorithm and leg pattern detection algorithm as described in the Bellotto and Hu paper. The solution has been tested in the Player/Stage simulator and will next be tested on an actual robot.
The value constraints described above might need to be adjusted depending on how well the algorithm performs. We assume that the performance of each robot might be different, so we will be testing and adjusting the algorithm as necessary to ensure acceptable performance.
Person Detection with Camera Image Processing
Computer detection of a person in an image or video is a hard problem, let alone unique identification of that particular person. We are as yet unsure as to the method we will choose in our implementation, but we discuss several possible methods below.
OpenCV contains built-in feature detection through its Haar characteristics model. The model requires training through the use of positive and negative images. Positive images prominently contain the target feature (e.g. a face, or a body), while negative images do not. Several trained profiles are available out-of-the-box, including upper, lower and full body detection, but the performance of the algorithm for our application is unclear. The benefit of using Haar detection is its ease. The algorithm is well-integrated into the OpenCV framework and requires little effort to use. We plan to implement a simple test program using such an approach by Week 6 to determine its suitability. Although OpenCV handles body detection, it does not handle identification between multiple bodies. A possible algorithm to identify bodies might create a fingerprint from the color histogram within the detected body's bounding box. Such an approach, however, would have difficulty in situations with dynamic lighting or multiple people dressed similarly (e.g. men in suits). For a demonstration of Haar body detection, please see this video.
Another possible way to leverage OpenCV, should the built-in Haar classifier be insufficient, would be to make use of the cvBlobsLib extension. This particular extension provides "blob" detection, or region segmentation, based on discontinuities in the image. After segmenting the image into continuous regions, the shape and color histogram of each region could be compared to a template containing a human. The benefit of this technique is its greater customization, whereas the drawback is the additional complexity of implementing the stages beyond region clustering.
In short, the ideal choice for person detection is OpenCV's built-in Haar classifier model, but the effectiveness of the algorithm must first be studied. A possible solution might use both the OpenCV Haar model along with additional detection algorithms should OpenCV alone be insufficient.
A paper from Carnegie Mellon University  discusses two methods of navigating to a person once their position is known:
- Direct: The robot moves to last known location of the target person
- Path following: The robot attempts to follow the path that the target person is walking.
The direct method has the advantage of simplicity. The robot only needs to keep track of the last known position of the target person and move directly towards it.
The path following method has the advantage of being a natural way of avoiding obstacles. Assuming the person does not move across anything the robot can not move across, then following the humans path is natural way of avoiding obstacles.
- Find best distance to human
- Keep avoiding obstacles that were not previously on the path
- Where to go if the target is lost -> Path heuristics
- Nicola Bellotto and Huosheng Hu, “Multisensor Integration for Human-Robot Interaction” 
- A. Morate, “People detecting and tracking using laser and vision”
- S. Lang, M. Kleinehagenbrock, S. Hohenner, J. Fritsch, G. A. Fink, and G. Sagerer "Providing the basis for human-robot-interaction: a multi-modal attention system for a mobile robot" 
Laser Leg Detection
- João Xavier, Marco Pacheco, Daniel Castro and António Ruano, “Fast line, arc/circle and leg detection from laser scan data in a Player driver” 
- A. Mendes, L. Conde and U. Nunes, "Multi-target detection and tracking with a laserscanner," Proc. IEEE Intelligent Vehicles Symposium, Parma, June 2004. 
Camera Image Processing
- S. Bahadori and L. Iocchi, "Human Body Detection in the RoboCup Rescue Scenario" 
- N. Harai and H. Mizoguchi, "Visual tracking of human back and shoulder for person following robot" 
- OpenCV Haar classifier usage 
- OpenCV cvBlobsLib library 
- R. Gockley, J. Forlizzi, R. Simmons, "Natural Person Following Behavior for Social Robots" 
- Kalman filtering, Embedded Systems Programming Magazine 
- Week 5
- Implementation of basic Leg detection algorithm as it comes in the paper
- Basic world model implementation begins.
- Week 6
- Testing the leg detection algorithm, mistakes lead to improve the algorithm.
- World model implementation continues
- First test with webcams and pioner camera.
- Find the solution to the bug: "full message queue".
- Week 7
- Navigation, change from goto() to SetSpeed(). Problems with delays of GetXPos(). Underestanding the behaviour of SetSpeed() and Read()
- Basic leg detection algorithm working and integrated into the (basic) world model. Empty room with one or two people. Laser range, 2 meters.
- Speed Tests. We tryed with constant values for rotate and forward speed. 2 m/s is way to fast/dangerous.
- First steps testing image processing from the robot. It finds body shapes in motion, while the robot is stopped.
- Laser range 2m is too short.
- Week 8
- When the robot is too close to something, it goes backwards.
- New: the robot keeps following the same person, not just the closest one.
- Implemented failsafe crash avoidance. Whenever there's something very close (<0.5m), stop the robot. Problem: the robot doesn't stop immediately, it takes a few seconds -> dangerous. SetSpeed takes a while, SetMotorEnable(false) stops the motor instantly.
- Image processing continues
- Week 9
- Better failsafe mode, uses FrontalCollision.
- Robot loses one person very easily. Gets confused with fake legs. Trying different methods: velocity vector...
- Working on object avoidance module.
- Week 10
- Partial redesign of the system. Introducing states in the robot behaviour.
- Object avoidance implemented. Testing begins. Issues: the behaviour of the robot is getting complex and debugging is difficult.
- Tunning parameters of camera module
- Setting realistic goals to the project.