OpenCV Guide

From Experimental Robotics

Jump to: navigation, search

Contents

Introduction

The Intel OpenCV package is an open source library of computer vision algorithms that many COMP4411 groups have used. On the web, it lives at http://www.intel.com/technology/computing/opencv/index.htm. This space is for sharing tips/tricks/pitfalls with OpenCV use.

One thing worth considering with OpenCV is that it comes with a Python interface as well as a C++ interface. While not well documented, and working only on Linux, it can be an excellent tool for rapid application development.

Selected applications

Capturing a camera frame from a robot

The following code snippet grabs a frame from a robot using Player/Stage (providing the camera driver is setup correctly!) and fills an OpenGL image structure with the frame data:

PlayerClient* robot = new PlayerClient("localhost");
CameraProxy* camera = new CameraProxy(&robot, 1);

//Read from robot
robot->Read();

//Get the capture information
uint cam_width = camera->GetWidth();
uint cam_height = camera->GetHeight();
uint cam_depth = camera->GetDepth();

//Allocate enough memory for the image data buffer
uint8_t imgBuffer = new uint8_t[cam_width * cam_height * cam_depth];

// Decompress image and copy it into the image buffer
camera->Decompress();
camera->GetImage(imgBuffer);

//Create an OpenGL image structure to hold image
IplImage* img = cvCreateImage(cvSize(cam_width, cam_height), IPL_DEPTH_8U, 3);

// Copy the data from the image buffer into the frame
for (uint i = 0; i < cam_width; i++) {
	for (uint j = 0; j < cam_height; j++) {
		img->imageData[cam_width * j*3 + i*3 + 0] = (char)imgBuffer[cam_width * j*3 + i*3 + 2];
		img->imageData[cam_width * j*3 + i*3 + 1] = (char)imgBuffer[cam_width * j*3 + i*3 + 1];
		img->imageData[cam_width * j*3 + i*3 + 2] = (char)imgBuffer[cam_width * j*3 + i*3 + 0];
	}
}

//Release memory when finished processing
delete[] imgBuffer;
cvReleaseImage(&img);

Writing an image to a file:

How to use OpenCV to capture and display images from a camera

Introduction to programming with OpenCV: Working With Images

  if(!cvSaveImage(outFileName,img)) printf("Could not save: %s\n",outFileName);

The output file format is determined based on the file name extension.


Finding bounding boxes around regions of a binary image

//Linked list of connected pixel sequences in a binary image
CvSeq* seq;

//Array of bounding boxes
vector<CvRect> boxes;

//Memory allocated for OpenCV function operations
CvMemStorage* storage = cvCreateMemStorage(0);
cvClearMemStorage(storage);

//Find connected pixel sequences within a binary OpenGL image (diff), starting at the top-left corner (0,0)
cvFindContours(diff, storage, &seq, sizeof(CvContour), CV_RETR_EXTERNAL, CV_CHAIN_APPROX_NONE, cvPoint(0,0));

//Iterate through segments
for(; seq; seq = seq->h_next) {
        //Find minimal bounding box for each sequence
	CvRect boundbox = cvBoundingRect(seq);
	boxes.push_back(boundbox);
}

Color histograms

Calculating color histograms

Finding the hue component of an image

Convert the image of interest to the HSV color space and extract the hue component:

IplImage* hsv = cvCreateImage( cvGetSize(img), 8, 3 );
IplImage* hue = cvCreateImage( cvGetSize(img), 8, 1 );

//Convert img to the HSV color space
cvCvtColor(img, hsv, CV_BGR2HSV);

//Split out hue component and store in hue
cvSplit(hsv, hue, 0, 0, 0);

Calculating a hue histogram

Calculate a histogram from a rectangular region (the region of interest, or ROI) of the hue image:

//Represents how many hues to cover in histogram, here 180 degrees
float hranges_arr[] = {0,181};
hranges = hranges_arr;

//Allocate memory for a histogram with num_bins = number of bins
CvHistogram* histogram = cvCreateHist( 1, &num_bins, CV_HIST_ARRAY, &hranges, 1 );

//Set the region of interest (of which we want the histogram) in the hue image
cvSetImageROI(hue, cvRect(top, left, width, bottom) );

//Calculate the histogram and normalize it to facilitate comparisons
cvCalcHist(&hue, histogram);
cvNormalizeHist(histogram, 1024);

//Reset the image region of interest
cvResetImageROI(hue);

Calculating the derivative of a histogram

The code below finds the derivative of a histogram, defined here as the difference between successive bins of an OpenCV histogram structure:

/*
    int num_bins = number of bins in histogram
    CvHistogram* hist = filled OpenCV histogram structure
*/

double* deriv_hist = new double[num_bins-1];
for( int i = 0; i < (num_bins-1); i++ ) {
	deriv_hist[i] = cvGetReal1D(hist->bins,i) - cvGetReal1D(hist->bins,i+1);
}

Finding the correlation between histogram derivatives

The code below calculates the correlation between two arrays containing the derivative of a color histogram. The code is adapted from OpenCV code for histogram correlation:

/*
     int num_bins = number of bins in histogram
     double* diff1, diff2 = histogram derivatives
*/

//The number of bins in the derivative of a histogram, one less than num_bins
int total = num_bins - 1;

double s1 = 0, s11 = 0, s2 = 0, s22 = 0, s12 = 0;
double num, denom, scale = 1./total;

for(int i = 0; i < total; i++ ) {
	double a = diff1[i];
	double b = diff2[i];

	s12 += a*b;
	s1 += a;
	s11 += a*a;
	s2 += b;
	s22 += b*b;
}
num = s12 - s1*s2*scale;
denom = (s11 - s1*s1*scale)*(s22 - s2*s2*scale);

//Find the correlation
double corr = fabs(denom) > 0? num/sqrt(denom) : 1;

Detecting lines

 static IplImage*_imageGrey = cvCreateImage(cvGetSize(cameraImage), IPL_DEPTH_8U, 1);
 // Copy the greyscale version of input to grey
 cvCvtColor(cameraImage, _imageGrey, CV_BGR2GRAY);
 
 static IplImage*_imageWhite = cvCreateImage(cvGetSize(cameraImage), IPL_DEPTH_8U, 1);
 // threshold to find the white lines
 cvThreshold(_imageGrey, _imageWhite, 220, 255, CV_THRESH_BINARY);
       
 // Run the edge detector algorithm on grey (set threshold to 120)
 cvCanny(_imageWhite, edgesImage, 120, 120*3, 3);
 
 static CvMemStorage* storage = cvCreateMemStorage(0);
 overheadLines = cvHoughLines2(edgesImage, storage, CV_HOUGH_PROBABILISTIC, 1, CV_PI/360, 30, 10, MAXIMUM_GAP);
 //This copies the lines into a new image.
 static IplImage* _imageLines // after line detection
     = cvCreateImage(cvGetSize(edgesImage), IPL_DEPTH_8U, 1);
 cvSetZero(_imageLines);
 for (int i = 0; i < overheadLines->total; i++) {
   CvPoint* line = (CvPoint*)cvGetSeqElem(overheadLines, i);
   cvLine(_imageLines, line[0], line[1], CV_RGB(255, 255, 255));
 }

Detecting circles

There are a few fiddly bits that need to taken care of to detect circles in an image. Before you process an image with cvHoughCircles - the function for circle detection, you may wish to first convert it into a gray image and smooth it. Following is the general procedure of the functions you need to use with examples of their usage.

Create Image

Supposing you have an initial image for processing called 'img', first you want to create an image variable called 'gray' with the same dimensions as img using cvCreateImage.

IplImage* gray = cvCreateImage( cvGetSize(img), 8, 1 ); // allocate a 1 channel byte image
CvMemStorage* storage = cvCreateMemStorage(0);
IplImage* cvCreateImage(CvSize size, int depth, int channels);

  size:  cvSize(width,height);

  depth: pixel depth in bits: IPL_DEPTH_8U, IPL_DEPTH_8S, IPL_DEPTH_16U,
    IPL_DEPTH_16S, IPL_DEPTH_32S, IPL_DEPTH_32F, IPL_DEPTH_64F

  channels: Number of channels per pixel. Can be 1, 2, 3 or 4. The channels 
    are interleaved. The usual data layout of a color image is
    b0 g0 r0 b1 g1 r1 ...

Convert to Gray

Now you need to convert it to gray using cvCvtColour which converts between colour spaces.

cvCvtColor( img, gray, CV_BGR2GRAY );
cvCvtColor(src,dst,code); // src -> dst

  code    = CV_<X>2<Y>
  <X>/<Y> = RGB, BGR, GRAY, HSV, YCrCb, XYZ, Lab, Luv, HLS

e.g.: CV_BGR2GRAY, CV_BGR2HSV, CV_BGR2Lab


Smooth Image

This is done so as to prevent a lot of false circles from being detected. You might need to play around with the last two parameters, noting that they need to multiply to an odd number.

cvSmooth( gray, gray, CV_GAUSSIAN, 9, 9 ); // smooth it, otherwise a lot of false circles may be detected
void cvSmooth( const CvArr* src, CvArr* dst,
               int smoothtype=CV_GAUSSIAN,
               int param1, int param2);

src
    The source image. 
dst
    The destination image. 
smoothtype
    Type of the smoothing:
        * CV_BLUR_NO_SCALE (simple blur with no scaling) - summation over a pixel param1×param2 neighborhood. 
          If the neighborhood size is not fixed, one may use cvIntegral function.
        * CV_BLUR (simple blur) - summation over a pixel param1×param2 neighborhood with subsequent scaling by 1/(param1•param2).
        * CV_GAUSSIAN (gaussian blur) - convolving image with param1×param2 Gaussian.
        * CV_MEDIAN (median blur) - finding median of param1×param1 neighborhood (i.e. the neighborhood is square).
        * CV_BILATERAL (bilateral filter) - applying bilateral 3x3 filtering with color sigma=param1 and space sigma=param2.
param1
    The first parameter of smoothing operation. 
param2
    The second parameter of smoothing operation. In case of simple scaled/non-scaled and Gaussian blur if param2 is zero, it is set to param1. 

Detect using Hough Circle

The function cvHoughCircles is used to detect circles on the gray image. Again the last two parameters might need to be fiddled around with.

CvSeq* circles = cvHoughCircles( gray, storage, CV_HOUGH_GRADIENT, 2, gray->height/4, 200, 100 );
CvSeq* cvHoughCircles( CvArr* image, void* circle_storage,
                       int method, double dp, double min_dist,
                       double param1=100, double param2=100,
                       int min_radius=0, int max_radius=0 );

Haar Classifier

A Haar Classifier is a machine learning approach for visual object detection originally developed by Viola & Jones [pdf link to paper]. It was originally intended for facial recognition but can be used for any object. The power of the Haar Classifier is that it will quickly reject regions that are highly unlikely to contain the object. It does this by making use of the cascade of classifiers. In this cascade, the early stages will quickly reject the majority of false regions and the object detection can move on to other regions. The later stages however require progressively more computational effort in order to reject the region. By doing this, the Haar Classifier will only spend substantial time on regions that are likely to contain the object being searched for.

The necessary applications for implementing a Haar classifier are included in OpenCV and these can be used to train a classifier for detecting objects in an image. At the time of writing, the version of OpenCV installed on the lab computers was 0.9.7. It is highly recommended to download the latest version from Sourceforge then extract and compile this into your home directory.


Steps in training and using a Haar classifier:

  • collect positive and negative training images
  • markup positive images using objectmarker.cpp
  • create a .vec file using createsamples.cpp
  • train the classifier using haartraining.cpp
  • run the classifier using cvHaarDetectObjects()

Collect Training Images

Generally you will need around 1000 images that contain the object you want to train for. When we did this for the Urban Challenge 2008 we used the Pioneer robot to collect images of a sign with the numeral 1 (see above for instructions on getting images from the robot). It is this collection of positive images that will be used to create the .vec file used to train the classifier. It is also necessary to supply the training with a number of negative images, i.e. ones that do not contain the object being trained for. Examples of positive and negative images are shown below.


Sign1 pos 4x1.jpg

Examples of positive images used to train Haar Classifier


Negative images 4x1.jpg

Examples of negative images used to train Haar Classifier


It is unclear exactly how many of each kind of image are needed. For Urban Challenge 2008 we used 1000 positive and 1000 negative images whereas the previous project Grippered Bandit used 5000. The result for the Grippered Bandit project was that their classifier was much more accurate than ours. However, it took 3 weeks to train the classifier for Grippered Bandit, whereas it only took 2 days to train for Urban Challenge 2008. These numbers gave reasonable results, however it would be a great investigation for any future group to test the performance of the classifier using varying numbers of training images.

Mark Positive Images

This step creates a data file containing the file name and the location of the object in the image. The data file is created using the object marker utility. Compile the source code and then run, giving the name of the text file to write the object location data into and the path where your positive images are located, eg:

./ObjectMarker output.txt /home/group03/haarimages/sign1positive/


Marking the images is a time consuming task. Get comfortable. The images in the directory provided to objectmarker are displayed one at a time. To mark an object, click with the mouse at the top left corner of the object and again at the bottom right corner, drawing a bounding box around the object. Take care to always start the bounding box at either the top left or bottom right corner. If you use the other two corners objectmarker will not write the coordinates to the output.txt file.

Objectmarker screenshot.png

Screen-Shot of ObjectMarker Utility


Press the Space Bar to confirm the rectangle. If you made a mistake drawing the rectangle, eg it was too small, just click to draw it again without pressing space. To save the marked objects and load the next image press Enter. We found it more convenient to change the key used to save and load the next image from Enter to B because B is closer to the 'Space Bar'. This can be changed in the source for ObjectMarker (ObjectMarker.cpp). When you are hand marking 1000+ images, this makes a big difference. Press Esc to close ObjectMarker and save the output.txt file.

The output.txt file will be a list of the image file names followed by the number of objects marked in the image, the coordinate of the top left corner of the rectangle bounding the object, then the width and height of the rectangle, eg,

/home/group03/haarimages/sign1positive/image000001.jpg 2 35 178 35 26 73 112 19 17
/home/group03/haarimages/sign1positive/image000002.jpg 2 41 181 28 23 76 113 15 16
/home/group03/haarimages/sign1positive/image000003.jpg 1 37 178 32 28
/home/group03/haarimages/sign1positive/image000004.jpg 2 75 113 14 14 35 176 37 32
/home/group03/haarimages/sign1positive/image000005.jpg 2 73 111 19 20 33 179 34 26
/home/group03/haarimages/sign1positive/image000006.jpg 2 78 110 18 23 35 171 49 41
etc...

Create Vector File

The marked images need to be packed into a vector file. The createsamples (/opencv_library_install_path/opencv-1.0.0/bin) utility is used to do this, eg:

./opencv-createsamples -info /home/group03/haarimages/sign1positive/output.txt 
-vec /home/group03/haarimages/sign1positive/positives.vec -w 24 -h 24

The -w and -h options set the sample width and height respectively, the info option specifies the file made using ObjectMarker, and the -vec option specifies the name and location for the vector file output by the createsamples utility.

Train Classifier

The classifier is trained with the utility opencv-haartraing (/opencv_library_install_path/opencv-1.0.0/bin/), eg:

$ opencv-haartraining -data /home/group03/trainout -vec /home/group03/haarimages/sign1positive/positives.vec -bg /home/group03/sign1_negative/negatives.txt -npos 1000 -nneg 1000 -nstages 20

Where the -data option specifies where to place the .xml file which is the trained classifier, -vec is our vector file from the previous createsamples step, -bg is a txt file containing a listing of the negative images, -npos is the number of positive images, -nneg is the number of negative images, and nstages is the number of classifier stages to train.

Use Classifier

The result of this is a cascaded classifier that can be used by the robot to detect the object. Signs are detected in the image from the robot's camera by using the OpenCV function cvHaarDetectObjects. A video of our classifier is available here on youtube.


Tips and tricks

cvWaitKey

Certain functions of OpenCV may not behave correctly unless cvWaitKey is invoked with an appropriate time delay (in milliseconds) during each iteration of an application's main frame processing loop. For example, windows created with OpenCV may clip their contents. The following value worked correctly for our particular projects (Multi-Sensor Human Tracking and Urban Challenge 2008):

cvWaitKey(10);

OpenCV Links

OpenCV on the wiki

OpenCV on the web

Haar Info and Guides

Viola and Jones: Rapid Object Detection using a Boosted Cascade of Simple Features (pdf)

Haar on Opencv Wiki

Tutorial: OpenCV haartraining (rapid object detection with a cascade of boosted classifiers based on Haar-like features) - Naotoshi Sao

OpenCV tutorial - Haartraining [AIT Computer Vision Wiki]

How to build a cascade of classifiers based on haar-like features (pdf)

Personal tools