Algorithms for Motion Tracking? 32
Keith Handy asks: "I seem to be unable to find algorithms and/or open source programs that will do accurate motion tracking, i.e. you mark a point on an object in frame 36, and the program can follow that point on that object through all the frames following it. This is useful not just for analyzing motion, but also for interpolating/extrapolating frames of video -- so if you had something at only 15 fps, you could generate inbetween frames (which are not just crossfades between the frames) and actually smooth the effect of the motion. Not something so complicated as to get into actual physics -- just something that will indicate where (in 2D only) that part of the object has moved from one frame to the next, for any given point in the whole picture. And for that matter it doesn't have to be 100% accurate, just any means of generating a reasonable motion-flow map." This doesn't strike me as an easy algorithm to develop, but are there any papers online or offline, that might describe an algorithm that can at least track objects in an image?
"In other words, I want something that does this,
in order to write code that will do things like this and this. I already know how to write code to blur and warp images, so to be able to track motion would give me (and you) the same capabilities as these expensive plug-ins.
Anyone know any other resources, directions, or existing code I could look into to find out more about how this works, so I can incorporate it into my own programming instead of paying hundreds or thousands of dollars for limited, proprietary use of the technology?"
Book (Score:5, Informative)
Re:Book (Score:2)
--
Evan "Tired, and I'm not gonna rewrite that for easier parsing" E.
Video compression (Score:3, Informative)
Which is exactly what MPEG does... very crudely. The MPEG solution seems to be to compare a block (8x8?) of pixels with every block in the previous frame.
The fact that MPEG doesn't use anything more sophisticated than this suggests to me that there probably aren't any algorithms which consistently work better.
Re:Video compression (Score:2, Informative)
Re:Video compression (Score:2)
Not that I know, but it could also be that there aren't any algorithms that work better considering the horsepower available in many devices. There could be many algorithms that work much better assuming a dual Athlon 1900+'s to execute it.
Re:Video compression (Score:1)
i'm sure that we will see something like this in the future, but even right now with our 2ghz machines, point tracking is pretty slow.
download a trial copy of shake [nothingreal.com] and try some tracking for yourself. just trying to follow 4 points takes about 1 second per frame, imagine how long it would take to process every pixel (or even 8x8 blocks) in a 30 minute video!
openCV (Score:5, Informative)
and a yahoo groups support forum thing here [yahoo.com]
the original intel pages are here [intel.com]
cheers,
bjpirt
what about intercorrelation ? (Score:3, Interesting)
the intercorrelation function of two neighbouring
frames. The maxima are more or less where
the objects have moved.
I only used this method on artificially generated
frames, ie 1 frame with translation and noise
added. Still, the intercorrelation sinks quite
fast. On natural images, there must be a lot
of fiddling to do.
Try (Score:1)
CJC
KLT Feature Tracker (Score:5, Informative)
An implementation can be found here:
http://vision.stanford.edu/~birch/klt/
Identify Features and Label (Score:1, Interesting)
Basically, identify all "peaks" (whatever feature you're interested in) and sort them. Start with the most outstanding feature and associate its nearest neighbours with it. Repeat many times. You will have data structure of references which will produce a map of islands and isthmuses depending on how far down you look.
Attach a "label" (unique ID) to each significant feature in the frame.
Repeat for the next frame.
Compare significant features. Using some sort of threshold, you can attach a confidence level that you're looking at the safe feature in the previous frame.
That's a simplistic overview, but I did it many years ago for looking at the output of stellar formation simulations.
tracking motion (Score:3, Interesting)
http://motion.technolust.cx [technolust.cx]
there are some examples and a sample video which demonstrate tracking "motion."
Re:tracking motion (Score:1)
related GPL software (Score:1)
"Motion uses a video4linux device and detects changes in the image. If a change is detected a snapshot will be taken. "
MPEG (Score:2)
For other applications (e.g., colorization), you need somewhat better segmentation. Doing this well in the general case is still a research topic; but that's good: you can get lots of research software from around the net that does this sort of thing. Look for keywords like "computer vision", "motion", "segmentation", and "tracking" on Google.
some brainstormed ideas... (Score:2, Interesting)
break up the iimage into N x N submatrices, and do a fourier transform on each subsection of the image. then do this for the next frame, and calculate the phase differences between each frame, and use linear/cubic/etc interpolation to generate the frames in between. not too difficult, and I think there is even a 2-D FFT library located somwhere on download.com. this, however might introduce a couple of artifacts, but if you're doing high framerate video, it shouldn't be too noticeable.
or even more far-fetched:
assuming that the translation of the objects in the image plane between frames are small and uniform enough, you might also be able to pull this off with a properly trained neural network on subsections of the image (so each individual feature fits approximately in each subsection). neural networks can do non-linear regression, but thier outputs are continuous, so I figure if you train it right, it'll give you what you want.
good luck
just use the MPEG algorithm (Score:4, Informative)
Once you have done this for every block in the original frame, you have a set of motion vectors from which you can construct an intermediate frame.
voice from academia (Score:1)
http://robotics.stanford.edu/~birch/klt/ [stanford.edu]
I think that different MPEG compression schemes track motion differently - some using a brute force method. This method treats your image like a linear function so that it can search for the region of interest in the next image by using a "newton's method" like scheme - Much more efficient than brute force pixel comparison. I could be wrong though - I wasn't really paying attention in class
Possible Hardcopy Resource (Score:1)
I picked up this book (and many other computer and math books) at my local Coles bookstore for $2-$5 CDN$ each... I guess they were trying to get rid of them. I don't know if you'll be able to find a copy, but here's the info anyways:
Lecture Notes in Computer Science
Volume 1310
Image Analysis and Processing
Alberto Del Bimbo (Editor)
Published by Springer
ISSN: 0302-9743
ISBN: 3-540-63507-6
The editor's email address is listed in the cover page: delbimbo@aguirre.ing.unifi.it, so you might be able to contact him to see where you could find a copy... Good luck!
Holy Grail... (Score:1)
interesting aside (Score:2)
Also, the BBC have something camera based in the works
http://www.bbc.co.uk/rd/tour/virtualproduction.
Actually... (Score:1)
That way, if the structure of the picture changes, with more or less pixels of the same color, the
Will this work?
Re:Actually... (Score:1)
no. Re:Actually... (Score:2)
but try seaching for webcam motion detector [google.com] on google and you will find some useful stuff.
Re:Actually... (Score:1)
MSE= (Sigma[f(i,j)-f'(i,j)]^2)/N^2
Where N is the number of pixels. PSNR in decibels (dB) is calculated as
PSNR=20 log10 ((255^2)/MSE)
A higher PSNR between two images indicates a greater degree of similarity, the PSNR of identical images will be infinity. Although this calculation is a useful way of determining overall similarity between images, it does not necessarily correlate with human judgments particularly well. One crucial limitation of PSNR is that it is a global measure that treats all deviations the same. Therefore, a slight, barely perceptible, uniform degradation over the entire image may result in a PSNR that is identical to that of an image with an obvious, severe degradation in a small, prominent location of the picture.
Alternatively, you could look into DCTune, a proprietary algorithm developed at NASA. It is based on some principles of the human visual perceptual system, and gives a score that's more well correlated with human judgments
VideoOrbits will do this (Score:1)
Our lab is doing very similar work. We've interpolated frames of video from an 8fps image sequence (taken with a wearable computer) into a smooth 30fps video sequence, using VideoOrbits. Theres a short video example available somewhere on my homepage. Perhaps this would be of interest to you. VideoOrbits is freely available at http://wearcam.org/orbits [wearcam.org].
Video Orbits runs at over 11 fps on
a 700 MHz dual processor machine. Its also a featureless tracking algorithm so no point correspondences need to be identified.