Software To Flatten a Photographed Book? 172
davidy writes "I have photographed some pages of a book for reading on my PDA. This is much faster than scanning and I don't have to carry the heavy books. However, the photographed books are not as nice: curved, skewed, and shadowed, as opposed to the much flatter, cleaner scanned books. I have searched for software that can flatten the pages for better reading on the PDA. So far I have come across Unpaper and Scan Tailor. Unpaper doesn't seem to have a windows GUI, and Scan Tailor doesn't unskew well. I remember reading about Google's technique of converting books to e-books with a camera and a laser overlay. Is there any home user software that can do a similar job without the need for a laser overlay or other sophisticated (and patented) technology?"
Snapter (Score:5, Informative)
Re: (Score:3, Informative)
Re:Snapter (Score:4, Informative)
My short review: FAIL.
Re:Snapter (Score:4, Informative)
Re:Snapter (Score:5, Informative)
1. The book must be on a uniform surface.
2. All the edges of the book must be in the frame.
3. Only hold the book down from the side.
4. The photograph must be taken directly over the book.
5. Use a dSLR for best results.
Okay, so now try holding a dSLR directly over an open book that you're holding with another hand, from the side, and at a range where the entire book fits in the frame. At that point, you might as well build that book scanning rig.
In short: FAIL.
Try a heavy piece of non-glare glass (Score:2, Insightful)
Try a heavy piece of non-glare glass
Re: (Score:2)
You could hire an illegal immigrant midget to follow you around and carry the book.
Re: (Score:3, Interesting)
I trialed the software using files from a Pentax istDS (6 megapixel) DSLR, and since it worked quite well I purchased a copy. However, when I attempted to use it with files from a D700 (12 megapixel), it failed completely. So, I would only recommend it for 6 mp resolution files or less.
Also, Snapter hasn't released a new version in over a year; so, it's almost abandonware---it's not possible that they haven't received a bug report or two that needed to be fixed, over the past 12 months.
Net conclusion: try
Re: (Score:2)
Have you tried qipit [qipit.com]???
It's supposed to work directly from your phone camera. And if your phone is not directly supported, it should still work if you email the picture to: copy AT kipit . com
sooo.... (Score:1, Interesting)
after photographing your book you have these huge image files that are barely readable, and now you want to spend MORE time trying to make em legible, wouldn`t it just be faster to scan em?, after OCR they would be much smaller and you could edit em/annotate to your hearts delight too.
seems like you made a problem looking for a solution rather than just scanning em in the first place
Re: (Score:2, Funny)
Aye, this Ask Slashdot sounds more like "Doctor, it hurts whenever I do this." The traditional response to such nonsense is, "well, then don't do that!"
Next question?
No it wouldn't be faster (Score:2)
Re: (Score:1, Informative)
Seriously, have you ever compared the time photographing a book vs. scanning it? The fastest scanners run like photocopiers. With a book, all you need is to set up a decent or ghetto rig for the camera and turn the pages. Until now, I've been shooting with a DSLR at the same lighting/camera settings for each shot, and applying a batch transform process followed by a universal levels setting, finishing up with a PDF assembly. But I'll report back on how Snapter works on the same files.
Exactly, the document scanners used in libraries and archives are pretty much high resolution cameras on an adjustable stand. They don't work like flatbed desktop scanners where you have to squash the book flat on a plate of glass. As a result they are much faster, easier on the books and you get better quality scans for OCR processing.
Re: (Score:2)
Orbital/Planetary scanners - much faster, essentially the same as your DSLR hanging above the page. Less than a second for a 'scan', rate limited by speed of turning the pages. A 5 secs to turn an scan that's 12 images a min or 720 an hour.
What the GP was talking about: Duplex feeder scanners. If you can cut the spine of the book and feed in into one of these then your laughing. Some of them will do 2-300 full colour pages a minute
Re: (Score:2)
Re: (Score:2)
sounds like its time to find a place where the legal department doesn't run IT?
when you leave tell them their tools are "defective by design" and you won't support them.
Using a camera would be MUCH better. (Score:2)
The Fujitsu fi-6230 Sheet-Fed and Flatbed Scanner [fujitsu.com] gets good reviews and the flatbed scanner is fast, but it costs $1,200, and the sheet-fed and flatbed scanners are weirdly and unnecessarily connected.
Less expensive Fujitsu scanners lack TWAIN or ISIS d
PDF v. paper contracts (Score:2)
... even contracts are scanned and the original destroyed as legal has deemed that a PDF scan of a signed document is as legally binding and secure as the actual paper.
Wow, you have a dumb legal department.
It is "as legally binding" only if it can be used to coerce the other party into admitting that they signed the document. A wise but immoral signer could take the opportunity to say they signed something else and that you must have manipulated the file. There's nothing you can do to prove that you didn't. At best you can only show that it is unlikely that you did. I hope they keep the physical paper for any truly important contracts they have.
Re: (Score:2)
Flat bed scanners - well slow.
AFAIK "flat bed scanner" simply means a scanner where you place the item to be scanned on a flat plane of glass to be scanned. The technology that actually does the scanning is irrelevant.
In which case, the Xerox printer/scanner in my office does flat bed scanning in the blink of an eye. I ran it with the lid up once to prove to myself it wasn't just a camera - and sure enough a sensor bar whips across the page.
Of course, it's not as practical for scanning bound books as the Google solution.
Re: (Score:2)
I suspect that Xerox machine you talk of will actually be moving just a light/mirror which is reflecting into a camera. (or possibly a CCD arrangement, but usually they are in cheaper machines) But either way scanning a book on a 'flatbed' whereby the book has to be placed face down is a major chore. turning the page means lifting the book, turning the page, then turning the book back over and repositioning it etc etc. Even if the scan takes a blink all that manual work will make it tedious at
Re: (Score:2)
Flat bed scanners - well slow. So slow it's not even worth bothering with
Any decent mid-range multi-function copier will be able to feed its flatbed scanner from the input tray, meaning you can do 25+ pages at a time (depending on the model). And while it's not as fast as a drum scanner, it's not nearly as slow as you're implying here.
Of course you have to destroy the book to use that option. But the equipment isn't hard to find: you likely have one in your office right now.
Re: (Score:2)
dude, hopefully you shot both left and right page open into 1 file. If that's the case, drag one into photoshop, record actions while you tweak. Then auto/batch the whole series with that action and see what's up.
Re: (Score:1)
Re: (Score:2)
I like paper books. Still.
Re:sooo.... (Score:5, Funny)
He can easily get his tiny 10 megapixels camera into the book store, but he would be stopped immediately if he tried to bring his scanner instead.
How about a $300 home-built scanner? (Score:5, Informative)
Some guy posted a great instructables [instructables.com] on building your own high speed book scanner [instructables.com], purposely designed to rapidly photograph book pages without curves. He even includes a software stream that OCRs the contents and sticks them into PDFs.
It's been quite popular -- so much so that he's created an online forum at http://www.diybookscanner.org/ [diybookscanner.org] dedicated to discussions from DIY book scanners all over the place, where they talk about builds, parts, and software.
I've been very tempted to build one myself just to avoid carrying heavy books around in my backpack.
Re: (Score:3, Interesting)
Something the narrator in the bookscanner video said at the end of his video really resonates with me, which is that
Lately, I've been posting like crazy about digital print and related topics such as the conversion of paper print to digital audio [mistersquid.com]. Google started their project several years back and publishers are suing to stop the
Re: (Score:2)
I think the day when eBook readers can flick pages in 0.1 seconds is the day when the shit really hits the fan for publishers. Trying to find information in textbooks when it takes 2 second per page turn is not very practical at the moment.
People will want to buy the latest Stephen King novel, but they don't really care about the publisher at all. If they can get it directly from his web site, they will. At best there will be a load of out-of-work editors willing to work on a per-book contract.
The key is go
Re: (Score:2)
Follow the http://diybookscanner.org/ [diybookscanner.org] link. He says he's migrating it to there.
Re: (Score:2)
I have a book I'd like to photograph - it's a large, hand-typed bound family history, done in the 70s by a relative. You can't get it on a scanner without breaking the binding. I tried years ago to scan & ocr it with a handheld scanner, but the tools available then were just too painful I'm going to give it another shot now. So, some people may well find this discussion useful.
Anonymous Coward (Score:4, Informative)
Get a thick, heavy piece of glass and lay it atop the pages to flatten them out before you photograph them. Use ambient light and avoid the flash.
Re: (Score:3, Informative)
Also use a zoom lens and take the shot from as far as possible, to reduce curvature. The longer the focal distance, the flatter the picture will appear.
Re: (Score:2)
Re: (Score:2)
The flash still doesn't typically work well when photographing something behind glass, though.
Re: (Score:2)
using the right flash allows you to use a flash.
I take closeups within 8=12 inches of a persons face. I get perfectly illuminated flash photography.
standing back and zooming works great for low end cameras, upgrade to a real flash with a bounce or diffuser and power setting and you can do far more. Upgrade to a pair of light boxes and adjustable stand flashes and you can do everything.
Re: (Score:2)
The biggest problem is that a macro lens can be somewhat on the expensive side. If you want to stay cheap, Cosina makes a 100mm f/3.5 macro that looks and feels cheap, but has quite decent quality optics. This is widely available under various other names (Promaster, Quantaray, etc.
Re: (Score:3, Interesting)
If you're doing fixed height/lighting camera photography, you might as well just buy a cheapo screw mount macro lens + screw mount adapter.
Re: (Score:2)
Re: (Score:3, Informative)
Barrel distortion can be easily fixed in photoshop, and once you get the right settings for your first pic, you can batch process the rest of them.
Re: (Score:2, Informative)
That's not what polymeris is getting at. Wide angle lenses create strong perspective foreshortening. That's why there is a sweet spot for portrait photography: too wide makes noses look big, too long leaves no perspective. Lens distortion is easily removed because it is inherent to the lens, so you only need to calibrate once and can use the profile for all pictures shot at the same focal length. Perspective distortion depends on the scene, so there is no "calibrate once, correct all" option without creatin
Re: (Score:2)
Actually, lens distortion also depends on the scene, but usually the scene is much farther than the focal length so it doesn't change very much. But for close up shots, you'd do better with a macro lens.
Re:Anonymous Coward (Score:5, Funny)
Also, avoid being seen by the bookstore clerks.
Re: (Score:2, Informative)
It doesn't have to be glass. Target stores have these nice plexiglass photo boxes. An advantage of them over glass is that the edge of the box helps hold the opposing page up.
Re: (Score:2, Insightful)
I think such a device is called a "scanner" :P
Re: (Score:2)
If money isn't an issue and you can rip the book apart, the scansnap series of scanner is nice and fast. Just drop in 50 sheets at a time, and, depending on the settings, a 500 page book will be scanned in under 20 minutes. Pages stay flat, and you'll have an automatic PDF too - no conversion necessary except probably for small devices (iPhone).
I tend to think photographing pages is slow, require either an expensive set-up, or you just get half-assed results that will drive you nuts when you actually sit
Contact Scan Tailor Author? (Score:5, Insightful)
At version 0.9.6, perhaps Scan Tailor is 96% of what you want and it's F/OSS. If you *politely* contact the author(s) and lay out your concerns perhaps you can get what you need AND help make a project better. Worth a try.
Re: (Score:2)
From the screens on the site it looks like the author does not value Windows very much (the screens are done on Linux).
If you want him to do development for the Windows platform, then you will have to ask nicely with some motivational argument (ask him how much he wants for making it work on the Windows platform, pay half in advance).
Re: (Score:2)
The home page says there is both a Windows and Linux version. Just because the screenshots are Linux doesn't mean the author doesn't value Windows; they have to have been made on one platform or the other, whoever made them happens to have done them in Linux.
If the Windows version was broken for some reason it still might be possible to build the Linux version under Cygwin.
QT3 (Score:3, Informative)
Re: (Score:2)
Or just check the Pirate Bay^H^H^H^H Mininova for a .pdf version and save yourself some hassle.
Seriously, if fair use allows format shifting, does that mean you are required to do the shift yourself?
You need the proper kit (Score:1, Interesting)
There are perfectly good machines that will do this for you.
They have suction systems to turn the pages of the book, and hold the book partially open so that the pages are more or less flat. There's one camera for each page, and the software that comes with the system deals with the curving, and obviously gets the lighting right to avoid shadowing etc.
OK, so maybe these machines aren't exactly cheap ... ... but at least one publisher is using them to photograph books (ones that are out of copyright, obvious
Re: (Score:2)
This is what the OP is asking about. Plenty of these systems ship with the combined hardware/software pack and cost thousands or even tens of thousands of dollars.
It sounds like the OP has taken a few shots of a handy book maybe at a friends house or whatever and would like to just 'sort' them out before finally archiving them on to his system. Kind of like how some people like to tag all the MP3s before 'committing' them to their system. Free,
Re: (Score:2)
Yeah, I don't know why people keep suggesting such expensive and complex stuff for his "project". It isn't a project- he is just photographing a few pages.
You could just fix them in Gimp.
Anyway, as others have suggested, there is Scan Tailor, which is FOSS and multiplatform (and GUI)
http://sourceforge.net/project/screenshots.php?group_id=227253&ssid=90796 [sourceforge.net]
ahhh - book scanning (Score:4, Informative)
Not everyone has 5-10mm thick peices of book sized glass lying around and it can be hard to take that sort of thing about the place in case of requiring to photo a book.
There is software called Book restorer that does this removes curves 'geometrical correction' etc but it's pricy.
i've tried un paper and it's pretty decent for what it does but it does have some limitations and it's not the most convenient to use.
Deskewing, cropping, filling, etc etc are all easily done and I've even written imagemagick batch scripts in windows to do these things. The major trick is the curve removal.
There's various ways you can determine the curve from a scanned image. If you have the edge of the page, you can calculate the movement required to straighten that, and then apply it to the whole image. You can use text based curve removal, similar to well known deskew algorithms for text, but takes into account different parts of the text may be 'more' skewed. i.e. rather than a rotational deskew a 'sliced' deskew. This needs to be done from the top to the middle and the bottom to the middle.
If you have a good 'shape' of the page, and know the true size of the page, you can use a kind of morph operator to morph the corners back to th eright position and hope the image follows.
Using a Greyscale/colour source will work better than a black and white source image in general.
the other option is if the scanned / photoed page is actually of reasonaly good quality but if just a bit squint, then OCR it to a PDF and generate a new document using the OCR text, which will be pin sharp accurate, compress a lot better and be easier to use, although may not be ideal if there are too many errors.
Re: (Score:2)
A 6mm 12"x9.5" piece of regular (not anti-glare) glass costs less than $10 to get cut. However, I haven't figured out a good way yet to use the glass without damaging the book's spine (while also operating the entire contraption quickly). That's why I want to use "heavy" image manipulation myself.
Re: (Score:2)
Some cheap cameras can do it (Score:1)
Re: (Score:2)
That's a Nikon CoolPix 5200. p38 in the manual.
"Scene Mode" "Copy"
"Copy provides clear pictures of text or drawings on a white
board or in printed matter such as a business card."
"Colored text and drawings may not show up well in the final picture"
Look at some Google books (Score:4, Insightful)
I remember reading about Google's technique of converting books to e-books
My suggestion is that you look at some of the Google books that are on-line. I have, and they show the problems that you mention and more, curved pages, dark areas, and even text that is distorted and harder to read than most captchas. Whatever you have read (and yea, I remember reading it too), it doesn't seem to actually be viable in practice. Sure, photographs are easier than scanning, particularly if you do it fast and cheap, but the result is poorer. If you can scan the book without damaging them I suggest you go back and do that.
Depending on the book of course... (Score:1)
I'm surprised that Google doesn't do this, it would be SO much faster than scanning each page one at a time.
Another option, see if Amazon sells the book in digital format. Sometimes a few bucks saves a world of headache.
Now if these are expensive textbooks or reference books, or don't belong to you, the above may not apply, just my first thought on the subject.
Re: (Score:2)
I'm surprised that Google doesn't do this, it would be SO much faster than scanning each page one at a time.
Except that they use high-quality book scanners that can go through a hundred pages in a few seconds flat, and it would cost a fortune for them to do it this way, and no libraries would let them touch any archival materials which is half the point.
Re: (Score:2, Funny)
I'm surprised that Google doesn't do this, it would be SO much faster than scanning each page one at a time.
Yeah, I don't see why they don't just slice the spine off that one-of-a-kind 16th century book so they can scan it in. That's such an easier way to do it. And I have NO idea why a library would have an issue with that.
Are people around here really this dumb?
Re: (Score:2)
You just described a major plot point in Vernor Vinge's book "Rainbows End" except that instead of cutting off the spine they actually used paper shredders to cut up the books and then used computers to put the books back together. It was incredibly fast (no need to cut spines and feed pages, just sh
Casio Exilim digital cameras built-in mode (Score:4, Interesting)
Use a homemade book scanner. (Score:5, Informative)
Re: (Score:2)
Re: (Score:2)
Decapod project (Score:2)
Take a look at decapod-project.org for a complete system. Note that software dewarping is quite a hard problem, but it is part of decapod.
If you had a Mac... (Score:2)
Hugin could help (Score:5, Interesting)
where are my mod points when I need them ? (Score:2)
So where was the magic (Score:2)
The link was very interesting, but gave no clue as to how you take the perfect image of a book with a crease down the middle (which is the original starting point of the question), and then get an image with no crease. They demonstrated a result with no crease, that was very convincing - but I didn't see just how they got rid of it. Simple cropping of each side?
Re: (Score:2)
I don't think there was a crease in that set of images. It appeared like it because the left edge of the right image was near the edge of the scanner, and didn't get a lot of data (maybe? I'm not really sure about that). I think this tutorial is more about how to combine two scanned images that are each part of the full image. Notice how control points were dropped that indicate that the images share an overlap. They aren't two separate pages, just a left and right portion of the same page/image.
As to t
Re: (Score:2)
See also this LWN article: http://lwn.net/Articles/351053/ [lwn.net]
He shows how he straightened a picture he took and also mentions the SIFT algorithm which is disabled by default because of software patents. He says:
Your editor, being a daring sort of person, decided that he wanted to find out just what sort of functionality is being denied to hugin users by the oppressive US software patent regime. As it happens, Fedora users can get around patent-based repression by installing the autopano-sift-C package from the rpmfusion repository and tweaking the program preferences to use the real autopano tool. The difference is striking: with autopano-sift-C installed, the program proceeds immediately from image selection to a preview window; the whole "control points" and "optimization" process just sort of goes away. This package does a great job of finding control points, at least on your editor's sample image set.
Here you go (Score:2, Funny)
You're welcome.
Much faster than scanning? (Score:2)
Does that include the time needed to now fix the artefacts that scanning doesn't get you?
If you have a scanner, then why don't you just use that? And if you do not have a scanner, why even bother with the speed comparison and not settle for "I don't have a scanner"?
There's a reason that scanning takes time compared to just pointing a camera at a book and snapping a picture. You've now found one of those reasons. Congratulations.
Now you just have to find out if the up front time savings are greater than the
Re: (Score:2)
I want I want (Score:2, Insightful)
I want software that will do it. For free.
Can I do it without a camera, too?
Actually I'd like it if there were some way I could get paid for using the software.
Can i just put my iPhone/PDA on the book and have it all sucked in via osmosis?
and then have the book read back to me w/ Morgan Freeman as the narrator?
Is there
Re:I want I want (Score:5, Insightful)
It never hurts to ask; if there were an easy magic software solution to do X, wouldn't you rather find out about it now, instead of after doing X?
Re: (Score:2, Insightful)
Re: (Score:2)
Good thing we haven't had any enlightened lazy people in the past who actually solved problems similar to this. If they had, we'd be able to keep things cool without having to constantly refill our iceboxes, calculate things without having to do them by hand or mentally, or even have a machine write t
Try using one of the Planon Pen Scanners (Score:2, Interesting)
Re: (Score:2)
Low nerd-factor approach (Score:2)
Not as fun as figuring out some massive kluge to do the job, but if it's a book that you can easily find used copies of just cut the binding off or remove the pages with a razor blade, and photograph them flat.
Some of the ebook-torrent scans use that method. It destroys the book but makes for cleanly readable scans.
How about putting the book under glass? (Score:2)
Of course, you'll still have to deal with lens effects like trapezoidal issues, or skew, but these I imagine are much easier to deal with than curl.
Personally, I've switched to mostly
OpenGL (Score:2)
What you get out of it (Score:2)
I'm sorry but this question is just ridiculous, or if not actually ridiculous then wasteful.
Let me see if I understand this correctly: The author has photographed a selection of pages from a book that he wants to have available to read at his leisure. He doesn't want to carry the book around because it's too heavy, so what he wants to do, because the photographs aren't as pretty as scanned copies but scanned copies have artifacts of the scanning process, is to develop a home rig to scan the pages he's in
Re: (Score:2)
Did I miss somethin
ABBYY FineReader? (Score:2)
UGGGHHH (Score:2)
For $deity's sake (Score:2)
Re: (Score:2)
Allow me to extract the informative part of this.. (Score:3, Insightful)
Unpaper may work for you if you're not afraid to deal with a CLI.
There's no harm in giving it a look. Assuming it's properly designed I can see it being quite elegant.
Re: (Score:2)
I don't know; for many purposes ImageMagick is perfect for my needs. Unlike the GIMP (as of 2.6.7), it supports 16-bit channels in images, which are useful for grayscale work on my side (256 grays is too limiting). It's also a lot faster for doing work over a lot of images in sequence.
Re: (Score:2)
Exactly.
How you stitch together a book doesn't matter if it's for your own personal use.
Now, if you want to go commercial you've got quite a few things to figure out.
Regardless, what you're wanting to do is basically orthorectification. There is an open source package out there that does that. Figuring out how to do so would be left to you, but I'd recommend using some sort of yellow projection grid (or red from a red laser) to map the distortion and correct it by treating it as a DEM.
Poor man method- so
Re:What does "and patented" have to do with it? (Score:4, Informative)
Really, if you are doing this for yourself and have no intention of selling your product, then you are free to use their method all you want.
35 U.S.C. 271 (a) Except as otherwise provided in this title, whoever without authority makes, uses, offers to sell, or sells any patented invention, within the United States, or imports into the United States any patented invention during the term of the patent therefor, infringes the patent.
Yes, it's extremely unlikely that anyone would ever sue you for infringing a patent in the privacy of your own home because the damages would be minuscule and it would be very difficult to prove infringement, but it's still an infringement.
patented tech is teh evil (Score:2)
Of course, neither his digital camera nor his PDA use any patented technology.
Re: (Score:2)
Re: (Score:3, Informative)
Re: (Score:2)
The macro recording is more or less planned for GIMP 3.0
http://www.gimp.org/docs/userfaq.html#Macro [gimp.org]