OCR!

Entropy

榜眼
So my first try using this in the field was successful. I was able to read characters on a laminated menu with colored background. Not a bad start. But makes for lazy students! :lol:

mikelove said:
Though an option to move the overlay might make sense too... I'm not quite sure where we could fit it on that already-overcrowded screen, though.

In the middle third of the display. IOW, divide the current camera space in half, recognizer on top, recognized characters underneath.

Entropy said:
Will I be able to download it after I buy it on my phone? Much easier to test still image rec on an iPad.

mikelove said:
Unfortunately no - that does work on the iPod Touch (though you have to load the OCR data files manually and we haven't posted the link yet) but it won't even let you in on the iPad; as I said, we didn't want to spend time designing / testing an iPad interface for something we weren't yet selling on iPad.

That's annoying, since testing the still image recognizer would be much easier on the iPad, for a variety of reasons, including but not limited to:

I can't figure out how to zoom the chosen image. If I try to zoom, I trigger the recognizer. And the drag handles are too big relative to the display, while the image is physically small, which together make it hard to select the characters I want.

The drag handles can obscure the character I'm trying to recognize--since I can't zoom the image, I have to shrink the selection. If I make it too small, I can't tap on it.

When the text is recognized, I can't compare it to the image, since it takes up the whole screen.

mikelove said:
We're actually leaning more towards something like the document reader, but with a photo - tap on a character in an image and get a popup definition bubble just like in the document reader. A key advantage to that is that it keeps the actual image right in front of you,

On the iPad, you could divide the screen into panes to allow the user to edit text while recognizing.

So, if not the iPad, can I run Pleco using the iPhone development environment on a MacBook Air to do still-image recognition that way? :D

(That's actually an interesting idea, since the phone simulator would presumably be phone-sized leaving the rest of the display for my own uses.)

~ Kiran <entropy@io.com>
 

sharnyo

Member
Mike,

I have Ipod touch 4G and wish that I can use the OCR. I know that you've indicated that the quality is not good given that the camera doesn't autofocus.
But still I think the OCR should work fine in some situations where the image is still in focus. And hence IMO, it should be made available for the ipod touch owners - of course with the caveat.
Would you still consider releasing it for ipod touch 4G?

EDIT: Oops. I saw that it's released also for iPod touch. But why can't I download the add-on. It said that it's not supported.
 

djbass

Member
mikelove said:
That's odd... does the focus work correctly in the built-in camera app? It could be that there's something awry mechanically with the camera itself... if focus works OK in the built-in camera, does it help in Pleco at all if you shake your iPhone? For some reason that seems to convince the phone it needs to re-focus, at least on our test device.

Not too sure anymore, now it works.
 

Entropy

榜眼
One more issue. I hit pause when the right character comes up in the flickering green mess, and the recognizer doesn't actually trust my judgment and thus doesn't display the translation for that character.

(I have no idea if the one it displayed was among the choices for that one, or the previous one.)

~ Kiran <entropy@io.com>
 

kun4

举人
The OCR is very nice. But maybe a simple switch "Simplified" - "Traditional" would cut down on the number of false positives. You might even tune the OCR output based upon geolocation.
 
*OCR for iPod Touch 4g*

All the buzz around the OCR feature has gotten me really excited about using it. I have an iPod Touch 4g with a camera, and I paid for the Professional Bundle. Please let me have the option to try to use OCR!

If you think OCR doesn't work well enough to sell as a standalone option for iPod Touch 4g users, at least let people who purchased a bundle have access to it as part of the bundle. A sort of working OCR is a million times better than no OCR. Please :!:
 

dustpuppy

榜眼
Feature request:
could you let the user copy all the characters in the OCR history? that way i can copy paste a string of traditional cantonese characters into the cantodict online parser. this would alleviate the need for a cantonese dictionary for me.
thanks again for this awesome feature. it really feels like we're in the 21st century now.
 

tennonn8

秀才
mikelove said:
...that does work on the iPod Touch (though you have to load the OCR data files manually and we haven't posted the link yet)...

So, if I understand correctly, the OCR package will be made available for the iPod Touch too, but it is not available yet. Is that right? When (approximately) will this link become available for us? I suppose, it will be posted on this page: http://www.pleco.com/ipdirectdownload.html, right?

I hope it will become available for us (iPod Touch users) indeed. I'm 100% ready to buy it at my own risk in order to try it out, even if only on big size characters. (Well, in the worst case, if it doesn't work at all, I'll be happy to "donate" a little bit more for my favorite program, even if it's just a few more dollars! :D :D )

My main target for the OCR is basically the subtitles of Chinese movies and series. The size of characters in that case is significantly bigger than that of a regular printed text, especially on a large display, so maybe there's a hope for the iPod Touch users as well here. Well, if it doesn't work, that's OK. Then maybe I'll give some more thought to the idea of buying an iPhone, even though I don't need it for anything else, especially for 700+ Euro. :shock: If I had a choice, I'd better give more money to the cool guys from Pleco rather than help Apple become a new world monopolist! :) :wink:

And by the way, thank you very much for other improvements and bug fixes in 2.2! Some of them were very important for me and I was really looking forward to them.
 

mikelove

皇帝
Staff member
IMPORTANT: if you have an iPhone 3GS, DO NOT upgrade to iOS 4.2 (due to be released any day now) yet; there appears to be a bug in Apple's video capture code that screws up OCR, and while we've come up with a workaround it'll take at least a few days for Apple to review and approve it.

(will reply to other posts soon, just wanted to get that posted since it's not quite clear when 4.2 will hit - we've already submitted our workaround update (2.2.1), and are preparing a bug report for Apple (the problem is on their end) in the hopes that they'll fix it, but it seems unlikely they'll fix the problem or approve our update before OS 4.2 is out)
 
I finally got the in-app purchase to work (shakes fist at Apple) and OCR is incredible. I purchased the flashcards upgrade as well and the two together make the most powerful Chinese language learning combination I've used. I will definitely write a review of OCR on my website, chengduliving.com

Thanks Mike and anyone else who was involved in the development of this. I've worked on barcode reader apps for iPhone and I know how advanced the technology in Pleco is. Very impressive stuff!

I'm looking forward to updates which refine the algorith and make operation between features (flashcards, etc) more smooth. Keep up the good work!
 

anchan42

探花
I gave pleco 5 stars and also put some comment in the app store but the comment is still not showing. I could see only two comments from two user in the app store. That's really strange seeing the amount of buzz the OCR generated here there should be more then 2 people wanting to put a few words in the app review.

Anyway, OCR is just great. It works really well with standard font in text book and newspaper. I tried it with New Year sign I have in my house but it did not pick the characters up too well. Maybe because of all the decorative pattern that surrounded the characters. I also tried it on some hand written notes written by Taiwanese but that seems to be too much for the OCR. This is in no way the OCR systems fault. It is just me being happy with my new toy/tool and push it too the limit trying to see what it can do.

I showed it to a few English expat around here and they were all exited and agreed that this is extremely useful when traveling in China.
 

Entropy

榜眼
A few more issues.

kun4 said:
maybe a simple switch "Simplified" - "Traditional" would cut down on the number of false positives.

That certainly seems like a good idea to me. I would expect that in many cases it could be automated, since presumably there are characters that are clearly simplified, like 鸡. And, other user-defined limitations would also be useful, such as being able to specify that you're translating a menu. (That would be facilitated by having a specialized dictionary, I suppose.)

Why is the recognizer box centered, instead of being pinned to the upper left corner?

In Lookup mode, when I have a line of text in the box, only the first character ever seems to turn blue, the rest remain green (but the characters do settle down after a while.) I'm guessing that's the desired behavior, trying to recognize only one word, but I don't want to switch to capture mode just to see the translation of the entire selection. Of course I could take a screenshot and use still image mode.... :lol:

(My test case was 田鸡. Maybe you don't think that's a word. :D But if I have a block of code in the recognizer, I really want that block to be recognized, not just someone else's idea of what a word is. I suppose I should try to find an instance of 烧鸭 and see whether it recognizes that as "fever". In fact, this makes me think there should be a character-by-character option. Furthermore, if Pleco thinks it recognizes a word, it should attempt to recognize the next word, and so on until it runs out of text.)

In capture mode, I can capture a whole line of text, but none of it ever turns blue. Am I wrong in thinking that the text should turn blue when the recognizer is done with its work?

Is there a technical reason I need to switch between lookup, flashcard and document capture modes? Modes are bad. I'd rather just have three buttons in the toolbar.

In capture mode, turning on ""hide unused chars" leads to *no* characters being displayed, green or blue. There's still a translation, though.

Camera shake is a huge problem FWICT. I'm still thinking that working from a picture is going to be much easier than trying to get the green flicker to settle down. In the future, I'd like to be able to remotely control a real digicam with stabilization. In fact, that's one more thing you could do if you wrote a version for OS X. Being able to put the camera onto a tripod while controlling it remotely from an iPad would also be cool.

In capture mode, I'd like to be able to save both the raw image and the translation, so I can compare them later.

I don't want to have a bunch of English characters recognized, even if they're written on the menu.

If there's formatting, I want captured text to reflect that. (I don't know whether it already does.) I'd like to be able to capture a whole menu page at one time, and get individual items on individual lines. In fact, it would be great if Pleco could recognize vertical text and rearrange it as horizontal lines and vice versa (the latter in the editor, not OCR.)

If I go to the settings, they should reflect the settings for the module I was using when I chose to go to the settings.

BTW, for the most part the recognizer does a surprisingly good job on the 26 year old printing in McCawley's book. I'm amazed at how well it works in situations I would've expected to be pretty challenging, such as bad printing, laminated menus with colored backgrounds, pictures on my iPad, Web text on my MBP, even writing in a TV show on my ancient desklamp iMac! Now if only I had a desktop version, I could scan and translate some modern Sichuan cookbooks. :D

~ Kiran <entropy@io.com>
 

cai

Member
Like someone already suggested, I would also like to have a feature that saves all the captured text in the reader, not just the last scan.

Now when I see a sentence and OCR it, is is saved in the reader... BUT when I OCR another sentence, the previous sentence gets lost? Why?
 

Entropy

榜眼
cai said:
I would also like to have a feature that saves all the captured text in the reader, not just the last scan. Now when I see a sentence and OCR it, is is saved in the reader... BUT when I OCR another sentence, the previous sentence gets lost? Why?

There's a setting to append text, (OCR>Capture text>Combine captured text) but I think you still have to manually save the captured text using the little folder icon. I want a setting to save the file automatically.

~ Kiran <entropy@io.com>
 

cai

Member
Thanks for the heads up, Entropy.

I second the automatic saving... I just scanned 20+ sentences and then Pleco crashed and lost everything :evil:
 

jacky89

秀才
The OCR doesnt seem to recognize handwriting very well. It just keeps on jumping around. Is there a way to get the OCR system to only recognize one character at a time?
 

mikelove

皇帝
Staff member
Entropy said:
In the middle third of the display. IOW, divide the current camera space in half, recognizer on top, recognized characters underneath.

That's a pretty thin band left for the camera preview - are you sure you'd be able to see enough to find where you were on the page?

Entropy said:
That's annoying, since testing the still image recognizer would be much easier on the iPad, for a variety of reasons, including but not limited to:

I can't figure out how to zoom the chosen image. If I try to zoom, I trigger the recognizer. And the drag handles are too big relative to the display, while the image is physically small, which together make it hard to select the characters I want.

The drag handles can obscure the character I'm trying to recognize--since I can't zoom the image, I have to shrink the selection. If I make it too small, I can't tap on it.

When the text is recognized, I can't compare it to the image, since it takes up the whole screen.

The UI for still image capture is just awful at the moment - threw it together in afternoon but it needs (and will get) a lot of work in future releases to make it a real feature. (there actually is no zoom option yet, e.g.) Releasing a product now with live capture working and still images a work in progress seemed better than waiting everyone wait another month or two.

Entropy said:
On the iPad, you could divide the screen into panes to allow the user to edit text while recognizing.

Maybe we could display the image alongside the recognized text, but "while recognizing" is a bit of a stretch.

Entropy said:
So, if not the iPad, can I run Pleco using the iPhone development environment on a MacBook Air to do still-image recognition that way?

(That's actually an interesting idea, since the phone simulator would presumably be phone-sized leaving the rest of the display for my own uses.)

Nope, that requires a recompiled version of our app and since it would lack anything in the way of DRM there's no way we're releasing it outside of Pleco. (perhaps a future dedicated Mac app, though)

sharnyo said:
I have Ipod touch 4G and wish that I can use the OCR. I know that you've indicated that the quality is not good given that the camera doesn't autofocus.
But still I think the OCR should work fine in some situations where the image is still in focus. And hence IMO, it should be made available for the ipod touch owners - of course with the caveat.
Would you still consider releasing it for ipod touch 4G?

EDIT: Oops. I saw that it's released also for iPod touch. But why can't I download the add-on. It said that it's not supported.

The add-on is not currently available on iPod; since we've gotten quite a few emails / posts like this, though any day now we'll be adding a form to our website where you can register your iPod's UDID and then be able to purchase / download OCR. But basically we really really don't think it works well enough to sell it, and we didn't want a bunch of people ignoring whatever warnings we put in front of it, downloading / purchasing it without demoing it first, discovering that it in fact doesn't work, then badmouthing us / posting negative reviews / demanding their money back / etc - for now, we'd rather lose a few sales than suffer a lot of bad PR that could hurt long-term interest in our OCR system.

Entropy said:
One more issue. I hit pause when the right character comes up in the flickering green mess, and the recognizer doesn't actually trust my judgment and thus doesn't display the translation for that character.

Settings / OCR / Mode-specific / Word detect samples > 0 should fix this (at the cost of making the definition change even more often).

daniel123 said:
just two word about OCR: unbelievable great!

Thank you!

kun4 said:
The OCR is very nice. But maybe a simple switch "Simplified" - "Traditional" would cut down on the number of false positives. You might even tune the OCR output based upon geolocation.

Not feasible, unfortunately - we're not exactly sure of the technical reasons, but basically it would cost a ridiculous amount of money / time to get this feature added to the recognizer engine. (far more than our current jitter-improvement efforts) I suppose we could at least put together a makeshift version that simply looked through the list of the top 10 matches for each character and filtered out all of the ones from the wrong character set, though (returning a character from the wrong set if not a single one is from the correct set).

nihao pengyou said:
All the buzz around the OCR feature has gotten me really excited about using it. I have an iPod Touch 4g with a camera, and I paid for the Professional Bundle. Please let me have the option to try to use OCR!

If you think OCR doesn't work well enough to sell as a standalone option for iPod Touch 4g users, at least let people who purchased a bundle have access to it as part of the bundle. A sort of working OCR is a million times better than no OCR. Please

It's actually not included in the Professional or any other Bundle - we couldn't afford to add it to them for free given the royalties we have to pay, and we didn't want to raise the prices of bundles for everyone when iPod/iPad/old iPhone users can't access OCR. But per my earlier comment, we will be adding an option to our website very soon to let individual iPod users (after clicking through a whole bunch of warnings / pleas not to post negative reviews / etc) register their device to be able to purchase OCR. (basically you'll enter your UDID on an online form, reload the Add-ons catalog and magically see that OCR is now available to download / purchase)

dustpuppy said:
could you let the user copy all the characters in the OCR history? that way i can copy paste a string of traditional cantonese characters into the cantodict online parser. this would alleviate the need for a cantonese dictionary for me.
thanks again for this awesome feature. it really feels like we're in the 21st century now.

Thanks! And yes, we could certainly add something like that, though you could also do this with Capture Text + OCR>Capture text>Combine captured text.

tennonn8 said:
So, if I understand correctly, the OCR package will be made available for the iPod Touch too, but it is not available yet. Is that right? When (approximately) will this link become available for us? I suppose, it will be posted on this page: http://www.pleco.com/ipdirectdownload.html, right?

I hope it will become available for us (iPod Touch users) indeed. I'm 100% ready to buy it at my own risk in order to try it out, even if only on big size characters. (Well, in the worst case, if it doesn't work at all, I'll be happy to "donate" a little bit more for my favorite program, even if it's just a few more dollars!)

Thanks!

We actually ended up deciding to do it a different way - you'll be able download it right in the app after registering your device's ID on our website - though I really wouldn't get your hopes up; even for subtitles the prospects are grim, unfortunately, it's just not a very good camera for reading text. Should have that website form up by the end of the weekend (just digging out from under a metric ton of email / forum posts / etc first).

justcharlie said:
I finally got the in-app purchase to work (shakes fist at Apple) and OCR is incredible. I purchased the flashcards upgrade as well and the two together make the most powerful Chinese language learning combination I've used. I will definitely write a review of OCR on my website, chengduliving.com

Thanks! I look forward to seeing that.

anchan42 said:
I gave pleco 5 stars and also put some comment in the app store but the comment is still not showing. I could see only two comments from two user in the app store. That's really strange seeing the amount of buzz the OCR generated here there should be more then 2 people wanting to put a few words in the app review.

Anyway, OCR is just great. It works really well with standard font in text book and newspaper. I tried it with New Year sign I have in my house but it did not pick the characters up too well. Maybe because of all the decorative pattern that surrounded the characters. I also tried it on some hand written notes written by Taiwanese but that seems to be too much for the OCR. This is in no way the OCR systems fault. It is just me being happy with my new toy/tool and push it too the limit trying to see what it can do.

Reviews can sometimes take a while to appear, I think Apple reads all of them first... thanks! Decorative patterns and handwriting can both be problematic, yeah... I'll be curious to see what happens the first time someone tries to use it to read a Chinese character tattoo :)

Entropy said:
That certainly seems like a good idea to me. I would expect that in many cases it could be automated, since presumably there are characters that are clearly simplified, like 鸡. And, other user-defined limitations would also be useful, such as being able to specify that you're translating a menu. (That would be facilitated by having a specialized dictionary, I suppose.)

That wouldn't really make sense - if the system knows the character is simplified then we're already past the point where limiting to one system or the other would make a difference. A specialized menu filter might help, but as I noted above we can't do the filtering at the engine level so we'd have to screen the character results to try to find menu words.

Entropy said:
Why is the recognizer box centered, instead of being pinned to the upper left corner?

Easier to adjust (thumb high or low, left- or right-handed), easier to aim, less finger travel to resize since we do it symmetrically...

Entropy said:
In capture mode, I can capture a whole line of text, but none of it ever turns blue. Am I wrong in thinking that the text should turn blue when the recognizer is done with its work?

The blue text doesn't mean the recognizer's finished - usually by the time it shows up the recognizer's already recognized that block of text several times over; the blue text indicates which characters are in the dictionary entry it's currently showing you. So there's no blue text at all in Capture mode.

Entropy said:
Is there a technical reason I need to switch between lookup, flashcard and document capture modes? Modes are bad. I'd rather just have three buttons in the toolbar.

Which toolbar would you propose adding them to? :) (not like we have a lot of extra room, and this a relatively infrequently-made switch, not to mention the fact that we might vary the user interface between modes more in future releases)

Entropy said:
In capture mode, turning on ""hide unused chars" leads to *no* characters being displayed, green or blue. There's still a translation, though.

That might actually be preferable under the circumstances.

Entropy said:
Camera shake is a huge problem FWICT. I'm still thinking that working from a picture is going to be much easier than trying to get the green flicker to settle down. In the future, I'd like to be able to remotely control a real digicam with stabilization. In fact, that's one more thing you could do if you wrote a version for OS X. Being able to put the camera onto a tripod while controlling it remotely from an iPad would also be cool.

Accuracy is a problem with photos, though - as I've said before in this thread, usually it works but 20-30% of the time you get a page full of garbage, especially if you're not careful about lighting / straightness / etc. But yes, we're working on the shake problem.

Entropy said:
In capture mode, I'd like to be able to save both the raw image and the translation, so I can compare them later.

Good idea, already doing that in some of our test builds.

Entropy said:
I don't want to have a bunch of English characters recognized, even if they're written on the menu.

So you'd rather have it leave those spaces blank? Difficult to filter since occasionally one or two English characters are recognized as Chinese ones.

Entropy said:
If there's formatting, I want captured text to reflect that. (I don't know whether it already does.) I'd like to be able to capture a whole menu page at one time, and get individual items on individual lines. In fact, it would be great if Pleco could recognize vertical text and rearrange it as horizontal lines and vice versa (the latter in the editor, not OCR.)

Can't do much for formatting, but we do recognize / rearrange vertical text even with still image capture.

Entropy said:
If I go to the settings, they should reflect the settings for the module I was using when I chose to go to the settings.

That one we really need to do system-wide.

Entropy said:
BTW, for the most part the recognizer does a surprisingly good job on the 26 year old printing in McCawley's book. I'm amazed at how well it works in situations I would've expected to be pretty challenging, such as bad printing, laminated menus with colored backgrounds, pictures on my iPad, Web text on my MBP, even writing in a TV show on my ancient desklamp iMac! Now if only I had a desktop version, I could scan and translate some modern Sichuan cookbooks.

Thanks! Desktop version may actually be forthcoming, I'm thinking a $20 or $30 Chinese OCR app for Mac would be an interesting little way of experimenting with native desktop Mac development without committing to bringing the full version of Pleco to it. Along with perhaps a $10 to $20 drop-in replacement for Apple's trackpad handwriting recognizer that used our more accurate engine and two-finger clear gesture.

cai said:
I second the automatic saving... I just scanned 20+ sentences and then Pleco crashed and lost everything

Very sorry about that... could you send me your crash log? It's at:

Mac OS X : /Users/(your username)/Library/Logs/CrashReporter/MobileDevice/(your device name)
Windows XP: C:\Documents and Settings\(your username)\Application Data\Apple Computer\Logs\CrashReporter\MobileDevice\(your device name)
Windows Vista / 7: C:\Users\(your username)\AppData\Roaming\Apple Computer\Logs\CrashReporter\MobileDevice\(your device name)

(on WIndows that "Application Data" folder may be hidden - choose "Folder Options" from the "Tools" menu, go to the "View" tab and check the radio button to "Show hidden files and folders" to make hidden folders visible)

Assuming it's something we can fix we can hopefully take care of this in 2.2.2.

jacky89 said:
The OCR doesnt seem to recognize handwriting very well. It just keeps on jumping around. Is there a way to get the OCR system to only recognize one character at a time?

We can't do much for handwriting, unfortunately - too much variation to recognize accurately. Settings / OCR / Mode-specific / Lookup Words / Word detect samples > 0 will at least get the definition to update every time the character does, so when you hit pause it'll actually match the character (and you'll see all of the characters it's alternating between in the history).
 
I just bought OCR for my iPod Touch 4g, and guess what - it's not all gloom and doom - I'm really happy with it! It's true that OCR would be much better with auto-focus to identify some of the smaller more complicated characters. However, OCR still works decently in my opinion even without auto-focus. Perhaps part of the reason I'm happy is that all of Mr. Love's scary disclaimers set my expectations really really low for OCR on the iPod Touch 4g; however, despite all that - I really think OCR is still useful and good to have as part of Pleco on an iPod Touch 4g.

Question 1: Sometimes while I am looking up a word with OCR, the green characters OCR is trying to lock on to are changing too fast for me to pause on the correct one. Is there a way to make the characters change more slowly, so I don't feel like I'm playing Whack'a-Mole. :)

Question 2: I looked in the instruction manual, but I didn't find an explanation for the OCR settings "Word detect samples" and "Word detect compare length." Can you tell me what those settings do?

Thanks!
 
Top