insane idea per a new feature (OCR)

bodybag77

秀才
Hi There,
I was thinking that it would be cool to have some basic OCR functions in the next versions of Pleco Dict. It would be cool to take a picture of some chinese characters using PDA's camera and then process the characters inside Pleco, to have them conveniently displayed on the reader module. Think how would it be cool when having to deal with books or restaurant's menu!!! yes yes yes i know, i'm a lazy guy...but it would be a nice feature, isn't it? Is it technically possible?

Ugo
 

mikelove

皇帝
Staff member
Technically possible, but very tricky - it's hard enough to OCR Chinese characters correctly from a scanned black-and-white document where they're neatly aligned. It'll be possible eventually but probably isn't within the range of what we can do at Pleco anytime soon.
 
I've seen your insane idea realized...in HK in 2002---a pen-shaped electronic translator with a lens at the point and a small LCD display for translating English into Chinese on the spot. Like most new features, I figured this would become standard equipment on most (Asian) phones (or at least on electronic dictionaries) within a few years but not so.
 

ldolse

状元
The other problem with this idea is you need a macro mode on the PDA. The original HTC Tytn is the only one I know of that had a macro mode. It shipped with a business card scanner which could recognize chinese. It made a lot of mistakes though as I recall.
 

Eggwind

举人
I've seen this on a Japanese phone... You could OCR Japanese text with the camera phone, and then send the result to the J-E dictionary. Not always accurate, but worked ok. Only took a couple characters at a time though; don't think it would work on entire pages like you suggest...

Solution: spend half an hour looking up all the characters and words and adding them to flashcards and study them until you don't need this feature any longer. :p
 

FeiGeiWay

Member
I have two apps on my iPhone if they are combined would be awesome. One is Perfect OCR, it scans a full letter page with the iPhone 3GS and converts the image to text.

The other app is CamCard, they can scan a business card in Chinese and input it into your addressbook. How hard would it be for Pleco to implement this feature?
 

mikelove

皇帝
Staff member
Hard - we'd have to find someone to license the OCR technology from, since it's beyond what we can develop ourselves. Does the card scanner actually work well / reliably? Even with relatively obscure name characters?
 
Actually, Tsinghua licenses their OCR scanner for not too much money. Also, I've got two different commercial OCR packages that both do Chinese. ReadIris Corporate does a really good job on scanned stuff (for me), but I am always somewhat careful about scan quality. Omnipage Professional uses the Tsinghua engine as far as I can tell. I don't like it's results as much, however.

Of course, those run on a desktop!.... Perhaps a web/3G app that takes a photo, sends it up to, say.... PlecoOCR (online service)... and then returns it to the device!

(The title does say "insane idea..." :D )
 

character

状元
stephanhodges said:
Of course, those run on a desktop!.... Perhaps a web/3G app that takes a photo, sends it up to, say.... PlecoOCR (online service)... and then returns it to the device!

(The title does say "insane idea..." :D )
Less insane than running all of Pleco as a web app. :D

I like the idea if the price was right, it worked well, and there were some privacy protections (no storage of what is scanned or the results, no logging of location, etc.).
 
Hmm

Maybe I should set up something, and license it out to Pleco :)... Anyone else want to join with me to discuss this (off board as it will involve investing time and/or money)? (I hope that doesn't sound mercenary -- I don't want to abuse this forum!) I'll PM Mike about this post and ask him to remove it if he doesn't like it...
 

gato

状元
Perhaps a web/3G app that takes a photo, sends it up to, say.... PlecoOCR (online service)... and then returns it to the device!

(The title does say "insane idea..." :D )

Dragon Dictate on the iPhone does the same thing for voice recognition. It sends a recording to a server for processing and spits back the result to you. Dragon is able to make the process seamless. If you are connected by WiFI, you get your result back in about 1 second. If you can do this with OCR, then it would be almost just as good as running native on the device.
 

mikelove

皇帝
Staff member
stephanhodges - investigate it if you like, but honestly unless you have a giant team of research scientists working for you it's likely we'd want to cut out the middleman and license an OCR system from whoever actually makes it :)

I'll look into Tsinghua's system - certainly might be portable to iPhone. A server-based solution isn't impossible either, but it'd be a lot more complicated.
 

mfcb

状元
i've never seen it in action, but i guess the most trivial and "poor mans"-OCR would be to display the image, maybe zoomed into the image manually to where the character(s) is/are and with the transparent HWR just repaint the character... already suggested that one time, dont remember why it was rejected...
 

mikelove

皇帝
Staff member
I remember that suggestion, but I think we dismissed it at the time because that sort of image zooming was actually going to be quite a lot of work to implement well on WM; it might be doable on iPhone once we get transparent handwriting to be a bit less laggy, though.
 

mfcb

状元
but on WM i would have no problem to do the zooming and cropping in an external app, remember, we have full multitasking, hehe
 

mikelove

皇帝
Staff member
True, but it would be pretty awkward to have to carefully zoom / re-crop the image for every individual character or group of characters you wanted to trace over; if we want this feature to be widely used enough to justify adding it (not that it's a lot of work, but every feature we add has to be though of as an extra burden on usability since it's yet another option for users to deal with - gotten much stricter about that post-2.0) we need to make really streamlined and automatic - direct capture from a built-in camera, zoom-in / invocation of the handwriting overlay in just a few taps, etc. "It's not worth doing if it's not done well" is kind of a motto for us going forward... :)
 

FeiGeiWay

Member
CAMCARD and WORLDCARD MOBILE both scan business cards in simplified and traditional chinese. It's pretty accurate too. I like Camcard better because I think it's a little bit faster and more simple. Maybe Mike can ask the developers of those apps where they license their chinese ocr.
 

mikelove

皇帝
Staff member
Worldcard seems to be from PenPower, who are Hanwang's Taiwanese arch-rivals, so given all of the wonderful things Hanwang has done for us (we license our handwriting recognizer from them) I'm not particularly eager to do business with PenPower. (Hanwang does have their own OCR system, though...)

CamCard seems to have their own technology judging by their website, and they even mention licensing, so that one might be worth investigating further.
 

radioman

状元
Might have been brought up before - and might be easy to implement ...

1) take a photo of a bunch of character

2) blow up character on screen with typical zoom in photo feature

3) trace character by hand

4) process via HWR.

Figure this is just another tool in the box - eliminates the problems of hard-to-differentiate color combinations, not being able to automatically recogize characters, etc.
 

mikelove

皇帝
Staff member
radioman said:
Might have been brought up before - and might be easy to implement ...

1) take a photo of a bunch of character

2) blow up character on screen with typical zoom in photo feature

3) trace character by hand

4) process via HWR.

Figure this is just another tool in the box - eliminates the problems of hard-to-differentiate color combinations, not being able to automatically recogize characters, etc.

It's been brought up a few times, but it's one of those nice little features that we never get around to because it's time-consuming enough to go beyond the quick-little-bonus-in-an-update level but simple enough to not be a major selling point. Also in this particular case there's a big question as to where we might fit in a camera button into the handwriting UI in such a way as to make it easily discoverable; we'd basically have to either jam another button into the top input bar (which probably means this would end up as an option buried in Settings that would replace the Full or the Wild button) or reduce the number of candidate characters displayed in the bottom bar in handwriting. We could put it in its own separate area in the Reader tab, but I'm not sure if anybody would bother switching to it at that point when they could just bring up the regular handwriting screen - do you think you'd be willing to use this if you had to go into Reader, tap on a command in reader to open this, tap on the screen a few more times to select or take a picture, zoom into the characters you're interested in, handwrite them in one-by-one, then tap on a button to copy your input back to / switch into the Dict tab?
 
Top