iPad 3 / Still OCR

character · Mar 15, 2012

Re: iPad 3

radioman said:
- In this new mode, there would be no need for the capture or the clip functions at the bottom. Replace the capture and clip function with previous or next photos, respectively.

Or perhaps be able to swipe the controls to one side to reveal next/previous image buttons. I don't read Chinese fast enough for the current clunky process of loading a new image to be a big deal, but I can definitely see how it could be with easier material or better reading skills.

Before I tried it, I didn't think reading from an image would be so valuable a use of Pleco. But it's nice to just hold the iPad instead of trying to juggle a book and use an iPhone to look up characters, esp. on the bus. The limitations of OCR make extracting text and reading that non-optimal.

---

I just got an email from Cheng & Tsui which mentioned Integrated Chinese is coming out as an iBook for the iPad. I hope Pleco can capitalize on the interest in the iPad in Education. Some more tablet-friendly features such as an enhanced reading experience might help. Would still love to see Chinese Breeze or other licensed content.

radioman · Mar 15, 2012

Re: iPad 3

[b:1ugx4xom]character[/b] said:
I don't read Chinese fast enough for the current clunky process of loading a new image to be a big deal, but I can definitely see how it could be with easier material or better reading skills.

For me, this has less to do with my ability to read fast, and more to do about the practical handling of documents, be it in the classroom, a business setting, or just when doing personal reading.

Expanding on some of the previous use cases (I'll call them "use cases" now), here are a few more real life ones that I actually have encountered. The great "enabler" in all cases would be Pleco Reader with a Enhanced Photo Advance Function.

Use Case 4 - Teacher says, "Ok students lets look at word list thats starts at the bottom of page 13 . You will see that the word is simply a verb. Who can tell me the meaning of that word? ... ok great. Now lets go to the top of page 14. The second down word, who knows an alternate meaning for that word? ... thats right, it is a noun, not a verb, and cannot be used in the way that Sally proposed. Ok, very good. Now lets go back to the dialog, please start reading at the bottom of page 12, continuing from the second line from the bottom ...". So you start to read the dialog on page 12, and 2 lines later need to get immediately to page 13.

Use Case 5a - Teacher says, "Ok class, for the dialog, everyone take turns reading a sentence", and you are number 9 in the class list, So each student reads a sentence. But after sentence 7 (more or less, you haven't figured it out yet), the dialog continues on the next page. And it winds up being sentence 8 where the split is. You need to get to the next page fast.

Use Case 5b - Alternatively, you could be counting the sentences and find out that indeed your sentence is on the next page, second sentence in. You review the sentence, find out in your that there is 1 word you are not sure of, and one character you do not recognize. So while the students are only to sentence 4 in the reading, you have already reviewed your sentence, know everything you need to to read it aloud, and then page back to the dialog on the previous page and follow along with the class until it is time for you to read your sentence.

Use Case 6 - This one is a business setting. You are going to sit through an afternoon product review where 5 product managers are going to present their latest information - You are sitting in Beijing, and the presentations are ALL in Chinese. Five Powerpoint decks, each at about 25 slides. So you export to PNG or JPG and can now effortlessly sit through and actually comprehend much of what is going on. Should someone reference a previous slide (as is often the case), you simply arrow or swipe back. (Arguably, just about every 外国人 doing business in China, even if they are not studying Chinese, would want this function.)

These situations where the material being referenced straddles pages or bounces back and forth between pages happens all the time, in business and classroom settings. The proposed Enhanced Photo Advance Function would nicely address the problem, allowing the user to seamlessly move from page to page in a natural page turning fashion.

character said:
Before I tried it, I didn't think reading from an image would be so valuable a use of Pleco. But it's nice to just hold the iPad instead of trying to juggle a book and use an iPhone to look up characters, esp. on the bus. The limitations of OCR make extracting text and reading that non-optimal.

This I agree with. I have an iPhone 4s and it works great, and fast. And for tasks like building word lists, the OCR on the iPhone is crazy-good. But for a classroom setting, waving around an iPhone trying to get the characters focused, and doing it on the fly is just not natural or seamless. And it immediately draws attention to the fact that you are having "issues" with your Chinese.

With an iPad, it sits there like a book; no waving things around. You might just touch a character, look something up, or save a flashcard, but those activities are unobtrusive. So rather than looking like you are having "issues" with your Chinese, you are in full control.

mikelove · Mar 15, 2012

Re: iPad 3

Vzzzbx said:
The autofocus camera will definitely come in useful for OCR and, er, not much else. I've not used my iPad 2's camera once.

I suspect that's mostly a casing thickness thing - the iPad 2 like the iPod Touch was too thin for an autofocus camera, but since they happened to be making the iPad 3 thicker anyway for the sake of its enormous battery (which is now getting up into laptop range capacity-wise) they found they now had the extra space for all of those lens elements and figured that if they were building in one anyway they might as well improve it. Might also be getting them some quantity discount benefits if they're using more of the same parts in the iPad 3 and the iPhone 4S.

radioman said:
My thought is, it would be really useful is there was a way in Photo OCR Block Read mode where you can just open a photo, Pleco-OCR it (like you do now).. BUT, there would be a new function where you can simply swipe or arrow to the next or previous photo. I figure that anyone taking multiple photos for OCR interpretation is likely taking the photos in order.

This makes sense except for the issue of how we define the "previous" and "next" photo - do we sort by file name? How do we handle numbers in the names? What about Chinese numbers? With PDF at least there's no problem there since the file defines the previous and next pages quite clearly.

character said:
I would benefit from this as well. Combine these changes with an option for hiding the OCR results on startup and the OCR-based reading experience would be much improved.

What do you mean by "hiding the OCR results on startup"? Do you want the overlaid characters hidden, or the definition popup from the bottom of the screen hidden? (not quite clear if this is Block or Scroll Lookup mode)

radioman said:
I agree with Character about hiding the OCR results. Perhaps, like the highlighted OCR area box position, just have the OCR green be in whatever state/position you had it set on the previous photo.

That seems like a very sensible way to do it, yes.

radioman said:
- In this new mode, there would be no need for the capture or the clip functions at the bottom. Replace the capture and clip function with previous or next photos, respectively.

Why would those no longer be necessary? Presumably people might still want to send a page of text somewhere else. Since we're on an iPad and we have acres of space anyway we can probably just squeeze in a few more buttons in the top bar - the page scroll ones might make the most sense at the top right corner of the screen (following standard iOS design convention).

radioman said:
- Provide high contrast, so within the green box, do not change change the background shading (leave it per the original photo). This would benefit my poor eyesight.

Might have to make that one optional, though I suppose it would be unnecessary if we took advantage of the Retina Display to make our characters white-with-black-outlines a la movie subtitles. (we'd have done that already, but on a regular iPad they're too fuzzy for it to work)

radioman said:
- Leave the "show characters" box in place, so that you can still check if necessary if there are OCR issues.

Indeed, that's pretty much the only reason why that's a button rather than a setting somewhere.

yoose said:
i agree with this, i was actually about to post the same thing. swiping might be hard since you need to swipe around to move the photo if you are zoomed in, but perhaps some sort of film strip on the iPad, there is plenty of room. Having a row of pictures at the top that you can just select would be very nice.

Film strip might be interesting, but a swipe is definitely out since it conflicts with normal scrolling. (it's actually a problem even with things that only scroll vertically - people don't tend to move their fingers in very straight lines - which is why we don't support it as a way to scroll between entries in the regular dictionary)

yoose said:
also now the app only looks in the root folder for pictures. I wanted to organize it into a different folder, but when i did those pictures do not appear in the list.

Have you turned on "Multi-level file move"? There's a bug relating to that - fixed in the submitted-almost-a-week-ago 2.2.11 update.

character said:
I just got an email from Cheng & Tsui which mentioned Integrated Chinese is coming out as an iBook for the iPad. I hope Pleco can capitalize on the interest in the iPad in Education. Some more tablet-friendly features such as an enhanced reading experience might help. Would still love to see Chinese Breeze or other licensed content.

We've just signed a license for some custom Chinese fonts, which is probably the biggest single thing we can to do for the reading experience - beyond that, since we already support custom color schemes, font sizes, and margins, I'm not sure what else we can do; the main reader improvements people have been asking for are EPUB support, page-based scrolling (in spite of the fact that you end up having to flip a lot to read sentences) and a search button. Anything else you've seen in other e-reader apps that you feel we're missing?

As far as licensed content, since we've got a built-in dictionary which largely obviates the need for end-of-chapter vocabulary lists I'm inclined to think that we really just need to license some general-purpose Chinese reading content at a variety of skill levels - not quite sure what the state of the children's and young adult fiction market is in China right now, but that might be a good place to look.

radioman said:
Use Case 6 - This one is a business setting. You are going to sit through an afternoon product review where 5 product managers are going to present their latest information - You are sitting in Beijing, and the presentations are ALL in Chinese. Five Powerpoint decks, each at about 25 slides. So you export to PNG or JPG and can now effortlessly sit through and actually comprehend much of what is going on. Should someone reference a previous slide (as is often the case), you simply arrow or swipe back. (Arguably, just about every 外国人 doing business in China, even if they are not studying Chinese, would want this function.)

Ideally we'll eventually be able to handle this one by hooking into iOS' own PowerPoint viewer - it's a little tricky since we can't really tell when you've changed pages and hence when we need to re-run OCR, but in theory we could add a button to the Office-document-browsing interface that would OCR the current screen image and pop up a tappable overlay. (though it's debatable whether that would offer an advantage over simply highlighting the text and tapping [>], and the [>] approach is more accurate)

character · Mar 15, 2012

Re: iPad 3

mikelove said:
This makes sense except for the issue of how we define the "previous" and "next" photo - do we sort by file name? How do we handle numbers in the names? What about Chinese numbers? With PDF at least there's no problem there since the file defines the previous and next pages quite clearly.

If there's not a default ordering on the contents of a directory, how about supporting an ordering listed in a simple text file? If people are going to scan in lots of images, they won't mind the little additional work of creating the text file.

What do you mean by "hiding the OCR results on startup"? Do you want the overlaid characters hidden [...]

Yes -- it's very useful for live OCR, but IMO it should be configurable for image-based OCR.

Anything else you've seen in other e-reader apps that you feel we're missing?

In this case, I meant specifically improving the reading experience when reading images in Pleco.

[...] I'm inclined to think that we really just need to license some general-purpose Chinese reading content at a variety of skill levels

That would be great. Fiction is far more interesting than non-fiction.

Aside: If you had reading material for up to 500 characters, I wonder if it would be worthwhile to pay someone to design/adapt an intro Mandarin course around Pleco -- along with the basic educational material, include teacher-friendly instructions for using various aspects of Pleco. Even if instructors don't use the course, they would see the value Pleco brings to the classroom.

yoose · Mar 16, 2012

Re: iPad 3

mikelove said:
yoose said:

i agree with this, i was actually about to post the same thing. swiping might be hard since you need to swipe around to move the photo if you are zoomed in, but perhaps some sort of film strip on the iPad, there is plenty of room. Having a row of pictures at the top that you can just select would be very nice.

Click to expand...

Film strip might be interesting, but a swipe is definitely out since it conflicts with normal scrolling. (it's actually a problem even with things that only scroll vertically - people don't tend to move their fingers in very straight lines - which is why we don't support it as a way to scroll between entries in the regular dictionary)

is there a way to implement a multi-touch gesture, like a two finger swipe that is used for navigation? if not then a film strip would be awesome

mikelove said:
yoose said:

also now the app only looks in the root folder for pictures. I wanted to organize it into a different folder, but when i did those pictures do not appear in the list.

Click to expand...

Have you turned on "Multi-level file move"? There's a bug relating to that - fixed in the submitted-almost-a-week-ago 2.2.11 update.

i have turned on multi-level file move. i moved the pictures in to pleco through itunes. then went to my ipad->OCR->Block Recognizer->Image File at which time the popup list comes up. They do show up, but when i go to the file browser and move them into another folder that I created and go back to OCR->Block Recognizer->Image File the popup list is empty.

mikelove said:
As far as licensed content, since we've got a built-in dictionary which largely obviates the need for end-of-chapter vocabulary lists I'm inclined to think that we really just need to license some general-purpose Chinese reading content at a variety of skill levels - not quite sure what the state of the children's and young adult fiction market is in China right now, but that might be a good place to look.

i think it would be even better if you can license textbooks. instead of having to lug the textbooks around it can be in Pleco and if you have access to the files instead of scanning the pages in then it can have quick lookup of words and perhaps add search and eventually notes. If i could buy a digital version of the textbook in Pleco for the same price, or even a little but more, as the paperback (although prices do vary a lot depending on where you buy them, the textbooks I use are much cheaper here in China than it is on the US amazon store) then I would def get the Pleco version.

radioman · Mar 16, 2012

Re: iPad 3

I'd like to comment a little (a lot?) further on this topic of Photo. For brevity, I will reference this concept as EPAF, (Enahanced Photo Advance Function), and would like to propose a 2 phases.

- Phase 1) Near Term (getting something out simple and effective, addressing 80% of the need, and
- Phase 2) a more polished implementation.

The reason for the two phases is totally self serving. I need this basically as soon as… well, over a year ago. But I do think the simplicity of the first Phase has real merit, so will pitch it as such.

*** PHASE 1 START ***
Add two new EPAF buttons (forward and backward) on the "show chars/capture/clip" toolbar.
*** PHASE 1 END ***

So the Phase 1 function would be:
- If EPAF buttons are pressed, then you advance to the next photo (previous or next).
- Use the same photo order as currently exists in the Photo Stream or Saved Photos.
- Use the current Photo Block Reader function. So if you open the 197th photo in the Photo Stream, then new EPAF buttons will load photo 196 or 198 from whatever stream/folder you are drawing files.
... Thats it.

With the above support in place, I can do the following:
- Quickly screen capture dozens of pages of anything on the iPad for immediate use in Pleco.
- Export pdf pages as graphics on my computer to my iPad Saved Photos, to be used in Pleco.
- Export PPT or Keynote as jpg or PNG files and use them. They are sequentially date/time stampled, and therefore would be in order per the current Pleco file system.

Phase 1 tweak (if easy to code) would be:
- Taking out of the gray background within the Green OCR box (see below).

As For Phase 2
- Option to leave OCR "hide chars" setting per previous photo.
- Option to leave OCR box position setting per previous photo.
- Advanced File Organization, including handling of Photo folders, sorting by name, date, etc.
- A way to preview the photos before you load it (if desired). This could be part of a film strip function, coverflow or a big grid. Maybe all of them should be available. However, whatever the approach, there should be a way to preview the document prior to loading - something better than just looking at a little thumbnail.
- Ordered List - (along the lines of what @character referenced) where you could have text list of photo URLs in Dropbox. Pleco points to the text file in Dropbox, reads the list of Photo URLS, and loads the photos into the reader in their predetermined order.

GENERAL COMMENTS
In no particular order.

-Yes, keeping capture and clip functions would be useful. I just figured they were not that important for this "EPAF" mode. However, the fact that there is a lot of room on the the iPad toolbar, this makes sense.

mikelove wrote:

radioman wrote:
- Provide high contrast, so within the green box, do not change change the background shading (leave it per the original photo). This would benefit my poor eyesight.

Click to expand...

Might have to make that one optional, though I suppose it would be unnecessary if we took advantage of the Retina Display to make our characters white-with-black-outlines a la movie subtitles. (we'd have done that already, but on a regular iPad they're too fuzzy for it to work)

- Next, in the above quoted item, I think you are talking about laying the definition text on top of the page and maybe having a presentation of the definition box more like a subtitle presentation. This might be effective. But I believe that when I want to see a pop-up definition, I will not want to see any of the original page behind the popup bubble (i.e., totally opaque). However, my comments about providing high contrast were specifically about how Pleco grays out the area inside the green OCR box. So if the green box is big, the whole document is basically more "dim". When "Hide Char" is in use, I believe what I want is:

- the area inside the green box to be totally transparent (so the original photo is presented as intended, with its original level of contrast).
- if a word is selected, the definition box will pop up as it does now.
- definition box be opaque.

- As for Powerpoint, yes, tie-in to a viewer, etc. would be great and of course very useful. However, thats another integration step, be it a Microsoft Windows Product, or Mac's iWork, etc. And referencing back to my original 2-Phase approach, well, my hope is this is something that could be addressed in a second later phase.

The nice thing about going the photo route near term is that Photo Stream is there, as is the Photo library. Its just done. And many of Office type products, including variations from Apple, Adobe, etc., all export out to picture files. For everything else, you can screen capture. And with better cameras available, and more and more people taking pictures, I figure that this simple way to collecting data and information will only become more useful.

laodie · Mar 16, 2012

Re: iPad 3

New iPad was delivered and Pleco transferred to it. Fifteen minutes to redownload all the software. OCR works great in "live" mode. I will post again after using "still" mode a bit. This is a very welcome development after using my iPhone 3GS and my iPad 1 for so long. Looking forward to the next release.

mikelove · Mar 17, 2012

Re: iPad 3

radioman said:
- Phase 1) Near Term (getting something out simple and effective, addressing 80% of the need, and
- Phase 2) a more polished implementation.

Basic idea makes sense, though we've got to investigate further just how tricky the prev/next functionality would be regarding the photo library, and as I said I think the prev/next buttons make more sense at the top of the screen - even ignoring the fact that iOS conventions would have us put them there, it's just a more logical grouping with the exit / save / etc buttons than with the buttons on the bottom that do something with the content you've captured.

radioman said:
- Next, in the above quoted item, I think you are talking about laying the definition text on top of the page and maybe having a presentation of the definition box more like a subtitle presentation. This might be effective. But I believe that when I want to see a pop-up definition, I will not want to see any of the original page behind the popup bubble (i.e., totally opaque). However, my comments about providing high contrast were specifically about how Pleco grays out the area inside the green OCR box. So if the green box is big, the whole document is basically more "dim". When "Hide Char" is in use, I believe what I want is:

Actually I was talking about the green character overlay - the biggest reason for the differently-colored inside of the box is to provide better contrast with that, so if we made them stand out better the shading would be unnecessary.

radioman said:
The nice thing about going the photo route near term is that Photo Stream is there, as is the Photo library. Its just done. And many of Office type products, including variations from Apple, Adobe, etc., all export out to picture files. For everything else, you can screen capture. And with better cameras available, and more and more people taking pictures, I figure that this simple way to collecting data and information will only become more useful.

That's true, though since we get a lot more requests for PDF support I'm wondering whether we could come up with a similarly quick-and-dirty way to do that - have to investigate the iOS PDF APIs some more.

laodie said:
New iPad was delivered and Pleco transferred to it. Fifteen minutes to redownload all the software. OCR works great in "live" mode. I will post again after using "still" mode a bit. This is a very welcome development after using my iPhone 3GS and my iPad 1 for so long. Looking forward to the next release.

Thanks! Retina-iPad-enabled minor update was submitted to Apple a week ago so we're just waiting on them, though if you were talking about a bigger update that's still a little ways off...

radioman · Mar 17, 2012

Re: iPad 3

Yeah, I'm sure there is more to this than just settings some buttons (why I wanted to keep this simple to start). As for position, having them at the top for this function sounds appropriate to me, with the bottom bar being the current, active page control.

mikelove said:
radioman said:

- Phase 1) Near Term (getting something out simple and effective, addressing 80% of the need, and
- Phase 2) a more polished implementation.

Click to expand...

Basic idea makes sense, though we've got to investigate further just how tricky the prev/next functionality would be regarding the photo library, and as I said I think the prev/next buttons make more sense at the top of the screen - even ignoring the fact that iOS conventions would have us put them there, it's just a more logical grouping with the exit / save / etc buttons than with the buttons on the bottom that do something with the content you've captured.

Again, I might be a little confused. But from my perspective, that the basic tenet should be that, during "hide chars" mode, the original document should be rendered as it was originally presented, with no shading or overlay. The only exception to this might be the line of the green box identifying the OCR perimeter. Perhaps you are taking about when the "hide chars" is inactive? And if that is the case, during that time I do not care if the shading is there or not as I will not be actually "reading" during that time, just looking to see of the OCR rendering was reasonably successful across the OCR area.

radioman said:
radioman said:

- Next, in the above quoted item, I think you are talking about laying the definition text on top of the page and maybe having a presentation of the definition box more like a subtitle presentation. This might be effective. But I believe that when I want to see a pop-up definition, I will not want to see any of the original page behind the popup bubble (i.e., totally opaque). However, my comments about providing high contrast were specifically about how Pleco grays out the area inside the green OCR box. So if the green box is big, the whole document is basically more "dim". When "Hide Char" is in use, I believe what I want is:

Click to expand...

Actually I was talking about the green character overlay - the biggest reason for the differently-colored inside of the box is to provide better contrast with that, so if we made them stand out better the shading would be unnecessary.

Quick and Dirty would be GREAT. And with your OCR, you could potentially ignore all the OCR data in the document (I hear dealing with that, screen positioning, etc. is a not easy) and handle the pages like photos - with your on-the-fly OCR.

But Here is my concern.

I don't pretend know Pleco's business/marketing/development challenges (just my own

), but what I have learned over the years is that Pleco likes to implement things "right". I get it. But I hear "EPUB", integration of Microsoft Powerpoint hooks, potental PDF APIs, and I figure any of these implementation efforts will take real work and real time. No doubt, any of these would be welcome additions. But my gut tells me that the effort to implement any of them would also likely take an order of magnitude (pick a number) more effort than "Phase 1" portion of the proposed enhanced photo advance function. As well, Phase 1 would provide a way to enhance the viewing experience of all of these different document formats, as well as others (at least via the export-photo or screenshot route).

I first posted about my interest for PDF reader back in October, 2009, and a few times since then. But now, with the University classes in full force over the past year, and now the new semester hitting me squarely in the face... well, lets just say my interest in this function has become exponentially greater. The iPad was useful last semester, but the effort involved with page turning was frankly excruciating.

In the end, Phase 1 is my attempt to propose a solution to my page turning problem - the fastest path forward but one that also provides real benefits not limited to PDFs. It is a feature I would happily pay for. Anything to get it over my self-serving goal-line.

mikelove said:
radioman said:

The nice thing about going the photo route near term is that Photo Stream is there, as is the Photo library. Its just done. And many of Office type products, including variations from Apple, Adobe, etc., all export out to picture files. For everything else, you can screen capture. And with better cameras available, and more and more people taking pictures, I figure that this simple way to collecting data and information will only become more useful.

Click to expand...

That's true, though since we get a lot more requests for PDF support I'm wondering whether we could come up with a similarly quick-and-dirty way to do that - have to investigate the iOS PDF APIs some more.

character · Mar 18, 2012

Re: iPad 3

radioman said:
But from my perspective, that the basic tenet should be that, during "hide chars" mode, the original document should be rendered as it was originally presented, with no shading or overlay. The only exception to this might be the line of the green box identifying the OCR perimeter.

I agree that this would be a very welcome tweak to the interface. It would also make for more impressive demos, as it would be easier for the potential Pleco customer to see the original document, and thus imagine reading the document in Pleco.

I don't pretend know Pleco's business/marketing/development challenges (just my own ), but what I have learned over the years is that Pleco likes to implement things "right". I get it.

Agreed. Pleco also has their own list of priorities. So, while I hope improving the "hide chars" mode as above is low-effort and can be added to the next release, the photo advance function does have some complexity and problematic aspects, as Mike mentioned. OTOH, this does feel like an unexpected but valuable use of Pleco which might cause the priorities to shift.

In the meantime, have you looked at making images which include more than one page? I don't know the practical limits of what iOS/Pleco can handle in terms of image size and character size (perhaps Mike can provide some guidance), but there should be some sweet spot to be found by concatenating several low-resolution images together, removing excess whitespace, etc. before importing the image to Pleco. I don't know if a whole chapter will fit in an image, but certainly several pages will. This would decrease the amount of time spent loading a different image.

As you described your use of Pleco (whole textbook scanned in and then read in Pleco) and the problem (teacher jumping around in the textbook) I wonder if next/previous buttons alone would solve the problem. It sounds like you would also need some delay in the start of OCR so you could tap next several times without OCR starting on each page before you got to the page you wanted to view.

radioman · Mar 18, 2012

Re: iPad 3

character said:
In the meantime, have you looked at making images which include more than one page?

Yeah, thats a good thought. And I definitely looked into it. Even downloaded special software to patch the pages together. Unfortunately, it was a pain to create, and the graphic was so large, it crashed the program. Just not convenient.

character said:
As you described your use of Pleco (whole textbook scanned in and then read in Pleco) and the problem (teacher jumping around in the textbook) I wonder if next/previous buttons alone would solve the problem. It sounds like you would also need some delay in the start of OCR so you could tap next several times without OCR starting on each page before you got to the page you wanted to view.

I really do believe this photo advance thing would by and large solve the issue. For any given reading effort, I can just screen capture, photograph, or images of interest. it takes about 20-30 seconds or so to get all set up for a 2 or 3 hour full-blown reading or classroom session. Maybe it would make sense to have a 1 second delay before the OCR function engaged so you could scroll through a few pages, but maybe thats a "phase 2" thing. Might be oversimplifying, but I figure 96% of the paging issues are going back one page or going forward one page. 3% would be spanning 2 pages, and the rest are whatever.

character · Mar 18, 2012

Re: iPad 3

radioman said:
character said:

In the meantime, have you looked at making images which include more than one page?

Click to expand...

Yeah, thats a good thought. And I definitely looked into it. Even downloaded special software to patch the pages together. Unfortunately, it was a pain to create, and the graphic was so large, it crashed the program. Just not convenient.

I know it's extra work, but any paging solution from Pleco is months away. I've loaded 2.5mb, 3264 × 2448 pixel images into Pleco OCR (on an iPad 2); I don't know what the size limit is, or how small/low resolution characters can be and still be successfully OCR'd.

Might be oversimplifying, but I figure 96% of the paging issues are going back one page or going forward one page. 3% would be spanning 2 pages, and the rest are whatever.

So if you made images with as few as 3-5 pages each, the frequency with which you would be caught out by paging issues would be drastically reduced? I'd like to understand the situation, as you have me thinking about scanning my Chinese textbooks in over the summer break. It would be nice to just bring my new iPad to class instead of a bag of books.

radioman · Mar 18, 2012

Re: iPad 3

Well, call me optimistic (desperate?) but hoping some patch can get put together sooner than a few months out. (Gotta try, right??)

But be that as it may, the way my classes go is that we might do a chapter over the course of 2 hours of class. So with about 10 to 15 pages in the chapter. Reading passages, etc., this is what we do. And prior to that class, I would want to read the chapter as well. So I have to screenshot 10 to 15 pages (each one takes a few seconds. I also have high res pictures that can be used rather than screenshots. But using the screenshots is more convenient right now because I do not have to sort out pictures ahead of time. Just find out what is on the agenda and quickly prepare the screenshots, and start reading.

Its just convenient to open the PDF viewer of choice and page right to where you know the class will be reviewing. And its really useful to have the full pages OCRed (not just copying the text to the clipboard). That is, by looking at the page like its intended in the document, with pictures and all, allows you to see the info in context.

character · Mar 18, 2012

Re: iPad 3

radioman said:
Well, call me optimistic (desperate?) but hoping some patch can get put together sooner than a few months out.

Sure, never say never with Mike. Can't hardly tear him away from Java/Android these days, for ex. :wink:

On your iPad, to combine images on your camera roll into a new image, try Photo Wall Pro ($3) -- I also tried Diptic and FrameMagic, both of which wouldn't let me resize the imported images, as far as I could tell. In PWP, I created a new project with a 2048x1536 pixel size canvas and put four Chinese documents on it and resized them to be as large as possible. I exported it to my camera roll and pulled it into Pleco OCR, which worked on it.

ETA: the first time I launched Photo Wall Lite and Photo Wall Pro they crashed. Launched them again and they worked. Either they have an initialization bug or they have some slight issue with the iPad 3rd gen.

mikelove · Mar 18, 2012

Re: iPad 3

radioman said:
Again, I might be a little confused. But from my perspective, that the basic tenet should be that, during "hide chars" mode, the original document should be rendered as it was originally presented, with no shading or overlay. The only exception to this might be the line of the green box identifying the OCR perimeter. Perhaps you are taking about when the "hide chars" is inactive? And if that is the case, during that time I do not care if the shading is there or not as I will not be actually "reading" during that time, just looking to see of the OCR rendering was reasonably successful across the OCR area.

I was talking about when it's inactive, yes. There's not really any good reason to keep the shading when it's active except that it makes it easier to identify whether you're inside or outside of the box if you're zoomed in enough that none of the lines are visible - that's an obscure enough function that we can probably just eliminate the shading altogether in those instances.

radioman said:
I don't pretend know Pleco's business/marketing/development challenges (just my own ), but what I have learned over the years is that Pleco likes to implement things "right". I get it. But I hear "EPUB", integration of Microsoft Powerpoint hooks, potental PDF APIs, and I figure any of these implementation efforts will take real work and real time. No doubt, any of these would be welcome additions. But my gut tells me that the effort to implement any of them would also likely take an order of magnitude (pick a number) more effort than "Phase 1" portion of the proposed enhanced photo advance function. As well, Phase 1 would provide a way to enhance the viewing experience of all of these different document formats, as well as others (at least via the export-photo or screenshot route).

Having spent a little time looking over the PDF documentation I'm actually feeling optimistic that that might not be an order of magnitude more work than image scrolling - in fact it may be hardly any more work at all. We could also look at having a third-party library like this one do the heavy lifting, but since we're already doing most of the tricky background work with efficient image scrolling ourselves I don't know if we'd even need to. If we consider PDF support to be an OCR feature rather than a document reader feature, it's probably the most requested single OCR feature (or at least the most requested one that doesn't require a total overhaul of the recognition engine), and since it would be nice to have some sort of significant OCR improvement in 2.3 (as it's a popular feature even if it's not the main focus of that update), PDF support is a very good candidate for that.

character said:
Sure, never say never with Mike. Can't hardly tear him away from Java/Android these days, for ex.

Not for the last few weeks, and probably not that much in general in 2012 - Android simply isn't generating enough money to sustain our business on its own, and may never do so, so while we'll continue releasing lots of little Android updates, the main focus of most of them is going to be on improvements to cross-platform engine code: stuff that we're actually developing originally on iOS, but want to get more widely tested than is possible given Apple's restrictions on device distribution. Which makes sense even from a technical perspective given the UI transition that Android is in the middle of (no sense doing any big UI overhauls there while we're still waiting to see how 4.0 adoption shakes out), but happens to also be the smart business move since iOS users are the ones paying our bills

(which is not to say that the Android version has been a mistake - still making more money than, say, OCR, which took nearly as long to develop, and it's done all sorts of good things for us both tech- and licensing-wise, but it would be extremely foolish for us to continue to make it our primary focus when iOS is doing so well)

Vzzzbx · Mar 18, 2012

Re: iPad 3

I've got this new iPad and, as I'm sure many of you can already attest, Chinese text looks incredible. No more lost strokes, anywhere, at any legible text size. GQ China looks terrible because all the text is compressed for the old iPad, but it'll catch up at some point.

mikelove said:
Vzzzbx said:

The autofocus camera will definitely come in useful for OCR and, er, not much else. I've not used my iPad 2's camera once.

Click to expand...

I suspect that's mostly a casing thickness thing - the iPad 2 like the iPod Touch was too thin for an autofocus camera, but since they happened to be making the iPad 3 thicker anyway for the sake of its enormous battery (which is now getting up into laptop range capacity-wise) they found they now had the extra space for all of those lens elements and figured that if they were building in one anyway they might as well improve it.

Definitely, but the utility of a camera in a thing the size and shape of a notepad is questionable. For users of Pleco, of course, it's indispensable.

One potential problem with whacking a bigger camera in this iPad is that successive models will need to maintain the sensor and optical quality in a thinner case. I don't envy the engineers.

radioman · Mar 19, 2012

Re: iPad 3

mikelove said:
If we consider PDF support to be an OCR feature rather than a document reader feature

That makes some sense. I guess it really isn't the reader per se, but an OCR engine function. To the extent that might compliment some of planned feature enhancements being rolled out, obviously that's a good thing.

mikelove said:
Having spent a little time looking over the PDF documentation I'm actually feeling optimistic that that might not be an order of magnitude more work than image scrolling.

This would be great news if an "easy implementation" (oxymoron?) actually turns out to be the case.

Well, I have gone on here in this forum probably longer than I should have. But this PDF and/or photo feature from my perspective is really important. If there is anyway to get it brought out sooner than later, I would be happy assist in any testing, etc., whatever is needed.

mikelove · Mar 19, 2012

Re: iPad 3

Vzzzbx said:
One potential problem with whacking a bigger camera in this iPad is that successive models will need to maintain the sensor and optical quality in a thinner case. I don't envy the engineers.

That's true - going by this article it sounds like the extra power consumption from the display is largely backlight-related (more pixels = more light blocking = 3x as much power needed to light the screen) and should go down considerably in a year or so, and I assume that LTE chipsets will likewise get a good bit more efficient, so the iPad 4 will probably be able to shed some battery capacity and return to the iPad 2's level of thickness. But I'm sure Apple have planned around this and are confident that they'll be able to get things to fit, perhaps by shuffling the internals so that there's less distance between the back of the camera and the front of the device.

radioman said:
That makes some sense. I guess it really isn't the reader per se, but an OCR engine function. To the extent that might compliment some of planned feature enhancements being rolled out, obviously that's a good thing.

It raises the question of whether we should consider merging the OCR and document reader add-ons into a single "Pleco Reader" module at - say - $20; if we want to play up the still-image aspects of OCR going forward - and we probably do - and to leverage OCR for other document-reading-related features (look up characters in images on web pages, say) then that's a more logical place for it.

radioman said:
This would be great news if an "easy implementation" (oxymoron?) actually turns out to be the case.

Well it looks like it's only minimally more complicated than rendering a regular image into an offscreen buffer - CGPDFDocumentCreateWithURL / CGPDFDocumentGetPage / CGContextDrawPDFPage and you're done. For some reason I'd had it in mind that it was a lot more work, or else we probably would have done it in one of our .. updates - then again, it feels like enough of a marquee feature that we would probably be better off putting it in 2.3 regardless so that we can claim credit for it then

And there still may be some complications concerning memory usage or efficient scrolling / zooming that won't become clear until we actually try to implement it, so probably the sort of feature that needs a proper testing period.

radioman said:
Well, I have gone on here in this forum probably longer than I should have. But this PDF and/or photo feature from my perspective is really important. If there is anyway to get it brought out sooner than later, I would be happy assist in any testing, etc., whatever is needed.

You're certainly welcome into the 2.3 beta once that gets going...

radioman · Mar 19, 2012

Re: iPad 3

mikelove said:
It raises the question of whether we should consider merging the OCR and document reader add-ons into a single "Pleco Reader" module at - say - $20; if we want to play up the still-image aspects of OCR going forward - and we probably do - and to leverage OCR for other document-reading-related features (look up characters in images on web pages, say) then that's a more logical place for it.

Well, if there was an enhanced reader function for $20 dollars that helps in the way we've been referencing, I certainly would shell out the money. While its all OCR driven, it certainly sounds like a "reader" function to me.

As for 2.3 Beta testers... for sure, get me on that list list!

mikelove · Mar 19, 2012

Re: iPad 3

radioman said:
Well, if there was an enhanced reader function for $20 dollars that helps in the way we've been referencing, I certainly would shell out the money. While its all OCR driven, it certainly sounds like a "reader" function to me.

Oh no, we wouldn't charge for that as a new add-on, at least I don't think we would - this would just be changing the way we presented things to new buyers to avoid confusion over the fact that you might need a different add-on to open text files than to open PDFs.

iPad 3 / Still OCR

状元

状元

皇帝

状元

探花

状元

Member

皇帝

状元

状元

状元

状元

状元

状元

皇帝

进士

状元

皇帝

状元

皇帝