Today i noticed there is yet another volunteer project trying to get people to type in text from the Hawaiian-Language Newspapers using images that are not clear. This time people are being asked to correct text done by Cambodians. I won’t speak on the ramifications of using Cambodians to OCR Hawaiian newspapers, because that is a whole separate issue in itself.
I will say once again, however, that i believe the information written in the Hawaiian-Language Newspapers is important enough to reproduce accurately so that we can search and find what was originally written within its pages. I don’t know if there is anyone who feels any different.
The example given in the ad calling for volunteers shows precisely why we need to FIRST get good, clean images of the newspapers AND THEN typescript them so they are word-searchable. The highlighted column will never be fully legible using this image, because there is a big fold running down the left side, obscuring two or three letters in each line. There are pages and pages like this (and many are even less legible).
Is getting 70 or 80 or 90% of the words sufficient? What if your kupuna wrote something or was written about; would 90% of it be good enough for you? What if the one time her name was mentioned in the article was a part that was folded over, or was too dark to read…