Im looking for some optical character/text recognition software. No prior experience in this field.
Strict requirements:
* take 1 or more widely used *BITMAP* formats.
* pixel color format: 1 or more of 8bit palettized, 888rgb and 8888cmyk/rgb-none format.
* virtually accurate for large types, and also very high reliablity for small type. Character omission must be extremely low or nonexistant, if necessary by increasing false positive rate.
* fairly rapid.
* fully automated operation. (no user interaction required)
* cheap or free. Talking about academic interest use on currently no budget.
* memory restriction: max ~200MByte available for the application.
* output to raw text, 8bit ASCII cp437/cp865. Templates/macroscripting for output formatting a big plus.
* win32api app on x86 (IA-32). Processor requirement may be i80586 or i80686 instruction set compability (max pentium-III set). Pentium 4 or later specifics are not tolerated.
Human assisted mode is a plus but full automation is most important. Builtin bulk processing features welcome, but not required (will be scripted).
Targetted input: computergenerated text and scans/photos of printed text - I.e. almost anything u might see displayed on a computerscreen, short of handwriting, No handwriting recognition capability necessary. Some custom pre-recognition filtering can be arranged if that helps satisfy requirements.
As long as a single popular bitmap format is accepted thats fine. (Bulk) conversion can be done as needed.
Target text is largely monochromatic or at least can be preprocessed into low-variance color on bulk scale, but inputs may have potentially any background.
Key text orientations are *dead* horizontal left-to-right and *dead* vertical up and down. Expected deviation from this is very low. Some degree of compensation for scan misaligment is needed, but general text orientation can be specified.
Strict requirements:
* take 1 or more widely used *BITMAP* formats.
* pixel color format: 1 or more of 8bit palettized, 888rgb and 8888cmyk/rgb-none format.
* virtually accurate for large types, and also very high reliablity for small type. Character omission must be extremely low or nonexistant, if necessary by increasing false positive rate.
* fairly rapid.
* fully automated operation. (no user interaction required)
* cheap or free. Talking about academic interest use on currently no budget.
* memory restriction: max ~200MByte available for the application.
* output to raw text, 8bit ASCII cp437/cp865. Templates/macroscripting for output formatting a big plus.
* win32api app on x86 (IA-32). Processor requirement may be i80586 or i80686 instruction set compability (max pentium-III set). Pentium 4 or later specifics are not tolerated.
Human assisted mode is a plus but full automation is most important. Builtin bulk processing features welcome, but not required (will be scripted).
Targetted input: computergenerated text and scans/photos of printed text - I.e. almost anything u might see displayed on a computerscreen, short of handwriting, No handwriting recognition capability necessary. Some custom pre-recognition filtering can be arranged if that helps satisfy requirements.
As long as a single popular bitmap format is accepted thats fine. (Bulk) conversion can be done as needed.
Target text is largely monochromatic or at least can be preprocessed into low-variance color on bulk scale, but inputs may have potentially any background.
Key text orientations are *dead* horizontal left-to-right and *dead* vertical up and down. Expected deviation from this is very low. Some degree of compensation for scan misaligment is needed, but general text orientation can be specified.
Last edited: