HylaFAX The world's most advanced open source fax server

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [hylafax-users] Extract text strings from fax



On Tue, 20 May 2003 16:33:50 +0800
Wei Ming Long <WEI_Ming_Long@dsi.a-star.edu.sg> wrote:

> Hi everyone,
> Can we extract some text strings from the fax such as "To: Matthew" and then
> route this fax to a folder named "Matthew" so that all incoming faxes can be
> routed to the respective recipients' folders? would this be possible? Has
> anyone done this before and care to share how he/she did it?

To extract the text from a TIFF file, you can use the Open Source tool
gocr, which is available from http://jocr.sourceforge.net

The TIFF file first has to be converted to PNM format with the netpbm
suite of tools (http://netpbm.sourceforge.net).

Both of these should be available as RPM packages for most distributions.

However, gocr is very sensitive to the type style or font.  It has
difficulty with italic fonts, for example, and unusual fonts may be
beyond its current capability.

But it may be worth a try.  Yet I doubt that you could maintain
any good consistency with such a method.

I have used gocr to recover text from files that I had lost but which
were fortunately still available as stored faxes.

Frank Peters

____________________ HylaFAX(tm) Users Mailing List _______________________
  To subscribe/unsubscribe, click http://lists.hylafax.org/cgi-bin/lsg2.cgi
 On UNIX: mail -s unsubscribe hylafax-users-request@hylafax.org < /dev/null
  *To learn about commercial HylaFAX(tm) support, mail sales@hylafax.org.*




Project hosted by iFAX Solutions