![]() |
> On the server, the documents that must be sent ar coming in a Postscript > format, Ok ? Not a requirement. Any format that can be recognized and for which a convertor has been configured should do. > > That means that a particular text, let's say : > > <HYLAFAX-PHONE-NUMBER=xxxxxxxxx> (the same font, size and > characteristics) > > would be found in the postscript file as clear text. Some simple filters It is very unlikely to be in clear text from any Windows application; they are particularly bad at producing very big postscript files by very heavy use of fine positioning movements. You need a full postscript interepreter to track font size etc., so any contents search should probably ignore font information. (I sometimes wonder if the microspacing is done to make Word a better format than PDF by inflating the PDF.) (Early windows drivers produce less noisy postscript.) It's not impossible to recover text, but you must make some (probably simple) assumptions about the structure of the postscript program generated and then delete everything except string literals. Spaces are likely to get lost, unless you interpret the postscript and infer them from the character positions.