![]() |
Hi everybody, On Fri, 21 Apr 2000, David Woolley wrote: > > * zlib compressed image data in the generated PDF output file > > Why? The images are already in an appropriate compressed format > supported by PDF. If one wanted better compression, the only > reasonable choice would be to go from group3 to group4 fax. > Fax encoding should even be supported by Acrobat 2. > I thought about it as well, but the answer is quite simple: Because it was the simplest thing to do. libtiff decompresses the image it reads and it allows to hook into the function it uses to store pixels when it reads a tiff file. I simply replaced it with a function that stores the 1bpp data coming from the tiff as 1bpp. The result is an uncompressed tiff file. A single fax page thus needs around 250kB of Memory. Temporarily the tools allocates about the same amount of memory for the compression. After compression it just outputs the file and frees the allocated memory. The choice of the compression algorithm mostly had to do with: * ease of programming (zlib is VERY easy to use) * accessibility of documentation. * pure coincidence. * I wanted this thing quickly. I agree that it would be simpler to just copy the tiff images to the pdf, but that would have required a lot more work (how to access the data from the tiff?) The libtiff framework just works well for me. Having said this, I would like to know how much work it would be to implement just this. I also think that this approach would speed up the conversion a lot. Currently the tool converts a 17 page fax of about 540kB size to PDF within 10 seconds on my 166MHz Pentium linux machine. [ not the fastest thing in the world, but some of my customers have 233MHz machines, where the thing should be a little bit faster. ] I also would like to say that I wrote this little tool, because my experiments with other alternatives were not very promising: * tiff -> ps | ghostscript -> pdf seemed too complicated (not DOING it, but the concept) and too slow. * tiff -> pdf using ImageMagick ImageMagick is a very resource intensive tool. It needed up to 80MB to just convert one page. * tiff -> pdf using other tools The available tools were either commercial (I didn't want to pay around USD 500 to just license such a tool), or did not work for faxes. BTW, I have stomped out a tiny bug, that caused Acrobat Reader to complain about the PDF being corrupted. (The very last line of the PDF must be %%EOF, but the tool generated a %EOF only, as a % introduces a comment in PDF, this shouldn't really hurt, but a PDF viewer presumably uses this tag, just like postscript viewers use the PS Document structuring convention). Another improvement is the proper generation of the cross reference table. If anybody is interested, I can send him/her the new version. peter