HylaFAX The world's most advanced open source fax server |
We have a hylafax system that works fine most of the time, but a few days ago I sent a fax and got this email back in response:
Fax job 48 failed:
Submitted From: localhost
Page Width: 215 (mm)
Page Length: 279 (mm)
Resolution: 98 (lpi)
Status: Kill time expired
Dialogs: 0
Dials: 0
Calls: 0
Pages: 0
TotPages: 0
Attempts: 0
Dirnum:
/var/spool/hylafax/log normally contains files like c000000436 that show a detailed log about each fax, but that file is missing for this fax. /var/spool/hylafax/log/xferfaxlog seems to imply the job was never assigned the ID that would generate this detailed log. It skips from 000000436 to 000000437 and job 48 is missing this ID entirely:
01/30/09 14:51 SEND 000000436 ttyS0 46 "" cdragon@xxxxxxxxxxx "323 221-1830" "3232211830" 2220073 12 5:01 4:33 "" "" "" "" "fax" "00 46 1F 22"
01/30/09 18:19 UNSENT 48 "" kelly@xxxxxxxxxxxx "18185584620" "" 0 0 0:00 0:00 "Kill time expired" "" "" "" "root" ""
01/30/09 14:57 SEND 000000437 ttyS0 47 "" cdragon@xxxxxxxxxxx "310 858-7212" "3108587212" 2220033 11 16:17:30 16:17:04 "" "" "" "" "fax" "00 52 1F 22"
Syslog seems to show there were three faxes going out around the same time. These were sent via the WHFC print driver. The failed job 48 was sent via an automated PDF generation process using the sendQueuedMsgs.pl script. Running the same script using the same PDF files worked in a later test (more on that later).
Jan 30 14:49:31 intranet HylaFAX[14087]: Filesystem has SysV-style file creation
semantics.
Jan 30 14:49:31 intranet FaxQueuer[6277]: SUBMIT JOB 45
Jan 30 14:49:32 intranet FaxGetty[6287]: LOCKWAIT
Jan 30 14:49:48 intranet FaxSend[14098]: MODEM AGERE OCM V.92 VER2.7A (JUN 14 20
04) VOICE MERCURY DP2SH MODE-II SERIAL Agere OCM V.92 Ver2.7a (Jun 14 2004) Voic
e Mercury DP2SH mode-ii SERIAL/
Jan 30 14:49:49 intranet FaxSend[14098]: SEND FAX: JOB 45 DEST 310 527-6435 COMMID 000000435 DEVICE '/dev/ttyS0' FROM 'Chris Dragon <cdragon@xxxxxxxxxxx>' USER fax
Jan 30 14:50:01 intranet /USR/SBIN/CRON[14112]: (root) CMD (/home/cdragon/Automation/sendQueuedMsgs.pl)
Jan 30 14:50:25 intranet HylaFAX[14132]: Filesystem has SysV-style file creation semantics.
Jan 30 14:50:25 intranet FaxQueuer[6277]: SUBMIT JOB 46
Jan 30 14:51:04 intranet FaxSend[14098]: SEND FAX: JOB 45 SENT in 0:47
Jan 30 14:51:08 intranet FaxQueuer[6277]: NOTIFY: bin/notify "doneq/q45" "done" "1:30"
Jan 30 14:51:08 intranet FaxGetty[6287]: LOCKWAIT
Jan 30 14:51:09 intranet FaxQueuer[6277]: NOTIFY exit status: 0 (14157)
Jan 30 14:51:09 intranet postfix/pickup[12289]: 26841108F8: uid=10 from=<FaxMaster>
Jan 30 14:51:09 intranet postfix/cleanup[14241]: 26841108F8: message-id=<20090130225109.26841108F8@xxxxxxxxxxxxxxxxxxxxxxxx>
Jan 30 14:51:09 intranet postfix/qmgr[6503]: 26841108F8: from=<FaxMaster@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>, size=1066, nrcpt=1 (queue active)
Jan 30 14:51:12 intranet HylaFAX[14247]: Filesystem has SysV-style file creation semantics.
Jan 30 14:51:12 intranet FaxQueuer[6277]: SUBMIT JOB 47
Jan 30 14:51:14 intranet postfix/smtp[14244]: 26841108F8: to=<cdragon@xxxxxxxxxxx>, relay=dummy0.junkemailfilter.com[65.49.42.60]:25, delay=5.7, delays=0.73/0.1/3.7/1.1, dsn=2.0.0, status=sent (250 OK id=1LT2Cg-0003zn-0C)
Jan 30 14:51:14 intranet postfix/qmgr[6503]: 26841108F8: removed
Jan 30 14:51:33 intranet FaxSend[14254]: MODEM AGERE OCM V.92 VER2.7A (JUN 14 2004) VOICE MERCURY DP2SH MODE-II SERIAL Agere OCM V.92 Ver2.7a (Jun 14 2004) Voice Mercury DP2SH mode-ii SERIAL/
Jan 30 14:51:33 intranet FaxSend[14254]: SEND FAX: JOB 46 DEST 323 221-1830 COMMID 000000436 DEVICE '/dev/ttyS0' FROM 'Chris Dragon <cdragon@xxxxxxxxxxx>' USER fax
Jan 30 14:55:01 intranet /USR/SBIN/CRON[14385]: (root) CMD (/home/cdragon/Automation/sendQueuedMsgs.pl)
Jan 30 14:56:30 intranet FaxSend[14254]: SEND FAX: JOB 46 SENT in 4:27
Jan 30 14:56:34 intranet FaxQueuer[6277]: NOTIFY: bin/notify "doneq/q46" "done" "5:11"
Jan 30 14:56:34 intranet FaxGetty[6287]: LOCKWAIT
Jan 30 14:56:35 intranet FaxQueuer[6277]: NOTIFY exit status: 0 (14450)
Jan 30 14:56:35 intranet postfix/pickup[12289]: CD2371090B: uid=10 from=<FaxMaster>
Jan 30 14:56:35 intranet postfix/cleanup[14534]: CD2371090B: message-id=<20090130225635.CD2371090B@xxxxxxxxxxxxxxxxxxxxxxxx>
Jan 30 14:56:35 intranet postfix/qmgr[6503]: CD2371090B: from=<FaxMaster@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>, size=1067, nrcpt=1 (queue active)
Jan 30 14:56:41 intranet postfix/smtp[14537]: CD2371090B: to=<cdragon@xxxxxxxxxxx>, relay=dummy0.junkemailfilter.com[65.49.42.60]:25, delay=6.3, delays=0.71/0.04/5.2/0.36, dsn=2.0.0, status=sent (250 OK id=1LT2Hx-0006GC-92)
Jan 30 14:56:41 intranet postfix/qmgr[6503]: CD2371090B: removed
Jan 30 14:57:10 intranet FaxSend[14551]: MODEM AGERE OCM V.92 VER2.7A (JUN 14 2004) VOICE MERCURY DP2SH MODE-II SERIAL Agere OCM V.92 Ver2.7a (Jun 14 2004) Voice Mercury DP2SH mode-ii SERIAL/
Jan 30 14:57:11 intranet FaxSend[14551]: SEND FAX: JOB 47 DEST 310 858-7212 COMMID 000000437 DEVICE '/dev/ttyS0' FROM 'Chris Dragon <cdragon@xxxxxxxxxxx>' USER fax
Jan 30 15:00:01 intranet /USR/SBIN/CRON[14643]: (root) CMD (/home/cdragon/Automation/sendQueuedMsgs.pl)
Jan 30 15:05:01 intranet /USR/SBIN/CRON[14805]: (root) CMD (/home/cdragon/Automation/sendQueuedMsgs.pl)
Jan 30 15:09:01 intranet /USR/SBIN/CRON[14933]: (root) CMD ( [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -type f -cmin +$(/usr/lib/php5/maxlifetime) -print0 | xargs -r -0 rm)
Jan 30 15:10:01 intranet /USR/SBIN/CRON[14970]: (root) CMD (/home/cdragon/Automation/sendQueuedMsgs.pl)
Jan 30 15:15:01 intranet /USR/SBIN/CRON[15130]: (root) CMD (/home/cdragon/Automation/sendQueuedMsgs.pl)
Jan 30 15:17:01 intranet /USR/SBIN/CRON[15201]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Jan 30 15:20:01 intranet /USR/SBIN/CRON[15293]: (root) CMD (/home/cdragon/Automation/sendQueuedMsgs.pl)
Jan 30 15:20:01 intranet HylaFAX[15303]: Filesystem has SysV-style file creation semantics.
Jan 30 15:20:02 intranet HylaFAX[15303]: What we have here is a PDF file
Jan 30 15:20:02 intranet last message repeated 3 times
Jan 30 15:20:02 intranet FaxQueuer[6277]: SUBMIT JOB 48
Jan 30 15:20:02 intranet postfix/pickup[12289]: 4648310914: uid=0 from=<root>
Jan 30 15:20:02 intranet postfix/cleanup[15307]: 4648310914: message-id=<20090130232002.4648310914@xxxxxxxxxxxxxxxxxxxxxxxx>
Jan 30 15:20:02 intranet postfix/qmgr[6503]: 4648310914: from=<root@xxxxxxxxxxxx
ensnatureinstitute.com>, size=638, nrcpt=1 (queue active)
Jan 30 15:20:02 intranet postfix/local[15309]: 4648310914: to=<cdragon@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>, orig_to=<root>, relay=local, delay=0.31, delays=0.16/0.11/0/0.05, dsn=2.0.0, status=sent (delivered to mailbox)
Jan 30 15:20:02 intranet postfix/qmgr[6503]: 4648310914: removed
So, I see job 48 was submitted, but grep "JOB 48" shows it's never referenced again in the log.
Looking at /var/spool/hylafax/doneq/q48 the file confirms the job was seen as containing 0 pages, but near the end it shows:
pdf:0::docq/doc95.pdf.48
pdf:0::docq/doc96.pdf.48
pdf:0::docq/doc97.pdf.48
pdf:0::docq/doc98.pdf.48
These four pdf files do exist in the docq directory. I submitted this job again using the same script and identical PDF files (I did not regenerate the files) and the fax worked fine. The working PDFs have an identical size and identical ownership:
-rw-r----- 1 uucp fax 169725 2009-02-03 08:25 doc103.pdf
-rw-r----- 1 uucp fax 218289 2009-02-03 08:25 doc104.pdf
-rw-r----- 1 uucp fax 155251 2009-02-03 08:25 doc105.pdf
-rw-r----- 1 uucp fax 240979 2009-02-03 08:25 doc106.pdf
-rw-r----- 1 uucp fax 169725 2009-01-30 15:20 doc95.pdf.48
-rw-r----- 1 uucp fax 218289 2009-01-30 15:20 doc96.pdf.48
-rw-r----- 1 uucp fax 155251 2009-01-30 15:20 doc97.pdf.48
-rw-r----- 1 uucp fax 240979 2009-01-30 15:20 doc98.pdf.48
I'm using HylaFax version 4.4.3 but according to the changelog, only one unrelated thing was changed in 4.4.4. I see more changes in the CVS tree that sound more promising, but I don't want to upgrade unless there's a reasonable chance of preventing this in the future. Can anyone tell what went wrong?
----------
For reference, here's what syslog looks like on a resubmit of job 48 when it worked correctly and generated a 7 page fax (I did change the fax number to one I knew would be busy, and I manually killed the job after it dialed):
Feb 3 10:25:02 intranet HylaFAX[11773]: Filesystem has SysV-style file creation semantics.
Feb 3 10:25:02 intranet HylaFAX[11773]: What we have here is a PDF file
Feb 3 10:25:02 intranet last message repeated 3 times
Feb 3 10:25:02 intranet FaxQueuer[6277]: SUBMIT JOB 51
Feb 3 10:25:03 intranet FaxGetty[6287]: LOCKWAIT
Feb 3 10:25:27 intranet FaxSend[11816]: MODEM AGERE OCM V.92 VER2.7A (JUN 14 2004) VOICE MERCURY DP2SH MODE-II SERIAL Agere OCM V.92 Ver2.7a (Jun 14 2004) Voice Mercury DP2SH mode-ii SERIAL/
Feb 3 10:25:27 intranet FaxSend[11816]: SEND FAX: JOB 51 DEST 13108609019 COMMI
D 000000443 DEVICE '/dev/ttyS0' FROM 'kelly <kelly@xxxxxxxxxxxx>' USER root
Feb 3 10:25:36 intranet FaxSend[11816]: SEND FAILED: JOB 51 DEST 13108609019 ERR [1] Busy signal detected
Feb 3 10:25:37 intranet FaxQueuer[6277]: NOTIFY: bin/notify "sendq/q51" "requeued" "" "10:28"
|