HylaFAX The world's most advanced open source fax server

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [hylafax-users] sendfax sometimes hangs, can't identify protocol?



Mark,

Are you using triggers at all or JWAIT (which use triggers internally)?
It looks as if faxq has just "lost" track of the job that that
particular hfaxd has submitted.  The only way I know of that to happen
is the trigger bug which has been fixed recently (but is a bug in 4.3.3
yet).

Also - the slowness of the faxq process when handling large queues is a
well know deficiency of the scheduler in faxq.  If you're interested in
handling large queues (with or without batching), I would recommend you
try out current CVS.   You can get a snapshot of it from:
	ftp://ftp.hylafax.org/source/hylafax-SNAPSHOT.tar.gz
or, get it straight from CVS:
	:pserver:cvs:cvs@xxxxxxxxxxxxxxx:/cvsroot 
or from GIT:
	git://cvs.hylafax.org/HylaFAX

For busy queues with all devices almost always busy, the new scheduler
is essentially O(1), where as the original HylaFAX scheduler was O(N),
and the one in 4.3 is O(N**2).

* Mark Hunting <mark@xxxxxxxxxx> [070508 06:33]:
> The server is very busy today, and I already have a new hanging sendfax 
> process now. I hangs almost an hour now, and I guess it will never 
> finish anymore:
> 
> strace -p26875
> uucp     26875  0.0  0.1   4152  1664 ?        S    11:42   0:00 
> /usr/bin/sendfax -m -T 3 -I 300 -n -k now + 3 days -P 128 -f 
> xxxxxx@xxxxxxxxxxxxx -d 084xxxxxxx 92726.pdf
> 
> This is the strace:
> 
> Process 26875 attached - interrupt to quit
> read(3,
> 
> Under normal circumstances the last line becomes something like
> 
> read(3, "200 Job 118161 submitted.\r\n", 1024) = 27
> etc...
> 
> And here is the strace of the corresponding hfaxd process:
> 
> strace -p26876
> Process 26876 attached - interrupt to quit
> select(5, [0 4], [], [], NULL
> 
> Which under normal circumstances becomes something like
> 
> select(5, [0 4], [], [], NULL***)          = 1 (in [4])
> read(4, "S*\0", 2047)                   = 3
> read(4, 0xbfb03500, 2047)               = -1 EAGAIN (Resource 
> temporarily unavailable)
> write(1, "200 Job 119050 submitted.\r\n", 27) = 27
> fcntl64(0, F_SETFL, O_RDWR|O_NONBLOCK)  = 0
> etc...
> 
> I also notice that adding faxes to the queue (using sendfax) is always 
> very slow when the queue is long (7000+ faxes). It becomes so slow that 
> the queue never becomes bigger than +/- 8000 faxes. At that point adding 
> faxes to the queue (from my perl loop) goes at the same speed as the 
> sending of the faxes itself using 60 phone lines. In the past this 
> problem was even worse, until I set MaxBatchJobs to 1. I don't know why 
> adding faxes is still slow now when the queue is long. When I do a 
> strace on those slow processes, they hang for some seconds (or longer) 
> at the same point as the straces above. Apparently sometimes these slow 
> processes are not only slow, but hang forever. Slow processes are no 
> problem, but the faxes should be sent at some point, and not hang forever.
> 
> I hope you can help me with this problem. Please let me know if you need 
> more information.
> 
> Best regards,
> Mark



-- 
Aidan Van Dyk                                             aidan@xxxxxxxx
Senior Software Developer                          +1 215 825-8700 x8103
iFAX Solutions, Inc.                                http://www.ifax.com/

Attachment: signature.asc
Description: Digital signature




Project hosted by iFAX Solutions