HylaFAX The world's most advanced open source fax server |
* Dan Brummer <dan.brummer@xxxxxxxxx> [070222 11:39]: > Hello, > I currently run HylaFAX 4.3.1 in a clustered environment. I'm noticing > an issue when two jobs are submitted the FaxQueuer process > crashes/exits. I don't see much in the log but here are the entries > just before the cluster fails over to the other node: > > Feb 21 21:27:42 hylafax01 FaxQueuer[5131]: SUBMIT JOB 780144 > Feb 21 21:27:42 hylafax01 FaxQueuer[5131]: SUBMIT JOB 780145 > Feb 21 21:27:43 hylafax01 FaxGetty[5180]: LOCKWAIT > Feb 21 21:27:43 hylafax01 FaxGetty[5180]: STATE CHANGE: RUNNING -> > LOCKWAIT (timeout 30) > Feb 21 21:27:43 hylafax01 FaxGetty[5184]: LOCKWAIT > Feb 21 21:27:43 hylafax01 FaxGetty[5184]: STATE CHANGE: RUNNING -> > LOCKWAIT (timeout 30) > > Right after these messages my cluster software, LinuxHA Heartbeat, fails > over to the other node because it sees FaxQueuer not running anymore. > Is this a known bug and is there any resolutions? I just recently > upgraded to 4.3.1 from 4.3.0 because I was having the same issue, > FaxQueuer crashing on same time job submits but in 4.3.0 it gave me a > 'Assertion failed "QLink::remove: item not on a list", file "QLink.c++" > line 53' error. Dan, There are no *known* bugs in 4.3.1 that would cause that. Do you get any core files? Can you increase logging so we can see what is going on when it's exiting? ServerTracing to 0xFFFFF would be good. Also, if you could put a GDB on the faxq, and give us a backtrace, that would be helpful. If doing this, remember to protect GDB in a screen session (or something similar) and set pagination off. a. -- Aidan Van Dyk aidan@xxxxxxxx Senior Software Developer +1 215 825-8700 x8103 iFAX Solutions, Inc. http://www.ifax.com/
Attachment:
signature.asc
Description: Digital signature