HylaFAX The world's most advanced open source fax server

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [hylafax-users] Dropping Faxes -- With Logs



Todd Patton wrote:

Jan 15 23:09:59.20: [ 1845]: <-- [11:AT+FRM=122\r]
Jan 15 23:10:00.42: [ 1845]: --> [7:CONNECT]
Jan 15 23:10:00.42: [ 1845]: MODEM input buffering enabled
Jan 15 23:10:00.42: [ 1845]: MODEM set XON/XOFF/FLUSH: input ignored,
output generated
Jan 15 23:10:04.02: [ 1845]: Timeout awaiting synchronization sequence
Jan 15 23:10:04.22: [ 1845]: MODEM TIMEOUT: reading line from modem
Jan 15 23:10:04.22: [ 1845]: MODEM <Timeout>
Jan 15 23:10:04.22: [ 1845]: <-- [11:AT+FRM=122\r]
Jan 15 23:10:11.22: [ 1845]: MODEM TIMEOUT: reading line from modem
Jan 15 23:10:11.33: [ 1845]: --> [2:]
Jan 15 23:10:11.33: [ 1845]: --> [2:OK]
Jan 15 23:10:11.33: [ 1845]: <-- [9:AT+FRH=3\r]
Jan 15 23:10:11.67: [ 1845]: --> [7:CONNECT]
Jan 15 23:10:12.61: [ 1845]: --> [2:10 03]
Jan 15 23:10:12.61: [ 1845]: --> [10:NO CARRIER]
Jan 15 23:10:12.61: [ 1845]: MODEM No carrier
Jan 15 23:10:12.61: [ 1845]: <-- [9:AT+FRS=7\r]
Jan 15 23:10:42.61: [ 1845]: MODEM TIMEOUT: reading line from modem
Jan 15 23:10:42.61: [ 1845]: MODEM <Timeout>
Jan 15 23:10:42.72: [ 1845]: --> [2:]
Jan 15 23:10:42.72: [ 1845]: --> [2:OK]


Although the final demise of the first log that you posted was not precisely the problem you see here (which shows the T.31 protocol demise of the second log), the first log did exhibit several of these same kinds of messy protocol sequences.


Basically what you see there is HylaFAX's Class 1 driver telling the modem to listen for V.17 12000 bps signalling, and the modem reports CONNECT back indicating that it found it. The modem then starts delivering demodulated data, which isn't being logged here, but would probably show us a sequence of 1-bits (0xFF bytes). These are called "flags" and I suspect that they continued uninterrupted for the 3.6 seconds until that first timeout occurs.

Now, one could say that HylaFAX should not timeout at 3.6 seconds, but instead should wait 5 or 10 or 30 seconds here. However, 3.6 seconds is already very generous per the spec requirements, I believe. This is twelve times the length of time permitted by T.4 A.3.1. If the sender needs to use flow control at this point they should be using sync flags (0x7E) instead of 0xFF. So anyway, HylaFAX then attempts to abort the AT+FRH=122 command. The first "MODEM TIMEOUT: reading line from modem" apparently is because the modem did not honor the method of aborting that HylaFAX uses by default. This may have only been a temporary problem, or it's possible that the modem needs to be configured differently for this to work better. However, as in this case, this abort problem is not usually detrimental to the session.

Now, at this point we really don't have an option to re-issue the AT+FRM=122 command as you see happening there. The carrier flags are only sent at the outset of a carrier, and the modem won't be able to train. So the fact that we see no CONNECT message following it is no surprise. The subsequent timeout was inevitable. After aborting an AT+FRM=122 command we only have an option to do AT+FRH=3... which we do after the second "MODEM TIMEOUT: reading line from modem" there. (Notice that the abort seems to have worked the second time - so the abort problem must be limited in its scope.)

The modem unfortunately reports CONNECT here almost immediately after AT+FRH=3 (listen for V.21 HDLC signalling). It probably mis-reported the V.17 signalling with the CONNECT. This is a common shortcoming with many modems in their V.21 HDLC detection. They seem to report CONNECT at any carrier signal without first checking to see if the demodulation of that carrier results in HDLC. So the subsequent "NO CARRIER" message from the modem could mean just about anything from a loss of V.17 carrier or simply the chance demodulation of a DLE+ETX in the audio (which appears to have been the case). The modem then begins to listen for silence (before transmitting some signal back to the sender)... but if we believe the modem here it then waits 30 seconds for silence and still never gets it.

So if we trust the modem as much as it appears to be trustworthy, it would seem that the sender transmitted way too many 0xFF flags in V.17 before sending any image data. There is no guarantee that the sender actually started transmitting any image data at all, either. However, the rest of the logs seem to indicate that it does... and that would lead me to believe that it's using the 0xFF V.17 flags inappropriately as an extended flow control.

If my interpretation is closely correct, then you should see this happening only with some senders. Is that correct? If so, do you know what kind of sender it is?

I think that maybe extending the timeout from 3.6 seconds to maybe 30 would be a useful test. Attached is a patch that does this. Let me know if a code patch is not useful for you.

Lee.
--- hylafax.orig/faxd/Class1.c++	2007-01-12 18:11:37.000000000 -0800
+++ hylafax/faxd/Class1.c++	2007-01-16 10:59:29.888368752 -0800
@@ -899,9 +899,9 @@
     // look for the first sync flag
 
     time_t start = Sys::now();
-    startTimeout(3600);		// twelve times T.4 A.3.1
+    startTimeout(30000);	// despite indications by T.4 A.3.1
     do {
-	if ((unsigned) Sys::now()-start >= 3) {
+	if ((unsigned) Sys::now()-start >= 30) {
 	    protoTrace("Timeout awaiting synchronization sequence");
 	    setTimeout(true);
 	    return (false);
@@ -909,7 +909,7 @@
 	bit = getModemBit(0);
     } while (bit != 0 && !didBlockEnd());
     do {
-	if ((unsigned) Sys::now()-start >= 3) {
+	if ((unsigned) Sys::now()-start >= 30) {
 	    protoTrace("Timeout awaiting synchronization sequence");
 	    setTimeout(true);
 	    return (false);



Project hosted by iFAX Solutions