This page is about using serial TNCs with Linux Packet Radio.
Mostly Plug and Play
For the majority of people, it appears that serial TNCs work great with
Linux. Many people use kissattach
to use them in KISS mode, and this
seems to work just out of the box for almost everyone. It did for me
(KR0L) with my Kenwood TS-2000 and standard 16550A UART on my Core 2
Quad machine.
Transmit Hanging Problem
I’m going to go into a lot of detail here about a particular problem I had – and also the thought process for troubleshooting. This may be more detail than you need, but it should perhaps help illuminate the problem.
I set up another system – a laptop with a USB-to-serial converter and an old Kantronics KPC-3+ TNC. The computer is talking to the TNC at 19200bps and the TNC is running in KISS mode. So several things are different from my working setup: the use of a USB-to-serial converter, the make and model of TNC, and the serial interface baud rate (19200 instead of 9600, though the over-the-air baud rate is 1200 in both instances.)
The Symptoms
I initially noticed a problem almost immediately. I tried telnetting from my main station to this new one over VHF. Initially things started to work fine, but partway through the session setup I stopped receiving packets from the KPC-3+ TNC. The first question was: was there something weird with the TCP setup causing this? So I placed an AX.25 call to it and had the same problem, ruling out TCP.
I then went over to the problem machine itself and ran axlisten. axlisten showed incoming packets just fine, but showed no outgoing packets. I ran axcall on that machine, which should have immediately caused an outgoing packet, but still nothing. I restarted the system and it worked fine for awhile.
I also observed that stty sometimes would even hang after this problem occurred, and once even had a kernel panic.
Flow Control Background
This made me suspicious to start with of a flow control problem. Here’s what flow control means. Let’s say you have a 19200bps link to your TNC, but your TNC only can transmit at 1200bps. (This is also a common situation with a modem.) Your TNC or modem probably has some sort of internal transmit buffer, but it’s going to be small. When it gets full, it needs to tell the PC to stop sending for a bit so that it doesn’t lose characters.
Unfortunately, there is no standard way to accomplish this on a serial link. The two most common ways are called XON/XOFF or software flow control, and RTS/CTS or hardware flow control. XON/XOFF works by having the device send a Control-S when it needs the remote end to stop transmitting, and a Ctrl-Q when it’s ready for it to resume. The advantage of this is that it doesn’t require any additional signaling cabling. The disadvantages are many. First, it takes a certain time to transmit a Ctrl-S or Ctrl-Q. Moreover, what if the application in question needs to send one of those characters as part of its data? Sending a binary file, for instance, is almost guaranteed to send a Ctrl-S at some point. Without careful avoidance of this, it can result in a connection appearing to hang (lock up, or deadlock) due to XON/XOFF processing.
RTS/CTS involves signaling pins on the serial connector. The computer applies voltage to a pin when it’s able to receive and removes it when it isn’t. This is elegant in that it is out of band from the data, so there are no issues with handling special characters. It also is virtually instantaneous. But it’s not supported by some older hardware or non-standard cabling. In practice, people prefer to use RTS/CTS whenever possible.
Flow Control Investigation
So my problem smells, in part, of XON/XOFF flow control deadlock. That is, the PC perhaps received a Ctrl-S (XOFF) from the TNC and never got a corresponding Ctrl-Q. That could happen for a few reasons. One reason could be that the packets sent to the TNC over the air contained a Ctrl-S character. Another could be that the TNC sent an XOFF but for whatever reason the XON was never received (or never sent by the TNC).
So, how could that explain the stty locking or the kernel panic? Well, a bit of investigation reveals that the USB serial port converter driver appears to handle XON/XOFF either directly in that driver or in the firmware of the converter itself. This could lead to that. This could also explain the kernel panic, or the kernel panic could have come from the AX.25 stack due to being unexpectedly prevented from transmitting from so long.
Either way, more investigation into flow control is warranted.
kissattach doesn’t have any documented options about flow control. Strangely, mkiss from the same package has a -h option to enable hardware flow control (“handshaking” according to the mkiss manpage), but it has nothing that disables XON/XOFF (it is possible, though useless, to run both at the same time.)
It was time to look at the source code.
in ax25-tools/kiss/mkiss.c, hwflag is set to true if hardware flow control is requested. Then it calls:
tty_raw(tty->fd, hwflag);
tty_raw
itself is defined in libax25 as, partially:
if (tcgetattr(fd, &term) == -1) { .... }
term.c_cc[VMIN] = 1;
term.c_cc[VTIME] = 0;
term.c_iflag = IGNBRK | IGNPAR;
term.c_oflag = 0;
term.c_lflag = 0;
#ifdef CIBAUD
term.c_cflag = (term.c_cflag & (CBAUD | CIBAUD)) | CREAD | CS8 | CLOCAL;
#else
term.c_cflag = (term.c_cflag & CBAUD) | CREAD | CS8 | CLOCAL;
#endif
if (hwflag)
term.c_cflag |= CRTSCTS;
Note, by the way, that kissattach never calls this function at all. Also note that nothing in libax25 or ax25tools ever mentions anything at all about XON/XOFF. In other words, even if you tell mkiss to use hardware flow control, it’s not disabling XON/XOFF.
The TNC
The second piece of the puzzle is what the TNC is doing. According to
the Kantronics manual, the XFLOW
option defines whether software flow
control will be used, and it defaults to ON. The TNC I was using had
just been reset to factory defaults, so of course this was ON over
there. The manual says that in “transparent mode”, TRFLOW
and TXFLOW
can override that. KISS is different from transparent mode, so there is
no apparent overriding from there. So it may have been sending XOFF
characters as flow control, although the documentation did not
specifically address its use in KISS.
The documentation did state that “The TNC always uses hardware flow control”. There is no documented option to disable it, and no documentation that it’s disabled in KISS. (However, there is some cause to doubt the accuracy of both statements in the KISS context, though they may still prove accurate.)
The KISS Specification
Thus far, we have a suspicion that XON/XOFF flow control could be a problem. The PC may be seeing XOFF characters generated by the TNC, or possibly passed through in other packets. So what does the KISS specification say about it? We’re particularly interested in what it says about flow control in general, what also what steps it may take to prevent incoming Ctrl-S characters from being presented directly as such to the PC.
On the first point, we see:
- “One of the things that makes the KISS TNC simple is the deliberate lack of TNC/host flow control.” [section 5]
Essentially, KISS says that there is to be no flow control; if buffers are exceeded, packets just get dropped, but the underlying AX.25 protocol already has other ways to deal with that.
So what about XON/XOFF characters in particular?
- “No RS-232C handshaking signals are employed.” [section 2]
The “handshaking signals” here are XON and XOFF. Another concern is what if a Ctrl-S naturally occurs in the data coming in. KISS defines ways of dealing with four characters that have special meaning in section 2. However, Ctrl-S (0x13) is not among them.
Therefore:
- The TNC must not be inserting XON or XOFF in its communications to the PC
- Also, XON/XOFF may naturally occur in the data sent to the PC, so the PC must NOT be using XON/XOFF handshaking.
Other Efforts
I did Google about this, and found one other report of the problem. That solution, however, disabled RTS/CTS, and reported it helped the problem – but only a little bit.
Further Investigation
So far, we’ve established that the TNC has XON/XOFF turned on, and thus may be generating these characters in KISS mode. What is the PC side doing?
# stty -a -F /dev/ttyUSB0
speed 19200 baud; rows 0; columns 0; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>; eol2 = <undef>; swtch = <undef>; start = ^Q;
stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O; min = 1; time = 0;
-parenb -parodd cs8 hupcl -cstopb cread clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr icrnl ixon -ixoff -iuclc -ixany -imaxbel -iutf8
opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0
isig icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop -echoprt echoctl echoke
Notice in there the -crtscts
. The dash means it’s disabled, though
that’s probably because I did that. Note also ixon -ixoff
. Reading the
stty manpage shows that that means it is processing XON/XOFF on incoming
traffic but is not generating XON/XOFF on outgoing traffic. (It is
unlikely that the PC would do so anyhow, since its processing speed is
so far higher than 1200bps).
The Attempted Solution
So looking at the above evidence, I think it makes sense that I need to:
-
Set
XFLOW OFF
in the TNC to make sure it doesn’t insert RTS/CTS -
Completely disabled XON/XOFF on the PC side
-
Experiment with turning on hardware flow control on the PC side
Why the hardware flow control enabling, even though KISS spec says we don’t? Well, there’s a chance that this could avoid the occasional dropped packet and help throughput. The danger is how big the buffer on the PC may be; bufferbloat could turn out to be an even bigger problem. My first attempt, though, will be to turn it on.
So, how to do that? Just add this line to my startup script:
stty -F /dev/ttyUSB0 raw crtscts
The raw
option includes -ixon -ixoff
, which disables XON/XOFF
handling, as well as disabling handling of other special control
characters. This doesn’t set parity; maybe -parenb -parodd cs8
should
be added, but stty -a
shows that these are already set in my instance.
Result
After doing the above, I found that the thing worked better, but not perfectly. Also disabling CTS/RTS and switching to 9600bps fixed it. I am suspecting a buggy 19200bps implementation in the Edgeport.
Addendum January 2017
There is apparently a serious bug in the KPC-3 and KPC-3+ KISS mode that may be at the root of this. As an experimental measure, I have dropped the serial interface rate from 9600 to 1200 to try to address it. Perhaps it was not the Edgeport with the buggy 19200bps implementation, but actually the Kantronics.
Links to this note
Before proceeding, start with the Packet Radio page.