From: steveb@shade.UUCP (Steve Barber) Newsgroups: unix-pc.general,comp.sys.att Subject: Kernel Patch to Improve Serial Line Response Time (update) Date: 15 Apr 90 06:15:25 GMT Reply-To: steveb@shade.ann-arbor.mi.us (Steve Barber) Organization: Ripley Computing/Consulting Services (Ripley, MI) Back in December, Gene Olson (gene@digibd) posted information explaining how to patch your 3.51a kernel to allow it to move characters from the raw interrupt queue to the raw input queue every 1/60th of a second, rather than every 4/60ths of a second as the code in the kernel is written. The rest of this article is an update to Gene's original article explaining how to make the change to the 3.51m kernel. (Yes, due to the MeterMaid code, the addresses are different.) I should add a few comments here regarding the use of this code, now that I've been running this way for a while: 1) I can definitely notice a difference when typing on a terminal at 19200. The characters are echoed almost instantaneously, giving the system a much crisper feel. 2) These changes do tend to bog the system down if you have a lot of data moving in and out of your serial ports. I don't have any kind of numbers to back this up, but during a 19200 uucp transfer to a directly connected Sun, doing much of anything else with the system can be frustrating. The same number of interrupts are being serviced, but now characters are being moved 4 times as often. I dunno, maybe it's just me. A busy 19200 port is going to slow things down with or without this modification. 3) I can testify that setting nttyhog in ktune to 0 and running two serial ports heavily (in my case, 19200 and 2400) can lock up your system. I guess 19200 is fast enough to steal all the clists if you don't use ttyhog to flush the buffers. 4) My friend whose Sun I'm now connected to used to have a 3B1. When we applied this patch to both of our machines, our average UUCP transfer rates jumped from around 750 bytes/sec to almost 1400 bytes/sec. When only one machine of the two had the patch, average transfer rates were around 1000 bytes/sec, but presumably the machine with the patch could read data at 1400 bytes/sec, and the other machine was still reading at 750 bytes/sec, thus causing the average to be 1000 bytes/sec. 5) Just as another data point in the hardware flow control issue (please don't start another argument out of this!): I've been using hardware flow control on 19200 ports on my 3b1 since OS 3.51. For the most part, it works. uucp works great that way, and in fact due to the packet nature of the g protocol, I seem to recall that it doesn't end up needing to use the flow control much anyway. The only problems I've noticed have been when I'm logged in with cu over a 19200 serial line with flow control. Occasionally there are lost characters and sometimes I suspect (during a long ls, for instance) that sometimes I lose more like a few lines than a few characters, but I've never worried about it too much. pcomm 1.1 handled the speed with no problem, but 1.2 seems to lose some characters now and then too. (That and the feeble vt100 emulation slows it down a lot. I may just comment that out.) The remainder of this article was written by Gene Olson. It describes how to patch your 3.51m kernel using adb. I've changed the portions that need to be changed after the upgrade from 3.51a to 3.51m. --------------------------------------------------------------------------- BACKGROUND ---------- I noticed that the response time to keystrokes on my UNIX-pc was not as good as I have come to expect from terminals and other systems. I poked around a bit and found the cause. In the Kernel, there is a routine called "serscan" which moves data (this is conjecture) from the raw interrupt queue to the raw input queue. As it turns out, "serscan" (apparently) is only called by the "clock" routine, and only every 4 clock TICs. This 4 clock delay causes up to a 4/60 second variation in keystroke echo times which is quite noticeable to some people. (Me for example). The following "adb" patch changes the "clock" routine to call "serscan" every TIC instead, improving both instant response time and improving consistency. Some extra CPU time is taken with the extra call. I found on my system that CPU intensive benchmarks too .3% longer. This decrease in performance was not noticeable of course, and it is well worth it to a crackpot like myself. *WARNING* +----------------------------------------------------------+ *WARNING* | ***** DISCLAIMER ***** | *WARNING* | | *WARNING* | If you do this and blow it, you have just corrupted | *WARNING* | your UNIX kernel and may wind up reloading your system | *WARNING* | from scratch. Make a backup copy of /unix before | *WARNING* | beginning so you can copy it back if necessary. | *WARNING* | | *WARNING* | You may wish to patch the in-core version of UNIX | *WARNING* | first so you can try it out before altering your | *WARNING* | permanent version of /unix. | *WARNING* | | *WARNING* | You take personal responsibility for any patches to | *WARNING* | your kernel; you will have no-one to blame but | *WARNING* | yourself if you lose or corrupt data. | *WARNING* | | *WARNING* | [SWB: Please DO NOT try this if you are not familiar | *WARNING* | with the use of the adb debugger. It is very easy to | *WARNING* | make a mistake or typo without realizing it. At that | *WARNING* | point recovery can be tricky! I recommend making a | *WARNING* | copy of your kernel to work with. I also recommend | *WARNING* | stepping through the procedure the first time WITHOUT | *WARNING* | the write (-w) option to adb!] | *WARNING* +----------------------------------------------------------+ --------------- Log of ADB session begins here -------------- --------------- # comments added -------------- ##### The -w means WRITE. With the -w option set, any changes you make ##### while in adb WILL be written. Make sure you know what you're doing! # adb -w unix ##### Check out a portion of the Clock ##### interrupt routine. ##### ##### Note that it calls "serscan" only after "fserscan" ##### is incremented to 4, so "serscan" is called only ##### every 4 clock tics. clock+490,10?ia clockspecial+a4: tst.b serinprogress clockspecial+aa: bne.b clockspecial+bc clockspecial+ac: tst.l serbufcnt clockspecial+b2: beq.b clockspecial+bc clockspecial+b4: clr.l (%sp) clockspecial+b6: jsr serrint clockspecial+bc: and.w &-701,sr clockspecial+c0: mov.b fserscan,%d0 clockspecial+c6: addq.b &1,fserscan clockspecial+cc: cmp.b &2,%d0 clockspecial+d0: ble.b clockspecial+de clockspecial+d2: clr.b fserscan clockspecial+d8: jsr serscan clockspecial+de: and.w &-701,sr clockspecial+e2: tst.l idleflag clockspecial+e8: beq.b clockspecial+104 ##### Change the compare so we test "fserscan" against ##### 0 instead of 2. Now we call "serscan" every 2 ##### clock tics. clock+4ba?x clockspecial+ce: 2 .w? 0 clockspecial+ce: 2 = 0 ##### Change the "ble" to a "blt", so now we call "serscan" ##### every clock tic. clock+4bc?x clockspecial+d0: 6f0c .?w 6d0c clockspecial+d0: 6f0c = 6d0c ##### Check our work. Note that the intent of the original ##### code is preserved. If you wish to slow the scan back ##### down, you can still do it by patching clock+4ba. clock+490,10?ia clockspecial+a4: tst.b serinprogress clockspecial+aa: bne.b clockspecial+bc clockspecial+ac: tst.l serbufcnt clockspecial+b2: beq.b clockspecial+bc clockspecial+b4: clr.l (%sp) clockspecial+b6: jsr serrint clockspecial+bc: and.w &-701,sr clockspecial+c0: mov.b fserscan,%d0 clockspecial+c6: addq.b &1,fserscan clockspecial+cc: cmp.b &0,%d0 clockspecial+d0: blt.b clockspecial+de clockspecial+d2: clr.b fserscan clockspecial+d8: jsr serscan clockspecial+de: and.w &-701,sr clockspecial+e2: tst.l idleflag clockspecial+e8: beq.b clockspecial+104 -- --**-Steve Barber----steveb@shade.Ann-Arbor.MI.US----(cmode)-------------------