Debugging SSH logins

adamlarosa

Adam La Rosa

Posted on May 10, 2020

Debugging SSH logins

One of the main reasons I got into unix systems was the ability to remotely log into another machine. While this is common now amongst most operating systems, the ability to telnet into another machine was once a novelty. Before long people discovered that telnet's use of plain text was a security issue, and ssh became the dominate remote execution protocol.

I love being able to ssh into one of my machines at home from where ever I like (or can find a client to use). Having recently installed OpenBSD 6.6 on a box laying around the house I of course enabled ssh during the installation. Everything seemed fine from the machine itself. It wasn't until I tried to log in remotely that I got this gem.

Connection Refused

Not exactly my dreams of having the resources of my server available at the touch of a button. After a reboot it worked fine, then after another I get the same error message. Then after waiting a while I can log in again. But where to look for answers? My first clue was to check the dmesg buffer from boot and see what network device all this was happening under.

rl0 at pci2 dev 2 function 0 "Realtek 8139" rev 0x10: apic 2 int 22

So I know to look in the "rl" manual page for any resources. After a brief glance I find this bit of knowledge.

BUGS
     Since outbound packets must be longword aligned, the transmit routine has
     to copy an unaligned packet into an mbuf cluster buffer before
     transmission.  The driver abuses the fact that the cluster buffer pool is
     allocated at system startup time in a contiguous region starting at a
     page boundary.  Since cluster buffers are 2048 bytes, they are longword
     aligned by definition.  The driver probably should not be depending on
     this characteristic.

     The Realtek data sheets are of especially poor quality: the grammar and
     spelling are awful and there is a lot of information missing,
     particularly concerning the receiver operation.  One particularly
     important fact that the data sheets fail to mention relates to the way in
     which the chip fills in the receive buffer.  When an interrupt is posted
     to signal that a frame has been received, it is possible that another
     frame might be in the process of being copied into the receive buffer
     while the driver is busy handling the first one.  If the driver manages
     to finish processing the first frame before the chip is done DMAing the
     rest of the next frame, the driver may attempt to process the next frame
     in the buffer before the chip has had a chance to finish DMAing all of
     it.

     The driver can check for an incomplete frame by inspecting the frame
     length in the header preceding the actual packet data: an incomplete
     frame will have the magic length of 0xFFF0.  When the driver encounters
     this value, it knows that it has finished processing all currently
     available packets.  Neither this magic value nor its significance are
     documented anywhere in the Realtek data sheets.

Which made me focus on this...

One particularly important fact that the data sheets fail to mention relates to the way 
in which the chip fills in the receive buffer.  When an interrupt is posted to signal 
that a frame has been received, it is possible that another frame might be in the process of 
being copied into the receive buffer while the driver is busy handling the first one.

Swap out the network cards and voila! No more refused connections.

💖 💪 🙅 🚩
adamlarosa
Adam La Rosa

Posted on May 10, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related