All of lore.kernel.org
 help / color / mirror / Atom feed
* Status of the stuck sockets bugs.
@ 2021-06-29 17:32 Dave van der Locht
  2021-07-22 17:42 ` Dave van der Locht
  0 siblings, 1 reply; 12+ messages in thread
From: Dave van der Locht @ 2021-06-29 17:32 UTC (permalink / raw)
  To: linux-hams

Hello,

Is there any news about the well known bug regarding sockets getting
stuck in LISTENING state?

Some years ago (already) Marius Petrescu YO2LOJ wrote a patch for the
ax25_subr.c file which seems to work very well and solves the issue.
But it's really annoying having to patch the kernel with each update
again.

What about that patch, I've heard it was rejected several times for
some reason? But can't find info regarding that.
What can be done - or who is able - to get rid of this bug and get it
fixed in the kernel?

Kind regards,
Dave van der Locht

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Status of the stuck sockets bugs.
  2021-06-29 17:32 Status of the stuck sockets bugs Dave van der Locht
@ 2021-07-22 17:42 ` Dave van der Locht
  2021-07-22 23:22   ` David Ranch
  2021-07-23  8:19   ` Ralf Baechle
  0 siblings, 2 replies; 12+ messages in thread
From: Dave van der Locht @ 2021-07-22 17:42 UTC (permalink / raw)
  To: linux-hams

Is anybody able to tell me more / give answers to questions about this issue?

Kind regards,
Dave van der Locht

Op di 29 jun. 2021 om 19:32 schreef Dave van der Locht
<d.vanderlocht@gmail.com>:
>
> Hello,
>
> Is there any news about the well known bug regarding sockets getting
> stuck in LISTENING state?
>
> Some years ago (already) Marius Petrescu YO2LOJ wrote a patch for the
> ax25_subr.c file which seems to work very well and solves the issue.
> But it's really annoying having to patch the kernel with each update
> again.
>
> What about that patch, I've heard it was rejected several times for
> some reason? But can't find info regarding that.
> What can be done - or who is able - to get rid of this bug and get it
> fixed in the kernel?
>
> Kind regards,
> Dave van der Locht

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Status of the stuck sockets bugs.
  2021-07-22 17:42 ` Dave van der Locht
@ 2021-07-22 23:22   ` David Ranch
  2021-07-23  8:14     ` Ralf Baechle
  2021-07-23  8:19   ` Ralf Baechle
  1 sibling, 1 reply; 12+ messages in thread
From: David Ranch @ 2021-07-22 23:22 UTC (permalink / raw)
  To: linux-hams


This issue is still present in all current Linux kernels.  I believe 
Ralf Baechle (current AX.25 kernel module maintainer) has been aware of 
this issue for some time.

--David
KI6ZHD


On 07/22/2021 10:42 AM, Dave van der Locht wrote:
> Is anybody able to tell me more / give answers to questions about this issue?
>
> Kind regards,
> Dave van der Locht
>
> Op di 29 jun. 2021 om 19:32 schreef Dave van der Locht
> <d.vanderlocht@gmail.com>:
>> Hello,
>>
>> Is there any news about the well known bug regarding sockets getting
>> stuck in LISTENING state?
>>
>> Some years ago (already) Marius Petrescu YO2LOJ wrote a patch for the
>> ax25_subr.c file which seems to work very well and solves the issue.
>> But it's really annoying having to patch the kernel with each update
>> again.
>>
>> What about that patch, I've heard it was rejected several times for
>> some reason? But can't find info regarding that.
>> What can be done - or who is able - to get rid of this bug and get it
>> fixed in the kernel?
>>
>> Kind regards,
>> Dave van der Locht


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Status of the stuck sockets bugs.
  2021-07-22 23:22   ` David Ranch
@ 2021-07-23  8:14     ` Ralf Baechle
  2021-07-23 19:09       ` David Ranch
  0 siblings, 1 reply; 12+ messages in thread
From: Ralf Baechle @ 2021-07-23  8:14 UTC (permalink / raw)
  To: David Ranch; +Cc: linux-hams

On Thu, Jul 22, 2021 at 04:22:48PM -0700, David Ranch wrote:

> This issue is still present in all current Linux kernels.  I believe Ralf
> Baechle (current AX.25 kernel module maintainer) has been aware of this
> issue for some time.

I've never been able to reproduce the issue which made it really hard to
debug.  On my own system I've never observed the issue even once.

  Ralf

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Status of the stuck sockets bugs.
  2021-07-22 17:42 ` Dave van der Locht
  2021-07-22 23:22   ` David Ranch
@ 2021-07-23  8:19   ` Ralf Baechle
       [not found]     ` <CAH4uzPMp_4bwPf0+tjviM=5aDVGLRKfz+fC_gVybujwKriF48A@mail.gmail.com>
  2021-07-30 11:47     ` Roland Schwarz
  1 sibling, 2 replies; 12+ messages in thread
From: Ralf Baechle @ 2021-07-23  8:19 UTC (permalink / raw)
  To: Dave van der Locht; +Cc: linux-hams

On Thu, Jul 22, 2021 at 07:42:18PM +0200, Dave van der Locht wrote:

> Is anybody able to tell me more / give answers to questions about this issue?
> 
> Kind regards,
> Dave van der Locht
> 
> Op di 29 jun. 2021 om 19:32 schreef Dave van der Locht
> <d.vanderlocht@gmail.com>:
> >
> > Hello,
> >
> > Is there any news about the well known bug regarding sockets getting
> > stuck in LISTENING state?
> >
> > Some years ago (already) Marius Petrescu YO2LOJ wrote a patch for the
> > ax25_subr.c file which seems to work very well and solves the issue.
> > But it's really annoying having to patch the kernel with each update
> > again.
> >
> > What about that patch, I've heard it was rejected several times for
> > some reason? But can't find info regarding that.
> > What can be done - or who is able - to get rid of this bug and get it
> > fixed in the kernel?

I wasn't even aware of these patches and an internet search didn't turn
up anything.  Fortunately it turned out I have friends who happen to
know Marius so who pointed me at his "patches" really quickly.

And they were not even patches but a broken out net/ax25 directory from
a Debian kernel with random changes.  That said, while the way these
changes were hidden away and present leaves space for improvments,
technically they appear to have benefit, so I'm now working through
them.

  Ralf

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Status of the stuck sockets bugs.
       [not found]     ` <CAH4uzPMp_4bwPf0+tjviM=5aDVGLRKfz+fC_gVybujwKriF48A@mail.gmail.com>
@ 2021-07-23  8:49       ` Dave van der Locht
       [not found]       ` <CAH4uzPOuGD939KgYW5Rwn2or_xNtpo=TuAcCS6dbhrJ7GdZyQQ@mail.gmail.com>
  1 sibling, 0 replies; 12+ messages in thread
From: Dave van der Locht @ 2021-07-23  8:49 UTC (permalink / raw)
  To: linux-hams

Ralf;

Below is the thing / patch Marius did, spread, and which seems to
resolve the issue 100% without any side effects.
Replicating the issue can be done fairly easy, without the patch I've
never seen AX.25 sockets getting closed after NETROM disconnects from
kernel 4.4 or 4.9 and up.

Maybe you're willing to give it a review and nail this one. I've heard
this particular patch was provided and rejected for several times and
I know many HAMs are waiting for a solution (or already given up on
it).
But when you're telling not to be aware about this particular patch
I'm wondering if the rumours are true.

Thanks for the clarification / info on this one Ralf!

@@ -287,5 +287,7 @@
                }
                bh_unlock_sock(ax25->sk);
                local_bh_enable();
+       } else {
+               ax25_destroy_socket(ax25);
        }
 }

Kind regards,
Dave van der Locht

Op vr 23 jul. 2021 om 10:19 schreef Ralf Baechle <ralf@linux-mips.org>:
>
> On Thu, Jul 22, 2021 at 07:42:18PM +0200, Dave van der Locht wrote:
>
> > Is anybody able to tell me more / give answers to questions about this issue?
> >
> > Kind regards,
> > Dave van der Locht
> >
> > Op di 29 jun. 2021 om 19:32 schreef Dave van der Locht
> > <d.vanderlocht@gmail.com>:
> > >
> > > Hello,
> > >
> > > Is there any news about the well known bug regarding sockets getting
> > > stuck in LISTENING state?
> > >
> > > Some years ago (already) Marius Petrescu YO2LOJ wrote a patch for the
> > > ax25_subr.c file which seems to work very well and solves the issue.
> > > But it's really annoying having to patch the kernel with each update
> > > again.
> > >
> > > What about that patch, I've heard it was rejected several times for
> > > some reason? But can't find info regarding that.
> > > What can be done - or who is able - to get rid of this bug and get it
> > > fixed in the kernel?
>
> I wasn't even aware of these patches and an internet search didn't turn
> up anything.  Fortunately it turned out I have friends who happen to
> know Marius so who pointed me at his "patches" really quickly.
>
> And they were not even patches but a broken out net/ax25 directory from
> a Debian kernel with random changes.  That said, while the way these
> changes were hidden away and present leaves space for improvments,
> technically they appear to have benefit, so I'm now working through
> them.
>
>   Ralf

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Status of the stuck sockets bugs.
       [not found]       ` <CAH4uzPOuGD939KgYW5Rwn2or_xNtpo=TuAcCS6dbhrJ7GdZyQQ@mail.gmail.com>
@ 2021-07-23  8:50         ` Dave van der Locht
  0 siblings, 0 replies; 12+ messages in thread
From: Dave van der Locht @ 2021-07-23  8:50 UTC (permalink / raw)
  To: linux-hams

Excuse me, missed the filename in the patch file. It's in /net/ax25/ax25_subr.c.

Kind regards,
Dave van der Locht

Op vr 23 jul. 2021 om 10:44 schreef Dave van der Locht
<d.vanderlocht@gmail.com>:
>
> Ralf;
>
> Below is the thing / patch Marius did, spread, and which seems to
> resolve the issue 100% without any side effects.
> Replicating the issue can be done fairly easy, without the patch I've
> never seen AX.25 sockets getting closed after NETROM disconnects from
> kernel 4.4 or 4.9 and up.
>
> Maybe you're willing to give it a review and nail this one. I've heard
> this particular patch was provided and rejected for several times and
> I know many HAMs are waiting for a solution (or already given up on
> it).
> But when you're telling not to be aware about this particular patch
> I'm wondering if the rumours are true.
>
> Thanks for the clarification / info on this one Ralf!
>
> @@ -287,5 +287,7 @@
>                 }
>                 bh_unlock_sock(ax25->sk);
>                 local_bh_enable();
> +       } else {
> +               ax25_destroy_socket(ax25);
>         }
>  }
>
> Kind regards,
> Dave van der Locht
>
> Op vr 23 jul. 2021 om 10:19 schreef Ralf Baechle <ralf@linux-mips.org>:
> >
> > On Thu, Jul 22, 2021 at 07:42:18PM +0200, Dave van der Locht wrote:
> >
> > > Is anybody able to tell me more / give answers to questions about this issue?
> > >
> > > Kind regards,
> > > Dave van der Locht
> > >
> > > Op di 29 jun. 2021 om 19:32 schreef Dave van der Locht
> > > <d.vanderlocht@gmail.com>:
> > > >
> > > > Hello,
> > > >
> > > > Is there any news about the well known bug regarding sockets getting
> > > > stuck in LISTENING state?
> > > >
> > > > Some years ago (already) Marius Petrescu YO2LOJ wrote a patch for the
> > > > ax25_subr.c file which seems to work very well and solves the issue.
> > > > But it's really annoying having to patch the kernel with each update
> > > > again.
> > > >
> > > > What about that patch, I've heard it was rejected several times for
> > > > some reason? But can't find info regarding that.
> > > > What can be done - or who is able - to get rid of this bug and get it
> > > > fixed in the kernel?
> >
> > I wasn't even aware of these patches and an internet search didn't turn
> > up anything.  Fortunately it turned out I have friends who happen to
> > know Marius so who pointed me at his "patches" really quickly.
> >
> > And they were not even patches but a broken out net/ax25 directory from
> > a Debian kernel with random changes.  That said, while the way these
> > changes were hidden away and present leaves space for improvments,
> > technically they appear to have benefit, so I'm now working through
> > them.
> >
> >   Ralf

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Status of the stuck sockets bugs.
  2021-07-23  8:14     ` Ralf Baechle
@ 2021-07-23 19:09       ` David Ranch
  2021-07-24  6:44         ` Dave van der Locht
  0 siblings, 1 reply; 12+ messages in thread
From: David Ranch @ 2021-07-23 19:09 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-hams


Hello Ralf,

This is reproducible on my systems:

    - Use a modern kernel say from Ubuntu 20.04 (kernel: 5.8.0-59-generic):
    - Use a multi-core CPU with SMP enabled
    - Use a KISS-based TNC using the
    - Have a remote station make an incoming connected AX.25 session to 
your station
    - Have that station then gracefully or ungracefully disconnect


Below is a script I wrote back in 2015 that I used to force-clear these 
dead sessions and also includes comments about what is seen.

--David
KI6ZHD


--
#!/bin/bash

# Clear stranded AX.25 connection
# 11/27/15

#Part of the HamPacket documentation
#dranch@trinnet.net

# 11/27/15 - Added setup agnostic clearing of AX25 sessions
# 03/22/15 - Added clearing sessions in "DISC SENT" state
# 05/07/13 - Original version

DEBUG=1

NETSTAT="/bin/netstat"

NR_CALL="ki6zhd-5"

# 
----------------------------------------------------------------------------------------

function SHOW_SESSIONS {
    #Session to be cleared
    echo -e "Current AX.25 Sessions: "
    echo -e "-------"
    $NETSTAT -A ax25

    echo -e "\nCurrent Netrom Sessions: "
    echo -e "--------"
    $NETSTAT -A netrom
}

if [ $UID -ne 0 ]; then
    echo -e "\nError: you must be root.  \n\nAborting\n"
    exit 1
fi

if [ -z $1 ]; then
    echo -e "\nUsage: you must specify a remotely connected callsign 
WITHOUT an SSID\n"
    SHOW_SESSIONS
    echo -e "\nAborting\n"
    exit 1
fi

RCALLSIGN=$1

# Per dranch packet folder : email subject: "bugs and kernel panics"
#
# There seems to be a bug in the Linux AX.25 stack.  As I understand, 
AX.25 is a Layer-2
#  protocol and Netrom is a Layer-3 protocol.  I assume that if you kill 
the L2 connection,
#  the L3 should come down too.  Bad assumption?  If you create a netrom 
connection with
#  the following assuming your system is full setup and has an interface 
called "netrom"
#  in the /etc/ax25/nrports file:
#
#     #netrom_call - netrom device - remote netrom call (or alias)
#     call netrom wbay
#
#  Then issue the command:
#  program - ax25 interface - remote ax25 call - local ax25 call - kill 
the socket
#  "axctl d710 n6zx-5 ki6zhd-5 kill", I see:
#
#         d710: fm KI6ZHD-5 to N6ZX-5 ctl DISC+
#         d710: fm N6ZX-5 to KI6ZHD-5 ctl UA-
#
#  the ax.25 session is now gone :
#         # netstat -A ax25
#         Active AX.25 sockets
#         Dest       Source     Device  State        Vr/Vs    Send-Q  Recv-Q
#         *          KI6ZHD-2   ax0     LISTENING    000/000  0       0
#         *          KI6ZHD-1   ax0     LISTENING    000/000  0       0
#         *          KI6ZHD-0   ax0     LISTENING    000/000  0       0
#         *          SCLARA-0   ax0     LISTENING    000/000  0       0
#         *          KI6ZHD-5   ax0     LISTENING    000/000  0       0
#
# but the netrom connection persists:
#         netstat -A netrom
#          Active NET/ROM sockets
#          User       Dest       Source     Device  State        Vr/Vs 
  Send-Q  Recv-Q
#          KI6ZHD-0   N6ZX-5     KI6ZHD-5   nr0     ESTABLISHED  006/003 
  0       0
#          *          *          KI6ZHD-5   nr0     LISTENING    000/000 
  0       0
#
#  If I send traffic on the netrom connection, the AX.25 stack 
automatically re-establishes
#  a L2 AX.25 connection.  Why?  Is this expected?
#
#      d710: fm KI6ZHD-5 to N6ZX-5 ctl SABM+
#      d710: fm N6ZX-5 to KI6ZHD-5 ctl UA-



#We need to find and clear any netrom based sessions FIRST if any
# Test to see if we are Netrom enabled

$NETSTAT -A netrom -r > /dev/null 2>&1
if [ $? -eq 0 ]; then
    #  Expand any used netrom aliases
    REMOTE_NR_RELATED_CALL="`$NETSTAT -A netrom -r | grep -i 
$RCALLSIGN** | awk '{print $1}'`"
    LOCAL_NR_RELATED_CALL="`$NETSTAT -A netrom | grep -i 
$REMOTE_NR_RELATED_CALL | awk '{print $3}'`"

    if [ -n "$LOCAL_NR_RELATED_CALL" ]; then
       #Crafty workaround to find netrom related session
       NR_RELATED_SOCKET="`cat /proc/net/nr | grep -i 
$REMOTE_NR_RELATED_CALL | grep -i $NR_CALL | awk '{print $19}'`"
       NR_RELATED_PROCESS_NAME="`lsof -nP | grep -i $NR_RELATED_SOCKET | 
awk '{print $1}'`"
       NR_RELATED_PROCESS_PID="`lsof -nP | grep -i $NR_RELATED_SOCKET | 
awk '{print $2}'`"
    fi

    if [ $DEBUG -eq 1 ]; then
       #echo -e "\nDEBUG: command: netstat -A netrom -r | grep -i 
$RCALLSIGN** | awk '{print $1}'"
       echo -e "DEBUG: Show any aliased netrom call for $RCALLSIGN** : 
$REMOTE_NR_RELATED_CALL"
       echo -e "DEBUG: NR_RELATED_SOCKET inode is $NR_RELATED_SOCKET"
       echo -e "DEBUG: NR_RELATED_SOCKET process name is 
$NR_RELATED_PROCESS_NAME"
       echo -e "DEBUG: NR_RELATED_SOCKET process PID is 
$NR_RELATED_PROCESS_PID"
       echo -e " "
    fi

    if [ -n "$NR_RELATED_PROCESS_PID" ]; then
       echo -e "Killing netrom related process: $NR_RELATED_PROCESS_PID"
       kill $NR_RELATED_PROCESS_PID
      else
       echo "No netrom processes found"
    fi
   else
    echo -e "\nSystem not Netrom enabled.. skipping"
fi

if [ $DEBUG -eq 1 ]; then
    echo -e "DEBUG: Show any calls for RCALLSIGN $RCALLSIGN : 
$REMOTE_NR_RELATED_CALL"
    echo -e " "
fi

echo -e "\nKilling AX.25 processes: "
LOCAL_AX_RELATED_CALL="`$NETSTAT -A ax25 | grep -i $RCALLSIGN  | awk 
'{print $2}'`"
if [ -z $LOCAL_AX_RELATED_CALL ]; then
    echo -e "\nNo session associated with $RCALLSIGN found. Exiting\n"
    exit 1
fi
LOCAL_AX_INT="`grep -i $LOCAL_AX_RELATED_CALL /etc/ax25/axports | awk 
'{print $1}'`"
$NETSTAT -an | grep -i -e $RCALLSIGN** | grep -e "ESTABLISHED" -e "DISC 
SENT"
$NETSTAT -an | grep -i -e $RCALLSIGN** | grep -e "ESTABLISHED" -e "DISC 
SENT" -e "RECOVERY" | awk '{ system ("/usr/sbin/axctl '$LOCAL_AX_INT' 
"$1" "$2" kill") }'
#$NETSTAT -an | grep -i -e $RCALLSIGN** -e $REMOTE_NR_RELATED_CALL | 
grep -e "ESTABLISHED" -e "DISC SENT" -e "RECOVERY" | awk '{ system 
("/usr/sbin/axctl d710 "$1" "$2" kill") }'

echo -e "\ndone\n"
--



On 07/23/2021 01:14 AM, Ralf Baechle wrote:
> On Thu, Jul 22, 2021 at 04:22:48PM -0700, David Ranch wrote:
>
>> This issue is still present in all current Linux kernels.  I believe Ralf
>> Baechle (current AX.25 kernel module maintainer) has been aware of this
>> issue for some time.
>
> I've never been able to reproduce the issue which made it really hard to
> debug.  On my own system I've never observed the issue even once.
>
>   Ralf
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Status of the stuck sockets bugs.
  2021-07-23 19:09       ` David Ranch
@ 2021-07-24  6:44         ` Dave van der Locht
  0 siblings, 0 replies; 12+ messages in thread
From: Dave van der Locht @ 2021-07-24  6:44 UTC (permalink / raw)
  To: David Ranch; +Cc: Ralf Baechle, linux-hams

Hi Ralf,

It can be reproduced very easily without the need / setup of a TNC.
Just an AX.25 over UDP link to another pretty minimal system and using
NETROM is enough. Create a NETROM connection, disconnect or let it
disconnect due to idle and see what's happening with the sockets.

Not sure where it started occuring as it is there for years already, I
thought it was somewhere around 4.4.23 or 4.9.23 and personally I
haven't seen it working correctly without patches ever since. It seems
like a 100% hit to go wrong which makes me wonder why you haven't seen
the issue on your own system.

Bottom line, many others are having and reporting the exact same issue
in the past years. A working fix seems to be available, without side
effects (as far as I've seen), but isn't comitted yet or solved
otherwise.
I hope someone's willing and able to do 1 of those, It would help many
packetradio enthousiasts.

Kind regards,
Dave van der Locht

Op vr 23 jul. 2021 om 21:09 schreef David Ranch <linux-hams@trinnet.net>:
>
>
> Hello Ralf,
>
> This is reproducible on my systems:
>
>     - Use a modern kernel say from Ubuntu 20.04 (kernel: 5.8.0-59-generic):
>     - Use a multi-core CPU with SMP enabled
>     - Use a KISS-based TNC using the
>     - Have a remote station make an incoming connected AX.25 session to
> your station
>     - Have that station then gracefully or ungracefully disconnect
>
>
> Below is a script I wrote back in 2015 that I used to force-clear these
> dead sessions and also includes comments about what is seen.
>
> --David
> KI6ZHD
>
>
> --
> #!/bin/bash
>
> # Clear stranded AX.25 connection
> # 11/27/15
>
> #Part of the HamPacket documentation
> #dranch@trinnet.net
>
> # 11/27/15 - Added setup agnostic clearing of AX25 sessions
> # 03/22/15 - Added clearing sessions in "DISC SENT" state
> # 05/07/13 - Original version
>
> DEBUG=1
>
> NETSTAT="/bin/netstat"
>
> NR_CALL="ki6zhd-5"
>
> #
> ----------------------------------------------------------------------------------------
>
> function SHOW_SESSIONS {
>     #Session to be cleared
>     echo -e "Current AX.25 Sessions: "
>     echo -e "-------"
>     $NETSTAT -A ax25
>
>     echo -e "\nCurrent Netrom Sessions: "
>     echo -e "--------"
>     $NETSTAT -A netrom
> }
>
> if [ $UID -ne 0 ]; then
>     echo -e "\nError: you must be root.  \n\nAborting\n"
>     exit 1
> fi
>
> if [ -z $1 ]; then
>     echo -e "\nUsage: you must specify a remotely connected callsign
> WITHOUT an SSID\n"
>     SHOW_SESSIONS
>     echo -e "\nAborting\n"
>     exit 1
> fi
>
> RCALLSIGN=$1
>
> # Per dranch packet folder : email subject: "bugs and kernel panics"
> #
> # There seems to be a bug in the Linux AX.25 stack.  As I understand,
> AX.25 is a Layer-2
> #  protocol and Netrom is a Layer-3 protocol.  I assume that if you kill
> the L2 connection,
> #  the L3 should come down too.  Bad assumption?  If you create a netrom
> connection with
> #  the following assuming your system is full setup and has an interface
> called "netrom"
> #  in the /etc/ax25/nrports file:
> #
> #     #netrom_call - netrom device - remote netrom call (or alias)
> #     call netrom wbay
> #
> #  Then issue the command:
> #  program - ax25 interface - remote ax25 call - local ax25 call - kill
> the socket
> #  "axctl d710 n6zx-5 ki6zhd-5 kill", I see:
> #
> #         d710: fm KI6ZHD-5 to N6ZX-5 ctl DISC+
> #         d710: fm N6ZX-5 to KI6ZHD-5 ctl UA-
> #
> #  the ax.25 session is now gone :
> #         # netstat -A ax25
> #         Active AX.25 sockets
> #         Dest       Source     Device  State        Vr/Vs    Send-Q  Recv-Q
> #         *          KI6ZHD-2   ax0     LISTENING    000/000  0       0
> #         *          KI6ZHD-1   ax0     LISTENING    000/000  0       0
> #         *          KI6ZHD-0   ax0     LISTENING    000/000  0       0
> #         *          SCLARA-0   ax0     LISTENING    000/000  0       0
> #         *          KI6ZHD-5   ax0     LISTENING    000/000  0       0
> #
> # but the netrom connection persists:
> #         netstat -A netrom
> #          Active NET/ROM sockets
> #          User       Dest       Source     Device  State        Vr/Vs
>   Send-Q  Recv-Q
> #          KI6ZHD-0   N6ZX-5     KI6ZHD-5   nr0     ESTABLISHED  006/003
>   0       0
> #          *          *          KI6ZHD-5   nr0     LISTENING    000/000
>   0       0
> #
> #  If I send traffic on the netrom connection, the AX.25 stack
> automatically re-establishes
> #  a L2 AX.25 connection.  Why?  Is this expected?
> #
> #      d710: fm KI6ZHD-5 to N6ZX-5 ctl SABM+
> #      d710: fm N6ZX-5 to KI6ZHD-5 ctl UA-
>
>
>
> #We need to find and clear any netrom based sessions FIRST if any
> # Test to see if we are Netrom enabled
>
> $NETSTAT -A netrom -r > /dev/null 2>&1
> if [ $? -eq 0 ]; then
>     #  Expand any used netrom aliases
>     REMOTE_NR_RELATED_CALL="`$NETSTAT -A netrom -r | grep -i
> $RCALLSIGN** | awk '{print $1}'`"
>     LOCAL_NR_RELATED_CALL="`$NETSTAT -A netrom | grep -i
> $REMOTE_NR_RELATED_CALL | awk '{print $3}'`"
>
>     if [ -n "$LOCAL_NR_RELATED_CALL" ]; then
>        #Crafty workaround to find netrom related session
>        NR_RELATED_SOCKET="`cat /proc/net/nr | grep -i
> $REMOTE_NR_RELATED_CALL | grep -i $NR_CALL | awk '{print $19}'`"
>        NR_RELATED_PROCESS_NAME="`lsof -nP | grep -i $NR_RELATED_SOCKET |
> awk '{print $1}'`"
>        NR_RELATED_PROCESS_PID="`lsof -nP | grep -i $NR_RELATED_SOCKET |
> awk '{print $2}'`"
>     fi
>
>     if [ $DEBUG -eq 1 ]; then
>        #echo -e "\nDEBUG: command: netstat -A netrom -r | grep -i
> $RCALLSIGN** | awk '{print $1}'"
>        echo -e "DEBUG: Show any aliased netrom call for $RCALLSIGN** :
> $REMOTE_NR_RELATED_CALL"
>        echo -e "DEBUG: NR_RELATED_SOCKET inode is $NR_RELATED_SOCKET"
>        echo -e "DEBUG: NR_RELATED_SOCKET process name is
> $NR_RELATED_PROCESS_NAME"
>        echo -e "DEBUG: NR_RELATED_SOCKET process PID is
> $NR_RELATED_PROCESS_PID"
>        echo -e " "
>     fi
>
>     if [ -n "$NR_RELATED_PROCESS_PID" ]; then
>        echo -e "Killing netrom related process: $NR_RELATED_PROCESS_PID"
>        kill $NR_RELATED_PROCESS_PID
>       else
>        echo "No netrom processes found"
>     fi
>    else
>     echo -e "\nSystem not Netrom enabled.. skipping"
> fi
>
> if [ $DEBUG -eq 1 ]; then
>     echo -e "DEBUG: Show any calls for RCALLSIGN $RCALLSIGN :
> $REMOTE_NR_RELATED_CALL"
>     echo -e " "
> fi
>
> echo -e "\nKilling AX.25 processes: "
> LOCAL_AX_RELATED_CALL="`$NETSTAT -A ax25 | grep -i $RCALLSIGN  | awk
> '{print $2}'`"
> if [ -z $LOCAL_AX_RELATED_CALL ]; then
>     echo -e "\nNo session associated with $RCALLSIGN found. Exiting\n"
>     exit 1
> fi
> LOCAL_AX_INT="`grep -i $LOCAL_AX_RELATED_CALL /etc/ax25/axports | awk
> '{print $1}'`"
> $NETSTAT -an | grep -i -e $RCALLSIGN** | grep -e "ESTABLISHED" -e "DISC
> SENT"
> $NETSTAT -an | grep -i -e $RCALLSIGN** | grep -e "ESTABLISHED" -e "DISC
> SENT" -e "RECOVERY" | awk '{ system ("/usr/sbin/axctl '$LOCAL_AX_INT'
> "$1" "$2" kill") }'
> #$NETSTAT -an | grep -i -e $RCALLSIGN** -e $REMOTE_NR_RELATED_CALL |
> grep -e "ESTABLISHED" -e "DISC SENT" -e "RECOVERY" | awk '{ system
> ("/usr/sbin/axctl d710 "$1" "$2" kill") }'
>
> echo -e "\ndone\n"
> --
>
>
>
> On 07/23/2021 01:14 AM, Ralf Baechle wrote:
> > On Thu, Jul 22, 2021 at 04:22:48PM -0700, David Ranch wrote:
> >
> >> This issue is still present in all current Linux kernels.  I believe Ralf
> >> Baechle (current AX.25 kernel module maintainer) has been aware of this
> >> issue for some time.
> >
> > I've never been able to reproduce the issue which made it really hard to
> > debug.  On my own system I've never observed the issue even once.
> >
> >   Ralf
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Status of the stuck sockets bugs.
  2021-07-23  8:19   ` Ralf Baechle
       [not found]     ` <CAH4uzPMp_4bwPf0+tjviM=5aDVGLRKfz+fC_gVybujwKriF48A@mail.gmail.com>
@ 2021-07-30 11:47     ` Roland Schwarz
  2021-07-31 16:05       ` Roland Schwarz
  1 sibling, 1 reply; 12+ messages in thread
From: Roland Schwarz @ 2021-07-30 11:47 UTC (permalink / raw)
  To: linux-hams; +Cc: Ralf Baechle


[-- Attachment #1.1: Type: text/plain, Size: 1485 bytes --]

FWIW:

I am playing with direwolf to use it as a replacement for soundmodem, 
since soundmodem is currently lacking alsa support.

I have set up two machines, one is a raspberry 4 running stock debian 
11.0 with kernel 5.10 and the other is a x86 ubuntu 21.04 based machine.

I am using two usb soundcards with a cross over sound connection to get 
started.

After having set up /etc/ax25/axports, on both machines I start direwolf 
and use kissattach -l /dev/pts/<nr> dw0 to attach the the ports. Then I
adjust kissparms -c 1 -p dw0 because direwolf suggests so.

Without having set up any services (no ax25d entries yet and 
consequently no processes using the port) from one machine I try to 
axcall dw0 my-callsign-ssid to the other one. Of course this will not 
succeed since I have not set up an application on the remote peer. 
However when I forcibly close the connection (~. within axcall) on the 
other machine netstat --ax25 lists the socket in listening state, which 
should not be the case as I understand it. The socket being in that 
state will not allow me to do anything with it from the remote machine 
since I always get the error that the address is in use.

I guess what I am observing is an incarnation of the "stuck sockets 
bug", yes?

73 de Roland OE1RSA

-- 
__________________________________________
   _  _  | Roland Schwarz
  |_)(_  |
  | \__) | mailto:roland.schwarz@blackspace.at
________| http://www.blackspace.at


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Status of the stuck sockets bugs.
  2021-07-30 11:47     ` Roland Schwarz
@ 2021-07-31 16:05       ` Roland Schwarz
  2021-10-15  9:21         ` Dave van der Locht
  0 siblings, 1 reply; 12+ messages in thread
From: Roland Schwarz @ 2021-07-31 16:05 UTC (permalink / raw)
  To: linux-hams; +Cc: Ralf Baechle, d.vanderlocht


[-- Attachment #1.1.1: Type: text/plain, Size: 982 bytes --]

Partly answering my question by myself:

Am 30.07.21 um 13:47 schrieb Roland Schwarz:
 > I guess what I am observing is an incarnation of the "stuck sockets
 > bug", yes?

I applied YO2LOJ's changes to the current sources and verified that now 
the connection is not left stuck in listening mode.

What I still do not understand why the socket still signals connected 
when doing an axcall from a remote machine although there is no peer 
process connected. Is this normal behavior of the socket layer?

Attached to this mail also is a (hopefully) properly formatted 
patchfile. At least I was able to apply it to a current kernel source 
tree. I also have verified that these are the only changes that have 
been introduced by YO2LOJ with respect to stock kernel.

vy 73 de Roland oe1rsa

-- 
__________________________________________
   _  _  | Roland Schwarz
  |_)(_  |
  | \__) | mailto:roland.schwarz@blackspace.at
________| http://www.blackspace.at

[-- Attachment #1.1.2: 0001-apply-YO2LOJ-s-changes-for-proper-connection-cleanup.patch --]
[-- Type: text/x-patch, Size: 739 bytes --]

From aebb2b16522e50af1acf50d5d198e027aabc3513 Mon Sep 17 00:00:00 2001
From: Roland Schwarz <roland.schwarz@blackspace.at>
Date: Sat, 31 Jul 2021 17:51:34 +0200
Subject: [PATCH] apply YO2LOJ's changes for proper connection cleanup

---
 net/ax25/ax25_subr.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/ax25/ax25_subr.c b/net/ax25/ax25_subr.c
index 15ab812c4fe4..7ee0b56513e7 100644
--- a/net/ax25/ax25_subr.c
+++ b/net/ax25/ax25_subr.c
@@ -285,4 +285,9 @@ void ax25_disconnect(ax25_cb *ax25, int reason)
 		bh_unlock_sock(ax25->sk);
 		local_bh_enable();
 	}
+	else
+	{
+		// YO2LOJ: this is needed for proper NETROM connection cleanup on timeout
+		ax25_destroy_socket(ax25);
+	}
 }
-- 
2.30.2


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Status of the stuck sockets bugs.
  2021-07-31 16:05       ` Roland Schwarz
@ 2021-10-15  9:21         ` Dave van der Locht
  0 siblings, 0 replies; 12+ messages in thread
From: Dave van der Locht @ 2021-10-15  9:21 UTC (permalink / raw)
  To: Roland Schwarz; +Cc: linux-hams, Ralf Baechle

Hi,

Any status update on the 'stuck sockets bug'?

73! Dave

Op za 31 jul. 2021 om 18:05 schreef Roland Schwarz
<roland.schwarz@blackspace.at>:
>
> Partly answering my question by myself:
>
> Am 30.07.21 um 13:47 schrieb Roland Schwarz:
>  > I guess what I am observing is an incarnation of the "stuck sockets
>  > bug", yes?
>
> I applied YO2LOJ's changes to the current sources and verified that now
> the connection is not left stuck in listening mode.
>
> What I still do not understand why the socket still signals connected
> when doing an axcall from a remote machine although there is no peer
> process connected. Is this normal behavior of the socket layer?
>
> Attached to this mail also is a (hopefully) properly formatted
> patchfile. At least I was able to apply it to a current kernel source
> tree. I also have verified that these are the only changes that have
> been introduced by YO2LOJ with respect to stock kernel.
>
> vy 73 de Roland oe1rsa
>
> --
> __________________________________________
>    _  _  | Roland Schwarz
>   |_)(_  |
>   | \__) | mailto:roland.schwarz@blackspace.at
> ________| http://www.blackspace.at

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-10-15  9:21 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-29 17:32 Status of the stuck sockets bugs Dave van der Locht
2021-07-22 17:42 ` Dave van der Locht
2021-07-22 23:22   ` David Ranch
2021-07-23  8:14     ` Ralf Baechle
2021-07-23 19:09       ` David Ranch
2021-07-24  6:44         ` Dave van der Locht
2021-07-23  8:19   ` Ralf Baechle
     [not found]     ` <CAH4uzPMp_4bwPf0+tjviM=5aDVGLRKfz+fC_gVybujwKriF48A@mail.gmail.com>
2021-07-23  8:49       ` Dave van der Locht
     [not found]       ` <CAH4uzPOuGD939KgYW5Rwn2or_xNtpo=TuAcCS6dbhrJ7GdZyQQ@mail.gmail.com>
2021-07-23  8:50         ` Dave van der Locht
2021-07-30 11:47     ` Roland Schwarz
2021-07-31 16:05       ` Roland Schwarz
2021-10-15  9:21         ` Dave van der Locht

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.