All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] ax25: Fix (more) netdev refcount issues
@ 2024-04-26 21:29 Lars Kellogg-Stedman
  2024-04-27  8:48 ` Dan Carpenter
  0 siblings, 1 reply; 6+ messages in thread
From: Lars Kellogg-Stedman @ 2024-04-26 21:29 UTC (permalink / raw)
  To: linux-hams

Folks,

I'm posting the following patch here before sending it to netdev in the
hopes that someone can take a look at the change and comment on the
correctness. This resolves all of the issues I've been experiencing
recently with ax.25.

...

When closing a socket, the ax.25 code releases references via
netdev_put() and ax25_dev_put(). In the case when the socket was the
result of an incoming connection, these references were never allocated in
the first place, causing underflows in both ax25_dev->refcount and
ax25_dev->dev->refcnt_tracker->untracked. This would result in a variety of
errors:

- After an initial connection and then again after several subsequent
  connections:

      refcount_t: decrement hit 0; leaking memory.

- After several subsequent connections:

      refcount_t: underflow; use-after-free.

A typical call trace for the above two issues would look like:

    Call Trace:
    <TASK>
    ? show_regs+0x64/0x70
    ? __warn+0x83/0x120
    ? refcount_warn_saturate+0xb2/0x100
    ? report_bug+0x158/0x190
    ? prb_read_valid+0x20/0x30
    ? handle_bug+0x3e/0x70
    ? exc_invalid_op+0x1c/0x70
    ? asm_exc_invalid_op+0x1f/0x30
    ? refcount_warn_saturate+0xb2/0x100
    ? refcount_warn_saturate+0xb2/0x100
    ax25_release+0x2ad/0x360
    __sock_release+0x35/0xa0
    sock_close+0x19/0x20
    [...]

On reboot, the kernel would get stuck in an infinite loop:

    unregister_netdevice: waiting for ax1 to become free. Usage count = 0

The attached patch corrects all three of the above problems by ensuring
that we call netdev_hold() and ax25_dev_hold() for incoming connections.

Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com>
---
 net/ax25/ax25_in.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ax25/ax25_in.c b/net/ax25/ax25_in.c
index 1cac25aca63..35a55ad05f2 100644
--- a/net/ax25/ax25_in.c
+++ b/net/ax25/ax25_in.c
@@ -411,6 +411,8 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev,
 	ax25->state = AX25_STATE_3;
 
 	ax25_cb_add(ax25);
+	netdev_hold(ax25_dev->dev, &ax25->dev_tracker, GFP_ATOMIC);
+	ax25_dev_hold(ax25_dev);
 
 	ax25_start_heartbeat(ax25);
 	ax25_start_t3timer(ax25);
-- 
2.44.0

-- 
Lars Kellogg-Stedman <lars@oddbit.com> | larsks @ {irc,twitter,github}
http://blog.oddbit.com/                | N1LKS

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH RFC] ax25: Fix (more) netdev refcount issues
  2024-04-26 21:29 [PATCH RFC] ax25: Fix (more) netdev refcount issues Lars Kellogg-Stedman
@ 2024-04-27  8:48 ` Dan Carpenter
  2024-04-27 17:15   ` Lars Kellogg-Stedman
  0 siblings, 1 reply; 6+ messages in thread
From: Dan Carpenter @ 2024-04-27  8:48 UTC (permalink / raw)
  To: Lars Kellogg-Stedman, Duoming Zhou; +Cc: linux-hams

The commit message needs a Fixes tag.

Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs")

Let me add Duoming Zhou to the CC list.  That commit is two years old
now.  This sort of bug should have been caught by basic testing, right?
Could you confirm that that's actually the commit which breaks it?

regards,
dan carpenter

On Fri, Apr 26, 2024 at 05:29:40PM -0400, Lars Kellogg-Stedman wrote:
> Folks,
> 
> I'm posting the following patch here before sending it to netdev in the
> hopes that someone can take a look at the change and comment on the
> correctness. This resolves all of the issues I've been experiencing
> recently with ax.25.
> 
> ...
> 
> When closing a socket, the ax.25 code releases references via
> netdev_put() and ax25_dev_put(). In the case when the socket was the
> result of an incoming connection, these references were never allocated in
> the first place, causing underflows in both ax25_dev->refcount and
> ax25_dev->dev->refcnt_tracker->untracked. This would result in a variety of
> errors:
> 
> - After an initial connection and then again after several subsequent
>   connections:
> 
>       refcount_t: decrement hit 0; leaking memory.
> 
> - After several subsequent connections:
> 
>       refcount_t: underflow; use-after-free.
> 
> A typical call trace for the above two issues would look like:
> 
>     Call Trace:
>     <TASK>
>     ? show_regs+0x64/0x70
>     ? __warn+0x83/0x120
>     ? refcount_warn_saturate+0xb2/0x100
>     ? report_bug+0x158/0x190
>     ? prb_read_valid+0x20/0x30
>     ? handle_bug+0x3e/0x70
>     ? exc_invalid_op+0x1c/0x70
>     ? asm_exc_invalid_op+0x1f/0x30
>     ? refcount_warn_saturate+0xb2/0x100
>     ? refcount_warn_saturate+0xb2/0x100
>     ax25_release+0x2ad/0x360
>     __sock_release+0x35/0xa0
>     sock_close+0x19/0x20
>     [...]
> 
> On reboot, the kernel would get stuck in an infinite loop:
> 
>     unregister_netdevice: waiting for ax1 to become free. Usage count = 0
> 
> The attached patch corrects all three of the above problems by ensuring
> that we call netdev_hold() and ax25_dev_hold() for incoming connections.
> 
> Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com>
> ---
>  net/ax25/ax25_in.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/ax25/ax25_in.c b/net/ax25/ax25_in.c
> index 1cac25aca63..35a55ad05f2 100644
> --- a/net/ax25/ax25_in.c
> +++ b/net/ax25/ax25_in.c
> @@ -411,6 +411,8 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev,
>  	ax25->state = AX25_STATE_3;
>  
>  	ax25_cb_add(ax25);
> +	netdev_hold(ax25_dev->dev, &ax25->dev_tracker, GFP_ATOMIC);
> +	ax25_dev_hold(ax25_dev);
>  
>  	ax25_start_heartbeat(ax25);
>  	ax25_start_t3timer(ax25);
> -- 
> 2.44.0
> 
> -- 
> Lars Kellogg-Stedman <lars@oddbit.com> | larsks @ {irc,twitter,github}
> http://blog.oddbit.com/                | N1LKS

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFC] ax25: Fix (more) netdev refcount issues
  2024-04-27  8:48 ` Dan Carpenter
@ 2024-04-27 17:15   ` Lars Kellogg-Stedman
  2024-04-28 11:01     ` Dan Carpenter
  0 siblings, 1 reply; 6+ messages in thread
From: Lars Kellogg-Stedman @ 2024-04-27 17:15 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: Duoming Zhou, linux-hams

On Sat, Apr 27, 2024 at 11:48:48AM GMT, Dan Carpenter wrote:
> The commit message needs a Fixes tag.
> 
> Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs")

This wasn't meant explicitly as a fix for that commit. Is the Fixes: tag
still appropriate?

> Let me add Duoming Zhou to the CC list.  That commit is two years old
> now.  This sort of bug should have been caught by basic testing, right?

You'd think, right? The errors were trivial to reproduce (and can be
tested by anyone; you don't need a radio to exercise the ax.25 stack).

> Could you confirm that that's actually the commit which breaks it?

I'll see what I can do. There are already several patches on top of
d01ffb9eee4a (I count 19), so figuring that out might be tricky.

Thanks for taking a look!

> regards,
> dan carpenter
> 
> On Fri, Apr 26, 2024 at 05:29:40PM -0400, Lars Kellogg-Stedman wrote:
> > Folks,
> > 
> > I'm posting the following patch here before sending it to netdev in the
> > hopes that someone can take a look at the change and comment on the
> > correctness. This resolves all of the issues I've been experiencing
> > recently with ax.25.
> > 
> > ...
> > 
> > When closing a socket, the ax.25 code releases references via
> > netdev_put() and ax25_dev_put(). In the case when the socket was the
> > result of an incoming connection, these references were never allocated in
> > the first place, causing underflows in both ax25_dev->refcount and
> > ax25_dev->dev->refcnt_tracker->untracked. This would result in a variety of
> > errors:
> > 
> > - After an initial connection and then again after several subsequent
> >   connections:
> > 
> >       refcount_t: decrement hit 0; leaking memory.
> > 
> > - After several subsequent connections:
> > 
> >       refcount_t: underflow; use-after-free.
> > 
> > A typical call trace for the above two issues would look like:
> > 
> >     Call Trace:
> >     <TASK>
> >     ? show_regs+0x64/0x70
> >     ? __warn+0x83/0x120
> >     ? refcount_warn_saturate+0xb2/0x100
> >     ? report_bug+0x158/0x190
> >     ? prb_read_valid+0x20/0x30
> >     ? handle_bug+0x3e/0x70
> >     ? exc_invalid_op+0x1c/0x70
> >     ? asm_exc_invalid_op+0x1f/0x30
> >     ? refcount_warn_saturate+0xb2/0x100
> >     ? refcount_warn_saturate+0xb2/0x100
> >     ax25_release+0x2ad/0x360
> >     __sock_release+0x35/0xa0
> >     sock_close+0x19/0x20
> >     [...]
> > 
> > On reboot, the kernel would get stuck in an infinite loop:
> > 
> >     unregister_netdevice: waiting for ax1 to become free. Usage count = 0
> > 
> > The attached patch corrects all three of the above problems by ensuring
> > that we call netdev_hold() and ax25_dev_hold() for incoming connections.
> > 
> > Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com>
> > ---
> >  net/ax25/ax25_in.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/net/ax25/ax25_in.c b/net/ax25/ax25_in.c
> > index 1cac25aca63..35a55ad05f2 100644
> > --- a/net/ax25/ax25_in.c
> > +++ b/net/ax25/ax25_in.c
> > @@ -411,6 +411,8 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev,
> >  	ax25->state = AX25_STATE_3;
> >  
> >  	ax25_cb_add(ax25);
> > +	netdev_hold(ax25_dev->dev, &ax25->dev_tracker, GFP_ATOMIC);
> > +	ax25_dev_hold(ax25_dev);
> >  
> >  	ax25_start_heartbeat(ax25);
> >  	ax25_start_t3timer(ax25);
> > -- 
> > 2.44.0
> > 
> > -- 
> > Lars Kellogg-Stedman <lars@oddbit.com> | larsks @ {irc,twitter,github}
> > http://blog.oddbit.com/                | N1LKS

-- 
Lars Kellogg-Stedman <lars@oddbit.com> | larsks @ {irc,twitter,github}
http://blog.oddbit.com/                | N1LKS

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFC] ax25: Fix (more) netdev refcount issues
  2024-04-27 17:15   ` Lars Kellogg-Stedman
@ 2024-04-28 11:01     ` Dan Carpenter
  2024-04-29 18:06       ` Lars Kellogg-Stedman
  0 siblings, 1 reply; 6+ messages in thread
From: Dan Carpenter @ 2024-04-28 11:01 UTC (permalink / raw)
  To: Lars Kellogg-Stedman; +Cc: Duoming Zhou, linux-hams

On Sat, Apr 27, 2024 at 01:15:38PM -0400, Lars Kellogg-Stedman wrote:
> On Sat, Apr 27, 2024 at 11:48:48AM GMT, Dan Carpenter wrote:
> > The commit message needs a Fixes tag.
> > 
> > Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs")
> 
> This wasn't meant explicitly as a fix for that commit. Is the Fixes: tag
> still appropriate?
> 

We would want to have a Fixes tag to say how the bug was introduced.
That's the commit which introduced the underflow I think.

> > Let me add Duoming Zhou to the CC list.  That commit is two years old
> > now.  This sort of bug should have been caught by basic testing, right?
> 
> You'd think, right? The errors were trivial to reproduce (and can be
> tested by anyone; you don't need a radio to exercise the ax.25 stack).
> 
> > Could you confirm that that's actually the commit which breaks it?
> 
> I'll see what I can do. There are already several patches on top of
> d01ffb9eee4a (I count 19), so figuring that out might be tricky.

Instead of reverting just the commit directly before that commit.

git checkout 4e0f718daf97d47cf7dec122da1be970f145c809

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFC] ax25: Fix (more) netdev refcount issues
  2024-04-28 11:01     ` Dan Carpenter
@ 2024-04-29 18:06       ` Lars Kellogg-Stedman
  2024-04-30 10:08         ` Dan Carpenter
  0 siblings, 1 reply; 6+ messages in thread
From: Lars Kellogg-Stedman @ 2024-04-29 18:06 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: Duoming Zhou, linux-hams

On Sun, Apr 28, 2024 at 02:01:54PM +0300, Dan Carpenter wrote:
> We would want to have a Fixes tag to say how the bug was introduced.
> That's the commit which introduced the underflow I think.

It doesn't look like d01ffb9eee4a introduced either of the problems.

I ran a couple of git bisects between d01ffb9eee4a and c942a0cd36
(that's the HEAD of my local repository, 6.9.0-rc5); the traces
involving ax25_release first show up in 9fd75b66b8f ("ax25: Fix refcount
leaks caused by ax25_cb_del()"). The "waiting for ax* to become free"
problem first shows up in feef318c855 ("ax25: fix UAF bugs of net_device
caused by rebinding operation").  Since feef318c855 is the older commit,
I guess we pick that one as the target for the Fixes: tag.

-- 
Lars Kellogg-Stedman <lars@oddbit.com> | larsks @ {irc,twitter,github}
http://blog.oddbit.com/                | N1LKS

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFC] ax25: Fix (more) netdev refcount issues
  2024-04-29 18:06       ` Lars Kellogg-Stedman
@ 2024-04-30 10:08         ` Dan Carpenter
  0 siblings, 0 replies; 6+ messages in thread
From: Dan Carpenter @ 2024-04-30 10:08 UTC (permalink / raw)
  To: Lars Kellogg-Stedman; +Cc: Duoming Zhou, linux-hams

On Mon, Apr 29, 2024 at 02:06:30PM -0400, Lars Kellogg-Stedman wrote:
> On Sun, Apr 28, 2024 at 02:01:54PM +0300, Dan Carpenter wrote:
> > We would want to have a Fixes tag to say how the bug was introduced.
> > That's the commit which introduced the underflow I think.
> 
> It doesn't look like d01ffb9eee4a introduced either of the problems.
> 

I really wanted this to be introduced by this patch because then the
issue would have been more simple.

> I ran a couple of git bisects between d01ffb9eee4a and c942a0cd36
> (that's the HEAD of my local repository, 6.9.0-rc5); the traces
> involving ax25_release first show up in 9fd75b66b8f ("ax25: Fix refcount
> leaks caused by ax25_cb_del()"). The "waiting for ax* to become free"
> problem first shows up in feef318c855 ("ax25: fix UAF bugs of net_device
> caused by rebinding operation").  Since feef318c855 is the older commit,
> I guess we pick that one as the target for the Fixes: tag.

Meanwhile commit feef318c855a ("ax25: fix UAF bugs of net_device caused
by rebinding operation") is a much more complicated patch...

I don't think your patch is correct.  It's fixing the use after free but
it introduces leaks instead.

I don't think it makes sense to take a reference in the recv() path...
How would we balance that with a _put()?  It has to be something like
increment in the open() and decrement in close().  In your patch, you
say that it's dropping the reference in ax25_release() but that pairs
with ax25_bind() and that takes the reference when it calls
ax25_addr_ax25dev().  I don't see a problem there.

1)  The reference count starts as 2 in ax25_dev_device_up().

	refcount_set(&ax25_dev->refcount, 1);
	ax25_dev_hold(ax25_dev);

Then in ax25_dev_device_down() it drops the reference once or twice
depending on if we goto unlock_put or not.  What is the logic there?
Seems suspicious.

2) The ax25_addr_ax25dev() has a small bug.  It potentially increments
   more than one reference.  That's unrelated to the underflow.  Also
   The function should be renamed to show that it increments the ref
   count.

diff --git a/net/ax25/ax25_dev.c b/net/ax25/ax25_dev.c
index 282ec581c072..13fe57deafef 100644
--- a/net/ax25/ax25_dev.c
+++ b/net/ax25/ax25_dev.c
@@ -34,11 +34,12 @@ ax25_dev *ax25_addr_ax25dev(ax25_address *addr)
 	ax25_dev *ax25_dev, *res = NULL;
 
 	spin_lock_bh(&ax25_dev_lock);
-	for (ax25_dev = ax25_dev_list; ax25_dev != NULL; ax25_dev = ax25_dev->next)
-		if (ax25cmp(addr, (const ax25_address *)ax25_dev->dev->dev_addr) == 0) {
+	for (ax25_dev = ax25_dev_list; ax25_dev != NULL; ax25_dev = ax25_dev->next) {
+		if (ax25cmp(addr, (const ax25_address *)ax25_dev->dev->dev_addr) == 0)
 			res = ax25_dev;
-			ax25_dev_hold(ax25_dev);
-		}
+	}
+	if (res)
+		ax25_dev_hold(res);
 	spin_unlock_bh(&ax25_dev_lock);
 
 	return res;

3) The ax25_dev_free() function doesn't do have reference counting.
   Suspicous.

The other increment/decrement is bind/release which seems okay to me.

regards,
dan carpenter



^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-04-30 10:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-26 21:29 [PATCH RFC] ax25: Fix (more) netdev refcount issues Lars Kellogg-Stedman
2024-04-27  8:48 ` Dan Carpenter
2024-04-27 17:15   ` Lars Kellogg-Stedman
2024-04-28 11:01     ` Dan Carpenter
2024-04-29 18:06       ` Lars Kellogg-Stedman
2024-04-30 10:08         ` Dan Carpenter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.