All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme-tcp: Use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE
@ 2021-06-18 13:39 Prabhakar Kushwaha
  2021-06-18 13:42 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Prabhakar Kushwaha @ 2021-06-18 13:39 UTC (permalink / raw)
  To: linux-nvme, sagi, hch, kbusch, axboe, martin.belanger
  Cc: aelior, okulkarni, pkushwaha, smalin, prabhakar.pkin, malin1024

dev_get_by_name() finds network device by name but it also increases
reference count.
Increasing the ref count,
If nvme-tcp queue is present and the network device driver is removed
before nvme_tcp, we will face the following continuous log:
  "kernel:unregister_netdevice: waiting for <eth> to become
  free. Usage count = 2"
And rmmod further halts. Similar case arises during reboot/shutdown
with nvme-tcp queue present and both never completes.

As a fix we will use __dev_get_by_name() which find network device by
name without increasing any reference counter.

Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
---
 drivers/nvme/host/tcp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index c7bd37103cf4..f9b527e71c13 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2533,7 +2533,7 @@ static struct nvme_ctrl *nvme_tcp_create_ctrl(struct device *dev,
 	}
 
 	if (opts->mask & NVMF_OPT_HOST_IFACE) {
-		ctrl->ndev = dev_get_by_name(&init_net, opts->host_iface);
+		ctrl->ndev = __dev_get_by_name(&init_net, opts->host_iface);
 		if (!ctrl->ndev) {
 			pr_err("invalid interface passed: %s\n",
 			       opts->host_iface);
-- 
2.30.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] nvme-tcp: Use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE
  2021-06-18 13:39 [PATCH] nvme-tcp: Use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE Prabhakar Kushwaha
@ 2021-06-18 13:42 ` Christoph Hellwig
  2021-06-18 19:14   ` Sagi Grimberg
  2021-06-23 21:34 ` Sagi Grimberg
  2021-06-24  6:35 ` Christoph Hellwig
  2 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2021-06-18 13:42 UTC (permalink / raw)
  To: Prabhakar Kushwaha
  Cc: linux-nvme, sagi, hch, kbusch, axboe, martin.belanger, aelior,
	okulkarni, smalin, prabhakar.pkin, malin1024

On Fri, Jun 18, 2021 at 04:39:56PM +0300, Prabhakar Kushwaha wrote:
> dev_get_by_name() finds network device by name but it also increases
> reference count.
> Increasing the ref count,
> If nvme-tcp queue is present and the network device driver is removed
> before nvme_tcp, we will face the following continuous log:
>   "kernel:unregister_netdevice: waiting for <eth> to become
>   free. Usage count = 2"
> And rmmod further halts. Similar case arises during reboot/shutdown
> with nvme-tcp queue present and both never completes.
> 
> As a fix we will use __dev_get_by_name() which find network device by
> name without increasing any reference counter.

And when you remove it without the refcount we'll now have a stale
kernel pointer?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] nvme-tcp: Use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE
  2021-06-18 13:42 ` Christoph Hellwig
@ 2021-06-18 19:14   ` Sagi Grimberg
  2021-06-21  1:43     ` Prabhakar Kushwaha
  0 siblings, 1 reply; 9+ messages in thread
From: Sagi Grimberg @ 2021-06-18 19:14 UTC (permalink / raw)
  To: Christoph Hellwig, Prabhakar Kushwaha
  Cc: linux-nvme, kbusch, axboe, martin.belanger, aelior, okulkarni,
	smalin, prabhakar.pkin, malin1024


>> dev_get_by_name() finds network device by name but it also increases
>> reference count.
>> Increasing the ref count,
>> If nvme-tcp queue is present and the network device driver is removed
>> before nvme_tcp, we will face the following continuous log:
>>    "kernel:unregister_netdevice: waiting for <eth> to become
>>    free. Usage count = 2"
>> And rmmod further halts. Similar case arises during reboot/shutdown
>> with nvme-tcp queue present and both never completes.
>>
>> As a fix we will use __dev_get_by_name() which find network device by
>> name without increasing any reference counter.
> 
> And when you remove it without the refcount we'll now have a stale
> kernel pointer?

The netdev is not actually needed, its to make sure that it exist as
reference in the host_iface afaict (the netdev is never referenced
afterwards), and I'm pretty sure there was a matching put somewhere in
the former versions of this patch.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] nvme-tcp: Use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE
  2021-06-18 19:14   ` Sagi Grimberg
@ 2021-06-21  1:43     ` Prabhakar Kushwaha
  2021-06-23 17:19       ` Belanger, Martin
  0 siblings, 1 reply; 9+ messages in thread
From: Prabhakar Kushwaha @ 2021-06-21  1:43 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Christoph Hellwig, Prabhakar Kushwaha, linux-nvme, kbusch, axboe,
	martin.belanger, aelior, Omkar Kulkarni, Shai Malin, malin1024

On Sat, Jun 19, 2021 at 12:44 AM Sagi Grimberg <sagi@grimberg.me> wrote:
>
>
> >> dev_get_by_name() finds network device by name but it also increases
> >> reference count.
> >> Increasing the ref count,
> >> If nvme-tcp queue is present and the network device driver is removed
> >> before nvme_tcp, we will face the following continuous log:
> >>    "kernel:unregister_netdevice: waiting for <eth> to become
> >>    free. Usage count = 2"
> >> And rmmod further halts. Similar case arises during reboot/shutdown
> >> with nvme-tcp queue present and both never completes.
> >>
> >> As a fix we will use __dev_get_by_name() which find network device by
> >> name without increasing any reference counter.
> >
> > And when you remove it without the refcount we'll now have a stale
> > kernel pointer?
>
> The netdev is not actually needed, its to make sure that it exist as
> reference in the host_iface afaict (the netdev is never referenced
> afterwards), and I'm pretty sure there was a matching put somewhere in
> the former versions of this patch.

Patch 3ede8f72a9a2 (“nvme-tcp: allow selecting the network interface for
connections") exposed to this problem in the following cases:
A)  create nvme-tcp queue, disconnect queue and remove network driver
B)  create nvme-tcp queue and remove network driver

“A” can be solved by putting dev_put() before kfree(ctrl):
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index c7bd37103cf4..93639b5b795f 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2155,6 +2155,8 @@ static void nvme_tcp_free_ctrl(struct nvme_ctrl *nctrl)
        nvmf_free_options(nctrl->opts);
free_ctrl:
        kfree(ctrl->queues);
+       if (ctrl->ndev)
+               dev_put(ctrl->ndev);
        kfree(ctrl);
}

@@ -2586,6 +2588,8 @@ static struct nvme_ctrl
*nvme_tcp_create_ctrl(struct device *dev,
out_kfree_queues:
        kfree(ctrl->queues);
out_free_ctrl:
+       if (ctrl->ndev)
+               dev_put(ctrl->ndev);
        kfree(ctrl);
        return ERR_PTR(ret);
}

But even using dev_put() will not solve “B” in case of the removal of
the network
driver before nvme-tcp.

This was our initial motivation for using  __dev_get_by_name().
It will solve both “A” and “B” and it will behave the same with or without the
usage of the OPT_HOST_IFACE option.
i.e. If we remove the network driver while controllers are connected it will
end with nvme timeouts.

--pk

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [PATCH] nvme-tcp: Use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE
  2021-06-21  1:43     ` Prabhakar Kushwaha
@ 2021-06-23 17:19       ` Belanger, Martin
  0 siblings, 0 replies; 9+ messages in thread
From: Belanger, Martin @ 2021-06-23 17:19 UTC (permalink / raw)
  To: Prabhakar Kushwaha, Sagi Grimberg
  Cc: Christoph Hellwig, Prabhakar Kushwaha, linux-nvme, kbusch, axboe,
	aelior, Omkar Kulkarni, Shai Malin, malin1024

> On Sat, Jun 19, 2021 at 12:44 AM Sagi Grimberg <sagi@grimberg.me> wrote:
> >
> >
> > >> dev_get_by_name() finds network device by name but it also
> > >> increases reference count.
> > >> Increasing the ref count,
> > >> If nvme-tcp queue is present and the network device driver is
> > >> removed before nvme_tcp, we will face the following continuous log:
> > >>    "kernel:unregister_netdevice: waiting for <eth> to become
> > >>    free. Usage count = 2"
> > >> And rmmod further halts. Similar case arises during reboot/shutdown
> > >> with nvme-tcp queue present and both never completes.
> > >>
> > >> As a fix we will use __dev_get_by_name() which find network device
> > >> by name without increasing any reference counter.
> > >
> > > And when you remove it without the refcount we'll now have a stale
> > > kernel pointer?
> >
> > The netdev is not actually needed, its to make sure that it exist as
> > reference in the host_iface afaict (the netdev is never referenced
> > afterwards), and I'm pretty sure there was a matching put somewhere in
> > the former versions of this patch.
> 
> Patch 3ede8f72a9a2 (“nvme-tcp: allow selecting the network interface for
> connections") exposed to this problem in the following cases:
> A)  create nvme-tcp queue, disconnect queue and remove network driver
> B)  create nvme-tcp queue and remove network driver
> 
> “A” can be solved by putting dev_put() before kfree(ctrl):
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index
> c7bd37103cf4..93639b5b795f 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -2155,6 +2155,8 @@ static void nvme_tcp_free_ctrl(struct nvme_ctrl
> *nctrl)
>         nvmf_free_options(nctrl->opts);
> free_ctrl:
>         kfree(ctrl->queues);
> +       if (ctrl->ndev)
> +               dev_put(ctrl->ndev);
>         kfree(ctrl);
> }
> 
> @@ -2586,6 +2588,8 @@ static struct nvme_ctrl *nvme_tcp_create_ctrl(struct
> device *dev,
> out_kfree_queues:
>         kfree(ctrl->queues);
> out_free_ctrl:
> +       if (ctrl->ndev)
> +               dev_put(ctrl->ndev);
>         kfree(ctrl);
>         return ERR_PTR(ret);
> }
> 
> But even using dev_put() will not solve “B” in case of the removal of the
> network driver before nvme-tcp.
> 
> This was our initial motivation for using  __dev_get_by_name().
> It will solve both “A” and “B” and it will behave the same with or without the
> usage of the OPT_HOST_IFACE option.
> i.e. If we remove the network driver while controllers are connected it will end
> with nvme timeouts.
> 
> --pk

Sagi is right. The function dev_get_by_name() was used to verify that the interface name exists. We don't need to increase the ref count here and using __dev_get_by_name() is the right thing to do.
Martin
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] nvme-tcp: Use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE
  2021-06-18 13:39 [PATCH] nvme-tcp: Use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE Prabhakar Kushwaha
  2021-06-18 13:42 ` Christoph Hellwig
@ 2021-06-23 21:34 ` Sagi Grimberg
  2021-06-24  6:35 ` Christoph Hellwig
  2 siblings, 0 replies; 9+ messages in thread
From: Sagi Grimberg @ 2021-06-23 21:34 UTC (permalink / raw)
  To: Prabhakar Kushwaha, linux-nvme, hch, kbusch, axboe, martin.belanger
  Cc: aelior, okulkarni, smalin, prabhakar.pkin, malin1024

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] nvme-tcp: Use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE
  2021-06-18 13:39 [PATCH] nvme-tcp: Use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE Prabhakar Kushwaha
  2021-06-18 13:42 ` Christoph Hellwig
  2021-06-23 21:34 ` Sagi Grimberg
@ 2021-06-24  6:35 ` Christoph Hellwig
  2021-07-11 13:17   ` Prabhakar Kushwaha
  2 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2021-06-24  6:35 UTC (permalink / raw)
  To: Prabhakar Kushwaha
  Cc: linux-nvme, sagi, hch, kbusch, axboe, martin.belanger, aelior,
	okulkarni, smalin, prabhakar.pkin, malin1024

I've applied the slightly updated version below to make it a little
more clear what is going on:

---
From 9de7d173c10b6be09fe9d5b7010ef182465897a1 Mon Sep 17 00:00:00 2001
From: Prabhakar Kushwaha <pkushwaha@marvell.com>
Date: Fri, 18 Jun 2021 16:39:56 +0300
Subject: nvme-tcp: use __dev_get_by_name instead dev_get_by_name for
 OPT_HOST_IFACE

dev_get_by_name() finds network device by name but it also increases the
reference count.

If a nvme-tcp queue is present and the network device driver is removed
before nvme_tcp, we will face the following continuous log:

  "kernel:unregister_netdevice: waiting for <eth> to become free. Usage count = 2"

And rmmod further halts. Similar case arises during reboot/shutdown
with nvme-tcp queue present and both never completes.

To fix this, use __dev_get_by_name() which finds network device by
name without increasing any reference counter.

Fixes: 3ede8f72a9a2 ("nvme-tcp: allow selecting the network interface for connections")
Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
[hch: remove the ->ndev member entirely]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/tcp.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index c7bd37103cf4..2b46c6e0226f 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -123,7 +123,6 @@ struct nvme_tcp_ctrl {
 	struct blk_mq_tag_set	admin_tag_set;
 	struct sockaddr_storage addr;
 	struct sockaddr_storage src_addr;
-	struct net_device	*ndev;
 	struct nvme_ctrl	ctrl;
 
 	struct work_struct	err_work;
@@ -2533,8 +2532,7 @@ static struct nvme_ctrl *nvme_tcp_create_ctrl(struct device *dev,
 	}
 
 	if (opts->mask & NVMF_OPT_HOST_IFACE) {
-		ctrl->ndev = dev_get_by_name(&init_net, opts->host_iface);
-		if (!ctrl->ndev) {
+		if (!__dev_get_by_name(&init_net, opts->host_iface)) {
 			pr_err("invalid interface passed: %s\n",
 			       opts->host_iface);
 			ret = -ENODEV;
-- 
2.30.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] nvme-tcp: Use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE
  2021-06-24  6:35 ` Christoph Hellwig
@ 2021-07-11 13:17   ` Prabhakar Kushwaha
  2021-07-12  5:30     ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: Prabhakar Kushwaha @ 2021-07-11 13:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Prabhakar Kushwaha, linux-nvme, Sagi Grimberg, kbusch, axboe,
	martin.belanger, aelior, Omkar Kulkarni, Shai Malin, malin1024

Hi Christoph,

On Thu, Jun 24, 2021 at 12:05 PM Christoph Hellwig <hch@lst.de> wrote:
>
> I've applied the slightly updated version below to make it a little
> more clear what is going on:
>
> ---
> From 9de7d173c10b6be09fe9d5b7010ef182465897a1 Mon Sep 17 00:00:00 2001
> From: Prabhakar Kushwaha <pkushwaha@marvell.com>
> Date: Fri, 18 Jun 2021 16:39:56 +0300
> Subject: nvme-tcp: use __dev_get_by_name instead dev_get_by_name for
>  OPT_HOST_IFACE
>
> dev_get_by_name() finds network device by name but it also increases the
> reference count.
>
> If a nvme-tcp queue is present and the network device driver is removed
> before nvme_tcp, we will face the following continuous log:
>
>   "kernel:unregister_netdevice: waiting for <eth> to become free. Usage count = 2"
>
> And rmmod further halts. Similar case arises during reboot/shutdown
> with nvme-tcp queue present and both never completes.
>
> To fix this, use __dev_get_by_name() which finds network device by
> name without increasing any reference counter.
>
> Fixes: 3ede8f72a9a2 ("nvme-tcp: allow selecting the network interface for connections")
> Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com>
> Signed-off-by: Shai Malin <smalin@marvell.com>
> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
> [hch: remove the ->ndev member entirely]
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

I am not finding this patch on updated "branch nvme-5.14" having the
latest commit 0755d3be2d9b("nvme-tcp: can't set sk_user_data without
write_lock").
Am I missing something?  Please note, earlier it was there with commit
ff5af4bfb6af.

Issue can still be found on the nvme-5.14 branch (top of the tree).
Are you planning to fix this problem using any other approach?

--pk

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] nvme-tcp: Use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE
  2021-07-11 13:17   ` Prabhakar Kushwaha
@ 2021-07-12  5:30     ` Christoph Hellwig
  0 siblings, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2021-07-12  5:30 UTC (permalink / raw)
  To: Prabhakar Kushwaha
  Cc: Christoph Hellwig, Prabhakar Kushwaha, linux-nvme, Sagi Grimberg,
	kbusch, axboe, martin.belanger, aelior, Omkar Kulkarni,
	Shai Malin, malin1024

On Sun, Jul 11, 2021 at 06:47:37PM +0530, Prabhakar Kushwaha wrote:
> I am not finding this patch on updated "branch nvme-5.14" having the
> latest commit 0755d3be2d9b("nvme-tcp: can't set sk_user_data without
> write_lock").
> Am I missing something?  Please note, earlier it was there with commit
> ff5af4bfb6af.
> 
> Issue can still be found on the nvme-5.14 branch (top of the tree).
> Are you planning to fix this problem using any other approach?

Sorry, it looks like it got lost in a rebase.  I'll pick it up for
nvme-5.14 again.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-07-12  5:30 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-18 13:39 [PATCH] nvme-tcp: Use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE Prabhakar Kushwaha
2021-06-18 13:42 ` Christoph Hellwig
2021-06-18 19:14   ` Sagi Grimberg
2021-06-21  1:43     ` Prabhakar Kushwaha
2021-06-23 17:19       ` Belanger, Martin
2021-06-23 21:34 ` Sagi Grimberg
2021-06-24  6:35 ` Christoph Hellwig
2021-07-11 13:17   ` Prabhakar Kushwaha
2021-07-12  5:30     ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.