linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] staging: lustre: o2iblnd: fix race at kiblnd_connect_peer
@ 2018-03-09 23:29 Doug Oucharek
  2018-03-10  0:34 ` Doug Oucharek
  0 siblings, 1 reply; 5+ messages in thread
From: Doug Oucharek @ 2018-03-09 23:29 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Drokin, Oleg, Dilger, Andreas,
	James Simmons, alexander.boyko
  Cc: Linux Kernel Mailing List, Lustre Development List, Doug Oucharek

cmid will be destroyed at OFED if kiblnd_cm_callback return error.
if error happen before the end of kiblnd_connect_peer, it will touch
destroyed cmid and fail as
(o2iblnd_cb.c:1315:kiblnd_connect_peer())
           ASSERTION( cmid->device != ((void *)0) ) failed:

Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-10015
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Doug Oucharek <dougso@me.com>
---
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 6690a6c..080c2a1 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -1290,11 +1290,6 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid,
		goto failed2;
	}

-	LASSERT(cmid->device);
-	CDEBUG(D_NET, "%s: connection bound to %s:%pI4h:%s\n",
-	       libcfs_nid2str(peer->ibp_nid), dev->ibd_ifname,
-	       &dev->ibd_ifip, cmid->device->name);
-
	return;

 failed2:
@@ -2996,8 +2991,19 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid,
		} else {
			rc = rdma_resolve_route(
				cmid, *kiblnd_tunables.kib_timeout * 1000);
-			if (!rc)
+			if (!rc) {
+				kib_net_t *net = peer_ni->ibp_ni->ni_data;
+				kib_dev_t *dev = net->ibn_dev;
+
+				CDEBUG(D_NET, "%s: connection bound to "\
+				       "%s:%pI4h:%s\n",
+				       libcfs_nid2str(peer_ni->ibp_nid),
+				       dev->ibd_ifname,
+				       &dev->ibd_ifip, cmid->device->name);
+
				return 0;
+			}
+
			/* Can't initiate route resolution */
			CERROR("Can't resolve route for %s: %d\n",
			       libcfs_nid2str(peer->ibp_nid), rc);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] staging: lustre: o2iblnd: fix race at kiblnd_connect_peer
  2018-03-09 23:29 [PATCH] staging: lustre: o2iblnd: fix race at kiblnd_connect_peer Doug Oucharek
@ 2018-03-10  0:34 ` Doug Oucharek
  0 siblings, 0 replies; 5+ messages in thread
From: Doug Oucharek @ 2018-03-10  0:34 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Drokin, Oleg, Dilger, Andreas,
	James Simmons, alexander.boyko
  Cc: Linux Kernel Mailing List, Lustre Development List

Please ignore this patch.  Turns out it depends on a series which has not been submitted yet.  I’ll resend this one once all of those are done.

Doug

> On Mar 9, 2018, at 3:29 PM, Doug Oucharek <dougso@me.com> wrote:
> 
> cmid will be destroyed at OFED if kiblnd_cm_callback return error.
> if error happen before the end of kiblnd_connect_peer, it will touch
> destroyed cmid and fail as
> (o2iblnd_cb.c:1315:kiblnd_connect_peer())
>           ASSERTION( cmid->device != ((void *)0) ) failed:
> 
> Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-10015
> Reviewed-by: Alexey Lyashkov <c17817@cray.com>
> Reviewed-by: Doug Oucharek <dougso@me.com>
> Reviewed-by: John L. Hammond <john.hammond@intel.com>
> Signed-off-by: Doug Oucharek <dougso@me.com>
> ---
> drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 18 ++++++++++++------
> 1 file changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
> index 6690a6c..080c2a1 100644
> --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
> +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
> @@ -1290,11 +1290,6 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid,
> 		goto failed2;
> 	}
> 
> -	LASSERT(cmid->device);
> -	CDEBUG(D_NET, "%s: connection bound to %s:%pI4h:%s\n",
> -	       libcfs_nid2str(peer->ibp_nid), dev->ibd_ifname,
> -	       &dev->ibd_ifip, cmid->device->name);
> -
> 	return;
> 
> failed2:
> @@ -2996,8 +2991,19 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid,
> 		} else {
> 			rc = rdma_resolve_route(
> 				cmid, *kiblnd_tunables.kib_timeout * 1000);
> -			if (!rc)
> +			if (!rc) {
> +				kib_net_t *net = peer_ni->ibp_ni->ni_data;
> +				kib_dev_t *dev = net->ibn_dev;
> +
> +				CDEBUG(D_NET, "%s: connection bound to "\
> +				       "%s:%pI4h:%s\n",
> +				       libcfs_nid2str(peer_ni->ibp_nid),
> +				       dev->ibd_ifname,
> +				       &dev->ibd_ifip, cmid->device->name);
> +
> 				return 0;
> +			}
> +
> 			/* Can't initiate route resolution */
> 			CERROR("Can't resolve route for %s: %d\n",
> 			       libcfs_nid2str(peer->ibp_nid), rc);
> -- 
> 1.8.3.1
> 

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] staging: lustre: o2iblnd: fix race at kiblnd_connect_peer
@ 2018-05-02  5:22 Doug Oucharek
  0 siblings, 0 replies; 5+ messages in thread
From: Doug Oucharek @ 2018-05-02  5:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger, James Simmons
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Doug Oucahrek, Alexander Boyko

From: Doug Oucahrek <dougso@me.com>

cmid will be destroyed at OFED if kiblnd_cm_callback return error.
if error happen before the end of kiblnd_connect_peer, it will touch
destroyed cmid and fail as
(o2iblnd_cb.c:1315:kiblnd_connect_peer())
            ASSERTION( cmid->device != ((void *)0) ) failed:

Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-10015
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Doug Oucharek <dougso@me.com>
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index b4a182d..a76c1f2 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -1290,11 +1290,6 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid,
 		goto failed2;
 	}
 
-	LASSERT(cmid->device);
-	CDEBUG(D_NET, "%s: connection bound to %s:%pI4h:%s\n",
-	       libcfs_nid2str(peer->ibp_nid), dev->ibd_ifname,
-	       &dev->ibd_ifip, cmid->device->name);
-
 	return;
 
  failed2:
@@ -2996,8 +2991,19 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid,
 		} else {
 			rc = rdma_resolve_route(
 				cmid, *kiblnd_tunables.kib_timeout * 1000);
-			if (!rc)
+			if (!rc) {
+				struct kib_net *net = peer->ibp_ni->ni_data;
+				struct kib_dev *dev = net->ibn_dev;
+
+				CDEBUG(D_NET, "%s: connection bound to "\
+				       "%s:%pI4h:%s\n",
+				       libcfs_nid2str(peer->ibp_nid),
+				       dev->ibd_ifname,
+				       &dev->ibd_ifip, cmid->device->name);
+
 				return 0;
+			}
+
 			/* Can't initiate route resolution */
 			CERROR("Can't resolve route for %s: %d\n",
 			       libcfs_nid2str(peer->ibp_nid), rc);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] staging: lustre: o2iblnd: fix race at kiblnd_connect_peer
  2018-03-09  7:49 Doug Oucharek
@ 2018-03-14 11:55 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 5+ messages in thread
From: Greg Kroah-Hartman @ 2018-03-14 11:55 UTC (permalink / raw)
  To: Doug Oucharek
  Cc: devel, Doug Oucharek, Andreas Dilger, Linux Kernel Mailing List,
	Oleg Drokin, Alexander Boyko, Lustre Development List

On Fri, Mar 09, 2018 at 02:49:18AM -0500, Doug Oucharek wrote:
> From: Alexander Boyko <alexander.boyko@seagate.com>
> 
> cmid will be destroyed at OFED if kiblnd_cm_callback return error.
> if error happen before the end of kiblnd_connect_peer, it will touch
> destroyed cmid and fail as
> (o2iblnd_cb.c:1315:kiblnd_connect_peer())
>             ASSERTION( cmid->device != ((void *)0) ) failed:
> 
> Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-10015
> Reviewed-by: Alexey Lyashkov <c17817@cray.com>
> Reviewed-by: Doug Oucharek <dougso@me.com>
> Reviewed-by: John L. Hammond <john.hammond@intel.com>
> Signed-off-by: Doug Oucharek <dougso@me.com>

Your email address here does not match the From: line :(

Please fix up and resend...

thanks,

greg k-h
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] staging: lustre: o2iblnd: fix race at kiblnd_connect_peer
@ 2018-03-09  7:49 Doug Oucharek
  2018-03-14 11:55 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 5+ messages in thread
From: Doug Oucharek @ 2018-03-09  7:49 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger, James Simmons
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Alexander Boyko, Doug Oucharek

From: Alexander Boyko <alexander.boyko@seagate.com>

cmid will be destroyed at OFED if kiblnd_cm_callback return error.
if error happen before the end of kiblnd_connect_peer, it will touch
destroyed cmid and fail as
(o2iblnd_cb.c:1315:kiblnd_connect_peer())
            ASSERTION( cmid->device != ((void *)0) ) failed:

Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-10015
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Doug Oucharek <dougso@me.com>
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 6690a6c..080c2a1 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -1290,11 +1290,6 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid,
 		goto failed2;
 	}
 
-	LASSERT(cmid->device);
-	CDEBUG(D_NET, "%s: connection bound to %s:%pI4h:%s\n",
-	       libcfs_nid2str(peer->ibp_nid), dev->ibd_ifname,
-	       &dev->ibd_ifip, cmid->device->name);
-
 	return;
 
  failed2:
@@ -2996,8 +2991,19 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid,
 		} else {
 			rc = rdma_resolve_route(
 				cmid, *kiblnd_tunables.kib_timeout * 1000);
-			if (!rc)
+			if (!rc) {
+				kib_net_t *net = peer_ni->ibp_ni->ni_data;
+				kib_dev_t *dev = net->ibn_dev;
+
+				CDEBUG(D_NET, "%s: connection bound to "\
+				       "%s:%pI4h:%s\n",
+				       libcfs_nid2str(peer_ni->ibp_nid),
+				       dev->ibd_ifname,
+				       &dev->ibd_ifip, cmid->device->name);
+
 				return 0;
+			}
+
 			/* Can't initiate route resolution */
 			CERROR("Can't resolve route for %s: %d\n",
 			       libcfs_nid2str(peer->ibp_nid), rc);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-05-02  5:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-09 23:29 [PATCH] staging: lustre: o2iblnd: fix race at kiblnd_connect_peer Doug Oucharek
2018-03-10  0:34 ` Doug Oucharek
  -- strict thread matches above, loose matches on Subject: below --
2018-05-02  5:22 Doug Oucharek
2018-03-09  7:49 Doug Oucharek
2018-03-14 11:55 ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).