* [PATCH V5] SA Busy Handling
@ 2010-12-03 21:57 Mike Heinz
[not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB4A208DF42A-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Mike Heinz @ 2010-12-03 21:57 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
[-- Attachment #1: Type: text/plain, Size: 2593 bytes --]
The purpose of this patch is to cause the ib_mad driver to discard busy responses from the SA, effectively causing busy responses to become time outs.
This ensures that naïve IB applications cannot overwhelm the SA with queries, which could happen when a cluster is being rebooted, or when a large HPC application is started.
Signed-Off-By: Michael Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>
----
drivers/infiniband/core/mad.c | 13 +++++++++++++
include/rdma/ib_mad.h | 9 +++++++++
2 files changed, 22 insertions(+), 0 deletions(-)
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 64e660c..b322173 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -1828,6 +1828,9 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
/* Complete corresponding request */
if (ib_response_mad(mad_recv_wc->recv_buf.mad)) {
+ u16 busy = be16_to_cpu(mad_recv_wc->recv_buf.mad->mad_hdr.status) &
+ IB_MGMT_MAD_STATUS_BUSY;
+
spin_lock_irqsave(&mad_agent_priv->lock, flags);
mad_send_wr = ib_find_send_mad(mad_agent_priv, mad_recv_wc);
if (!mad_send_wr) {
@@ -1836,6 +1839,16 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
deref_mad_agent(mad_agent_priv);
return;
}
+
+ if (busy && mad_send_wr->retries_left &&
+ (mad_recv_wc->recv_buf.mad->mad_hdr.method != IB_MGMT_METHOD_TRAP_REPRESS)) {
+ /* Just let the query timeout and have it requeued later */
+ spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
+ ib_free_recv_mad(mad_recv_wc);
+ deref_mad_agent(mad_agent_priv);
+ return;
+ }
+
ib_mark_mad_done(mad_send_wr);
spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index d3b9401..b901968 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -77,6 +77,15 @@
#define IB_MGMT_MAX_METHODS 128
+/* MAD Status field bit masks */
+#define IB_MGMT_MAD_STATUS_SUCCESS 0x0000
+#define IB_MGMT_MAD_STATUS_BUSY 0x0001
+#define IB_MGMT_MAD_STATUS_REDIRECT_REQD 0x0002
+#define IB_MGMT_MAD_STATUS_BAD_VERSION 0x0004
+#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD 0x0008
+#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD_ATTRIB 0x000c
+#define IB_MGMT_MAD_STATUS_INVALID_ATTRIB_VALUE 0x001c
+
/* RMPP information */
#define IB_MGMT_RMPP_VERSION 1
[-- Attachment #2: sa_busy_20101203.patch --]
[-- Type: application/octet-stream, Size: 2107 bytes --]
drivers/infiniband/core/mad.c | 13 +++++++++++++
include/rdma/ib_mad.h | 9 +++++++++
2 files changed, 22 insertions(+), 0 deletions(-)
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 64e660c..b322173 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -1828,6 +1828,9 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
/* Complete corresponding request */
if (ib_response_mad(mad_recv_wc->recv_buf.mad)) {
+ u16 busy = be16_to_cpu(mad_recv_wc->recv_buf.mad->mad_hdr.status) &
+ IB_MGMT_MAD_STATUS_BUSY;
+
spin_lock_irqsave(&mad_agent_priv->lock, flags);
mad_send_wr = ib_find_send_mad(mad_agent_priv, mad_recv_wc);
if (!mad_send_wr) {
@@ -1836,6 +1839,16 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
deref_mad_agent(mad_agent_priv);
return;
}
+
+ if (busy && mad_send_wr->retries_left &&
+ (mad_recv_wc->recv_buf.mad->mad_hdr.method != IB_MGMT_METHOD_TRAP_REPRESS)) {
+ /* Just let the query timeout and have it requeued later */
+ spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
+ ib_free_recv_mad(mad_recv_wc);
+ deref_mad_agent(mad_agent_priv);
+ return;
+ }
+
ib_mark_mad_done(mad_send_wr);
spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index d3b9401..b901968 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -77,6 +77,15 @@
#define IB_MGMT_MAX_METHODS 128
+/* MAD Status field bit masks */
+#define IB_MGMT_MAD_STATUS_SUCCESS 0x0000
+#define IB_MGMT_MAD_STATUS_BUSY 0x0001
+#define IB_MGMT_MAD_STATUS_REDIRECT_REQD 0x0002
+#define IB_MGMT_MAD_STATUS_BAD_VERSION 0x0004
+#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD 0x0008
+#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD_ATTRIB 0x000c
+#define IB_MGMT_MAD_STATUS_INVALID_ATTRIB_VALUE 0x001c
+
/* RMPP information */
#define IB_MGMT_RMPP_VERSION 1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH V5] SA Busy Handling
[not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB4A208DF42A-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
@ 2010-12-06 16:45 ` Hal Rosenstock
[not found] ` <4CFD1339.4050601-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2010-12-06 16:54 ` [PATCH V5] SA Busy Handling - style questions Mike Heinz
0 siblings, 2 replies; 5+ messages in thread
From: Hal Rosenstock @ 2010-12-06 16:45 UTC (permalink / raw)
To: Mike Heinz; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
On 12/3/2010 4:57 PM, Mike Heinz wrote:
> The purpose of this patch is to cause the ib_mad driver to discard busy responses from the SA, effectively causing busy responses to become time outs.
>
> This ensures that naïve IB applications cannot overwhelm the SA with queries, which could happen when a cluster is being rebooted, or when a large HPC application is started.
>
> Signed-Off-By: Michael Heinz<michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>
>
> ----
>
> drivers/infiniband/core/mad.c | 13 +++++++++++++
> include/rdma/ib_mad.h | 9 +++++++++
> 2 files changed, 22 insertions(+), 0 deletions(-)
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index 64e660c..b322173 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -1828,6 +1828,9 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
>
> /* Complete corresponding request */
> if (ib_response_mad(mad_recv_wc->recv_buf.mad)) {
> + u16 busy = be16_to_cpu(mad_recv_wc->recv_buf.mad->mad_hdr.status)&
> + IB_MGMT_MAD_STATUS_BUSY;
> +
> spin_lock_irqsave(&mad_agent_priv->lock, flags);
> mad_send_wr = ib_find_send_mad(mad_agent_priv, mad_recv_wc);
> if (!mad_send_wr) {
> @@ -1836,6 +1839,16 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
> deref_mad_agent(mad_agent_priv);
> return;
> }
> +
> + if (busy&& mad_send_wr->retries_left&&
> + (mad_recv_wc->recv_buf.mad->mad_hdr.method != IB_MGMT_METHOD_TRAP_REPRESS)) {
> + /* Just let the query timeout and have it requeued later */
> + spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
> + ib_free_recv_mad(mad_recv_wc);
> + deref_mad_agent(mad_agent_priv);
> + return;
> + }
This code is duplicated so it should be combined with if (!mad_send_wr)
clause above to eliminate the duplication.
-- Hal
> +
> ib_mark_mad_done(mad_send_wr);
> spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
>
> diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
> index d3b9401..b901968 100644
> --- a/include/rdma/ib_mad.h
> +++ b/include/rdma/ib_mad.h
> @@ -77,6 +77,15 @@
>
> #define IB_MGMT_MAX_METHODS 128
>
> +/* MAD Status field bit masks */
> +#define IB_MGMT_MAD_STATUS_SUCCESS 0x0000
> +#define IB_MGMT_MAD_STATUS_BUSY 0x0001
> +#define IB_MGMT_MAD_STATUS_REDIRECT_REQD 0x0002
> +#define IB_MGMT_MAD_STATUS_BAD_VERSION 0x0004
> +#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD 0x0008
> +#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD_ATTRIB 0x000c
> +#define IB_MGMT_MAD_STATUS_INVALID_ATTRIB_VALUE 0x001c
> +
> /* RMPP information */
> #define IB_MGMT_RMPP_VERSION 1
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH V5] SA Busy Handling
[not found] ` <4CFD1339.4050601-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2010-12-06 16:52 ` Mike Heinz
0 siblings, 0 replies; 5+ messages in thread
From: Mike Heinz @ 2010-12-06 16:52 UTC (permalink / raw)
To: Hal Rosenstock; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
> This code is duplicated so it should be combined with if (!mad_send_wr)
> clause above to eliminate the duplication.
>
> -- Hal
I guess that's true - they used to be different but since we've reverted to simply discarding the response, you're right the clauses can be merged.
-----Original Message-----
From: Hal Rosenstock [mailto:hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org]
Sent: Monday, December 06, 2010 11:46 AM
To: Mike Heinz
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH V5] SA Busy Handling
On 12/3/2010 4:57 PM, Mike Heinz wrote:
> The purpose of this patch is to cause the ib_mad driver to discard busy responses from the SA, effectively causing busy responses to become time outs.
>
> This ensures that naïve IB applications cannot overwhelm the SA with queries, which could happen when a cluster is being rebooted, or when a large HPC application is started.
>
> Signed-Off-By: Michael Heinz<michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>
>
> ----
>
> drivers/infiniband/core/mad.c | 13 +++++++++++++
> include/rdma/ib_mad.h | 9 +++++++++
> 2 files changed, 22 insertions(+), 0 deletions(-)
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index 64e660c..b322173 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -1828,6 +1828,9 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
>
> /* Complete corresponding request */
> if (ib_response_mad(mad_recv_wc->recv_buf.mad)) {
> + u16 busy = be16_to_cpu(mad_recv_wc->recv_buf.mad->mad_hdr.status)&
> + IB_MGMT_MAD_STATUS_BUSY;
> +
> spin_lock_irqsave(&mad_agent_priv->lock, flags);
> mad_send_wr = ib_find_send_mad(mad_agent_priv, mad_recv_wc);
> if (!mad_send_wr) {
> @@ -1836,6 +1839,16 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
> deref_mad_agent(mad_agent_priv);
> return;
> }
> +
> + if (busy&& mad_send_wr->retries_left&&
> + (mad_recv_wc->recv_buf.mad->mad_hdr.method != IB_MGMT_METHOD_TRAP_REPRESS)) {
> + /* Just let the query timeout and have it requeued later */
> + spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
> + ib_free_recv_mad(mad_recv_wc);
> + deref_mad_agent(mad_agent_priv);
> + return;
> + }
This code is duplicated so it should be combined with if (!mad_send_wr)
clause above to eliminate the duplication.
-- Hal
> +
> ib_mark_mad_done(mad_send_wr);
> spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
>
> diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
> index d3b9401..b901968 100644
> --- a/include/rdma/ib_mad.h
> +++ b/include/rdma/ib_mad.h
> @@ -77,6 +77,15 @@
>
> #define IB_MGMT_MAX_METHODS 128
>
> +/* MAD Status field bit masks */
> +#define IB_MGMT_MAD_STATUS_SUCCESS 0x0000
> +#define IB_MGMT_MAD_STATUS_BUSY 0x0001
> +#define IB_MGMT_MAD_STATUS_REDIRECT_REQD 0x0002
> +#define IB_MGMT_MAD_STATUS_BAD_VERSION 0x0004
> +#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD 0x0008
> +#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD_ATTRIB 0x000c
> +#define IB_MGMT_MAD_STATUS_INVALID_ATTRIB_VALUE 0x001c
> +
> /* RMPP information */
> #define IB_MGMT_RMPP_VERSION 1
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH V5] SA Busy Handling - style questions
2010-12-06 16:45 ` Hal Rosenstock
[not found] ` <4CFD1339.4050601-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2010-12-06 16:54 ` Mike Heinz
[not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB4A208DF4E8-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
1 sibling, 1 reply; 5+ messages in thread
From: Mike Heinz @ 2010-12-06 16:54 UTC (permalink / raw)
To: Hal Rosenstock; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Which is preferred?
if ((!mad_send_wr) ||
(busy && mad_send_wr->retries_left &&
(mad_recv_wc->recv_buf.mad->mad_hdr.method != IB_MGMT_METHOD_TRAP_REPRESS))) {
Or
if ((!mad_send_wr) ||
(busy && mad_send_wr->retries_left &&
(mad_recv_wc->recv_buf.mad->mad_hdr.method !=
IB_MGMT_METHOD_TRAP_REPRESS))) {
?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH V5] SA Busy Handling - style questions
[not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB4A208DF4E8-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
@ 2010-12-06 17:01 ` Hal Rosenstock
0 siblings, 0 replies; 5+ messages in thread
From: Hal Rosenstock @ 2010-12-06 17:01 UTC (permalink / raw)
To: Mike Heinz; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
On 12/6/2010 11:54 AM, Mike Heinz wrote:
> Which is preferred?
>
> if ((!mad_send_wr) ||
> (busy&& mad_send_wr->retries_left&&
> (mad_recv_wc->recv_buf.mad->mad_hdr.method != IB_MGMT_METHOD_TRAP_REPRESS))) {
>
> Or
>
> if ((!mad_send_wr) ||
> (busy&& mad_send_wr->retries_left&&
> (mad_recv_wc->recv_buf.mad->mad_hdr.method !=
> IB_MGMT_METHOD_TRAP_REPRESS))) {
>
> ?
>
More like the latter IMO:
if (!mad_send_wr ||
(busy && mad_send_wr->retries_left &&
(mad_recv_wc->recv_buf.mad->mad_hdr.method !=
IB_MGMT_METHOD_TRAP_REPRESS))) {
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-12-06 17:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-03 21:57 [PATCH V5] SA Busy Handling Mike Heinz
[not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB4A208DF42A-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
2010-12-06 16:45 ` Hal Rosenstock
[not found] ` <4CFD1339.4050601-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2010-12-06 16:52 ` Mike Heinz
2010-12-06 16:54 ` [PATCH V5] SA Busy Handling - style questions Mike Heinz
[not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB4A208DF4E8-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
2010-12-06 17:01 ` Hal Rosenstock
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.