All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free
@ 2015-10-11 12:58 Matan Barak
       [not found] ` <1444568298-17289-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Matan Barak @ 2015-10-11 12:58 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
	Matan Barak, Sean Hefty, Jason Gunthorpe, Doron Tsur

From: Doron Tsur <doront-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

ib_send_cm_sidr_rep could sometimes erase the node from the sidr
(depending on errors in the process). Since ib_send_cm_sidr_rep is
called both from cm_sidr_req_handler and cm_destroy_id, cm_id_priv
could be either erased from the rb_tree twice or not erased at all.
Fixing that by making sure it's erased only once before freeing
cm_id_priv.

Fixes: a977049dacde ('[PATCH] IB: Add the kernel CM implementation')
Signed-off-by: Doron Tsur <doront-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---

Hi Doug,
This patch fixes a bug in the CM. In some flow, rb-tree could be
freed twice or used after it was freed. This bug was picked by
our regression tests and this fix was verified.

Thanks,
Doron and Matan

 drivers/infiniband/core/cm.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index f5cf1c4..56ff0f3 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -844,6 +844,11 @@ retest:
 	case IB_CM_SIDR_REQ_RCVD:
 		spin_unlock_irq(&cm_id_priv->lock);
 		cm_reject_sidr_req(cm_id_priv, IB_SIDR_REJECT);
+		spin_lock_irq(&cm.lock);
+		if (!RB_EMPTY_NODE(&cm_id_priv->sidr_id_node))
+			rb_erase(&cm_id_priv->sidr_id_node,
+				 &cm.remote_sidr_table);
+		spin_unlock_irq(&cm.lock);
 		break;
 	case IB_CM_REQ_SENT:
 	case IB_CM_MRA_REQ_RCVD:
@@ -3210,7 +3215,10 @@ int ib_send_cm_sidr_rep(struct ib_cm_id *cm_id,
 	spin_unlock_irqrestore(&cm_id_priv->lock, flags);
 
 	spin_lock_irqsave(&cm.lock, flags);
-	rb_erase(&cm_id_priv->sidr_id_node, &cm.remote_sidr_table);
+	if (!RB_EMPTY_NODE(&cm_id_priv->sidr_id_node)) {
+		rb_erase(&cm_id_priv->sidr_id_node, &cm.remote_sidr_table);
+		RB_CLEAR_NODE(&cm_id_priv->sidr_id_node);
+	}
 	spin_unlock_irqrestore(&cm.lock, flags);
 	return 0;
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found] ` <1444568298-17289-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-10-11 12:58   ` Matan Barak
       [not found]     ` <1444568298-17289-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-10-11 15:28   ` [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free Or Gerlitz
  2015-10-12 16:37   ` Hefty, Sean
  2 siblings, 1 reply; 23+ messages in thread
From: Matan Barak @ 2015-10-11 12:58 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
	Matan Barak, Sean Hefty, Jason Gunthorpe

When IP based addressing was introduced, ib_create_ah_from_wc was
changed in order to support a suitable AH. Since this AH should
now contains the DMAC (which isn't a simple derivative of the GID).
In order to find the DMAC, an ARP should sometime be sent. This ARP
is a sleeping context.

ib_create_ah_from_wc is called from cm_alloc_response_msg, which is
sometimes called from an atomic context. This caused a
sleeping-while-atomic bug. Fixing this by splitting
cm_alloc_response_msg to an atomic and sleep-able part. When
cm_alloc_response_msg is used in an atomic context, we try to create
the AH before entering the atomic context.

Fixes: 66bd20a72d2f ('IB/core: Ethernet L2 attributes in verbs/cm structures')
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---

Hi Doug,
This patch fixes an old bug in the CM. IP based addressing requires
ARP resolution which isn't sleep-able by its nature. This resolution
was sometimes done in non sleep-able context. Our regression tests
picked up this bug and verified this fix.

Thanks,
Matan

 drivers/infiniband/core/cm.c | 60 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 49 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index ea4db9c..f5cf1c4 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -287,17 +287,12 @@ static int cm_alloc_msg(struct cm_id_private *cm_id_priv,
 	return 0;
 }
 
-static int cm_alloc_response_msg(struct cm_port *port,
-				 struct ib_mad_recv_wc *mad_recv_wc,
-				 struct ib_mad_send_buf **msg)
+static int _cm_alloc_response_msg(struct cm_port *port,
+				  struct ib_mad_recv_wc *mad_recv_wc,
+				  struct ib_ah *ah,
+				  struct ib_mad_send_buf **msg)
 {
 	struct ib_mad_send_buf *m;
-	struct ib_ah *ah;
-
-	ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
-				  mad_recv_wc->recv_buf.grh, port->port_num);
-	if (IS_ERR(ah))
-		return PTR_ERR(ah);
 
 	m = ib_create_send_mad(port->mad_agent, 1, mad_recv_wc->wc->pkey_index,
 			       0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
@@ -312,6 +307,20 @@ static int cm_alloc_response_msg(struct cm_port *port,
 	return 0;
 }
 
+static int cm_alloc_response_msg(struct cm_port *port,
+				 struct ib_mad_recv_wc *mad_recv_wc,
+				 struct ib_mad_send_buf **msg)
+{
+	struct ib_ah *ah;
+
+	ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
+				  mad_recv_wc->recv_buf.grh, port->port_num);
+	if (IS_ERR(ah))
+		return PTR_ERR(ah);
+
+	return _cm_alloc_response_msg(port, mad_recv_wc, ah, msg);
+}
+
 static void cm_free_msg(struct ib_mad_send_buf *msg)
 {
 	ib_destroy_ah(msg->ah);
@@ -2201,6 +2210,7 @@ static int cm_dreq_handler(struct cm_work *work)
 	struct cm_id_private *cm_id_priv;
 	struct cm_dreq_msg *dreq_msg;
 	struct ib_mad_send_buf *msg = NULL;
+	struct ib_ah *ah;
 	int ret;
 
 	dreq_msg = (struct cm_dreq_msg *)work->mad_recv_wc->recv_buf.mad;
@@ -2213,6 +2223,11 @@ static int cm_dreq_handler(struct cm_work *work)
 		return -EINVAL;
 	}
 
+	ah = ib_create_ah_from_wc(work->port->mad_agent->qp->pd,
+				  work->mad_recv_wc->wc,
+				  work->mad_recv_wc->recv_buf.grh,
+				  work->port->port_num);
+
 	work->cm_event.private_data = &dreq_msg->private_data;
 
 	spin_lock_irq(&cm_id_priv->lock);
@@ -2234,9 +2249,13 @@ static int cm_dreq_handler(struct cm_work *work)
 	case IB_CM_TIMEWAIT:
 		atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES].
 				counter[CM_DREQ_COUNTER]);
-		if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
+		if (IS_ERR(ah))
+			goto unlock;
+		if (_cm_alloc_response_msg(work->port, work->mad_recv_wc, ah,
+					   &msg))
 			goto unlock;
 
+		ah = NULL;
 		cm_format_drep((struct cm_drep_msg *) msg->mad, cm_id_priv,
 			       cm_id_priv->private_data,
 			       cm_id_priv->private_data_len);
@@ -2259,6 +2278,8 @@ static int cm_dreq_handler(struct cm_work *work)
 		list_add_tail(&work->list, &cm_id_priv->work_list);
 	spin_unlock_irq(&cm_id_priv->lock);
 
+	if (!IS_ERR_OR_NULL(ah))
+		ib_destroy_ah(ah);
 	if (ret)
 		cm_process_work(cm_id_priv, work);
 	else
@@ -2266,6 +2287,8 @@ static int cm_dreq_handler(struct cm_work *work)
 	return 0;
 
 unlock:	spin_unlock_irq(&cm_id_priv->lock);
+	if (!IS_ERR_OR_NULL(ah))
+		ib_destroy_ah(ah);
 deref:	cm_deref_id(cm_id_priv);
 	return -EINVAL;
 }
@@ -2761,6 +2784,7 @@ static int cm_lap_handler(struct cm_work *work)
 	struct cm_lap_msg *lap_msg;
 	struct ib_cm_lap_event_param *param;
 	struct ib_mad_send_buf *msg = NULL;
+	struct ib_ah *ah;
 	int ret;
 
 	/* todo: verify LAP request and send reject APR if invalid. */
@@ -2770,6 +2794,12 @@ static int cm_lap_handler(struct cm_work *work)
 	if (!cm_id_priv)
 		return -EINVAL;
 
+
+	ah = ib_create_ah_from_wc(work->port->mad_agent->qp->pd,
+				  work->mad_recv_wc->wc,
+				  work->mad_recv_wc->recv_buf.grh,
+				  work->port->port_num);
+
 	param = &work->cm_event.param.lap_rcvd;
 	param->alternate_path = &work->path[0];
 	cm_format_path_from_lap(cm_id_priv, param->alternate_path, lap_msg);
@@ -2786,9 +2816,13 @@ static int cm_lap_handler(struct cm_work *work)
 	case IB_CM_MRA_LAP_SENT:
 		atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES].
 				counter[CM_LAP_COUNTER]);
-		if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
+		if (IS_ERR(ah))
+			goto unlock;
+		if (_cm_alloc_response_msg(work->port, work->mad_recv_wc, ah,
+					   &msg))
 			goto unlock;
 
+		ah = NULL;
 		cm_format_mra((struct cm_mra_msg *) msg->mad, cm_id_priv,
 			      CM_MSG_RESPONSE_OTHER,
 			      cm_id_priv->service_timeout,
@@ -2818,6 +2852,8 @@ static int cm_lap_handler(struct cm_work *work)
 		list_add_tail(&work->list, &cm_id_priv->work_list);
 	spin_unlock_irq(&cm_id_priv->lock);
 
+	if (!IS_ERR_OR_NULL(ah))
+		ib_destroy_ah(ah);
 	if (ret)
 		cm_process_work(cm_id_priv, work);
 	else
@@ -2825,6 +2861,8 @@ static int cm_lap_handler(struct cm_work *work)
 	return 0;
 
 unlock:	spin_unlock_irq(&cm_id_priv->lock);
+	if (!IS_ERR_OR_NULL(ah))
+		ib_destroy_ah(ah);
 deref:	cm_deref_id(cm_id_priv);
 	return -EINVAL;
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free
       [not found] ` <1444568298-17289-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-10-11 12:58   ` [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC Matan Barak
@ 2015-10-11 15:28   ` Or Gerlitz
       [not found]     ` <HE1PR05MB1466FC3AF0B30533033EC1B0B0310@HE1PR05MB1466.eurprd05.prod.outlook.com>
  2015-10-12 16:37   ` Hefty, Sean
  2 siblings, 1 reply; 23+ messages in thread
From: Or Gerlitz @ 2015-10-11 15:28 UTC (permalink / raw)
  To: Matan Barak
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz,
	Eran Ben Elisha, Sean Hefty, Jason Gunthorpe, Doron Tsur

> Fixes: a977049dacde ('[PATCH] IB: Add the kernel CM implementation')

Please remove the the "[PATCH]" thing from the change-log and respin
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]     ` <1444568298-17289-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-10-12 12:59       ` Devesh Sharma
       [not found]         ` <CANjDDBjGe9a_gg-51=ysxMyDPjuD6rQg4FLyfZH8E1TCoEYLKQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-10-12 16:42       ` Hefty, Sean
  1 sibling, 1 reply; 23+ messages in thread
From: Devesh Sharma @ 2015-10-12 12:59 UTC (permalink / raw)
  To: Matan Barak
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz,
	Eran Ben Elisha, Sean Hefty, Jason Gunthorpe

Looks good, just one doubt inline:

On Sun, Oct 11, 2015 at 6:28 PM, Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> When IP based addressing was introduced, ib_create_ah_from_wc was
> changed in order to support a suitable AH. Since this AH should
> now contains the DMAC (which isn't a simple derivative of the GID).
> In order to find the DMAC, an ARP should sometime be sent. This ARP
> is a sleeping context.
>
> ib_create_ah_from_wc is called from cm_alloc_response_msg, which is
> sometimes called from an atomic context. This caused a
> sleeping-while-atomic bug. Fixing this by splitting
> cm_alloc_response_msg to an atomic and sleep-able part. When
> cm_alloc_response_msg is used in an atomic context, we try to create
> the AH before entering the atomic context.
>
> Fixes: 66bd20a72d2f ('IB/core: Ethernet L2 attributes in verbs/cm structures')
> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>
> Hi Doug,
> This patch fixes an old bug in the CM. IP based addressing requires
> ARP resolution which isn't sleep-able by its nature. This resolution
> was sometimes done in non sleep-able context. Our regression tests
> picked up this bug and verified this fix.
>
> Thanks,
> Matan
>
>  drivers/infiniband/core/cm.c | 60 ++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 49 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index ea4db9c..f5cf1c4 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -287,17 +287,12 @@ static int cm_alloc_msg(struct cm_id_private *cm_id_priv,
>         return 0;
>  }
>
> -static int cm_alloc_response_msg(struct cm_port *port,
> -                                struct ib_mad_recv_wc *mad_recv_wc,
> -                                struct ib_mad_send_buf **msg)
> +static int _cm_alloc_response_msg(struct cm_port *port,
> +                                 struct ib_mad_recv_wc *mad_recv_wc,
> +                                 struct ib_ah *ah,
> +                                 struct ib_mad_send_buf **msg)
>  {
>         struct ib_mad_send_buf *m;
> -       struct ib_ah *ah;
> -
> -       ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
> -                                 mad_recv_wc->recv_buf.grh, port->port_num);
> -       if (IS_ERR(ah))
> -               return PTR_ERR(ah);
>
>         m = ib_create_send_mad(port->mad_agent, 1, mad_recv_wc->wc->pkey_index,
>                                0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
> @@ -312,6 +307,20 @@ static int cm_alloc_response_msg(struct cm_port *port,
>         return 0;
>  }
>
> +static int cm_alloc_response_msg(struct cm_port *port,
> +                                struct ib_mad_recv_wc *mad_recv_wc,
> +                                struct ib_mad_send_buf **msg)
> +{
> +       struct ib_ah *ah;
> +
> +       ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
> +                                 mad_recv_wc->recv_buf.grh, port->port_num);
> +       if (IS_ERR(ah))
> +               return PTR_ERR(ah);
> +
> +       return _cm_alloc_response_msg(port, mad_recv_wc, ah, msg);
> +}
> +
>  static void cm_free_msg(struct ib_mad_send_buf *msg)
>  {
>         ib_destroy_ah(msg->ah);
> @@ -2201,6 +2210,7 @@ static int cm_dreq_handler(struct cm_work *work)
>         struct cm_id_private *cm_id_priv;
>         struct cm_dreq_msg *dreq_msg;
>         struct ib_mad_send_buf *msg = NULL;
> +       struct ib_ah *ah;
>         int ret;
>
>         dreq_msg = (struct cm_dreq_msg *)work->mad_recv_wc->recv_buf.mad;
> @@ -2213,6 +2223,11 @@ static int cm_dreq_handler(struct cm_work *work)
>                 return -EINVAL;
>         }
>
> +       ah = ib_create_ah_from_wc(work->port->mad_agent->qp->pd,
> +                                 work->mad_recv_wc->wc,
> +                                 work->mad_recv_wc->recv_buf.grh,
> +                                 work->port->port_num);
> +

Shouldn't below IS_ERR(ah) on ah be here, instead of there?

>         work->cm_event.private_data = &dreq_msg->private_data;
>
>         spin_lock_irq(&cm_id_priv->lock);
> @@ -2234,9 +2249,13 @@ static int cm_dreq_handler(struct cm_work *work)
>         case IB_CM_TIMEWAIT:
>                 atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES].
>                                 counter[CM_DREQ_COUNTER]);
> -               if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
> +               if (IS_ERR(ah))
> +                       goto unlock;
> +               if (_cm_alloc_response_msg(work->port, work->mad_recv_wc, ah,
> +                                          &msg))
>                         goto unlock;
>
> +               ah = NULL;
>                 cm_format_drep((struct cm_drep_msg *) msg->mad, cm_id_priv,
>                                cm_id_priv->private_data,
>                                cm_id_priv->private_data_len);
> @@ -2259,6 +2278,8 @@ static int cm_dreq_handler(struct cm_work *work)
>                 list_add_tail(&work->list, &cm_id_priv->work_list);
>         spin_unlock_irq(&cm_id_priv->lock);
>
> +       if (!IS_ERR_OR_NULL(ah))
> +               ib_destroy_ah(ah);
>         if (ret)
>                 cm_process_work(cm_id_priv, work);
>         else
> @@ -2266,6 +2287,8 @@ static int cm_dreq_handler(struct cm_work *work)
>         return 0;
>
>  unlock:        spin_unlock_irq(&cm_id_priv->lock);
> +       if (!IS_ERR_OR_NULL(ah))
> +               ib_destroy_ah(ah);
>  deref: cm_deref_id(cm_id_priv);
>         return -EINVAL;
>  }
> @@ -2761,6 +2784,7 @@ static int cm_lap_handler(struct cm_work *work)
>         struct cm_lap_msg *lap_msg;
>         struct ib_cm_lap_event_param *param;
>         struct ib_mad_send_buf *msg = NULL;
> +       struct ib_ah *ah;
>         int ret;
>
>         /* todo: verify LAP request and send reject APR if invalid. */
> @@ -2770,6 +2794,12 @@ static int cm_lap_handler(struct cm_work *work)
>         if (!cm_id_priv)
>                 return -EINVAL;
>
> +
> +       ah = ib_create_ah_from_wc(work->port->mad_agent->qp->pd,
> +                                 work->mad_recv_wc->wc,
> +                                 work->mad_recv_wc->recv_buf.grh,
> +                                 work->port->port_num);
> +
>         param = &work->cm_event.param.lap_rcvd;
>         param->alternate_path = &work->path[0];
>         cm_format_path_from_lap(cm_id_priv, param->alternate_path, lap_msg);
> @@ -2786,9 +2816,13 @@ static int cm_lap_handler(struct cm_work *work)
>         case IB_CM_MRA_LAP_SENT:
>                 atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES].
>                                 counter[CM_LAP_COUNTER]);
> -               if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
> +               if (IS_ERR(ah))
> +                       goto unlock;
> +               if (_cm_alloc_response_msg(work->port, work->mad_recv_wc, ah,
> +                                          &msg))
>                         goto unlock;
>
> +               ah = NULL;
>                 cm_format_mra((struct cm_mra_msg *) msg->mad, cm_id_priv,
>                               CM_MSG_RESPONSE_OTHER,
>                               cm_id_priv->service_timeout,
> @@ -2818,6 +2852,8 @@ static int cm_lap_handler(struct cm_work *work)
>                 list_add_tail(&work->list, &cm_id_priv->work_list);
>         spin_unlock_irq(&cm_id_priv->lock);
>
> +       if (!IS_ERR_OR_NULL(ah))
> +               ib_destroy_ah(ah);
>         if (ret)
>                 cm_process_work(cm_id_priv, work);
>         else
> @@ -2825,6 +2861,8 @@ static int cm_lap_handler(struct cm_work *work)
>         return 0;
>
>  unlock:        spin_unlock_irq(&cm_id_priv->lock);
> +       if (!IS_ERR_OR_NULL(ah))
> +               ib_destroy_ah(ah);
>  deref: cm_deref_id(cm_id_priv);
>         return -EINVAL;
>  }
> --
> 2.1.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free
       [not found]       ` <HE1PR05MB1466FC3AF0B30533033EC1B0B0310-eBadYZ65MZ+I1hPkL3GmLNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2015-10-12 13:14         ` Or Gerlitz
  0 siblings, 0 replies; 23+ messages in thread
From: Or Gerlitz @ 2015-10-12 13:14 UTC (permalink / raw)
  To: Matan Barak; +Cc: Hefty, Sean, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Mon, Oct 12, 2015 at 10:13 AM, Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> But that's the name of the commit....
>
> -----Original Message-----
> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Or Gerlitz
> Sent: Sunday, October 11, 2015 6:28 PM
> To: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>; Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>; Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>; Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>; Doron Tsur <doront-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Subject: Re: [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free
>
>> Fixes: a977049dacde ('[PATCH] IB: Add the kernel CM implementation')
>
> Please remove the the "[PATCH]" thing from the change-log and respin

oops, back in 2005 this appeared in the commit title... so ok to let
it go here...

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free
       [not found] ` <1444568298-17289-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-10-11 12:58   ` [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC Matan Barak
  2015-10-11 15:28   ` [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free Or Gerlitz
@ 2015-10-12 16:37   ` Hefty, Sean
       [not found]     ` <1828884A29C6694DAF28B7E6B8A82373A9734333-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2 siblings, 1 reply; 23+ messages in thread
From: Hefty, Sean @ 2015-10-12 16:37 UTC (permalink / raw)
  To: Matan Barak, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
	Jason Gunthorpe, Doron Tsur

> ib_send_cm_sidr_rep could sometimes erase the node from the sidr
> (depending on errors in the process). Since ib_send_cm_sidr_rep is
> called both from cm_sidr_req_handler and cm_destroy_id, cm_id_priv

This should clarify that it is the app calling from the callback, and not a direct call from the cm_sidr_req_handler.

> could be either erased from the rb_tree twice or not erased at all.

In an error case, I can see why it would be left in the rbtree, but I don't see how it can be removed twice.


> Fixing that by making sure it's erased only once before freeing
> cm_id_priv.
> 
> Fixes: a977049dacde ('[PATCH] IB: Add the kernel CM implementation')
> Signed-off-by: Doron Tsur <doront-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
> 
> Hi Doug,
> This patch fixes a bug in the CM. In some flow, rb-tree could be
> freed twice or used after it was freed. This bug was picked by
> our regression tests and this fix was verified.
> 
> Thanks,
> Doron and Matan
> 
>  drivers/infiniband/core/cm.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index f5cf1c4..56ff0f3 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -844,6 +844,11 @@ retest:
>  	case IB_CM_SIDR_REQ_RCVD:
>  		spin_unlock_irq(&cm_id_priv->lock);
>  		cm_reject_sidr_req(cm_id_priv, IB_SIDR_REJECT);
> +		spin_lock_irq(&cm.lock);
> +		if (!RB_EMPTY_NODE(&cm_id_priv->sidr_id_node))
> +			rb_erase(&cm_id_priv->sidr_id_node,
> +				 &cm.remote_sidr_table);
> +		spin_unlock_irq(&cm.lock);

We should be able to use a return value from cm_reject_sidr_req() -- passed through from ib_send_cm_sidr_rep() to determine if the id was removed from the tree.

>  		break;
>  	case IB_CM_REQ_SENT:
>  	case IB_CM_MRA_REQ_RCVD:
> @@ -3210,7 +3215,10 @@ int ib_send_cm_sidr_rep(struct ib_cm_id *cm_id,
>  	spin_unlock_irqrestore(&cm_id_priv->lock, flags);
> 
>  	spin_lock_irqsave(&cm.lock, flags);
> -	rb_erase(&cm_id_priv->sidr_id_node, &cm.remote_sidr_table);
> +	if (!RB_EMPTY_NODE(&cm_id_priv->sidr_id_node)) {
> +		rb_erase(&cm_id_priv->sidr_id_node, &cm.remote_sidr_table);
> +		RB_CLEAR_NODE(&cm_id_priv->sidr_id_node);
> +	}
>  	spin_unlock_irqrestore(&cm.lock, flags);

Something is very wrong in this function if the id is not in the tree at this point.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]     ` <1444568298-17289-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-10-12 12:59       ` Devesh Sharma
@ 2015-10-12 16:42       ` Hefty, Sean
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373A9734356-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  1 sibling, 1 reply; 23+ messages in thread
From: Hefty, Sean @ 2015-10-12 16:42 UTC (permalink / raw)
  To: Matan Barak, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
	Jason Gunthorpe

> When IP based addressing was introduced, ib_create_ah_from_wc was
> changed in order to support a suitable AH. Since this AH should
> now contains the DMAC (which isn't a simple derivative of the GID).
> In order to find the DMAC, an ARP should sometime be sent. This ARP
> is a sleeping context.

Wait - are you saying that the CM may now be waiting for an ARP response before it can send a message?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373A9734356-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-10-13  8:22           ` Matan Barak
       [not found]             ` <CAAKD3BCe_LAuyxifm=j-Am44S1k4nT328WrGBC+Day+XxMxk9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Matan Barak @ 2015-10-13  8:22 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Or Gerlitz, Eran Ben Elisha, Jason Gunthorpe

On Mon, Oct 12, 2015 at 7:42 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> When IP based addressing was introduced, ib_create_ah_from_wc was
>> changed in order to support a suitable AH. Since this AH should
>> now contains the DMAC (which isn't a simple derivative of the GID).
>> In order to find the DMAC, an ARP should sometime be sent. This ARP
>> is a sleeping context.
>
> Wait - are you saying that the CM may now be waiting for an ARP response before it can send a message?
>

ib_create_ah_from_wc needs to resolve the DMAC in order to create the
AH (this may result sending an ARP and waiting for response).
CM uses this function (which is now sleepable).

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]         ` <CANjDDBjGe9a_gg-51=ysxMyDPjuD6rQg4FLyfZH8E1TCoEYLKQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-13  8:24           ` Matan Barak
  0 siblings, 0 replies; 23+ messages in thread
From: Matan Barak @ 2015-10-13  8:24 UTC (permalink / raw)
  To: Devesh Sharma
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz,
	Eran Ben Elisha, Sean Hefty, Jason Gunthorpe



On 10/12/2015 3:59 PM, Devesh Sharma wrote:
> Looks good, just one doubt inline:
>

Thanks for looking at this patch.

> On Sun, Oct 11, 2015 at 6:28 PM, Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>> When IP based addressing was introduced, ib_create_ah_from_wc was
>> changed in order to support a suitable AH. Since this AH should
>> now contains the DMAC (which isn't a simple derivative of the GID).
>> In order to find the DMAC, an ARP should sometime be sent. This ARP
>> is a sleeping context.
>>
>> ib_create_ah_from_wc is called from cm_alloc_response_msg, which is
>> sometimes called from an atomic context. This caused a
>> sleeping-while-atomic bug. Fixing this by splitting
>> cm_alloc_response_msg to an atomic and sleep-able part. When
>> cm_alloc_response_msg is used in an atomic context, we try to create
>> the AH before entering the atomic context.
>>
>> Fixes: 66bd20a72d2f ('IB/core: Ethernet L2 attributes in verbs/cm structures')
>> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
>>
>> Hi Doug,
>> This patch fixes an old bug in the CM. IP based addressing requires
>> ARP resolution which isn't sleep-able by its nature. This resolution
>> was sometimes done in non sleep-able context. Our regression tests
>> picked up this bug and verified this fix.
>>
>> Thanks,
>> Matan
>>
>>   drivers/infiniband/core/cm.c | 60 ++++++++++++++++++++++++++++++++++++--------
>>   1 file changed, 49 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
>> index ea4db9c..f5cf1c4 100644
>> --- a/drivers/infiniband/core/cm.c
>> +++ b/drivers/infiniband/core/cm.c
>> @@ -287,17 +287,12 @@ static int cm_alloc_msg(struct cm_id_private *cm_id_priv,
>>          return 0;
>>   }
>>
>> -static int cm_alloc_response_msg(struct cm_port *port,
>> -                                struct ib_mad_recv_wc *mad_recv_wc,
>> -                                struct ib_mad_send_buf **msg)
>> +static int _cm_alloc_response_msg(struct cm_port *port,
>> +                                 struct ib_mad_recv_wc *mad_recv_wc,
>> +                                 struct ib_ah *ah,
>> +                                 struct ib_mad_send_buf **msg)
>>   {
>>          struct ib_mad_send_buf *m;
>> -       struct ib_ah *ah;
>> -
>> -       ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
>> -                                 mad_recv_wc->recv_buf.grh, port->port_num);
>> -       if (IS_ERR(ah))
>> -               return PTR_ERR(ah);
>>
>>          m = ib_create_send_mad(port->mad_agent, 1, mad_recv_wc->wc->pkey_index,
>>                                 0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
>> @@ -312,6 +307,20 @@ static int cm_alloc_response_msg(struct cm_port *port,
>>          return 0;
>>   }
>>
>> +static int cm_alloc_response_msg(struct cm_port *port,
>> +                                struct ib_mad_recv_wc *mad_recv_wc,
>> +                                struct ib_mad_send_buf **msg)
>> +{
>> +       struct ib_ah *ah;
>> +
>> +       ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
>> +                                 mad_recv_wc->recv_buf.grh, port->port_num);
>> +       if (IS_ERR(ah))
>> +               return PTR_ERR(ah);
>> +
>> +       return _cm_alloc_response_msg(port, mad_recv_wc, ah, msg);
>> +}
>> +
>>   static void cm_free_msg(struct ib_mad_send_buf *msg)
>>   {
>>          ib_destroy_ah(msg->ah);
>> @@ -2201,6 +2210,7 @@ static int cm_dreq_handler(struct cm_work *work)
>>          struct cm_id_private *cm_id_priv;
>>          struct cm_dreq_msg *dreq_msg;
>>          struct ib_mad_send_buf *msg = NULL;
>> +       struct ib_ah *ah;
>>          int ret;
>>
>>          dreq_msg = (struct cm_dreq_msg *)work->mad_recv_wc->recv_buf.mad;
>> @@ -2213,6 +2223,11 @@ static int cm_dreq_handler(struct cm_work *work)
>>                  return -EINVAL;
>>          }
>>
>> +       ah = ib_create_ah_from_wc(work->port->mad_agent->qp->pd,
>> +                                 work->mad_recv_wc->wc,
>> +                                 work->mad_recv_wc->recv_buf.grh,
>> +                                 work->port->port_num);
>> +
>
> Shouldn't below IS_ERR(ah) on ah be here, instead of there?
>

I don't think we want to fail on error if the state != IB_CM_TIMEWAIT 
(other states don't use this ah).

>>          work->cm_event.private_data = &dreq_msg->private_data;
>>
>>          spin_lock_irq(&cm_id_priv->lock);
>> @@ -2234,9 +2249,13 @@ static int cm_dreq_handler(struct cm_work *work)
>>          case IB_CM_TIMEWAIT:
>>                  atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES].
>>                                  counter[CM_DREQ_COUNTER]);
>> -               if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
>> +               if (IS_ERR(ah))
>> +                       goto unlock;
>> +               if (_cm_alloc_response_msg(work->port, work->mad_recv_wc, ah,
>> +                                          &msg))
>>                          goto unlock;
>>
>> +               ah = NULL;
>>                  cm_format_drep((struct cm_drep_msg *) msg->mad, cm_id_priv,
>>                                 cm_id_priv->private_data,
>>                                 cm_id_priv->private_data_len);
>> @@ -2259,6 +2278,8 @@ static int cm_dreq_handler(struct cm_work *work)
>>                  list_add_tail(&work->list, &cm_id_priv->work_list);
>>          spin_unlock_irq(&cm_id_priv->lock);
>>
>> +       if (!IS_ERR_OR_NULL(ah))
>> +               ib_destroy_ah(ah);
>>          if (ret)
>>                  cm_process_work(cm_id_priv, work);
>>          else
>> @@ -2266,6 +2287,8 @@ static int cm_dreq_handler(struct cm_work *work)
>>          return 0;
>>
>>   unlock:        spin_unlock_irq(&cm_id_priv->lock);
>> +       if (!IS_ERR_OR_NULL(ah))
>> +               ib_destroy_ah(ah);
>>   deref: cm_deref_id(cm_id_priv);
>>          return -EINVAL;
>>   }
>> @@ -2761,6 +2784,7 @@ static int cm_lap_handler(struct cm_work *work)
>>          struct cm_lap_msg *lap_msg;
>>          struct ib_cm_lap_event_param *param;
>>          struct ib_mad_send_buf *msg = NULL;
>> +       struct ib_ah *ah;
>>          int ret;
>>
>>          /* todo: verify LAP request and send reject APR if invalid. */
>> @@ -2770,6 +2794,12 @@ static int cm_lap_handler(struct cm_work *work)
>>          if (!cm_id_priv)
>>                  return -EINVAL;
>>
>> +
>> +       ah = ib_create_ah_from_wc(work->port->mad_agent->qp->pd,
>> +                                 work->mad_recv_wc->wc,
>> +                                 work->mad_recv_wc->recv_buf.grh,
>> +                                 work->port->port_num);
>> +
>>          param = &work->cm_event.param.lap_rcvd;
>>          param->alternate_path = &work->path[0];
>>          cm_format_path_from_lap(cm_id_priv, param->alternate_path, lap_msg);
>> @@ -2786,9 +2816,13 @@ static int cm_lap_handler(struct cm_work *work)
>>          case IB_CM_MRA_LAP_SENT:
>>                  atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES].
>>                                  counter[CM_LAP_COUNTER]);
>> -               if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
>> +               if (IS_ERR(ah))
>> +                       goto unlock;
>> +               if (_cm_alloc_response_msg(work->port, work->mad_recv_wc, ah,
>> +                                          &msg))
>>                          goto unlock;
>>
>> +               ah = NULL;
>>                  cm_format_mra((struct cm_mra_msg *) msg->mad, cm_id_priv,
>>                                CM_MSG_RESPONSE_OTHER,
>>                                cm_id_priv->service_timeout,
>> @@ -2818,6 +2852,8 @@ static int cm_lap_handler(struct cm_work *work)
>>                  list_add_tail(&work->list, &cm_id_priv->work_list);
>>          spin_unlock_irq(&cm_id_priv->lock);
>>
>> +       if (!IS_ERR_OR_NULL(ah))
>> +               ib_destroy_ah(ah);
>>          if (ret)
>>                  cm_process_work(cm_id_priv, work);
>>          else
>> @@ -2825,6 +2861,8 @@ static int cm_lap_handler(struct cm_work *work)
>>          return 0;
>>
>>   unlock:        spin_unlock_irq(&cm_id_priv->lock);
>> +       if (!IS_ERR_OR_NULL(ah))
>> +               ib_destroy_ah(ah);
>>   deref: cm_deref_id(cm_id_priv);
>>          return -EINVAL;
>>   }
>> --
>> 2.1.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]             ` <CAAKD3BCe_LAuyxifm=j-Am44S1k4nT328WrGBC+Day+XxMxk9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-13 16:18               ` Hefty, Sean
       [not found]                 ` <1828884A29C6694DAF28B7E6B8A82373A9734A57-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Hefty, Sean @ 2015-10-13 16:18 UTC (permalink / raw)
  To: Matan Barak
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Or Gerlitz, Eran Ben Elisha, Jason Gunthorpe

> On Mon, Oct 12, 2015 at 7:42 PM, Hefty, Sean <sean.hefty@intel.com> wrote:
> >> When IP based addressing was introduced, ib_create_ah_from_wc was
> >> changed in order to support a suitable AH. Since this AH should
> >> now contains the DMAC (which isn't a simple derivative of the GID).
> >> In order to find the DMAC, an ARP should sometime be sent. This ARP
> >> is a sleeping context.
> >
> > Wait - are you saying that the CM may now be waiting for an ARP response
> before it can send a message?
> >
> 
> ib_create_ah_from_wc needs to resolve the DMAC in order to create the
> AH (this may result sending an ARP and waiting for response).
> CM uses this function (which is now sleepable).

This is a significant change to the CM.  The CM calls are invoked assuming that they return relatively quickly.  They're invoked from callbacks and internally.  Having the calls now wait for an ARP response requires that this be re-architected, so the calling thread doesn't go out to lunch for several seconds.

- Sean

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]                 ` <1828884A29C6694DAF28B7E6B8A82373A9734A57-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-10-14  7:44                   ` Matan Barak
       [not found]                     ` <CAAKD3BAF763brdhsrHtpm_peHk--g3iza53AioiUefepHM_s2w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Matan Barak @ 2015-10-14  7:44 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Or Gerlitz, Eran Ben Elisha, Jason Gunthorpe

On Tue, Oct 13, 2015 at 7:18 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> On Mon, Oct 12, 2015 at 7:42 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> >> When IP based addressing was introduced, ib_create_ah_from_wc was
>> >> changed in order to support a suitable AH. Since this AH should
>> >> now contains the DMAC (which isn't a simple derivative of the GID).
>> >> In order to find the DMAC, an ARP should sometime be sent. This ARP
>> >> is a sleeping context.
>> >
>> > Wait - are you saying that the CM may now be waiting for an ARP response
>> before it can send a message?
>> >
>>
>> ib_create_ah_from_wc needs to resolve the DMAC in order to create the
>> AH (this may result sending an ARP and waiting for response).
>> CM uses this function (which is now sleepable).
>
> This is a significant change to the CM.  The CM calls are invoked assuming that they return relatively quickly.  They're invoked from callbacks and internally.  Having the calls now wait for an ARP response requires that this be re-architected, so the calling thread doesn't go out to lunch for several seconds.

Agree - this is a significant change, but it was done a long time ago
(at v4.3 if I recall). When we need to send a message we need to
figure out the destination MAC. Even the passive side needs to do that
as some vendors don't report the source MAC of the packet in their wc.
Even if they did, since IP based addressing is rout-able by its
nature, it should follow the networking stack rules. Some crazy
configurations could force sending responses to packets that came from
router1 to router2 - so we have no choice than resolving the DMAC at
every side.

>
> - Sean

Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free
       [not found]     ` <1828884A29C6694DAF28B7E6B8A82373A9734333-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-10-15 15:15       ` Matan Barak
       [not found]         ` <561FC309.2030102-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Matan Barak @ 2015-10-15 15:15 UTC (permalink / raw)
  To: Hefty, Sean, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
	Jason Gunthorpe, Doron Tsur



On 10/12/2015 7:37 PM, Hefty, Sean wrote:
>> ib_send_cm_sidr_rep could sometimes erase the node from the sidr
>> (depending on errors in the process). Since ib_send_cm_sidr_rep is
>> called both from cm_sidr_req_handler and cm_destroy_id, cm_id_priv
>
> This should clarify that it is the app calling from the callback, and not a direct call from the cm_sidr_req_handler.
>

Consider the following error flows:

Double free:
cm_sidr_req_handler:3156->cm_reject_sidr_req:663->ib_send_cm_sidr_rep:3233->erase_rb 
 
cm_sidr_req_handler:3173->ib_destroy_cm_id->cm_destroy_id:846->ib_send_cm_sidr_rep:3233->erase_rb

RB contains free node:
cm_sidr_req_handler:3156->cm_reject_sidr_req:663->ib_send_cm_sidr_rep->returns 
error(for example cm_alloc_msg,3219)
 
cm_sidr_req_handler:3173->ib_destroy_cm_id->cm_destroy_id:846->cm_reject_sidr_req->cm_reject_sidr_req:663->returns
error(for example cm_alloc_msg,3219)->RB wasn't erased but memory is 
freed :910 kfree(cm_id_priv)


>> could be either erased from the rb_tree twice or not erased at all.
>
> In an error case, I can see why it would be left in the rbtree, but I don't see how it can be removed twice.
>
>
>> Fixing that by making sure it's erased only once before freeing
>> cm_id_priv.
>>
>> Fixes: a977049dacde ('[PATCH] IB: Add the kernel CM implementation')
>> Signed-off-by: Doron Tsur <doront-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
>>
>> Hi Doug,
>> This patch fixes a bug in the CM. In some flow, rb-tree could be
>> freed twice or used after it was freed. This bug was picked by
>> our regression tests and this fix was verified.
>>
>> Thanks,
>> Doron and Matan
>>
>>   drivers/infiniband/core/cm.c | 10 +++++++++-
>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
>> index f5cf1c4..56ff0f3 100644
>> --- a/drivers/infiniband/core/cm.c
>> +++ b/drivers/infiniband/core/cm.c
>> @@ -844,6 +844,11 @@ retest:
>>   	case IB_CM_SIDR_REQ_RCVD:
>>   		spin_unlock_irq(&cm_id_priv->lock);
>>   		cm_reject_sidr_req(cm_id_priv, IB_SIDR_REJECT);
>> +		spin_lock_irq(&cm.lock);
>> +		if (!RB_EMPTY_NODE(&cm_id_priv->sidr_id_node))
>> +			rb_erase(&cm_id_priv->sidr_id_node,
>> +				 &cm.remote_sidr_table);
>> +		spin_unlock_irq(&cm.lock);

This change seeks to remove the about to be freed node from the rb tree, 
while verifying it has not been freed already

>
> We should be able to use a return value from cm_reject_sidr_req() -- passed through from ib_send_cm_sidr_rep() to determine if the id was removed from the tree.
>

But this won't protect from double free in ib_send_cm_sidr_rep, unless 
we pass this parameter to the cm destroy function, but this alternative 
is cumbersome.

>>   		break;
>>   	case IB_CM_REQ_SENT:
>>   	case IB_CM_MRA_REQ_RCVD:
>> @@ -3210,7 +3215,10 @@ int ib_send_cm_sidr_rep(struct ib_cm_id *cm_id,
>>   	spin_unlock_irqrestore(&cm_id_priv->lock, flags);
>>
>>   	spin_lock_irqsave(&cm.lock, flags);
>> -	rb_erase(&cm_id_priv->sidr_id_node, &cm.remote_sidr_table);
>> +	if (!RB_EMPTY_NODE(&cm_id_priv->sidr_id_node)) {
>> +		rb_erase(&cm_id_priv->sidr_id_node, &cm.remote_sidr_table);
>> +		RB_CLEAR_NODE(&cm_id_priv->sidr_id_node);
>> +	}

This change protects against double free

>>   	spin_unlock_irqrestore(&cm.lock, flags);
>
> Something is very wrong in this function if the id is not in the tree at this point.
>

We agree, but there's an error flow that triggers this behavior.

Regards,
Matan and Doron.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]                     ` <CAAKD3BAF763brdhsrHtpm_peHk--g3iza53AioiUefepHM_s2w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-15 16:58                       ` Hefty, Sean
       [not found]                         ` <1828884A29C6694DAF28B7E6B8A82373A9735914-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Hefty, Sean @ 2015-10-15 16:58 UTC (permalink / raw)
  To: Matan Barak
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Or Gerlitz, Eran Ben Elisha, Jason Gunthorpe

> >> ib_create_ah_from_wc needs to resolve the DMAC in order to create the
> >> AH (this may result sending an ARP and waiting for response).
> >> CM uses this function (which is now sleepable).
> >
> > This is a significant change to the CM.  The CM calls are invoked
> assuming that they return relatively quickly.  They're invoked from
> callbacks and internally.  Having the calls now wait for an ARP response
> requires that this be re-architected, so the calling thread doesn't go out
> to lunch for several seconds.
> 
> Agree - this is a significant change, but it was done a long time ago
> (at v4.3 if I recall). When we need to send a message we need to

We're at 4.3-rc5?

> figure out the destination MAC. Even the passive side needs to do that
> as some vendors don't report the source MAC of the packet in their wc.
> Even if they did, since IP based addressing is rout-able by its
> nature, it should follow the networking stack rules. Some crazy
> configurations could force sending responses to packets that came from
> router1 to router2 - so we have no choice than resolving the DMAC at
> every side.

Ib_create_ah_from_wc is broken.   It is now an asynchronous operation, only the call itself was left as synchronous.  We can't block kernel threads for a minute, or however long ARP takes to resolve.  The call itself must change to be async, and all users of it updated to allocate some request, queue it, and handle all race conditions that result -- such as state changes or destruction of the work that caused the request to be initiated.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]                         ` <1828884A29C6694DAF28B7E6B8A82373A9735914-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-10-18  7:28                           ` Matan Barak
       [not found]                             ` <CAAKD3BCdbD8MC4PJGuVzUxiq5wEbCNjX4e91vBkcoErJVM8FQg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-12-23 20:04                           ` Doug Ledford
  1 sibling, 1 reply; 23+ messages in thread
From: Matan Barak @ 2015-10-18  7:28 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Or Gerlitz, Eran Ben Elisha, Jason Gunthorpe

On Thu, Oct 15, 2015 at 7:58 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> >> ib_create_ah_from_wc needs to resolve the DMAC in order to create the
>> >> AH (this may result sending an ARP and waiting for response).
>> >> CM uses this function (which is now sleepable).
>> >
>> > This is a significant change to the CM.  The CM calls are invoked
>> assuming that they return relatively quickly.  They're invoked from
>> callbacks and internally.  Having the calls now wait for an ARP response
>> requires that this be re-architected, so the calling thread doesn't go out
>> to lunch for several seconds.
>>
>> Agree - this is a significant change, but it was done a long time ago
>> (at v4.3 if I recall). When we need to send a message we need to
>
> We're at 4.3-rc5?
>

Sorry, meant v3.14.

>> figure out the destination MAC. Even the passive side needs to do that
>> as some vendors don't report the source MAC of the packet in their wc.
>> Even if they did, since IP based addressing is rout-able by its
>> nature, it should follow the networking stack rules. Some crazy
>> configurations could force sending responses to packets that came from
>> router1 to router2 - so we have no choice than resolving the DMAC at
>> every side.
>
> Ib_create_ah_from_wc is broken.   It is now an asynchronous operation, only the call itself was left as synchronous.  We can't block kernel threads for a minute, or however long ARP takes to resolve.  The call itself must change to be async, and all users of it updated to allocate some request, queue it, and handle all race conditions that result -- such as state changes or destruction of the work that caused the request to be initiated.


Today, cm assumes paths are reversible primary_path->reversible = 1.
That's true for both IB and RoCE. We could say vendors must report the
SMAC in WC and then ib_create_ah_from_wc will be atomic (for these
cases). If we wish to lift these limitations, we need to make
ib_create_ah_from_wc asynchronous, but that's true even prior the RoCE
IP based addressing patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]                             ` <CAAKD3BCdbD8MC4PJGuVzUxiq5wEbCNjX4e91vBkcoErJVM8FQg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-20 15:57                               ` Hefty, Sean
       [not found]                                 ` <1828884A29C6694DAF28B7E6B8A82373A9736832-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Hefty, Sean @ 2015-10-20 15:57 UTC (permalink / raw)
  To: Matan Barak
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Or Gerlitz, Eran Ben Elisha, Jason Gunthorpe

> Today, cm assumes paths are reversible primary_path->reversible = 1.

I can't quickly find a link, but I believe all CM MADs are reversible, per the spec.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]                                 ` <1828884A29C6694DAF28B7E6B8A82373A9736832-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-10-20 16:03                                   ` Hal Rosenstock
  2015-10-20 16:36                                   ` Jason Gunthorpe
  1 sibling, 0 replies; 23+ messages in thread
From: Hal Rosenstock @ 2015-10-20 16:03 UTC (permalink / raw)
  To: Hefty, Sean, Matan Barak
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Or Gerlitz, Eran Ben Elisha, Jason Gunthorpe



On 10/20/2015 11:57 AM, Hefty, Sean wrote:
>> Today, cm assumes paths are reversible primary_path->reversible = 1.
> 
> I can't quickly find a link, but I believe all CM MADs are reversible, per the spec.

Spec citation for this is:

C12-5.1.3: All responses generated by the CM protocol shall follow the
rules for response generation that are enumerated in 13.5.4 Response
Generation and Reversible Paths.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]                                 ` <1828884A29C6694DAF28B7E6B8A82373A9736832-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2015-10-20 16:03                                   ` Hal Rosenstock
@ 2015-10-20 16:36                                   ` Jason Gunthorpe
  1 sibling, 0 replies; 23+ messages in thread
From: Jason Gunthorpe @ 2015-10-20 16:36 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Matan Barak, Matan Barak, Doug Ledford,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha

On Tue, Oct 20, 2015 at 03:57:54PM +0000, Hefty, Sean wrote:
> > Today, cm assumes paths are reversible primary_path->reversible = 1.
> 
> I can't quickly find a link, but I believe all CM MADs are
> reversible, per the spec.

But the Linux CM code doesn't always create the reverse CM MAD from
the GMP headers, it sometimes will do it by looking into the data path
in the MAD, which means it still could need to sleep to do the MAC
resolution.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free
       [not found]         ` <561FC309.2030102-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-10-20 20:27           ` Doug Ledford
       [not found]             ` <5626A39D.6030906-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Doug Ledford @ 2015-10-20 20:27 UTC (permalink / raw)
  To: Matan Barak, Hefty, Sean
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
	Jason Gunthorpe, Doron Tsur

[-- Attachment #1: Type: text/plain, Size: 4135 bytes --]

On 10/15/2015 11:15 AM, Matan Barak wrote:
> 
> 
> On 10/12/2015 7:37 PM, Hefty, Sean wrote:
>>> ib_send_cm_sidr_rep could sometimes erase the node from the sidr
>>> (depending on errors in the process). Since ib_send_cm_sidr_rep is
>>> called both from cm_sidr_req_handler and cm_destroy_id, cm_id_priv
>>
>> This should clarify that it is the app calling from the callback, and
>> not a direct call from the cm_sidr_req_handler.
>>
> 
> Consider the following error flows:
> 
> Double free:
> cm_sidr_req_handler:3156->cm_reject_sidr_req:663->ib_send_cm_sidr_rep:3233->erase_rb
> 
> cm_sidr_req_handler:3173->ib_destroy_cm_id->cm_destroy_id:846->ib_send_cm_sidr_rep:3233->erase_rb
> 
> 
> RB contains free node:
> cm_sidr_req_handler:3156->cm_reject_sidr_req:663->ib_send_cm_sidr_rep->returns
> error(for example cm_alloc_msg,3219)
> 
> cm_sidr_req_handler:3173->ib_destroy_cm_id->cm_destroy_id:846->cm_reject_sidr_req->cm_reject_sidr_req:663->returns
> 
> error(for example cm_alloc_msg,3219)->RB wasn't erased but memory is
> freed :910 kfree(cm_id_priv)
> 
> 
>>> could be either erased from the rb_tree twice or not erased at all.
>>
>> In an error case, I can see why it would be left in the rbtree, but I
>> don't see how it can be removed twice.
>>
>>
>>> Fixing that by making sure it's erased only once before freeing
>>> cm_id_priv.
>>>
>>> Fixes: a977049dacde ('[PATCH] IB: Add the kernel CM implementation')
>>> Signed-off-by: Doron Tsur <doront-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>> ---
>>>
>>> Hi Doug,
>>> This patch fixes a bug in the CM. In some flow, rb-tree could be
>>> freed twice or used after it was freed. This bug was picked by
>>> our regression tests and this fix was verified.
>>>
>>> Thanks,
>>> Doron and Matan
>>>
>>>   drivers/infiniband/core/cm.c | 10 +++++++++-
>>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
>>> index f5cf1c4..56ff0f3 100644
>>> --- a/drivers/infiniband/core/cm.c
>>> +++ b/drivers/infiniband/core/cm.c
>>> @@ -844,6 +844,11 @@ retest:
>>>       case IB_CM_SIDR_REQ_RCVD:
>>>           spin_unlock_irq(&cm_id_priv->lock);
>>>           cm_reject_sidr_req(cm_id_priv, IB_SIDR_REJECT);
>>> +        spin_lock_irq(&cm.lock);
>>> +        if (!RB_EMPTY_NODE(&cm_id_priv->sidr_id_node))
>>> +            rb_erase(&cm_id_priv->sidr_id_node,
>>> +                 &cm.remote_sidr_table);
>>> +        spin_unlock_irq(&cm.lock);
> 
> This change seeks to remove the about to be freed node from the rb tree,
> while verifying it has not been freed already
> 
>>
>> We should be able to use a return value from cm_reject_sidr_req() --
>> passed through from ib_send_cm_sidr_rep() to determine if the id was
>> removed from the tree.
>>
> 
> But this won't protect from double free in ib_send_cm_sidr_rep, unless
> we pass this parameter to the cm destroy function, but this alternative
> is cumbersome.
> 
>>>           break;
>>>       case IB_CM_REQ_SENT:
>>>       case IB_CM_MRA_REQ_RCVD:
>>> @@ -3210,7 +3215,10 @@ int ib_send_cm_sidr_rep(struct ib_cm_id *cm_id,
>>>       spin_unlock_irqrestore(&cm_id_priv->lock, flags);
>>>
>>>       spin_lock_irqsave(&cm.lock, flags);
>>> -    rb_erase(&cm_id_priv->sidr_id_node, &cm.remote_sidr_table);
>>> +    if (!RB_EMPTY_NODE(&cm_id_priv->sidr_id_node)) {
>>> +        rb_erase(&cm_id_priv->sidr_id_node, &cm.remote_sidr_table);
>>> +        RB_CLEAR_NODE(&cm_id_priv->sidr_id_node);
>>> +    }
> 
> This change protects against double free
> 
>>>       spin_unlock_irqrestore(&cm.lock, flags);
>>
>> Something is very wrong in this function if the id is not in the tree
>> at this point.
>>
> 
> We agree, but there's an error flow that triggers this behavior.

Sean, I need to close on this patch.  What is your position after
Matan's explanation?

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free
       [not found]             ` <5626A39D.6030906-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-10-21 19:58               ` Doug Ledford
       [not found]                 ` <5627EE5A.7030303-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Doug Ledford @ 2015-10-21 19:58 UTC (permalink / raw)
  To: Matan Barak, Hefty, Sean
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
	Jason Gunthorpe, Doron Tsur

[-- Attachment #1: Type: text/plain, Size: 4900 bytes --]

On 10/20/2015 04:27 PM, Doug Ledford wrote:
> On 10/15/2015 11:15 AM, Matan Barak wrote:
>>
>>
>> On 10/12/2015 7:37 PM, Hefty, Sean wrote:
>>>> ib_send_cm_sidr_rep could sometimes erase the node from the sidr
>>>> (depending on errors in the process). Since ib_send_cm_sidr_rep is
>>>> called both from cm_sidr_req_handler and cm_destroy_id, cm_id_priv
>>>
>>> This should clarify that it is the app calling from the callback, and
>>> not a direct call from the cm_sidr_req_handler.
>>>
>>
>> Consider the following error flows:
>>
>> Double free:
>> cm_sidr_req_handler:3156->cm_reject_sidr_req:663->ib_send_cm_sidr_rep:3233->erase_rb
>>
>> cm_sidr_req_handler:3173->ib_destroy_cm_id->cm_destroy_id:846->ib_send_cm_sidr_rep:3233->erase_rb
>>
>>
>> RB contains free node:
>> cm_sidr_req_handler:3156->cm_reject_sidr_req:663->ib_send_cm_sidr_rep->returns
>> error(for example cm_alloc_msg,3219)
>>
>> cm_sidr_req_handler:3173->ib_destroy_cm_id->cm_destroy_id:846->cm_reject_sidr_req->cm_reject_sidr_req:663->returns
>>
>> error(for example cm_alloc_msg,3219)->RB wasn't erased but memory is
>> freed :910 kfree(cm_id_priv)
>>
>>
>>>> could be either erased from the rb_tree twice or not erased at all.
>>>
>>> In an error case, I can see why it would be left in the rbtree, but I
>>> don't see how it can be removed twice.
>>>
>>>
>>>> Fixing that by making sure it's erased only once before freeing
>>>> cm_id_priv.
>>>>
>>>> Fixes: a977049dacde ('[PATCH] IB: Add the kernel CM implementation')
>>>> Signed-off-by: Doron Tsur <doront-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>>> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>>> ---
>>>>
>>>> Hi Doug,
>>>> This patch fixes a bug in the CM. In some flow, rb-tree could be
>>>> freed twice or used after it was freed. This bug was picked by
>>>> our regression tests and this fix was verified.
>>>>
>>>> Thanks,
>>>> Doron and Matan
>>>>
>>>>   drivers/infiniband/core/cm.c | 10 +++++++++-
>>>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
>>>> index f5cf1c4..56ff0f3 100644
>>>> --- a/drivers/infiniband/core/cm.c
>>>> +++ b/drivers/infiniband/core/cm.c
>>>> @@ -844,6 +844,11 @@ retest:
>>>>       case IB_CM_SIDR_REQ_RCVD:
>>>>           spin_unlock_irq(&cm_id_priv->lock);
>>>>           cm_reject_sidr_req(cm_id_priv, IB_SIDR_REJECT);
>>>> +        spin_lock_irq(&cm.lock);
>>>> +        if (!RB_EMPTY_NODE(&cm_id_priv->sidr_id_node))
>>>> +            rb_erase(&cm_id_priv->sidr_id_node,
>>>> +                 &cm.remote_sidr_table);
>>>> +        spin_unlock_irq(&cm.lock);
>>
>> This change seeks to remove the about to be freed node from the rb tree,
>> while verifying it has not been freed already
>>
>>>
>>> We should be able to use a return value from cm_reject_sidr_req() --
>>> passed through from ib_send_cm_sidr_rep() to determine if the id was
>>> removed from the tree.
>>>
>>
>> But this won't protect from double free in ib_send_cm_sidr_rep, unless
>> we pass this parameter to the cm destroy function, but this alternative
>> is cumbersome.
>>
>>>>           break;
>>>>       case IB_CM_REQ_SENT:
>>>>       case IB_CM_MRA_REQ_RCVD:
>>>> @@ -3210,7 +3215,10 @@ int ib_send_cm_sidr_rep(struct ib_cm_id *cm_id,
>>>>       spin_unlock_irqrestore(&cm_id_priv->lock, flags);
>>>>
>>>>       spin_lock_irqsave(&cm.lock, flags);
>>>> -    rb_erase(&cm_id_priv->sidr_id_node, &cm.remote_sidr_table);
>>>> +    if (!RB_EMPTY_NODE(&cm_id_priv->sidr_id_node)) {
>>>> +        rb_erase(&cm_id_priv->sidr_id_node, &cm.remote_sidr_table);
>>>> +        RB_CLEAR_NODE(&cm_id_priv->sidr_id_node);
>>>> +    }
>>
>> This change protects against double free
>>
>>>>       spin_unlock_irqrestore(&cm.lock, flags);
>>>
>>> Something is very wrong in this function if the id is not in the tree
>>> at this point.
>>>
>>
>> We agree, but there's an error flow that triggers this behavior.
> 
> Sean, I need to close on this patch.  What is your position after
> Matan's explanation?
> 

Absent an objection from Sean, I've pulled this in.  A use after free
bug is a pretty serious issue, and you've listed an error flow that
triggers it.  The only thing bugging me is that this code is 10+ years
old and this didn't show up until now, which makes me think that some
recent change is the cause of this.  I've made note of that fact in my
tag commit and I think this warrants further examination in the next
kernel cycle.  But since we are so close to out of time on 4.3, I deemed
it better to fix the use after free issue, even if it isn't necessarily
the perfect fix, than leave that hanging about.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free
       [not found]                 ` <5627EE5A.7030303-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-10-26 17:39                   ` Hefty, Sean
  0 siblings, 0 replies; 23+ messages in thread
From: Hefty, Sean @ 2015-10-26 17:39 UTC (permalink / raw)
  To: Doug Ledford, Matan Barak
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha,
	Jason Gunthorpe, Doron Tsur

> > Sean, I need to close on this patch.  What is your position after
> > Matan's explanation?
> >
> 
> Absent an objection from Sean, I've pulled this in.  A use after free
> bug is a pretty serious issue, and you've listed an error flow that
> triggers it.  The only thing bugging me is that this code is 10+ years
> old and this didn't show up until now, which makes me think that some
> recent change is the cause of this.  I've made note of that fact in my
> tag commit and I think this warrants further examination in the next
> kernel cycle.  But since we are so close to out of time on 4.3, I deemed
> it better to fix the use after free issue, even if it isn't necessarily
> the perfect fix, than leave that hanging about.

I was out last week.  I think one of the reasons that this bug hasn't shown up is that very few apps use UD QPs, and those that do likely exchange QP information using some other out of band mechanism.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]                         ` <1828884A29C6694DAF28B7E6B8A82373A9735914-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2015-10-18  7:28                           ` Matan Barak
@ 2015-12-23 20:04                           ` Doug Ledford
       [not found]                             ` <567AFE42.2080107-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 23+ messages in thread
From: Doug Ledford @ 2015-12-23 20:04 UTC (permalink / raw)
  To: Hefty, Sean, Matan Barak
  Cc: Matan Barak, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz,
	Eran Ben Elisha, Jason Gunthorpe

[-- Attachment #1: Type: text/plain, Size: 1997 bytes --]

On 10/15/2015 12:58 PM, Hefty, Sean wrote:
>>>> ib_create_ah_from_wc needs to resolve the DMAC in order to create the
>>>> AH (this may result sending an ARP and waiting for response).
>>>> CM uses this function (which is now sleepable).
>>>
>>> This is a significant change to the CM.  The CM calls are invoked
>> assuming that they return relatively quickly.  They're invoked from
>> callbacks and internally.  Having the calls now wait for an ARP response
>> requires that this be re-architected, so the calling thread doesn't go out
>> to lunch for several seconds.
>>
>> Agree - this is a significant change, but it was done a long time ago
>> (at v4.3 if I recall). When we need to send a message we need to
> 
> We're at 4.3-rc5?
> 
>> figure out the destination MAC. Even the passive side needs to do that
>> as some vendors don't report the source MAC of the packet in their wc.
>> Even if they did, since IP based addressing is rout-able by its
>> nature, it should follow the networking stack rules. Some crazy
>> configurations could force sending responses to packets that came from
>> router1 to router2 - so we have no choice than resolving the DMAC at
>> every side.
> 
> Ib_create_ah_from_wc is broken.   It is now an asynchronous operation, only the call itself was left as synchronous.  We can't block kernel threads for a minute, or however long ARP takes to resolve.  The call itself must change to be async, and all users of it updated to allocate some request, queue it, and handle all race conditions that result -- such as state changes or destruction of the work that caused the request to be initiated.
> 

I don't know who had intended to address this, but it got left out of
the 4.4 work.  We need to not let this drop through the cracks (for
another release).  Can someone please put fixing this properly on their
TODO list?

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]                             ` <567AFE42.2080107-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-12-24  7:46                               ` Matan Barak
       [not found]                                 ` <CAAKD3BBGhWqp7kJfBQAhYHDVz5JgjNg4M+0GBTwg7Us4hioO-A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Matan Barak @ 2015-12-24  7:46 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Hefty, Sean, Matan Barak, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Or Gerlitz, Eran Ben Elisha, Jason Gunthorpe

On Wed, Dec 23, 2015 at 10:04 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 10/15/2015 12:58 PM, Hefty, Sean wrote:
>>>>> ib_create_ah_from_wc needs to resolve the DMAC in order to create the
>>>>> AH (this may result sending an ARP and waiting for response).
>>>>> CM uses this function (which is now sleepable).
>>>>
>>>> This is a significant change to the CM.  The CM calls are invoked
>>> assuming that they return relatively quickly.  They're invoked from
>>> callbacks and internally.  Having the calls now wait for an ARP response
>>> requires that this be re-architected, so the calling thread doesn't go out
>>> to lunch for several seconds.
>>>
>>> Agree - this is a significant change, but it was done a long time ago
>>> (at v4.3 if I recall). When we need to send a message we need to
>>
>> We're at 4.3-rc5?
>>
>>> figure out the destination MAC. Even the passive side needs to do that
>>> as some vendors don't report the source MAC of the packet in their wc.
>>> Even if they did, since IP based addressing is rout-able by its
>>> nature, it should follow the networking stack rules. Some crazy
>>> configurations could force sending responses to packets that came from
>>> router1 to router2 - so we have no choice than resolving the DMAC at
>>> every side.
>>
>> Ib_create_ah_from_wc is broken.   It is now an asynchronous operation, only the call itself was left as synchronous.  We can't block kernel threads for a minute, or however long ARP takes to resolve.  The call itself must change to be async, and all users of it updated to allocate some request, queue it, and handle all race conditions that result -- such as state changes or destruction of the work that caused the request to be initiated.
>>
>
> I don't know who had intended to address this, but it got left out of
> the 4.4 work.  We need to not let this drop through the cracks (for
> another release).  Can someone please put fixing this properly on their
> TODO list?
>

IMHO, the proposed patch makes things better. Not applying the current
patch means we have a "sleeping while atomic" error (in addition to
the fact that kernel threads could wait until the ARP process
finishes), which is pretty bad. I tend to agree that adding another CM
state is probably a better approach, but unless someone steps up and
add this for v4.5, I think that's the best thing we have.

> --
> Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>               GPG KeyID: 0E572FDD
>
>

Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
       [not found]                                 ` <CAAKD3BBGhWqp7kJfBQAhYHDVz5JgjNg4M+0GBTwg7Us4hioO-A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-01-05 11:33                                   ` Matan Barak
  0 siblings, 0 replies; 23+ messages in thread
From: Matan Barak @ 2016-01-05 11:33 UTC (permalink / raw)
  To: Doug Ledford, Yishai Hadas
  Cc: Hefty, Sean, Matan Barak, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Or Gerlitz, Eran Ben Elisha, Jason Gunthorpe

On Thu, Dec 24, 2015 at 9:46 AM, Matan Barak <matanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On Wed, Dec 23, 2015 at 10:04 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> On 10/15/2015 12:58 PM, Hefty, Sean wrote:
>>>>>> ib_create_ah_from_wc needs to resolve the DMAC in order to create the
>>>>>> AH (this may result sending an ARP and waiting for response).
>>>>>> CM uses this function (which is now sleepable).
>>>>>
>>>>> This is a significant change to the CM.  The CM calls are invoked
>>>> assuming that they return relatively quickly.  They're invoked from
>>>> callbacks and internally.  Having the calls now wait for an ARP response
>>>> requires that this be re-architected, so the calling thread doesn't go out
>>>> to lunch for several seconds.
>>>>
>>>> Agree - this is a significant change, but it was done a long time ago
>>>> (at v4.3 if I recall). When we need to send a message we need to
>>>
>>> We're at 4.3-rc5?
>>>
>>>> figure out the destination MAC. Even the passive side needs to do that
>>>> as some vendors don't report the source MAC of the packet in their wc.
>>>> Even if they did, since IP based addressing is rout-able by its
>>>> nature, it should follow the networking stack rules. Some crazy
>>>> configurations could force sending responses to packets that came from
>>>> router1 to router2 - so we have no choice than resolving the DMAC at
>>>> every side.
>>>
>>> Ib_create_ah_from_wc is broken.   It is now an asynchronous operation, only the call itself was left as synchronous.  We can't block kernel threads for a minute, or however long ARP takes to resolve.  The call itself must change to be async, and all users of it updated to allocate some request, queue it, and handle all race conditions that result -- such as state changes or destruction of the work that caused the request to be initiated.
>>>
>>
>> I don't know who had intended to address this, but it got left out of
>> the 4.4 work.  We need to not let this drop through the cracks (for
>> another release).  Can someone please put fixing this properly on their
>> TODO list?
>>
>
> IMHO, the proposed patch makes things better. Not applying the current
> patch means we have a "sleeping while atomic" error (in addition to
> the fact that kernel threads could wait until the ARP process
> finishes), which is pretty bad. I tend to agree that adding another CM
> state is probably a better approach, but unless someone steps up and
> add this for v4.5, I think that's the best thing we have.
>
>> --
>> Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>>               GPG KeyID: 0E572FDD
>>
>>
>
> Matan

Yishai has found a double free bug in the error flow of this patch.
The fix is pretty simple.
Thanks Yishai for catching and testing this fix.

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 07a3bbf..832674f 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -296,10 +296,9 @@ static int _cm_alloc_response_msg(struct cm_port *port,
                               0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
                               GFP_ATOMIC,
                               IB_MGMT_BASE_VERSION);
-       if (IS_ERR(m)) {
-               ib_destroy_ah(ah);
+       if (IS_ERR(m))
                return PTR_ERR(m);
-       }
+
        m->ah = ah;
        *msg = m;
        return 0;
@@ -310,13 +309,18 @@ static int cm_alloc_response_msg(struct cm_port *port,
                                 struct ib_mad_send_buf **msg)
 {
        struct ib_ah *ah;
+       int ret;

        ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
                                  mad_recv_wc->recv_buf.grh, port->port_num);
        if (IS_ERR(ah))
                return PTR_ERR(ah);

-       return _cm_alloc_response_msg(port, mad_recv_wc, ah, msg);
+       ret = _cm_alloc_response_msg(port, mad_recv_wc, ah, msg);
+       if (ret)
+               ib_destroy_ah(ah);
+
+       return ret;
 }

 static void cm_free_msg(struct ib_mad_send_buf *msg)


Doug, if you intend to take this patch. I can squash this fix and respin it.

Thanks,
Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2016-01-05 11:33 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-11 12:58 [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free Matan Barak
     [not found] ` <1444568298-17289-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-11 12:58   ` [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC Matan Barak
     [not found]     ` <1444568298-17289-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-12 12:59       ` Devesh Sharma
     [not found]         ` <CANjDDBjGe9a_gg-51=ysxMyDPjuD6rQg4FLyfZH8E1TCoEYLKQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-13  8:24           ` Matan Barak
2015-10-12 16:42       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A82373A9734356-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-13  8:22           ` Matan Barak
     [not found]             ` <CAAKD3BCe_LAuyxifm=j-Am44S1k4nT328WrGBC+Day+XxMxk9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-13 16:18               ` Hefty, Sean
     [not found]                 ` <1828884A29C6694DAF28B7E6B8A82373A9734A57-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-14  7:44                   ` Matan Barak
     [not found]                     ` <CAAKD3BAF763brdhsrHtpm_peHk--g3iza53AioiUefepHM_s2w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-15 16:58                       ` Hefty, Sean
     [not found]                         ` <1828884A29C6694DAF28B7E6B8A82373A9735914-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-18  7:28                           ` Matan Barak
     [not found]                             ` <CAAKD3BCdbD8MC4PJGuVzUxiq5wEbCNjX4e91vBkcoErJVM8FQg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-20 15:57                               ` Hefty, Sean
     [not found]                                 ` <1828884A29C6694DAF28B7E6B8A82373A9736832-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-20 16:03                                   ` Hal Rosenstock
2015-10-20 16:36                                   ` Jason Gunthorpe
2015-12-23 20:04                           ` Doug Ledford
     [not found]                             ` <567AFE42.2080107-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-12-24  7:46                               ` Matan Barak
     [not found]                                 ` <CAAKD3BBGhWqp7kJfBQAhYHDVz5JgjNg4M+0GBTwg7Us4hioO-A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-01-05 11:33                                   ` Matan Barak
2015-10-11 15:28   ` [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free Or Gerlitz
     [not found]     ` <HE1PR05MB1466FC3AF0B30533033EC1B0B0310@HE1PR05MB1466.eurprd05.prod.outlook.com>
     [not found]       ` <HE1PR05MB1466FC3AF0B30533033EC1B0B0310-eBadYZ65MZ+I1hPkL3GmLNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2015-10-12 13:14         ` Or Gerlitz
2015-10-12 16:37   ` Hefty, Sean
     [not found]     ` <1828884A29C6694DAF28B7E6B8A82373A9734333-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-15 15:15       ` Matan Barak
     [not found]         ` <561FC309.2030102-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-20 20:27           ` Doug Ledford
     [not found]             ` <5626A39D.6030906-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-21 19:58               ` Doug Ledford
     [not found]                 ` <5627EE5A.7030303-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-26 17:39                   ` Hefty, Sean

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.