All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
To: Devesh Sharma <devesh.sharma-1wcpHE2jlwO1Z/+hSey0Gg@public.gmane.org>
Cc: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Jason Gunthorpe
	<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
Subject: Re: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
Date: Tue, 13 Oct 2015 11:24:20 +0300	[thread overview]
Message-ID: <561CBFB4.1020504@mellanox.com> (raw)
In-Reply-To: <CANjDDBjGe9a_gg-51=ysxMyDPjuD6rQg4FLyfZH8E1TCoEYLKQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>



On 10/12/2015 3:59 PM, Devesh Sharma wrote:
> Looks good, just one doubt inline:
>

Thanks for looking at this patch.

> On Sun, Oct 11, 2015 at 6:28 PM, Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>> When IP based addressing was introduced, ib_create_ah_from_wc was
>> changed in order to support a suitable AH. Since this AH should
>> now contains the DMAC (which isn't a simple derivative of the GID).
>> In order to find the DMAC, an ARP should sometime be sent. This ARP
>> is a sleeping context.
>>
>> ib_create_ah_from_wc is called from cm_alloc_response_msg, which is
>> sometimes called from an atomic context. This caused a
>> sleeping-while-atomic bug. Fixing this by splitting
>> cm_alloc_response_msg to an atomic and sleep-able part. When
>> cm_alloc_response_msg is used in an atomic context, we try to create
>> the AH before entering the atomic context.
>>
>> Fixes: 66bd20a72d2f ('IB/core: Ethernet L2 attributes in verbs/cm structures')
>> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
>>
>> Hi Doug,
>> This patch fixes an old bug in the CM. IP based addressing requires
>> ARP resolution which isn't sleep-able by its nature. This resolution
>> was sometimes done in non sleep-able context. Our regression tests
>> picked up this bug and verified this fix.
>>
>> Thanks,
>> Matan
>>
>>   drivers/infiniband/core/cm.c | 60 ++++++++++++++++++++++++++++++++++++--------
>>   1 file changed, 49 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
>> index ea4db9c..f5cf1c4 100644
>> --- a/drivers/infiniband/core/cm.c
>> +++ b/drivers/infiniband/core/cm.c
>> @@ -287,17 +287,12 @@ static int cm_alloc_msg(struct cm_id_private *cm_id_priv,
>>          return 0;
>>   }
>>
>> -static int cm_alloc_response_msg(struct cm_port *port,
>> -                                struct ib_mad_recv_wc *mad_recv_wc,
>> -                                struct ib_mad_send_buf **msg)
>> +static int _cm_alloc_response_msg(struct cm_port *port,
>> +                                 struct ib_mad_recv_wc *mad_recv_wc,
>> +                                 struct ib_ah *ah,
>> +                                 struct ib_mad_send_buf **msg)
>>   {
>>          struct ib_mad_send_buf *m;
>> -       struct ib_ah *ah;
>> -
>> -       ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
>> -                                 mad_recv_wc->recv_buf.grh, port->port_num);
>> -       if (IS_ERR(ah))
>> -               return PTR_ERR(ah);
>>
>>          m = ib_create_send_mad(port->mad_agent, 1, mad_recv_wc->wc->pkey_index,
>>                                 0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
>> @@ -312,6 +307,20 @@ static int cm_alloc_response_msg(struct cm_port *port,
>>          return 0;
>>   }
>>
>> +static int cm_alloc_response_msg(struct cm_port *port,
>> +                                struct ib_mad_recv_wc *mad_recv_wc,
>> +                                struct ib_mad_send_buf **msg)
>> +{
>> +       struct ib_ah *ah;
>> +
>> +       ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
>> +                                 mad_recv_wc->recv_buf.grh, port->port_num);
>> +       if (IS_ERR(ah))
>> +               return PTR_ERR(ah);
>> +
>> +       return _cm_alloc_response_msg(port, mad_recv_wc, ah, msg);
>> +}
>> +
>>   static void cm_free_msg(struct ib_mad_send_buf *msg)
>>   {
>>          ib_destroy_ah(msg->ah);
>> @@ -2201,6 +2210,7 @@ static int cm_dreq_handler(struct cm_work *work)
>>          struct cm_id_private *cm_id_priv;
>>          struct cm_dreq_msg *dreq_msg;
>>          struct ib_mad_send_buf *msg = NULL;
>> +       struct ib_ah *ah;
>>          int ret;
>>
>>          dreq_msg = (struct cm_dreq_msg *)work->mad_recv_wc->recv_buf.mad;
>> @@ -2213,6 +2223,11 @@ static int cm_dreq_handler(struct cm_work *work)
>>                  return -EINVAL;
>>          }
>>
>> +       ah = ib_create_ah_from_wc(work->port->mad_agent->qp->pd,
>> +                                 work->mad_recv_wc->wc,
>> +                                 work->mad_recv_wc->recv_buf.grh,
>> +                                 work->port->port_num);
>> +
>
> Shouldn't below IS_ERR(ah) on ah be here, instead of there?
>

I don't think we want to fail on error if the state != IB_CM_TIMEWAIT 
(other states don't use this ah).

>>          work->cm_event.private_data = &dreq_msg->private_data;
>>
>>          spin_lock_irq(&cm_id_priv->lock);
>> @@ -2234,9 +2249,13 @@ static int cm_dreq_handler(struct cm_work *work)
>>          case IB_CM_TIMEWAIT:
>>                  atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES].
>>                                  counter[CM_DREQ_COUNTER]);
>> -               if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
>> +               if (IS_ERR(ah))
>> +                       goto unlock;
>> +               if (_cm_alloc_response_msg(work->port, work->mad_recv_wc, ah,
>> +                                          &msg))
>>                          goto unlock;
>>
>> +               ah = NULL;
>>                  cm_format_drep((struct cm_drep_msg *) msg->mad, cm_id_priv,
>>                                 cm_id_priv->private_data,
>>                                 cm_id_priv->private_data_len);
>> @@ -2259,6 +2278,8 @@ static int cm_dreq_handler(struct cm_work *work)
>>                  list_add_tail(&work->list, &cm_id_priv->work_list);
>>          spin_unlock_irq(&cm_id_priv->lock);
>>
>> +       if (!IS_ERR_OR_NULL(ah))
>> +               ib_destroy_ah(ah);
>>          if (ret)
>>                  cm_process_work(cm_id_priv, work);
>>          else
>> @@ -2266,6 +2287,8 @@ static int cm_dreq_handler(struct cm_work *work)
>>          return 0;
>>
>>   unlock:        spin_unlock_irq(&cm_id_priv->lock);
>> +       if (!IS_ERR_OR_NULL(ah))
>> +               ib_destroy_ah(ah);
>>   deref: cm_deref_id(cm_id_priv);
>>          return -EINVAL;
>>   }
>> @@ -2761,6 +2784,7 @@ static int cm_lap_handler(struct cm_work *work)
>>          struct cm_lap_msg *lap_msg;
>>          struct ib_cm_lap_event_param *param;
>>          struct ib_mad_send_buf *msg = NULL;
>> +       struct ib_ah *ah;
>>          int ret;
>>
>>          /* todo: verify LAP request and send reject APR if invalid. */
>> @@ -2770,6 +2794,12 @@ static int cm_lap_handler(struct cm_work *work)
>>          if (!cm_id_priv)
>>                  return -EINVAL;
>>
>> +
>> +       ah = ib_create_ah_from_wc(work->port->mad_agent->qp->pd,
>> +                                 work->mad_recv_wc->wc,
>> +                                 work->mad_recv_wc->recv_buf.grh,
>> +                                 work->port->port_num);
>> +
>>          param = &work->cm_event.param.lap_rcvd;
>>          param->alternate_path = &work->path[0];
>>          cm_format_path_from_lap(cm_id_priv, param->alternate_path, lap_msg);
>> @@ -2786,9 +2816,13 @@ static int cm_lap_handler(struct cm_work *work)
>>          case IB_CM_MRA_LAP_SENT:
>>                  atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES].
>>                                  counter[CM_LAP_COUNTER]);
>> -               if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
>> +               if (IS_ERR(ah))
>> +                       goto unlock;
>> +               if (_cm_alloc_response_msg(work->port, work->mad_recv_wc, ah,
>> +                                          &msg))
>>                          goto unlock;
>>
>> +               ah = NULL;
>>                  cm_format_mra((struct cm_mra_msg *) msg->mad, cm_id_priv,
>>                                CM_MSG_RESPONSE_OTHER,
>>                                cm_id_priv->service_timeout,
>> @@ -2818,6 +2852,8 @@ static int cm_lap_handler(struct cm_work *work)
>>                  list_add_tail(&work->list, &cm_id_priv->work_list);
>>          spin_unlock_irq(&cm_id_priv->lock);
>>
>> +       if (!IS_ERR_OR_NULL(ah))
>> +               ib_destroy_ah(ah);
>>          if (ret)
>>                  cm_process_work(cm_id_priv, work);
>>          else
>> @@ -2825,6 +2861,8 @@ static int cm_lap_handler(struct cm_work *work)
>>          return 0;
>>
>>   unlock:        spin_unlock_irq(&cm_id_priv->lock);
>> +       if (!IS_ERR_OR_NULL(ah))
>> +               ib_destroy_ah(ah);
>>   deref: cm_deref_id(cm_id_priv);
>>          return -EINVAL;
>>   }
>> --
>> 2.1.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-10-13  8:24 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-11 12:58 [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free Matan Barak
     [not found] ` <1444568298-17289-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-11 12:58   ` [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC Matan Barak
     [not found]     ` <1444568298-17289-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-12 12:59       ` Devesh Sharma
     [not found]         ` <CANjDDBjGe9a_gg-51=ysxMyDPjuD6rQg4FLyfZH8E1TCoEYLKQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-13  8:24           ` Matan Barak [this message]
2015-10-12 16:42       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A82373A9734356-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-13  8:22           ` Matan Barak
     [not found]             ` <CAAKD3BCe_LAuyxifm=j-Am44S1k4nT328WrGBC+Day+XxMxk9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-13 16:18               ` Hefty, Sean
     [not found]                 ` <1828884A29C6694DAF28B7E6B8A82373A9734A57-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-14  7:44                   ` Matan Barak
     [not found]                     ` <CAAKD3BAF763brdhsrHtpm_peHk--g3iza53AioiUefepHM_s2w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-15 16:58                       ` Hefty, Sean
     [not found]                         ` <1828884A29C6694DAF28B7E6B8A82373A9735914-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-18  7:28                           ` Matan Barak
     [not found]                             ` <CAAKD3BCdbD8MC4PJGuVzUxiq5wEbCNjX4e91vBkcoErJVM8FQg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-20 15:57                               ` Hefty, Sean
     [not found]                                 ` <1828884A29C6694DAF28B7E6B8A82373A9736832-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-20 16:03                                   ` Hal Rosenstock
2015-10-20 16:36                                   ` Jason Gunthorpe
2015-12-23 20:04                           ` Doug Ledford
     [not found]                             ` <567AFE42.2080107-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-12-24  7:46                               ` Matan Barak
     [not found]                                 ` <CAAKD3BBGhWqp7kJfBQAhYHDVz5JgjNg4M+0GBTwg7Us4hioO-A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-01-05 11:33                                   ` Matan Barak
2015-10-11 15:28   ` [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free Or Gerlitz
     [not found]     ` <HE1PR05MB1466FC3AF0B30533033EC1B0B0310@HE1PR05MB1466.eurprd05.prod.outlook.com>
     [not found]       ` <HE1PR05MB1466FC3AF0B30533033EC1B0B0310-eBadYZ65MZ+I1hPkL3GmLNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2015-10-12 13:14         ` Or Gerlitz
2015-10-12 16:37   ` Hefty, Sean
     [not found]     ` <1828884A29C6694DAF28B7E6B8A82373A9734333-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-15 15:15       ` Matan Barak
     [not found]         ` <561FC309.2030102-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-20 20:27           ` Doug Ledford
     [not found]             ` <5626A39D.6030906-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-21 19:58               ` Doug Ledford
     [not found]                 ` <5627EE5A.7030303-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-26 17:39                   ` Hefty, Sean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=561CBFB4.1020504@mellanox.com \
    --to=matanb-vpraknaxozvwk0htik3j/w@public.gmane.org \
    --cc=devesh.sharma-1wcpHE2jlwO1Z/+hSey0Gg@public.gmane.org \
    --cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.