All of lore.kernel.org
 help / color / mirror / Atom feed
From: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Erez Shitrit
	<erezsh-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org
Subject: Re: [PATCH 6/8] IPoIB: Use dedicated workqueues per interface
Date: Tue, 09 Sep 2014 03:09:38 -0400	[thread overview]
Message-ID: <540EA7B2.2050805@redhat.com> (raw)
In-Reply-To: <54080B6B.8050707-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>

On 09/04/2014 02:49 AM, Erez Shitrit wrote:
> Hi Doug,
>
> The idea that "big" ipoib workqueue is only for the IB_EVENT_XXX looks
> good to me.
>
> one comment here: I don't see where the driver flushes it at all.
> IMHO, when the driver goes down you should cancel all pending tasks over
> that global wq prior the call for the destroy_workqueue.

I understand your thought, but I disagree with your conclusion.  If we 
ever get to the portion of the driver down sequence where we are 
removing the big flush work queue and there are still items on that work 
queue, then we have already failed and we are going to crash because 
there are outstanding flushes for a deleted device.  The real issue here 
is that we need to make sure it's not possible to delete a device that 
has outstanding flushes, and not possible to flush a device in the 
process of being deleted (Wendy, don't get mad at me, but the rtnl lock 
jumps out as appropriate for this purpose).

> The current logic calls the destroy_workqueue after the remove_one, i
> think you should cancel the pending works after the
> ib_unregister_event_handler call in the ipoib_remove_one function,

The ib_remove_one is device specific while the big flush work queue is 
not.  As such, that can cancel work for devices other than the device 
being removed with no clear means of getting it back.

> otherwise if there are pending tasks in that queue they will schedule
> after the dev/priv are gone.

I agree that there is a problem here.  I think what needs to happen is 
that now that we have a work queue per device, and all flushes come in 
to us via a work queue that does not belong to the device, it is now 
possible to handle flushes synchronously.  This would allow us to lock 
around the flush itself and prevent removal until after the flush is 
complete without getting into the hairy rat hole that is trying to flush 
the flush workqueue that might result in either flushing or canceling 
the very device we are working on.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2014-09-09  7:09 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-12 23:38 [PATCH 0/8] IPoIB: Fix multiple race conditions Doug Ledford
     [not found] ` <cover.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-12 23:38   ` [PATCH 1/8] IPoIB: Consolidate rtnl_lock tasks in workqueue Doug Ledford
     [not found]     ` <2394730ce5ae1d46522dca04066293dd842edf16.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-15 22:11       ` Wendy Cheng
2014-09-04 14:28       ` Erez Shitrit
2014-08-12 23:38   ` [PATCH 2/8] IPoIB: Make the carrier_on_task race aware Doug Ledford
     [not found]     ` <d05cdce70f1312f35f8be2d14bafd2a06809b137.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-18 23:26       ` Wendy Cheng
     [not found]         ` <CABgxfbE6edfZZ58=mTvhGqWSkCxsik0XuQPR0L-yayze=803cg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-19 20:32           ` Doug Ledford
     [not found]             ` <2CC1794B-10CD-49A2-8F5D-0C66A6684DBC-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-19 22:05               ` Wendy Cheng
2014-09-04 12:13       ` Erez Shitrit
     [not found]         ` <5408576C.7040609-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-09-09  7:17           ` Doug Ledford
2014-08-12 23:38   ` [PATCH 3/8] IPoIB: fix MCAST_FLAG_BUSY usage Doug Ledford
     [not found]     ` <a410e80dc5ca7cfa64229bbbf50c1337317e3bd8.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-19 18:08       ` Wendy Cheng
     [not found]         ` <CABgxfbH-Dt3CpxJKwCAZeHTUyupaA9y_WXVXuxgiPMet26PTQw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-19 20:28           ` Doug Ledford
     [not found]             ` <902D5BF2-159A-4B31-A87F-7491F3C8057F-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-25 19:51               ` Wendy Cheng
     [not found]                 ` <CABgxfbHOD75vLdZ0TtWZbk8ne3kHd_eWObxPHmoJ-D8DjE0bkw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-25 20:03                   ` Doug Ledford
     [not found]                     ` <E3EFCBAC-2D6E-49D3-A556-DBD40701CC5F-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-26 18:04                       ` Wendy Cheng
2014-08-12 23:38   ` [PATCH 4/8] IPoIB: fix mcast_dev_flush/mcast_restart_task race Doug Ledford
     [not found]     ` <e1dbcfc25d8930b281aad12699ebf8fa82485b0e.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-29 21:53       ` Wendy Cheng
     [not found]         ` <CABgxfbHDuUrdHuLJT2oD07Cy3Ys4_rj-bJ6eR=9+uv0CuPH7_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-30 15:39           ` Wendy Cheng
     [not found]             ` <CABgxfbGJOdmAn1sokEtisDdnA=r_4mfP=PfqZVsP0cd_oL50dA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-03 18:06               ` Doug Ledford
     [not found]                 ` <1409767614.26762.7.camel-v+aXH1h/sVwpzh8Nc7Vzg+562jBIR2Zt@public.gmane.org>
2014-09-03 19:45                   ` Wendy Cheng
2014-09-03 17:49           ` Doug Ledford
2014-08-12 23:38   ` [PATCH 5/8] IPoIB: change init sequence ordering Doug Ledford
     [not found]     ` <ead9800512c1cb412b86cb1de3868c40f07c72be.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-09-04 12:36       ` Erez Shitrit
2014-08-12 23:38   ` [PATCH 6/8] IPoIB: Use dedicated workqueues per interface Doug Ledford
     [not found]     ` <f7af9c251d722675a549e4a673f46c0f31dfa266.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-20 15:01       ` Estrin, Alex
2014-09-04  6:49       ` Erez Shitrit
     [not found]         ` <54080B6B.8050707-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-09-09  7:09           ` Doug Ledford [this message]
     [not found]             ` <540EA7B2.2050805-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-09-10 17:27               ` Erez Shitrit
2014-08-12 23:38   ` [PATCH 7/8] IPoIB: Make ipoib_mcast_stop_thread flush the workqueue Doug Ledford
     [not found]     ` <ae3912431eeacd81d920a405a6bdeb3853791b1a.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-20 15:03       ` Estrin, Alex
2014-08-12 23:38   ` [PATCH 8/8] IPoIB: No longer use flush as a parameter Doug Ledford
     [not found]     ` <ad7bb2b8da52f187cf2978e6a1c77ead32b60de3.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-20 15:04       ` Estrin, Alex
2014-08-15 22:08   ` [PATCH 0/8] IPoIB: Fix multiple race conditions Wendy Cheng
     [not found]     ` <CABgxfbEGfiNGUKT4NJi1GoDRouFznxpgogDt5yr47TLfwDB7hA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-03 16:26       ` Wendy Cheng
2014-09-03 13:52   ` Or Gerlitz
     [not found]     ` <54071D14.9040404-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-09-03 18:12       ` Doug Ledford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=540EA7B2.2050805@redhat.com \
    --to=dledford-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=erezsh-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.