linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Bart Van Assche <bvanassche@acm.org>,
	Bob Pearson <rpearsonhpe@gmail.com>
Cc: "Daisuke Matsuda (Fujitsu)" <matsuda-daisuke@fujitsu.com>,
	'Zhu Yanjun' <yanjun.zhu@linux.dev>,
	Leon Romanovsky <leon@kernel.org>,
	"zyjzyj2000@gmail.com" <zyjzyj2000@gmail.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"shinichiro.kawasaki@wdc.com" <shinichiro.kawasaki@wdc.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	Zhu Yanjun <yanjun.zhu@intel.com>
Subject: Re: [PATCH 1/1] Revert "RDMA/rxe: Add workqueue support for rxe tasks"
Date: Wed, 11 Oct 2023 12:51:04 -0300	[thread overview]
Message-ID: <20231011155104.GF55194@ziepe.ca> (raw)
In-Reply-To: <a4808fa6-5bd5-4a64-a437-6a7e89ca7e9f@acm.org>

On Tue, Oct 10, 2023 at 02:29:19PM -0700, Bart Van Assche wrote:
> On 10/10/23 09:09, Jason Gunthorpe wrote:
> > On Tue, Oct 10, 2023 at 04:53:55AM +0000, Daisuke Matsuda (Fujitsu) wrote:
> > 
> > > Solution 1: Reverting "RDMA/rxe: Add workqueue support for rxe tasks"
> > > I see this is supported by Zhu, Bart and approved by Leon.
> > > 
> > > Solution 2: Serializing execution of work items
> > > > -       rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, WQ_MAX_ACTIVE);
> > > > +       rxe_wq = alloc_workqueue("rxe_wq", WQ_HIGHPRI | WQ_UNBOUND, 1);
> > > 
> > > Solution 3: Merging requester and completer (not yet submitted/tested)
> > > https://lore.kernel.org/all/93c8ad67-f008-4352-8887-099723c2f4ec@gmail.com/
> > > Not clear to me if we should call this a new feature or a fix.
> > > If it can eliminate the hang issue, it could be an ultimate solution.
> > > 
> > > It is understandable some people do not want to wait for solution 3 to be submitted and verified.
> > > Is there any problem if we adopt solution 2?
> > > If so, then I agree to going with solution 1.
> > > If not, solution 2 is better to me.
> > 
> > I also do not want to go backwards, I don't believe the locking is
> > magically correct under tasklets. 2 is painful enough to continue to
> > motivate people to fix this while unbreaking block tests.
> 
> In my opinion (2) is not a solution. Zhu Yanjun reported test failures with
> rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, 1). Adding WQ_HIGHPRI probably
> made it less likely to trigger any race conditions but I don't believe that
> this is sufficient as a solution.

I've been going on the assumption that rxe has always been full of
bugs. I don't believe the work queue change added new bugs, it just
made the existing bugs easier to hit.

It is hard to be sure until someon can find out what is going wrong.

If we revert it then rxe will probably just stop development
entirely. Daisuke's ODP work will be blocked and if Bob was able to
fix it he would have done so already. Which mean's Bobs ongoing work
is lost too.

I *vastly* prefer we root cause and fix it properly. Rxe was finally
starting to get a reasonable set of people interested in it, I do not
want to kill that off.

Again, I'm troubled that this doesn't seem to be reproducing for other
people.

> > I'm still puzzled why Bob can't reproduce the things Bart has seen.
> 
> Is this necessary?

It is always easier to debug something you can change than to try and
guess what an oops is trying to say..

> The KASAN complaint that I reported should be more than enough for
> someone who is familiar with the RXE driver to identify and fix the
> root cause. I can help with testing candidate fixes.

Bob?

Jason

  reply	other threads:[~2023-10-11 15:51 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-22 16:32 [PATCH 1/1] Revert "RDMA/rxe: Add workqueue support for rxe tasks" Zhu Yanjun
2023-09-22 16:42 ` Bart Van Assche
2023-09-26  9:43   ` Leon Romanovsky
2023-09-26  9:43 ` Leon Romanovsky
2023-09-26 14:06   ` Leon Romanovsky
2023-09-26 17:05     ` Bart Van Assche
2023-09-26 18:34       ` Bob Pearson
2023-09-26 20:24         ` Bart Van Assche
2023-09-27  0:08           ` Rain River
2023-09-27 16:36           ` Bob Pearson
2023-09-27 16:51           ` Bob Pearson
2023-10-01  6:30             ` Leon Romanovsky
2023-10-04 17:44               ` Bart Van Assche
2023-10-04  3:41           ` Zhu Yanjun
2023-10-04 17:43             ` Bart Van Assche
2023-10-04 18:38               ` Jason Gunthorpe
2023-10-05  9:25                 ` Zhu Yanjun
2023-10-05 14:21                   ` Jason Gunthorpe
2023-10-05 14:50                     ` Bart Van Assche
2023-10-05 15:56                       ` Jason Gunthorpe
2023-10-06 15:58                         ` Bob Pearson
2023-10-07  0:35                           ` Zhu Yanjun
2023-10-08 16:01                       ` Zhu Yanjun
2023-10-08 17:09                         ` Leon Romanovsky
2023-10-10  4:53                         ` Daisuke Matsuda (Fujitsu)
2023-10-10 16:09                           ` Jason Gunthorpe
2023-10-10 21:29                             ` Bart Van Assche
2023-10-11 15:51                               ` Jason Gunthorpe [this message]
2023-10-11 20:14                                 ` Bart Van Assche
2023-10-11 23:12                                   ` Jason Gunthorpe
2023-10-12 11:49                                     ` Zhu Yanjun
2023-10-12 15:38                                       ` Bob Pearson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231011155104.GF55194@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=bvanassche@acm.org \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=matsuda-daisuke@fujitsu.com \
    --cc=rpearsonhpe@gmail.com \
    --cc=shinichiro.kawasaki@wdc.com \
    --cc=yanjun.zhu@intel.com \
    --cc=yanjun.zhu@linux.dev \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).