From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Gunthorpe Subject: Re: Kernel oops Date: Thu, 27 Jul 2017 14:44:37 -0600 Message-ID: <20170727204437.GA16986@obsidianresearch.com> References: <20170724211606.GA1705@obsidianresearch.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Matan Barak Cc: Doug Ledford , linux-rdma , Yishai Hadas List-Id: linux-rdma@vger.kernel.org On Thu, Jul 27, 2017 at 03:54:07PM +0300, Matan Barak wrote: > Digging a bit, we found a fix that might be related to this issue. > I would be happy if you could try that and report if it solved this problem. > We plan to send it soon. Yep this looks like it. FWIW, it causes random kernel memory corruption and failures in my experience, I was very lucky to get such a clean oops the first time.. > commit 1d4ecbf034193f000fe6686586c40ab4b2a95da1 > Author: Yishai Hadas > Date: Thu Jul 27 15:49:00 2017 +0200 > > IB/uverbs: Fix device cleanup > > Uverbs device should be cleaned up only when there is no > potential usage of. > > As part of ib_uverbs_remove_one which might be triggered upon reset flow > the device reference count is decreased as expected and leave the final > cleanup to the FDs that were opened. > > Current code increases reference count upon opening a new command FD and > decreases it upon closing the file. The event FD is opened internally > and rely on the command FD by taking on it a reference count. > > In case that the command FD was closed and just later the event FD we > may ensure that the device resources as of srcu are still alive as they > are still in use. > > Fixing the above by moving the reference count decreasing to the place > where the command FD is really freed instead of doing that when it was > just closed. > > Signed-off-by: Yishai Hadas > Reviewed-by: Matan Barak Reviewed-by: Jason Gunthorpe Tested-by: Jason Gunthorpe Please add a fixes line Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html