From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io0-x235.google.com (mail-io0-x235.google.com [IPv6:2607:f8b0:4001:c06::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id DAB4C2195DA76 for ; Mon, 1 May 2017 09:16:52 -0700 (PDT) Received: by mail-io0-x235.google.com with SMTP id k87so124013793ioi.0 for ; Mon, 01 May 2017 09:16:52 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <1493655131.30303.17.camel@hpe.com> References: <149355594185.9917.1577772489949690281.stgit@dwillia2-desk3.amr.corp.intel.com> <1493652871.30303.15.camel@hpe.com> <1493655131.30303.17.camel@hpe.com> From: Dan Williams Date: Mon, 1 May 2017 09:16:51 -0700 Message-ID: Subject: Re: [PATCH] libnvdimm: rework region badblocks clearing List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: "Kani, Toshimitsu" Cc: "linux-kernel@vger.kernel.org" , "linux-nvdimm@lists.01.org" List-ID: On Mon, May 1, 2017 at 9:12 AM, Kani, Toshimitsu wrote: > On Mon, 2017-05-01 at 08:52 -0700, Dan Williams wrote: >> On Mon, May 1, 2017 at 8:43 AM, Dan Williams > m> wrote: >> > On Mon, May 1, 2017 at 8:34 AM, Kani, Toshimitsu > > m> wrote: >> > > On Sun, 2017-04-30 at 05:39 -0700, Dan Williams wrote: > : >> > > >> > > Hi Dan, >> > > >> > > I was testing the change with CONFIG_DEBUG_ATOMIC_SLEEP set this >> > > time, and hit the following BUG with BTT. This is a separate >> > > issue (not introduced by this patch), but it shows that we have >> > > an issue with the DSM call path as well. >> > >> > Ah, great find, thanks! We don't see this in the unit tests because >> > the nfit_test infrastructure takes no sleeping actions in its >> > simulated DSM path. Outside of converting btt to use sleeping locks >> > I'm not sure I see a path forward. I wonder how bad the performance >> > impact of that would be? Perhaps with opportunistic spinning it >> > won't be so bad, but I don't see another choice. >> >> It's worse than that. Part of the performance optimization of BTT I/O >> was to avoid locking altogether when we could rely on a BTT lane >> percpu, so that would also need to be removed. > > I do not have a good idea either, but I'd rather disable this clearing > in the regular BTT write path than adding sleeping locks to BTT. > Clearing a bad block in the BTT write path is difficult/challenging > since it allocates a new block. Actually, that may make things easier. Can we teach BTT to track error blocks and clear them before they are reassigned? _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932400AbdEAQQ4 (ORCPT ); Mon, 1 May 2017 12:16:56 -0400 Received: from mail-io0-f182.google.com ([209.85.223.182]:36122 "EHLO mail-io0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758612AbdEAQQx (ORCPT ); Mon, 1 May 2017 12:16:53 -0400 MIME-Version: 1.0 In-Reply-To: <1493655131.30303.17.camel@hpe.com> References: <149355594185.9917.1577772489949690281.stgit@dwillia2-desk3.amr.corp.intel.com> <1493652871.30303.15.camel@hpe.com> <1493655131.30303.17.camel@hpe.com> From: Dan Williams Date: Mon, 1 May 2017 09:16:51 -0700 Message-ID: Subject: Re: [PATCH] libnvdimm: rework region badblocks clearing To: "Kani, Toshimitsu" Cc: "linux-kernel@vger.kernel.org" , "linux-nvdimm@lists.01.org" , "Jiang, Dave" , "Verma, Vishal L" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 1, 2017 at 9:12 AM, Kani, Toshimitsu wrote: > On Mon, 2017-05-01 at 08:52 -0700, Dan Williams wrote: >> On Mon, May 1, 2017 at 8:43 AM, Dan Williams > m> wrote: >> > On Mon, May 1, 2017 at 8:34 AM, Kani, Toshimitsu > > m> wrote: >> > > On Sun, 2017-04-30 at 05:39 -0700, Dan Williams wrote: > : >> > > >> > > Hi Dan, >> > > >> > > I was testing the change with CONFIG_DEBUG_ATOMIC_SLEEP set this >> > > time, and hit the following BUG with BTT. This is a separate >> > > issue (not introduced by this patch), but it shows that we have >> > > an issue with the DSM call path as well. >> > >> > Ah, great find, thanks! We don't see this in the unit tests because >> > the nfit_test infrastructure takes no sleeping actions in its >> > simulated DSM path. Outside of converting btt to use sleeping locks >> > I'm not sure I see a path forward. I wonder how bad the performance >> > impact of that would be? Perhaps with opportunistic spinning it >> > won't be so bad, but I don't see another choice. >> >> It's worse than that. Part of the performance optimization of BTT I/O >> was to avoid locking altogether when we could rely on a BTT lane >> percpu, so that would also need to be removed. > > I do not have a good idea either, but I'd rather disable this clearing > in the regular BTT write path than adding sleeping locks to BTT. > Clearing a bad block in the BTT write path is difficult/challenging > since it allocates a new block. Actually, that may make things easier. Can we teach BTT to track error blocks and clear them before they are reassigned?