From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 961C3C43381 for ; Thu, 21 Mar 2019 13:34:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 648D820693 for ; Thu, 21 Mar 2019 13:34:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727857AbfCUNep (ORCPT ); Thu, 21 Mar 2019 09:34:45 -0400 Received: from mga18.intel.com ([134.134.136.126]:7975 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727829AbfCUNep (ORCPT ); Thu, 21 Mar 2019 09:34:45 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Mar 2019 06:34:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,252,1549958400"; d="scan'208";a="216208215" Received: from ikonopko-mobl1.ger.corp.intel.com (HELO [10.237.142.155]) ([10.237.142.155]) by orsmga001.jf.intel.com with ESMTP; 21 Mar 2019 06:34:42 -0700 Subject: Re: [PATCH 10/18] lightnvm: pblk: ensure that emeta is written To: =?UTF-8?Q?Javier_Gonz=c3=a1lez?= Cc: =?UTF-8?Q?Matias_Bj=c3=b8rling?= , Hans Holmberg , linux-block@vger.kernel.org References: <20190314160428.3559-1-igor.j.konopko@intel.com> <20190314160428.3559-11-igor.j.konopko@intel.com> <6766e0ae-ec44-3ebd-0015-aeb7cb9029e5@lightnvm.io> <5d8dc4e4-c1a4-88ea-9006-47c3c492ea2c@intel.com> <23A8B9B8-21F8-4A8D-BBEE-EC4286AB4B13@javigon.com> From: Igor Konopko Message-ID: Date: Thu, 21 Mar 2019 14:34:42 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.3 MIME-Version: 1.0 In-Reply-To: <23A8B9B8-21F8-4A8D-BBEE-EC4286AB4B13@javigon.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On 18.03.2019 19:26, Javier González wrote: >> On 18 Mar 2019, at 14.02, Igor Konopko wrote: >> >> >> >> On 17.03.2019 20:44, Matias Bjørling wrote: >>> On 3/14/19 9:04 AM, Igor Konopko wrote: >>>> When we are trying to switch to the new line, we need to ensure that >>>> emeta for n-2 line is already written. In other case we can end with >>>> deadlock scenario, when the writer has no more requests to write and >>>> thus there is no way to trigger emeta writes from writer thread. This >>>> is a corner case scenario which occurs in a case of multiple writes >>>> error and thus kind of early line close due to lack of line space. >>>> >>>> Signed-off-by: Igor Konopko >>>> --- >>>> drivers/lightnvm/pblk-core.c | 2 ++ >>>> drivers/lightnvm/pblk-write.c | 24 ++++++++++++++++++++++++ >>>> drivers/lightnvm/pblk.h | 1 + >>>> 3 files changed, 27 insertions(+) >>>> >>>> diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c >>>> index 38e26fe..a683d1f 100644 >>>> --- a/drivers/lightnvm/pblk-core.c >>>> +++ b/drivers/lightnvm/pblk-core.c >>>> @@ -1001,6 +1001,7 @@ static void pblk_line_setup_metadata(struct pblk_line *line, >>>> struct pblk_line_mgmt *l_mg, >>>> struct pblk_line_meta *lm) >>>> { >>>> + struct pblk *pblk = container_of(l_mg, struct pblk, l_mg); >>>> int meta_line; >>>> lockdep_assert_held(&l_mg->free_lock); >>>> @@ -1009,6 +1010,7 @@ static void pblk_line_setup_metadata(struct pblk_line *line, >>>> meta_line = find_first_zero_bit(&l_mg->meta_bitmap, PBLK_DATA_LINES); >>>> if (meta_line == PBLK_DATA_LINES) { >>>> spin_unlock(&l_mg->free_lock); >>>> + pblk_write_emeta_force(pblk); >>>> io_schedule(); >>>> spin_lock(&l_mg->free_lock); >>>> goto retry_meta; >>>> diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c >>>> index 4e63f9b..4fbb9b2 100644 >>>> --- a/drivers/lightnvm/pblk-write.c >>>> +++ b/drivers/lightnvm/pblk-write.c >>>> @@ -505,6 +505,30 @@ static struct pblk_line *pblk_should_submit_meta_io(struct pblk *pblk, >>>> return meta_line; >>>> } >>>> +void pblk_write_emeta_force(struct pblk *pblk) >>>> +{ >>>> + struct pblk_line_meta *lm = &pblk->lm; >>>> + struct pblk_line_mgmt *l_mg = &pblk->l_mg; >>>> + struct pblk_line *meta_line; >>>> + >>>> + while (true) { >>>> + spin_lock(&l_mg->close_lock); >>>> + if (list_empty(&l_mg->emeta_list)) { >>>> + spin_unlock(&l_mg->close_lock); >>>> + break; >>>> + } >>>> + meta_line = list_first_entry(&l_mg->emeta_list, >>>> + struct pblk_line, list); >>>> + if (meta_line->emeta->mem >= lm->emeta_len[0]) { >>>> + spin_unlock(&l_mg->close_lock); >>>> + io_schedule(); >>>> + continue; >>>> + } >>>> + spin_unlock(&l_mg->close_lock); >>>> + pblk_submit_meta_io(pblk, meta_line); >>>> + } >>>> +} >>>> + >>>> static int pblk_submit_io_set(struct pblk *pblk, struct nvm_rq *rqd) >>>> { >>>> struct ppa_addr erase_ppa; >>>> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h >>>> index 0a85990..a42bbfb 100644 >>>> --- a/drivers/lightnvm/pblk.h >>>> +++ b/drivers/lightnvm/pblk.h >>>> @@ -877,6 +877,7 @@ int pblk_write_ts(void *data); >>>> void pblk_write_timer_fn(struct timer_list *t); >>>> void pblk_write_should_kick(struct pblk *pblk); >>>> void pblk_write_kick(struct pblk *pblk); >>>> +void pblk_write_emeta_force(struct pblk *pblk); >>>> /* >>>> * pblk read path >>> Hi Igor, >>> Is this an error that qemu can force pblk to expose? Can you provide a specific example on what is needed to force the error? >> >> So I hit this error on PBLKs with low number of LUNs and multiple >> write IO errors (should be reproducible with error injection). Then >> pblk_map_remaining() quickly mapped all the sectors in line and thus >> writer thread was not able to issue all the necessary emeta IO writes, >> so it stucks when trying to replace line to new one. So this is >> definitely an error/corner case scenario. > > If the cause if emeta writes, then there is a bug in > pblk_line_close_meta(), as the logic to prevent this case is in place. > So I definitely saw this functions to be called few times in corner series scenarios, but I will drop this patch for now and I'll try to find out what is the reason of such a behavior, since this patch more looks like a workaround that a real fix for me now after the discussion. Thanks Igor