From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0B9CC433EF for ; Mon, 11 Apr 2022 13:37:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346474AbiDKNj5 (ORCPT ); Mon, 11 Apr 2022 09:39:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343774AbiDKNjz (ORCPT ); Mon, 11 Apr 2022 09:39:55 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 72B472180D; Mon, 11 Apr 2022 06:37:40 -0700 (PDT) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 2083D215FF; Mon, 11 Apr 2022 13:37:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1649684259; h=from:from:reply-to:reply-to:date:date:message-id:message-id:to:to: cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=11KZMTEDGr7Y36I2Hveh49be2Dvb2g0aFWQxmVrJlYI=; b=eDK1BZE9UndY5Nm9Y1Q59yBMO963EEGxBP0S38+VGGctz+x2Fb0yJXLQjsEeMpv39fol/F UxqWru6Qnx8Qla0tyL+dE2espoSNyvL6V3VCh/Bnu15mMpRhx1IJxNESIv93US4PeeDFwJ 03qWuC61jirABqrbJF1paxcbuKv7YK4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1649684259; h=from:from:reply-to:reply-to:date:date:message-id:message-id:to:to: cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=11KZMTEDGr7Y36I2Hveh49be2Dvb2g0aFWQxmVrJlYI=; b=LGdulimxNbMm/rpsEUldDVRgpVU3DBCKIiIk43EGsCiab+bFLyjHVXUmCAKOSIv6ZgDjzV zspTx+zQhFykusBA== Received: from ds.suse.cz (ds.suse.cz [10.100.12.205]) by relay2.suse.de (Postfix) with ESMTP id C20BBA3B82; Mon, 11 Apr 2022 13:37:38 +0000 (UTC) Received: by ds.suse.cz (Postfix, from userid 10065) id 69163DA7DA; Mon, 11 Apr 2022 15:33:34 +0200 (CEST) Date: Mon, 11 Apr 2022 15:33:34 +0200 From: David Sterba To: Naohiro Aota Cc: Sweet Tea Dorminy , Chris Mason , Josef Bacik , David Sterba , "linux-kernel@vger.kernel.org" , "linux-btrfs@vger.kernel.org" , "kernel-team@fb.com" Subject: Re: [PATCH] btrfs: wait between incomplete batch allocations Message-ID: <20220411133334.GF15609@suse.cz> Reply-To: dsterba@suse.cz Mail-Followup-To: dsterba@suse.cz, Naohiro Aota , Sweet Tea Dorminy , Chris Mason , Josef Bacik , David Sterba , "linux-kernel@vger.kernel.org" , "linux-btrfs@vger.kernel.org" , "kernel-team@fb.com" References: <07d6dbf34243b562287e953c44a70cbb6fca15a1.1649268923.git.sweettea-kernel@dorminy.me> <20220411071124.zwtcarqngqqkdd6q@naota-xeon> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220411071124.zwtcarqngqqkdd6q@naota-xeon> User-Agent: Mutt/1.5.23.1-rc1 (2014-03-12) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 11, 2022 at 07:11:24AM +0000, Naohiro Aota wrote: > On Wed, Apr 06, 2022 at 02:24:18PM -0400, Sweet Tea Dorminy wrote: > > When allocating memory in a loop, each iteration should call > > memalloc_retry_wait() in order to prevent starving memory-freeing > > processes (and to mark where allcoation loops are). ext4, f2fs, and xfs > > all use this function at present for their allocation loops; btrfs ought > > also. > > > > The bulk page allocation is the only place in btrfs with an allocation > > retry loop, so add an appropriate call to it. > > > > Suggested-by: David Sterba > > Signed-off-by: Sweet Tea Dorminy > > The fstests btrfs/187 becomes incredibly slow with this patch applied. > > For example, on a nvme ZNS SSD (zoned) device, it takes over 10 hours to > finish the test case. It only takes 765 seconds if I revert this commit > from the misc-next branch. > > I also confirmed the same slowdown occurs on regular btrfs. For the > baseline, with this commit reverted, it takes 335 seconds on 8GB ZRAM > device running on QEMU (8GB RAM), and takes 768 seconds on a (non-zoned) > HDD running on a real machine (128GB RAM). The tests on misc-next with the > same setup above is still running, but it already took 2 hours. > > The test case runs full btrfs sending 5 times and incremental btrfs sending > 10 times at the same time. Also, dedupe loop and balance loop is running > simultaneously while all the send commands finish. > > The slowdown of the test case basically comes from slow "btrfs send" > command. On the HDD run, it takes 25 minutes to run a full btrfs sending > command and 1 hour 18 minutes to run a incremental btrfs sending > command. Thus, we will need 78 minutes x 5 = 6.5 hours to finish all the > send commands, making the test case incredibly slow. > > > --- > > fs/btrfs/extent_io.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c > > index 9f2ada809dea..4bcc182744e4 100644 > > --- a/fs/btrfs/extent_io.c > > +++ b/fs/btrfs/extent_io.c > > @@ -6,6 +6,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -3159,6 +3160,8 @@ int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array) > > */ > > if (allocated == last) > > return -ENOMEM; > > + > > + memalloc_retry_wait(GFP_NOFS); > > And, I just noticed this is because we are waiting for the retry even if we > successfully allocated all the pages. We should exit the loop if (allocated > == nr_pages). Can you please test if the fixup restores the run time? This looks like a mistake and the delays are not something we'd observe otherwise. If it does not fix the problem then the last option is to revert the patch.