From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF404C4360F for ; Sat, 16 Mar 2019 00:13:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 85C2A218D8 for ; Sat, 16 Mar 2019 00:13:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="Q97p96al" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726697AbfCPAN1 (ORCPT ); Fri, 15 Mar 2019 20:13:27 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:41060 "EHLO esa1.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726638AbfCPAN1 (ORCPT ); Fri, 15 Mar 2019 20:13:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1552695207; x=1584231207; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Z98vNC96F5GZHJ7IuN1a+JKlKGjAXQCU2uqCJSzFmoI=; b=Q97p96alKXTuUPYlrRAyhmscusGWSjZ15yazATxh5VU+lbL8z+Vr7n9h 4ZfnUs44w5uKb5eXP9js63OVbYe9yt7PcW5Hh5CU9HIxScik03q47wNxX oUPbS3iXkPsuorrVugKryJvWShxo78MTJ773pffJx94fQQ/Rjhjsjrg2R I4OeT3oeNIX9IqBDzFvhvsYu/GPxaxhlvzNFM7OmW2NCGNPH5e2vc1Twr Hml4/9tgoP8ur7lWv2SYiuU9yWBorr9P9CJphsz5+AmqUxltIPHJh0rRT oAGcxUZU/QyMqdWtmTqTnGI9SDvpk88Mr7WvLYlMQcZSMnTIsoR2MsvOg g==; X-IronPort-AV: E=Sophos;i="5.58,483,1544457600"; d="scan'208";a="209047193" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 16 Mar 2019 08:13:26 +0800 IronPort-SDR: nfe99g8Slk3/Pycc5JFpyQqZl65B899IJoxA+dllkoO6Ml2RCF/dRcGMYluZxyxcdzZklc+eoz E0jsryt9OBjUy9heDNf6TWU09JmvvDl20aKkyQemsMu3aRa+23Pr5KN9hRKnfE3p+yInwT1lg7 qtxJWWiGLbpc16DvXA5SoOY2nftVpe717KnpwiUzVOohJxUD4+EPY5/TE57QYz6ptkje6ecYBN 4cxdbSUU7oMPAxt1DGfpRlPNd/l30O5586YHLyvbKee7RNcFIfACkn3gbG5KANWnNr/DSgsLv9 d6R9dCe6xOIAsfKi07QUcBjh Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP; 15 Mar 2019 16:51:10 -0700 IronPort-SDR: LqNFPWh3IMZ63J4nSnx6naiKn98vi+94x2hfLu+0Xh6zzXgpybfi6uApMr78e5K+Uwrv1DxazY sN3FJ/6nXgIhHnugAk5zc8ZZTO2Mj7IWVz/sSWw3mIVJ8zGZwXnfO15UEL/vNC9OdyJ18LnrFP D399CUFIby82bdGhjI+skL6HY4a+TavXvr3wkDffvlf35AI+uqAHv4HXufYwiYmO7oUzNqajEA wZCRYS1fnXvr1o/RqBeArjJFphPUiv8TYYFrG8nSYCb1k7xcjZ4ljI/JarrgvU27PnZCUNcNFP YJw= Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip02.wdc.com with ESMTP; 15 Mar 2019 17:13:26 -0700 From: Damien Le Moal To: Jaegeuk Kim , Chao Yu , linux-f2fs-devel@lists.sourceforge.net Cc: linux-fsdevel@vger.kernel.org, Matias Bjorling , Masato Suzuki Subject: [PATCH 2/3] f2fs: Reduce zoned block device memory usage Date: Sat, 16 Mar 2019 09:13:07 +0900 Message-Id: <20190316001308.18115-3-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190316001308.18115-1-damien.lemoal@wdc.com> References: <20190316001308.18115-1-damien.lemoal@wdc.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org For zoned block devices, an array of zone types for each device is allocated and initialized in order to determine if a section is stored on a sequential zone (zone reset needed) or a conventional zone (no zone reset needed and regular discard applies). Considering this usage, the zone types stored in memory can be replaced with a bitmap to indicate an equivalent information, that is, if a zone is sequential or not. This reduces the memory usage for each zoned device by roughly 8: on a 14TB disk with zones of 256 MB, the zone type array consumes 13x4KB pages while the bitmap uses only 2x4KB pages. This patch changes the f2fs_dev_info structure blkz_type field to the bitmap blkz_seq. Access to this bitmap is done using the helper function f2fs_blkz_is_seq(), which is a rewrite of the function get_blkz_type(). Signed-off-by: Damien Le Moal --- fs/f2fs/f2fs.h | 14 +++++--------- fs/f2fs/segment.c | 36 ++++++++++++++++-------------------- fs/f2fs/super.c | 13 ++++++++----- 3 files changed, 29 insertions(+), 34 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index e79f426a9f2f..576e637ef568 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -1066,8 +1066,8 @@ struct f2fs_dev_info { block_t start_blk; block_t end_blk; #ifdef CONFIG_BLK_DEV_ZONED - unsigned int nr_blkz; /* Total number of zones */ - u8 *blkz_type; /* Array of zones type */ + unsigned int nr_blkz; /* Total number of zones */ + unsigned long *blkz_seq; /* Bitmap indicating sequential zones */ #endif }; @@ -3513,16 +3513,12 @@ F2FS_FEATURE_FUNCS(lost_found, LOST_FOUND); F2FS_FEATURE_FUNCS(sb_chksum, SB_CHKSUM); #ifdef CONFIG_BLK_DEV_ZONED -static inline int get_blkz_type(struct f2fs_sb_info *sbi, - struct block_device *bdev, block_t blkaddr) +static inline bool f2fs_blkz_is_seq(struct f2fs_sb_info *sbi, int devi, + block_t blkaddr) { unsigned int zno = blkaddr >> sbi->log_blocks_per_blkz; - int i; - for (i = 0; i < sbi->s_ndevs; i++) - if (FDEV(i).bdev == bdev) - return FDEV(i).blkz_type[zno]; - return -EINVAL; + return test_bit(zno, FDEV(devi).blkz_seq); } #endif diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index d8f531b33350..f40148b735d7 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -1701,40 +1701,36 @@ static int __f2fs_issue_discard_zone(struct f2fs_sb_info *sbi, if (f2fs_is_multi_device(sbi)) { devi = f2fs_target_device_index(sbi, blkstart); + if (blkstart < FDEV(devi).start_blk || + blkstart > FDEV(devi).end_blk) { + f2fs_msg(sbi->sb, KERN_ERR, "Invalid block %x", + blkstart); + return -EIO; + } blkstart -= FDEV(devi).start_blk; } - /* - * We need to know the type of the zone: for conventional zones, - * use regular discard if the drive supports it. For sequential - * zones, reset the zone write pointer. - */ - switch (get_blkz_type(sbi, bdev, blkstart)) { - - case BLK_ZONE_TYPE_CONVENTIONAL: - if (!blk_queue_discard(bdev_get_queue(bdev))) - return 0; - return __queue_discard_cmd(sbi, bdev, lblkstart, blklen); - case BLK_ZONE_TYPE_SEQWRITE_REQ: - case BLK_ZONE_TYPE_SEQWRITE_PREF: + /* For sequential zones, reset the zone write pointer */ + if (f2fs_blkz_is_seq(sbi, devi, blkstart)) { sector = SECTOR_FROM_BLOCK(blkstart); nr_sects = SECTOR_FROM_BLOCK(blklen); if (sector & (bdev_zone_sectors(bdev) - 1) || nr_sects != bdev_zone_sectors(bdev)) { - f2fs_msg(sbi->sb, KERN_INFO, - "(%d) %s: Unaligned discard attempted (block %x + %x)", + f2fs_msg(sbi->sb, KERN_ERR, + "(%d) %s: Unaligned zone reset attempted (block %x + %x)", devi, sbi->s_ndevs ? FDEV(devi).path: "", blkstart, blklen); return -EIO; } trace_f2fs_issue_reset_zone(bdev, blkstart); - return blkdev_reset_zones(bdev, sector, - nr_sects, GFP_NOFS); - default: - /* Unknown zone type: broken device ? */ - return -EIO; + return blkdev_reset_zones(bdev, sector, nr_sects, GFP_NOFS); } + + /* For conventional zones, use regular discard if supported */ + if (!blk_queue_discard(bdev_get_queue(bdev))) + return 0; + return __queue_discard_cmd(sbi, bdev, lblkstart, blklen); } #endif diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index d1ccc52afc93..8d0caf4c5f2b 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -1017,7 +1017,7 @@ static void destroy_device_list(struct f2fs_sb_info *sbi) for (i = 0; i < sbi->s_ndevs; i++) { blkdev_put(FDEV(i).bdev, FMODE_EXCL); #ifdef CONFIG_BLK_DEV_ZONED - kvfree(FDEV(i).blkz_type); + kvfree(FDEV(i).blkz_seq); #endif } kvfree(sbi->devs); @@ -2765,9 +2765,11 @@ static int init_blkz_info(struct f2fs_sb_info *sbi, int devi) if (nr_sectors & (bdev_zone_sectors(bdev) - 1)) FDEV(devi).nr_blkz++; - FDEV(devi).blkz_type = f2fs_kmalloc(sbi, FDEV(devi).nr_blkz, - GFP_KERNEL); - if (!FDEV(devi).blkz_type) + FDEV(devi).blkz_seq = f2fs_kzalloc(sbi, + BITS_TO_LONGS(FDEV(devi).nr_blkz) + * sizeof(unsigned long), + GFP_KERNEL); + if (!FDEV(devi).blkz_seq) return -ENOMEM; #define F2FS_REPORT_NR_ZONES 4096 @@ -2794,7 +2796,8 @@ static int init_blkz_info(struct f2fs_sb_info *sbi, int devi) } for (i = 0; i < nr_zones; i++) { - FDEV(devi).blkz_type[n] = zones[i].type; + if (zones[i].type != BLK_ZONE_TYPE_CONVENTIONAL) + set_bit(n, FDEV(devi).blkz_seq); sector += zones[i].len; n++; } -- 2.20.1