From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F761C31E4B for ; Fri, 14 Jun 2019 18:17:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 610202177E for ; Fri, 14 Jun 2019 18:17:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725889AbfFNSRj (ORCPT ); Fri, 14 Jun 2019 14:17:39 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:18617 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726429AbfFNSRV (ORCPT ); Fri, 14 Jun 2019 14:17:21 -0400 Received: from DGGEMS408-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 65C475463CF33BAB43D3; Sat, 15 Jun 2019 02:17:18 +0800 (CST) Received: from architecture4.huawei.com (10.140.130.215) by smtp.huawei.com (10.3.19.208) with Microsoft SMTP Server (TLS) id 14.3.439.0; Sat, 15 Jun 2019 02:17:11 +0800 From: Gao Xiang To: , Greg Kroah-Hartman , CC: LKML , , , Miao Xie , , Fang Wei , Gao Xiang Subject: [RFC PATCH 6/8] staging: erofs: introduce LZ4 decompression inplace Date: Sat, 15 Jun 2019 02:16:17 +0800 Message-ID: <20190614181619.64905-7-gaoxiang25@huawei.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190614181619.64905-1-gaoxiang25@huawei.com> References: <20190614181619.64905-1-gaoxiang25@huawei.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.140.130.215] X-CFilter-Loop: Reflected Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org compressed data will be usually loaded into last pages of the extent (the last page for 4k) for in-place decompression (more specifically, in-place IO), as ilustration below, start of compressed logical extent | end of this logical extent | | ______v___________________________v________ ... | page 6 | page 7 | page 8 | page 9 | ... |__________|__________|__________|__________| . ^ . ^ . |compressed| . | data | . . . |< dstsize >|| oend iend op ip Therefore, it's possible to do decompression inplace (thus no memcpy at all) if the margin is sufficient and safe enough [1], and it can be implemented only for fixed-size output compression compared with fixed-size input compression. No memcpy for most of in-place IO (about 99% of enwik9) after decompression inplace is implemented and sequential read will be improved of course (see the following patch for test results). [1] https://github.com/lz4/lz4/commit/b17f578a919b7e6b078cede2d52be29dd48c8e8c https://github.com/lz4/lz4/commit/5997e139f53169fa3a1c1b4418d2452a90b01602 Signed-off-by: Gao Xiang --- drivers/staging/erofs/compress.h | 1 + drivers/staging/erofs/decompressor.c | 21 ++++++++++++++++++--- drivers/staging/erofs/erofs_fs.h | 3 ++- 3 files changed, 21 insertions(+), 4 deletions(-) diff --git a/drivers/staging/erofs/compress.h b/drivers/staging/erofs/compress.h index ebeccb1f4eae..c43aa3374d28 100644 --- a/drivers/staging/erofs/compress.h +++ b/drivers/staging/erofs/compress.h @@ -17,6 +17,7 @@ enum { }; struct z_erofs_decompress_req { + struct super_block *sb; struct page **in, **out; unsigned short pageofs_out; diff --git a/drivers/staging/erofs/decompressor.c b/drivers/staging/erofs/decompressor.c index 4f644bdea0bc..727bae6a3cb0 100644 --- a/drivers/staging/erofs/decompressor.c +++ b/drivers/staging/erofs/decompressor.c @@ -14,6 +14,9 @@ #endif #define LZ4_MAX_DISTANCE_PAGES DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) +#ifndef LZ4_DECOMPRESS_INPLACE_MARGIN +#define LZ4_DECOMPRESS_INPLACE_MARGIN(srcsize) (((srcsize) >> 8) + 32) +#endif struct z_erofs_decompressor { /* @@ -131,9 +134,21 @@ static int lz4_decompress(struct z_erofs_decompress_req *rq, u8 *out) copied = false; inlen = rq->inputsize - inputmargin; if (rq->inplace_io) { - src = generic_copy_inplace_data(rq, src, inputmargin); - inputmargin = 0; - copied = true; + const uint oend = (rq->pageofs_out + + rq->outputsize) & ~PAGE_MASK; + const uint nr = PAGE_ALIGN(rq->pageofs_out + + rq->outputsize) >> PAGE_SHIFT; + + if (rq->partial_decoding || + !(EROFS_SB(rq->sb)->requirements & + EROFS_REQUIREMENT_LZ4_0PADDING) || + rq->out[nr - 1] != rq->in[0] || + rq->inputsize - oend < + LZ4_DECOMPRESS_INPLACE_MARGIN(inlen)) { + src = generic_copy_inplace_data(rq, src, inputmargin); + inputmargin = 0; + copied = true; + } } ret = LZ4_decompress_safe_partial(src + inputmargin, out, diff --git a/drivers/staging/erofs/erofs_fs.h b/drivers/staging/erofs/erofs_fs.h index a05139f1df60..353322a3206c 100644 --- a/drivers/staging/erofs/erofs_fs.h +++ b/drivers/staging/erofs/erofs_fs.h @@ -21,7 +21,8 @@ * Any bits that aren't in EROFS_ALL_REQUIREMENTS should be * incompatible with this kernel version. */ -#define EROFS_ALL_REQUIREMENTS 0 +#define EROFS_REQUIREMENT_LZ4_0PADDING 0x00000001 +#define EROFS_ALL_REQUIREMENTS EROFS_REQUIREMENT_LZ4_0PADDING struct erofs_super_block { /* 0 */__le32 magic; /* in the little endian */ -- 2.17.1