From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52B97C3A5AB for ; Thu, 5 Sep 2019 12:51:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 61F802067B for ; Thu, 5 Sep 2019 12:51:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388073AbfIEMv0 (ORCPT ); Thu, 5 Sep 2019 08:51:26 -0400 Received: from mx2.suse.de ([195.135.220.15]:43736 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730864AbfIEMvZ (ORCPT ); Thu, 5 Sep 2019 08:51:25 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 9D270B642; Thu, 5 Sep 2019 12:51:22 +0000 (UTC) Subject: Re: [PATCH 2/2] btrfs: add ioctl for directly writing compressed data From: Johannes Thumshirn To: Dave Chinner , Omar Sandoval Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com, linux-fsdevel@vger.kernel.org References: <8eae56abb90c0fe87c350322485ce8674e135074.1567623877.git.osandov@fb.com> <20190905021012.GL7777@dread.disaster.area> <8acbff04-aee0-9f88-b2cd-a85bb7b94df8@suse.de> Openpgp: preference=signencrypt Autocrypt: addr=jthumshirn@suse.de; prefer-encrypt=mutual; keydata= xsFNBFTTwPEBEADOadCyru0ZmVLaBn620Lq6WhXUlVhtvZF5r1JrbYaBROp8ZpiaOc9YpkN3 rXTgBx+UoDGtnz9DZnIa9fwxkcby63igMPFJEYpwt9adN6bA1DiKKBqbaV5ZbDXR1tRrSvCl 2V4IgvgVuO0ZJEt7gakOQlqjQaOvIzDnMIi/abKLSSzYAThsOUf6qBEn2G46r886Mk8MwkJN hilcQ7F5UsKfcVVGrTBoim6j69Ve6EztSXOXjFgsoBw4pEhWuBQCkDWPzxkkQof1WfkLAVJ2 X9McVokrRXeuu3mmB+ltamYcZ/DtvBRy8K6ViAgGyNRWmLTNWdJj19Qgw9Ef+Q9O5rwfbPZy SHS2PVE9dEaciS+EJkFQ3/TBRMP1bGeNbZUgrMwWOvt37yguvrCOglbHW+a8/G+L7vz0hasm OpvD9+kyTOHjqkknVJL69BOJeCIVUtSjT9EXaAOkqw3EyNJzzhdaMXcOPwvTXNkd8rQZIHft SPg47zMp2SJtVdYrA6YgLv7OMMhXhNkUsvhU0HZWUhcXZnj+F9NmDnuccarez9FmLijRUNgL 6iU+oypB/jaBkO6XLLwo2tf7CYmBYMmvXpygyL8/wt+SIciNiM34Yc+WIx4xv5nDVzG1n09b +iXDTYoWH82Dq1xBSVm0gxlNQRUGMmsX1dCbCS2wmWbEJJDEeQARAQABzSdKb2hhbm5lcyBU aHVtc2hpcm4gPGp0aHVtc2hpcm5Ac3VzZS5kZT7CwYAEEwEIACoCGwMFCwkIBwIGFQgJCgsC BBYCAwECHgECF4AFCQo9ta8FAlohZmoCGQEACgkQA5OWnS12CFATLQ//ajhNDVJLK9bjjiOH 53B0+hCrRBj5jQiT8I60+4w+hssvRHWkgsujF+V51jcmX3NOXeSyLC1Gk43A9vCz5gXnqyqG tOlYm26bihzG02eAoWr/glHBQyy7RYcd97SuRSv77WzuXT3mCnM15TKiqXYNzRCK7u5nx4eu szAU+AoXAC/y1gtuDMvANBEuHWE4LNQLkTwJshU1vwoNcTSl+JuQWe89GB8eeeMnHuY92T6A ActzHN14R1SRD/51N9sebAxGVZntXzSVKyMID6eGdNegWrz4q55H56ZrOMQ6IIaa7KSz3QSj 3E8VIY4FawfjCSOuA2joemnXH1a1cJtuqbDPZrO2TUZlNGrO2TRi9e2nIzouShc5EdwmL6qt WG5nbGajkm1wCNb6t4v9ueYMPkHsr6xJorFZHlu7PKqB6YY3hRC8dMcCDSLkOPWf+iZrqtpE odFBlnYNfmAXp+1ynhUvaeH6eSOqCN3jvQbITUo8mMQsdVgVeJwRdeAOFhP7fsxNugii721U acNVDPpEz4QyxfZtfu9QGI405j9MXF/CPrHlNLD5ZM5k9NxnmIdCM9i1ii4nmWvmz9JdVJ+8 6LkxauROr2apgTXxMnJ3Desp+IRWaFvTVhbwfxmwC5F3Kr0ouhr5Kt8jkQeD/vuqYuxOAyDI egjo3Y7OGqct+5nybmbOwU0EVNPA8QEQAN/79cFVNpC+8rmudnXGbob9sk0J99qnwM2tw33v uvQjEGAJTVCOHrewDbHmqZ5V1X1LI9cMlLUNMR3W0+L04+MH8s/JxshFST+hOaijGc81AN2P NrAQD7IKpA78Q2F3I6gpbMzyMy0DxmoKF73IAMQIknrhzn37DgM+x4jQgkvhFMqnnZ/xIQ9d QEBKDtfxH78QPosDqCzsN9HRArC75TiKTKOxC12ZRNFZfEPnmqJ260oImtmoD/L8QiBsdA4m Mdkmo6Pq6iAhbGQ5phmhUVuj+7O8rTpGRXySMLZ44BimM8yHWTaiLWxCehHgfUWRNLwFbrd+ nYJYHoqyFGueZFBNxY4bS2rIEDg+nSKiAwJv3DUJDDd/QJpikB5HIjg/5kcSm7laqfbr1pmC ZbR2JCTp4FTABVLxt7pJP40SuLx5He63aA/VyxoInLcZPBNvVfq/3v3fkoILphi77ZfTvKrl RkDdH6PkFOFpnrctdTWbIFAYfU96VvySFAOOg5fsCeLv9/zD4dQEGsvva/qKZXkH/l2LeVp3 xEXoFsUZtajPZgyRBxer0nVWRyeVwUQnLG8kjEOcZzX27GUpughi8w42p4oMD+96tr3BKTAr guRHJnU1M1xwRPbw5UsNXEOgYsFc8cdto0X7hQ2Ugc07CRSDvyH50IKXf2++znOTXFDhABEB AAHCwV8EGAECAAkFAlTTwPECGwwACgkQA5OWnS12CFAdRg//ZGV0voLRjjgX9ODzaz6LP+IP /ebGLXe3I+QXz8DaTkG45evOu6B2J53IM8t1xEug0OnfnTo1z0AFg5vU53L24LAdpi12CarV Da53WvHzG4BzCVGOGrAvJnMvUXf0/aEm0Sen2Mvf5kvOwsr9UTHJ8N/ucEKSXAXf+KZLYJbL NL4LbOFP+ywxtjV+SgLpDgRotM43yCRbONUXEML64SJ2ST+uNzvilhEQT/mlDP7cY259QDk7 1K6B+/ACE3Dn7X0/kp8a+ZoNjUJZkQQY4JyMOkITD6+CJ1YsxhX+/few9k5uVrwK/Cw+Vmae A85gYfFn+OlLFO/6RGjMAKOsdtPFMltNOZoT+YjgAcW6Q9qGgtVYKcVOxusL8C3v8PAYf7Ul Su7c+/Ayr3YV9Sp8PH4X4jK/zk3+DDY1/ASE94c95DW1lpOcyx3n1TwQbwp6TzPMRe1IkkYe 0lYj9ZgKaZ8hEmzuhg6FKXk9Dah+H73LdV57M4OFN8Xwb7v+oEG23vdsb2KBVG5K6Tv7Hb2N sfHWRdU3quYIistrNWWeGmfTlhVLgDhEmAsKZFH05QsAv3pQv7dH/JD+Tbn6sSnNAVrATff1 AD3dXmt+5d3qYuUxam1UFGufGzV7jqG5QNStp0yvLP0xroB8y0CnnX2FY6bAVCU+CqKu+n1B LGlgwABHRtLCwe0EGAEIACAWIQTsOJyrwsTyXYYA0NADk5adLXYIUAUCWsTXAwIbAgCBCRAD k5adLXYIUHYgBBkWCAAdFiEEx1U9vxg1xAeUwus20p7yIq+KHe4FAlrE1wMACgkQ0p7yIq+K He6RfAEA+frSSvrHiuatNqvgYAJcraYhp1GQJrWSWMmi2eFcGskBAJyLp47etEn3xhJBLVVh 2y2K4Nobb6ZgxA4Svfnkf7AAdicQALiaOKDwKD3tgf90ypEoummYzAxv8MxyPXZ7ylRnkheA eQDxuoc/YwMA4qyxhzf6K4tD/aT12XJd95gk+YAL6flGkJD8rA3jsEucPmo5eko4Ms2rOEdG jKsZetkdPKGBd2qVxxyZgzUkgRXduvyux04b9erEpJmoIXs/lE0IRbL9A9rJ6ASjFPGpXYrb 73pb6Dtkdpvv+hoe4cKeae4dS0AnDc7LWSW3Ub0n61uk/rqpTmKuesmTZeB2GHzLN5GAXfNj ELHAeSVfFLPRFrjF5jjKJkpiyq98+oUnvTtDIPMTg05wSN2JtwKnoQ0TAIHWhiF6coGeEfY8 ikdVLSZDEjW54Td5aIXWCRTBWa6Zqz/G6oESF+Lchu/lDv5+nuN04KZRAwCpXLS++/givJWo M9FMnQSvt4N95dVQE3kDsasl960ct8OzxaxuevW0OV/jQEd9gH50RaFif412DTrsuaPsBz6O l2t2TyTuHm7wVUY2J3gJYgG723/PUGW4LaoqNrYQUr/rqo6NXw6c+EglRpm1BdpkwPwAng63 W5VOQMdnozD2RsDM5GfA4aEFi5m00tE+8XPICCtkduyWw+Z+zIqYk2v+zraPLs9Gs0X2C7X0 yvqY9voUoJjG6skkOToGZbqtMX9K4GOv9JAxVs075QRXL3brHtHONDt6udYobzz+ Message-ID: <026f0c89-2a31-6a87-12d6-884818164a63@suse.de> Date: Thu, 5 Sep 2019 14:51:22 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <8acbff04-aee0-9f88-b2cd-a85bb7b94df8@suse.de> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On 05/09/2019 14:16, Johannes Thumshirn wrote: > On 05/09/2019 04:10, Dave Chinner wrote: >> On Wed, Sep 04, 2019 at 12:13:26PM -0700, Omar Sandoval wrote: >>> From: Omar Sandoval >>> >>> This adds an API for writing compressed data directly to the filesystem. >>> The use case that I have in mind is send/receive: currently, when >>> sending data from one compressed filesystem to another, the sending side >>> decompresses the data and the receiving side recompresses it before >>> writing it out. This is wasteful and can be avoided if we can just send >>> and write compressed extents. The send part will be implemented in a >>> separate series, as this ioctl can stand alone. >>> >>> The interface is essentially pwrite(2) with some extra information: >>> >>> - The input buffer contains the compressed data. >>> - Both the compressed and decompressed sizes of the data are given. >>> - The compression type (zlib, lzo, or zstd) is given. >> >> So why can't you do this with pwritev2()? Heaps of flags, and >> use a second iovec to hold the decompressed size of the previous >> iovec. i.e. >> >> iov[0].iov_base = compressed_data; >> iov[0].iov_len = compressed_size; >> iov[1].iov_base = NULL; >> iov[1].iov_len = uncompressed_size; >> pwritev2(fd, iov, 2, offset, RWF_COMPRESSED_ZLIB); >> >> And you don't need to reinvent pwritev() with some whacky ioctl that >> is bound to be completely screwed up is ways not noticed until >> someone else tries to use it... >> >> I'd also suggest atht if we are going to be able to write compressed >> data directly, then we should be able to read them as well directly >> via preadv2().... > > > While I'm with you on this from a design PoV, one question remains: > What to do with the file systems that do not support compression? > > Currently there's only a kernel global check for known RWF_* flags in > kiocb_set_rw_flags(). > > So we need a way for the individual file systems to opt into the new > RWF_COMPRESSED_* flags and fail early if they're not supported, that > will cause a lot of code churn if we cannot do it in the vfs layer. > > From the 52 ->write_iter callbacks in fs/ 32 are not using > generic_file_write_iter(). So we'd have to patch 33 functions (+/- 1-2 > because my grep | wc fu isn't the best). > This (from Anthony Iliopoulos) should be sufficient: diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 58a18ed11546..86f7ff0387d7 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -3299,7 +3299,7 @@ static loff_t btrfs_file_llseek(struct file *file, static int btrfs_file_open(struct inode *inode, struct file *filp) { - filp->f_mode |= FMODE_NOWAIT; + filp->f_mode |= (FMODE_NOWAIT|FMODE_CAN_COMPRESS); return generic_file_open(inode, filp); } diff --git a/include/linux/fs.h b/include/linux/fs.h index 997a530ff4e9..1b59e795f448 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3357,6 +3357,11 @@ static inline int kiocb_set_rw_flags(struct kiocb ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC); if (flags & RWF_APPEND) ki->ki_flags |= IOCB_APPEND; + if (flags & RWF_COMPRESSED) { + if (!(ki->ki_filp->fmode & FMODE_CAN_COMPRESS)) + return -EOPNOTSUPP; + ki->ki_flags |= IOCB_COMPRESSED; + } return 0; } -- Johannes Thumshirn SUSE Labs Filesystems jthumshirn@suse.de +49 911 74053 689 SUSE Software Solutions Germany GmbH Maxfeldstr. 5 90409 Nürnberg Germany (HRB 247165, AG München) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850