From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 003ADC433E0 for ; Thu, 18 Jun 2020 09:11:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C98AB20899 for ; Thu, 18 Jun 2020 09:11:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=javigon-com.20150623.gappssmtp.com header.i=@javigon-com.20150623.gappssmtp.com header.b="zrEq0qH5" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729009AbgFRJLW (ORCPT ); Thu, 18 Jun 2020 05:11:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728937AbgFRJLS (ORCPT ); Thu, 18 Jun 2020 05:11:18 -0400 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0BCA1C0613ED for ; Thu, 18 Jun 2020 02:11:16 -0700 (PDT) Received: by mail-wr1-x444.google.com with SMTP id x6so5200938wrm.13 for ; Thu, 18 Jun 2020 02:11:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=javigon-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=nJ/ameCVngCfbGYTO+5aI2tlO0POD3IpltXZZNLuxEs=; b=zrEq0qH5SUEXfv9B8Ir5FQX8mZb9/7ShVjC1BqOnxa+nJeZZepPmEXK0JUq9Iyo/er x2eV2pn2BWHZZYzdw8xWsFUWKUmeJ2LD0KtYxtFNlFFMMUrP3JuBoJQh5pCG0LPjeBXb Bj/SSgbof0QLi7qLSpLHKlV1nbJkdJLFawhgO8tsKLBmz4bafqZR0T+4ppsS7C9a9SUW Uqs9ZRYksz+MN2tnjrFdL7Q3hlp6toE2WynUOVWvB1951khtuJOXjysmpuQxmkjTaK9w WKMnxGdseFfMzIDugIemYPP8DSqA3Reo0AAoyttSjL52Sb7wQ5cxg8jIDYGFDymW76Ol /7LA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=nJ/ameCVngCfbGYTO+5aI2tlO0POD3IpltXZZNLuxEs=; b=Oh9ML9MLXBoUFCqg/592g7YCea0zDF3jlZyP2HTB/gFS6g0Lqm4EIT/gP/X7Y52yTw B0dpD5Eq/X0R+SpJr1JIdyEuoQAAY7YBrS7kz6BEGQWymxqc8E3/3PGZZyAwUgthl9xl 4MaksyJLL0Qs1TQAxw6+J2JrcTAl2zFVNORl4n3Q42yZSRXrBdyuiob2YG7SVd8NblBH XqhQnghfjr1dGFmoWfOWMBR6hWX8YukyJ5tsE+fvboMWu3Wm5RdORnc85WHqavWHA+Ga DjIezZUZiCq2rRZGYvJz7ozeLqUQj6EyLdZLDMv/IyPqiUHf4VAl1ocRrwrRFSU6FRuI RT4w== X-Gm-Message-State: AOAM530GjoUxdnU6rQS7JoekHM7NQuvuS1D7+OvnI0LXCD3Vwmg/1Z1q 2HjaudTGIPW+gynTfQKgTSwtDA== X-Google-Smtp-Source: ABdhPJxtEFr+KfP6DTCsP0An0esK+61P9lyK23C48ndc7t0VhLzMWIdc+iJ9BZ0TktVPWIm3QuGzWg== X-Received: by 2002:a05:6000:114e:: with SMTP id d14mr3499793wrx.110.1592471475570; Thu, 18 Jun 2020 02:11:15 -0700 (PDT) Received: from localhost ([194.62.217.57]) by smtp.gmail.com with ESMTPSA id q4sm2773031wma.47.2020.06.18.02.11.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jun 2020 02:11:14 -0700 (PDT) Date: Thu, 18 Jun 2020 11:11:13 +0200 From: "javier.gonz@samsung.com" To: Damien Le Moal Cc: Kanchan Joshi , "axboe@kernel.dk" , "viro@zeniv.linux.org.uk" , "bcrl@kvack.org" , "linux-fsdevel@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-aio@kvack.org" , "io-uring@vger.kernel.org" , "linux-block@vger.kernel.org" , "selvakuma.s1@samsung.com" , "nj.shetty@samsung.com" Subject: Re: [PATCH 3/3] io_uring: add support for zone-append Message-ID: <20200618091113.eu2xdp6zmdooy5d2@mpHalley.local> References: <1592414619-5646-1-git-send-email-joshi.k@samsung.com> <1592414619-5646-4-git-send-email-joshi.k@samsung.com> <20200618083529.ciifu4chr4vrv2j5@mpHalley.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 18.06.2020 08:47, Damien Le Moal wrote: >On 2020/06/18 17:35, javier.gonz@samsung.com wrote: >> On 18.06.2020 07:39, Damien Le Moal wrote: >>> On 2020/06/18 2:27, Kanchan Joshi wrote: >>>> From: Selvakumar S >>>> >>>> Introduce three new opcodes for zone-append - >>>> >>>> IORING_OP_ZONE_APPEND : non-vectord, similiar to IORING_OP_WRITE >>>> IORING_OP_ZONE_APPENDV : vectored, similar to IORING_OP_WRITEV >>>> IORING_OP_ZONE_APPEND_FIXED : append using fixed-buffers >>>> >>>> Repurpose cqe->flags to return zone-relative offset. >>>> >>>> Signed-off-by: SelvaKumar S >>>> Signed-off-by: Kanchan Joshi >>>> Signed-off-by: Nitesh Shetty >>>> Signed-off-by: Javier Gonzalez >>>> --- >>>> fs/io_uring.c | 72 +++++++++++++++++++++++++++++++++++++++++-- >>>> include/uapi/linux/io_uring.h | 8 ++++- >>>> 2 files changed, 77 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/fs/io_uring.c b/fs/io_uring.c >>>> index 155f3d8..c14c873 100644 >>>> --- a/fs/io_uring.c >>>> +++ b/fs/io_uring.c >>>> @@ -649,6 +649,10 @@ struct io_kiocb { >>>> unsigned long fsize; >>>> u64 user_data; >>>> u32 result; >>>> +#ifdef CONFIG_BLK_DEV_ZONED >>>> + /* zone-relative offset for append, in bytes */ >>>> + u32 append_offset; >>> >>> this can overflow. u64 is needed. >> >> We chose to do it this way to start with because struct io_uring_cqe >> only has space for u32 when we reuse the flags. >> >> We can of course create a new cqe structure, but that will come with >> larger changes to io_uring for supporting append. >> >> Do you believe this is a better approach? > >The problem is that zone size are 32 bits in the kernel, as a number of sectors. >So any device that has a zone size smaller or equal to 2^31 512B sectors can be >accepted. Using a zone relative offset in bytes for returning zone append result >is OK-ish, but to match the kernel supported range of possible zone size, you >need 31+9 bits... 32 does not cut it. Agree. Our initial assumption was that u32 would cover current zone size requirements, but if this is a no-go, we will take the longer path. > >Since you need a 64-bit sized result, I would also prefer that you drop the zone >relative offset as a result and return the absolute offset instead. That makes >life easier for the applications since the zone append requests also must use >absolute offsets for zone start. An absolute offset as a result becomes >consistent with that and all other read/write system calls that all use absolute >offsets (seek() is the only one that I know of that can use a relative offset, >but that is not an IO system call). Agree. Using relative offsets was a product of reusing the existing u32. If we move to u64, there is no need to do an extra transformation. Thanks Damien! Javier