From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6F92C4360F for ; Fri, 5 Apr 2019 14:02:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8619821850 for ; Fri, 5 Apr 2019 14:02:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="orhBsZtz" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731264AbfDEOCq (ORCPT ); Fri, 5 Apr 2019 10:02:46 -0400 Received: from mail-yw1-f65.google.com ([209.85.161.65]:32918 "EHLO mail-yw1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726643AbfDEOCq (ORCPT ); Fri, 5 Apr 2019 10:02:46 -0400 Received: by mail-yw1-f65.google.com with SMTP id l5so2349662ywa.0; Fri, 05 Apr 2019 07:02:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=QfNewao+6aKRIyZh9wdpLvDXsS4LbHOmjBJWCQPDyp0=; b=orhBsZtzOKjZLCUXkmeuzRc0brvlt9Td2tuixcxinN2nAfbxFaSth8eVs5/JmPDblL 69lWYirXU2IOuceamEavOMxebJVnJjar0yjK3kWBwrH7ukAG+ByxplKV9JCS9WHBHflM xZjkj13+2ZFy9HC3DXOBKiPAwsxkbeICFkyv9KFssCZxG3zzvKWDqs/KKtHnwurluucw 16f2fNj+kgKEhfl5fBAF4YSVSiMN5OGV89ky3JNQPuhPw0bgqR6d3aMa0lEMMb6lsktn +7q6G5rBzfL1dUwdi38hFNYUPzqiN/VcLDCvLtXEh1SnetC4q4dEY2MXAXxhZJYz4nAm JyxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=QfNewao+6aKRIyZh9wdpLvDXsS4LbHOmjBJWCQPDyp0=; b=tcn9muuv+2yiNnuQ5WzkKejO34s1gdgIbVJXbGT3dkglcezM5jkf0Wq9azBJ2uyXo7 I16a2Yxgd7greRx9v9gFUj96E0OYUhioxw0pJxOFHYPHh+nkGdi5/MFDLEI5AI39jqWE 0FZdT/6EPSdhcivgqcREvq8E83OkZAHZohv8YC7hvpvi6aIclaZ+v2NPD9+JgAgHTMon TemRLLmXvwJ1LnS9vY8AzEeXap/iGRiPDjWM8QOUM0RLsdzfF5jDUn30xlPfqniWvqnM zNz5c9y62IJ5/ITWTXZwVDT48/9/tM8vUL1zBHYoKYbzMPyrx/glqUYmHBkjpRxY4nu8 TrvA== X-Gm-Message-State: APjAAAV+TrS9/WoCkMEedRfNduWlUmK1Rf8LO8VA37JfsSVhKLi5r+SG jHk9J9pXGFOqYWxkc0L3ZqsiP9LEzXcgCLXHqSs= X-Google-Smtp-Source: APXvYqzD/CLcuqtSFjjLdYr1OT2B178wsPGvz8nJ1PnNE45yNGgbkswlcVXneWxeEvXJkPrWDXfk0mppCNFVtrKtkQA= X-Received: by 2002:a81:b554:: with SMTP id c20mr10075553ywk.169.1554472964618; Fri, 05 Apr 2019 07:02:44 -0700 (PDT) MIME-Version: 1.0 References: <20190404165737.30889-1-amir73il@gmail.com> <20190404211730.GD26298@dastard> In-Reply-To: <20190404211730.GD26298@dastard> From: Amir Goldstein Date: Fri, 5 Apr 2019 17:02:33 +0300 Message-ID: Subject: Re: [POC][PATCH] xfs: reduce ilock contention on buffered randrw workload To: Dave Chinner Cc: "Darrick J . Wong" , Christoph Hellwig , Matthew Wilcox , linux-xfs , linux-fsdevel Content-Type: text/plain; charset="UTF-8" Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Fri, Apr 5, 2019 at 12:17 AM Dave Chinner wrote: > > On Thu, Apr 04, 2019 at 07:57:37PM +0300, Amir Goldstein wrote: > > This patch improves performance of mixed random rw workload > > on xfs without relaxing the atomic buffered read/write guaranty > > that xfs has always provided. > > > > We achieve that by calling generic_file_read_iter() twice. > > Once with a discard iterator to warm up page cache before taking > > the shared ilock and once again under shared ilock. > > This will race with thing like truncate, hole punching, etc that > serialise IO and invalidate the page cache for data integrity > reasons under the IOLOCK. These rely on there being no IO to the > inode in progress at all to work correctly, which this patch > violates. IOWs, while this is fast, it is not safe and so not a > viable approach to solving the problem. > This statement leaves me wondering, if ext4 does not takes i_rwsem on generic_file_read_iter(), how does ext4 (or any other fs for that matter) guaranty buffered read synchronization with truncate, hole punching etc? The answer in ext4 case is i_mmap_sem, which is read locked in the page fault handler. And xfs does the same type of synchronization with MMAPLOCK, so while my patch may not be safe, I cannot follow why from your explanation, so please explain if I am missing something. One thing that Darrick mentioned earlier was that IOLOCK is also used by xfs to synchronization pNFS leases (probably listed under 'etc' in your explanation). I consent that my patch does not look safe w.r.t pNFS leases, but that can be sorted out with a hammer #ifndef CONFIG_EXPORTFS_BLOCK_OPS or with finer instruments. > FYI, I'm working on a range lock implementation that should both > solve the performance issue and the reader starvation issue at the > same time by allowing concurrent buffered reads and writes to > different file ranges. > > IO range locks will allow proper exclusion for other extent > manipulation operations like fallocate and truncate, and eventually > even allow truncate, hole punch, file extension, etc to run > concurrently with other non-overlapping IO. They solve more than > just the performance issue you are seeing.... > I'm glad to hear that. IO range locks are definitely a more wholesome solution to the problem looking forward. However, I am still interested to continue the discussion on my POC patch. One reason is that I am guessing it would be much easier for distros to backport and pick up to solve performance issues. Even if my patch doesn't get applied upstream nor picked by distros, I would still like to understand its flaws and limitations. I know... if I break it, I get to keep the pieces, but the information that you provide helps me make my risk assessments. Thanks, Amir.