From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F26EC76195 for ; Mon, 22 Jul 2019 23:35:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2839F2199C for ; Mon, 22 Jul 2019 23:35:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="U09JlmRF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731026AbfGVXfa (ORCPT ); Mon, 22 Jul 2019 19:35:30 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:35739 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731025AbfGVXfa (ORCPT ); Mon, 22 Jul 2019 19:35:30 -0400 Received: by mail-pg1-f195.google.com with SMTP id s1so12091132pgr.2 for ; Mon, 22 Jul 2019 16:35:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=85QgoVCMoUBXEV57x4G0PDzl0QhwTAJz3PR0yob8+QQ=; b=U09JlmRF+1e0SXt1nVtPE+wsgbeVT+ODWkwh5EDZK/HI2W80i+GnSjYreZoZXSI25X H/SgYIii0FLjAEyz5fr1WEYrj2sBL7nya8427TlnXTltUbOXOAlyW3PNZsaBZvxVlt9p RgLWM5tK8Sx6N9INgAByIFLu9Ox5yaiWxUh//d2f8VulJfzD/2OGyTqkfx1tsYqpi2aZ 9pTGAYQZrPIxM09wjYq9BG2H+3Xk4U/1Xw1EXLWHs6+EyDzpF9FgQvWLD1cdX6f1bApk +PT94ubpctMbcTGgGWiIX4I+Bn5Ly+Usy4Fm5gUH2puNJ6ZQxnG4w0K/QY7JYtZ7JJqP eD4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=85QgoVCMoUBXEV57x4G0PDzl0QhwTAJz3PR0yob8+QQ=; b=fEJN3rmdFoJX/mleHq5tSaDbaXKPw2kQv/hLL39L/ey4HMQcEfKbNsV5AyYdqiR1G9 ZwaXj9vnFl5ZZVt5SHLiuVma1eNFdQGteDHlmJak+SOTmuibisoQWiHAHIq5zJAbAAUc T63oNEJgoq2DYu2KKuvpKtqtRFnDMgKl1pf6zk+V+k3FeYzDzPAyyngS6+zoxmCTlSGW 6/OjOgZwybQcC5uEjb0RDQRGnga56BNevSIg3G08IgOcnB/CA4Bsz+65muZD0DpYvjYL WneQdCdS/6VMafZQTeZHZV/8FUQroStpEMTHaTsXpZukE0cUoxzyxqsBjpeB1A0+RHdt moVw== X-Gm-Message-State: APjAAAUkeVgbuaSkk4+wDNgVqbFhPY/XJonxAOttu8nKxUKnKryYuF8E FKYih7ENm0pzg/E1NkDq8oI= X-Google-Smtp-Source: APXvYqxd+2dzeBJuFB5qvEgV0O6zCpRw01LkZ+IA8kHWZ1vLbAH4HIeBFqUtXGy3QN5SDNyoAA0HPw== X-Received: by 2002:a63:188:: with SMTP id 130mr72665111pgb.231.1563838529829; Mon, 22 Jul 2019 16:35:29 -0700 (PDT) Received: from localhost ([2620:10d:c091:500::d1c7]) by smtp.gmail.com with ESMTPSA id l6sm40554336pga.72.2019.07.22.16.35.28 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Mon, 22 Jul 2019 16:35:29 -0700 (PDT) Date: Mon, 22 Jul 2019 19:35:27 -0400 From: Johannes Weiner To: Andrew Morton Cc: linux-mm@kvack.org, linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] psi: annotate refault stalls from IO submission Message-ID: <20190722233527.GA21594@cmpxchg.org> References: <20190722201337.19180-1-hannes@cmpxchg.org> <20190722152607.dd175a9d517a5f6af06a8bdc@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190722152607.dd175a9d517a5f6af06a8bdc@linux-foundation.org> User-Agent: Mutt/1.12.0 (2019-05-25) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On Mon, Jul 22, 2019 at 03:26:07PM -0700, Andrew Morton wrote: > On Mon, 22 Jul 2019 16:13:37 -0400 Johannes Weiner wrote: > > > psi tracks the time tasks wait for refaulting pages to become > > uptodate, but it does not track the time spent submitting the IO. The > > submission part can be significant if backing storage is contended or > > when cgroup throttling (io.latency) is in effect - a lot of time is > > spent in submit_bio(). In that case, we underreport memory pressure. > > It's a somewhat broad patch. How significant is this problem in the > real world? Can we be confident that the end-user benefit is worth the > code changes? The error scales with how aggressively IO is throttled compared to the device's capability. For example, we have system maintenance software throttled down pretty hard on IO compared to the workload. When memory is contended, the system software starts thrashing cache, but since the backing device is actually pretty fast, the majority of "io time" is from injected throttling delays during submit_bio(). As a result we barely see memory pressure, when the reality is that there is almost no progress due to the thrashing and we should be killing misbehaving stuff. > > Annotate the submit_bio() paths (or the indirection through readpage) > > for refaults and swapin to get proper psi coverage of delays there. > > > > Signed-off-by: Johannes Weiner > > --- > > fs/btrfs/extent_io.c | 14 ++++++++++++-- > > fs/ext4/readpage.c | 9 +++++++++ > > fs/f2fs/data.c | 8 ++++++++ > > fs/mpage.c | 9 +++++++++ > > mm/filemap.c | 20 ++++++++++++++++++++ > > mm/page_io.c | 11 ++++++++--- > > mm/readahead.c | 24 +++++++++++++++++++++++- > > We touch three filesystems. Why these three? Are all other > filesystems OK or will they need work as well? These are the ones that I found open-coding add_to_page_cache_lru() followed by submit_bio() instead of going through generic code like mpage, use read_cache_pages(), implement ->readpage only. > > @@ -2753,11 +2763,14 @@ static struct page *do_read_cache_page(struct address_space *mapping, > > void *data, > > gfp_t gfp) > > { > > + bool refault = false; > > struct page *page; > > int err; > > repeat: > > page = find_get_page(mapping, index); > > if (!page) { > > + unsigned long pflags; > > + > > That was a bit odd. This? > > --- a/mm/filemap.c~psi-annotate-refault-stalls-from-io-submission-fix > +++ a/mm/filemap.c > @@ -2815,12 +2815,12 @@ static struct page *do_read_cache_page(s > void *data, > gfp_t gfp) > { > - bool refault = false; > struct page *page; > int err; > repeat: > page = find_get_page(mapping, index); > if (!page) { > + bool refault = false; > unsigned long pflags; > > page = __page_cache_alloc(gfp); > _ > It's so that when we jump to 'filler:' from outside the branch, the 'refault' variable is initialized from the first time through: bool refault = false; struct page *page; page = find_get_page(mapping, index); if (!page) { __page_cache_alloc() add_to_page_cache_lru() refault = PageWorkingset(page); filler: if (refault) psi_memstall_enter(&pflags); readpage() if (refault) psi_memstall_leave(&pflags); } lock_page() if (PageUptodate()) goto out; goto filler;