From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EDD69C10F27 for ; Mon, 9 Mar 2020 14:32:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B6BF121D56 for ; Mon, 9 Mar 2020 14:32:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=stapelberg-ch.20150623.gappssmtp.com header.i=@stapelberg-ch.20150623.gappssmtp.com header.b="Y56O/CGG" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726695AbgCIOcm (ORCPT ); Mon, 9 Mar 2020 10:32:42 -0400 Received: from mail-ot1-f67.google.com ([209.85.210.67]:40699 "EHLO mail-ot1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726520AbgCIOcl (ORCPT ); Mon, 9 Mar 2020 10:32:41 -0400 Received: by mail-ot1-f67.google.com with SMTP id x19so9736299otp.7 for ; Mon, 09 Mar 2020 07:32:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=stapelberg-ch.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=wwttuEgJLrHDrsoeti/WRvxYN1vo6oc1Gkxoo3LNrYk=; b=Y56O/CGGKX6JjG/tEDKs8wpA5NNiUKYMORJsS8olRIwd59un5niG5ggzfTBmi4Bg07 0H1ibRSQqIFB7VLHmmT06mg5RCILlUW9t/SD5sGQJJ+U2GSY/lIu0s1ea36tIEZ+XJzI TBxZrdg5Cy1scfui5yXeQFBJ6rw2oHdp6vaaS5yN66UQLQrp4pvVYm+KPahqIj00uVqY HQNpDynu37ITkchS+hc7B7c37s81wORIk3IBpeMvwVwHzb2V6ZlelvM7adcK5iT48dsk HNg+ObkT602w7saoTpwRHm88aaE9h1QcmHAJ7Az283YsljOM3K/Dctx5oa157lQ8mGbm GW1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=wwttuEgJLrHDrsoeti/WRvxYN1vo6oc1Gkxoo3LNrYk=; b=cH/F7uwqltoel9BiTQkJSUsSIrdf5TcooDsH4G6U+ofdj2nW/yeGLTzR3PeDLoC1eh CSFfkYFAiSjGwINuU8wsjil39bQdGWo3iSOmsSNEV/e3Ws6Jxiik33MbnxUHO6vbz9Et gx3Q4JI1ZgvWM0T3ONdCLKPmcmghf7YlZhus0/8lWB1WETQOMj0HljOClHCsaAD0/HM3 TQNp1Vxz0Ucidy5KuI/iaqi1TiNTnzGgBfnHYgnzNe8a07R//tnVID1VpkzE0VPQwVdi a+xCzZlkSW4S4ViPnMW61Dt+y/snMJadtta3rX0x4KsGJL30+xCWBaTlumeSaBnostA1 sd1A== X-Gm-Message-State: ANhLgQ08SIMIkNduvLYF21QXWB1jUJTThhl1fMF5nmEv/NAcvI3cxoLt /q3GL9LpZMXbQN+3DtAwvLeKbz/Zog2PPUkfs62UqQ== X-Google-Smtp-Source: ADFU+vurIp6uaBT8quIe8mrP94VZjXTFnvdeCAPZtjSz8BnPkGOgpzEY/LeTYfHsFd1fuP3uoZtuU1x60QjucjHoAY8= X-Received: by 2002:a9d:3e89:: with SMTP id b9mr13399637otc.3.1583764360881; Mon, 09 Mar 2020 07:32:40 -0700 (PDT) MIME-Version: 1.0 References: <20200303130421.GA5186@mtj.thefacebook.com> <20200303141311.GA189690@mtj.thefacebook.com> <20200303142512.GC189690@mtj.thefacebook.com> In-Reply-To: From: Michael Stapelberg Date: Mon, 9 Mar 2020 15:32:29 +0100 Message-ID: Subject: Re: [fuse-devel] Writing to FUSE via mmap extremely slow (sometimes) on some machines? To: Tejun Heo Cc: Miklos Szeredi , Jack Smith , fuse-devel , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Here=E2=80=99s one more thing I noticed: when polling /sys/kernel/debug/bdi/0:93/stats, I see that BdiDirtied and BdiWritten remain at their original values while the kernel sends FUSE read requests, and only goes up when the kernel transitions into sending FUSE write requests. Notably, the page dirtying throttling happens in the read phase, which is most likely why the write bandwidth is (correctly) measured as 0. Do we have any ideas on why the kernel sends FUSE reads at all? On Thu, Mar 5, 2020 at 3:45 PM Michael Stapelberg wrote: > > Thanks for taking a look! > > Find attached a trace file which illustrates that the device=E2=80=99s wr= ite > bandwidth (write_bw) decreases from the initial 100 MB/s down to, > eventually, 0 (not included in the trace). When seeing the > pathologically slow write-back performance, I observed write_bw=3D0! > > The trace was generated with these tracepoints enabled: > echo 1 > /sys/kernel/debug/tracing/events/writeback/balance_dirty_pages/e= nable > echo 1 > /sys/kernel/debug/tracing/events/writeback/bdi_dirty_ratelimit/e= nable > > I wonder why the measured write bandwidth decreases so much. Any thoughts= ? > > On Tue, Mar 3, 2020 at 3:25 PM Tejun Heo wrote: > > > > On Tue, Mar 03, 2020 at 03:21:47PM +0100, Michael Stapelberg wrote: > > > Find attached trace.log (cat /sys/kernel/debug/tracing/trace) and > > > fuse-debug.log (FUSE daemon with timestamps). > > > > > > Does that tell you something, or do we need more data? (If so, how?) > > > > This is likely the culprit. > > > > .... 1319822.406198: balance_dirty_pages: ... bdi_dirty=3D68 dirty_rat= elimit=3D28 ... > > > > For whatever reason, bdp calculated that the dirty throttling > > threshold for the fuse device is 28 pages which is extremely low. Need > > to track down how that number came to be. I'm afraid from here on it'd > > mostly be reading source code and sprinkling printks around but the > > debugging really comes down to figuring out how we ended up with 68 > > and 28. > > > > Thanks. > > > > -- > > tejun From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51A00C10F29 for ; Mon, 9 Mar 2020 14:32:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1801A20578 for ; Mon, 9 Mar 2020 14:32:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=stapelberg-ch.20150623.gappssmtp.com header.i=@stapelberg-ch.20150623.gappssmtp.com header.b="Y56O/CGG" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1801A20578 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=stapelberg.ch Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id ABC836B0005; Mon, 9 Mar 2020 10:32:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A46486B0006; Mon, 9 Mar 2020 10:32:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9357C6B0007; Mon, 9 Mar 2020 10:32:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0189.hostedemail.com [216.40.44.189]) by kanga.kvack.org (Postfix) with ESMTP id 798D16B0005 for ; Mon, 9 Mar 2020 10:32:42 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 3381A181AEF00 for ; Mon, 9 Mar 2020 14:32:42 +0000 (UTC) X-FDA: 76576064964.18.sleet04_34a5ba8e8a642 X-HE-Tag: sleet04_34a5ba8e8a642 X-Filterd-Recvd-Size: 5717 Received: from mail-ot1-f68.google.com (mail-ot1-f68.google.com [209.85.210.68]) by imf11.hostedemail.com (Postfix) with ESMTP for ; Mon, 9 Mar 2020 14:32:41 +0000 (UTC) Received: by mail-ot1-f68.google.com with SMTP id g15so3629913otr.0 for ; Mon, 09 Mar 2020 07:32:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=stapelberg-ch.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=wwttuEgJLrHDrsoeti/WRvxYN1vo6oc1Gkxoo3LNrYk=; b=Y56O/CGGKX6JjG/tEDKs8wpA5NNiUKYMORJsS8olRIwd59un5niG5ggzfTBmi4Bg07 0H1ibRSQqIFB7VLHmmT06mg5RCILlUW9t/SD5sGQJJ+U2GSY/lIu0s1ea36tIEZ+XJzI TBxZrdg5Cy1scfui5yXeQFBJ6rw2oHdp6vaaS5yN66UQLQrp4pvVYm+KPahqIj00uVqY HQNpDynu37ITkchS+hc7B7c37s81wORIk3IBpeMvwVwHzb2V6ZlelvM7adcK5iT48dsk HNg+ObkT602w7saoTpwRHm88aaE9h1QcmHAJ7Az283YsljOM3K/Dctx5oa157lQ8mGbm GW1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=wwttuEgJLrHDrsoeti/WRvxYN1vo6oc1Gkxoo3LNrYk=; b=gqOqFQa6hZRaRPVPjf7zqwd/F+Iy/khphl2ubetfBLPDlH2z2P9uSLyTxALryyOuzt Terj93p8Y3DyFchMAsSsP7WDIJhfdlDDrSqwhGPWplJBPDgV5mi8zHo2bP42Tb09sOxg WtX938k91Y95FTqAPg8FV/HvdvYq8bpaeVyKsqzfMT5l/HQn6p1RKjNT3smbmEvXoU4/ yBXK1qXLPq5IXudQohXPYyF+IJMveudhdU4nKTnUZxpV6MSfQxBkP7MtgY0NddyhlL45 lk5jvnobe9t0ySLxMA+0rTeykp8xAf2B+FFocX9qUXOqdeAa2I79Pcizs9OvsKA/2cww t1Aw== X-Gm-Message-State: ANhLgQ1L/WN0DJSj+9rPMAw/Xn36nTRVx5g4P4uW7yY55h7yOzitYKZg +GCINrJlWFd2RnygPkuFfofThjGMIry+ANFn9yA4mA== X-Google-Smtp-Source: ADFU+vurIp6uaBT8quIe8mrP94VZjXTFnvdeCAPZtjSz8BnPkGOgpzEY/LeTYfHsFd1fuP3uoZtuU1x60QjucjHoAY8= X-Received: by 2002:a9d:3e89:: with SMTP id b9mr13399637otc.3.1583764360881; Mon, 09 Mar 2020 07:32:40 -0700 (PDT) MIME-Version: 1.0 References: <20200303130421.GA5186@mtj.thefacebook.com> <20200303141311.GA189690@mtj.thefacebook.com> <20200303142512.GC189690@mtj.thefacebook.com> In-Reply-To: From: Michael Stapelberg Date: Mon, 9 Mar 2020 15:32:29 +0100 Message-ID: Subject: Re: [fuse-devel] Writing to FUSE via mmap extremely slow (sometimes) on some machines? To: Tejun Heo Cc: Miklos Szeredi , Jack Smith , fuse-devel , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.030077, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Here=E2=80=99s one more thing I noticed: when polling /sys/kernel/debug/bdi/0:93/stats, I see that BdiDirtied and BdiWritten remain at their original values while the kernel sends FUSE read requests, and only goes up when the kernel transitions into sending FUSE write requests. Notably, the page dirtying throttling happens in the read phase, which is most likely why the write bandwidth is (correctly) measured as 0. Do we have any ideas on why the kernel sends FUSE reads at all? On Thu, Mar 5, 2020 at 3:45 PM Michael Stapelberg wrote: > > Thanks for taking a look! > > Find attached a trace file which illustrates that the device=E2=80=99s wr= ite > bandwidth (write_bw) decreases from the initial 100 MB/s down to, > eventually, 0 (not included in the trace). When seeing the > pathologically slow write-back performance, I observed write_bw=3D0! > > The trace was generated with these tracepoints enabled: > echo 1 > /sys/kernel/debug/tracing/events/writeback/balance_dirty_pages/e= nable > echo 1 > /sys/kernel/debug/tracing/events/writeback/bdi_dirty_ratelimit/e= nable > > I wonder why the measured write bandwidth decreases so much. Any thoughts= ? > > On Tue, Mar 3, 2020 at 3:25 PM Tejun Heo wrote: > > > > On Tue, Mar 03, 2020 at 03:21:47PM +0100, Michael Stapelberg wrote: > > > Find attached trace.log (cat /sys/kernel/debug/tracing/trace) and > > > fuse-debug.log (FUSE daemon with timestamps). > > > > > > Does that tell you something, or do we need more data? (If so, how?) > > > > This is likely the culprit. > > > > .... 1319822.406198: balance_dirty_pages: ... bdi_dirty=3D68 dirty_rat= elimit=3D28 ... > > > > For whatever reason, bdp calculated that the dirty throttling > > threshold for the fuse device is 28 pages which is extremely low. Need > > to track down how that number came to be. I'm afraid from here on it'd > > mostly be reading source code and sprinkling printks around but the > > debugging really comes down to figuring out how we ended up with 68 > > and 28. > > > > Thanks. > > > > -- > > tejun