From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5D63C4338F for ; Sat, 24 Jul 2021 19:21:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 984F260725 for ; Sat, 24 Jul 2021 19:21:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229623AbhGXSkr (ORCPT ); Sat, 24 Jul 2021 14:40:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229476AbhGXSkq (ORCPT ); Sat, 24 Jul 2021 14:40:46 -0400 Received: from bedivere.hansenpartnership.com (bedivere.hansenpartnership.com [IPv6:2607:fcd0:100:8a00::2]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E9FFC061575; Sat, 24 Jul 2021 12:21:18 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by bedivere.hansenpartnership.com (Postfix) with ESMTP id 251C81280541; Sat, 24 Jul 2021 12:21:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=hansenpartnership.com; s=20151216; t=1627154478; bh=RwmyDLi8Otc7MLTn5MLWY8ZIvHMRyae2630SrNyNvd4=; h=Message-ID:Subject:From:To:Date:In-Reply-To:References:From; b=EsGCwytkbLX6pUETj7D3+5M3ugtzjkH76d1GnEPLwRl8a7rSwC6FxNxqdQJ2eQqYf rzf5sfyyGcl/n/0aqXBBp2IZZNQr8EErbWOf9SvKVUcjzYqpJNkWOFqmYvEvA9yxJ2 Oadd9mn5PzfgwKOpbJDL/a6gedSqdr+XH7AoOFAA= Received: from bedivere.hansenpartnership.com ([127.0.0.1]) by localhost (bedivere.hansenpartnership.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F2DEa1vXkPjj; Sat, 24 Jul 2021 12:21:18 -0700 (PDT) Received: from jarvis.int.hansenpartnership.com (unknown [IPv6:2601:600:8280:66d1::527]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bedivere.hansenpartnership.com (Postfix) with ESMTPSA id 798DF1280534; Sat, 24 Jul 2021 12:21:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=hansenpartnership.com; s=20151216; t=1627154477; bh=RwmyDLi8Otc7MLTn5MLWY8ZIvHMRyae2630SrNyNvd4=; h=Message-ID:Subject:From:To:Date:In-Reply-To:References:From; b=UoAYYAcLzUOuZJV9op1vWS6kU+0PyuMTg++SW4Jb/zU4jOJTzWv7u0x6DH8acKDrz 6jhc4biqzwPMeX1IfwKyMNVumkwu3DHv8bhFmInjqG1t5sQmnsT0dVu/EYS4CxYjUn WEU/WBB19lmhYdkyvX6nVEv88/ZJiSMTfhD6BZRE= Message-ID: Subject: Re: Folios give an 80% performance win From: James Bottomley To: Matthew Wilcox Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Linus Torvalds , Andrew Morton , "Darrick J. Wong" , Christoph Hellwig , Andres Freund , Michael Larabel Date: Sat, 24 Jul 2021 12:21:16 -0700 In-Reply-To: References: <20210715033704.692967-1-willy@infradead.org> <1e48f7edcb6d9a67e8b78823660939007e14bae1.camel@HansenPartnership.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.34.4 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 2021-07-24 at 19:50 +0100, Matthew Wilcox wrote: > On Sat, Jul 24, 2021 at 11:23:25AM -0700, James Bottomley wrote: > > On Sat, 2021-07-24 at 19:14 +0100, Matthew Wilcox wrote: > > > On Sat, Jul 24, 2021 at 11:09:02AM -0700, James Bottomley wrote: > > > > On Sat, 2021-07-24 at 18:27 +0100, Matthew Wilcox wrote: > > > > > What blows me away is the 80% performance improvement for > > > > > PostgreSQL. I know they use the page cache extensively, so > > > > > it's > > > > > plausibly real. I'm a bit surprised that it has such good > > > > > locality, and the size of the win far exceeds my > > > > > expectations. We should probably dive into it and figure out > > > > > exactly what's going on. > > > > > > > > Since none of the other tested databases showed more than a 3% > > > > improvement, this looks like an anomalous result specific to > > > > something in postgres ... although the next biggest db: mariadb > > > > wasn't part of the tests so I'm not sure that's > > > > definitive. Perhaps the next step should be to t > > > > est mariadb? Since they're fairly similar in domain (both full > > > > SQL) if mariadb shows this type of improvement, you can > > > > safely assume it's something in the way SQL databases handle > > > > paging and if it doesn't, it's likely fixing a postgres > > > > inefficiency. > > > > > > I think the thing that's specific to PostgreSQL is that it's a > > > heavy user of the page cache. My understanding is that most > > > databases use direct IO and manage their own page cache, while > > > PostgreSQL trusts the kernel to get it right. > > > > That's testable with mariadb, at least for the innodb engine since > > the flush_method is settable. > > We're still not communicating well. I'm not talking about writes, > I'm talking about reads. Postgres uses the page cache for reads. > InnoDB uses O_DIRECT (afaict). See articles like this one: > https://www.percona.com/blog/2018/02/08/fsync-performance-storage-devices/ If it were all about reads, wouldn't the Phoronix pgbench read only test have shown a better improvement than 7%? I think the Phoronix data shows that whatever it is it's to do with writes ... that does imply something in the way the log syncs data. James From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBCC3C432BE for ; Sat, 24 Jul 2021 19:21:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8D3FD60E8B for ; Sat, 24 Jul 2021 19:21:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8D3FD60E8B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=HansenPartnership.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 30D3F6B0033; Sat, 24 Jul 2021 15:21:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2BDAB6B005D; Sat, 24 Jul 2021 15:21:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1ACBB6B006C; Sat, 24 Jul 2021 15:21:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0048.hostedemail.com [216.40.44.48]) by kanga.kvack.org (Postfix) with ESMTP id F37B36B0033 for ; Sat, 24 Jul 2021 15:21:20 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A1A24181AEF31 for ; Sat, 24 Jul 2021 19:21:20 +0000 (UTC) X-FDA: 78398449920.16.AC06BBB Received: from bedivere.hansenpartnership.com (bedivere.hansenpartnership.com [96.44.175.130]) by imf04.hostedemail.com (Postfix) with ESMTP id 4F097500BC06 for ; Sat, 24 Jul 2021 19:21:19 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by bedivere.hansenpartnership.com (Postfix) with ESMTP id 251C81280541; Sat, 24 Jul 2021 12:21:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=hansenpartnership.com; s=20151216; t=1627154478; bh=RwmyDLi8Otc7MLTn5MLWY8ZIvHMRyae2630SrNyNvd4=; h=Message-ID:Subject:From:To:Date:In-Reply-To:References:From; b=EsGCwytkbLX6pUETj7D3+5M3ugtzjkH76d1GnEPLwRl8a7rSwC6FxNxqdQJ2eQqYf rzf5sfyyGcl/n/0aqXBBp2IZZNQr8EErbWOf9SvKVUcjzYqpJNkWOFqmYvEvA9yxJ2 Oadd9mn5PzfgwKOpbJDL/a6gedSqdr+XH7AoOFAA= Received: from bedivere.hansenpartnership.com ([127.0.0.1]) by localhost (bedivere.hansenpartnership.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F2DEa1vXkPjj; Sat, 24 Jul 2021 12:21:18 -0700 (PDT) Received: from jarvis.int.hansenpartnership.com (unknown [IPv6:2601:600:8280:66d1::527]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bedivere.hansenpartnership.com (Postfix) with ESMTPSA id 798DF1280534; Sat, 24 Jul 2021 12:21:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=hansenpartnership.com; s=20151216; t=1627154477; bh=RwmyDLi8Otc7MLTn5MLWY8ZIvHMRyae2630SrNyNvd4=; h=Message-ID:Subject:From:To:Date:In-Reply-To:References:From; b=UoAYYAcLzUOuZJV9op1vWS6kU+0PyuMTg++SW4Jb/zU4jOJTzWv7u0x6DH8acKDrz 6jhc4biqzwPMeX1IfwKyMNVumkwu3DHv8bhFmInjqG1t5sQmnsT0dVu/EYS4CxYjUn WEU/WBB19lmhYdkyvX6nVEv88/ZJiSMTfhD6BZRE= Message-ID: Subject: Re: Folios give an 80% performance win From: James Bottomley To: Matthew Wilcox Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Linus Torvalds , Andrew Morton , "Darrick J. Wong" , Christoph Hellwig , Andres Freund , Michael Larabel Date: Sat, 24 Jul 2021 12:21:16 -0700 In-Reply-To: References: <20210715033704.692967-1-willy@infradead.org> <1e48f7edcb6d9a67e8b78823660939007e14bae1.camel@HansenPartnership.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.34.4 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=hansenpartnership.com header.s=20151216 header.b=EsGCwytk; dkim=pass header.d=hansenpartnership.com header.s=20151216 header.b=UoAYYAcL; spf=pass (imf04.hostedemail.com: domain of James.Bottomley@HansenPartnership.com designates 96.44.175.130 as permitted sender) smtp.mailfrom=James.Bottomley@HansenPartnership.com; dmarc=pass (policy=none) header.from=HansenPartnership.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 4F097500BC06 X-Stat-Signature: wsy9fsskk1gka6j7y1du4huw773at8ft X-HE-Tag: 1627154479-792873 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, 2021-07-24 at 19:50 +0100, Matthew Wilcox wrote: > On Sat, Jul 24, 2021 at 11:23:25AM -0700, James Bottomley wrote: > > On Sat, 2021-07-24 at 19:14 +0100, Matthew Wilcox wrote: > > > On Sat, Jul 24, 2021 at 11:09:02AM -0700, James Bottomley wrote: > > > > On Sat, 2021-07-24 at 18:27 +0100, Matthew Wilcox wrote: > > > > > What blows me away is the 80% performance improvement for > > > > > PostgreSQL. I know they use the page cache extensively, so > > > > > it's > > > > > plausibly real. I'm a bit surprised that it has such good > > > > > locality, and the size of the win far exceeds my > > > > > expectations. We should probably dive into it and figure out > > > > > exactly what's going on. > > > > > > > > Since none of the other tested databases showed more than a 3% > > > > improvement, this looks like an anomalous result specific to > > > > something in postgres ... although the next biggest db: mariadb > > > > wasn't part of the tests so I'm not sure that's > > > > definitive. Perhaps the next step should be to t > > > > est mariadb? Since they're fairly similar in domain (both full > > > > SQL) if mariadb shows this type of improvement, you can > > > > safely assume it's something in the way SQL databases handle > > > > paging and if it doesn't, it's likely fixing a postgres > > > > inefficiency. > > > > > > I think the thing that's specific to PostgreSQL is that it's a > > > heavy user of the page cache. My understanding is that most > > > databases use direct IO and manage their own page cache, while > > > PostgreSQL trusts the kernel to get it right. > > > > That's testable with mariadb, at least for the innodb engine since > > the flush_method is settable. > > We're still not communicating well. I'm not talking about writes, > I'm talking about reads. Postgres uses the page cache for reads. > InnoDB uses O_DIRECT (afaict). See articles like this one: > https://www.percona.com/blog/2018/02/08/fsync-performance-storage-devices/ If it were all about reads, wouldn't the Phoronix pgbench read only test have shown a better improvement than 7%? I think the Phoronix data shows that whatever it is it's to do with writes ... that does imply something in the way the log syncs data. James