From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2CF25C43214 for ; Thu, 2 Sep 2021 00:15:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 10F0F6108B for ; Thu, 2 Sep 2021 00:15:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243109AbhIBAQR (ORCPT ); Wed, 1 Sep 2021 20:16:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243065AbhIBAQQ (ORCPT ); Wed, 1 Sep 2021 20:16:16 -0400 Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BCFA5C061575 for ; Wed, 1 Sep 2021 17:15:18 -0700 (PDT) Received: by mail-ed1-x535.google.com with SMTP id l6so124691edb.7 for ; Wed, 01 Sep 2021 17:15:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=xhEOkWP33yhFB7NNvk81879Ir5dlLowUoKbhLb4yxxI=; b=NyyrAAWpHnMGaaDnOCyuyHd3fhM+X+9xBLDIGnis0cKuJpKW+KT1WR48NcbkpPaCvM 8d0geCI4gkpschCP5E1gkvgXwzCHXeGNIUrUqY5wM+PVnsXFnK4WMQ85S9iGIHksggyb EQPn+5Gu7U+O3tA4GzMQW4BFyG5eBXr4iICM4TmML1jVRIxL0ZkzoCst+4dF21zayx4w Lv/kEXhYY6wRa8B8FlIT8RZ1QdT/zEEIv94jnRkTSLQf82FfHZHLGL2pFzMHDOKzwNra /p6XuljmCisnCopPPSE6I+A89nnv0sBv4B4VaCZLBPH+tSBd/xrckyso1rLdYp73MGsa Vycw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=xhEOkWP33yhFB7NNvk81879Ir5dlLowUoKbhLb4yxxI=; b=SbMGRstu27ZoaVSCEAmrcxCGzpG1QHx6ImJN28cSDWBQff/T1GHBg6Xp++sIZeFRIx asB0f/eGwrtVVA+d06s2yx+3uRjaTUnFredljTIyy8yldjz5S0yNpdtfDQpE5Qn3jYz/ ZXMilxKUwZT0yYT5u90IF8i8xajx5xgO5jxCgek95UfyJ9wuBLpN8egDQ6GSxvp8dsxt YBAHil8/xeh3sdSLAFhlfKzCTd82/q+xWxok2EHdA62+W4a5+b6t7Ikmc1hw0kwboH3l YU1cPWVSfv8kgrF0L2++ULn7bk49jS+wYp7SmrVKLxxl2ion3ByyFpArcu+ENS6fBPss uyPg== X-Gm-Message-State: AOAM530qONGD3GjUHyeZ8XlXQB4u1EIzqOy5+IwSKk6U9J30txJORsU3 Td7TYihj6B0dRrzBCXFu5CEAUFOsYsrgIYnx1mc= X-Google-Smtp-Source: ABdhPJyqhMtFYPXxzpWKxfuC+dwe6cwFLrsa90k/sBIiER97yphsd7E/7cKZpT+1VBK6cse9Kw2Z8MLOcBlgFjf4BJY= X-Received: by 2002:a50:c043:: with SMTP id u3mr576707edd.207.1630541717411; Wed, 01 Sep 2021 17:15:17 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Thu, 2 Sep 2021 12:15:06 +1200 Message-ID: Subject: Re: Is it possible to implement the per-node page cache for programs/libraries? To: Matthew Wilcox Cc: Huang Shijie , Shijie Huang , Linus Torvalds , Al Viro , Andrew Morton , Linux-MM , Barry Song , LKML , Frank Wang Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 2, 2021 at 12:00 PM Matthew Wilcox wrote: > > On Wed, Sep 01, 2021 at 02:25:34PM +0000, Huang Shijie wrote: > > On Wed, Sep 01, 2021 at 01:30:45PM +0000, Huang Shijie wrote: > > > On Wed, Sep 01, 2021 at 04:25:01AM +0100, Matthew Wilcox wrote: > > > > On Wed, Sep 01, 2021 at 11:07:41AM +0800, Shijie Huang wrote: > > > > > In the NUMA, we only have one page cache for each file. For the > > > > > program/shared libraries, the > > > > > remote-access delays longer then the local-access. > > > > > > > > > > So, is it possible to implement the per-node page cache for > > > > > programs/libraries? > > > > > > > > At this point, we have no way to support text replication within a > > > > process. So what you're suggesting (if implemented) would work for > > > > > > I created a glibc patch which can do the text replication within a process. > > The "text replication" means the shared libraries, not program itself. > > Thinking about it some more, if you're ok with it only being shared > libraries, you can do this: > > for i in `seq 0 3`; do \ > cp --reflink=always /lib/x86_64-linux-gnu/libc.so.6 \ > /lib/x86_64-linux-gnu/libc.so.6.numa$i; \ > done > > Reflinked files don't share page cache, so you can do this all in > userspace with no kernel changes. Not quite sure I catch your point. In case we are running mysql on a machine with 128 cores (4numa, 32cores in each numa), how will the reflink help the only mysql process to leverage its local libc copy? Thanks Barry From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36B61C4320E for ; Thu, 2 Sep 2021 00:15:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DF5CF60EBB for ; Thu, 2 Sep 2021 00:15:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org DF5CF60EBB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 709EF8D0002; Wed, 1 Sep 2021 20:15:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B8A68D0001; Wed, 1 Sep 2021 20:15:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 580DF8D0002; Wed, 1 Sep 2021 20:15:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0186.hostedemail.com [216.40.44.186]) by kanga.kvack.org (Postfix) with ESMTP id 45D4F8D0001 for ; Wed, 1 Sep 2021 20:15:19 -0400 (EDT) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 01BC68249980 for ; Thu, 2 Sep 2021 00:15:19 +0000 (UTC) X-FDA: 78540713958.31.76F1830 Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) by imf13.hostedemail.com (Postfix) with ESMTP id BF6F21024481 for ; Thu, 2 Sep 2021 00:15:18 +0000 (UTC) Received: by mail-ed1-f43.google.com with SMTP id u19so141503edb.3 for ; Wed, 01 Sep 2021 17:15:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=xhEOkWP33yhFB7NNvk81879Ir5dlLowUoKbhLb4yxxI=; b=NyyrAAWpHnMGaaDnOCyuyHd3fhM+X+9xBLDIGnis0cKuJpKW+KT1WR48NcbkpPaCvM 8d0geCI4gkpschCP5E1gkvgXwzCHXeGNIUrUqY5wM+PVnsXFnK4WMQ85S9iGIHksggyb EQPn+5Gu7U+O3tA4GzMQW4BFyG5eBXr4iICM4TmML1jVRIxL0ZkzoCst+4dF21zayx4w Lv/kEXhYY6wRa8B8FlIT8RZ1QdT/zEEIv94jnRkTSLQf82FfHZHLGL2pFzMHDOKzwNra /p6XuljmCisnCopPPSE6I+A89nnv0sBv4B4VaCZLBPH+tSBd/xrckyso1rLdYp73MGsa Vycw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=xhEOkWP33yhFB7NNvk81879Ir5dlLowUoKbhLb4yxxI=; b=LtX06xEgY7a+r3tru0HSAjiSyPSehFe3NSeVoqvud/u2xGAoHNF1IDKluVaJAfP2Vo 4GDmcHC9l1xO+xyxcKK01Sw+Za9ynesRxDhQHRdhAhk+Qa1SDJtkvWoSexzdSnLA/jyv fxurgOj1hgbnuxO6nn+1ny9uFr7Z3zSCx6lWemCTi+mkm03erQhJY0owLRRjQuuzzVjX Dq8eH814WThpDH8VtmHcNvjYmAV3NC3/zVcNSn/FM1gyIIYZdtjEMJnqQEMLVbBvgtZp Fel0HNCiFWpk9m3/aj05VeRAm6mB1hvnYF9qsWy2oUlpVSQxJqSfRZO6r7aeOf6LzwJW YvcA== X-Gm-Message-State: AOAM532UL5xHQoFVy4dnyhEWQj1MX/0rQj/7dZ8a09W43HXjh1oGLWet BcpR5jKp5/qfbITDpx7EnE7nDykZFksFokfZvI0= X-Google-Smtp-Source: ABdhPJyqhMtFYPXxzpWKxfuC+dwe6cwFLrsa90k/sBIiER97yphsd7E/7cKZpT+1VBK6cse9Kw2Z8MLOcBlgFjf4BJY= X-Received: by 2002:a50:c043:: with SMTP id u3mr576707edd.207.1630541717411; Wed, 01 Sep 2021 17:15:17 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Thu, 2 Sep 2021 12:15:06 +1200 Message-ID: Subject: Re: Is it possible to implement the per-node page cache for programs/libraries? To: Matthew Wilcox Cc: Huang Shijie , Shijie Huang , Linus Torvalds , Al Viro , Andrew Morton , Linux-MM , Barry Song , LKML , Frank Wang Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=NyyrAAWp; spf=pass (imf13.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.208.43 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Stat-Signature: zouzbih94c6e6sjnojsami6igyhf4189 X-Rspamd-Queue-Id: BF6F21024481 X-Rspamd-Server: rspam04 X-HE-Tag: 1630541718-474896 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Sep 2, 2021 at 12:00 PM Matthew Wilcox wrote: > > On Wed, Sep 01, 2021 at 02:25:34PM +0000, Huang Shijie wrote: > > On Wed, Sep 01, 2021 at 01:30:45PM +0000, Huang Shijie wrote: > > > On Wed, Sep 01, 2021 at 04:25:01AM +0100, Matthew Wilcox wrote: > > > > On Wed, Sep 01, 2021 at 11:07:41AM +0800, Shijie Huang wrote: > > > > > In the NUMA, we only have one page cache for each file. For the > > > > > program/shared libraries, the > > > > > remote-access delays longer then the local-access. > > > > > > > > > > So, is it possible to implement the per-node page cache for > > > > > programs/libraries? > > > > > > > > At this point, we have no way to support text replication within a > > > > process. So what you're suggesting (if implemented) would work for > > > > > > I created a glibc patch which can do the text replication within a process. > > The "text replication" means the shared libraries, not program itself. > > Thinking about it some more, if you're ok with it only being shared > libraries, you can do this: > > for i in `seq 0 3`; do \ > cp --reflink=always /lib/x86_64-linux-gnu/libc.so.6 \ > /lib/x86_64-linux-gnu/libc.so.6.numa$i; \ > done > > Reflinked files don't share page cache, so you can do this all in > userspace with no kernel changes. Not quite sure I catch your point. In case we are running mysql on a machine with 128 cores (4numa, 32cores in each numa), how will the reflink help the only mysql process to leverage its local libc copy? Thanks Barry