From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5B63C432BE for ; Thu, 2 Sep 2021 01:14:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8F23961026 for ; Thu, 2 Sep 2021 01:14:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232845AbhIBBPR (ORCPT ); Wed, 1 Sep 2021 21:15:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229958AbhIBBPO (ORCPT ); Wed, 1 Sep 2021 21:15:14 -0400 Received: from mail-lf1-x12b.google.com (mail-lf1-x12b.google.com [IPv6:2a00:1450:4864:20::12b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F19D8C061575 for ; Wed, 1 Sep 2021 18:14:16 -0700 (PDT) Received: by mail-lf1-x12b.google.com with SMTP id s10so433135lfr.11 for ; Wed, 01 Sep 2021 18:14:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bqz6LZdN01vNMwtcP1Z+NsunTL7XUef9xw+Gx1OyCNs=; b=N7UDBpi6uLzFIcqYJ6v9E88Lg2ylU4yUaRohLplpblkLd7w9iGOzwRcMveO1xdvQKk Ar+hVsUk0v3uMqN3giV/Z9B/66ookTa16ErYKDgXW8o5/sSW3ZR14Vd6WnM6fA1q0pn1 dnQoqcAy8pExMKy85JEw2hkODrxtjxA1hhmLo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bqz6LZdN01vNMwtcP1Z+NsunTL7XUef9xw+Gx1OyCNs=; b=KX7i+vB18OBIq7qTmECn+5/wbgfuw+4pfLJ9jhNt1jsVM3Ei5qW5oAk5sLKBK2uNFl RYkvXSR9l3ag973Hd6NbrlMR/7jLHHrdUu8QfwuVFnojOUkb3mDUZjmzNItuOR/YqkPv yA/3+peRwlF8PQWUnnSnHy5itqXzT0A4/VcRSozQPkrpl+485ILCtGNOhKCL5C+8HDLR DGXXq2ydAI8uNP1Xlt0aAxrQBmiz2U2d8m20+OKs0gvZwmbGoUq/YlZJW01jnl/JhhB5 GRaBBQomiu4ueW09jN02np18wmZVbnZotrjTpb5T39FPztMr+zdEux98VBThIjwJN632 PUlw== X-Gm-Message-State: AOAM532PLF2wa+Ca5GSwrE4ArTJqQj1o70cMojoesckB9KZyj1CLZdnP N3PTCT+hQLn66uOM5hA5tKmSMc4E4e4tFedM X-Google-Smtp-Source: ABdhPJxul794+nAt7d51VAGEg3id1VvC7obt6SW4akm5idXiZjXnM6nYlx8gRoirdq7Zr5PyClp0Vg== X-Received: by 2002:ac2:5685:: with SMTP id 5mr529413lfr.466.1630545254825; Wed, 01 Sep 2021 18:14:14 -0700 (PDT) Received: from mail-lj1-f169.google.com (mail-lj1-f169.google.com. [209.85.208.169]) by smtp.gmail.com with ESMTPSA id x2sm37893lfu.116.2021.09.01.18.14.12 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 01 Sep 2021 18:14:13 -0700 (PDT) Received: by mail-lj1-f169.google.com with SMTP id s12so566774ljg.0 for ; Wed, 01 Sep 2021 18:14:12 -0700 (PDT) X-Received: by 2002:a2e:7d0e:: with SMTP id y14mr376926ljc.251.1630545252497; Wed, 01 Sep 2021 18:14:12 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Linus Torvalds Date: Wed, 1 Sep 2021 18:13:56 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Is it possible to implement the per-node page cache for programs/libraries? To: Barry Song <21cnbao@gmail.com> Cc: Matthew Wilcox , Huang Shijie , Shijie Huang , Al Viro , Andrew Morton , Linux-MM , Barry Song , LKML , Frank Wang Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 1, 2021 at 5:15 PM Barry Song <21cnbao@gmail.com> wrote: > > In case we are running mysql on a machine with 128 cores > (4numa, 32cores in each numa), how will the reflink help the only > mysql process to leverage its local libc copy? That's a fundamentally harder problem anyway, and for the foreseeable future you should expect the answer to that be "Not a way in hell". Because it's not about "local libc copies" at that point any more, it's about "a single process only has a single page table". So a single process will have a particular virtual address mapped to *one* physical page. And no, it doesn't matter how many threads you have. What makes them threads - not processes - is that they share the same VM image. So the only way you will have local NUMA copies is if you (a) run multiple processes (b) bind each process to a particular NUMA node (c) do something special to then have per-node mappings That "(c)" is what is up for discussion, whether it be with various user mode hacks, or the "NUMA COW" thing, or whatever. But (a) and (b) are basically required. Linus From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED410C432BE for ; Thu, 2 Sep 2021 01:14:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 67E7E6108B for ; Thu, 2 Sep 2021 01:14:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 67E7E6108B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4D6E88D0002; Wed, 1 Sep 2021 21:14:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4778E8D0001; Wed, 1 Sep 2021 21:14:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 366908D0002; Wed, 1 Sep 2021 21:14:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0178.hostedemail.com [216.40.44.178]) by kanga.kvack.org (Postfix) with ESMTP id 246708D0001 for ; Wed, 1 Sep 2021 21:14:17 -0400 (EDT) Received: from smtpin35.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id DBDC68249980 for ; Thu, 2 Sep 2021 01:14:16 +0000 (UTC) X-FDA: 78540862512.35.61F6E41 Received: from mail-lf1-f47.google.com (mail-lf1-f47.google.com [209.85.167.47]) by imf03.hostedemail.com (Postfix) with ESMTP id 9647E300009B for ; Thu, 2 Sep 2021 01:14:16 +0000 (UTC) Received: by mail-lf1-f47.google.com with SMTP id bq28so453849lfb.7 for ; Wed, 01 Sep 2021 18:14:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bqz6LZdN01vNMwtcP1Z+NsunTL7XUef9xw+Gx1OyCNs=; b=N7UDBpi6uLzFIcqYJ6v9E88Lg2ylU4yUaRohLplpblkLd7w9iGOzwRcMveO1xdvQKk Ar+hVsUk0v3uMqN3giV/Z9B/66ookTa16ErYKDgXW8o5/sSW3ZR14Vd6WnM6fA1q0pn1 dnQoqcAy8pExMKy85JEw2hkODrxtjxA1hhmLo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bqz6LZdN01vNMwtcP1Z+NsunTL7XUef9xw+Gx1OyCNs=; b=VcdxoCfoVQ+gQCJzKrJal9utzSMSk6+6KkJ2PxK5woqpgp8Ur9jrsUvcC/KsimIja0 t1i0FcMBtJhM+hUR/nyo87y3ON54PrmWgwe7yh+yEnX+XjPHH2uuQO2B0Zr6zSV+O2C5 EVqVx+EbGd8V9vY3BpfznmwzhfK0hrdLasM+Gk0EZyRy5o5JkjMguHzRlQkWFbI6iKOz DX8hjCmwGhclZmFaOCdnuqxLQsnbwuh7k7RPiw5a63POtu7PiECBTSnIr1luVksifZ4x AMxuiqoDUdHBY+RYrcRpygTcwFnGyDRnIXcJlPjw1xyGi0oVKAb0Api4fckT2SEAKsx+ 2Apw== X-Gm-Message-State: AOAM531qt/yggTLuOUOLZzhj4k76yHFBWI5gNZH8/+mgeQMyn/+NFg0+ EN40mvnnvROtONeQHbfOaboBjT5KLvV4sg5S X-Google-Smtp-Source: ABdhPJxJJ/ZXBqHnD/jnABD3SHPNN5U1Eep5m3i+phHkorlk6Adsfexrcw2C6BdGUm7P9SwnugyJPQ== X-Received: by 2002:a05:6512:22c8:: with SMTP id g8mr550029lfu.342.1630545254490; Wed, 01 Sep 2021 18:14:14 -0700 (PDT) Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com. [209.85.208.179]) by smtp.gmail.com with ESMTPSA id u16sm49915ljl.9.2021.09.01.18.14.12 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 01 Sep 2021 18:14:13 -0700 (PDT) Received: by mail-lj1-f179.google.com with SMTP id h1so493856ljl.9 for ; Wed, 01 Sep 2021 18:14:12 -0700 (PDT) X-Received: by 2002:a2e:7d0e:: with SMTP id y14mr376926ljc.251.1630545252497; Wed, 01 Sep 2021 18:14:12 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Linus Torvalds Date: Wed, 1 Sep 2021 18:13:56 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Is it possible to implement the per-node page cache for programs/libraries? To: Barry Song <21cnbao@gmail.com> Cc: Matthew Wilcox , Huang Shijie , Shijie Huang , Al Viro , Andrew Morton , Linux-MM , Barry Song , LKML , Frank Wang Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=N7UDBpi6; dmarc=none; spf=pass (imf03.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.167.47 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 9647E300009B X-Stat-Signature: gh1pdjia5sh67z84b4wxqpkumfmy3s3m X-HE-Tag: 1630545256-769881 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Sep 1, 2021 at 5:15 PM Barry Song <21cnbao@gmail.com> wrote: > > In case we are running mysql on a machine with 128 cores > (4numa, 32cores in each numa), how will the reflink help the only > mysql process to leverage its local libc copy? That's a fundamentally harder problem anyway, and for the foreseeable future you should expect the answer to that be "Not a way in hell". Because it's not about "local libc copies" at that point any more, it's about "a single process only has a single page table". So a single process will have a particular virtual address mapped to *one* physical page. And no, it doesn't matter how many threads you have. What makes them threads - not processes - is that they share the same VM image. So the only way you will have local NUMA copies is if you (a) run multiple processes (b) bind each process to a particular NUMA node (c) do something special to then have per-node mappings That "(c)" is what is up for discussion, whether it be with various user mode hacks, or the "NUMA COW" thing, or whatever. But (a) and (b) are basically required. Linus