From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 025D5C11D3D for ; Thu, 27 Feb 2020 18:08:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BD3212468E for ; Thu, 27 Feb 2020 18:08:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=intel-com.20150623.gappssmtp.com header.i=@intel-com.20150623.gappssmtp.com header.b="btVB1/kC" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BD3212468E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 516EA6B0007; Thu, 27 Feb 2020 13:08:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4C7086B0008; Thu, 27 Feb 2020 13:08:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 38E126B000A; Thu, 27 Feb 2020 13:08:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0252.hostedemail.com [216.40.44.252]) by kanga.kvack.org (Postfix) with ESMTP id 1D8916B0007 for ; Thu, 27 Feb 2020 13:08:22 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id EACCA2816 for ; Thu, 27 Feb 2020 18:08:21 +0000 (UTC) X-FDA: 76536691602.11.box56_833499893ec25 X-HE-Tag: box56_833499893ec25 X-Filterd-Recvd-Size: 6328 Received: from mail-ot1-f65.google.com (mail-ot1-f65.google.com [209.85.210.65]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Thu, 27 Feb 2020 18:08:21 +0000 (UTC) Received: by mail-ot1-f65.google.com with SMTP id 59so2898otp.12 for ; Thu, 27 Feb 2020 10:08:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=4F3CkEYboenOdYfvgxaEFjzaB88Ias8AvgF6y2bsYJo=; b=btVB1/kCSyh3miQnMYs3e17lMek8QSVYBuP0FlHl1SN8jj8AK7ZcsZTrZLYZcDr8L0 NSJ9MeY7RriOXGf8mxUIHycStwVTMNeNi3yrQla6dMYLj551Rp9KWZnTu9g5AKGKvvYT 9WlFsCXT0V4ckXbqldT9RpXG/xwR6Da9kdRkEzPi8tdoEUk2abNALGro5ZLdP/rXl0tG /sptmzlpcqEIizzSfsWPT6ZSjga8U2PolwmP3cK/7EJmAplRWGpIpx3oVOIujglDOBPB U/jmRkYeKgZuoKDQ/uKn2xiCIfawYUF6S/k5E1dLzCZ3Me3N7Bit13cPq5fiueJNzJqi dLRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=4F3CkEYboenOdYfvgxaEFjzaB88Ias8AvgF6y2bsYJo=; b=QY9l0h/wmNyfVQr8BIqcxTSSxj4ZYbZmofCh7w0qVoCjcM1bM1iAD7sVmuxmiBdTeP UzwNc/2glDFBfUdK8MoF59pi9c8g0M7xNsO6W1+g8cKXLoqDdewm2WKnltPeHB4gpWl3 7lqpDHY1L1tVIh9n0VSmGXYg4uV/WlPMncgceg5gJAbAznf+XXVHIuFo6b9gJEHl6LgZ 70LXQZnds3ZKwgkCY0YdIAgTgCI5uMkU+V2p/UZCdJZZUQJItEoUqeJAdIqxL0jp9eSj hnWNVx1r/Nfdapx6bSMQWfL2N2Pz8AeJS2Zevx4kqGpZh447relThnhGlEV4Fm82nRPE 72Tg== X-Gm-Message-State: APjAAAUwTepxz9RAuhKPk9EhLkyJ1odqDWRf3XYiOQCKX8e/i7RyswHx wdUo/t47NtZhjM5K6FyJUszYyWOQJ2xnfvG3WwX5oQ== X-Google-Smtp-Source: APXvYqzWZDXbIbzo1esKuuymk2opANK/r16XeVx9wwrPYC7dcJfvghbSyviygno+FjUiHqeLzBQ9brN5G7f+Od4lm4g= X-Received: by 2002:a9d:6c9:: with SMTP id 67mr93279otx.363.1582826900650; Thu, 27 Feb 2020 10:08:20 -0800 (PST) MIME-Version: 1.0 References: <20200221182503.28317-1-logang@deltatee.com> <20200227171704.GK31668@ziepe.ca> <20200227174311.GL31668@ziepe.ca> <20200227180346.GM31668@ziepe.ca> In-Reply-To: <20200227180346.GM31668@ziepe.ca> From: Dan Williams Date: Thu, 27 Feb 2020 10:08:09 -0800 Message-ID: Subject: Re: [PATCH v3 0/7] Allow setting caching mode in arch_add_memory() for P2PDMA To: Jason Gunthorpe Cc: Logan Gunthorpe , Linux Kernel Mailing List , Linux ARM , linux-ia64@vger.kernel.org, linuxppc-dev , linux-s390 , Linux-sh , platform-driver-x86@vger.kernel.org, Linux MM , Michal Hocko , David Hildenbrand , Andrew Morton , Christoph Hellwig , Catalin Marinas , Will Deacon , Benjamin Herrenschmidt , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Eric Badger Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Feb 27, 2020 at 10:03 AM Jason Gunthorpe wrote: > > On Thu, Feb 27, 2020 at 09:55:04AM -0800, Dan Williams wrote: > > On Thu, Feb 27, 2020 at 9:43 AM Jason Gunthorpe wrote: > > > > > > On Thu, Feb 27, 2020 at 10:21:50AM -0700, Logan Gunthorpe wrote: > > > > > > > > > > > > On 2020-02-27 10:17 a.m., Jason Gunthorpe wrote: > > > > >> Instead of this, this series proposes a change to arch_add_memory() > > > > >> to take the pgprot required by the mapping which allows us to > > > > >> explicitly set pagetable entries for P2PDMA memory to WC. > > > > > > > > > > Is there a particular reason why WC was selected here? I thought for > > > > > the p2pdma cases there was no kernel user that touched the memory? > > > > > > > > Yes, that's correct. I choose WC here because the existing users are > > > > registering memory blocks without side effects which fit the WC > > > > semantics well. > > > > > > Hm, AFAIK WC memory is not compatible with the spinlocks/mutexs/etc in > > > Linux, so while it is true the memory has no side effects, there would > > > be surprising concurrency risks if anything in the kernel tried to > > > write to it. > > > > > > Not compatible means the locks don't contain stores to WC memory the > > > way you would expect. AFAIK on many CPUs extra barriers are required > > > to keep WC stores ordered, the same way ARM already has extra barriers > > > to keep UC stores ordered with locking.. > > > > > > The spinlocks are defined to contain UC stores though. > > > > How are spinlocks and mutexes getting into p2pdma ranges in the first > > instance? Even with UC, the system has bigger problems if it's trying > > to send bus locks targeting PCI, see the flurry of activity of trying > > to trigger faults on split locks [1]. > > This is not what I was trying to explain. > > Consider > > static spinlock lock; // CPU DRAM > static idx = 0; > u64 *wc_memory = [..]; > > spin_lock(&lock); > wc_memory[0] = idx++; > spin_unlock(&lock); > > You'd expect that the PCI device will observe stores where idx is > strictly increasing, but this is not guarenteed. idx may decrease, idx > may skip. It just won't duplicate. > > Or perhaps > > wc_memory[0] = foo; > writel(doorbell) > > foo is not guarenteed observable by the device before doorbell reaches > the device. > > All of these are things that do not happen with UC or NC memory, and > are surprising violations of our programming model. > > Generic kernel code should never touch WC memory unless the code is > specifically designed to handle it. Ah, yes, agree.