From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752345AbcGMJyy (ORCPT ); Wed, 13 Jul 2016 05:54:54 -0400 Received: from mx2.suse.de ([195.135.220.15]:45138 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751683AbcGMJyq (ORCPT ); Wed, 13 Jul 2016 05:54:46 -0400 Subject: Re: [PATCH 0/4] [RFC][v4] Workaround for Xeon Phi PTE A/D bits erratum To: Dave Hansen , linux-kernel@vger.kernel.org References: <20160708001909.FB2443E2@viggo.jf.intel.com> Cc: x86@kernel.org, linux-mm@kvack.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, bp@alien8.de, ak@linux.intel.com, mhocko@suse.com, dave.hansen@intel.com, Benjamin Herrenschmidt From: Vlastimil Babka Message-ID: Date: Wed, 13 Jul 2016 11:54:25 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <20160708001909.FB2443E2@viggo.jf.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/08/2016 02:19 AM, Dave Hansen wrote: > This patch survived a bunch of testing over the past week, including > on hardware affected by the issue. A debugging patch showed the > "stray" bits being set, and no ill effects were noticed. > > Barring any heartburn from folks, I think this is ready for the tip > tree. I don't see any answer to Benjamin's question on the previous version? https://lkml.org/lkml/2016/7/1/703 > -- > > The Intel(R) Xeon Phi(TM) Processor x200 Family (codename: Knights > Landing) has an erratum where a processor thread setting the Accessed > or Dirty bits may not do so atomically against its checks for the > Present bit. This may cause a thread (which is about to page fault) > to set A and/or D, even though the Present bit had already been > atomically cleared. > > These bits are truly "stray". In the case of the Dirty bit, the > thread associated with the stray set was *not* allowed to write to > the page. This means that we do not have to launder the bit(s); we > can simply ignore them. > > More details can be found in the "Specification Update" under "KNL4": > > http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-phi-processor-specification-update.pdf > > If the PTE is used for storing a swap index or a NUMA migration index, > the A bit could be misinterpreted as part of the swap type. The stray > bits being set cause a software-cleared PTE to be interpreted as a > swap entry. In some cases (like when the swap index ends up being > for a non-existent swapfile), the kernel detects the stray value > and WARN()s about it, but there is no guarantee that the kernel can > always detect it. > > This patch changes the kernel to attempt to ignore those stray bits > when they get set. We do this by making our swap PTE format > completely ignore the A/D bits, and also by ignoring them in our > pte_none() checks. > > Andi Kleen wrote the original version of this patch. Dave Hansen > wrote the later ones. > > v4: complete rework: let the bad bits stay around, but try to > ignore them > v3: huge rework to keep batching working in unmap case > v2: out of line. avoid single thread flush. cover more clear > cases > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org >