From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756933Ab3EVSpy (ORCPT ); Wed, 22 May 2013 14:45:54 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34979 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756914Ab3EVSps (ORCPT ); Wed, 22 May 2013 14:45:48 -0400 Message-ID: <519D11BF.5000604@redhat.com> Date: Wed, 22 May 2013 14:43:11 -0400 From: Rik van Riel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: "H. Peter Anvin" CC: Stanislav Meduna , Steven Rostedt , Linus Torvalds , "linux-rt-users@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Thomas Gleixner , Ingo Molnar , the arch/x86 maintainers , Hai Huang Subject: Re: [PATCH] mm: fix up a spurious page fault whenever it happens References: <5195ED8B.7060002@meduna.org> <1369183168.6828.168.camel@gandalf.local.home> <519CBB30.3060200@redhat.com> <20130522134111.33a695c5@cuia.bos.redhat.com> <519D08B0.8050707@meduna.org> <1369246316.6828.176.camel@gandalf.local.home> <519D0CAB.7020800@meduna.org> <519D0FF8.5080200@redhat.com> <519D118B.6010306@zytor.com> In-Reply-To: <519D118B.6010306@zytor.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/22/2013 02:42 PM, H. Peter Anvin wrote: > On 05/22/2013 11:35 AM, Rik van Riel wrote: >> On 05/22/2013 02:21 PM, Stanislav Meduna wrote: >>> On 22.05.2013 20:11, Steven Rostedt wrote: >>> >>>> Did you apply both patches? Without the first one, this one is >>>> meaningless. >>> >>> Sure. >>> >>> BTW, back when I tried to pinpoint it I also tried adding >>> flush_tlb_page(vma, address) >>> at the beginning of handle_pte_fault, which as I read should >>> be basically the same. It did not not change anything. >> >> I'm stumped. >> >> If the Geode knows how to flush single TLB entries, it >> should do that when flush_tlb_page is called. >> >> If it does not know, it should throw an invalid instruction >> exception, and not quietly complete the instruction without >> doing anything. >> > > Some CPUs have had errata when it comes to flushing large pages that > have been split into small pages by hardware, e.g. due to MTRR > conflicts. In that case, fragments of the large page may have been left > in the TLB. > > Could that explain what you are seeing? That would be testable by changing __native_flush_tlb_single() to call __flush_tlb(), instead of doing an invlpg instruction. In other words, make the code look like this, for testing: static inline void __native_flush_tlb_single(unsigned long addr) { __flush_tlb(); } This on top of the other two patches.