From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751236AbdEaMof (ORCPT ); Wed, 31 May 2017 08:44:35 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44458 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751028AbdEaMoc (ORCPT ); Wed, 31 May 2017 08:44:32 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 515EC81243 Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=riel@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 515EC81243 Message-ID: <1496234670.29205.82.camel@redhat.com> Subject: Re: [x86/mm] e2a7dcce31: kernel_BUG_at_arch/x86/mm/tlb.c From: Rik van Riel To: Arjan van de Ven , Andy Lutomirski Cc: kernel test robot , X86 ML , Dave Hansen , Nadav Amit , Michal Hocko , Andrew Morton , LKML , LKP Date: Wed, 31 May 2017 08:44:30 -0400 In-Reply-To: <06ab9499-0c74-f2f0-251c-57244360219f@linux.intel.com> References: <20170527133113.GA33229@inn.lkp.intel.com> <06ab9499-0c74-f2f0-251c-57244360219f@linux.intel.com> Organization: Red Hat, Inc Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Wed, 31 May 2017 12:44:32 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2017-05-30 at 21:05 -0700, Arjan van de Ven wrote: > On 5/27/2017 9:56 AM, Andy Lutomirski wrote: > > On Sat, May 27, 2017 at 9:00 AM, Andy Lutomirski > > wrote: > > > On Sat, May 27, 2017 at 6:31 AM, kernel test robot > > > wrote: > > > > > > > > FYI, we noticed the following commit: > > > > > > > > commit: e2a7dcce31f10bd7471b4245a6d1f2de344e7adf ("x86/mm: > > > > Rework lazy TLB to track the actual loaded mm") > > > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git > > > > x86/tlbflush_cleanup > > > > > > Ugh, there's an unpleasant interaction between this patch and > > > intel_idle.  I suspect that the intel_idle code in question is > > > either > > > wrong or pointless, but I want to investigate further.  Ingo, can > > > you > > > hold off on applying this patch? > > > > I think this is what's going on: intel_idle has an optimization and > > sometimes calls leave_mm().  This is a rather expensive way of > > working > > around x86 Linux's fairly weak lazy mm handling.  It also abuses > > the > > whole switch_mm state machine.  In particular, there's no guarantee > > that the mm is actually lazy at the time.  The old code didn't > > care, > > but the new code can oops. > > > > The short-term fix is to just reorder the code in leave_mm() to > > avoid the OOPS. > > fwiw the reason the code is in intel_idle is to avoid tlb flush IPIs > to idle cpus, > once the cpu goes into a deep enough idle state.  In the current > linux code, > that is done by no longer having the old TLB live on the CPU, by > switching to the neutral > kernel-only set of tlbs. > > If your proposed changes do that (avoid the IPI/wakeup), great! > (if not, there should be a way to do that) My patch moves the atomic write from the intel idle path into the tlb invalidation path, and gets rid of the IPI. Shouldn't be too hard to get that on top of Andy's patches, once those have settled. https://patchwork.kernel.org/patch/9307541/ From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============9144653241539731199==" MIME-Version: 1.0 From: Rik van Riel To: lkp@lists.01.org Subject: Re: [x86/mm] e2a7dcce31: kernel_BUG_at_arch/x86/mm/tlb.c Date: Wed, 31 May 2017 08:44:30 -0400 Message-ID: <1496234670.29205.82.camel@redhat.com> In-Reply-To: <06ab9499-0c74-f2f0-251c-57244360219f@linux.intel.com> List-Id: --===============9144653241539731199== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Tue, 2017-05-30 at 21:05 -0700, Arjan van de Ven wrote: > On 5/27/2017 9:56 AM, Andy Lutomirski wrote: > > On Sat, May 27, 2017 at 9:00 AM, Andy Lutomirski > > wrote: > > > On Sat, May 27, 2017 at 6:31 AM, kernel test robot > > > wrote: > > > > = > > > > FYI, we noticed the following commit: > > > > = > > > > commit: e2a7dcce31f10bd7471b4245a6d1f2de344e7adf ("x86/mm: > > > > Rework lazy TLB to track the actual loaded mm") > > > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git > > > > x86/tlbflush_cleanup > > > = > > > Ugh, there's an unpleasant interaction between this patch and > > > intel_idle.=C2=A0=C2=A0I suspect that the intel_idle code in question= is > > > either > > > wrong or pointless, but I want to investigate further.=C2=A0=C2=A0Ing= o, can > > > you > > > hold off on applying this patch? > > = > > I think this is what's going on: intel_idle has an optimization and > > sometimes calls leave_mm().=C2=A0=C2=A0This is a rather expensive way of > > working > > around x86 Linux's fairly weak lazy mm handling.=C2=A0=C2=A0It also abu= ses > > the > > whole switch_mm state machine.=C2=A0=C2=A0In particular, there's no gua= rantee > > that the mm is actually lazy at the time.=C2=A0=C2=A0The old code didn't > > care, > > but the new code can oops. > > = > > The short-term fix is to just reorder the code in leave_mm() to > > avoid the OOPS. > = > fwiw the reason the code is in intel_idle is to avoid tlb flush IPIs > to idle cpus, > once the cpu goes into a deep enough idle state.=C2=A0=C2=A0In the current > linux code, > that is done by no longer having the old TLB live on the CPU, by > switching to the neutral > kernel-only set of tlbs. > = > If your proposed changes do that (avoid the IPI/wakeup), great! > (if not, there should be a way to do that) My patch moves the atomic write from the intel idle path into the tlb invalidation path, and gets rid of the IPI. Shouldn't be too hard to get that on top of Andy's patches, once those have settled. https://patchwork.kernel.org/patch/9307541/ --===============9144653241539731199==--