From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751970Ab3COAYE (ORCPT ); Thu, 14 Mar 2013 20:24:04 -0400 Received: from mail-qe0-f54.google.com ([209.85.128.54]:45443 "EHLO mail-qe0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750833Ab3COAYC (ORCPT ); Thu, 14 Mar 2013 20:24:02 -0400 MIME-Version: 1.0 In-Reply-To: References: <20130226070247.GA14094@gmail.com> Date: Fri, 15 Mar 2013 01:24:00 +0100 Message-ID: Subject: Re: [GIT PULL] perf fixes From: Stephane Eranian To: Linus Torvalds Cc: Ingo Molnar , Arnaldo Carvalho de Melo , Peter Zijlstra , Thomas Gleixner , Andrew Morton , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Linus, I bet if you force the affinity of your perf record to be on a CPU other than CPU0, you will not get the crash. This is what I am seeing now. I appears on resume, CPU0 hotplug callbacks for perf_events are not invoked leaving DS_AREA MSR to 0. Can you confirm on your machine? On Fri, Mar 15, 2013 at 12:11 AM, Stephane Eranian wrote: > On Thu, Mar 14, 2013 at 11:53 PM, Stephane Eranian wrote: >> On Thu, Mar 14, 2013 at 11:42 PM, Stephane Eranian wrote: >>> On Thu, Mar 14, 2013 at 11:19 PM, Stephane Eranian wrote: >>>> On Thu, Mar 14, 2013 at 11:17 PM, Linus Torvalds >>>> wrote: >>>>> On Thu, Mar 14, 2013 at 3:09 PM, Stephane Eranian wrote: >>>>>> >>>>>> Could be related to suspend/resume. But were you running perf across >>>>>> that resume/suspend cycle? >>>>> >>>>> No. >>>>> >>>>> In most cases I was running a perf record before and after (but not >>>>> *while* suspending) >>>>> >>>>> In at least one other crash, I didn't run perf before at all, so the >>>>> first time I used perf was after the resume. >>>>> >>>>> So in no cases did I actually have any perf stuff active over the >>>>> suspend itself. >>>>> >>>> Ok, simpler test case then. >>>> >>>>>> Let's see if we can reproduce the problem on the same ChromeBook you >>>>>> have. Don't have one myself. >>>>> >>>>> I don't imagine it should be about chromebook per se, because afaik >>>>> all of pmu suspend/resume is done by the kernel, no firmware involved. >>>>> >>>>> So I'd assume it should happen with any IvyBridge. >>>>> >>>> Will try on a desktop IvyBridge too. >>> >>> Ok, it happens on my IVB desktop too, so I can investigate... >> >> It's not specific to IVB either, it hangs on my Nehalem desktop as well. > > Looks related to PEBS. If I drop the :pp the machine does not hang. Even > a single :p hangs it. So it is possible something is not properly > restored in the > DS state after a resume or is corrupted by the suspend.