From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752612AbbDBU5f (ORCPT ); Thu, 2 Apr 2015 16:57:35 -0400 Received: from mail-ie0-f171.google.com ([209.85.223.171]:35328 "EHLO mail-ie0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751702AbbDBU5d (ORCPT ); Thu, 2 Apr 2015 16:57:33 -0400 MIME-Version: 1.0 In-Reply-To: <20150402190725.GA10570@gmail.com> References: <20150331031536.GA9303@canonical.com> <20150331222327.GA12512@canonical.com> <20150401124336.GB12841@gmail.com> <20150401161047.GD12730@canonical.com> <551C6A48.9060805@canonical.com> <20150402182607.GA8896@gmail.com> <551D8FAF.5070805@canonical.com> <20150402190725.GA10570@gmail.com> Date: Thu, 2 Apr 2015 13:57:33 -0700 X-Google-Sender-Auth: J2mV9skeYsb0hHu4eiDU6ogTpyM Message-ID: Subject: Re: smp_call_function_single lockups From: Linus Torvalds To: Ingo Molnar Cc: Chris J Arges , Rafael David Tinoco , Peter Anvin , Jiang Liu , Peter Zijlstra , LKML , Jens Axboe , Frederic Weisbecker , Gema Gomez , "the arch/x86 maintainers" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 2, 2015 at 12:07 PM, Ingo Molnar wrote: > > So one possibility would be that an 'IPI was sent but lost'. Yes, the "sent but lost" thing would certainly explain the lockups. At the same time, that sounds like a huge hardware bug, and that's somewhat surprising/unlikely. That said. > We could try the following trick: poll for completion for a couple of > seconds (since an IPI is not held up by anything but irqs-off > sections, it should arrive within microseconds typically - seconds of > polling should be more than enough), and if the IPI does not arrive, > print a warning message and re-send the IPI. Sounds like a reasonable approach. At worst it doesn't fix anything, and we never see any messages, and that tells us something too. Linus