From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C534C388F9 for ; Fri, 23 Oct 2020 23:36:30 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 382E321D6C for ; Fri, 23 Oct 2020 23:36:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 382E321D6C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=cert.pl Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from list by lists.xenproject.org with outflank-mailman.11348.30079 (Exim 4.92) (envelope-from ) id 1kW6bX-0002yp-MB; Fri, 23 Oct 2020 23:36:15 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 11348.30079; Fri, 23 Oct 2020 23:36:15 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kW6bX-0002yi-Ie; Fri, 23 Oct 2020 23:36:15 +0000 Received: by outflank-mailman (input) for mailman id 11348; Fri, 23 Oct 2020 23:36:14 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kW6bW-0002yd-Dm for xen-devel@lists.xenproject.org; Fri, 23 Oct 2020 23:36:14 +0000 Received: from mx.nask.net.pl (unknown [195.187.55.89]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id ef37d5a8-c4db-4c2f-987b-8f3cd60cca37; Fri, 23 Oct 2020 23:36:11 +0000 (UTC) Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kW6bW-0002yd-Dm for xen-devel@lists.xenproject.org; Fri, 23 Oct 2020 23:36:14 +0000 X-Inumbo-ID: ef37d5a8-c4db-4c2f-987b-8f3cd60cca37 Received: from mx.nask.net.pl (unknown [195.187.55.89]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id ef37d5a8-c4db-4c2f-987b-8f3cd60cca37; Fri, 23 Oct 2020 23:36:11 +0000 (UTC) X-Virus-Scanned: amavisd-new at Date: Sat, 24 Oct 2020 01:36:09 +0200 (CEST) From: =?utf-8?Q?Micha=C5=82_Leszczy=C5=84ski?= To: =?utf-8?Q?J=C3=BCrgen_Gro=C3=9F?= Cc: xen-devel@lists.xenproject.org Message-ID: <1398275796.814046.1603496169810.JavaMail.zimbra@nask.pl> In-Reply-To: References: <157653679.6164.1603407559737.JavaMail.zimbra@nask.pl> Subject: Re: BUG: credit=sched2 machine hang when using DRAKVUF MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [195.187.238.14] X-Mailer: Zimbra 9.0.0_GA_3969 (ZimbraWebClient - GC86 (Win)/9.0.0_GA_3969) Thread-Topic: credit=sched2 machine hang when using DRAKVUF Thread-Index: ehjKdI2aZ/NIQ0mvEFcfZJ714jsIjw== ----- 23 pa=C5=BA, 2020 o 6:47, J=C3=BCrgen Gro=C3=9F jgross@suse.com napis= a=C5=82(a): > On 23.10.20 00:59, Micha=C5=82 Leszczy=C5=84ski wrote: >> Hello, >>=20 >> when using DRAKVUF against a Windows 7 x64 DomU, the whole machine hangs= after a >> few minutes. >>=20 >> The chance for a hang seems to be correlated with number of PCPUs, in th= is case >> we have 14 PCPUs and hang is very easily reproducible, while on other ma= chines >> with 2-4 PCPUs it's very rare (but still occurring sometimes). The issue= is >> observed with the default sched=3Dcredit2 and is no longer reproducible = once >> sched=3Dcredit is set. >=20 > Interesting. Can you please share some more information? >=20 > Which Xen version are you using? RELEASE-4.14 >=20 > Is there any additional information in the dom0 log which could be > related to the hang (earlier WARN() splats, Oopses, Xen related > messages, hardware failure messages, ...? I will try to find something out next week and will come back to you. >=20 > Can you please try to get backtraces of all cpus at the time of the > hang? >=20 > It would help to know which cpu was the target of the call of > smp_call_function_single(), so a disassembly of that function would > be needed to find that information from the dumped registers. >=20 > I'm asking because I've seen a similar problem recently and I was > rather suspecting a fifo event channel issue than the Xen scheduler, > but your data suggests it could be the scheduler after all (if it is > the same issue, of course). >=20 >=20 > Juergen