From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.7 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7EA61C43331 for ; Fri, 6 Sep 2019 15:33:46 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 56B4A20838 for ; Fri, 6 Sep 2019 15:33:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 56B4A20838 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernelpath.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:57606 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i6GF7-0001og-Br for qemu-devel@archiver.kernel.org; Fri, 06 Sep 2019 11:33:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44350) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i6GCq-0008UN-PC for qemu-devel@nongnu.org; Fri, 06 Sep 2019 11:31:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i6GCh-0003m9-7H for qemu-devel@nongnu.org; Fri, 06 Sep 2019 11:31:23 -0400 Received: from indium.canonical.com ([91.189.90.7]:33406) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1i6GCg-0003i9-Oq for qemu-devel@nongnu.org; Fri, 06 Sep 2019 11:31:14 -0400 Received: from loganberry.canonical.com ([91.189.90.37]) by indium.canonical.com with esmtp (Exim 4.86_2 #2 (Debian)) id 1i6GCd-0005H9-Hi for ; Fri, 06 Sep 2019 15:31:11 +0000 Received: from loganberry.canonical.com (localhost [127.0.0.1]) by loganberry.canonical.com (Postfix) with ESMTP id 833192E80CD for ; Fri, 6 Sep 2019 15:31:11 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Date: Fri, 06 Sep 2019 15:12:44 -0000 From: Rafael David Tinoco To: qemu-devel@nongnu.org X-Launchpad-Notification-Type: bug X-Launchpad-Bug: product=qemu; status=Confirmed; importance=Undecided; assignee=None; X-Launchpad-Bug: distribution=ubuntu; sourcepackage=qemu; component=main; status=In Progress; importance=Medium; assignee=rafaeldtinoco@kernelpath.com; X-Launchpad-Bug-Tags: qemu-img X-Launchpad-Bug-Information-Type: Public X-Launchpad-Bug-Private: no X-Launchpad-Bug-Security-Vulnerability: no X-Launchpad-Bug-Commenters: dannf jnsnow lizhengui rafaeldtinoco X-Launchpad-Bug-Reporter: dann frazier (dannf) X-Launchpad-Bug-Modifier: Rafael David Tinoco (rafaeldtinoco) References: <154327283728.15443.11625169757714443608.malonedeb@soybean.canonical.com> Message-Id: <156778276451.17577.7497183627592074422.malone@gac.canonical.com> X-Launchpad-Message-Rationale: Subscriber (QEMU) @qemu-devel-ml X-Launchpad-Message-For: qemu-devel-ml Precedence: bulk X-Generated-By: Launchpad (canonical.com); Revision="19044"; Instance="production-secrets-lazr.conf" X-Launchpad-Hash: 27f553c841ea6f5c5fc97215f260667300b56a65 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 91.189.90.7 Subject: [Qemu-devel] [Bug 1805256] Re: qemu-img hangs on high core count ARM system X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Bug 1805256 <1805256@bugs.launchpad.net> Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Hello Liz, I'll try to reproduce this issue in a Cortex-A53 aarch64 real environment (w/ 24 HW threads) AND in a virtual environment w/ lots of vCPUs... but, if it's a barrier missing - or the lack of atomicity and/or ordering in a primitive - then, I'm afraid the context switch in between vCPUs might not be the same as in real CPUs (IPIs are sent and handled differently and host kernel delays IPI delivery because of its own callbacks, before scheduling, etc...) and I could need a qemu dump from your environment. Would that be feasible ? Can you reproduce this nowadays ? This bug has aged a little, so I'm now sure! Could you provide me the dump caused by latest package available for your Ubuntu version ? This way I have the debug symbols to work with. Meanwhile, I'll be trying to reproduce on my side. -- = You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1805256 Title: qemu-img hangs on high core count ARM system Status in QEMU: Confirmed Status in qemu package in Ubuntu: In Progress Bug description: On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img frequently hangs (~50% of the time) with this command: qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2 Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This qcow2->qcow2 conversion happens to be something uvtool does every time it fetches images. Once hung, attaching gdb gives the following backtrace: (gdb) bt #0 0x0000ffffae4f8154 in __GI_ppoll (fds=3D0xaaaae8a67dc0, nfds=3D187650= 274213760, = timeout=3D, timeout@entry=3D0x0, sigmask=3D0xffffc123b= 950) at ../sysdeps/unix/sysv/linux/ppoll.c:39 #1 0x0000aaaabbefaf00 in ppoll (__ss=3D0x0, __timeout=3D0x0, __nfds=3D, = __fds=3D) at /usr/include/aarch64-linux-gnu/bits/poll2= .h:77 #2 qemu_poll_ns (fds=3D, nfds=3D, = timeout=3Dtimeout@entry=3D-1) at util/qemu-timer.c:322 #3 0x0000aaaabbefbf80 in os_host_main_loop_wait (timeout=3D-1) at util/main-loop.c:233 #4 main_loop_wait (nonblocking=3D) at util/main-loop.c:497 #5 0x0000aaaabbe2aa30 in convert_do_copy (s=3D0xffffc123bb58) at qemu-im= g.c:1980 #6 img_convert (argc=3D, argv=3D) at qemu-= img.c:2456 #7 0x0000aaaabbe2333c in main (argc=3D7, argv=3D) at qemu= -img.c:4975 Reproduced w/ latest QEMU git (@ 53744e0a182) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1805256/+subscriptions