From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_2 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E75A0C4338F for ; Tue, 24 Aug 2021 11:43:42 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2D0FA613AB for ; Tue, 24 Aug 2021 11:43:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2D0FA613AB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:38038 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mIUqD-00083d-4W for qemu-devel@archiver.kernel.org; Tue, 24 Aug 2021 07:43:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44858) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mIUpR-0007Io-U3 for qemu-devel@nongnu.org; Tue, 24 Aug 2021 07:42:54 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:39732) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mIUpQ-0006nX-2M for qemu-devel@nongnu.org; Tue, 24 Aug 2021 07:42:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629805370; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Iso7bt58RY5IoZJC7imhR+y4dSPS8Wjo86q8q/MxRQM=; b=ba1PTrhaeAowKe8dvX1kFkO0gHhpbid3DH4MdvV/SCO8n+0toawJRLsx9SKp6G2VTyTIMt E1i6MnqJrpn8rFS8i+4mRTe2UkyWqeWzTZcsXyGAeMkCbBJBACjcTVBw9sTiHwfZLOErTr 8/cw7QfL2vutWABDTwFXOrdXZz0NRGg= Received: from mail-ej1-f71.google.com (mail-ej1-f71.google.com [209.85.218.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-340-rIuAemJiMYeVpjxenkR6Yw-1; Tue, 24 Aug 2021 07:42:48 -0400 X-MC-Unique: rIuAemJiMYeVpjxenkR6Yw-1 Received: by mail-ej1-f71.google.com with SMTP id bx10-20020a170906a1ca00b005c341820edeso3980627ejb.10 for ; Tue, 24 Aug 2021 04:42:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5w5fYjyYVhIacB4/koGS96XUqJuMEXo/5VCvR4n9IYs=; b=aPlc3mPMJpR5GyHb1Fa0HQnIA8jGigPl9/NzOMupthJ4v88ttlWsu79f7Jam/V1St3 4WrM8ra/5waYOvpsG1YJ90y1vP2Laukcat2f1YVPC0oRVoRuuUHIxnscl8AA1a7u9suE CZ/p0BehUlwqWCH+JXyvM1LGFJ4pcwMt3ex0QJGYT4HAlocnW2RnJJ4o0s9gcJU6M9pB plEpuYmnNewXMISrVnn/ZEr5+RHNYOTYJAXGNGvRMCwAdPPh3Kuw+9YoRvw74xhIgQno RcLFZyNgfUPcwXEXs/DmqKCIwwSLZ3bJSB8RliiajIE62fYmjS/tJrz5JpaNfDxOmFfJ hDng== X-Gm-Message-State: AOAM532EFi7x/ZkVtFkIPysktAyNyfTuHyiYZp6OdjgQ66uowLCKHXgS cVucveTDyIf71Ruj2gLpV1lQNZqMzrFgAYUbp6tKn5vXYODAM2fvoiPdFncKUycIginHYpvqgvB d2GANuwhS5NxJliM= X-Received: by 2002:a17:907:2091:: with SMTP id pv17mr40202247ejb.204.1629805367290; Tue, 24 Aug 2021 04:42:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy3WxfCIR9DwUO8F8DUTu3cTc2qdhF6vXIrhuPDVZkAi5EPhNPSf0Qj7wlWA9zOwnpmC6LfFA== X-Received: by 2002:a17:907:2091:: with SMTP id pv17mr40202213ejb.204.1629805367005; Tue, 24 Aug 2021 04:42:47 -0700 (PDT) Received: from localhost (nat-pool-brq-t.redhat.com. [213.175.37.10]) by smtp.gmail.com with ESMTPSA id q21sm9050136ejs.43.2021.08.24.04.42.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 04:42:45 -0700 (PDT) Date: Tue, 24 Aug 2021 13:42:44 +0200 From: Igor Mammedov To: "Dr. David Alan Gilbert" Subject: Re: [PATCH] softmmu/physmem: Improve guest memory allocation failure error message Message-ID: <20210824134244.39c199d2@redhat.com> In-Reply-To: References: <20210820155211.3153137-1-philmd@redhat.com> <20a53e29-ba23-fe0d-f961-63d0b5ca9a89@redhat.com> <6165f86e-1ce7-d178-1f5c-4b3c5110f0c1@redhat.com> <1a63c2d2-7420-5fc1-1023-0504a67dc40b@redhat.com> <628a7ee5-b88d-c043-2e67-67e791532c18@redhat.com> X-Mailer: Claws Mail 3.18.0 (GTK+ 2.24.33; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=imammedo@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=170.10.133.124; envelope-from=imammedo@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.747, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , David Hildenbrand , QEMU Developers , Peter Xu , Paolo Bonzini , Bin Meng , Philippe =?UTF-8?B?TWF0aGlldS1E?= =?UTF-8?B?YXVkw6k=?= Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On Tue, 24 Aug 2021 09:37:54 +0100 "Dr. David Alan Gilbert" wrote: > * David Hildenbrand (david@redhat.com) wrote: > > On 23.08.21 12:34, Philippe Mathieu-Daud=C3=A9 wrote: =20 > > > On 8/23/21 12:24 PM, David Hildenbrand wrote: =20 > > > > On 23.08.21 12:12, Philippe Mathieu-Daud=C3=A9 wrote: =20 > > > > > On 8/23/21 11:29 AM, David Hildenbrand wrote: =20 > > > > > > On 23.08.21 11:23, Peter Maydell wrote: =20 > > > > > > > On Mon, 23 Aug 2021 at 09:40, David Hildenbrand > > > > > > > wrote: =20 > > > > > > > > Not opposed to printing the size, although I doubt that it = will really > > > > > > > > stop similar questions/problems getting raised. =20 > > > > > > >=20 > > > > > > > The case that triggered this was somebody thinking > > > > > > > -m took a byte count, so very likely that an error message > > > > > > > saying "you tried to allocate 38TB" would have made their > > > > > > > mistake clear in a way that just "allocation failed" did not. > > > > > > > It also means that if a future user asks us for help then > > > > > > > we can look at the error message and immediately tell them > > > > > > > the problem, rather than going "hmm, what are all the possibl= e > > > > > > > ways that allocation might have failed" and going off down > > > > > > > rabbitholes like VM overcommit settings... =20 > > > > > >=20 > > > > > > We've had similar issues recently where Linux memory overcommit= handling > > > > > > rejected the allocation -- and the user was well aware about th= e actual > > > > > > size. You won't be able to catch such reports, because people d= on't > > > > > > understand how Linux memory overcommit handling works or was co= nfigured. > > > > > >=20 > > > > > > "I have 3 GiB of free memory, why can't I create a 3 GiB VM". "= I have 3 > > > > > > GiB of RAM, why can't I create a 3 GiB VM even if it won't make= use of > > > > > > all 3 GiB of memory". > > > > > >=20 > > > > > > Thus my comment, it will only stop very basic usage issues. And= I agree > > > > > > that looking at the error *might* help. It didn't help for the = cases I > > > > > > just described, because we need much more system information to= make a > > > > > > guess what the user error actually is. =20 > > > > >=20 > > > > > Is it possible to get the maximal overcommitable amount on Linux?= =20 > > > >=20 > > > > Not reliably I think. > > > >=20 > > > > In the "always" mode, there is none. > > > >=20 > > > > In the "guess"/"estimate" mode, the kernel takes a guess (currently > > > > implemented as checking if the mmap size <=3D total RAM + total SWA= P). > > > > =C2=A0=C2=A0=C2=A0=C2=A0Committable =3D MemTotal + SwapTotal > > > >=20 > > > > In the "never" mode: > > > > =C2=A0=C2=A0=C2=A0=C2=A0Committable =3D CommitLimit - Committed_AS > > > > However, the value gets further reduced for !root applications by > > > > /proc/sys/vm/admin_reserve_kbytes. > > > >=20 > > > > Replicating these calculations in user space would be suboptimal IM= HO. =20 > > >=20 > > > What about simply giving a hint about memory overcommit and display > > > a link to documentation with longer description about how to check > > > and figure out this issue? =20 > >=20 > > That would be highly OS-specific -- for example, there is no memory > > overcommit under Windows. Sure, we could add a Linux specific hint, > > indication documentation. But I'm not sure if most end users stumbling = into > > such an error+hint would be able to make sense of memory overcommit det= ails > > (not to mention that they know what it even is) :) > >=20 > > You can run into memory allocation issues with many applications. Let m= e > > give you a simple example > >=20 > > t480s: ~ $ dd if=3D/dev/zero of=3D/dev/null ibs=3D100G > > dd: memory exhausted by input buffer of size 107374182400 bytes (100 Gi= B) > >=20 > > So indicating the size of the failing allocation might be just good eno= ugh. > > For the other parts it's usually just "the way the OS was configured, i= t > > does not think it can allow this allocation". =20 >=20 > Does it also get complicated by the use of CGroup? And if it's not complex enough, add to that NUMA node binding, which introduces additional limitations on RAM size that can be allocated. > Dave >=20 > > --=20 > > Thanks, > >=20 > > David / dhildenb > > =20