From: KY Srinivasan <kys@microsoft.com>
To: Jon Stanley <jonstanley@gmail.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
Stephen Hemminger <sthemmin@microsoft.com>,
"wei.liu@kernel.org" <wei.liu@kernel.org>,
"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>
Subject: RE: [EXTERNAL] hv_balloon issues??
Date: Mon, 25 Jan 2021 20:51:35 +0000 [thread overview]
Message-ID: <MWHPR2101MB0874A5A65BD03A5FB7857668A0BD9@MWHPR2101MB0874.namprd21.prod.outlook.com> (raw)
In-Reply-To: <CALY6xngo6fU7NoEgrmP_qtdz4OMQgKo9CiJno2uhtWie0ze3Rw@mail.gmail.com>
I take it that this is on Windows Server machine. What are the Dynamic memory settings for the VM under question.
K. Y
> -----Original Message-----
> From: Jon Stanley <jonstanley@gmail.com>
> Sent: Monday, January 25, 2021 12:20 PM
> To: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; wei.liu@kernel.org; linux-
> hyperv@vger.kernel.org
> Subject: [EXTERNAL] hv_balloon issues??
>
> I'm working to make a method to install bare-metal machines with Packer
> images, and in testing (this isn't going to wind up in production on Hyper-V) I
> think I've found an issue in hv_balloon, but I'm not sure.
>
> Starting from a RHEL 8 live CD, I make a tmpfs filesystem and download a disk
> image to it. Despite having plenty of memory to do this (I was downloading a
> 5GB image onto a VM with 16GB of RAM), I got paid a visit by the OOM killer.
>
> If I turn off dynamic memory, then things work as expected. This isn't 100%
> reproducible, I tried immediately after boot and it worked, unmounted the
> tmpfs filesystem and waited for a kernel message that said the balloon floor
> was reached and tried again, and BOOM!
>
> The actual process that is filling the filesystem (curl) doesn't get killed (which
> makes sense I guess since *it* isn't taking a ton of memory), and also never
> completes presumably due to it's I/O becoming blocked. Does this have to do
> with a sudden, enormous demand for memory perhaps that the hypervisor is
> having difficulty fulfilling?
> The host has plenty of memory available (63GB right now)
>
> On another note, is there a way that I'm not seeing to tell the current status of
> the balloon driver - i.e. current/max allocations? A quick look through /proc
> and /sys wasn't revealing.
>
> Also, sorry to be using a distro kernel instead of upstream.
>
> -Jon
>
> Jan 25 14:58:43 dhcp-132.rmrf.net kernel: hv_balloon: Balloon request will be
> partially fulfilled. Balloon floor reached.
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: tuned invoked oom-killer:
> gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: CPU: 0 PID: 1165 Comm: tuned Not
> tainted 4.18.0-240.10.1.el8_3.x86_64 #1 Jan 25 14:59:30 dhcp-132.rmrf.net
> kernel: Hardware name: Microsoft Corporation Virtual Machine/Virtual
> Machine, BIOS Hyper-V UEFI Release
> v4.0 11/01/2019
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Call Trace:
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: dump_stack+0x5c/0x80 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: dump_header+0x51/0x308 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: oom_kill_process.cold.28+0xb/0x10 Jan
> 25 14:59:30 dhcp-132.rmrf.net kernel: out_of_memory+0x1c1/0x4b0 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: __alloc_pages_slowpath+0xc24/0xd40
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel:
> __alloc_pages_nodemask+0x245/0x280
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: filemap_fault+0x3b8/0x840 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: ? hrtimer_cancel+0x11/0x20 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: ? futex_wait+0x19a/0x210 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: ? xas_load+0x8/0x80 Jan 25 14:59:30
> dhcp-132.rmrf.net kernel: ? xas_find+0x173/0x1b0 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: ? filemap_map_pages+0x1a3/0x380 Jan 25 14:59:30
> dhcp-132.rmrf.net kernel: ext4_filemap_fault+0x2c/0x40 [ext4] Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: __do_fault+0x38/0xc0 Jan 25 14:59:30
> dhcp-132.rmrf.net kernel: do_fault+0x191/0x3c0 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: __handle_mm_fault+0x3e6/0x7c0 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: handle_mm_fault+0xc2/0x1d0 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: __do_page_fault+0x21b/0x4d0 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: do_page_fault+0x32/0x110 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: ? page_fault+0x8/0x30 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: page_fault+0x1e/0x30 Jan 25 14:59:30 dhcp-132.rmrf.net
> kernel: RIP: 0033:0x7faf2f8c5df2 Jan 25 14:59:30 dhcp-132.rmrf.net kernel:
> Code: Bad RIP value.
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: RSP: 002b:00007faf242629a0
> EFLAGS: 00010246
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: RAX: ffffffffffffff92 RBX:
> 00007faf24262a40 RCX: 00007faf2f8c5df2
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: RDX: 0000000000000000 RSI:
> 0000000000000189 RDI: 00007faf1c002490
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: RBP: 00007faf1c002490 R08:
> 0000000000000000 R09: 00000000ffffffff
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: R10: 00007faf24262a40 R11:
> 0000000000000246 R12: 0000000000000000
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: R13: 0000000000000000 R14:
> 00007faf24262a40 R15: 000000003b9aca00
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Mem-Info:
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: active_anon:18180
> inactive_anon:738744 isolated_anon:0
> active_file:18
> inactive_file:337 isolated_file:32
> unevictable:132114 dirty:0
> writeback:0 unstable:0
> slab_reclaimable:6250
> slab_unreclaimable:5966
> mapped:1626 shmem:738916
> pagetables:1396 bounce:0
> free:31759 free_pcp:30 free_cma:0 Jan 25 14:59:30
> dhcp-132.rmrf.net kernel: Node 0 active_anon:72720kB
> inactive_anon:2954976kB active_file:72kB inactive_file:1348kB
> unevictable:528456kB isolated(anon):0kB i> Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: Node 0 DMA free:15908kB min:64kB low:80kB high:96kB
> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
> unevictabl> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: lowmem_reserve[]: 0
> 3845
> 15960 15960 15960
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Node 0 DMA32 free:64676kB
> min:16264kB low:20328kB high:24392kB active_anon:1424kB
> inactive_anon:2489752kB active_file:28kB inactiv> Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: lowmem_reserve[]: 0 0 12114
> 12114 12114
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Node 0 Normal free:46452kB
> min:51248kB low:64060kB high:76872kB active_anon:71296kB
> inactive_anon:465224kB active_file:4kB inactiv> Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: lowmem_reserve[]: 0 0 0 0 0 Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB
> (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB
> (U) 1*2048kB (M) 3*4096kB (M) = >
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Node 0 DMA32: 29*4kB (UE)
> 36*8kB (UE) 33*16kB (UME) 6*32kB (UE) 3*64kB (UME) 1*128kB (U) 3*256kB
> (UME) 2*512kB (UM) 2*1024kB (U) 3>
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Node 0 Normal: 833*4kB (UME)
> 712*8kB (UME) 305*16kB (UME) 152*32kB (UME) 52*64kB (E) 28*128kB
> (UME) 15*256kB (UME) 11*512kB (UME) > Jan 25 14:59:30 dhcp-132.rmrf.net
> kernel: Node 0 hugepages_total=0
> hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: Node 0 hugepages_total=0
> hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: 871413 total pagecache pages Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: 0 pages in swap cache Jan 25 14:59:30
> dhcp-132.rmrf.net kernel: Swap cache stats: add 0, delete 0, find 0/0 Jan 25
> 14:59:30 dhcp-132.rmrf.net kernel: Free swap = 0kB Jan 25 14:59:30 dhcp-
> 132.rmrf.net kernel: Total swap = 0kB Jan 25 14:59:30 dhcp-132.rmrf.net
> kernel: 4194027 pages RAM Jan 25 14:59:30 dhcp-132.rmrf.net kernel: 0 pages
> HighMem/MovableOnly Jan 25 14:59:30 dhcp-132.rmrf.net kernel: 91830
> pages reserved Jan 25 14:59:30 dhcp-132.rmrf.net kernel: 0 pages hwpoisoned
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ pid ] uid tgid total_vm
> rss pgtables_bytes swapents oom_score_adj name
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 762] 0 762 27626
> 1788 290816 0 0 systemd-journal
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 816] 0 816 25338
> 353 212992 0 -1000 systemd-udevd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 819] 0 819 15287
> 152 135168 0 -1000 auditd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 860] 81 860 14087
> 213 155648 0 -900 dbus-daemon
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 875] 995 875 29968
> 111 147456 0 0 chronyd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 907] 0 907 48443
> 510 405504 0 0 sssd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 908] 997 908 404961
> 1915 331776 0 0 polkitd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 913] 0 913 1085
> 16 53248 0 0 hypervvssd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 914] 994 914 40028
> 204 208896 0 0 rngd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 921] 0 921 50484
> 659 421888 0 0 sssd_be
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 922] 0 922 53956
> 395 462848 0 0 sssd_nss
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 925] 0 925 74573
> 5478 466944 0 0 firewalld
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 926] 0 926 24290
> 252 204800 0 0 systemd-logind
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 940] 0 940 116867
> 614 389120 0 0 NetworkManager
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 958] 0 958 23072
> 224 212992 0 -1000 sshd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 968] 0 968 1778
> 30 61440 0 0 hypervkvpd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 969] 0 969 106589
> 3721 450560 0 0 tuned
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 972] 0 972 9232
> 221 106496 0 0 crond
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 973] 0 973 10449
> 135 114688 0 0 rhsmcertd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 1189] 0 1189 56455
> 509 192512 0 0 rsyslogd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 1201] 0 1201 30749
> 215 266240 0 0 login
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 1206] 0 1206 23443
> 331 225280 0 0 systemd
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 1210] 0 1210 37531
> 648 299008 0 0 (sd-pam)
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 1216] 0 1216 6554
> 154 86016 0 0 bash
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: [ 1285] 0 1285 20229
> 245 196608 0 0 curl
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel:
> oom-
> kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=
> 0,global_oom,task_memcg=/system.slice/firewalld.service,>
> Jan 25 14:59:30 dhcp-132.rmrf.net kernel: Out of memory: Killed process 925
> (firewalld) total-vm:298292kB, anon-rss:21912kB, file-rss:0kB, shmem-rss:0kB,
> UID:0 Jan 25 14:59:34 dhcp-132.rmrf.net systemd[1]: firewalld.service: Main
> process exited, code=killed, status=9/KILL Jan 25 14:59:47 dhcp-132.rmrf.net
> systemd[1]: firewalld.service:
> Failed with result 'signal'.
next prev parent reply other threads:[~2021-01-25 20:55 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-25 20:19 hv_balloon issues?? Jon Stanley
2021-01-25 20:51 ` KY Srinivasan [this message]
2021-01-25 21:07 ` [EXTERNAL] " Jon Stanley
2021-01-25 21:45 ` KY Srinivasan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=MWHPR2101MB0874A5A65BD03A5FB7857668A0BD9@MWHPR2101MB0874.namprd21.prod.outlook.com \
--to=kys@microsoft.com \
--cc=haiyangz@microsoft.com \
--cc=jonstanley@gmail.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=sthemmin@microsoft.com \
--cc=wei.liu@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).