From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49D01C433EF for ; Thu, 14 Apr 2022 10:32:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A517A6B0071; Thu, 14 Apr 2022 06:32:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A02776B0073; Thu, 14 Apr 2022 06:32:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8C8BA6B0074; Thu, 14 Apr 2022 06:32:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id 7B2136B0071 for ; Thu, 14 Apr 2022 06:32:38 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 13B542B18 for ; Thu, 14 Apr 2022 10:32:38 +0000 (UTC) X-FDA: 79355120796.01.28978C6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf14.hostedemail.com (Postfix) with ESMTP id 83846100003 for ; Thu, 14 Apr 2022 10:32:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649932356; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=QR9KlRpIgLy+NHuJHbBMyshO/OHacsD7KQS8/lCfVus=; b=HrjoWAyYRd0P+Xpq5sLAQtxUVdMtFpLGMMnAozzdor0nhwq9DG+tllF0mmQUP2q65v7TH5 1npnjrwKYdpyg9cqOS+BmIQy4Ze+mJGpIqtCSNX2WlVrcg93BBexB5zTyA02LSzGpNyD4a yGct81lonRwQeCQjfW+hrfmgcCfRgkk= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-651-pcadmk-LNzywEdzeYVG2ug-1; Thu, 14 Apr 2022 06:32:33 -0400 X-MC-Unique: pcadmk-LNzywEdzeYVG2ug-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 14BA21C06912; Thu, 14 Apr 2022 10:32:33 +0000 (UTC) Received: from localhost (ovpn-13-186.pek2.redhat.com [10.72.13.186]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D125042D3D3; Thu, 14 Apr 2022 10:32:31 +0000 (UTC) Date: Thu, 14 Apr 2022 18:32:28 +0800 From: Baoquan He To: Omar Sandoval Cc: Chris Down , linux-mm@kvack.org, kexec@lists.infradead.org, Andrew Morton , Uladzislau Rezki , Christoph Hellwig , x86@kernel.org, kernel-team@fb.com Subject: Re: [PATCH v2] mm/vmalloc: fix spinning drain_vmap_work after reading from /proc/vmcore Message-ID: References: <52f819991051f9b865e9ce25605509bfdbacadcd.1649277321.git.osandov@fb.com> MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.85 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HrjoWAyY; spf=none (imf14.hostedemail.com: domain of bhe@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=bhe@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: uxcxatcfxgienrijb3gt6ckkb3iwisc9 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 83846100003 X-HE-Tag: 1649932357-168162 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 04/13/22 at 09:24am, Omar Sandoval wrote: > On Fri, Apr 08, 2022 at 11:02:47AM +0800, Baoquan He wrote: ...... > > Since redhat mail server has issue, the body content of patch is empty > > from my mail client. So reply here to add comment. > > > > As replied in v1 to Omar, I think this is a great fix. That would be > > also great to state if this is a real issue which is breaking thing, > > then add 'Fixes' tag and Cc stable like "Cc: # 5.17", > > or a fantastic improvement from code inspecting. > > > > Concern this because in distros, e.g in our rhel8, we maintain old kernel > > and back port necessary patches into the kernel, those patches with > > 'Fixes' tag definitely are good candidate. This is important too to LTS > > kernel. > > > > Thanks > > Baoquan > > Hi, Baoquan, > > Sorry I missed your replies. I'll answer your questions from your first > email. > > > I am wondering if this is a real issue you met, or you just found it > > by code inspecting > > I hit this issue with the test suite for drgn > (https://github.com/osandov/drgn). We run the test cases in a virtual > machine on various kernel versions > (https://github.com/osandov/drgn/tree/main/vmtest). Part of the test > suite crashes the kernel to run some tests against /proc/vmcore > (https://github.com/osandov/drgn/blob/13144eda119790cdbc11f360c15a04efdf81ae9a/setup.py#L213, > https://github.com/osandov/drgn/blob/main/vmtest/enter_kdump.py, > https://github.com/osandov/drgn/tree/main/tests/linux_kernel/vmcore). > When I tried v5.18-rc1 configured with !SMP and !PREEMPT, that part of > the test suite got stuck, which is how I found this issue. > > > I am wondering how your vmcore dumping is handled. Asking this because > > we usually use makedumpfile utility > > In production at Facebook, we don't run drgn directly against > /proc/vmcore. We use makedumpfile and inspect the captured file with > drgn once we reboot. > > > While using makedumpfile, we use mmap which is 4M at one time by > > default, then process the content. So the copy_oldmem_page() may only > > be called during elfcorehdr and notes reading. > > We also use vmcore-dmesg > (https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/tree/vmcore-dmesg) > on /proc/vmcore before calling makedumpfile. From what I can tell, that > uses read()/pread() > (https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/tree/util_lib/elf_info.c), > so it would also hit this issue. Thanks for these details and great patch. It's clear to me now about the situation and motivation. We also use vmcore-dmesg to collect dmesg log before running makedumpfile. That could be a small probability event, but worth adding Fixes just in case. > > I'll send a v3 adding Fixes: 690467c81b1a ("mm/vmalloc: Move draining > areas out of caller context"). I don't think a stable tag is necessary > since this was introduced in v5.18-rc1 and hasn't been backported as far > as I can tell. > > Thanks, > Omar > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Baoquan He Date: Thu, 14 Apr 2022 18:32:28 +0800 Subject: [PATCH v2] mm/vmalloc: fix spinning drain_vmap_work after reading from /proc/vmcore In-Reply-To: References: <52f819991051f9b865e9ce25605509bfdbacadcd.1649277321.git.osandov@fb.com> Message-ID: List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: kexec@lists.infradead.org On 04/13/22 at 09:24am, Omar Sandoval wrote: > On Fri, Apr 08, 2022 at 11:02:47AM +0800, Baoquan He wrote: ...... > > Since redhat mail server has issue, the body content of patch is empty > > from my mail client. So reply here to add comment. > > > > As replied in v1 to Omar, I think this is a great fix. That would be > > also great to state if this is a real issue which is breaking thing, > > then add 'Fixes' tag and Cc stable like "Cc: # 5.17", > > or a fantastic improvement from code inspecting. > > > > Concern this because in distros, e.g in our rhel8, we maintain old kernel > > and back port necessary patches into the kernel, those patches with > > 'Fixes' tag definitely are good candidate. This is important too to LTS > > kernel. > > > > Thanks > > Baoquan > > Hi, Baoquan, > > Sorry I missed your replies. I'll answer your questions from your first > email. > > > I am wondering if this is a real issue you met, or you just found it > > by code inspecting > > I hit this issue with the test suite for drgn > (https://github.com/osandov/drgn). We run the test cases in a virtual > machine on various kernel versions > (https://github.com/osandov/drgn/tree/main/vmtest). Part of the test > suite crashes the kernel to run some tests against /proc/vmcore > (https://github.com/osandov/drgn/blob/13144eda119790cdbc11f360c15a04efdf81ae9a/setup.py#L213, > https://github.com/osandov/drgn/blob/main/vmtest/enter_kdump.py, > https://github.com/osandov/drgn/tree/main/tests/linux_kernel/vmcore). > When I tried v5.18-rc1 configured with !SMP and !PREEMPT, that part of > the test suite got stuck, which is how I found this issue. > > > I am wondering how your vmcore dumping is handled. Asking this because > > we usually use makedumpfile utility > > In production at Facebook, we don't run drgn directly against > /proc/vmcore. We use makedumpfile and inspect the captured file with > drgn once we reboot. > > > While using makedumpfile, we use mmap which is 4M at one time by > > default, then process the content. So the copy_oldmem_page() may only > > be called during elfcorehdr and notes reading. > > We also use vmcore-dmesg > (https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/tree/vmcore-dmesg) > on /proc/vmcore before calling makedumpfile. From what I can tell, that > uses read()/pread() > (https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/tree/util_lib/elf_info.c), > so it would also hit this issue. Thanks for these details and great patch. It's clear to me now about the situation and motivation. We also use vmcore-dmesg to collect dmesg log before running makedumpfile. That could be a small probability event, but worth adding Fixes just in case. > > I'll send a v3 adding Fixes: 690467c81b1a ("mm/vmalloc: Move draining > areas out of caller context"). I don't think a stable tag is necessary > since this was introduced in v5.18-rc1 and hasn't been backported as far > as I can tell. > > Thanks, > Omar >