All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Schatzberg <schatzberg.dan@gmail.com>
To: kernel test robot <oliver.sang@intel.com>
Cc: 0day robot <lkp@intel.com>, LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, Jens Axboe <axboe@kernel.dk>,
	Tejun Heo <tj@kernel.org>, Zefan Li <lizefan.x@bytedance.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Hugh Dickins <hughd@google.com>,
	Shakeel Butt <shakeelb@google.com>, Roman Gushchin <guro@fb.com>,
	Muchun Song <songmuchun@bytedance.com>,
	Alex Shi <alex.shi@linux.alibaba.com>,
	Alexander Duyck <alexander.h.duyck@linux.intel.com>,
	Chris Down <chris@chrisdown.name>,
	Yafang Shao <laoar.shao@gmail.com>,
	Wei Yang <richard.weiyang@gmail.com>
Subject: Re: [loop]  eaba742710: WARNING:at_kernel/workqueue.c:#check_flush_dependency
Date: Mon, 22 Mar 2021 09:47:25 -0400	[thread overview]
Message-ID: <YFif7fEDAt6eaHDC@dschatzberg-fedora-PC0Y6AEN.dhcp.thefacebook.com> (raw)
In-Reply-To: <20210322060334.GD32426@xsang-OptiPlex-9020>

On Mon, Mar 22, 2021 at 02:03:34PM +0800, kernel test robot wrote:
> 
> 
> Greeting,
> 
> FYI, we noticed the following commit (built with gcc-9):
> 
> commit: eaba7427107045752f7454f94a40839c0880cf02 ("[PATCH 1/3] loop: Use worker per cgroup instead of kworker")
> url: https://github.com/0day-ci/linux/commits/Dan-Schatzberg/Charge-loop-device-i-o-to-issuing-cgroup/20210316-233842
> base: https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git for-next
> 
> in testcase: xfstests
> version: xfstests-x86_64-73c0871-1_20210318
> with following parameters:
> 
> 	disk: 4HDD
> 	fs: xfs
> 	test: generic-group-18
> 	ucode: 0xe2
> 
> test-description: xfstests is a regression test suite for xfs and other files ystems.
> test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
> 
> 
> on test machine: 4 threads Intel(R) Xeon(R) CPU E3-1225 v5 @ 3.30GHz with 16G memory
> 
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> ... 
> [   50.428387] WARNING: CPU: 0 PID: 35 at kernel/workqueue.c:2613 check_flush_dependency (kbuild/src/consumer/kernel/workqueue.c:2613 (discriminator 9)) 
> [   50.450013] Modules linked in: loop xfs dm_mod btrfs blake2b_generic xor zstd_compress raid6_pq libcrc32c sd_mod t10_pi sg ipmi_devintf ipmi_msghandler intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal i915 intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul hp_wmi sparse_keymap intel_gtt crc32c_intel ghash_clmulni_intel mei_wdt rfkill wmi_bmof rapl drm_kms_helper ahci intel_cstate syscopyarea mei_me libahci sysfillrect sysimgblt fb_sys_fops intel_uncore serio_raw mei drm libata intel_pch_thermal ie31200_edac wmi video tpm_infineon intel_pmc_core acpi_pad ip_tables
> [   50.500731] CPU: 0 PID: 35 Comm: kworker/u8:3 Not tainted 5.12.0-rc2-00093-geaba74271070 #1
> [   50.509081] Hardware name: HP HP Z238 Microtower Workstation/8183, BIOS N51 Ver. 01.63 10/05/2017
> [   50.517963] Workqueue: loop0 loop_rootcg_workfn [loop]
> [   50.523109] RIP: 0010:check_flush_dependency (kbuild/src/consumer/kernel/workqueue.c:2613 (discriminator 9))
> ...
> [   50.625837] __flush_work (kbuild/src/consumer/kernel/workqueue.c:2669 kbuild/src/consumer/kernel/workqueue.c:3011 kbuild/src/consumer/kernel/workqueue.c:3051) 
> [   50.629418] ? __queue_work (kbuild/src/consumer/arch/x86/include/asm/paravirt.h:559 kbuild/src/consumer/arch/x86/include/asm/qspinlock.h:56 kbuild/src/consumer/include/linux/spinlock.h:212 kbuild/src/consumer/include/linux/spinlock_api_smp.h:151 kbuild/src/consumer/kernel/workqueue.c:1500) 
> [   50.633261] xfs_file_buffered_write (kbuild/src/consumer/fs/xfs/xfs_file.c:761) xfs
> [   50.638468] do_iter_readv_writev (kbuild/src/consumer/fs/read_write.c:741) 
> [   50.642833] do_iter_write (kbuild/src/consumer/fs/read_write.c:866 kbuild/src/consumer/fs/read_write.c:847) 
> [   50.646513] lo_write_bvec (kbuild/src/consumer/include/linux/fs.h:2903 kbuild/src/consumer/drivers/block/loop.c:286) loop
> [   50.650804] loop_process_work (kbuild/src/consumer/drivers/block/loop.c:307 kbuild/src/consumer/drivers/block/loop.c:630 kbuild/src/consumer/drivers/block/loop.c:2129 kbuild/src/consumer/drivers/block/loop.c:2161) loop
> [   50.655543] ? newidle_balance (kbuild/src/consumer/kernel/sched/fair.c:10635) 
> [   50.659647] process_one_work (kbuild/src/consumer/arch/x86/include/asm/jump_label.h:25 kbuild/src/consumer/include/linux/jump_label.h:200 kbuild/src/consumer/include/trace/events/workqueue.h:108 kbuild/src/consumer/kernel/workqueue.c:2280) 
> [   50.663696] worker_thread (kbuild/src/consumer/include/linux/list.h:282 kbuild/src/consumer/kernel/workqueue.c:2422) 
> [   50.667365] ? process_one_work (kbuild/src/consumer/kernel/workqueue.c:2364) 
> [   50.671568] kthread (kbuild/src/consumer/kernel/kthread.c:292) 
> [   50.674813] ? kthread_park (kbuild/src/consumer/kernel/kthread.c:245) 
> [   50.678476] ret_from_fork (kbuild/src/consumer/arch/x86/entry/entry_64.S:300) 

My understanding is that this warning is firing because the loop
workqueue sets WQ_MEM_RECLAIM but the XFS workqueue (m_sync_workqueue)
does not. I believe that the WQ_MEM_RECLAIM on the loop device is
sensible because reclaim may flush dirty writes through the loop
device. I'm not familiar with xfs and its not clear why
m_sync_workqueue (flushed from xfs_flush_inodes) wouldn't have the
same reclaim dependency. I'll keep digging, but if anyone has
insights, please let me know.

WARNING: multiple messages have this Message-ID (diff)
From: Dan Schatzberg <schatzberg.dan@gmail.com>
To: lkp@lists.01.org
Subject: Re: [loop] eaba742710: WARNING:at_kernel/workqueue.c:#check_flush_dependency
Date: Mon, 22 Mar 2021 09:47:25 -0400	[thread overview]
Message-ID: <YFif7fEDAt6eaHDC@dschatzberg-fedora-PC0Y6AEN.dhcp.thefacebook.com> (raw)
In-Reply-To: <20210322060334.GD32426@xsang-OptiPlex-9020>

[-- Attachment #1: Type: text/plain, Size: 4507 bytes --]

On Mon, Mar 22, 2021 at 02:03:34PM +0800, kernel test robot wrote:
> 
> 
> Greeting,
> 
> FYI, we noticed the following commit (built with gcc-9):
> 
> commit: eaba7427107045752f7454f94a40839c0880cf02 ("[PATCH 1/3] loop: Use worker per cgroup instead of kworker")
> url: https://github.com/0day-ci/linux/commits/Dan-Schatzberg/Charge-loop-device-i-o-to-issuing-cgroup/20210316-233842
> base: https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git for-next
> 
> in testcase: xfstests
> version: xfstests-x86_64-73c0871-1_20210318
> with following parameters:
> 
> 	disk: 4HDD
> 	fs: xfs
> 	test: generic-group-18
> 	ucode: 0xe2
> 
> test-description: xfstests is a regression test suite for xfs and other files ystems.
> test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
> 
> 
> on test machine: 4 threads Intel(R) Xeon(R) CPU E3-1225 v5 @ 3.30GHz with 16G memory
> 
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> ... 
> [   50.428387] WARNING: CPU: 0 PID: 35 at kernel/workqueue.c:2613 check_flush_dependency (kbuild/src/consumer/kernel/workqueue.c:2613 (discriminator 9)) 
> [   50.450013] Modules linked in: loop xfs dm_mod btrfs blake2b_generic xor zstd_compress raid6_pq libcrc32c sd_mod t10_pi sg ipmi_devintf ipmi_msghandler intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal i915 intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul hp_wmi sparse_keymap intel_gtt crc32c_intel ghash_clmulni_intel mei_wdt rfkill wmi_bmof rapl drm_kms_helper ahci intel_cstate syscopyarea mei_me libahci sysfillrect sysimgblt fb_sys_fops intel_uncore serio_raw mei drm libata intel_pch_thermal ie31200_edac wmi video tpm_infineon intel_pmc_core acpi_pad ip_tables
> [   50.500731] CPU: 0 PID: 35 Comm: kworker/u8:3 Not tainted 5.12.0-rc2-00093-geaba74271070 #1
> [   50.509081] Hardware name: HP HP Z238 Microtower Workstation/8183, BIOS N51 Ver. 01.63 10/05/2017
> [   50.517963] Workqueue: loop0 loop_rootcg_workfn [loop]
> [   50.523109] RIP: 0010:check_flush_dependency (kbuild/src/consumer/kernel/workqueue.c:2613 (discriminator 9))
> ...
> [   50.625837] __flush_work (kbuild/src/consumer/kernel/workqueue.c:2669 kbuild/src/consumer/kernel/workqueue.c:3011 kbuild/src/consumer/kernel/workqueue.c:3051) 
> [   50.629418] ? __queue_work (kbuild/src/consumer/arch/x86/include/asm/paravirt.h:559 kbuild/src/consumer/arch/x86/include/asm/qspinlock.h:56 kbuild/src/consumer/include/linux/spinlock.h:212 kbuild/src/consumer/include/linux/spinlock_api_smp.h:151 kbuild/src/consumer/kernel/workqueue.c:1500) 
> [   50.633261] xfs_file_buffered_write (kbuild/src/consumer/fs/xfs/xfs_file.c:761) xfs
> [   50.638468] do_iter_readv_writev (kbuild/src/consumer/fs/read_write.c:741) 
> [   50.642833] do_iter_write (kbuild/src/consumer/fs/read_write.c:866 kbuild/src/consumer/fs/read_write.c:847) 
> [   50.646513] lo_write_bvec (kbuild/src/consumer/include/linux/fs.h:2903 kbuild/src/consumer/drivers/block/loop.c:286) loop
> [   50.650804] loop_process_work (kbuild/src/consumer/drivers/block/loop.c:307 kbuild/src/consumer/drivers/block/loop.c:630 kbuild/src/consumer/drivers/block/loop.c:2129 kbuild/src/consumer/drivers/block/loop.c:2161) loop
> [   50.655543] ? newidle_balance (kbuild/src/consumer/kernel/sched/fair.c:10635) 
> [   50.659647] process_one_work (kbuild/src/consumer/arch/x86/include/asm/jump_label.h:25 kbuild/src/consumer/include/linux/jump_label.h:200 kbuild/src/consumer/include/trace/events/workqueue.h:108 kbuild/src/consumer/kernel/workqueue.c:2280) 
> [   50.663696] worker_thread (kbuild/src/consumer/include/linux/list.h:282 kbuild/src/consumer/kernel/workqueue.c:2422) 
> [   50.667365] ? process_one_work (kbuild/src/consumer/kernel/workqueue.c:2364) 
> [   50.671568] kthread (kbuild/src/consumer/kernel/kthread.c:292) 
> [   50.674813] ? kthread_park (kbuild/src/consumer/kernel/kthread.c:245) 
> [   50.678476] ret_from_fork (kbuild/src/consumer/arch/x86/entry/entry_64.S:300) 

My understanding is that this warning is firing because the loop
workqueue sets WQ_MEM_RECLAIM but the XFS workqueue (m_sync_workqueue)
does not. I believe that the WQ_MEM_RECLAIM on the loop device is
sensible because reclaim may flush dirty writes through the loop
device. I'm not familiar with xfs and its not clear why
m_sync_workqueue (flushed from xfs_flush_inodes) wouldn't have the
same reclaim dependency. I'll keep digging, but if anyone has
insights, please let me know.

  reply	other threads:[~2021-03-22 13:48 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-16 15:36 [PATCH v10 0/3] Charge loop device i/o to issuing cgroup Dan Schatzberg
2021-03-16 15:36 ` Dan Schatzberg
2021-03-16 15:36 ` Dan Schatzberg
2021-03-16 15:36 ` [PATCH 1/3] loop: Use worker per cgroup instead of kworker Dan Schatzberg
2021-03-16 15:36   ` Dan Schatzberg
2021-03-16 15:36   ` Dan Schatzberg
2021-03-22  6:03   ` [loop] eaba742710: WARNING:at_kernel/workqueue.c:#check_flush_dependency kernel test robot
2021-03-22  6:03     ` kernel test robot
2021-03-22 13:47     ` Dan Schatzberg [this message]
2021-03-22 13:47       ` Dan Schatzberg
2021-03-16 15:36 ` [PATCH 2/3] mm: Charge active memcg when no mm is set Dan Schatzberg
2021-03-16 15:36   ` Dan Schatzberg
2021-03-16 15:36   ` Dan Schatzberg
2021-03-16 15:50   ` Shakeel Butt
2021-03-16 15:50     ` Shakeel Butt
2021-03-16 16:02     ` Dan Schatzberg
2021-03-16 16:02       ` Dan Schatzberg
2021-03-16 15:36 ` [PATCH 3/3] loop: Charge i/o to mem and blk cg Dan Schatzberg
2021-03-16 15:36   ` Dan Schatzberg
2021-03-16 15:36   ` Dan Schatzberg
2021-03-16 16:25   ` Shakeel Butt
2021-03-16 16:25     ` Shakeel Butt
2021-03-16 16:25     ` Shakeel Butt
2021-03-17 22:30 ` [PATCH v10 0/3] Charge loop device i/o to issuing cgroup Jens Axboe
2021-03-17 22:30   ` Jens Axboe
2021-03-18 15:53   ` Shakeel Butt
2021-03-18 15:53     ` Shakeel Butt
2021-03-18 15:53     ` Shakeel Butt
2021-03-18 16:00     ` Jens Axboe
2021-03-18 16:00       ` Jens Axboe
2021-03-18 23:46       ` Andrew Morton
2021-03-18 23:46         ` Andrew Morton
2021-03-19  0:56         ` Shakeel Butt
2021-03-19  0:56           ` Shakeel Butt
2021-03-19 15:51           ` Dan Schatzberg
2021-03-19 16:20             ` Shakeel Butt
2021-03-19 16:20               ` Shakeel Butt
2021-03-19 16:27               ` Dan Schatzberg
2021-03-19 16:27                 ` Dan Schatzberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YFif7fEDAt6eaHDC@dschatzberg-fedora-PC0Y6AEN.dhcp.thefacebook.com \
    --to=schatzberg.dan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linux.alibaba.com \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=axboe@kernel.dk \
    --cc=chris@chrisdown.name \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=mhocko@kernel.org \
    --cc=oliver.sang@intel.com \
    --cc=richard.weiyang@gmail.com \
    --cc=shakeelb@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.