All of lore.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "djwong@kernel.org" <djwong@kernel.org>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	"hch@infradead.org" <hch@infradead.org>,
	"trondmy@kernel.org" <trondmy@kernel.org>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
Subject: Re: [PATCH] iomap: Address soft lockup in iomap_finish_ioend()
Date: Thu, 30 Dec 2021 22:25:14 +0000	[thread overview]
Message-ID: <a42acb06152b0ecba3e99aec38349e1f29304b1e.camel@hammerspace.com> (raw)
In-Reply-To: <05bac8cb-e36a-b043-5ac3-82c585f76bbe@kernel.dk>

On Thu, 2021-12-30 at 13:24 -0800, Jens Axboe wrote:
> On 12/30/21 11:35 AM, trondmy@kernel.org wrote:
> > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > 
> > We're observing the following stack trace using various kernels
> > when
> > running in the Azure cloud.
> > 
> >  watchdog: BUG: soft lockup - CPU#12 stuck for 23s!
> > [kworker/12:1:3106]
> >  Modules linked in: raid0 ipt_MASQUERADE nf_conntrack_netlink
> > xt_addrtype nft_chain_nat nf_nat br_netfilter bridge stp llc ext4
> > mbcache jbd2 overlay xt_conntrack nf_conntrack nf_defrag_ipv6
> > nf_defrag_ipv4 nft_counter rpcrdma rdma_ucm xt_owner ib_srpt
> > nft_compat intel_rapl_msr ib_isert intel_rapl_common nf_tables
> > iscsi_target_mod isst_if_mbox_msr isst_if_common nfnetlink
> > target_core_mod nfit ib_iser libnvdimm libiscsi
> > scsi_transport_iscsi ib_umad kvm_intel ib_ipoib rdma_cm iw_cm vfat
> > ib_cm fat kvm irqbypass crct10dif_pclmul crc32_pclmul mlx5_ib
> > ghash_clmulni_intel rapl ib_uverbs ib_core i2c_piix4 pcspkr
> > hyperv_fb hv_balloon hv_utils joydev nfsd auth_rpcgss nfs_acl lockd
> > grace sunrpc ip_tables xfs libcrc32c mlx5_core mlxfw tls pci_hyperv
> > pci_hyperv_intf sd_mod t10_pi sg ata_generic hv_storvsc hv_netvsc
> > scsi_transport_fc hyperv_keyboard hid_hyperv ata_piix libata
> > crc32c_intel hv_vmbus serio_raw fuse
> >  CPU: 12 PID: 3106 Comm: kworker/12:1 Not tainted 4.18.0-
> > 305.10.2.el8_4.x86_64 #1
> >  Hardware name: Microsoft Corporation Virtual Machine/Virtual
> > Machine, BIOS 090008  12/07/2018
> >  Workqueue: xfs-conv/md127 xfs_end_io [xfs]
> >  RIP: 0010:_raw_spin_unlock_irqrestore+0x11/0x20
> >  Code: 7c ff 48 29 e8 4c 39 e0 76 cf 80 0b 08 eb 8c 90 90 90 90 90
> > 90 90 90 90 90 0f 1f 44 00 00 e8 e6 db 7e ff 66 90 48 89 f7 57 9d
> > <0f> 1f 44 00 00 c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 07
> >  RSP: 0018:ffffac51d26dfd18 EFLAGS: 00000202 ORIG_RAX:
> > ffffffffffffff12
> >  RAX: 0000000000000001 RBX: ffffffff980085a0 RCX: dead000000000200
> >  RDX: ffffac51d3893c40 RSI: 0000000000000202 RDI: 0000000000000202
> >  RBP: 0000000000000202 R08: ffffac51d3893c40 R09: 0000000000000000
> >  R10: 00000000000000b9 R11: 00000000000004b3 R12: 0000000000000a20
> >  R13: ffffd228f3e5a200 R14: ffff963cf7f58d10 R15: ffffd228f3e5a200
> >  FS:  0000000000000000(0000) GS:ffff9625bfb00000(0000)
> > knlGS:0000000000000000
> >  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >  CR2: 00007f5035487500 CR3: 0000000432810004 CR4: 00000000003706e0
> >  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >  Call Trace:
> >   wake_up_page_bit+0x8a/0x110
> >   iomap_finish_ioend+0xd7/0x1c0
> >   iomap_finish_ioends+0x7f/0xb0
> >   xfs_end_ioend+0x6b/0x100 [xfs]
> >   ? xfs_setfilesize_ioend+0x60/0x60 [xfs]
> >   xfs_end_io+0xb9/0xe0 [xfs]
> >   process_one_work+0x1a7/0x360
> >   worker_thread+0x1fa/0x390
> >   ? create_worker+0x1a0/0x1a0
> >   kthread+0x116/0x130
> >   ? kthread_flush_work_fn+0x10/0x10
> >   ret_from_fork+0x35/0x40
> > 
> > Jens suggested adding a latency-reducing cond_resched() to the loop
> > in
> > iomap_finish_ioends().
> 
> The patch doesn't add it there though, I was suggesting:
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 71a36ae120ee..4ad2436a936a 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1078,6 +1078,7 @@ iomap_finish_ioends(struct iomap_ioend *ioend,
> int error)
>                 ioend = list_first_entry(&tmp, struct iomap_ioend,
> io_list);
>                 list_del_init(&ioend->io_list);
>                 iomap_finish_ioend(ioend, error);
> +               cond_resched();
>         }
>  }
>  EXPORT_SYMBOL_GPL(iomap_finish_ioends);
> 
> as I don't think you need it once-per-vec. But not sure if you tested
> that variant or not...
> 

Yes, we did test that variant, but were still seeing the soft lockups
on Azure, hence why I moved it into the inner loop.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  reply	other threads:[~2021-12-30 22:25 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-30 19:35 [PATCH] iomap: Address soft lockup in iomap_finish_ioend() trondmy
2021-12-30 21:24 ` Jens Axboe
2021-12-30 22:25   ` Trond Myklebust [this message]
2021-12-30 22:27     ` Jens Axboe
2021-12-30 22:55       ` Trond Myklebust
2021-12-31  1:42 ` Matthew Wilcox
2021-12-31  6:16   ` Trond Myklebust
2022-01-01  3:55     ` Dave Chinner
2022-01-01 17:39       ` Trond Myklebust
2022-01-03 22:03         ` Dave Chinner
2022-01-04  0:04           ` Trond Myklebust
2022-01-04  1:22             ` Dave Chinner
2022-01-04  3:01               ` Trond Myklebust
2022-01-04  7:08               ` hch
2022-01-04 18:08                 ` Matthew Wilcox
2022-01-04 18:14                   ` hch
2022-01-04 19:22                     ` Darrick J. Wong
2022-01-04 21:52                       ` Dave Chinner
2022-01-04 23:12                         ` Darrick J. Wong
2022-01-05  2:10                           ` Dave Chinner
2022-01-05 13:56                             ` Brian Foster
2022-01-05 22:04                               ` Dave Chinner
2022-01-06 16:44                                 ` Brian Foster
2022-01-10  8:18                                   ` Dave Chinner
2022-01-10 17:45                                     ` Brian Foster
2022-01-10 18:11                                       ` hch
2022-01-11 14:33                                       ` Trond Myklebust
2022-01-05 13:42                           ` hch
2022-01-04 21:16                 ` Dave Chinner
2022-01-05 13:43                   ` hch
2022-01-05 22:34                     ` Dave Chinner
2022-01-05  2:09               ` Trond Myklebust
2022-01-05 20:45                 ` Trond Myklebust
2022-01-05 22:48                   ` Dave Chinner
2022-01-05 23:29                     ` Trond Myklebust
2022-01-06  0:01                     ` Darrick J. Wong
2022-01-09 23:09                       ` Dave Chinner
2022-01-06 18:36                     ` Trond Myklebust
2022-01-06 18:38                       ` Trond Myklebust
2022-01-06 20:07                       ` Brian Foster
2022-01-07  3:08                         ` Trond Myklebust
2022-01-07 15:15                           ` Brian Foster
2022-01-09 23:34                       ` Dave Chinner
2022-01-10 23:37                       ` Dave Chinner
2022-01-11  0:08                         ` Dave Chinner
2022-01-13 17:01                         ` Trond Myklebust
2022-01-17 17:24                           ` Trond Myklebust
2022-01-17 17:36                             ` Darrick J. Wong
2022-01-04 13:36         ` Brian Foster
2022-01-04 19:23           ` Darrick J. Wong
2022-01-05  2:31 ` [iomap] f5934dda54: BUG:sleeping_function_called_from_invalid_context_at_fs/iomap/buffered-io.c kernel test robot
2022-01-05  2:31   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a42acb06152b0ecba3e99aec38349e1f29304b1e.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=axboe@kernel.dk \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.