All of lore.kernel.org
 help / color / mirror / Atom feed
From: fruggeri@arista.com (Francesco Ruggeri)
To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	amir73il@gmail.com, jack@suse.cz, fruggeri@arista.com
Subject: soft lockup in fanotify_read
Date: Mon, 13 Jul 2020 19:54:17 -0700	[thread overview]
Message-ID: <20200714025417.A25EB95C0339@us180.sjc.aristanetworks.com> (raw)

We are getting this soft lockup in fanotify_read.
The reason is that this code does not seem to scale to cases where there
are big bursts of events generated by fanotify_handle_event.
fanotify_read acquires group->notification_lock for each event.
fanotify_handle_event uses the lock to add one event, which also involves
fanotify_merge, which scans the whole list trying to find an event to
merge the new one with.
In our case fanotify_read is invoked with a buffer big enough for 200
events, and what happens is that every time fanotify_read dequeues an
event and releases the lock, fanotify_handle_event adds several more,
scanning a longer and longer list. This causes fanotify_read to wait
longer and longer for the lock, and the soft lockup happens before
fanotify_read can reach 200 events.
Is it intentional for fanotify_read to acquire the lock for each event,
rather than batching together a user buffer worth of events?

Thanks,
Francesco Ruggeri

[5752801.578813] watchdog: BUG: soft lockup - CPU#15 stuck for 22s! [fstrace:23105]
[5752801.586804] Modules linked in: ...
[5752801.586871] CPU: 15 PID: 23105 Comm: fstrace Tainted: G        W  O L    4.19.112-16802951.AroraKernel419.el7.x86_64 #1
[5752801.586872] Hardware name: Supermicro SYS-6029TP-HTR/X11DPT-PS, BIOS 3.1 04/30/2019
[5752801.586879] RIP: 0010:queued_spin_lock_slowpath+0x141/0x17c
[5752801.586881] Code: 5e c1 e9 12 83 e0 03 ff c9 48 c1 e0 04 48 63 c9 48 05 c0 14 02 00 48 03 04 cd 20 e7 e9 81 48 89 10 8b 42 08 85 c0 75 04 f3 90 <eb> f5 48 8b 0a 48 85 c9 74 03 0f 0d 09 8b 07 66 85 c0 74 93 f3 90
[5752801.586882] RSP: 0018:ffffc90027c5fd18 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[5752801.586883] RAX: 0000000000000000 RBX: ffff8883a9078ce8 RCX: 000000000000001e
[5752801.586884] RDX: ffff88df7f6e14c0 RSI: 0000000000400000 RDI: ffff88b06306e30c
[5752801.586885] RBP: ffffc90027c5fd18 R08: 0000000000000001 R09: ffffffff811e8b7e
[5752801.586886] R10: ffffc90027c5fce0 R11: 0000000000000000 R12: ffff88b06306e300
[5752801.586887] R13: 00000000ff935638 R14: ffff88b4af422600 R15: 0000000000000018
[5752801.586888] FS:  0000000000000000(0000) GS:ffff88df7f6c0000(0063) knlGS:0000000009c6a880
[5752801.586889] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[5752801.586890] CR2: 0000000009430914 CR3: 000000035bbef006 CR4: 00000000007606e0
[5752801.586891] DR0: 000000000001864c DR1: 0000000000000000 DR2: 0000000000000000
[5752801.586891] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
[5752801.586892] PKRU: 55555554
[5752801.586892] Call Trace:
[5752801.586898]  do_raw_spin_lock+0x1a/0x1c
[5752801.586901]  _raw_spin_lock+0xe/0x10
[5752801.586905]  fanotify_read+0xa3/0x32d
[5752801.586912]  ? __wake_up_sync+0x12/0x12
[5752801.586916]  copy_oldmem_page+0xa9/0xa9
[5752801.586920]  ? fsnotify_perm+0x60/0x6c
[5752801.586921]  ? fsnotify_perm+0x60/0x6c
[5752801.586923]  ? security_file_permission+0x37/0x3e
[5752801.586926]  vfs_read+0xa4/0xdc
[5752801.586928]  ksys_read+0x64/0xab
[5752801.586930]  __ia32_sys_read+0x18/0x1a
[5752801.586933]  do_fast_syscall_32+0xaf/0xf6
[5752801.586935]  entry_SYSENTER_compat+0x6b/0x7a
[5752801.586936] RIP: 0023:0xf7f64c29
[5752801.586938] Code: 5b 5d c3 8b 04 24 c3 8b 14 24 c3 8b 1c 24 c3 8b 34 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
[5752801.586939] RSP: 002b:00000000ff933828 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[5752801.586940] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00000000ff9348d0
[5752801.586941] RDX: 00000000000012c0 RSI: 0000000009c72da8 RDI: 00000000ff9348d0
[5752801.586941] RBP: 00000000ff935ba8 R08: 0000000000000000 R09: 0000000000000000
[5752801.586942] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[5752801.586942] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000


             reply	other threads:[~2020-07-14  2:54 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-14  2:54 Francesco Ruggeri [this message]
2020-07-14 13:10 ` soft lockup in fanotify_read Amir Goldstein
2020-07-14 13:14 ` Jan Kara
2020-07-14 16:57   ` Francesco Ruggeri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200714025417.A25EB95C0339@us180.sjc.aristanetworks.com \
    --to=fruggeri@arista.com \
    --cc=amir73il@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.