All of lore.kernel.org
 help / color / mirror / Atom feed
* 9pfs hangs since 4.7
@ 2016-11-24 19:50 Tuomas Tynkkynen
  2016-11-29 16:39 ` Eric Van Hensbergen
  2017-01-02  8:20 ` Tuomas Tynkkynen
  0 siblings, 2 replies; 19+ messages in thread
From: Tuomas Tynkkynen @ 2016-11-24 19:50 UTC (permalink / raw)
  To: v9fs-developer, linux-fsdevel, linux-kernel

Hi fsdevel,

I have been observing hangs when running xfstests generic/224. Curiously
enough, the test is *not* causing problems on the FS under test (I've
tried both ext4 and f2fs) but instead it's causing the 9pfs that I'm
using as the root filesystem to crap out.

How it shows up is that the test doesn't finish in time (usually
takes ~50 sec) but the hung task detector triggers for some task in
d_alloc_parallel():

[  660.701646] INFO: task 224:7800 blocked for more than 300 seconds.
[  660.702756]       Not tainted 4.9.0-rc5 #1-NixOS
[  660.703232] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  660.703927] 224             D    0  7800    549 0x00000000
[  660.704501]  ffff8a82ec022800 0000000000000000 ffff8a82fc03c800 ffff8a82ff217dc0
[  660.705302]  ffff8a82d0f88c00 ffffa94a41a27b88 ffffffffaeb4ad1d ffffa94a41a27b78
[  660.706125]  ffffffffae800fc6 ffff8a82fbd90f08 ffff8a82d0f88c00 ffff8a82fbfd5418
[  660.706924] Call Trace:
[  660.707185]  [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
[  660.707751]  [<ffffffffae800fc6>] ? __d_alloc+0x126/0x1e0
[  660.708304]  [<ffffffffaeb4b206>] schedule+0x36/0x80
[  660.708841]  [<ffffffffae801937>] d_alloc_parallel+0x3a7/0x480
[  660.709454]  [<ffffffffae697800>] ? wake_up_q+0x70/0x70
[  660.710007]  [<ffffffffae7f37e3>] lookup_slow+0x73/0x140
[  660.710572]  [<ffffffffae7f3eea>] walk_component+0x1ca/0x2f0
[  660.711167]  [<ffffffffae7f2409>] ? path_init+0x1d9/0x330
[  660.711747]  [<ffffffffae808ca4>] ? mntput+0x24/0x40
[  660.716962]  [<ffffffffae7f4fbd>] path_lookupat+0x5d/0x110
[  660.717581]  [<ffffffffae7f784e>] filename_lookup+0x9e/0x150
[  660.718194]  [<ffffffffae7ca036>] ? kmem_cache_alloc+0x156/0x1b0
[  660.719037]  [<ffffffffae7f7496>] ? getname_flags+0x56/0x1f0
[  660.719801]  [<ffffffffae7f74b2>] ? getname_flags+0x72/0x1f0
[  660.720492]  [<ffffffffae7f79b6>] user_path_at_empty+0x36/0x40
[  660.721206]  [<ffffffffae7ec943>] vfs_fstatat+0x53/0xa0
[  660.721980]  [<ffffffffae7ecdbf>] SYSC_newstat+0x1f/0x40
[  660.722732]  [<ffffffffae7ecebe>] SyS_newstat+0xe/0x10
[  660.723702]  [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9

SysRq-T is full of things stuck inside p9_client_rpc like:

[  271.703598] bash            S    0   100     96 0x00000000
[  271.703968]  ffff8a82ff824800 0000000000000000 ffff8a82faee4800 ffff8a82ff217dc0
[  271.704486]  ffff8a82fb946c00 ffffa94a404ebae8 ffffffffaeb4ad1d ffff8a82fb9fc058
[  271.705024]  ffffa94a404ebb10 ffffffffae8f21f9 ffff8a82fb946c00 ffff8a82fbbba000
[  271.705542] Call Trace:
[  271.705715]  [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
[  271.706079]  [<ffffffffae8f21f9>] ? idr_get_empty_slot+0x199/0x3b0
[  271.706489]  [<ffffffffaeb4b206>] schedule+0x36/0x80
[  271.706825]  [<ffffffffc01b02ba>] p9_client_rpc+0x12a/0x460 [9pnet]
[  271.707239]  [<ffffffffae8f2497>] ? idr_alloc+0x87/0x100
[  271.707596]  [<ffffffffae6abf90>] ? wake_atomic_t_function+0x60/0x60
[  271.708043]  [<ffffffffc01b2247>] p9_client_walk+0x77/0x200 [9pnet]
[  271.708459]  [<ffffffffc01ccae9>] v9fs_vfs_lookup.part.16+0x59/0x120 [9p]
[  271.708912]  [<ffffffffc01ccbcf>] v9fs_vfs_lookup+0x1f/0x30 [9p]
[  271.709308]  [<ffffffffae7f3806>] lookup_slow+0x96/0x140
[  271.709664]  [<ffffffffae7f3eea>] walk_component+0x1ca/0x2f0
[  271.710036]  [<ffffffffae7f2409>] ? path_init+0x1d9/0x330
[  271.710390]  [<ffffffffae7f4fbd>] path_lookupat+0x5d/0x110
[  271.710763]  [<ffffffffae7f784e>] filename_lookup+0x9e/0x150
[  271.711136]  [<ffffffffae7e091e>] ? mem_cgroup_commit_charge+0x7e/0x4a0
[  271.711581]  [<ffffffffae7ca036>] ? kmem_cache_alloc+0x156/0x1b0
[  271.711977]  [<ffffffffae7f7496>] ? getname_flags+0x56/0x1f0
[  271.712349]  [<ffffffffae7f74b2>] ? getname_flags+0x72/0x1f0
[  271.712726]  [<ffffffffae7f79b6>] user_path_at_empty+0x36/0x40
[  271.713110]  [<ffffffffae7ec943>] vfs_fstatat+0x53/0xa0
[  271.713454]  [<ffffffffae7ecdbf>] SYSC_newstat+0x1f/0x40
[  271.713810]  [<ffffffffae7ecebe>] SyS_newstat+0xe/0x10
[  271.714150]  [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9

[  271.729022] sleep           S    0   218    216 0x00000002
[  271.729391]  ffff8a82fb990800 0000000000000000 ffff8a82fc0d8000 ffff8a82ff317dc0
[  271.729915]  ffff8a82fbbec800 ffffa94a404f3cf8 ffffffffaeb4ad1d ffff8a82fb9fc058
[  271.730426]  ffffec95c1ee08c0 0000000000000001 ffff8a82fbbec800 ffff8a82fbbba000
[  271.730950] Call Trace:
[  271.731115]  [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
[  271.731479]  [<ffffffffaeb4b206>] schedule+0x36/0x80
[  271.731814]  [<ffffffffc01b02ba>] p9_client_rpc+0x12a/0x460 [9pnet]
[  271.732226]  [<ffffffffae6abf90>] ? wake_atomic_t_function+0x60/0x60
[  271.732649]  [<ffffffffc01b2158>] p9_client_clunk+0x38/0xb0 [9pnet]
[  271.733061]  [<ffffffffc01cf38a>] v9fs_dir_release+0x1a/0x30 [9p]
[  271.733494]  [<ffffffffae7e964f>] __fput+0xdf/0x1f0
[  271.733844]  [<ffffffffae7e979e>] ____fput+0xe/0x10
[  271.734176]  [<ffffffffae68b5be>] task_work_run+0x7e/0xa0
[  271.734532]  [<ffffffffae672559>] do_exit+0x2b9/0xad0
[  271.734888]  [<ffffffffae65cc67>] ? __do_page_fault+0x287/0x4b0
[  271.735276]  [<ffffffffae672df3>] do_group_exit+0x43/0xb0
[  271.735639]  [<ffffffffae672e74>] SyS_exit_group+0x14/0x20
[  271.736002]  [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9

Full dmesgs is available from:
https://gist.githubusercontent.com/dezgeg/31c2a50a1ce82e4284f6c9e617e7eba8/raw/e5ed6e62c7a1a5234d9316563154e530a2e95586/dmesg (shorter) 
https://gist.githubusercontent.com/dezgeg/6989a0746ba8c000324455f473dc58e9/raw/4d7c6c58de88ef9d0367147c4ef1d990cfb267ce/dmesg (longer)

This typically reproduces quite fast, maybe half an hour or so. 4.7,
4.8.10 and 4.9-rc5 all are affected.

This happens in a 2-core QEMU+KVM vm with 2GB RAM using its
internal 9p server
(-virtfs local,path=MOUNTPOINT,security_model=none,mount_tag=store).

Any ideas for further debugging?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 9pfs hangs since 4.7
  2016-11-24 19:50 9pfs hangs since 4.7 Tuomas Tynkkynen
@ 2016-11-29 16:39 ` Eric Van Hensbergen
  2016-12-02 20:11   ` Tuomas Tynkkynen
  2017-01-02  8:20 ` Tuomas Tynkkynen
  1 sibling, 1 reply; 19+ messages in thread
From: Eric Van Hensbergen @ 2016-11-29 16:39 UTC (permalink / raw)
  To: Tuomas Tynkkynen; +Cc: V9FS Developers, Linux FS Devel, linux-kernel

Any idea of what xfstests is doing at this point in time?  I'd be a
bit worried about some sort of loop in the namespace since it seems to
be in path traversal.  Could also be some sort of resource leak or
fragmentation, I'll admit that many of the regression tests we do are
fairly short in duration.  Another approach would be to look at doing
this with a different server (over a network link instead of virtio)
to isolate it as a client versus server side problem (although from
the looks of things this does seem to be a client issue).

On Thu, Nov 24, 2016 at 1:50 PM, Tuomas Tynkkynen <tuomas@tuxera.com> wrote:
> Hi fsdevel,
>
> I have been observing hangs when running xfstests generic/224. Curiously
> enough, the test is *not* causing problems on the FS under test (I've
> tried both ext4 and f2fs) but instead it's causing the 9pfs that I'm
> using as the root filesystem to crap out.
>
> How it shows up is that the test doesn't finish in time (usually
> takes ~50 sec) but the hung task detector triggers for some task in
> d_alloc_parallel():
>
> [  660.701646] INFO: task 224:7800 blocked for more than 300 seconds.
> [  660.702756]       Not tainted 4.9.0-rc5 #1-NixOS
> [  660.703232] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  660.703927] 224             D    0  7800    549 0x00000000
> [  660.704501]  ffff8a82ec022800 0000000000000000 ffff8a82fc03c800 ffff8a82ff217dc0
> [  660.705302]  ffff8a82d0f88c00 ffffa94a41a27b88 ffffffffaeb4ad1d ffffa94a41a27b78
> [  660.706125]  ffffffffae800fc6 ffff8a82fbd90f08 ffff8a82d0f88c00 ffff8a82fbfd5418
> [  660.706924] Call Trace:
> [  660.707185]  [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
> [  660.707751]  [<ffffffffae800fc6>] ? __d_alloc+0x126/0x1e0
> [  660.708304]  [<ffffffffaeb4b206>] schedule+0x36/0x80
> [  660.708841]  [<ffffffffae801937>] d_alloc_parallel+0x3a7/0x480
> [  660.709454]  [<ffffffffae697800>] ? wake_up_q+0x70/0x70
> [  660.710007]  [<ffffffffae7f37e3>] lookup_slow+0x73/0x140
> [  660.710572]  [<ffffffffae7f3eea>] walk_component+0x1ca/0x2f0
> [  660.711167]  [<ffffffffae7f2409>] ? path_init+0x1d9/0x330
> [  660.711747]  [<ffffffffae808ca4>] ? mntput+0x24/0x40
> [  660.716962]  [<ffffffffae7f4fbd>] path_lookupat+0x5d/0x110
> [  660.717581]  [<ffffffffae7f784e>] filename_lookup+0x9e/0x150
> [  660.718194]  [<ffffffffae7ca036>] ? kmem_cache_alloc+0x156/0x1b0
> [  660.719037]  [<ffffffffae7f7496>] ? getname_flags+0x56/0x1f0
> [  660.719801]  [<ffffffffae7f74b2>] ? getname_flags+0x72/0x1f0
> [  660.720492]  [<ffffffffae7f79b6>] user_path_at_empty+0x36/0x40
> [  660.721206]  [<ffffffffae7ec943>] vfs_fstatat+0x53/0xa0
> [  660.721980]  [<ffffffffae7ecdbf>] SYSC_newstat+0x1f/0x40
> [  660.722732]  [<ffffffffae7ecebe>] SyS_newstat+0xe/0x10
> [  660.723702]  [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9
>
> SysRq-T is full of things stuck inside p9_client_rpc like:
>
> [  271.703598] bash            S    0   100     96 0x00000000
> [  271.703968]  ffff8a82ff824800 0000000000000000 ffff8a82faee4800 ffff8a82ff217dc0
> [  271.704486]  ffff8a82fb946c00 ffffa94a404ebae8 ffffffffaeb4ad1d ffff8a82fb9fc058
> [  271.705024]  ffffa94a404ebb10 ffffffffae8f21f9 ffff8a82fb946c00 ffff8a82fbbba000
> [  271.705542] Call Trace:
> [  271.705715]  [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
> [  271.706079]  [<ffffffffae8f21f9>] ? idr_get_empty_slot+0x199/0x3b0
> [  271.706489]  [<ffffffffaeb4b206>] schedule+0x36/0x80
> [  271.706825]  [<ffffffffc01b02ba>] p9_client_rpc+0x12a/0x460 [9pnet]
> [  271.707239]  [<ffffffffae8f2497>] ? idr_alloc+0x87/0x100
> [  271.707596]  [<ffffffffae6abf90>] ? wake_atomic_t_function+0x60/0x60
> [  271.708043]  [<ffffffffc01b2247>] p9_client_walk+0x77/0x200 [9pnet]
> [  271.708459]  [<ffffffffc01ccae9>] v9fs_vfs_lookup.part.16+0x59/0x120 [9p]
> [  271.708912]  [<ffffffffc01ccbcf>] v9fs_vfs_lookup+0x1f/0x30 [9p]
> [  271.709308]  [<ffffffffae7f3806>] lookup_slow+0x96/0x140
> [  271.709664]  [<ffffffffae7f3eea>] walk_component+0x1ca/0x2f0
> [  271.710036]  [<ffffffffae7f2409>] ? path_init+0x1d9/0x330
> [  271.710390]  [<ffffffffae7f4fbd>] path_lookupat+0x5d/0x110
> [  271.710763]  [<ffffffffae7f784e>] filename_lookup+0x9e/0x150
> [  271.711136]  [<ffffffffae7e091e>] ? mem_cgroup_commit_charge+0x7e/0x4a0
> [  271.711581]  [<ffffffffae7ca036>] ? kmem_cache_alloc+0x156/0x1b0
> [  271.711977]  [<ffffffffae7f7496>] ? getname_flags+0x56/0x1f0
> [  271.712349]  [<ffffffffae7f74b2>] ? getname_flags+0x72/0x1f0
> [  271.712726]  [<ffffffffae7f79b6>] user_path_at_empty+0x36/0x40
> [  271.713110]  [<ffffffffae7ec943>] vfs_fstatat+0x53/0xa0
> [  271.713454]  [<ffffffffae7ecdbf>] SYSC_newstat+0x1f/0x40
> [  271.713810]  [<ffffffffae7ecebe>] SyS_newstat+0xe/0x10
> [  271.714150]  [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9
>
> [  271.729022] sleep           S    0   218    216 0x00000002
> [  271.729391]  ffff8a82fb990800 0000000000000000 ffff8a82fc0d8000 ffff8a82ff317dc0
> [  271.729915]  ffff8a82fbbec800 ffffa94a404f3cf8 ffffffffaeb4ad1d ffff8a82fb9fc058
> [  271.730426]  ffffec95c1ee08c0 0000000000000001 ffff8a82fbbec800 ffff8a82fbbba000
> [  271.730950] Call Trace:
> [  271.731115]  [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
> [  271.731479]  [<ffffffffaeb4b206>] schedule+0x36/0x80
> [  271.731814]  [<ffffffffc01b02ba>] p9_client_rpc+0x12a/0x460 [9pnet]
> [  271.732226]  [<ffffffffae6abf90>] ? wake_atomic_t_function+0x60/0x60
> [  271.732649]  [<ffffffffc01b2158>] p9_client_clunk+0x38/0xb0 [9pnet]
> [  271.733061]  [<ffffffffc01cf38a>] v9fs_dir_release+0x1a/0x30 [9p]
> [  271.733494]  [<ffffffffae7e964f>] __fput+0xdf/0x1f0
> [  271.733844]  [<ffffffffae7e979e>] ____fput+0xe/0x10
> [  271.734176]  [<ffffffffae68b5be>] task_work_run+0x7e/0xa0
> [  271.734532]  [<ffffffffae672559>] do_exit+0x2b9/0xad0
> [  271.734888]  [<ffffffffae65cc67>] ? __do_page_fault+0x287/0x4b0
> [  271.735276]  [<ffffffffae672df3>] do_group_exit+0x43/0xb0
> [  271.735639]  [<ffffffffae672e74>] SyS_exit_group+0x14/0x20
> [  271.736002]  [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9
>
> Full dmesgs is available from:
> https://gist.githubusercontent.com/dezgeg/31c2a50a1ce82e4284f6c9e617e7eba8/raw/e5ed6e62c7a1a5234d9316563154e530a2e95586/dmesg (shorter)
> https://gist.githubusercontent.com/dezgeg/6989a0746ba8c000324455f473dc58e9/raw/4d7c6c58de88ef9d0367147c4ef1d990cfb267ce/dmesg (longer)
>
> This typically reproduces quite fast, maybe half an hour or so. 4.7,
> 4.8.10 and 4.9-rc5 all are affected.
>
> This happens in a 2-core QEMU+KVM vm with 2GB RAM using its
> internal 9p server
> (-virtfs local,path=MOUNTPOINT,security_model=none,mount_tag=store).
>
> Any ideas for further debugging?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 9pfs hangs since 4.7
  2016-11-29 16:39 ` Eric Van Hensbergen
@ 2016-12-02 20:11   ` Tuomas Tynkkynen
  0 siblings, 0 replies; 19+ messages in thread
From: Tuomas Tynkkynen @ 2016-12-02 20:11 UTC (permalink / raw)
  To: Eric Van Hensbergen; +Cc: V9FS Developers, Linux FS Devel, linux-kernel

On Tue, 29 Nov 2016 10:39:39 -0600
Eric Van Hensbergen <ericvh@gmail.com> wrote:

> Any idea of what xfstests is doing at this point in time?  I'd be a
> bit worried about some sort of loop in the namespace since it seems to
> be in path traversal.  Could also be some sort of resource leak or
> fragmentation, I'll admit that many of the regression tests we do are
> fairly short in duration.  Another approach would be to look at doing
> this with a different server (over a network link instead of virtio)
> to isolate it as a client versus server side problem (although from
> the looks of things this does seem to be a client issue).

The xfstests part where it hangs is either of these loops:

FILES=1000
for i in `seq 0 1 $FILES`; do
        (
                sleep 5
                xfs_io -f -c "truncate 10485760" $SCRATCH_MNT/testfile.$i
                dd if=/dev/zero of=$SCRATCH_MNT/testfile.$i bs=4k conv=notrunc
        ) > /dev/null 2>&1 &
done
wait

for i in `seq 0 1 $FILES`; do
        dd of=/dev/null if=$SCRATCH_MNT/testfile.$i bs=512k iflag=direct > /dev/null 2>&1 &
done
wait

So all what's happening on the 9p is a bunch of reads+opens
on the binaries (sleep, xfs_io, dd) and their .so dependencies
(which includes some readlinks as well apparently).

I also tried building QEMU with tracing support enabled and
according to its own 9p event log the server did end up replying
to each client request (i.e. each v9fs_foo with a given tag was
was matched up with a v9fs_foo_return or a v9fs_rerror)...
so yes, looking more like a client problem.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 9pfs hangs since 4.7
  2016-11-24 19:50 9pfs hangs since 4.7 Tuomas Tynkkynen
  2016-11-29 16:39 ` Eric Van Hensbergen
@ 2017-01-02  8:20 ` Tuomas Tynkkynen
  2017-01-02 16:23   ` Al Viro
  1 sibling, 1 reply; 19+ messages in thread
From: Tuomas Tynkkynen @ 2017-01-02  8:20 UTC (permalink / raw)
  To: v9fs-developer, linux-fsdevel, linux-kernel, Al Viro

Hi fsdevel,

I tracked this problem down a bit and it seems that it started happening
after the VFS merge in 4.7-rc1: 7f427d3a6029331304f91ef4d7cf646f054216d2:

    Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
    
    Pull parallel filesystem directory handling update from Al Viro.

Al, do you have any ideas about this problem with 9pfs I've been observing
(quoted below in full)? Many thanks in advance!

On Thu, 24 Nov 2016 21:50:23 +0200
Tuomas Tynkkynen <tuomas@tuxera.com> wrote:

> Hi fsdevel,
> 
> I have been observing hangs when running xfstests generic/224. Curiously
> enough, the test is *not* causing problems on the FS under test (I've
> tried both ext4 and f2fs) but instead it's causing the 9pfs that I'm
> using as the root filesystem to crap out.
> 
> How it shows up is that the test doesn't finish in time (usually
> takes ~50 sec) but the hung task detector triggers for some task in
> d_alloc_parallel():
> 
> [  660.701646] INFO: task 224:7800 blocked for more than 300 seconds.
> [  660.702756]       Not tainted 4.9.0-rc5 #1-NixOS
> [  660.703232] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  660.703927] 224             D    0  7800    549 0x00000000
> [  660.704501]  ffff8a82ec022800 0000000000000000 ffff8a82fc03c800 ffff8a82ff217dc0
> [  660.705302]  ffff8a82d0f88c00 ffffa94a41a27b88 ffffffffaeb4ad1d ffffa94a41a27b78
> [  660.706125]  ffffffffae800fc6 ffff8a82fbd90f08 ffff8a82d0f88c00 ffff8a82fbfd5418
> [  660.706924] Call Trace:
> [  660.707185]  [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
> [  660.707751]  [<ffffffffae800fc6>] ? __d_alloc+0x126/0x1e0
> [  660.708304]  [<ffffffffaeb4b206>] schedule+0x36/0x80
> [  660.708841]  [<ffffffffae801937>] d_alloc_parallel+0x3a7/0x480
> [  660.709454]  [<ffffffffae697800>] ? wake_up_q+0x70/0x70
> [  660.710007]  [<ffffffffae7f37e3>] lookup_slow+0x73/0x140
> [  660.710572]  [<ffffffffae7f3eea>] walk_component+0x1ca/0x2f0
> [  660.711167]  [<ffffffffae7f2409>] ? path_init+0x1d9/0x330
> [  660.711747]  [<ffffffffae808ca4>] ? mntput+0x24/0x40
> [  660.716962]  [<ffffffffae7f4fbd>] path_lookupat+0x5d/0x110
> [  660.717581]  [<ffffffffae7f784e>] filename_lookup+0x9e/0x150
> [  660.718194]  [<ffffffffae7ca036>] ? kmem_cache_alloc+0x156/0x1b0
> [  660.719037]  [<ffffffffae7f7496>] ? getname_flags+0x56/0x1f0
> [  660.719801]  [<ffffffffae7f74b2>] ? getname_flags+0x72/0x1f0
> [  660.720492]  [<ffffffffae7f79b6>] user_path_at_empty+0x36/0x40
> [  660.721206]  [<ffffffffae7ec943>] vfs_fstatat+0x53/0xa0
> [  660.721980]  [<ffffffffae7ecdbf>] SYSC_newstat+0x1f/0x40
> [  660.722732]  [<ffffffffae7ecebe>] SyS_newstat+0xe/0x10
> [  660.723702]  [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9
> 
> SysRq-T is full of things stuck inside p9_client_rpc like:
> 
> [  271.703598] bash            S    0   100     96 0x00000000
> [  271.703968]  ffff8a82ff824800 0000000000000000 ffff8a82faee4800 ffff8a82ff217dc0
> [  271.704486]  ffff8a82fb946c00 ffffa94a404ebae8 ffffffffaeb4ad1d ffff8a82fb9fc058
> [  271.705024]  ffffa94a404ebb10 ffffffffae8f21f9 ffff8a82fb946c00 ffff8a82fbbba000
> [  271.705542] Call Trace:
> [  271.705715]  [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
> [  271.706079]  [<ffffffffae8f21f9>] ? idr_get_empty_slot+0x199/0x3b0
> [  271.706489]  [<ffffffffaeb4b206>] schedule+0x36/0x80
> [  271.706825]  [<ffffffffc01b02ba>] p9_client_rpc+0x12a/0x460 [9pnet]
> [  271.707239]  [<ffffffffae8f2497>] ? idr_alloc+0x87/0x100
> [  271.707596]  [<ffffffffae6abf90>] ? wake_atomic_t_function+0x60/0x60
> [  271.708043]  [<ffffffffc01b2247>] p9_client_walk+0x77/0x200 [9pnet]
> [  271.708459]  [<ffffffffc01ccae9>] v9fs_vfs_lookup.part.16+0x59/0x120 [9p]
> [  271.708912]  [<ffffffffc01ccbcf>] v9fs_vfs_lookup+0x1f/0x30 [9p]
> [  271.709308]  [<ffffffffae7f3806>] lookup_slow+0x96/0x140
> [  271.709664]  [<ffffffffae7f3eea>] walk_component+0x1ca/0x2f0
> [  271.710036]  [<ffffffffae7f2409>] ? path_init+0x1d9/0x330
> [  271.710390]  [<ffffffffae7f4fbd>] path_lookupat+0x5d/0x110
> [  271.710763]  [<ffffffffae7f784e>] filename_lookup+0x9e/0x150
> [  271.711136]  [<ffffffffae7e091e>] ? mem_cgroup_commit_charge+0x7e/0x4a0
> [  271.711581]  [<ffffffffae7ca036>] ? kmem_cache_alloc+0x156/0x1b0
> [  271.711977]  [<ffffffffae7f7496>] ? getname_flags+0x56/0x1f0
> [  271.712349]  [<ffffffffae7f74b2>] ? getname_flags+0x72/0x1f0
> [  271.712726]  [<ffffffffae7f79b6>] user_path_at_empty+0x36/0x40
> [  271.713110]  [<ffffffffae7ec943>] vfs_fstatat+0x53/0xa0
> [  271.713454]  [<ffffffffae7ecdbf>] SYSC_newstat+0x1f/0x40
> [  271.713810]  [<ffffffffae7ecebe>] SyS_newstat+0xe/0x10
> [  271.714150]  [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9
> 
> [  271.729022] sleep           S    0   218    216 0x00000002
> [  271.729391]  ffff8a82fb990800 0000000000000000 ffff8a82fc0d8000 ffff8a82ff317dc0
> [  271.729915]  ffff8a82fbbec800 ffffa94a404f3cf8 ffffffffaeb4ad1d ffff8a82fb9fc058
> [  271.730426]  ffffec95c1ee08c0 0000000000000001 ffff8a82fbbec800 ffff8a82fbbba000
> [  271.730950] Call Trace:
> [  271.731115]  [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
> [  271.731479]  [<ffffffffaeb4b206>] schedule+0x36/0x80
> [  271.731814]  [<ffffffffc01b02ba>] p9_client_rpc+0x12a/0x460 [9pnet]
> [  271.732226]  [<ffffffffae6abf90>] ? wake_atomic_t_function+0x60/0x60
> [  271.732649]  [<ffffffffc01b2158>] p9_client_clunk+0x38/0xb0 [9pnet]
> [  271.733061]  [<ffffffffc01cf38a>] v9fs_dir_release+0x1a/0x30 [9p]
> [  271.733494]  [<ffffffffae7e964f>] __fput+0xdf/0x1f0
> [  271.733844]  [<ffffffffae7e979e>] ____fput+0xe/0x10
> [  271.734176]  [<ffffffffae68b5be>] task_work_run+0x7e/0xa0
> [  271.734532]  [<ffffffffae672559>] do_exit+0x2b9/0xad0
> [  271.734888]  [<ffffffffae65cc67>] ? __do_page_fault+0x287/0x4b0
> [  271.735276]  [<ffffffffae672df3>] do_group_exit+0x43/0xb0
> [  271.735639]  [<ffffffffae672e74>] SyS_exit_group+0x14/0x20
> [  271.736002]  [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9
> 
> Full dmesgs is available from:
> https://gist.githubusercontent.com/dezgeg/31c2a50a1ce82e4284f6c9e617e7eba8/raw/e5ed6e62c7a1a5234d9316563154e530a2e95586/dmesg (shorter) 
> https://gist.githubusercontent.com/dezgeg/6989a0746ba8c000324455f473dc58e9/raw/4d7c6c58de88ef9d0367147c4ef1d990cfb267ce/dmesg (longer)
> 
> This typically reproduces quite fast, maybe half an hour or so. 4.7,
> 4.8.10 and 4.9-rc5 all are affected.
> 
> This happens in a 2-core QEMU+KVM vm with 2GB RAM using its
> internal 9p server
> (-virtfs local,path=MOUNTPOINT,security_model=none,mount_tag=store).
> 
> Any ideas for further debugging?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 9pfs hangs since 4.7
  2017-01-02  8:20 ` Tuomas Tynkkynen
@ 2017-01-02 16:23   ` Al Viro
  2017-01-03 23:34     ` Tuomas Tynkkynen
  0 siblings, 1 reply; 19+ messages in thread
From: Al Viro @ 2017-01-02 16:23 UTC (permalink / raw)
  To: Tuomas Tynkkynen; +Cc: v9fs-developer, linux-fsdevel, linux-kernel

On Mon, Jan 02, 2017 at 10:20:35AM +0200, Tuomas Tynkkynen wrote:
> Hi fsdevel,
> 
> I tracked this problem down a bit and it seems that it started happening
> after the VFS merge in 4.7-rc1: 7f427d3a6029331304f91ef4d7cf646f054216d2:
> 
>     Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
>     
>     Pull parallel filesystem directory handling update from Al Viro.
> 
> Al, do you have any ideas about this problem with 9pfs I've been observing
> (quoted below in full)? Many thanks in advance!

d_alloc_parallel() is basically waiting for another lookup on the same
name in the same directory to finish.  In earlier kernels it would've
been waiting on ->i_mutex of parent, now - on the waitqueue pointed to
be ->d_wait of in-lookup dentry.

p9_client_rpc ones are more interesting; the question is what are they
waiting for.  Might be waiting for reply from server, might be - for
a tag to get freed.  We had at least one bug with tag lifetimes, BTW
(see commit a84b69cb6e0a for details) and that kind of crap can end up
with responses from server getting lost on client.

What I'd like to see is a log of 9p traffic in those; to hell with the
payloads, just the type and tag of from each message (octets at offsets
4, 5 and 6) plus otag (octets at offsets 7 and 8) when the type is equal to
108 (message[4] == TFLUSH).  That's 3 bytes for most of the messages and
once in a while 2 more.  Should be easier to do dumping on the qemu side
of things; in principle, there are tracing mechanisms in there (grep for
trace_v9fs_ in hw/9pfs/virtio-9p.c), but when I had to do that kind of
work, I simply did manual dumping (with write(2), nothing fancier
than that) in handle_9p_output() (out.id and out.tag_le for type and tag
of incoming messages), v9fs_flush() (tag for otag of TFLUSH messages) and
complete_pdu() (id and pdu->tag for type and tag of replies).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 9pfs hangs since 4.7
  2017-01-02 16:23   ` Al Viro
@ 2017-01-03 23:34     ` Tuomas Tynkkynen
  2017-01-04  1:47       ` Al Viro
  0 siblings, 1 reply; 19+ messages in thread
From: Tuomas Tynkkynen @ 2017-01-03 23:34 UTC (permalink / raw)
  To: Al Viro; +Cc: v9fs-developer, linux-fsdevel, linux-kernel

On Mon, 2 Jan 2017 16:23:09 +0000
Al Viro <viro@ZenIV.linux.org.uk> wrote:


> 
> What I'd like to see is a log of 9p traffic in those; to hell with the
> payloads, just the type and tag of from each message [...]

Thanks for the suggestions. With the following patch to QEMU:

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index aea7e9d..8a6b426 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -662,6 +662,7 @@ static void coroutine_fn pdu_complete(V9fsPDU *pdu, ssize_t len)
 
     /* fill out the header */
     pdu_marshal(pdu, 0, "dbw", (int32_t)len, id, pdu->tag);
+    fprintf(stderr, "complete %d %04x\n", id, pdu->tag);
 
     /* keep these in sync */
     pdu->size = len;
@@ -2347,6 +2348,7 @@ static void v9fs_flush(void *opaque)
         return;
     }
     trace_v9fs_flush(pdu->tag, pdu->id, tag);
+    fprintf(stderr, "flush %04x %04x\n", tag, pdu->tag);
 
     QLIST_FOREACH(cancel_pdu, &s->active_list, next) {
         if (cancel_pdu->tag == tag) {
diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 1782e4a..6a5ac04 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -76,6 +76,7 @@ static void handle_9p_output(VirtIODevice *vdev, VirtQueue *vq)
 
         pdu->id = out.id;
         pdu->tag = le16_to_cpu(out.tag_le);
+        fprintf(stderr, "out %d %04x\n", pdu->id, pdu->tag);
 
         qemu_co_queue_init(&pdu->complete);
         pdu_submit(pdu);

I got these logs from the server & client with 9p tracepoints enabled:

https://gist.githubusercontent.com/dezgeg/02447100b3182167403099fe7de4d941/raw/3772e408ddf586fb662ac9148fc10943529a6b99/dmesg%2520with%25209p%2520trace
https://gist.githubusercontent.com/dezgeg/e1e0c7f354042e1d9bdf7e9135934a65/raw/3a0e3b4f7a5229fd0be032c6839b578d47a21ce4/qemu.log

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: 9pfs hangs since 4.7
  2017-01-03 23:34     ` Tuomas Tynkkynen
@ 2017-01-04  1:47       ` Al Viro
  2017-01-04 20:04         ` Tuomas Tynkkynen
  0 siblings, 1 reply; 19+ messages in thread
From: Al Viro @ 2017-01-04  1:47 UTC (permalink / raw)
  To: Tuomas Tynkkynen; +Cc: v9fs-developer, linux-fsdevel, linux-kernel

On Wed, Jan 04, 2017 at 01:34:50AM +0200, Tuomas Tynkkynen wrote:

> I got these logs from the server & client with 9p tracepoints enabled:
> 
> https://gist.githubusercontent.com/dezgeg/02447100b3182167403099fe7de4d941/raw/3772e408ddf586fb662ac9148fc10943529a6b99/dmesg%2520with%25209p%2520trace
> https://gist.githubusercontent.com/dezgeg/e1e0c7f354042e1d9bdf7e9135934a65/raw/3a0e3b4f7a5229fd0be032c6839b578d47a21ce4/qemu.log
> 

Lovely...

27349:out 8 0001 		TSTATFS, tag 1
27350:out 12 0001		TLOPEN, tag 1
27351:complete 9 0001		RSTATFS, tag 1
27352:complete 13 0001		RLOPEN, tag 1

27677:out 8 0001		TSTATFS, tag 1
27678:out 12 0001		TLOPEN, tag 1
27679:complete 9 0001		RSTATFS, tag 1
27680:complete 13 0001		RLOPEN, tag 1

29667:out 8 0001		TSTATFS, tag 1
29668:out 110 0001		TWALK, tag 1
29669:complete 9 0001		RSTATFS, tag 1
29670:complete 7 0001		RLERROR, tag 1

42317:out 110 0001		TWALK, tag 1
42318:out 8 0001		TSTATFS, tag 1
42319:complete 9 0001		RSTATFS, tag 1
42320:complete 7 0001		RERROR, tag 1

Those are outright protocol violations: tag can be reused only after either
a reply bearing the same tag has arrived *or* TFLUSH for that tag had been
issued and successful (type == RFLUSH) reply bearing the tag of that TFLUSH
has arrived.  Your log doesn't contain any TFLUSH (108) at all, so it should've
been plain and simple "no reuse until server sends a reply with matching tag".

Otherwise the the dump looks sane.  In the end all issued requests had been
served, so it's not as if the client had been waiting for a free tag or for
a response to be produced by the server.

Interesting...  dmesg doesn't seem to contain the beginning of the 9P trace -
only 89858 out of 108818 in the dump, even though it claims to have lost
only 4445 events.  The last exchanges in the trace are
> P9_TLOPEN tag  90 err 0
> P9_TLOPEN tag  25 err 0
> P9_TLOPEN tag  15 err 0
> P9_TLOPEN tag  104 err 0
> P9_TLOPEN tag  102 err 0
> P9_TLOPEN tag  91 err 0
> P9_TLOPEN tag  12 err 0
< P9_TREADLINK tag  12

which should've been
complete 13 005a
complete 13 0019
complete 13 000f
complete 13 0068
complete 13 0066
complete 13 005b
out 22 000c

and the only plausible match for that is

108784:complete 13 005a
108798:complete 13 000f
108799:complete 13 0019
108800:complete 13 0068
108801:complete 13 0066
108802:complete 13 005b

but there's a bunch of replies in between that hadn't been logged *and* such
a TREADLINK request isn't seen after
108341:out 22 000c

... and sure enough, there's a whole lot of processes stuck in readlink.

Interesting...  Which version of qemu/what arguments are you using there?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 9pfs hangs since 4.7
  2017-01-04  1:47       ` Al Viro
@ 2017-01-04 20:04         ` Tuomas Tynkkynen
  2017-01-04 23:01           ` Al Viro
  0 siblings, 1 reply; 19+ messages in thread
From: Tuomas Tynkkynen @ 2017-01-04 20:04 UTC (permalink / raw)
  To: Al Viro; +Cc: v9fs-developer, linux-fsdevel, linux-kernel

On Wed, 4 Jan 2017 01:47:53 +0000
Al Viro <viro@ZenIV.linux.org.uk> wrote:

> On Wed, Jan 04, 2017 at 01:34:50AM +0200, Tuomas Tynkkynen wrote:
> 
> > I got these logs from the server & client with 9p tracepoints enabled:
> > 
> > https://gist.githubusercontent.com/dezgeg/02447100b3182167403099fe7de4d941/raw/3772e408ddf586fb662ac9148fc10943529a6b99/dmesg%2520with%25209p%2520trace
> > https://gist.githubusercontent.com/dezgeg/e1e0c7f354042e1d9bdf7e9135934a65/raw/3a0e3b4f7a5229fd0be032c6839b578d47a21ce4/qemu.log
> >   
> 
> Lovely...
> 
> 27349:out 8 0001 		TSTATFS, tag 1
> 27350:out 12 0001		TLOPEN, tag 1
> 27351:complete 9 0001		RSTATFS, tag 1
> 27352:complete 13 0001		RLOPEN, tag 1
> 
> 27677:out 8 0001		TSTATFS, tag 1
> 27678:out 12 0001		TLOPEN, tag 1
> 27679:complete 9 0001		RSTATFS, tag 1
> 27680:complete 13 0001		RLOPEN, tag 1
> 
> 29667:out 8 0001		TSTATFS, tag 1
> 29668:out 110 0001		TWALK, tag 1
> 29669:complete 9 0001		RSTATFS, tag 1
> 29670:complete 7 0001		RLERROR, tag 1
> 
> 42317:out 110 0001		TWALK, tag 1
> 42318:out 8 0001		TSTATFS, tag 1
> 42319:complete 9 0001		RSTATFS, tag 1
> 42320:complete 7 0001		RERROR, tag 1
> 
> Those are outright protocol violations: tag can be reused only after either
> a reply bearing the same tag has arrived *or* TFLUSH for that tag had been
> issued and successful (type == RFLUSH) reply bearing the tag of that TFLUSH
> has arrived.  Your log doesn't contain any TFLUSH (108) at all, so it should've
> been plain and simple "no reuse until server sends a reply with matching tag".
> 

Argh, I had completely forgotten there is another 9p mount involved, so the log
is mixed between the two mounts. Apologies, there is now a new trace attached
with the V9fsState pointer dumped in the mix as well.

> Otherwise the the dump looks sane.  In the end all issued requests had been
> served, so it's not as if the client had been waiting for a free tag or for
> a response to be produced by the server.
> 
> Interesting...  dmesg doesn't seem to contain the beginning of the 9P trace -
> only 89858 out of 108818 in the dump, even though it claims to have lost
> only 4445 events.  [...]

Here's logs that should be complete this time:

https://gist.githubusercontent.com/dezgeg/08629d4c8ca79da794bc087e5951e518/raw/a1a82b9bc24e5282c82beb43a9dc91974ffcf75a/9p.qemu.log
https://gist.githubusercontent.com/dezgeg/1d5f1cc0647e336c59989fc62780eb2e/raw/d7623755a895b0441c608ddba366d20bbf47ab0b/9p.dmesg.log

> 
> Interesting...  Which version of qemu/what arguments are you using there?

This is QEMU 2.7.0, with these on the server side:
    -virtfs local,path=/nix/store,security_model=none,mount_tag=store
    -virtfs local,path=$TMPDIR/xchg,security_model=none,mount_tag=xchg

and on the client side:

store on /nix/store type 9p (rw,relatime,dirsync,trans=virtio,version=9p2000.L,cache=loose)
xchg on /tmp/xchg type 9p (rw,relatime,dirsync,trans=virtio,version=9p2000.L,cache=loose)

'store' is the one receiving the hammering, 'xchg' I think is mostly sitting idle.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 9pfs hangs since 4.7
  2017-01-04 20:04         ` Tuomas Tynkkynen
@ 2017-01-04 23:01           ` Al Viro
  2017-01-06 13:52             ` [V9fs-developer] " Greg Kurz
  0 siblings, 1 reply; 19+ messages in thread
From: Al Viro @ 2017-01-04 23:01 UTC (permalink / raw)
  To: Tuomas Tynkkynen; +Cc: v9fs-developer, linux-fsdevel, linux-kernel

> Here's logs that should be complete this time:
> 
> https://gist.githubusercontent.com/dezgeg/08629d4c8ca79da794bc087e5951e518/raw/a1a82b9bc24e5282c82beb43a9dc91974ffcf75a/9p.qemu.log
> https://gist.githubusercontent.com/dezgeg/1d5f1cc0647e336c59989fc62780eb2e/raw/d7623755a895b0441c608ddba366d20bbf47ab0b/9p.dmesg.log

Fun.  All requests prior to
[  360.110282]       dd-1899    1.... 18497262us : 9p_client_req: client 18446612134390128640 request P9_TWALK tag  25
line in dmesg had been seen by the servers; all requests starting with that
one had not.  Replies to earlier requests kept arriving just fine.

>From the server side, everything looks nice and sane - it has processed
all requests it had seen, and aside of slight difference in the arrival
order server-side and client-side logs match, except for the last 26
requests the client claims to have sent and the server has never seen.

All traffic for another client (there had been less of it) has ceased long
before that point, so we can't really tell if it's just this client that
got buggered.  Interesting...

Looking at the tracepoints, those requests got through p9_client_prepare_req();
we have no idea whether it got through p9_virtio_request().  OTOH, AFAICS
nothing had been sleeping in there...

FWIW, it might be interesting to try
	WARN_ON(!virtqueue_kick(chan->vq));
in p9_virtio_request() (instead of blind virtqueue_kick(chan->vq)) and see
if it triggers.  Incidentally, it looks like p9_virtio_request() ought to
return an error if that happens...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [V9fs-developer] 9pfs hangs since 4.7
  2017-01-04 23:01           ` Al Viro
@ 2017-01-06 13:52             ` Greg Kurz
  2017-01-07  6:26               ` Al Viro
  0 siblings, 1 reply; 19+ messages in thread
From: Greg Kurz @ 2017-01-06 13:52 UTC (permalink / raw)
  To: Al Viro; +Cc: Tuomas Tynkkynen, linux-fsdevel, v9fs-developer, linux-kernel

On Wed, 4 Jan 2017 23:01:01 +0000
Al Viro <viro@ZenIV.linux.org.uk> wrote:

> > Here's logs that should be complete this time:
> > 
> > https://gist.githubusercontent.com/dezgeg/08629d4c8ca79da794bc087e5951e518/raw/a1a82b9bc24e5282c82beb43a9dc91974ffcf75a/9p.qemu.log
> > https://gist.githubusercontent.com/dezgeg/1d5f1cc0647e336c59989fc62780eb2e/raw/d7623755a895b0441c608ddba366d20bbf47ab0b/9p.dmesg.log  
> 
> Fun.  All requests prior to
> [  360.110282]       dd-1899    1.... 18497262us : 9p_client_req: client 18446612134390128640 request P9_TWALK tag  25
> line in dmesg had been seen by the servers; all requests starting with that
> one had not.  Replies to earlier requests kept arriving just fine.
> 

Looking at the tag numbers, I think we're hitting the hardcoded limit of 128
simultaneous requests in QEMU (which doesn't produce any error, new requests
are silently dropped).

Tuomas, can you change MAX_REQ to some higher value (< 65535 since tag is
2-byte and 0xffff is reserved) to confirm ?

> >From the server side, everything looks nice and sane - it has processed  
> all requests it had seen, and aside of slight difference in the arrival
> order server-side and client-side logs match, except for the last 26
> requests the client claims to have sent and the server has never seen.
> 
> All traffic for another client (there had been less of it) has ceased long
> before that point, so we can't really tell if it's just this client that
> got buggered.  Interesting...
> 
> Looking at the tracepoints, those requests got through p9_client_prepare_req();
> we have no idea whether it got through p9_virtio_request().  OTOH, AFAICS
> nothing had been sleeping in there...
> 
> FWIW, it might be interesting to try
> 	WARN_ON(!virtqueue_kick(chan->vq));
> in p9_virtio_request() (instead of blind virtqueue_kick(chan->vq)) and see
> if it triggers.  Incidentally, it looks like p9_virtio_request() ought to
> return an error if that happens...
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most 
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> V9fs-developer mailing list
> V9fs-developer@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/v9fs-developer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [V9fs-developer] 9pfs hangs since 4.7
  2017-01-06 13:52             ` [V9fs-developer] " Greg Kurz
@ 2017-01-07  6:26               ` Al Viro
  2017-01-07 15:10                 ` Greg Kurz
  0 siblings, 1 reply; 19+ messages in thread
From: Al Viro @ 2017-01-07  6:26 UTC (permalink / raw)
  To: Greg Kurz; +Cc: Tuomas Tynkkynen, linux-fsdevel, v9fs-developer, linux-kernel

On Fri, Jan 06, 2017 at 02:52:35PM +0100, Greg Kurz wrote:

> Looking at the tag numbers, I think we're hitting the hardcoded limit of 128
> simultaneous requests in QEMU (which doesn't produce any error, new requests
> are silently dropped).
> 
> Tuomas, can you change MAX_REQ to some higher value (< 65535 since tag is
> 2-byte and 0xffff is reserved) to confirm ?

Huh?

Just how is a client supposed to cope with that behaviour?  9P is not
SunRPC - there's a reason why it doesn't live on top of UDP.  Sure, it's
datagram-oriented, but it really wants reliable transport...

Setting the ring size at MAX_REQ is fine; that'll give you ENOSPC on
attempt to put a request there, and p9_virtio_request() will wait for
things to clear, but if you've accepted a request, that's bloody it -
you really should go and handle it.

How does it happen, anyway?  qemu-side, I mean...  Does it move the buffer
to used ring as soon as it has fetched the request?  AFAICS, it doesn't -
virtqueue_push() is called just before pdu_free(); we might get complications
in case of TFLUSH handling (queue with MAX_REQ-1 requests submitted, TFLUSH
arrives, cancel_pdu is found and ->cancelled is set on it, then v9fs_flush()
waits for it to complete.  Once the damn thing is done, buffer is released by
virtqueue_push(), but pdu freeing is delayed until v9fs_flush() gets woken
up.  In the meanwhile, another request arrives into the slot of freed by
that virtqueue_push() and we are out of pdus.

So it could happen, and the things might get unpleasant to some extent, but...
no TFLUSH had been present in all that traffic.  And none of the stuck
processes had been spinning in p9_virtio_request(), so they *did* find
ring slots...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [V9fs-developer] 9pfs hangs since 4.7
  2017-01-07  6:26               ` Al Viro
@ 2017-01-07 15:10                 ` Greg Kurz
  2017-01-07 17:19                   ` Al Viro
  0 siblings, 1 reply; 19+ messages in thread
From: Greg Kurz @ 2017-01-07 15:10 UTC (permalink / raw)
  To: Al Viro; +Cc: Tuomas Tynkkynen, linux-fsdevel, v9fs-developer, linux-kernel

On Sat, 7 Jan 2017 06:26:47 +0000
Al Viro <viro@ZenIV.linux.org.uk> wrote:

> On Fri, Jan 06, 2017 at 02:52:35PM +0100, Greg Kurz wrote:
> 
> > Looking at the tag numbers, I think we're hitting the hardcoded limit of 128
> > simultaneous requests in QEMU (which doesn't produce any error, new requests
> > are silently dropped).
> > 
> > Tuomas, can you change MAX_REQ to some higher value (< 65535 since tag is
> > 2-byte and 0xffff is reserved) to confirm ?  
> 
> Huh?
> 
> Just how is a client supposed to cope with that behaviour?  9P is not
> SunRPC - there's a reason why it doesn't live on top of UDP.  Sure, it's
> datagram-oriented, but it really wants reliable transport...
> 
> Setting the ring size at MAX_REQ is fine; that'll give you ENOSPC on
> attempt to put a request there, and p9_virtio_request() will wait for
> things to clear, but if you've accepted a request, that's bloody it -
> you really should go and handle it.
> 

Yes you're right and "dropped" in my previous mail meant "not accepted"
actually (virtqueue_pop() not called)... sorry for the confusion. :-\

> How does it happen, anyway?  qemu-side, I mean...  Does it move the buffer
> to used ring as soon as it has fetched the request?  AFAICS, it doesn't -
> virtqueue_push() is called just before pdu_free(); we might get complications
> in case of TFLUSH handling (queue with MAX_REQ-1 requests submitted, TFLUSH
> arrives, cancel_pdu is found and ->cancelled is set on it, then v9fs_flush()
> waits for it to complete.  Once the damn thing is done, buffer is released by
> virtqueue_push(), but pdu freeing is delayed until v9fs_flush() gets woken
> up.  In the meanwhile, another request arrives into the slot of freed by
> that virtqueue_push() and we are out of pdus.
> 

Indeed. Even if this doesn't seem to be the problem here, I guess this should
be fixed.

> So it could happen, and the things might get unpleasant to some extent, but...
> no TFLUSH had been present in all that traffic.  And none of the stuck
> processes had been spinning in p9_virtio_request(), so they *did* find
> ring slots...

So we're back to your previous proposal of checking if virtqueue_kick() returned
false...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [V9fs-developer] 9pfs hangs since 4.7
  2017-01-07 15:10                 ` Greg Kurz
@ 2017-01-07 17:19                   ` Al Viro
  2017-01-07 18:15                     ` Al Viro
                                       ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Al Viro @ 2017-01-07 17:19 UTC (permalink / raw)
  To: Greg Kurz; +Cc: Tuomas Tynkkynen, linux-fsdevel, v9fs-developer, linux-kernel

On Sat, Jan 07, 2017 at 04:10:45PM +0100, Greg Kurz wrote:
> > virtqueue_push(), but pdu freeing is delayed until v9fs_flush() gets woken
> > up.  In the meanwhile, another request arrives into the slot of freed by
> > that virtqueue_push() and we are out of pdus.
> > 
> 
> Indeed. Even if this doesn't seem to be the problem here, I guess this should
> be fixed.

	FWIW, there's something that looks like an off-by-one in
v9fs_device_realize_common():
    /* initialize pdu allocator */
    QLIST_INIT(&s->free_list);
    QLIST_INIT(&s->active_list);
    for (i = 0; i < (MAX_REQ - 1); i++) {
        QLIST_INSERT_HEAD(&s->free_list, &s->pdus[i], next);
        s->pdus[i].s = s;
        s->pdus[i].idx = i;
    }

Had been there since the original merge of 9p support into qemu - that code
had moved around a bit, but it had never inserted s->pdus[MAX_REQ - 1] into
free list.  So your scenario with failing pdu_alloc() is still possible.
In that log the total amount of pending requests has reached 128 for the
first time right when the requests had stopped being handled and even
though it had dropped below that shortly after, extra requests being put
into queue had not been processed at all...

I'm not familiar with qemu guts enough to tell if that's a plausible scenario,
though... shouldn't subsequent queue insertions (after enough slots had been
released) simply trigger virtio_queue_notify_vq() again?  It *is* a bug
(if we get a burst filling a previously empty queue all at once, there won't
be any slots becoming freed), but that's obviously not the case here -
slots were getting freed, after all.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [V9fs-developer] 9pfs hangs since 4.7
  2017-01-07 17:19                   ` Al Viro
@ 2017-01-07 18:15                     ` Al Viro
  2017-01-08  5:46                       ` Al Viro
  2017-01-09 18:29                     ` Greg Kurz
  2017-01-09 18:39                     ` Tuomas Tynkkynen
  2 siblings, 1 reply; 19+ messages in thread
From: Al Viro @ 2017-01-07 18:15 UTC (permalink / raw)
  To: Greg Kurz; +Cc: Tuomas Tynkkynen, linux-fsdevel, v9fs-developer, linux-kernel

On Sat, Jan 07, 2017 at 05:19:10PM +0000, Al Viro wrote:

> released) simply trigger virtio_queue_notify_vq() again?  It *is* a bug
> (if we get a burst filling a previously empty queue all at once, there won't
> be any slots becoming freed

Umm...  Nope, even in that scenario we'll have all requests except the last
one processed.  I'm trying to put together a simpler reproducer, but...
no luck so far.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [V9fs-developer] 9pfs hangs since 4.7
  2017-01-07 18:15                     ` Al Viro
@ 2017-01-08  5:46                       ` Al Viro
  2017-01-10  8:16                         ` Greg Kurz
  0 siblings, 1 reply; 19+ messages in thread
From: Al Viro @ 2017-01-08  5:46 UTC (permalink / raw)
  To: Greg Kurz; +Cc: Tuomas Tynkkynen, linux-fsdevel, v9fs-developer, linux-kernel

On Sat, Jan 07, 2017 at 06:15:23PM +0000, Al Viro wrote:
> On Sat, Jan 07, 2017 at 05:19:10PM +0000, Al Viro wrote:
> 
> > released) simply trigger virtio_queue_notify_vq() again?  It *is* a bug
> > (if we get a burst filling a previously empty queue all at once, there won't
> > be any slots becoming freed
> 
> Umm...  Nope, even in that scenario we'll have all requests except the last
> one processed.  I'm trying to put together a simpler reproducer, but...
> no luck so far.

FWIW, here's another fun bug in qemu 9pfs: client may issue multiple
Tflush on the same pending request.  Server must not reply to them
out of order and it must reply to at least the last one.  If client has
sent more than one Tflush, it must treat an arrival of Rflush to one of
those as if it has received an Rflush to each of the earlier Tflush as
well (IOW, *their* tags are released).  Client may not reuse the tag of
victim request until it has received Rflush for the last Tflush it has
sent.

Linux kernel-side 9p client doesn't issue such multiple Tflush, but the
protocol allows that.

qemu server does not guarantee in-order replies to multiple Tflush; worse,
while it is likely to reply only to one of those, it may very well be
the _first_ one.  Subsequent ones will be lost; moreover, it'll leak
both coroutines and ring slots.

AFAICS, a clean way to deal with that would be to handle Tflush synchronously,
right in pdu_submit():
	if pdu->type == Tflush
		look the victim request up.
		if !victim || victim == pdu	// [*]
			send reply and free pdu immediately.
		if victim->type == Tflush	// [**]
			pdu->victim = victim->victim
		else
			pdu->victim = victim
			mark the victim cancelled
		add pdu to the tail of pdu->victim->flushes
and let pdu_complete(pdu) send a reply to each request in pdu->flushes
as well (freeing them as we go)

Some locking might be needed to avoid the races between pdu_submit() and
pdu_complete(), but it's not going to be a wide critical area.  I'm not
familiar with qemu locking primitives, sorry; what's providing an exclusion
between QLIST_INSERT_HEAD in pdu_alloc() (from handle_9p_output())
and QLIST_REMOVE in pdu_free() (from pdu_complete())?  Looks like the same
thing ought to suffice here...

[*] a cute way to hurt the current qemu server, BTW - coroutine that waits for
itself to complete...

[**] Tflush to Tflush is another fun corner case - it does *not* do anything
to the earlier Tflush, but reply to it must follow that to original Tflush
to avoid tag reuse problems.  Note that this is not the "multiple Tflush"
case mentioned above - those would all have oldtag pointing to the same
request; they are not chained and unlike this one can happen with legitimate
clients.  Tflush to Tflush, OTOH, is not something a sane client would do.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [V9fs-developer] 9pfs hangs since 4.7
  2017-01-07 17:19                   ` Al Viro
  2017-01-07 18:15                     ` Al Viro
@ 2017-01-09 18:29                     ` Greg Kurz
  2017-01-09 18:39                     ` Tuomas Tynkkynen
  2 siblings, 0 replies; 19+ messages in thread
From: Greg Kurz @ 2017-01-09 18:29 UTC (permalink / raw)
  To: Al Viro; +Cc: Tuomas Tynkkynen, linux-fsdevel, v9fs-developer, linux-kernel

On Sat, 7 Jan 2017 17:19:10 +0000
Al Viro <viro@ZenIV.linux.org.uk> wrote:

> On Sat, Jan 07, 2017 at 04:10:45PM +0100, Greg Kurz wrote:
> > > virtqueue_push(), but pdu freeing is delayed until v9fs_flush() gets woken
> > > up.  In the meanwhile, another request arrives into the slot of freed by
> > > that virtqueue_push() and we are out of pdus.
> > >   
> > 
> > Indeed. Even if this doesn't seem to be the problem here, I guess this should
> > be fixed.  
> 
> 	FWIW, there's something that looks like an off-by-one in
> v9fs_device_realize_common():
>     /* initialize pdu allocator */
>     QLIST_INIT(&s->free_list);
>     QLIST_INIT(&s->active_list);
>     for (i = 0; i < (MAX_REQ - 1); i++) {
>         QLIST_INSERT_HEAD(&s->free_list, &s->pdus[i], next);
>         s->pdus[i].s = s;
>         s->pdus[i].idx = i;
>     }
> 
> Had been there since the original merge of 9p support into qemu - that code
> had moved around a bit, but it had never inserted s->pdus[MAX_REQ - 1] into
> free list.  So your scenario with failing pdu_alloc() is still possible.

Indeed, this (MAX_REQ - 1) thing looks wrong. Thanks for poiting that out.

> In that log the total amount of pending requests has reached 128 for the
> first time right when the requests had stopped being handled and even
> though it had dropped below that shortly after, extra requests being put
> into queue had not been processed at all...
> 
> I'm not familiar with qemu guts enough to tell if that's a plausible scenario,
> though... shouldn't subsequent queue insertions (after enough slots had been
> released) simply trigger virtio_queue_notify_vq() again?  It *is* a bug
> (if we get a burst filling a previously empty queue all at once, there won't
> be any slots becoming freed), but that's obviously not the case here -
> slots were getting freed, after all.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [V9fs-developer] 9pfs hangs since 4.7
  2017-01-07 17:19                   ` Al Viro
  2017-01-07 18:15                     ` Al Viro
  2017-01-09 18:29                     ` Greg Kurz
@ 2017-01-09 18:39                     ` Tuomas Tynkkynen
  2017-01-09 20:05                       ` Al Viro
  2 siblings, 1 reply; 19+ messages in thread
From: Tuomas Tynkkynen @ 2017-01-09 18:39 UTC (permalink / raw)
  To: Al Viro; +Cc: Greg Kurz, linux-fsdevel, v9fs-developer, linux-kernel

On Sat, 7 Jan 2017 17:19:10 +0000
Al Viro <viro@ZenIV.linux.org.uk> wrote:

> On Sat, Jan 07, 2017 at 04:10:45PM +0100, Greg Kurz wrote:
> > > virtqueue_push(), but pdu freeing is delayed until v9fs_flush() gets woken
> > > up.  In the meanwhile, another request arrives into the slot of freed by
> > > that virtqueue_push() and we are out of pdus.
> > >   
> > 
> > Indeed. Even if this doesn't seem to be the problem here, I guess this should
> > be fixed.  
> 
> 	FWIW, there's something that looks like an off-by-one in
> v9fs_device_realize_common():
>     /* initialize pdu allocator */
>     QLIST_INIT(&s->free_list);
>     QLIST_INIT(&s->active_list);
>     for (i = 0; i < (MAX_REQ - 1); i++) {
>         QLIST_INSERT_HEAD(&s->free_list, &s->pdus[i], next);
>         s->pdus[i].s = s;
>         s->pdus[i].idx = i;
>     }
> 
> Had been there since the original merge of 9p support into qemu - that code
> had moved around a bit, but it had never inserted s->pdus[MAX_REQ - 1] into
> free list.  So your scenario with failing pdu_alloc() is still possible.
> In that log the total amount of pending requests has reached 128 for the
> first time right when the requests had stopped being handled and even
> though it had dropped below that shortly after, extra requests being put
> into queue had not been processed at all...
> 

Yes, this does seem to be related to this or otherwise MAX_REQ related!
- Bumping MAX_REQ up to 1024 makes the hang go away (on 4.7).
- Dropping it to 64 makes the same hang happen on kernels where it worked
  before (I tried 4.4.x).
- Doing s/(MAX_REQ - 1)/MAX_REQ/ makes the hang go away.

I tested QEMU 2.8.0 as well and the behaviour is the same there.

Here are the logs for 4.4 hanging with MAX_REQ == 64 without the loop
condition changed:

https://gist.githubusercontent.com/dezgeg/b5f0b7f8a0f3d8b6acb1566d7edcb2f0/raw/00241426890ea28d844986243c3b706881432fb4/9p-44.dmesg.log
https://gist.githubusercontent.com/dezgeg/2bffe6c0271c4c9c382ac6363ce1864b/raw/92329eaef38305f82090de5dde3c944561afa372/9p-44.qemu.log

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [V9fs-developer] 9pfs hangs since 4.7
  2017-01-09 18:39                     ` Tuomas Tynkkynen
@ 2017-01-09 20:05                       ` Al Viro
  0 siblings, 0 replies; 19+ messages in thread
From: Al Viro @ 2017-01-09 20:05 UTC (permalink / raw)
  To: Tuomas Tynkkynen; +Cc: Greg Kurz, linux-fsdevel, v9fs-developer, linux-kernel

On Mon, Jan 09, 2017 at 08:39:31PM +0200, Tuomas Tynkkynen wrote:

> Yes, this does seem to be related to this or otherwise MAX_REQ related!
> - Bumping MAX_REQ up to 1024 makes the hang go away (on 4.7).
> - Dropping it to 64 makes the same hang happen on kernels where it worked
>   before (I tried 4.4.x).
> - Doing s/(MAX_REQ - 1)/MAX_REQ/ makes the hang go away.

Note that it's still possible to trigger the same situation with that
off-by-one taken care of; if client sends 64 Treadlink and 64 Tflush
(one for each of those), then follows by another pile of Treadlink (feeding
them in as soon as free slots appear), the server can end up with failing
pdu_alloc() - completion of readlink will release its slot immediately,
but pdu won't get freed until the corresponding flush has gotten the
CPU.

I'm _not_ familiar with scheduling in qemu and I don't quite understand
the mechanism of getting from "handle_9p_output() bailed out with some
requests still not processed" to "no further requests get processed", so
it might be that for some reason triggering the former as described above
won't escalate to the latter, but I wouldn't count upon that.

Another thing I'm very certain about is that 9 0 0 0 108 1 0 1 0 sent by
broken client (Tflush tag = 1 oldtag = 1) will do nasty things to qemu
server.  v9fs_flush() will try to find the pdu of request to cancel,
find its own argument, put itself on its ->complete and yield CPU, expecting
to get back once the victim gets through to pdu_complete().  Since the
victim is itself...

AFAICS, the things client can expect wrt Tflush handling are
	* in no case should Rflush be sent before the reply to request
its trying to cancel
	* the only case when server _may_ not send Rflush is the arrival
of more than one Tflush with the same oldtag; in that case it is allowed
to suppress replies to earlier ones.  If they are not suppressed, replies
should come in the order of Tflush arrivals.
	* if reply to Tflush is sent (see above), it must be Rflush.
	* multiple Tflush with the same oldtag are allowed; Linux kernel
client does not issue those, but other clients might.  As the matter of
fact, Plan 9 kernel client *does* issue those.
	* Tflush to Tflush is no-op; it still needs a reply, and ordering
constraints apply (it can't be sent before the reply to Tflush it's been
refering to, which, in turn, can't be sent before the reply to request
the first Tflush refers to).  Normally such requests are not sent, but
in principle they are allowed.
	* Tflush to request that isn't being processed should be
answered immediately.  The same goes for Tflush refering to itself.
The former is not an error (we might have already sent a reply), but
the latter might be worth a loud warning - clients are definitely not
supposed to do that.  It still needs Rflush in response - Rerror is not
allowed.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [V9fs-developer] 9pfs hangs since 4.7
  2017-01-08  5:46                       ` Al Viro
@ 2017-01-10  8:16                         ` Greg Kurz
  0 siblings, 0 replies; 19+ messages in thread
From: Greg Kurz @ 2017-01-10  8:16 UTC (permalink / raw)
  To: Al Viro; +Cc: Tuomas Tynkkynen, linux-fsdevel, v9fs-developer, linux-kernel

On Sun, 8 Jan 2017 05:46:39 +0000
Al Viro <viro@ZenIV.linux.org.uk> wrote:

> On Sat, Jan 07, 2017 at 06:15:23PM +0000, Al Viro wrote:
> > On Sat, Jan 07, 2017 at 05:19:10PM +0000, Al Viro wrote:
> >   
> > > released) simply trigger virtio_queue_notify_vq() again?  It *is* a bug
> > > (if we get a burst filling a previously empty queue all at once, there won't
> > > be any slots becoming freed  
> > 
> > Umm...  Nope, even in that scenario we'll have all requests except the last
> > one processed.  I'm trying to put together a simpler reproducer, but...
> > no luck so far.  
> 
> FWIW, here's another fun bug in qemu 9pfs: client may issue multiple
> Tflush on the same pending request.  Server must not reply to them
> out of order and it must reply to at least the last one.  If client has
> sent more than one Tflush, it must treat an arrival of Rflush to one of
> those as if it has received an Rflush to each of the earlier Tflush as
> well (IOW, *their* tags are released).  Client may not reuse the tag of
> victim request until it has received Rflush for the last Tflush it has
> sent.
> 
> Linux kernel-side 9p client doesn't issue such multiple Tflush, but the
> protocol allows that.
> 
> qemu server does not guarantee in-order replies to multiple Tflush; worse,
> while it is likely to reply only to one of those, it may very well be
> the _first_ one.  Subsequent ones will be lost; moreover, it'll leak
> both coroutines and ring slots.
> 

Yeah I'm aware about that, I had started to work on a fix but it's low
priority on my TODO list since linux guests don't do that... and I got
scheduled.

> AFAICS, a clean way to deal with that would be to handle Tflush synchronously,
> right in pdu_submit():
> 	if pdu->type == Tflush
> 		look the victim request up.
> 		if !victim || victim == pdu	// [*]
> 			send reply and free pdu immediately.
> 		if victim->type == Tflush	// [**]
> 			pdu->victim = victim->victim
> 		else
> 			pdu->victim = victim
> 			mark the victim cancelled
> 		add pdu to the tail of pdu->victim->flushes
> and let pdu_complete(pdu) send a reply to each request in pdu->flushes
> as well (freeing them as we go)
> 

I had kept the asynchronous way and it resulted in quite convoluted code.
I'll give a try as per your suggestion.

> Some locking might be needed to avoid the races between pdu_submit() and
> pdu_complete(), but it's not going to be a wide critical area.  I'm not
> familiar with qemu locking primitives, sorry; what's providing an exclusion
> between QLIST_INSERT_HEAD in pdu_alloc() (from handle_9p_output())
> and QLIST_REMOVE in pdu_free() (from pdu_complete())?  Looks like the same
> thing ought to suffice here...
> 

This code runs in coroutines (stack switching, see include/qemu/coroutine.h for
details): it is serialized.

> [*] a cute way to hurt the current qemu server, BTW - coroutine that waits for
> itself to complete...
> 

Indeed :)

> [**] Tflush to Tflush is another fun corner case - it does *not* do anything
> to the earlier Tflush, but reply to it must follow that to original Tflush
> to avoid tag reuse problems.  Note that this is not the "multiple Tflush"
> case mentioned above - those would all have oldtag pointing to the same
> request; they are not chained and unlike this one can happen with legitimate
> clients.  Tflush to Tflush, OTOH, is not something a sane client would do.

I agree I don't see when a client would want to do that but FWIW, it is
mentioned in the last paragraph of http://man.cat-v.org/plan_9/5/flush :)

Thanks for your valuable suggestions, Al!

Cheers.

--
Greg

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-01-10  8:33 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-24 19:50 9pfs hangs since 4.7 Tuomas Tynkkynen
2016-11-29 16:39 ` Eric Van Hensbergen
2016-12-02 20:11   ` Tuomas Tynkkynen
2017-01-02  8:20 ` Tuomas Tynkkynen
2017-01-02 16:23   ` Al Viro
2017-01-03 23:34     ` Tuomas Tynkkynen
2017-01-04  1:47       ` Al Viro
2017-01-04 20:04         ` Tuomas Tynkkynen
2017-01-04 23:01           ` Al Viro
2017-01-06 13:52             ` [V9fs-developer] " Greg Kurz
2017-01-07  6:26               ` Al Viro
2017-01-07 15:10                 ` Greg Kurz
2017-01-07 17:19                   ` Al Viro
2017-01-07 18:15                     ` Al Viro
2017-01-08  5:46                       ` Al Viro
2017-01-10  8:16                         ` Greg Kurz
2017-01-09 18:29                     ` Greg Kurz
2017-01-09 18:39                     ` Tuomas Tynkkynen
2017-01-09 20:05                       ` Al Viro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.