All of lore.kernel.org
 help / color / mirror / Atom feed
* xarray, fault injection and syzkaller
@ 2022-11-03 19:09 Jason Gunthorpe
  2022-11-03 20:00 ` Matthew Wilcox
  0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2022-11-03 19:09 UTC (permalink / raw)
  To: Akinobu Mita, Matthew Wilcox, linux-fsdevel, syzkaller-bugs

Hi All,

I wonder if anyone has some thoughts on this - I have spent some time
setting up syzkaller for a new subsystem and I've noticed that nth
fault injection does not reliably cause things like xa_store() to
fail.

It seems the basic reason is that xarray will usually do two
allocations, one in an atomic context which fault injection does
reliably fail, but then it almost always follows up with a second
allocation in a non-atomic context that doesn't fail because nth has
become 0.

This reduces the available coverage that syzkaller can achieve by
randomizing fault injections. It does very rarely provoke a failure,
which I guess is because the atomic allocation fails naturally
sometimes with low probability and the nth takes out the non-atomic
allocation.. But it is rare and very annoying to reproduce.

Does anyone have some thoughts on what is an appropriate way to cope
with this? It seems like some sort of general problem with these sorts
of fallback allocations, so perhaps a GFP_ flag of some sort that
causes allows the fault injection to fail but not decrement nth?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-03 19:09 xarray, fault injection and syzkaller Jason Gunthorpe
@ 2022-11-03 20:00 ` Matthew Wilcox
  2022-11-03 20:07   ` Jason Gunthorpe
  0 siblings, 1 reply; 30+ messages in thread
From: Matthew Wilcox @ 2022-11-03 20:00 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Akinobu Mita, linux-fsdevel, syzkaller-bugs

On Thu, Nov 03, 2022 at 04:09:04PM -0300, Jason Gunthorpe wrote:
> Hi All,
> 
> I wonder if anyone has some thoughts on this - I have spent some time
> setting up syzkaller for a new subsystem and I've noticed that nth
> fault injection does not reliably cause things like xa_store() to
> fail.
> 
> It seems the basic reason is that xarray will usually do two
> allocations, one in an atomic context which fault injection does
> reliably fail, but then it almost always follows up with a second
> allocation in a non-atomic context that doesn't fail because nth has
> become 0.

Hahaha.  I didn't intentionally set out to thwart memory allocation
fault injection.  Realistically, do we want it to fail though?
GFP_KERNEL allocations of small sizes are supposed to never fail.
(for those not aware, node allocations are 576 bytes; typically the slab
allocator bundles 28 of them into an order-2 allocation).

I think a simple solution if we really do want to make allocations fail
is to switch error injection from "fail one allocation per N" to "fail
M allocations per N".  eg, 7 allocations succeed, 3 allocations fail,
7 succeed, 3 fail, ...  It's more realistic because you do tend to see
memory allocation failures come in bursts.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-03 20:00 ` Matthew Wilcox
@ 2022-11-03 20:07   ` Jason Gunthorpe
  2022-11-04  0:11     ` Dmitry Vyukov
  0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2022-11-03 20:07 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Akinobu Mita, linux-fsdevel, syzkaller-bugs

On Thu, Nov 03, 2022 at 08:00:25PM +0000, Matthew Wilcox wrote:
> On Thu, Nov 03, 2022 at 04:09:04PM -0300, Jason Gunthorpe wrote:
> > Hi All,
> > 
> > I wonder if anyone has some thoughts on this - I have spent some time
> > setting up syzkaller for a new subsystem and I've noticed that nth
> > fault injection does not reliably cause things like xa_store() to
> > fail.
> > 
> > It seems the basic reason is that xarray will usually do two
> > allocations, one in an atomic context which fault injection does
> > reliably fail, but then it almost always follows up with a second
> > allocation in a non-atomic context that doesn't fail because nth has
> > become 0.
> 
> Hahaha.  I didn't intentionally set out to thwart memory allocation
> fault injection.  Realistically, do we want it to fail though?
> GFP_KERNEL allocations of small sizes are supposed to never fail.
> (for those not aware, node allocations are 576 bytes; typically the slab
> allocator bundles 28 of them into an order-2 allocation).

I don't know, I have code to handle these failures, I want to test it
:)

> I think a simple solution if we really do want to make allocations fail
> is to switch error injection from "fail one allocation per N" to "fail
> M allocations per N".  eg, 7 allocations succeed, 3 allocations fail,
> 7 succeed, 3 fail, ...  It's more realistic because you do tend to see
> memory allocation failures come in bursts.

The systemic testing I've setup just walks nth through the entire
range until it no longer triggers. This hits every injection point and
checks the failure path of it. This is also what syzkaller does
automatically from what I can tell

If we make it probabilistic it is harder to reliably trigger these
fault points.

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-03 20:07   ` Jason Gunthorpe
@ 2022-11-04  0:11     ` Dmitry Vyukov
  2022-11-04  0:21       ` Jason Gunthorpe
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Vyukov @ 2022-11-04  0:11 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Matthew Wilcox, Akinobu Mita, linux-fsdevel, syzkaller-bugs, syzkaller

On Thu, 3 Nov 2022 at 13:07, 'Jason Gunthorpe' via syzkaller-bugs
<syzkaller-bugs@googlegroups.com> wrote:
>
> On Thu, Nov 03, 2022 at 08:00:25PM +0000, Matthew Wilcox wrote:
> > On Thu, Nov 03, 2022 at 04:09:04PM -0300, Jason Gunthorpe wrote:
> > > Hi All,
> > >
> > > I wonder if anyone has some thoughts on this - I have spent some time
> > > setting up syzkaller for a new subsystem and I've noticed that nth
> > > fault injection does not reliably cause things like xa_store() to
> > > fail.

Hi Jason, Matthew,

Interesting. Where exactly is that kmalloc sequence? xa_store() itself
does not have any allocations:
https://elixir.bootlin.com/linux/v6.1-rc3/source/lib/xarray.c#L1577

Do we know how common/useful such an allocation pattern is?
If it's common/useful, then it can be turned into a single kmalloc()
with some special flag that will try both allocation modes at once.

Potentially fail-nth interface can be extended to accept a set of
sites, e.g. "5,7" or, "5-100".
Though, not sure what the systematic enumeration should be then if we
go beyond "every single site on its own"... but I guess we can figure
it out.



> > > It seems the basic reason is that xarray will usually do two
> > > allocations, one in an atomic context which fault injection does
> > > reliably fail, but then it almost always follows up with a second
> > > allocation in a non-atomic context that doesn't fail because nth has
> > > become 0.
> >
> > Hahaha.  I didn't intentionally set out to thwart memory allocation
> > fault injection.  Realistically, do we want it to fail though?
> > GFP_KERNEL allocations of small sizes are supposed to never fail.
> > (for those not aware, node allocations are 576 bytes; typically the slab
> > allocator bundles 28 of them into an order-2 allocation).

I hear this statement periodically. But I can't understand its status.
We discussed it recently here:
https://lore.kernel.org/all/CACT4Y+Y_kg1J00iBL=sMr5AP7U4RXuBizusvQG52few2NcJ6dg@mail.gmail.com/

Likely/unlikely is not what matters, right? It's only: can it fail at
all or not?
If some allocation can't fail 100% and we want to rely on this in
future (not just a current implementation detail), then I think we
need to (1) make fault injection to not fail them, add a BUG_ON in the
allocation to panic if they do fail, (3) maybe start removing error
handling code (since having buggy/untested/confusing code is not
useful).


> I don't know, I have code to handle these failures, I want to test it
> :)
>
> > I think a simple solution if we really do want to make allocations fail
> > is to switch error injection from "fail one allocation per N" to "fail
> > M allocations per N".  eg, 7 allocations succeed, 3 allocations fail,
> > 7 succeed, 3 fail, ...  It's more realistic because you do tend to see
> > memory allocation failures come in bursts.
>
> The systemic testing I've setup just walks nth through the entire
> range until it no longer triggers. This hits every injection point and
> checks the failure path of it. This is also what syzkaller does
> automatically from what I can tell
>
> If we make it probabilistic it is harder to reliably trigger these
> fault points.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-04  0:11     ` Dmitry Vyukov
@ 2022-11-04  0:21       ` Jason Gunthorpe
  2022-11-04 17:47         ` Dmitry Vyukov
  0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2022-11-04  0:21 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Matthew Wilcox, Akinobu Mita, linux-fsdevel, syzkaller-bugs, syzkaller

On Thu, Nov 03, 2022 at 05:11:04PM -0700, Dmitry Vyukov wrote:
> On Thu, 3 Nov 2022 at 13:07, 'Jason Gunthorpe' via syzkaller-bugs
> <syzkaller-bugs@googlegroups.com> wrote:
> >
> > On Thu, Nov 03, 2022 at 08:00:25PM +0000, Matthew Wilcox wrote:
> > > On Thu, Nov 03, 2022 at 04:09:04PM -0300, Jason Gunthorpe wrote:
> > > > Hi All,
> > > >
> > > > I wonder if anyone has some thoughts on this - I have spent some time
> > > > setting up syzkaller for a new subsystem and I've noticed that nth
> > > > fault injection does not reliably cause things like xa_store() to
> > > > fail.
> 
> Hi Jason, Matthew,
> 
> Interesting. Where exactly is that kmalloc sequence? xa_store() itself
> does not have any allocations:
> https://elixir.bootlin.com/linux/v6.1-rc3/source/lib/xarray.c#L1577

The first effort is this call chain

__xa_store()
  xas_store()
    xas_create()
     xas_alloc()
      kmem_cache_alloc_lru(GFP_NOWAIT | __GFP_NOWARN)

If that fails then __xa_store() will do:

__xa_store()
  __xas_nomem()
      xas_unlock_type(xas, lock_type);
      kmem_cache_alloc_lru(GFP_KERNEL);
      xas_lock_type(xas, lock_type);

They key point being that the retry is structured in a way that allows
dropping the spinlocks that are forcing the first allocation to be
atomic.

> Do we know how common/useful such an allocation pattern is?

I have coded something like this a few times, in my cases it is
usually something like: try to allocate a big chunk of memory hoping
for a huge page, then fall back to a smaller allocation

Most likely the key consideration is that the callsites are using
GFP_NOWARN, so perhaps we can just avoid decrementing the nth on a
NOWARN case assuming that another allocation attempt will closely
follow?

> If it's common/useful, then it can be turned into a single kmalloc()
> with some special flag that will try both allocation modes at once.

A single call doesn't really suit the use cases..

> Potentially fail-nth interface can be extended to accept a set of
> sites, e.g. "5,7" or, "5-100".

For my purposes this is possibly Ok, you'd just set N->large and step
N to naively cover the error paths.

However, this would also have to fix the obnoxious behavior of fail
nth where it fails its own copy_from_user on its write system call -
meaning there would be no way to turn it off.

> > > Hahaha.  I didn't intentionally set out to thwart memory allocation
> > > fault injection.  Realistically, do we want it to fail though?
> > > GFP_KERNEL allocations of small sizes are supposed to never fail.
> > > (for those not aware, node allocations are 576 bytes; typically the slab
> > > allocator bundles 28 of them into an order-2 allocation).
> 
> I hear this statement periodically. But I can't understand its
> status. We discussed it recently here:

I was thinking about this after, and at least for what I am doing it
doesn't apply. All the allocations here are GFP_KERNEL_ACCOUNT and the
cgroup can definitely reject any allocation at any time even if it is
"small"

So I can't ignore allocation failures as something that is unlikely. A
hostile userspace contained in a cgroup sandbox can reliably trigger
them at will.

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-04  0:21       ` Jason Gunthorpe
@ 2022-11-04 17:47         ` Dmitry Vyukov
  2022-11-04 18:03           ` Jason Gunthorpe
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Vyukov @ 2022-11-04 17:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Matthew Wilcox, Akinobu Mita, linux-fsdevel, syzkaller-bugs, syzkaller

 iOn Thu, 3 Nov 2022 at 17:21, 'Jason Gunthorpe' via syzkaller-bugs
<syzkaller-bugs@googlegroups.com> wrote:
>
> On Thu, Nov 03, 2022 at 05:11:04PM -0700, Dmitry Vyukov wrote:
> > On Thu, 3 Nov 2022 at 13:07, 'Jason Gunthorpe' via syzkaller-bugs
> > <syzkaller-bugs@googlegroups.com> wrote:
> > >
> > > On Thu, Nov 03, 2022 at 08:00:25PM +0000, Matthew Wilcox wrote:
> > > > On Thu, Nov 03, 2022 at 04:09:04PM -0300, Jason Gunthorpe wrote:
> > > > > Hi All,
> > > > >
> > > > > I wonder if anyone has some thoughts on this - I have spent some time
> > > > > setting up syzkaller for a new subsystem and I've noticed that nth
> > > > > fault injection does not reliably cause things like xa_store() to
> > > > > fail.
> >
> > Hi Jason, Matthew,
> >
> > Interesting. Where exactly is that kmalloc sequence? xa_store() itself
> > does not have any allocations:
> > https://elixir.bootlin.com/linux/v6.1-rc3/source/lib/xarray.c#L1577
>
> The first effort is this call chain
>
> __xa_store()
>   xas_store()
>     xas_create()
>      xas_alloc()
>       kmem_cache_alloc_lru(GFP_NOWAIT | __GFP_NOWARN)
>
> If that fails then __xa_store() will do:
>
> __xa_store()
>   __xas_nomem()
>       xas_unlock_type(xas, lock_type);
>       kmem_cache_alloc_lru(GFP_KERNEL);
>       xas_lock_type(xas, lock_type);
>
> They key point being that the retry is structured in a way that allows
> dropping the spinlocks that are forcing the first allocation to be
> atomic.

I see. Yes, as you note below, this cannot be folded into a single
allocation call.

> > Do we know how common/useful such an allocation pattern is?
>
> I have coded something like this a few times, in my cases it is
> usually something like: try to allocate a big chunk of memory hoping
> for a huge page, then fall back to a smaller allocation
>
> Most likely the key consideration is that the callsites are using
> GFP_NOWARN, so perhaps we can just avoid decrementing the nth on a
> NOWARN case assuming that another allocation attempt will closely
> follow?

GFP_NOWARN is also extensively used for allocations with
user-controlled size, e.g.:
https://elixir.bootlin.com/linux/v6.1-rc3/source/net/unix/af_unix.c#L3451

That's different and these allocations are usually not repeated.
So looking at GFP_NOWARN does not look like the right thing to do.


> > If it's common/useful, then it can be turned into a single kmalloc()
> > with some special flag that will try both allocation modes at once.
>
> A single call doesn't really suit the use cases..
>
> > Potentially fail-nth interface can be extended to accept a set of
> > sites, e.g. "5,7" or, "5-100".
>
> For my purposes this is possibly Ok, you'd just set N->large and step
> N to naively cover the error paths.

Filed https://bugzilla.kernel.org/show_bug.cgi?id=216661 for this.

> However, this would also have to fix the obnoxious behavior of fail
> nth where it fails its own copy_from_user on its write system call -
> meaning there would be no way to turn it off.

Oh, interesting. We added failing of copy_from/to_user later and did
not consider such interaction.
Filed https://bugzilla.kernel.org/show_bug.cgi?id=216660 for this.

> > > > Hahaha.  I didn't intentionally set out to thwart memory allocation
> > > > fault injection.  Realistically, do we want it to fail though?
> > > > GFP_KERNEL allocations of small sizes are supposed to never fail.
> > > > (for those not aware, node allocations are 576 bytes; typically the slab
> > > > allocator bundles 28 of them into an order-2 allocation).
> >
> > I hear this statement periodically. But I can't understand its
> > status. We discussed it recently here:
>
> I was thinking about this after, and at least for what I am doing it
> doesn't apply. All the allocations here are GFP_KERNEL_ACCOUNT and the
> cgroup can definitely reject any allocation at any time even if it is
> "small"
>
> So I can't ignore allocation failures as something that is unlikely. A
> hostile userspace contained in a cgroup sandbox can reliably trigger
> them at will.
>
> Jason
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/Y2RbCUdEY2syxRLW%40nvidia.com.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-04 17:47         ` Dmitry Vyukov
@ 2022-11-04 18:03           ` Jason Gunthorpe
  2022-11-04 18:21             ` Dmitry Vyukov
  2022-11-04 18:42             ` xarray, fault injection and syzkaller Dmitry Vyukov
  0 siblings, 2 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2022-11-04 18:03 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Matthew Wilcox, Akinobu Mita, linux-fsdevel, syzkaller-bugs, syzkaller

On Fri, Nov 04, 2022 at 10:47:17AM -0700, Dmitry Vyukov wrote:
> > > Do we know how common/useful such an allocation pattern is?
> >
> > I have coded something like this a few times, in my cases it is
> > usually something like: try to allocate a big chunk of memory hoping
> > for a huge page, then fall back to a smaller allocation
> >
> > Most likely the key consideration is that the callsites are using
> > GFP_NOWARN, so perhaps we can just avoid decrementing the nth on a
> > NOWARN case assuming that another allocation attempt will closely
> > follow?
> 
> GFP_NOWARN is also extensively used for allocations with
> user-controlled size, e.g.:
> https://elixir.bootlin.com/linux/v6.1-rc3/source/net/unix/af_unix.c#L3451
> 
> That's different and these allocations are usually not repeated.
> So looking at GFP_NOWARN does not look like the right thing to do.

This may be the best option then, arguably perhaps even more
"realistic" than normal fail_nth as in a real system if this stuff
starts failing there is a good chance things from then on will fail
too during the error cleanup.

> > However, this would also have to fix the obnoxious behavior of fail
> > nth where it fails its own copy_from_user on its write system call -
> > meaning there would be no way to turn it off.
> 
> Oh, interesting. We added failing of copy_from/to_user later and did
> not consider such interaction.
> Filed https://bugzilla.kernel.org/show_bug.cgi?id=216660 for this.

Oh, I will tell you the other two bugish things I noticed

__should_failslab() has this:

	if (gfpflags & __GFP_NOWARN)
		failslab.attr.no_warn = true;

	return should_fail(&failslab.attr, s->object_size);

Which always permanently turns off no_warn for slab during early
boot. This is why syzkaller reports are so confusing. They trigger a
slab fault injection, which in all other cases gives a notification
backtrace, but in slab cases there is no hint about the fault
injection in the log.

Once that is fixed we can quickly explain why the socketpair() example
in the docs shows success ret codes in the middle of the sweep when
run on syzkaller kernels

fail_nth interacts badly with other kernel features typically enabled
in syzkaller kernels. Eg it fails in hidden kmemleak instrumentation:

[   18.499559] FAULT_INJECTION: forcing a failure.
[   18.499559] name failslab, interval 1, probability 0, space 0, times 0
[   18.499720] CPU: 10 PID: 386 Comm: iommufd_fail_nt Not tainted 6.1.0-rc3+ #34
[   18.499826] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[   18.499971] Call Trace:
[   18.500010]  <TASK>
[   18.500048]  show_stack+0x3d/0x3f
[   18.500114]  dump_stack_lvl+0x92/0xbd
[   18.500171]  dump_stack+0x15/0x17
[   18.500232]  should_fail.cold+0x5/0xa
[   18.500291]  __should_failslab+0xb6/0x100
[   18.500349]  should_failslab+0x9/0x20
[   18.500416]  kmem_cache_alloc+0x64/0x4e0
[   18.500477]  ? __create_object+0x40/0xc50
[   18.500539]  __create_object+0x40/0xc50
[   18.500620]  ? kasan_poison+0x3a/0x50
[   18.500690]  ? kasan_unpoison+0x28/0x50
[***18.500753]  kmemleak_alloc+0x24/0x30 
[   18.500816]  __kmem_cache_alloc_node+0x1de/0x400
[   18.500900]  ? iopt_alloc_area_pages+0x95/0x560 [iommufd]
[   18.500993]  kmalloc_trace+0x26/0x110
[   18.501059]  iopt_alloc_area_pages+0x95/0x560 [iommufd]

Which has the consequence of syzkaller wasting half its fail_nth
effort because it is triggering failures in hidden instrumentation
that has no impact on the main code path.

Maybe a kmem_cache_alloc_no_fault_inject() would be helpful for a few
cases.

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-04 18:03           ` Jason Gunthorpe
@ 2022-11-04 18:21             ` Dmitry Vyukov
  2022-11-04 18:30               ` Jason Gunthorpe
  2022-11-04 22:43               ` Qi Zheng
  2022-11-04 18:42             ` xarray, fault injection and syzkaller Dmitry Vyukov
  1 sibling, 2 replies; 30+ messages in thread
From: Dmitry Vyukov @ 2022-11-04 18:21 UTC (permalink / raw)
  To: Jason Gunthorpe, zhengqi.arch
  Cc: Matthew Wilcox, Akinobu Mita, linux-fsdevel, syzkaller-bugs, syzkaller

On Fri, 4 Nov 2022 at 11:03, Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Fri, Nov 04, 2022 at 10:47:17AM -0700, Dmitry Vyukov wrote:
> > > > Do we know how common/useful such an allocation pattern is?
> > >
> > > I have coded something like this a few times, in my cases it is
> > > usually something like: try to allocate a big chunk of memory hoping
> > > for a huge page, then fall back to a smaller allocation
> > >
> > > Most likely the key consideration is that the callsites are using
> > > GFP_NOWARN, so perhaps we can just avoid decrementing the nth on a
> > > NOWARN case assuming that another allocation attempt will closely
> > > follow?
> >
> > GFP_NOWARN is also extensively used for allocations with
> > user-controlled size, e.g.:
> > https://elixir.bootlin.com/linux/v6.1-rc3/source/net/unix/af_unix.c#L3451
> >
> > That's different and these allocations are usually not repeated.
> > So looking at GFP_NOWARN does not look like the right thing to do.
>
> This may be the best option then, arguably perhaps even more
> "realistic" than normal fail_nth as in a real system if this stuff
> starts failing there is a good chance things from then on will fail
> too during the error cleanup.
>
> > > However, this would also have to fix the obnoxious behavior of fail
> > > nth where it fails its own copy_from_user on its write system call -
> > > meaning there would be no way to turn it off.
> >
> > Oh, interesting. We added failing of copy_from/to_user later and did
> > not consider such interaction.
> > Filed https://bugzilla.kernel.org/show_bug.cgi?id=216660 for this.
>
> Oh, I will tell you the other two bugish things I noticed
>
> __should_failslab() has this:
>
>         if (gfpflags & __GFP_NOWARN)
>                 failslab.attr.no_warn = true;
>
>         return should_fail(&failslab.attr, s->object_size);
>
> Which always permanently turns off no_warn for slab during early
> boot. This is why syzkaller reports are so confusing. They trigger a
> slab fault injection, which in all other cases gives a notification
> backtrace, but in slab cases there is no hint about the fault
> injection in the log.

Ouch, this looks like a bug in:

commit 3f913fc5f9745613088d3c569778c9813ab9c129
Author: Qi Zheng <zhengqi.arch@bytedance.com>
Date:   Thu May 19 14:08:55 2022 -0700
     mm: fix missing handler for __GFP_NOWARN

+Qi could you please fix it?

At the very least the local gfpflags should not alter the global
failslab.attr that is persistent and shared by all tasks.

But I am not sure if we really don't want to issue the fault injection
stack in this case. It's not a WARNING, it's merely an information
message. It looks useful in all cases, even with GFP_NOWARN. Why
should it be suppressed?


> Once that is fixed we can quickly explain why the socketpair() example
> in the docs shows success ret codes in the middle of the sweep when
> run on syzkaller kernels
>
> fail_nth interacts badly with other kernel features typically enabled
> in syzkaller kernels. Eg it fails in hidden kmemleak instrumentation:
>
> [   18.499559] FAULT_INJECTION: forcing a failure.
> [   18.499559] name failslab, interval 1, probability 0, space 0, times 0
> [   18.499720] CPU: 10 PID: 386 Comm: iommufd_fail_nt Not tainted 6.1.0-rc3+ #34
> [   18.499826] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> [   18.499971] Call Trace:
> [   18.500010]  <TASK>
> [   18.500048]  show_stack+0x3d/0x3f
> [   18.500114]  dump_stack_lvl+0x92/0xbd
> [   18.500171]  dump_stack+0x15/0x17
> [   18.500232]  should_fail.cold+0x5/0xa
> [   18.500291]  __should_failslab+0xb6/0x100
> [   18.500349]  should_failslab+0x9/0x20
> [   18.500416]  kmem_cache_alloc+0x64/0x4e0
> [   18.500477]  ? __create_object+0x40/0xc50
> [   18.500539]  __create_object+0x40/0xc50
> [   18.500620]  ? kasan_poison+0x3a/0x50
> [   18.500690]  ? kasan_unpoison+0x28/0x50
> [***18.500753]  kmemleak_alloc+0x24/0x30
> [   18.500816]  __kmem_cache_alloc_node+0x1de/0x400
> [   18.500900]  ? iopt_alloc_area_pages+0x95/0x560 [iommufd]
> [   18.500993]  kmalloc_trace+0x26/0x110
> [   18.501059]  iopt_alloc_area_pages+0x95/0x560 [iommufd]
>
> Which has the consequence of syzkaller wasting half its fail_nth
> effort because it is triggering failures in hidden instrumentation
> that has no impact on the main code path.
>
> Maybe a kmem_cache_alloc_no_fault_inject() would be helpful for a few
> cases.
>
> Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-04 18:21             ` Dmitry Vyukov
@ 2022-11-04 18:30               ` Jason Gunthorpe
  2022-11-04 18:38                 ` Dmitry Vyukov
  2022-11-04 22:43               ` Qi Zheng
  1 sibling, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2022-11-04 18:30 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: zhengqi.arch, Matthew Wilcox, Akinobu Mita, linux-fsdevel,
	syzkaller-bugs, syzkaller

On Fri, Nov 04, 2022 at 11:21:21AM -0700, Dmitry Vyukov wrote:

> But I am not sure if we really don't want to issue the fault injection
> stack in this case. It's not a WARNING, it's merely an information
> message. It looks useful in all cases, even with GFP_NOWARN. Why
> should it be suppressed?

I think it is fine to suppress it for *this call* but the bug turns it
off forever more

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-04 18:30               ` Jason Gunthorpe
@ 2022-11-04 18:38                 ` Dmitry Vyukov
  2022-11-04 18:45                   ` Jason Gunthorpe
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Vyukov @ 2022-11-04 18:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: zhengqi.arch, Matthew Wilcox, Akinobu Mita, linux-fsdevel,
	syzkaller-bugs, syzkaller

On Fri, 4 Nov 2022 at 11:30, Jason Gunthorpe <jgg@nvidia.com> wrote:
> On Fri, Nov 04, 2022 at 11:21:21AM -0700, Dmitry Vyukov wrote:
>
> > But I am not sure if we really don't want to issue the fault injection
> > stack in this case. It's not a WARNING, it's merely an information
> > message. It looks useful in all cases, even with GFP_NOWARN. Why
> > should it be suppressed?
>
> I think it is fine to suppress it for *this call* but the bug turns it
> off forever more

Is it just "fine", or "good"? I agree it's probably "fine", but
wouldn't it be better to not suppress it?

The message fault injection prints is not a warning, and the
allocation failed due to fault injection. That may trigger subsequent
bugs just as any other case of fault injection. Why don't we want to
see the info message in this particular case? NOWARN looks orthogonal
to this, it's about normal slab allocation failures.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-04 18:03           ` Jason Gunthorpe
  2022-11-04 18:21             ` Dmitry Vyukov
@ 2022-11-04 18:42             ` Dmitry Vyukov
  1 sibling, 0 replies; 30+ messages in thread
From: Dmitry Vyukov @ 2022-11-04 18:42 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Matthew Wilcox, Akinobu Mita, linux-fsdevel, syzkaller-bugs, syzkaller

On Fri, 4 Nov 2022 at 11:03, Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Fri, Nov 04, 2022 at 10:47:17AM -0700, Dmitry Vyukov wrote:
> > > > Do we know how common/useful such an allocation pattern is?
> > >
> > > I have coded something like this a few times, in my cases it is
> > > usually something like: try to allocate a big chunk of memory hoping
> > > for a huge page, then fall back to a smaller allocation
> > >
> > > Most likely the key consideration is that the callsites are using
> > > GFP_NOWARN, so perhaps we can just avoid decrementing the nth on a
> > > NOWARN case assuming that another allocation attempt will closely
> > > follow?
> >
> > GFP_NOWARN is also extensively used for allocations with
> > user-controlled size, e.g.:
> > https://elixir.bootlin.com/linux/v6.1-rc3/source/net/unix/af_unix.c#L3451
> >
> > That's different and these allocations are usually not repeated.
> > So looking at GFP_NOWARN does not look like the right thing to do.
>
> This may be the best option then, arguably perhaps even more
> "realistic" than normal fail_nth as in a real system if this stuff
> starts failing there is a good chance things from then on will fail
> too during the error cleanup.
>
> > > However, this would also have to fix the obnoxious behavior of fail
> > > nth where it fails its own copy_from_user on its write system call -
> > > meaning there would be no way to turn it off.
> >
> > Oh, interesting. We added failing of copy_from/to_user later and did
> > not consider such interaction.
> > Filed https://bugzilla.kernel.org/show_bug.cgi?id=216660 for this.
>
> Oh, I will tell you the other two bugish things I noticed
>
> __should_failslab() has this:
>
>         if (gfpflags & __GFP_NOWARN)
>                 failslab.attr.no_warn = true;
>
>         return should_fail(&failslab.attr, s->object_size);
>
> Which always permanently turns off no_warn for slab during early
> boot. This is why syzkaller reports are so confusing. They trigger a
> slab fault injection, which in all other cases gives a notification
> backtrace, but in slab cases there is no hint about the fault
> injection in the log.
>
> Once that is fixed we can quickly explain why the socketpair() example
> in the docs shows success ret codes in the middle of the sweep when
> run on syzkaller kernels
>
> fail_nth interacts badly with other kernel features typically enabled
> in syzkaller kernels. Eg it fails in hidden kmemleak instrumentation:
>
> [   18.499559] FAULT_INJECTION: forcing a failure.
> [   18.499559] name failslab, interval 1, probability 0, space 0, times 0
> [   18.499720] CPU: 10 PID: 386 Comm: iommufd_fail_nt Not tainted 6.1.0-rc3+ #34
> [   18.499826] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> [   18.499971] Call Trace:
> [   18.500010]  <TASK>
> [   18.500048]  show_stack+0x3d/0x3f
> [   18.500114]  dump_stack_lvl+0x92/0xbd
> [   18.500171]  dump_stack+0x15/0x17
> [   18.500232]  should_fail.cold+0x5/0xa
> [   18.500291]  __should_failslab+0xb6/0x100
> [   18.500349]  should_failslab+0x9/0x20
> [   18.500416]  kmem_cache_alloc+0x64/0x4e0
> [   18.500477]  ? __create_object+0x40/0xc50
> [   18.500539]  __create_object+0x40/0xc50
> [   18.500620]  ? kasan_poison+0x3a/0x50
> [   18.500690]  ? kasan_unpoison+0x28/0x50
> [***18.500753]  kmemleak_alloc+0x24/0x30
> [   18.500816]  __kmem_cache_alloc_node+0x1de/0x400
> [   18.500900]  ? iopt_alloc_area_pages+0x95/0x560 [iommufd]
> [   18.500993]  kmalloc_trace+0x26/0x110
> [   18.501059]  iopt_alloc_area_pages+0x95/0x560 [iommufd]
>
> Which has the consequence of syzkaller wasting half its fail_nth
> effort because it is triggering failures in hidden instrumentation
> that has no impact on the main code path.
>
> Maybe a kmem_cache_alloc_no_fault_inject() would be helpful for a few
> cases.

Filed https://bugzilla.kernel.org/show_bug.cgi?id=216663
We could add GFP_NOFAULT. But now I am thinking if using the existing
GFP_NOFAIL is actually the right thing to do in these cases...
(details are in the bug).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-04 18:38                 ` Dmitry Vyukov
@ 2022-11-04 18:45                   ` Jason Gunthorpe
  0 siblings, 0 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2022-11-04 18:45 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: zhengqi.arch, Matthew Wilcox, Akinobu Mita, linux-fsdevel,
	syzkaller-bugs, syzkaller

On Fri, Nov 04, 2022 at 11:38:48AM -0700, Dmitry Vyukov wrote:
> On Fri, 4 Nov 2022 at 11:30, Jason Gunthorpe <jgg@nvidia.com> wrote:
> > On Fri, Nov 04, 2022 at 11:21:21AM -0700, Dmitry Vyukov wrote:
> >
> > > But I am not sure if we really don't want to issue the fault injection
> > > stack in this case. It's not a WARNING, it's merely an information
> > > message. It looks useful in all cases, even with GFP_NOWARN. Why
> > > should it be suppressed?
> >
> > I think it is fine to suppress it for *this call* but the bug turns it
> > off forever more
> 
> Is it just "fine", or "good"? I agree it's probably "fine", but
> wouldn't it be better to not suppress it?

It seems sensible to me that fail*/verbosity=2 should *always* print
if a fault has been injected.

I don't know why someone thought this one deserves to be shut down.

GFP_NOWARN is about the caller indicating that it expects and will
handle an allocation failure (eg it is asking for big regions and has
fallbacks) so we shouldn't print the general OOM warning.

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-04 18:21             ` Dmitry Vyukov
  2022-11-04 18:30               ` Jason Gunthorpe
@ 2022-11-04 22:43               ` Qi Zheng
  2022-11-05 12:16                 ` Qi Zheng
  1 sibling, 1 reply; 30+ messages in thread
From: Qi Zheng @ 2022-11-04 22:43 UTC (permalink / raw)
  To: Dmitry Vyukov, Jason Gunthorpe
  Cc: Matthew Wilcox, Akinobu Mita, linux-fsdevel, syzkaller-bugs, syzkaller



On 2022/11/5 02:21, Dmitry Vyukov wrote:
> On Fri, 4 Nov 2022 at 11:03, Jason Gunthorpe <jgg@nvidia.com> wrote:
>>
>> On Fri, Nov 04, 2022 at 10:47:17AM -0700, Dmitry Vyukov wrote:
>>>>> Do we know how common/useful such an allocation pattern is?
>>>>
>>>> I have coded something like this a few times, in my cases it is
>>>> usually something like: try to allocate a big chunk of memory hoping
>>>> for a huge page, then fall back to a smaller allocation
>>>>
>>>> Most likely the key consideration is that the callsites are using
>>>> GFP_NOWARN, so perhaps we can just avoid decrementing the nth on a
>>>> NOWARN case assuming that another allocation attempt will closely
>>>> follow?
>>>
>>> GFP_NOWARN is also extensively used for allocations with
>>> user-controlled size, e.g.:
>>> https://elixir.bootlin.com/linux/v6.1-rc3/source/net/unix/af_unix.c#L3451
>>>
>>> That's different and these allocations are usually not repeated.
>>> So looking at GFP_NOWARN does not look like the right thing to do.
>>
>> This may be the best option then, arguably perhaps even more
>> "realistic" than normal fail_nth as in a real system if this stuff
>> starts failing there is a good chance things from then on will fail
>> too during the error cleanup.
>>
>>>> However, this would also have to fix the obnoxious behavior of fail
>>>> nth where it fails its own copy_from_user on its write system call -
>>>> meaning there would be no way to turn it off.
>>>
>>> Oh, interesting. We added failing of copy_from/to_user later and did
>>> not consider such interaction.
>>> Filed https://bugzilla.kernel.org/show_bug.cgi?id=216660 for this.
>>
>> Oh, I will tell you the other two bugish things I noticed
>>
>> __should_failslab() has this:
>>
>>          if (gfpflags & __GFP_NOWARN)
>>                  failslab.attr.no_warn = true;
>>
>>          return should_fail(&failslab.attr, s->object_size);
>>
>> Which always permanently turns off no_warn for slab during early
>> boot. This is why syzkaller reports are so confusing. They trigger a
>> slab fault injection, which in all other cases gives a notification
>> backtrace, but in slab cases there is no hint about the fault
>> injection in the log.
> 
> Ouch, this looks like a bug in:
> 
> commit 3f913fc5f9745613088d3c569778c9813ab9c129
> Author: Qi Zheng <zhengqi.arch@bytedance.com>
> Date:   Thu May 19 14:08:55 2022 -0700
>       mm: fix missing handler for __GFP_NOWARN
> 
> +Qi could you please fix it?
> 
> At the very least the local gfpflags should not alter the global
> failslab.attr that is persistent and shared by all tasks.

Oh, It indeed shouldn't alter the global failslab.attr, I'll fix it.

But a warning should not be printed for callers that currently specify
__GFP_NOWARN, because that could lead to deadlocks, such as the deadlock
case mentioned in commit 6b9dbedbe349 ("tty: fix deadlock caused by 
calling printk() under tty_port->lock").

Thanks,
Qi

> 
> But I am not sure if we really don't want to issue the fault injection
> stack in this case. It's not a WARNING, it's merely an information
> message. It looks useful in all cases, even with GFP_NOWARN. Why
> should it be suppressed?
> 
> 
>> Once that is fixed we can quickly explain why the socketpair() example
>> in the docs shows success ret codes in the middle of the sweep when
>> run on syzkaller kernels
>>
>> fail_nth interacts badly with other kernel features typically enabled
>> in syzkaller kernels. Eg it fails in hidden kmemleak instrumentation:
>>
>> [   18.499559] FAULT_INJECTION: forcing a failure.
>> [   18.499559] name failslab, interval 1, probability 0, space 0, times 0
>> [   18.499720] CPU: 10 PID: 386 Comm: iommufd_fail_nt Not tainted 6.1.0-rc3+ #34
>> [   18.499826] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>> [   18.499971] Call Trace:
>> [   18.500010]  <TASK>
>> [   18.500048]  show_stack+0x3d/0x3f
>> [   18.500114]  dump_stack_lvl+0x92/0xbd
>> [   18.500171]  dump_stack+0x15/0x17
>> [   18.500232]  should_fail.cold+0x5/0xa
>> [   18.500291]  __should_failslab+0xb6/0x100
>> [   18.500349]  should_failslab+0x9/0x20
>> [   18.500416]  kmem_cache_alloc+0x64/0x4e0
>> [   18.500477]  ? __create_object+0x40/0xc50
>> [   18.500539]  __create_object+0x40/0xc50
>> [   18.500620]  ? kasan_poison+0x3a/0x50
>> [   18.500690]  ? kasan_unpoison+0x28/0x50
>> [***18.500753]  kmemleak_alloc+0x24/0x30
>> [   18.500816]  __kmem_cache_alloc_node+0x1de/0x400
>> [   18.500900]  ? iopt_alloc_area_pages+0x95/0x560 [iommufd]
>> [   18.500993]  kmalloc_trace+0x26/0x110
>> [   18.501059]  iopt_alloc_area_pages+0x95/0x560 [iommufd]
>>
>> Which has the consequence of syzkaller wasting half its fail_nth
>> effort because it is triggering failures in hidden instrumentation
>> that has no impact on the main code path.
>>
>> Maybe a kmem_cache_alloc_no_fault_inject() would be helpful for a few
>> cases.
>>
>> Jason

-- 
Thanks,
Qi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-04 22:43               ` Qi Zheng
@ 2022-11-05 12:16                 ` Qi Zheng
  2022-11-06 17:36                   ` Dmitry Vyukov
  0 siblings, 1 reply; 30+ messages in thread
From: Qi Zheng @ 2022-11-05 12:16 UTC (permalink / raw)
  To: Dmitry Vyukov, Jason Gunthorpe
  Cc: Matthew Wilcox, Akinobu Mita, linux-fsdevel, syzkaller-bugs, syzkaller



On 2022/11/5 06:43, Qi Zheng wrote:
> 
> 
> On 2022/11/5 02:21, Dmitry Vyukov wrote:
>> On Fri, 4 Nov 2022 at 11:03, Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>
>>> On Fri, Nov 04, 2022 at 10:47:17AM -0700, Dmitry Vyukov wrote:
>>>>>> Do we know how common/useful such an allocation pattern is?
>>>>>
>>>>> I have coded something like this a few times, in my cases it is
>>>>> usually something like: try to allocate a big chunk of memory hoping
>>>>> for a huge page, then fall back to a smaller allocation
>>>>>
>>>>> Most likely the key consideration is that the callsites are using
>>>>> GFP_NOWARN, so perhaps we can just avoid decrementing the nth on a
>>>>> NOWARN case assuming that another allocation attempt will closely
>>>>> follow?
>>>>
>>>> GFP_NOWARN is also extensively used for allocations with
>>>> user-controlled size, e.g.:
>>>> https://elixir.bootlin.com/linux/v6.1-rc3/source/net/unix/af_unix.c#L3451
>>>>
>>>> That's different and these allocations are usually not repeated.
>>>> So looking at GFP_NOWARN does not look like the right thing to do.
>>>
>>> This may be the best option then, arguably perhaps even more
>>> "realistic" than normal fail_nth as in a real system if this stuff
>>> starts failing there is a good chance things from then on will fail
>>> too during the error cleanup.
>>>
>>>>> However, this would also have to fix the obnoxious behavior of fail
>>>>> nth where it fails its own copy_from_user on its write system call -
>>>>> meaning there would be no way to turn it off.
>>>>
>>>> Oh, interesting. We added failing of copy_from/to_user later and did
>>>> not consider such interaction.
>>>> Filed https://bugzilla.kernel.org/show_bug.cgi?id=216660 for this.
>>>
>>> Oh, I will tell you the other two bugish things I noticed
>>>
>>> __should_failslab() has this:
>>>
>>>          if (gfpflags & __GFP_NOWARN)
>>>                  failslab.attr.no_warn = true;
>>>
>>>          return should_fail(&failslab.attr, s->object_size);
>>>
>>> Which always permanently turns off no_warn for slab during early
>>> boot. This is why syzkaller reports are so confusing. They trigger a
>>> slab fault injection, which in all other cases gives a notification
>>> backtrace, but in slab cases there is no hint about the fault
>>> injection in the log.
>>
>> Ouch, this looks like a bug in:
>>
>> commit 3f913fc5f9745613088d3c569778c9813ab9c129
>> Author: Qi Zheng <zhengqi.arch@bytedance.com>
>> Date:   Thu May 19 14:08:55 2022 -0700
>>       mm: fix missing handler for __GFP_NOWARN
>>
>> +Qi could you please fix it?
>>
>> At the very least the local gfpflags should not alter the global
>> failslab.attr that is persistent and shared by all tasks.
> 
> Oh, It indeed shouldn't alter the global failslab.attr, I'll fix it.

How about the following changes? If it's ok, I will send this fix patch.
Thanks. :)

diff --git a/include/linux/fault-inject.h b/include/linux/fault-inject.h
index 9f6e25467844..b61a3fb7a2a3 100644
--- a/include/linux/fault-inject.h
+++ b/include/linux/fault-inject.h
@@ -20,7 +20,6 @@ struct fault_attr {
         atomic_t space;
         unsigned long verbose;
         bool task_filter;
-       bool no_warn;
         unsigned long stacktrace_depth;
         unsigned long require_start;
         unsigned long require_end;
@@ -40,12 +39,12 @@ struct fault_attr {
                 .ratelimit_state = RATELIMIT_STATE_INIT_DISABLED,       \
                 .verbose = 2,                                           \
                 .dname = NULL,                                          \
-               .no_warn = false,                                       \
         }

  #define DECLARE_FAULT_ATTR(name) struct fault_attr name = 
FAULT_ATTR_INITIALIZER
  int setup_fault_attr(struct fault_attr *attr, char *str);
  bool should_fail(struct fault_attr *attr, ssize_t size);
+bool should_fail_gfp(struct fault_attr *attr, ssize_t size, gfp_t 
gfpflags);

  #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS

diff --git a/lib/fault-inject.c b/lib/fault-inject.c
index 4b8fafce415c..95af50832770 100644
--- a/lib/fault-inject.c
+++ b/lib/fault-inject.c
@@ -41,9 +41,6 @@ EXPORT_SYMBOL_GPL(setup_fault_attr);

  static void fail_dump(struct fault_attr *attr)
  {
-       if (attr->no_warn)
-               return;
-
         if (attr->verbose > 0 && __ratelimit(&attr->ratelimit_state)) {
                 printk(KERN_NOTICE "FAULT_INJECTION: forcing a failure.\n"
                        "name %pd, interval %lu, probability %lu, "
@@ -98,12 +95,7 @@ static inline bool fail_stacktrace(struct fault_attr 
*attr)

  #endif /* CONFIG_FAULT_INJECTION_STACKTRACE_FILTER */

-/*
- * This code is stolen from failmalloc-1.0
- * http://www.nongnu.org/failmalloc/
- */
-
-bool should_fail(struct fault_attr *attr, ssize_t size)
+bool should_fail_check(struct fault_attr *attr, ssize_t size)
  {
         bool stack_checked = false;

@@ -118,7 +110,7 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
                         fail_nth--;
                         WRITE_ONCE(current->fail_nth, fail_nth);
                         if (!fail_nth)
-                               goto fail;
+                               return true;

                         return false;
                 }
@@ -151,7 +143,19 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
         if (attr->probability <= get_random_u32_below(100))
                 return false;

-fail:
+       return true;
+}
+
+/*
+ * This code is stolen from failmalloc-1.0
+ * http://www.nongnu.org/failmalloc/
+ */
+
+bool should_fail(struct fault_attr *attr, ssize_t size)
+{
+       if (!should_fail_check(attr, size))
+               return false;
+
         fail_dump(attr);

         if (atomic_read(&attr->times) != -1)
@@ -161,6 +165,21 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
  }
  EXPORT_SYMBOL_GPL(should_fail);

+bool should_fail_gfp(struct fault_attr *attr, ssize_t size, gfp_t gfpflags)
+{
+       if (!should_fail_check(attr, size))
+               return false;
+
+       if (!(gfpflags & __GFP_NOWARN))
+               fail_dump(attr);
+
+       if (atomic_read(&attr->times) != -1)
+               atomic_dec_not_zero(&attr->times);
+
+       return true;
+}
+EXPORT_SYMBOL_GPL(should_fail_gfp);
+
  #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS

  static int debugfs_ul_set(void *data, u64 val)
diff --git a/mm/failslab.c b/mm/failslab.c
index 58df9789f1d2..21338b256791 100644
--- a/mm/failslab.c
+++ b/mm/failslab.c
@@ -30,10 +30,7 @@ bool __should_failslab(struct kmem_cache *s, gfp_t 
gfpflags)
         if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB))
                 return false;

-       if (gfpflags & __GFP_NOWARN)
-               failslab.attr.no_warn = true;
-
-       return should_fail(&failslab.attr, s->object_size);
+       return should_fail_gfp(&failslab.attr, s->object_size, gfpflags);
  }

  static int __init setup_failslab(char *str)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7192ded44ad0..4e70b5599ada 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3912,10 +3912,7 @@ static bool __should_fail_alloc_page(gfp_t 
gfp_mask, unsigned int order)
                         (gfp_mask & __GFP_DIRECT_RECLAIM))
                 return false;

-       if (gfp_mask & __GFP_NOWARN)
-               fail_page_alloc.attr.no_warn = true;
-
-       return should_fail(&fail_page_alloc.attr, 1 << order);
+       return should_fail_gfp(&fail_page_alloc.attr, 1 << order, gfp_mask);
  }

  #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS

> 
> But a warning should not be printed for callers that currently specify
> __GFP_NOWARN, because that could lead to deadlocks, such as the deadlock
> case mentioned in commit 6b9dbedbe349 ("tty: fix deadlock caused by 
> calling printk() under tty_port->lock").
> 
> Thanks,
> Qi
> 
>>
>> But I am not sure if we really don't want to issue the fault injection
>> stack in this case. It's not a WARNING, it's merely an information
>> message. It looks useful in all cases, even with GFP_NOWARN. Why
>> should it be suppressed?
>>
>>
>>> Once that is fixed we can quickly explain why the socketpair() example
>>> in the docs shows success ret codes in the middle of the sweep when
>>> run on syzkaller kernels
>>>
>>> fail_nth interacts badly with other kernel features typically enabled
>>> in syzkaller kernels. Eg it fails in hidden kmemleak instrumentation:
>>>
>>> [   18.499559] FAULT_INJECTION: forcing a failure.
>>> [   18.499559] name failslab, interval 1, probability 0, space 0, 
>>> times 0
>>> [   18.499720] CPU: 10 PID: 386 Comm: iommufd_fail_nt Not tainted 
>>> 6.1.0-rc3+ #34
>>> [   18.499826] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), 
>>> BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>>> [   18.499971] Call Trace:
>>> [   18.500010]  <TASK>
>>> [   18.500048]  show_stack+0x3d/0x3f
>>> [   18.500114]  dump_stack_lvl+0x92/0xbd
>>> [   18.500171]  dump_stack+0x15/0x17
>>> [   18.500232]  should_fail.cold+0x5/0xa
>>> [   18.500291]  __should_failslab+0xb6/0x100
>>> [   18.500349]  should_failslab+0x9/0x20
>>> [   18.500416]  kmem_cache_alloc+0x64/0x4e0
>>> [   18.500477]  ? __create_object+0x40/0xc50
>>> [   18.500539]  __create_object+0x40/0xc50
>>> [   18.500620]  ? kasan_poison+0x3a/0x50
>>> [   18.500690]  ? kasan_unpoison+0x28/0x50
>>> [***18.500753]  kmemleak_alloc+0x24/0x30
>>> [   18.500816]  __kmem_cache_alloc_node+0x1de/0x400
>>> [   18.500900]  ? iopt_alloc_area_pages+0x95/0x560 [iommufd]
>>> [   18.500993]  kmalloc_trace+0x26/0x110
>>> [   18.501059]  iopt_alloc_area_pages+0x95/0x560 [iommufd]
>>>
>>> Which has the consequence of syzkaller wasting half its fail_nth
>>> effort because it is triggering failures in hidden instrumentation
>>> that has no impact on the main code path.
>>>
>>> Maybe a kmem_cache_alloc_no_fault_inject() would be helpful for a few
>>> cases.
>>>
>>> Jason
> 

-- 
Thanks,
Qi

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-05 12:16                 ` Qi Zheng
@ 2022-11-06 17:36                   ` Dmitry Vyukov
  2022-11-07  2:13                     ` Qi Zheng
  2022-11-07  3:31                     ` [PATCH] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr Qi Zheng
  0 siblings, 2 replies; 30+ messages in thread
From: Dmitry Vyukov @ 2022-11-06 17:36 UTC (permalink / raw)
  To: Qi Zheng
  Cc: Jason Gunthorpe, Matthew Wilcox, Akinobu Mita, linux-fsdevel,
	syzkaller-bugs, syzkaller

On Sat, 5 Nov 2022 at 05:16, Qi Zheng <zhengqi.arch@bytedance.com> wrote:
> > On 2022/11/5 02:21, Dmitry Vyukov wrote:
> >> On Fri, 4 Nov 2022 at 11:03, Jason Gunthorpe <jgg@nvidia.com> wrote:
> >>>
> >>> On Fri, Nov 04, 2022 at 10:47:17AM -0700, Dmitry Vyukov wrote:
> >>>>>> Do we know how common/useful such an allocation pattern is?
> >>>>>
> >>>>> I have coded something like this a few times, in my cases it is
> >>>>> usually something like: try to allocate a big chunk of memory hoping
> >>>>> for a huge page, then fall back to a smaller allocation
> >>>>>
> >>>>> Most likely the key consideration is that the callsites are using
> >>>>> GFP_NOWARN, so perhaps we can just avoid decrementing the nth on a
> >>>>> NOWARN case assuming that another allocation attempt will closely
> >>>>> follow?
> >>>>
> >>>> GFP_NOWARN is also extensively used for allocations with
> >>>> user-controlled size, e.g.:
> >>>> https://elixir.bootlin.com/linux/v6.1-rc3/source/net/unix/af_unix.c#L3451
> >>>>
> >>>> That's different and these allocations are usually not repeated.
> >>>> So looking at GFP_NOWARN does not look like the right thing to do.
> >>>
> >>> This may be the best option then, arguably perhaps even more
> >>> "realistic" than normal fail_nth as in a real system if this stuff
> >>> starts failing there is a good chance things from then on will fail
> >>> too during the error cleanup.
> >>>
> >>>>> However, this would also have to fix the obnoxious behavior of fail
> >>>>> nth where it fails its own copy_from_user on its write system call -
> >>>>> meaning there would be no way to turn it off.
> >>>>
> >>>> Oh, interesting. We added failing of copy_from/to_user later and did
> >>>> not consider such interaction.
> >>>> Filed https://bugzilla.kernel.org/show_bug.cgi?id=216660 for this.
> >>>
> >>> Oh, I will tell you the other two bugish things I noticed
> >>>
> >>> __should_failslab() has this:
> >>>
> >>>          if (gfpflags & __GFP_NOWARN)
> >>>                  failslab.attr.no_warn = true;
> >>>
> >>>          return should_fail(&failslab.attr, s->object_size);
> >>>
> >>> Which always permanently turns off no_warn for slab during early
> >>> boot. This is why syzkaller reports are so confusing. They trigger a
> >>> slab fault injection, which in all other cases gives a notification
> >>> backtrace, but in slab cases there is no hint about the fault
> >>> injection in the log.
> >>
> >> Ouch, this looks like a bug in:
> >>
> >> commit 3f913fc5f9745613088d3c569778c9813ab9c129
> >> Author: Qi Zheng <zhengqi.arch@bytedance.com>
> >> Date:   Thu May 19 14:08:55 2022 -0700
> >>       mm: fix missing handler for __GFP_NOWARN
> >>
> >> +Qi could you please fix it?
> >>
> >> At the very least the local gfpflags should not alter the global
> >> failslab.attr that is persistent and shared by all tasks.
> >
> > Oh, It indeed shouldn't alter the global failslab.attr, I'll fix it.
>
> How about the following changes? If it's ok, I will send this fix patch.
> Thanks. :)

I think the interface design question is mostly to Akinobu as fault
injection maintainer.

> diff --git a/include/linux/fault-inject.h b/include/linux/fault-inject.h
> index 9f6e25467844..b61a3fb7a2a3 100644
> --- a/include/linux/fault-inject.h
> +++ b/include/linux/fault-inject.h
> @@ -20,7 +20,6 @@ struct fault_attr {
>          atomic_t space;
>          unsigned long verbose;
>          bool task_filter;
> -       bool no_warn;
>          unsigned long stacktrace_depth;
>          unsigned long require_start;
>          unsigned long require_end;
> @@ -40,12 +39,12 @@ struct fault_attr {
>                  .ratelimit_state = RATELIMIT_STATE_INIT_DISABLED,       \
>                  .verbose = 2,                                           \
>                  .dname = NULL,                                          \
> -               .no_warn = false,                                       \
>          }
>
>   #define DECLARE_FAULT_ATTR(name) struct fault_attr name =
> FAULT_ATTR_INITIALIZER
>   int setup_fault_attr(struct fault_attr *attr, char *str);
>   bool should_fail(struct fault_attr *attr, ssize_t size);
> +bool should_fail_gfp(struct fault_attr *attr, ssize_t size, gfp_t
> gfpflags);
>
>   #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
>
> diff --git a/lib/fault-inject.c b/lib/fault-inject.c
> index 4b8fafce415c..95af50832770 100644
> --- a/lib/fault-inject.c
> +++ b/lib/fault-inject.c
> @@ -41,9 +41,6 @@ EXPORT_SYMBOL_GPL(setup_fault_attr);
>
>   static void fail_dump(struct fault_attr *attr)
>   {
> -       if (attr->no_warn)
> -               return;
> -
>          if (attr->verbose > 0 && __ratelimit(&attr->ratelimit_state)) {
>                  printk(KERN_NOTICE "FAULT_INJECTION: forcing a failure.\n"
>                         "name %pd, interval %lu, probability %lu, "
> @@ -98,12 +95,7 @@ static inline bool fail_stacktrace(struct fault_attr
> *attr)
>
>   #endif /* CONFIG_FAULT_INJECTION_STACKTRACE_FILTER */
>
> -/*
> - * This code is stolen from failmalloc-1.0
> - * http://www.nongnu.org/failmalloc/
> - */
> -
> -bool should_fail(struct fault_attr *attr, ssize_t size)
> +bool should_fail_check(struct fault_attr *attr, ssize_t size)
>   {
>          bool stack_checked = false;
>
> @@ -118,7 +110,7 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
>                          fail_nth--;
>                          WRITE_ONCE(current->fail_nth, fail_nth);
>                          if (!fail_nth)
> -                               goto fail;
> +                               return true;
>
>                          return false;
>                  }
> @@ -151,7 +143,19 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
>          if (attr->probability <= get_random_u32_below(100))
>                  return false;
>
> -fail:
> +       return true;
> +}
> +
> +/*
> + * This code is stolen from failmalloc-1.0
> + * http://www.nongnu.org/failmalloc/
> + */
> +
> +bool should_fail(struct fault_attr *attr, ssize_t size)
> +{
> +       if (!should_fail_check(attr, size))
> +               return false;
> +
>          fail_dump(attr);
>
>          if (atomic_read(&attr->times) != -1)
> @@ -161,6 +165,21 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
>   }
>   EXPORT_SYMBOL_GPL(should_fail);
>
> +bool should_fail_gfp(struct fault_attr *attr, ssize_t size, gfp_t gfpflags)
> +{
> +       if (!should_fail_check(attr, size))
> +               return false;
> +
> +       if (!(gfpflags & __GFP_NOWARN))
> +               fail_dump(attr);
> +
> +       if (atomic_read(&attr->times) != -1)
> +               atomic_dec_not_zero(&attr->times);
> +
> +       return true;
> +}
> +EXPORT_SYMBOL_GPL(should_fail_gfp);

should_fail/should_fail_gfp duplicate some code + gfp is slab-specific
(while clumsy to use for other fault injection types + we won't be
able to pass any runtime flags that are not present in gfp). So I
would go either with:

bool should_fail(struct fault_attr *attr, ssize_t size, bool nowarn);

or even more extensible interface would be:

enum fault_flags {
  fault_nowarn = 1 << 0,
};

bool should_fail(struct fault_attr *attr, ssize_t size, fault_flags flags);

And if we don't want to change all callers to avoid code duplication:

bool should_fail_ex(struct fault_attr *attr, ssize_t size, fault_flags flags);
bool should_fail(struct fault_attr *attr, ssize_t size) {
  return should_fail_ex(attr, size, 0);
}



>   #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
>
>   static int debugfs_ul_set(void *data, u64 val)
> diff --git a/mm/failslab.c b/mm/failslab.c
> index 58df9789f1d2..21338b256791 100644
> --- a/mm/failslab.c
> +++ b/mm/failslab.c
> @@ -30,10 +30,7 @@ bool __should_failslab(struct kmem_cache *s, gfp_t
> gfpflags)
>          if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB))
>                  return false;
>
> -       if (gfpflags & __GFP_NOWARN)
> -               failslab.attr.no_warn = true;
> -
> -       return should_fail(&failslab.attr, s->object_size);
> +       return should_fail_gfp(&failslab.attr, s->object_size, gfpflags);
>   }
>
>   static int __init setup_failslab(char *str)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 7192ded44ad0..4e70b5599ada 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3912,10 +3912,7 @@ static bool __should_fail_alloc_page(gfp_t
> gfp_mask, unsigned int order)
>                          (gfp_mask & __GFP_DIRECT_RECLAIM))
>                  return false;
>
> -       if (gfp_mask & __GFP_NOWARN)
> -               fail_page_alloc.attr.no_warn = true;
> -
> -       return should_fail(&fail_page_alloc.attr, 1 << order);
> +       return should_fail_gfp(&fail_page_alloc.attr, 1 << order, gfp_mask);
>   }
>
>   #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
>
> >
> > But a warning should not be printed for callers that currently specify
> > __GFP_NOWARN, because that could lead to deadlocks, such as the deadlock
> > case mentioned in commit 6b9dbedbe349 ("tty: fix deadlock caused by
> > calling printk() under tty_port->lock").
> >
> > Thanks,
> > Qi
> >
> >>
> >> But I am not sure if we really don't want to issue the fault injection
> >> stack in this case. It's not a WARNING, it's merely an information
> >> message. It looks useful in all cases, even with GFP_NOWARN. Why
> >> should it be suppressed?
> >>
> >>
> >>> Once that is fixed we can quickly explain why the socketpair() example
> >>> in the docs shows success ret codes in the middle of the sweep when
> >>> run on syzkaller kernels
> >>>
> >>> fail_nth interacts badly with other kernel features typically enabled
> >>> in syzkaller kernels. Eg it fails in hidden kmemleak instrumentation:
> >>>
> >>> [   18.499559] FAULT_INJECTION: forcing a failure.
> >>> [   18.499559] name failslab, interval 1, probability 0, space 0,
> >>> times 0
> >>> [   18.499720] CPU: 10 PID: 386 Comm: iommufd_fail_nt Not tainted
> >>> 6.1.0-rc3+ #34
> >>> [   18.499826] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> >>> BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> >>> [   18.499971] Call Trace:
> >>> [   18.500010]  <TASK>
> >>> [   18.500048]  show_stack+0x3d/0x3f
> >>> [   18.500114]  dump_stack_lvl+0x92/0xbd
> >>> [   18.500171]  dump_stack+0x15/0x17
> >>> [   18.500232]  should_fail.cold+0x5/0xa
> >>> [   18.500291]  __should_failslab+0xb6/0x100
> >>> [   18.500349]  should_failslab+0x9/0x20
> >>> [   18.500416]  kmem_cache_alloc+0x64/0x4e0
> >>> [   18.500477]  ? __create_object+0x40/0xc50
> >>> [   18.500539]  __create_object+0x40/0xc50
> >>> [   18.500620]  ? kasan_poison+0x3a/0x50
> >>> [   18.500690]  ? kasan_unpoison+0x28/0x50
> >>> [***18.500753]  kmemleak_alloc+0x24/0x30
> >>> [   18.500816]  __kmem_cache_alloc_node+0x1de/0x400
> >>> [   18.500900]  ? iopt_alloc_area_pages+0x95/0x560 [iommufd]
> >>> [   18.500993]  kmalloc_trace+0x26/0x110
> >>> [   18.501059]  iopt_alloc_area_pages+0x95/0x560 [iommufd]
> >>>
> >>> Which has the consequence of syzkaller wasting half its fail_nth
> >>> effort because it is triggering failures in hidden instrumentation
> >>> that has no impact on the main code path.
> >>>
> >>> Maybe a kmem_cache_alloc_no_fault_inject() would be helpful for a few
> >>> cases.
> >>>
> >>> Jason
> >
>
> --
> Thanks,
> Qi
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/be6a67b0-479f-db0a-fa69-764713135d70%40bytedance.com.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: xarray, fault injection and syzkaller
  2022-11-06 17:36                   ` Dmitry Vyukov
@ 2022-11-07  2:13                     ` Qi Zheng
  2022-11-07  3:31                     ` [PATCH] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr Qi Zheng
  1 sibling, 0 replies; 30+ messages in thread
From: Qi Zheng @ 2022-11-07  2:13 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Jason Gunthorpe, Matthew Wilcox, Akinobu Mita, linux-fsdevel,
	syzkaller-bugs, syzkaller



On 2022/11/7 01:36, Dmitry Vyukov wrote:
> On Sat, 5 Nov 2022 at 05:16, Qi Zheng <zhengqi.arch@bytedance.com> wrote:
>>> On 2022/11/5 02:21, Dmitry Vyukov wrote:
>>>> On Fri, 4 Nov 2022 at 11:03, Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>>>
>>>>> On Fri, Nov 04, 2022 at 10:47:17AM -0700, Dmitry Vyukov wrote:
>>>>>>>> Do we know how common/useful such an allocation pattern is?
>>>>>>>
>>>>>>> I have coded something like this a few times, in my cases it is
>>>>>>> usually something like: try to allocate a big chunk of memory hoping
>>>>>>> for a huge page, then fall back to a smaller allocation
>>>>>>>
>>>>>>> Most likely the key consideration is that the callsites are using
>>>>>>> GFP_NOWARN, so perhaps we can just avoid decrementing the nth on a
>>>>>>> NOWARN case assuming that another allocation attempt will closely
>>>>>>> follow?
>>>>>>
>>>>>> GFP_NOWARN is also extensively used for allocations with
>>>>>> user-controlled size, e.g.:
>>>>>> https://elixir.bootlin.com/linux/v6.1-rc3/source/net/unix/af_unix.c#L3451
>>>>>>
>>>>>> That's different and these allocations are usually not repeated.
>>>>>> So looking at GFP_NOWARN does not look like the right thing to do.
>>>>>
>>>>> This may be the best option then, arguably perhaps even more
>>>>> "realistic" than normal fail_nth as in a real system if this stuff
>>>>> starts failing there is a good chance things from then on will fail
>>>>> too during the error cleanup.
>>>>>
>>>>>>> However, this would also have to fix the obnoxious behavior of fail
>>>>>>> nth where it fails its own copy_from_user on its write system call -
>>>>>>> meaning there would be no way to turn it off.
>>>>>>
>>>>>> Oh, interesting. We added failing of copy_from/to_user later and did
>>>>>> not consider such interaction.
>>>>>> Filed https://bugzilla.kernel.org/show_bug.cgi?id=216660 for this.
>>>>>
>>>>> Oh, I will tell you the other two bugish things I noticed
>>>>>
>>>>> __should_failslab() has this:
>>>>>
>>>>>           if (gfpflags & __GFP_NOWARN)
>>>>>                   failslab.attr.no_warn = true;
>>>>>
>>>>>           return should_fail(&failslab.attr, s->object_size);
>>>>>
>>>>> Which always permanently turns off no_warn for slab during early
>>>>> boot. This is why syzkaller reports are so confusing. They trigger a
>>>>> slab fault injection, which in all other cases gives a notification
>>>>> backtrace, but in slab cases there is no hint about the fault
>>>>> injection in the log.
>>>>
>>>> Ouch, this looks like a bug in:
>>>>
>>>> commit 3f913fc5f9745613088d3c569778c9813ab9c129
>>>> Author: Qi Zheng <zhengqi.arch@bytedance.com>
>>>> Date:   Thu May 19 14:08:55 2022 -0700
>>>>        mm: fix missing handler for __GFP_NOWARN
>>>>
>>>> +Qi could you please fix it?
>>>>
>>>> At the very least the local gfpflags should not alter the global
>>>> failslab.attr that is persistent and shared by all tasks.
>>>
>>> Oh, It indeed shouldn't alter the global failslab.attr, I'll fix it.
>>
>> How about the following changes? If it's ok, I will send this fix patch.
>> Thanks. :)
> 
> I think the interface design question is mostly to Akinobu as fault
> injection maintainer.
> 
>> diff --git a/include/linux/fault-inject.h b/include/linux/fault-inject.h
>> index 9f6e25467844..b61a3fb7a2a3 100644
>> --- a/include/linux/fault-inject.h
>> +++ b/include/linux/fault-inject.h
>> @@ -20,7 +20,6 @@ struct fault_attr {
>>           atomic_t space;
>>           unsigned long verbose;
>>           bool task_filter;
>> -       bool no_warn;
>>           unsigned long stacktrace_depth;
>>           unsigned long require_start;
>>           unsigned long require_end;
>> @@ -40,12 +39,12 @@ struct fault_attr {
>>                   .ratelimit_state = RATELIMIT_STATE_INIT_DISABLED,       \
>>                   .verbose = 2,                                           \
>>                   .dname = NULL,                                          \
>> -               .no_warn = false,                                       \
>>           }
>>
>>    #define DECLARE_FAULT_ATTR(name) struct fault_attr name =
>> FAULT_ATTR_INITIALIZER
>>    int setup_fault_attr(struct fault_attr *attr, char *str);
>>    bool should_fail(struct fault_attr *attr, ssize_t size);
>> +bool should_fail_gfp(struct fault_attr *attr, ssize_t size, gfp_t
>> gfpflags);
>>
>>    #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
>>
>> diff --git a/lib/fault-inject.c b/lib/fault-inject.c
>> index 4b8fafce415c..95af50832770 100644
>> --- a/lib/fault-inject.c
>> +++ b/lib/fault-inject.c
>> @@ -41,9 +41,6 @@ EXPORT_SYMBOL_GPL(setup_fault_attr);
>>
>>    static void fail_dump(struct fault_attr *attr)
>>    {
>> -       if (attr->no_warn)
>> -               return;
>> -
>>           if (attr->verbose > 0 && __ratelimit(&attr->ratelimit_state)) {
>>                   printk(KERN_NOTICE "FAULT_INJECTION: forcing a failure.\n"
>>                          "name %pd, interval %lu, probability %lu, "
>> @@ -98,12 +95,7 @@ static inline bool fail_stacktrace(struct fault_attr
>> *attr)
>>
>>    #endif /* CONFIG_FAULT_INJECTION_STACKTRACE_FILTER */
>>
>> -/*
>> - * This code is stolen from failmalloc-1.0
>> - * http://www.nongnu.org/failmalloc/
>> - */
>> -
>> -bool should_fail(struct fault_attr *attr, ssize_t size)
>> +bool should_fail_check(struct fault_attr *attr, ssize_t size)
>>    {
>>           bool stack_checked = false;
>>
>> @@ -118,7 +110,7 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
>>                           fail_nth--;
>>                           WRITE_ONCE(current->fail_nth, fail_nth);
>>                           if (!fail_nth)
>> -                               goto fail;
>> +                               return true;
>>
>>                           return false;
>>                   }
>> @@ -151,7 +143,19 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
>>           if (attr->probability <= get_random_u32_below(100))
>>                   return false;
>>
>> -fail:
>> +       return true;
>> +}
>> +
>> +/*
>> + * This code is stolen from failmalloc-1.0
>> + * http://www.nongnu.org/failmalloc/
>> + */
>> +
>> +bool should_fail(struct fault_attr *attr, ssize_t size)
>> +{
>> +       if (!should_fail_check(attr, size))
>> +               return false;
>> +
>>           fail_dump(attr);
>>
>>           if (atomic_read(&attr->times) != -1)
>> @@ -161,6 +165,21 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
>>    }
>>    EXPORT_SYMBOL_GPL(should_fail);
>>
>> +bool should_fail_gfp(struct fault_attr *attr, ssize_t size, gfp_t gfpflags)
>> +{
>> +       if (!should_fail_check(attr, size))
>> +               return false;
>> +
>> +       if (!(gfpflags & __GFP_NOWARN))
>> +               fail_dump(attr);
>> +
>> +       if (atomic_read(&attr->times) != -1)
>> +               atomic_dec_not_zero(&attr->times);
>> +
>> +       return true;
>> +}
>> +EXPORT_SYMBOL_GPL(should_fail_gfp);
> 
> should_fail/should_fail_gfp duplicate some code + gfp is slab-specific
> (while clumsy to use for other fault injection types + we won't be
> able to pass any runtime flags that are not present in gfp). So I
> would go either with:
> 
> bool should_fail(struct fault_attr *attr, ssize_t size, bool nowarn);
> 
> or even more extensible interface would be:
> 
> enum fault_flags {
>    fault_nowarn = 1 << 0,
> };
> 
> bool should_fail(struct fault_attr *attr, ssize_t size, fault_flags flags);
> 
> And if we don't want to change all callers to avoid code duplication:
> 
> bool should_fail_ex(struct fault_attr *attr, ssize_t size, fault_flags flags);
> bool should_fail(struct fault_attr *attr, ssize_t size) {
>    return should_fail_ex(attr, size, 0);
> }

This looks better to me, I will revise and resend this fix patch later.

Thanks,
Qi

> 
> 
> 
>>    #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
>>
>>    static int debugfs_ul_set(void *data, u64 val)
>> diff --git a/mm/failslab.c b/mm/failslab.c
>> index 58df9789f1d2..21338b256791 100644
>> --- a/mm/failslab.c
>> +++ b/mm/failslab.c
>> @@ -30,10 +30,7 @@ bool __should_failslab(struct kmem_cache *s, gfp_t
>> gfpflags)
>>           if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB))
>>                   return false;
>>
>> -       if (gfpflags & __GFP_NOWARN)
>> -               failslab.attr.no_warn = true;
>> -
>> -       return should_fail(&failslab.attr, s->object_size);
>> +       return should_fail_gfp(&failslab.attr, s->object_size, gfpflags);
>>    }
>>
>>    static int __init setup_failslab(char *str)
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 7192ded44ad0..4e70b5599ada 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3912,10 +3912,7 @@ static bool __should_fail_alloc_page(gfp_t
>> gfp_mask, unsigned int order)
>>                           (gfp_mask & __GFP_DIRECT_RECLAIM))
>>                   return false;
>>
>> -       if (gfp_mask & __GFP_NOWARN)
>> -               fail_page_alloc.attr.no_warn = true;
>> -
>> -       return should_fail(&fail_page_alloc.attr, 1 << order);
>> +       return should_fail_gfp(&fail_page_alloc.attr, 1 << order, gfp_mask);
>>    }
>>
>>    #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
>>
>>>
>>> But a warning should not be printed for callers that currently specify
>>> __GFP_NOWARN, because that could lead to deadlocks, such as the deadlock
>>> case mentioned in commit 6b9dbedbe349 ("tty: fix deadlock caused by
>>> calling printk() under tty_port->lock").
>>>
>>> Thanks,
>>> Qi
>>>
>>>>
>>>> But I am not sure if we really don't want to issue the fault injection
>>>> stack in this case. It's not a WARNING, it's merely an information
>>>> message. It looks useful in all cases, even with GFP_NOWARN. Why
>>>> should it be suppressed?
>>>>
>>>>
>>>>> Once that is fixed we can quickly explain why the socketpair() example
>>>>> in the docs shows success ret codes in the middle of the sweep when
>>>>> run on syzkaller kernels
>>>>>
>>>>> fail_nth interacts badly with other kernel features typically enabled
>>>>> in syzkaller kernels. Eg it fails in hidden kmemleak instrumentation:
>>>>>
>>>>> [   18.499559] FAULT_INJECTION: forcing a failure.
>>>>> [   18.499559] name failslab, interval 1, probability 0, space 0,
>>>>> times 0
>>>>> [   18.499720] CPU: 10 PID: 386 Comm: iommufd_fail_nt Not tainted
>>>>> 6.1.0-rc3+ #34
>>>>> [   18.499826] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
>>>>> BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>>>>> [   18.499971] Call Trace:
>>>>> [   18.500010]  <TASK>
>>>>> [   18.500048]  show_stack+0x3d/0x3f
>>>>> [   18.500114]  dump_stack_lvl+0x92/0xbd
>>>>> [   18.500171]  dump_stack+0x15/0x17
>>>>> [   18.500232]  should_fail.cold+0x5/0xa
>>>>> [   18.500291]  __should_failslab+0xb6/0x100
>>>>> [   18.500349]  should_failslab+0x9/0x20
>>>>> [   18.500416]  kmem_cache_alloc+0x64/0x4e0
>>>>> [   18.500477]  ? __create_object+0x40/0xc50
>>>>> [   18.500539]  __create_object+0x40/0xc50
>>>>> [   18.500620]  ? kasan_poison+0x3a/0x50
>>>>> [   18.500690]  ? kasan_unpoison+0x28/0x50
>>>>> [***18.500753]  kmemleak_alloc+0x24/0x30
>>>>> [   18.500816]  __kmem_cache_alloc_node+0x1de/0x400
>>>>> [   18.500900]  ? iopt_alloc_area_pages+0x95/0x560 [iommufd]
>>>>> [   18.500993]  kmalloc_trace+0x26/0x110
>>>>> [   18.501059]  iopt_alloc_area_pages+0x95/0x560 [iommufd]
>>>>>
>>>>> Which has the consequence of syzkaller wasting half its fail_nth
>>>>> effort because it is triggering failures in hidden instrumentation
>>>>> that has no impact on the main code path.
>>>>>
>>>>> Maybe a kmem_cache_alloc_no_fault_inject() would be helpful for a few
>>>>> cases.
>>>>>
>>>>> Jason
>>>
>>
>> --
>> Thanks,
>> Qi
>>
>> --
>> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/be6a67b0-479f-db0a-fa69-764713135d70%40bytedance.com.

-- 
Thanks,
Qi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-06 17:36                   ` Dmitry Vyukov
  2022-11-07  2:13                     ` Qi Zheng
@ 2022-11-07  3:31                     ` Qi Zheng
  2022-11-07 12:42                       ` Jason Gunthorpe
  1 sibling, 1 reply; 30+ messages in thread
From: Qi Zheng @ 2022-11-07  3:31 UTC (permalink / raw)
  To: dvyukov, jgg, willy, akinobu.mita
  Cc: akpm, linux-kernel, linux-mm, linux-fsdevel, Qi Zheng, stable

When we specify __GFP_NOWARN, we only expect that no warnings
will be issued for current caller. But in the __should_failslab()
and __should_fail_alloc_page(), the local GFP flags alter the
global {failslab|fail_page_alloc}.attr, which is persistent and
shared by all tasks. This is not what we expected, let's fix it.

Cc: stable@vger.kernel.org
Fixes: 3f913fc5f974 ("mm: fix missing handler for __GFP_NOWARN")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 include/linux/fault-inject.h |  7 +++++--
 lib/fault-inject.c           | 14 +++++++++-----
 mm/failslab.c                |  6 ++++--
 mm/page_alloc.c              |  6 ++++--
 4 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/include/linux/fault-inject.h b/include/linux/fault-inject.h
index 9f6e25467844..444236dadcf0 100644
--- a/include/linux/fault-inject.h
+++ b/include/linux/fault-inject.h
@@ -20,7 +20,6 @@ struct fault_attr {
 	atomic_t space;
 	unsigned long verbose;
 	bool task_filter;
-	bool no_warn;
 	unsigned long stacktrace_depth;
 	unsigned long require_start;
 	unsigned long require_end;
@@ -32,6 +31,10 @@ struct fault_attr {
 	struct dentry *dname;
 };
 
+enum fault_flags {
+	FAULT_NOWARN =	1 << 0,
+};
+
 #define FAULT_ATTR_INITIALIZER {					\
 		.interval = 1,						\
 		.times = ATOMIC_INIT(1),				\
@@ -40,11 +43,11 @@ struct fault_attr {
 		.ratelimit_state = RATELIMIT_STATE_INIT_DISABLED,	\
 		.verbose = 2,						\
 		.dname = NULL,						\
-		.no_warn = false,					\
 	}
 
 #define DECLARE_FAULT_ATTR(name) struct fault_attr name = FAULT_ATTR_INITIALIZER
 int setup_fault_attr(struct fault_attr *attr, char *str);
+bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags);
 bool should_fail(struct fault_attr *attr, ssize_t size);
 
 #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
diff --git a/lib/fault-inject.c b/lib/fault-inject.c
index 4b8fafce415c..5971f7c3e49e 100644
--- a/lib/fault-inject.c
+++ b/lib/fault-inject.c
@@ -41,9 +41,6 @@ EXPORT_SYMBOL_GPL(setup_fault_attr);
 
 static void fail_dump(struct fault_attr *attr)
 {
-	if (attr->no_warn)
-		return;
-
 	if (attr->verbose > 0 && __ratelimit(&attr->ratelimit_state)) {
 		printk(KERN_NOTICE "FAULT_INJECTION: forcing a failure.\n"
 		       "name %pd, interval %lu, probability %lu, "
@@ -103,7 +100,7 @@ static inline bool fail_stacktrace(struct fault_attr *attr)
  * http://www.nongnu.org/failmalloc/
  */
 
-bool should_fail(struct fault_attr *attr, ssize_t size)
+bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags)
 {
 	bool stack_checked = false;
 
@@ -152,13 +149,20 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
 		return false;
 
 fail:
-	fail_dump(attr);
+	if (!(flags & FAULT_NOWARN))
+		fail_dump(attr);
 
 	if (atomic_read(&attr->times) != -1)
 		atomic_dec_not_zero(&attr->times);
 
 	return true;
 }
+EXPORT_SYMBOL_GPL(should_fail_ex);
+
+bool should_fail(struct fault_attr *attr, ssize_t size)
+{
+	return should_fail_ex(attr, size, 0);
+}
 EXPORT_SYMBOL_GPL(should_fail);
 
 #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
diff --git a/mm/failslab.c b/mm/failslab.c
index 58df9789f1d2..fc046f26606c 100644
--- a/mm/failslab.c
+++ b/mm/failslab.c
@@ -16,6 +16,8 @@ static struct {
 
 bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
 {
+	int flags = 0;
+
 	/* No fault-injection for bootstrap cache */
 	if (unlikely(s == kmem_cache))
 		return false;
@@ -31,9 +33,9 @@ bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
 		return false;
 
 	if (gfpflags & __GFP_NOWARN)
-		failslab.attr.no_warn = true;
+		flags |= FAULT_NOWARN;
 
-	return should_fail(&failslab.attr, s->object_size);
+	return should_fail_ex(&failslab.attr, s->object_size, flags);
 }
 
 static int __init setup_failslab(char *str)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7192ded44ad0..e537d3a950a4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3902,6 +3902,8 @@ __setup("fail_page_alloc=", setup_fail_page_alloc);
 
 static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
 {
+	int flags = 0;
+
 	if (order < fail_page_alloc.min_order)
 		return false;
 	if (gfp_mask & __GFP_NOFAIL)
@@ -3913,9 +3915,9 @@ static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
 		return false;
 
 	if (gfp_mask & __GFP_NOWARN)
-		fail_page_alloc.attr.no_warn = true;
+		flags |= FAULT_NOWARN;
 
-	return should_fail(&fail_page_alloc.attr, 1 << order);
+	return should_fail_ex(&fail_page_alloc.attr, 1 << order, flags);
 }
 
 #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-07  3:31                     ` [PATCH] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr Qi Zheng
@ 2022-11-07 12:42                       ` Jason Gunthorpe
  2022-11-07 15:05                         ` Qi Zheng
  0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2022-11-07 12:42 UTC (permalink / raw)
  To: Qi Zheng
  Cc: dvyukov, willy, akinobu.mita, akpm, linux-kernel, linux-mm,
	linux-fsdevel, stable

On Mon, Nov 07, 2022 at 11:31:09AM +0800, Qi Zheng wrote:

> @@ -31,9 +33,9 @@ bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
>  		return false;
>  
>  	if (gfpflags & __GFP_NOWARN)
> -		failslab.attr.no_warn = true;
> +		flags |= FAULT_NOWARN;

You should add a comment here about why this is required, to avoid
deadlocking printk

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-07 12:42                       ` Jason Gunthorpe
@ 2022-11-07 15:05                         ` Qi Zheng
  2022-11-07 16:26                           ` Jason Gunthorpe
  0 siblings, 1 reply; 30+ messages in thread
From: Qi Zheng @ 2022-11-07 15:05 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dvyukov, willy, akinobu.mita, akpm, linux-kernel, linux-mm,
	linux-fsdevel, stable



On 2022/11/7 20:42, Jason Gunthorpe wrote:
> On Mon, Nov 07, 2022 at 11:31:09AM +0800, Qi Zheng wrote:
> 
>> @@ -31,9 +33,9 @@ bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
>>   		return false;
>>   
>>   	if (gfpflags & __GFP_NOWARN)
>> -		failslab.attr.no_warn = true;
>> +		flags |= FAULT_NOWARN;
> 
> You should add a comment here about why this is required, to avoid
> deadlocking printk

I think this comment should be placed where __GFP_NOWARN is specified
instead of here. What do you think? :)

Thanks,
Qi

> 
> Jason

-- 
Thanks,
Qi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-07 15:05                         ` Qi Zheng
@ 2022-11-07 16:26                           ` Jason Gunthorpe
  2022-11-08  2:47                             ` Qi Zheng
  2022-11-08  3:52                             ` [PATCH v2] " Qi Zheng
  0 siblings, 2 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2022-11-07 16:26 UTC (permalink / raw)
  To: Qi Zheng
  Cc: dvyukov, willy, akinobu.mita, akpm, linux-kernel, linux-mm,
	linux-fsdevel, stable

On Mon, Nov 07, 2022 at 11:05:42PM +0800, Qi Zheng wrote:
> 
> 
> On 2022/11/7 20:42, Jason Gunthorpe wrote:
> > On Mon, Nov 07, 2022 at 11:31:09AM +0800, Qi Zheng wrote:
> > 
> > > @@ -31,9 +33,9 @@ bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
> > >   		return false;
> > >   	if (gfpflags & __GFP_NOWARN)
> > > -		failslab.attr.no_warn = true;
> > > +		flags |= FAULT_NOWARN;
> > 
> > You should add a comment here about why this is required, to avoid
> > deadlocking printk
> 
> I think this comment should be placed where __GFP_NOWARN is specified
> instead of here. What do you think? :)

NOWARN is clear what it does, it is this specifically that is very
subtle about avoiding deadlock aginst allocations triggered by
printk/etc code.

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-07 16:26                           ` Jason Gunthorpe
@ 2022-11-08  2:47                             ` Qi Zheng
  2022-11-08  3:52                             ` [PATCH v2] " Qi Zheng
  1 sibling, 0 replies; 30+ messages in thread
From: Qi Zheng @ 2022-11-08  2:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dvyukov, willy, akinobu.mita, akpm, linux-kernel, linux-mm,
	linux-fsdevel, stable



On 2022/11/8 00:26, Jason Gunthorpe wrote:
> On Mon, Nov 07, 2022 at 11:05:42PM +0800, Qi Zheng wrote:
>>
>>
>> On 2022/11/7 20:42, Jason Gunthorpe wrote:
>>> On Mon, Nov 07, 2022 at 11:31:09AM +0800, Qi Zheng wrote:
>>>
>>>> @@ -31,9 +33,9 @@ bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
>>>>    		return false;
>>>>    	if (gfpflags & __GFP_NOWARN)
>>>> -		failslab.attr.no_warn = true;
>>>> +		flags |= FAULT_NOWARN;
>>>
>>> You should add a comment here about why this is required, to avoid
>>> deadlocking printk
>>
>> I think this comment should be placed where __GFP_NOWARN is specified
>> instead of here. What do you think? :)
> 
> NOWARN is clear what it does, it is this specifically that is very
> subtle about avoiding deadlock aginst allocations triggered by
> printk/etc code.

Oh, maybe I understand your concern. Some people may think that this
is just a print of fault injection information, not a warning. I'll
add a comment explaining why in some cases there must be no printing.

Thanks,
Qi

> 
> Jason

-- 
Thanks,
Qi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-07 16:26                           ` Jason Gunthorpe
  2022-11-08  2:47                             ` Qi Zheng
@ 2022-11-08  3:52                             ` Qi Zheng
  2022-11-08  8:44                               ` Wei Yongjun
  2022-11-08 17:36                               ` Akinobu Mita
  1 sibling, 2 replies; 30+ messages in thread
From: Qi Zheng @ 2022-11-08  3:52 UTC (permalink / raw)
  To: dvyukov, jgg, willy, akinobu.mita
  Cc: akpm, linux-kernel, linux-mm, linux-fsdevel, Qi Zheng, stable

When we specify __GFP_NOWARN, we only expect that no warnings
will be issued for current caller. But in the __should_failslab()
and __should_fail_alloc_page(), the local GFP flags alter the
global {failslab|fail_page_alloc}.attr, which is persistent and
shared by all tasks. This is not what we expected, let's fix it.

Cc: stable@vger.kernel.org
Fixes: 3f913fc5f974 ("mm: fix missing handler for __GFP_NOWARN")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 v1: https://lore.kernel.org/lkml/20221107033109.59709-1-zhengqi.arch@bytedance.com/

 Changelog in v1 -> v2:
  - add comment for __should_failslab() and __should_fail_alloc_page()
    (suggested by Jason)

 include/linux/fault-inject.h |  7 +++++--
 lib/fault-inject.c           | 14 +++++++++-----
 mm/failslab.c                | 12 ++++++++++--
 mm/page_alloc.c              |  7 +++++--
 4 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/include/linux/fault-inject.h b/include/linux/fault-inject.h
index 9f6e25467844..444236dadcf0 100644
--- a/include/linux/fault-inject.h
+++ b/include/linux/fault-inject.h
@@ -20,7 +20,6 @@ struct fault_attr {
 	atomic_t space;
 	unsigned long verbose;
 	bool task_filter;
-	bool no_warn;
 	unsigned long stacktrace_depth;
 	unsigned long require_start;
 	unsigned long require_end;
@@ -32,6 +31,10 @@ struct fault_attr {
 	struct dentry *dname;
 };
 
+enum fault_flags {
+	FAULT_NOWARN =	1 << 0,
+};
+
 #define FAULT_ATTR_INITIALIZER {					\
 		.interval = 1,						\
 		.times = ATOMIC_INIT(1),				\
@@ -40,11 +43,11 @@ struct fault_attr {
 		.ratelimit_state = RATELIMIT_STATE_INIT_DISABLED,	\
 		.verbose = 2,						\
 		.dname = NULL,						\
-		.no_warn = false,					\
 	}
 
 #define DECLARE_FAULT_ATTR(name) struct fault_attr name = FAULT_ATTR_INITIALIZER
 int setup_fault_attr(struct fault_attr *attr, char *str);
+bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags);
 bool should_fail(struct fault_attr *attr, ssize_t size);
 
 #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
diff --git a/lib/fault-inject.c b/lib/fault-inject.c
index 4b8fafce415c..5971f7c3e49e 100644
--- a/lib/fault-inject.c
+++ b/lib/fault-inject.c
@@ -41,9 +41,6 @@ EXPORT_SYMBOL_GPL(setup_fault_attr);
 
 static void fail_dump(struct fault_attr *attr)
 {
-	if (attr->no_warn)
-		return;
-
 	if (attr->verbose > 0 && __ratelimit(&attr->ratelimit_state)) {
 		printk(KERN_NOTICE "FAULT_INJECTION: forcing a failure.\n"
 		       "name %pd, interval %lu, probability %lu, "
@@ -103,7 +100,7 @@ static inline bool fail_stacktrace(struct fault_attr *attr)
  * http://www.nongnu.org/failmalloc/
  */
 
-bool should_fail(struct fault_attr *attr, ssize_t size)
+bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags)
 {
 	bool stack_checked = false;
 
@@ -152,13 +149,20 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
 		return false;
 
 fail:
-	fail_dump(attr);
+	if (!(flags & FAULT_NOWARN))
+		fail_dump(attr);
 
 	if (atomic_read(&attr->times) != -1)
 		atomic_dec_not_zero(&attr->times);
 
 	return true;
 }
+EXPORT_SYMBOL_GPL(should_fail_ex);
+
+bool should_fail(struct fault_attr *attr, ssize_t size)
+{
+	return should_fail_ex(attr, size, 0);
+}
 EXPORT_SYMBOL_GPL(should_fail);
 
 #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
diff --git a/mm/failslab.c b/mm/failslab.c
index 58df9789f1d2..ffc420c0e767 100644
--- a/mm/failslab.c
+++ b/mm/failslab.c
@@ -16,6 +16,8 @@ static struct {
 
 bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
 {
+	int flags = 0;
+
 	/* No fault-injection for bootstrap cache */
 	if (unlikely(s == kmem_cache))
 		return false;
@@ -30,10 +32,16 @@ bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
 	if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB))
 		return false;
 
+	/*
+	 * In some cases, it expects to specify __GFP_NOWARN
+	 * to avoid printing any information(not just a warning),
+	 * thus avoiding deadlocks. See commit 6b9dbedbe349 for
+	 * details.
+	 */
 	if (gfpflags & __GFP_NOWARN)
-		failslab.attr.no_warn = true;
+		flags |= FAULT_NOWARN;
 
-	return should_fail(&failslab.attr, s->object_size);
+	return should_fail_ex(&failslab.attr, s->object_size, flags);
 }
 
 static int __init setup_failslab(char *str)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7192ded44ad0..cb6fe715d983 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3902,6 +3902,8 @@ __setup("fail_page_alloc=", setup_fail_page_alloc);
 
 static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
 {
+	int flags = 0;
+
 	if (order < fail_page_alloc.min_order)
 		return false;
 	if (gfp_mask & __GFP_NOFAIL)
@@ -3912,10 +3914,11 @@ static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
 			(gfp_mask & __GFP_DIRECT_RECLAIM))
 		return false;
 
+	/* See comment in __should_failslab() */
 	if (gfp_mask & __GFP_NOWARN)
-		fail_page_alloc.attr.no_warn = true;
+		flags |= FAULT_NOWARN;
 
-	return should_fail(&fail_page_alloc.attr, 1 << order);
+	return should_fail_ex(&fail_page_alloc.attr, 1 << order, flags);
 }
 
 #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-08  3:52                             ` [PATCH v2] " Qi Zheng
@ 2022-11-08  8:44                               ` Wei Yongjun
  2022-11-08  8:58                                 ` Qi Zheng
  2022-11-08 17:36                               ` Akinobu Mita
  1 sibling, 1 reply; 30+ messages in thread
From: Wei Yongjun @ 2022-11-08  8:44 UTC (permalink / raw)
  To: Qi Zheng, dvyukov, jgg, willy, akinobu.mita
  Cc: akpm, linux-kernel, linux-mm, linux-fsdevel, stable

Hi Zheng Qi,

On 2022/11/8 11:52, Qi Zheng wrote:
> When we specify __GFP_NOWARN, we only expect that no warnings
> will be issued for current caller. But in the __should_failslab()
> and __should_fail_alloc_page(), the local GFP flags alter the
> global {failslab|fail_page_alloc}.attr, which is persistent and
> shared by all tasks. This is not what we expected, let's fix it.
> 
> Cc: stable@vger.kernel.org
> Fixes: 3f913fc5f974 ("mm: fix missing handler for __GFP_NOWARN")
> Reported-by: Dmitry Vyukov <dvyukov@google.com>
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
> ---
>  v1: https://lore.kernel.org/lkml/20221107033109.59709-1-zhengqi.arch@bytedance.com/
> 
>  Changelog in v1 -> v2:
>   - add comment for __should_failslab() and __should_fail_alloc_page()
>     (suggested by Jason)
> 
>  include/linux/fault-inject.h |  7 +++++--
>  lib/fault-inject.c           | 14 +++++++++-----
>  mm/failslab.c                | 12 ++++++++++--
>  mm/page_alloc.c              |  7 +++++--
>  4 files changed, 29 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/fault-inject.h b/include/linux/fault-inject.h
> index 9f6e25467844..444236dadcf0 100644
> --- a/include/linux/fault-inject.h
> +++ b/include/linux/fault-inject.h
> @@ -20,7 +20,6 @@ struct fault_attr {
>  	atomic_t space;
>  	unsigned long verbose;
>  	bool task_filter;
> -	bool no_warn;
>  	unsigned long stacktrace_depth;
>  	unsigned long require_start;
>  	unsigned long require_end;
> @@ -32,6 +31,10 @@ struct fault_attr {
>  	struct dentry *dname;
>  };
>  
> +enum fault_flags {
> +	FAULT_NOWARN =	1 << 0,
> +};
> +
>  #define FAULT_ATTR_INITIALIZER {					\
>  		.interval = 1,						\
>  		.times = ATOMIC_INIT(1),				\
> @@ -40,11 +43,11 @@ struct fault_attr {
>  		.ratelimit_state = RATELIMIT_STATE_INIT_DISABLED,	\
>  		.verbose = 2,						\
>  		.dname = NULL,						\
> -		.no_warn = false,					\

How about keep no_warn attr as it be, and export it to user?

When testing with fault injection, and each fault will print an backtrace.
but not all of the testsuit can tell us which one is fault injection
message or other is a real warning/crash like syzkaller do.

In my case, to make things simple, we usually used a regex to detect whether
wanring/error happend. So we disabled the slab/page fault warning message by
default, and only enable it when debug real issue.

Regards,


>  	}
>  
>  #define DECLARE_FAULT_ATTR(name) struct fault_attr name = FAULT_ATTR_INITIALIZER
>  int setup_fault_attr(struct fault_attr *attr, char *str);
> +bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags);
>  bool should_fail(struct fault_attr *attr, ssize_t size);
>  
>  #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
> diff --git a/lib/fault-inject.c b/lib/fault-inject.c
> index 4b8fafce415c..5971f7c3e49e 100644
> --- a/lib/fault-inject.c
> +++ b/lib/fault-inject.c
> @@ -41,9 +41,6 @@ EXPORT_SYMBOL_GPL(setup_fault_attr);
>  
>  static void fail_dump(struct fault_attr *attr)
>  {
> -	if (attr->no_warn)
> -		return;
> -
>  	if (attr->verbose > 0 && __ratelimit(&attr->ratelimit_state)) {
>  		printk(KERN_NOTICE "FAULT_INJECTION: forcing a failure.\n"
>  		       "name %pd, interval %lu, probability %lu, "
> @@ -103,7 +100,7 @@ static inline bool fail_stacktrace(struct fault_attr *attr)
>   * http://www.nongnu.org/failmalloc/
>   */
>  
> -bool should_fail(struct fault_attr *attr, ssize_t size)
> +bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags)
>  {
>  	bool stack_checked = false;
>  
> @@ -152,13 +149,20 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
>  		return false;
>  
>  fail:
> -	fail_dump(attr);
> +	if (!(flags & FAULT_NOWARN))
> +		fail_dump(attr);
>  
>  	if (atomic_read(&attr->times) != -1)
>  		atomic_dec_not_zero(&attr->times);
>  
>  	return true;
>  }
> +EXPORT_SYMBOL_GPL(should_fail_ex);
> +
> +bool should_fail(struct fault_attr *attr, ssize_t size)
> +{
> +	return should_fail_ex(attr, size, 0);
> +}
>  EXPORT_SYMBOL_GPL(should_fail);
>  
>  #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
> diff --git a/mm/failslab.c b/mm/failslab.c
> index 58df9789f1d2..ffc420c0e767 100644
> --- a/mm/failslab.c
> +++ b/mm/failslab.c
> @@ -16,6 +16,8 @@ static struct {
>  
>  bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
>  {
> +	int flags = 0;
> +
>  	/* No fault-injection for bootstrap cache */
>  	if (unlikely(s == kmem_cache))
>  		return false;
> @@ -30,10 +32,16 @@ bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
>  	if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB))
>  		return false;
>  
> +	/*
> +	 * In some cases, it expects to specify __GFP_NOWARN
> +	 * to avoid printing any information(not just a warning),
> +	 * thus avoiding deadlocks. See commit 6b9dbedbe349 for
> +	 * details.
> +	 */
>  	if (gfpflags & __GFP_NOWARN)
> -		failslab.attr.no_warn = true;
> +		flags |= FAULT_NOWARN;
>  
> -	return should_fail(&failslab.attr, s->object_size);
> +	return should_fail_ex(&failslab.attr, s->object_size, flags);
>  }
>  
>  static int __init setup_failslab(char *str)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 7192ded44ad0..cb6fe715d983 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3902,6 +3902,8 @@ __setup("fail_page_alloc=", setup_fail_page_alloc);
>  
>  static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
>  {
> +	int flags = 0;
> +
>  	if (order < fail_page_alloc.min_order)
>  		return false;
>  	if (gfp_mask & __GFP_NOFAIL)
> @@ -3912,10 +3914,11 @@ static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
>  			(gfp_mask & __GFP_DIRECT_RECLAIM))
>  		return false;
>  
> +	/* See comment in __should_failslab() */
>  	if (gfp_mask & __GFP_NOWARN)
> -		fail_page_alloc.attr.no_warn = true;
> +		flags |= FAULT_NOWARN;
>  
> -	return should_fail(&fail_page_alloc.attr, 1 << order);
> +	return should_fail_ex(&fail_page_alloc.attr, 1 << order, flags);
>  }
>  
>  #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-08  8:44                               ` Wei Yongjun
@ 2022-11-08  8:58                                 ` Qi Zheng
  2022-11-08  9:32                                   ` Wei Yongjun
  0 siblings, 1 reply; 30+ messages in thread
From: Qi Zheng @ 2022-11-08  8:58 UTC (permalink / raw)
  To: Wei Yongjun, dvyukov, jgg, willy, akinobu.mita
  Cc: akpm, linux-kernel, linux-mm, linux-fsdevel, stable



On 2022/11/8 16:44, Wei Yongjun wrote:
> Hi Zheng Qi,
> 
> On 2022/11/8 11:52, Qi Zheng wrote:
>> When we specify __GFP_NOWARN, we only expect that no warnings
>> will be issued for current caller. But in the __should_failslab()
>> and __should_fail_alloc_page(), the local GFP flags alter the
>> global {failslab|fail_page_alloc}.attr, which is persistent and
>> shared by all tasks. This is not what we expected, let's fix it.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: 3f913fc5f974 ("mm: fix missing handler for __GFP_NOWARN")
>> Reported-by: Dmitry Vyukov <dvyukov@google.com>
>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>> ---
>>   v1: https://lore.kernel.org/lkml/20221107033109.59709-1-zhengqi.arch@bytedance.com/
>>
>>   Changelog in v1 -> v2:
>>    - add comment for __should_failslab() and __should_fail_alloc_page()
>>      (suggested by Jason)
>>
>>   include/linux/fault-inject.h |  7 +++++--
>>   lib/fault-inject.c           | 14 +++++++++-----
>>   mm/failslab.c                | 12 ++++++++++--
>>   mm/page_alloc.c              |  7 +++++--
>>   4 files changed, 29 insertions(+), 11 deletions(-)
>>
>> diff --git a/include/linux/fault-inject.h b/include/linux/fault-inject.h
>> index 9f6e25467844..444236dadcf0 100644
>> --- a/include/linux/fault-inject.h
>> +++ b/include/linux/fault-inject.h
>> @@ -20,7 +20,6 @@ struct fault_attr {
>>   	atomic_t space;
>>   	unsigned long verbose;
>>   	bool task_filter;
>> -	bool no_warn;
>>   	unsigned long stacktrace_depth;
>>   	unsigned long require_start;
>>   	unsigned long require_end;
>> @@ -32,6 +31,10 @@ struct fault_attr {
>>   	struct dentry *dname;
>>   };
>>   
>> +enum fault_flags {
>> +	FAULT_NOWARN =	1 << 0,
>> +};
>> +
>>   #define FAULT_ATTR_INITIALIZER {					\
>>   		.interval = 1,						\
>>   		.times = ATOMIC_INIT(1),				\
>> @@ -40,11 +43,11 @@ struct fault_attr {
>>   		.ratelimit_state = RATELIMIT_STATE_INIT_DISABLED,	\
>>   		.verbose = 2,						\
>>   		.dname = NULL,						\
>> -		.no_warn = false,					\
> 
> How about keep no_warn attr as it be, and export it to user?
> 
> When testing with fault injection, and each fault will print an backtrace.
> but not all of the testsuit can tell us which one is fault injection
> message or other is a real warning/crash like syzkaller do.
> 
> In my case, to make things simple, we usually used a regex to detect whether
> wanring/error happend. So we disabled the slab/page fault warning message by
> default, and only enable it when debug real issue.

So you want to set/clear this no_warn attr through the procfs or sysfs
interface, so that you can easily disable/enable the slab/page fault
warning message from the user mode. Right?

Seems reasonable to me. Anyone else has an opinion on this? If it is
really needed, I can do it later.

Thanks,
Qi

> 
> Regards,
> 
> 
>>   	}
>>   
>>   #define DECLARE_FAULT_ATTR(name) struct fault_attr name = FAULT_ATTR_INITIALIZER
>>   int setup_fault_attr(struct fault_attr *attr, char *str);
>> +bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags);
>>   bool should_fail(struct fault_attr *attr, ssize_t size);
>>   
>>   #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
>> diff --git a/lib/fault-inject.c b/lib/fault-inject.c
>> index 4b8fafce415c..5971f7c3e49e 100644
>> --- a/lib/fault-inject.c
>> +++ b/lib/fault-inject.c
>> @@ -41,9 +41,6 @@ EXPORT_SYMBOL_GPL(setup_fault_attr);
>>   
>>   static void fail_dump(struct fault_attr *attr)
>>   {
>> -	if (attr->no_warn)
>> -		return;
>> -
>>   	if (attr->verbose > 0 && __ratelimit(&attr->ratelimit_state)) {
>>   		printk(KERN_NOTICE "FAULT_INJECTION: forcing a failure.\n"
>>   		       "name %pd, interval %lu, probability %lu, "
>> @@ -103,7 +100,7 @@ static inline bool fail_stacktrace(struct fault_attr *attr)
>>    * http://www.nongnu.org/failmalloc/
>>    */
>>   
>> -bool should_fail(struct fault_attr *attr, ssize_t size)
>> +bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags)
>>   {
>>   	bool stack_checked = false;
>>   
>> @@ -152,13 +149,20 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
>>   		return false;
>>   
>>   fail:
>> -	fail_dump(attr);
>> +	if (!(flags & FAULT_NOWARN))
>> +		fail_dump(attr);
>>   
>>   	if (atomic_read(&attr->times) != -1)
>>   		atomic_dec_not_zero(&attr->times);
>>   
>>   	return true;
>>   }
>> +EXPORT_SYMBOL_GPL(should_fail_ex);
>> +
>> +bool should_fail(struct fault_attr *attr, ssize_t size)
>> +{
>> +	return should_fail_ex(attr, size, 0);
>> +}
>>   EXPORT_SYMBOL_GPL(should_fail);
>>   
>>   #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
>> diff --git a/mm/failslab.c b/mm/failslab.c
>> index 58df9789f1d2..ffc420c0e767 100644
>> --- a/mm/failslab.c
>> +++ b/mm/failslab.c
>> @@ -16,6 +16,8 @@ static struct {
>>   
>>   bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
>>   {
>> +	int flags = 0;
>> +
>>   	/* No fault-injection for bootstrap cache */
>>   	if (unlikely(s == kmem_cache))
>>   		return false;
>> @@ -30,10 +32,16 @@ bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
>>   	if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB))
>>   		return false;
>>   
>> +	/*
>> +	 * In some cases, it expects to specify __GFP_NOWARN
>> +	 * to avoid printing any information(not just a warning),
>> +	 * thus avoiding deadlocks. See commit 6b9dbedbe349 for
>> +	 * details.
>> +	 */
>>   	if (gfpflags & __GFP_NOWARN)
>> -		failslab.attr.no_warn = true;
>> +		flags |= FAULT_NOWARN;
>>   
>> -	return should_fail(&failslab.attr, s->object_size);
>> +	return should_fail_ex(&failslab.attr, s->object_size, flags);
>>   }
>>   
>>   static int __init setup_failslab(char *str)
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 7192ded44ad0..cb6fe715d983 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3902,6 +3902,8 @@ __setup("fail_page_alloc=", setup_fail_page_alloc);
>>   
>>   static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
>>   {
>> +	int flags = 0;
>> +
>>   	if (order < fail_page_alloc.min_order)
>>   		return false;
>>   	if (gfp_mask & __GFP_NOFAIL)
>> @@ -3912,10 +3914,11 @@ static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
>>   			(gfp_mask & __GFP_DIRECT_RECLAIM))
>>   		return false;
>>   
>> +	/* See comment in __should_failslab() */
>>   	if (gfp_mask & __GFP_NOWARN)
>> -		fail_page_alloc.attr.no_warn = true;
>> +		flags |= FAULT_NOWARN;
>>   
>> -	return should_fail(&fail_page_alloc.attr, 1 << order);
>> +	return should_fail_ex(&fail_page_alloc.attr, 1 << order, flags);
>>   }
>>   
>>   #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS

-- 
Thanks,
Qi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-08  8:58                                 ` Qi Zheng
@ 2022-11-08  9:32                                   ` Wei Yongjun
  2022-11-08  9:45                                     ` Qi Zheng
  2022-11-08 12:04                                     ` Jason Gunthorpe
  0 siblings, 2 replies; 30+ messages in thread
From: Wei Yongjun @ 2022-11-08  9:32 UTC (permalink / raw)
  To: Qi Zheng, dvyukov, jgg, willy, akinobu.mita
  Cc: akpm, linux-kernel, linux-mm, linux-fsdevel, stable



On 2022/11/8 16:58, Qi Zheng wrote:
> 
> 
> On 2022/11/8 16:44, Wei Yongjun wrote:
>> Hi Zheng Qi,
>>
>> On 2022/11/8 11:52, Qi Zheng wrote:
>>> When we specify __GFP_NOWARN, we only expect that no warnings
>>> will be issued for current caller. But in the __should_failslab()
>>> and __should_fail_alloc_page(), the local GFP flags alter the
>>> global {failslab|fail_page_alloc}.attr, which is persistent and
>>> shared by all tasks. This is not what we expected, let's fix it.
>>>
>>> Cc: stable@vger.kernel.org
>>> Fixes: 3f913fc5f974 ("mm: fix missing handler for __GFP_NOWARN")
>>> Reported-by: Dmitry Vyukov <dvyukov@google.com>
>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>>> ---
>>>   v1: https://lore.kernel.org/lkml/20221107033109.59709-1-zhengqi.arch@bytedance.com/
>>>
>>>   Changelog in v1 -> v2:
>>>    - add comment for __should_failslab() and __should_fail_alloc_page()
>>>      (suggested by Jason)
>>>
>>>   include/linux/fault-inject.h |  7 +++++--
>>>   lib/fault-inject.c           | 14 +++++++++-----
>>>   mm/failslab.c                | 12 ++++++++++--
>>>   mm/page_alloc.c              |  7 +++++--
>>>   4 files changed, 29 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/include/linux/fault-inject.h b/include/linux/fault-inject.h
>>> index 9f6e25467844..444236dadcf0 100644
>>> --- a/include/linux/fault-inject.h
>>> +++ b/include/linux/fault-inject.h
>>> @@ -20,7 +20,6 @@ struct fault_attr {
>>>       atomic_t space;
>>>       unsigned long verbose;
>>>       bool task_filter;
>>> -    bool no_warn;
>>>       unsigned long stacktrace_depth;
>>>       unsigned long require_start;
>>>       unsigned long require_end;
>>> @@ -32,6 +31,10 @@ struct fault_attr {
>>>       struct dentry *dname;
>>>   };
>>>   +enum fault_flags {
>>> +    FAULT_NOWARN =    1 << 0,
>>> +};
>>> +
>>>   #define FAULT_ATTR_INITIALIZER {                    \
>>>           .interval = 1,                        \
>>>           .times = ATOMIC_INIT(1),                \
>>> @@ -40,11 +43,11 @@ struct fault_attr {
>>>           .ratelimit_state = RATELIMIT_STATE_INIT_DISABLED,    \
>>>           .verbose = 2,                        \
>>>           .dname = NULL,                        \
>>> -        .no_warn = false,                    \
>>
>> How about keep no_warn attr as it be, and export it to user?
>>
>> When testing with fault injection, and each fault will print an backtrace.
>> but not all of the testsuit can tell us which one is fault injection
>> message or other is a real warning/crash like syzkaller do.
>>
>> In my case, to make things simple, we usually used a regex to detect whether
>> wanring/error happend. So we disabled the slab/page fault warning message by
>> default, and only enable it when debug real issue.
> 
> So you want to set/clear this no_warn attr through the procfs or sysfs
> interface, so that you can easily disable/enable the slab/page fault
> warning message from the user mode. Right?

Yes, just like:

echo 1 > /sys/kernel/debug/failslab/no_warn  #disable message
echo 0 > /sys/kernel/debug/failslab/no_warn  #enable message

Regards
Wei Yongjun

> 
> Seems reasonable to me. Anyone else has an opinion on this? If it is
> really needed, I can do it later.
> 
> Thanks,
> Qi
> 
>>
>> Regards,
>>
>>
>>>       }
>>>     #define DECLARE_FAULT_ATTR(name) struct fault_attr name = FAULT_ATTR_INITIALIZER
>>>   int setup_fault_attr(struct fault_attr *attr, char *str);
>>> +bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags);
>>>   bool should_fail(struct fault_attr *attr, ssize_t size);
>>>     #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
>>> diff --git a/lib/fault-inject.c b/lib/fault-inject.c
>>> index 4b8fafce415c..5971f7c3e49e 100644
>>> --- a/lib/fault-inject.c
>>> +++ b/lib/fault-inject.c
>>> @@ -41,9 +41,6 @@ EXPORT_SYMBOL_GPL(setup_fault_attr);
>>>     static void fail_dump(struct fault_attr *attr)
>>>   {
>>> -    if (attr->no_warn)
>>> -        return;
>>> -
>>>       if (attr->verbose > 0 && __ratelimit(&attr->ratelimit_state)) {
>>>           printk(KERN_NOTICE "FAULT_INJECTION: forcing a failure.\n"
>>>                  "name %pd, interval %lu, probability %lu, "
>>> @@ -103,7 +100,7 @@ static inline bool fail_stacktrace(struct fault_attr *attr)
>>>    * http://www.nongnu.org/failmalloc/
>>>    */
>>>   -bool should_fail(struct fault_attr *attr, ssize_t size)
>>> +bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags)
>>>   {
>>>       bool stack_checked = false;
>>>   @@ -152,13 +149,20 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
>>>           return false;
>>>     fail:
>>> -    fail_dump(attr);
>>> +    if (!(flags & FAULT_NOWARN))
>>> +        fail_dump(attr);
>>>         if (atomic_read(&attr->times) != -1)
>>>           atomic_dec_not_zero(&attr->times);
>>>         return true;
>>>   }
>>> +EXPORT_SYMBOL_GPL(should_fail_ex);
>>> +
>>> +bool should_fail(struct fault_attr *attr, ssize_t size)
>>> +{
>>> +    return should_fail_ex(attr, size, 0);
>>> +}
>>>   EXPORT_SYMBOL_GPL(should_fail);
>>>     #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
>>> diff --git a/mm/failslab.c b/mm/failslab.c
>>> index 58df9789f1d2..ffc420c0e767 100644
>>> --- a/mm/failslab.c
>>> +++ b/mm/failslab.c
>>> @@ -16,6 +16,8 @@ static struct {
>>>     bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
>>>   {
>>> +    int flags = 0;
>>> +
>>>       /* No fault-injection for bootstrap cache */
>>>       if (unlikely(s == kmem_cache))
>>>           return false;
>>> @@ -30,10 +32,16 @@ bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
>>>       if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB))
>>>           return false;
>>>   +    /*
>>> +     * In some cases, it expects to specify __GFP_NOWARN
>>> +     * to avoid printing any information(not just a warning),
>>> +     * thus avoiding deadlocks. See commit 6b9dbedbe349 for
>>> +     * details.
>>> +     */
>>>       if (gfpflags & __GFP_NOWARN)
>>> -        failslab.attr.no_warn = true;
>>> +        flags |= FAULT_NOWARN;
>>>   -    return should_fail(&failslab.attr, s->object_size);
>>> +    return should_fail_ex(&failslab.attr, s->object_size, flags);
>>>   }
>>>     static int __init setup_failslab(char *str)
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index 7192ded44ad0..cb6fe715d983 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -3902,6 +3902,8 @@ __setup("fail_page_alloc=", setup_fail_page_alloc);
>>>     static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
>>>   {
>>> +    int flags = 0;
>>> +
>>>       if (order < fail_page_alloc.min_order)
>>>           return false;
>>>       if (gfp_mask & __GFP_NOFAIL)
>>> @@ -3912,10 +3914,11 @@ static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
>>>               (gfp_mask & __GFP_DIRECT_RECLAIM))
>>>           return false;
>>>   +    /* See comment in __should_failslab() */
>>>       if (gfp_mask & __GFP_NOWARN)
>>> -        fail_page_alloc.attr.no_warn = true;
>>> +        flags |= FAULT_NOWARN;
>>>   -    return should_fail(&fail_page_alloc.attr, 1 << order);
>>> +    return should_fail_ex(&fail_page_alloc.attr, 1 << order, flags);
>>>   }
>>>     #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-08  9:32                                   ` Wei Yongjun
@ 2022-11-08  9:45                                     ` Qi Zheng
  2022-11-08 12:04                                     ` Jason Gunthorpe
  1 sibling, 0 replies; 30+ messages in thread
From: Qi Zheng @ 2022-11-08  9:45 UTC (permalink / raw)
  To: Wei Yongjun, dvyukov, jgg, willy, akinobu.mita
  Cc: akpm, linux-kernel, linux-mm, linux-fsdevel, stable



On 2022/11/8 17:32, Wei Yongjun wrote:
> 
> 
> On 2022/11/8 16:58, Qi Zheng wrote:
>>
>>
>> On 2022/11/8 16:44, Wei Yongjun wrote:
>>> Hi Zheng Qi,
>>>
>>> On 2022/11/8 11:52, Qi Zheng wrote:
>>>> When we specify __GFP_NOWARN, we only expect that no warnings
>>>> will be issued for current caller. But in the __should_failslab()
>>>> and __should_fail_alloc_page(), the local GFP flags alter the
>>>> global {failslab|fail_page_alloc}.attr, which is persistent and
>>>> shared by all tasks. This is not what we expected, let's fix it.
>>>>
>>>> Cc: stable@vger.kernel.org
>>>> Fixes: 3f913fc5f974 ("mm: fix missing handler for __GFP_NOWARN")
>>>> Reported-by: Dmitry Vyukov <dvyukov@google.com>
>>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>>>> ---
>>>>    v1: https://lore.kernel.org/lkml/20221107033109.59709-1-zhengqi.arch@bytedance.com/
>>>>
>>>>    Changelog in v1 -> v2:
>>>>     - add comment for __should_failslab() and __should_fail_alloc_page()
>>>>       (suggested by Jason)
>>>>
>>>>    include/linux/fault-inject.h |  7 +++++--
>>>>    lib/fault-inject.c           | 14 +++++++++-----
>>>>    mm/failslab.c                | 12 ++++++++++--
>>>>    mm/page_alloc.c              |  7 +++++--
>>>>    4 files changed, 29 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/include/linux/fault-inject.h b/include/linux/fault-inject.h
>>>> index 9f6e25467844..444236dadcf0 100644
>>>> --- a/include/linux/fault-inject.h
>>>> +++ b/include/linux/fault-inject.h
>>>> @@ -20,7 +20,6 @@ struct fault_attr {
>>>>        atomic_t space;
>>>>        unsigned long verbose;
>>>>        bool task_filter;
>>>> -    bool no_warn;
>>>>        unsigned long stacktrace_depth;
>>>>        unsigned long require_start;
>>>>        unsigned long require_end;
>>>> @@ -32,6 +31,10 @@ struct fault_attr {
>>>>        struct dentry *dname;
>>>>    };
>>>>    +enum fault_flags {
>>>> +    FAULT_NOWARN =    1 << 0,
>>>> +};
>>>> +
>>>>    #define FAULT_ATTR_INITIALIZER {                    \
>>>>            .interval = 1,                        \
>>>>            .times = ATOMIC_INIT(1),                \
>>>> @@ -40,11 +43,11 @@ struct fault_attr {
>>>>            .ratelimit_state = RATELIMIT_STATE_INIT_DISABLED,    \
>>>>            .verbose = 2,                        \
>>>>            .dname = NULL,                        \
>>>> -        .no_warn = false,                    \
>>>
>>> How about keep no_warn attr as it be, and export it to user?
>>>
>>> When testing with fault injection, and each fault will print an backtrace.
>>> but not all of the testsuit can tell us which one is fault injection
>>> message or other is a real warning/crash like syzkaller do.
>>>
>>> In my case, to make things simple, we usually used a regex to detect whether
>>> wanring/error happend. So we disabled the slab/page fault warning message by
>>> default, and only enable it when debug real issue.
>>
>> So you want to set/clear this no_warn attr through the procfs or sysfs
>> interface, so that you can easily disable/enable the slab/page fault
>> warning message from the user mode. Right?
> 
> Yes, just like:
> 
> echo 1 > /sys/kernel/debug/failslab/no_warn  #disable message
> echo 0 > /sys/kernel/debug/failslab/no_warn  #enable message

Got it. Let's wait for the other people's comments and suggestions. :)

> 
> Regards
> Wei Yongjun
> 
>>
>> Seems reasonable to me. Anyone else has an opinion on this? If it is
>> really needed, I can do it later.
>>
>> Thanks,
>> Qi
>>
>>>
>>> Regards,
>>>
>>>
>>>>        }
>>>>      #define DECLARE_FAULT_ATTR(name) struct fault_attr name = FAULT_ATTR_INITIALIZER
>>>>    int setup_fault_attr(struct fault_attr *attr, char *str);
>>>> +bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags);
>>>>    bool should_fail(struct fault_attr *attr, ssize_t size);
>>>>      #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
>>>> diff --git a/lib/fault-inject.c b/lib/fault-inject.c
>>>> index 4b8fafce415c..5971f7c3e49e 100644
>>>> --- a/lib/fault-inject.c
>>>> +++ b/lib/fault-inject.c
>>>> @@ -41,9 +41,6 @@ EXPORT_SYMBOL_GPL(setup_fault_attr);
>>>>      static void fail_dump(struct fault_attr *attr)
>>>>    {
>>>> -    if (attr->no_warn)
>>>> -        return;
>>>> -
>>>>        if (attr->verbose > 0 && __ratelimit(&attr->ratelimit_state)) {
>>>>            printk(KERN_NOTICE "FAULT_INJECTION: forcing a failure.\n"
>>>>                   "name %pd, interval %lu, probability %lu, "
>>>> @@ -103,7 +100,7 @@ static inline bool fail_stacktrace(struct fault_attr *attr)
>>>>     * http://www.nongnu.org/failmalloc/
>>>>     */
>>>>    -bool should_fail(struct fault_attr *attr, ssize_t size)
>>>> +bool should_fail_ex(struct fault_attr *attr, ssize_t size, int flags)
>>>>    {
>>>>        bool stack_checked = false;
>>>>    @@ -152,13 +149,20 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
>>>>            return false;
>>>>      fail:
>>>> -    fail_dump(attr);
>>>> +    if (!(flags & FAULT_NOWARN))
>>>> +        fail_dump(attr);
>>>>          if (atomic_read(&attr->times) != -1)
>>>>            atomic_dec_not_zero(&attr->times);
>>>>          return true;
>>>>    }
>>>> +EXPORT_SYMBOL_GPL(should_fail_ex);
>>>> +
>>>> +bool should_fail(struct fault_attr *attr, ssize_t size)
>>>> +{
>>>> +    return should_fail_ex(attr, size, 0);
>>>> +}
>>>>    EXPORT_SYMBOL_GPL(should_fail);
>>>>      #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
>>>> diff --git a/mm/failslab.c b/mm/failslab.c
>>>> index 58df9789f1d2..ffc420c0e767 100644
>>>> --- a/mm/failslab.c
>>>> +++ b/mm/failslab.c
>>>> @@ -16,6 +16,8 @@ static struct {
>>>>      bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
>>>>    {
>>>> +    int flags = 0;
>>>> +
>>>>        /* No fault-injection for bootstrap cache */
>>>>        if (unlikely(s == kmem_cache))
>>>>            return false;
>>>> @@ -30,10 +32,16 @@ bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
>>>>        if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB))
>>>>            return false;
>>>>    +    /*
>>>> +     * In some cases, it expects to specify __GFP_NOWARN
>>>> +     * to avoid printing any information(not just a warning),
>>>> +     * thus avoiding deadlocks. See commit 6b9dbedbe349 for
>>>> +     * details.
>>>> +     */
>>>>        if (gfpflags & __GFP_NOWARN)
>>>> -        failslab.attr.no_warn = true;
>>>> +        flags |= FAULT_NOWARN;
>>>>    -    return should_fail(&failslab.attr, s->object_size);
>>>> +    return should_fail_ex(&failslab.attr, s->object_size, flags);
>>>>    }
>>>>      static int __init setup_failslab(char *str)
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index 7192ded44ad0..cb6fe715d983 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -3902,6 +3902,8 @@ __setup("fail_page_alloc=", setup_fail_page_alloc);
>>>>      static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
>>>>    {
>>>> +    int flags = 0;
>>>> +
>>>>        if (order < fail_page_alloc.min_order)
>>>>            return false;
>>>>        if (gfp_mask & __GFP_NOFAIL)
>>>> @@ -3912,10 +3914,11 @@ static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
>>>>                (gfp_mask & __GFP_DIRECT_RECLAIM))
>>>>            return false;
>>>>    +    /* See comment in __should_failslab() */
>>>>        if (gfp_mask & __GFP_NOWARN)
>>>> -        fail_page_alloc.attr.no_warn = true;
>>>> +        flags |= FAULT_NOWARN;
>>>>    -    return should_fail(&fail_page_alloc.attr, 1 << order);
>>>> +    return should_fail_ex(&fail_page_alloc.attr, 1 << order, flags);
>>>>    }
>>>>      #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
>>

-- 
Thanks,
Qi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-08  9:32                                   ` Wei Yongjun
  2022-11-08  9:45                                     ` Qi Zheng
@ 2022-11-08 12:04                                     ` Jason Gunthorpe
  2022-11-09  3:57                                       ` Wei Yongjun
  1 sibling, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2022-11-08 12:04 UTC (permalink / raw)
  To: Wei Yongjun
  Cc: Qi Zheng, dvyukov, willy, akinobu.mita, akpm, linux-kernel,
	linux-mm, linux-fsdevel, stable

On Tue, Nov 08, 2022 at 05:32:52PM +0800, Wei Yongjun wrote:
> > So you want to set/clear this no_warn attr through the procfs or sysfs
> > interface, so that you can easily disable/enable the slab/page fault
> > warning message from the user mode. Right?
> 
> Yes, just like:
> 
> echo 1 > /sys/kernel/debug/failslab/no_warn  #disable message
> echo 0 > /sys/kernel/debug/failslab/no_warn  #enable message

You can already do that:

 echo 0 > /sys/kernel/debug/failslab/verbose  #disable message

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-08  3:52                             ` [PATCH v2] " Qi Zheng
  2022-11-08  8:44                               ` Wei Yongjun
@ 2022-11-08 17:36                               ` Akinobu Mita
  2022-11-14  3:59                                 ` Qi Zheng
  1 sibling, 1 reply; 30+ messages in thread
From: Akinobu Mita @ 2022-11-08 17:36 UTC (permalink / raw)
  To: Qi Zheng
  Cc: dvyukov, jgg, willy, akpm, linux-kernel, linux-mm, linux-fsdevel, stable

2022年11月8日(火) 12:52 Qi Zheng <zhengqi.arch@bytedance.com>:
>
> When we specify __GFP_NOWARN, we only expect that no warnings
> will be issued for current caller. But in the __should_failslab()
> and __should_fail_alloc_page(), the local GFP flags alter the
> global {failslab|fail_page_alloc}.attr, which is persistent and
> shared by all tasks. This is not what we expected, let's fix it.
>
> Cc: stable@vger.kernel.org
> Fixes: 3f913fc5f974 ("mm: fix missing handler for __GFP_NOWARN")
> Reported-by: Dmitry Vyukov <dvyukov@google.com>
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
> ---
>  v1: https://lore.kernel.org/lkml/20221107033109.59709-1-zhengqi.arch@bytedance.com/
>
>  Changelog in v1 -> v2:
>   - add comment for __should_failslab() and __should_fail_alloc_page()
>     (suggested by Jason)

Looks good.

Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-08 12:04                                     ` Jason Gunthorpe
@ 2022-11-09  3:57                                       ` Wei Yongjun
  0 siblings, 0 replies; 30+ messages in thread
From: Wei Yongjun @ 2022-11-09  3:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Qi Zheng, dvyukov, willy, akinobu.mita, akpm, linux-kernel,
	linux-mm, linux-fsdevel, stable



On 2022/11/8 20:04, Jason Gunthorpe wrote:
> On Tue, Nov 08, 2022 at 05:32:52PM +0800, Wei Yongjun wrote:
>>> So you want to set/clear this no_warn attr through the procfs or sysfs
>>> interface, so that you can easily disable/enable the slab/page fault
>>> warning message from the user mode. Right?
>>
>> Yes, just like:
>>
>> echo 1 > /sys/kernel/debug/failslab/no_warn  #disable message
>> echo 0 > /sys/kernel/debug/failslab/no_warn  #enable message
> 
> You can already do that:
> 
>  echo 0 > /sys/kernel/debug/failslab/verbose  #disable message

Got it, thanks.

Wei Yongjun


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  2022-11-08 17:36                               ` Akinobu Mita
@ 2022-11-14  3:59                                 ` Qi Zheng
  0 siblings, 0 replies; 30+ messages in thread
From: Qi Zheng @ 2022-11-14  3:59 UTC (permalink / raw)
  To: Akinobu Mita, Andrew Morton
  Cc: dvyukov, jgg, willy, linux-kernel, linux-mm, linux-fsdevel, stable



On 2022/11/9 01:36, Akinobu Mita wrote:
> 2022年11月8日(火) 12:52 Qi Zheng <zhengqi.arch@bytedance.com>:
>>
>> When we specify __GFP_NOWARN, we only expect that no warnings
>> will be issued for current caller. But in the __should_failslab()
>> and __should_fail_alloc_page(), the local GFP flags alter the
>> global {failslab|fail_page_alloc}.attr, which is persistent and
>> shared by all tasks. This is not what we expected, let's fix it.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: 3f913fc5f974 ("mm: fix missing handler for __GFP_NOWARN")
>> Reported-by: Dmitry Vyukov <dvyukov@google.com>
>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>> ---
>>   v1: https://lore.kernel.org/lkml/20221107033109.59709-1-zhengqi.arch@bytedance.com/
>>
>>   Changelog in v1 -> v2:
>>    - add comment for __should_failslab() and __should_fail_alloc_page()
>>      (suggested by Jason)
> 
> Looks good.
> 
> Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com>

Thanks. And hi Andrew, seems no action left for me, can this patch
be applied to mm-unstable tree for testing? :)

-- 
Thanks,
Qi

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2022-11-14  4:00 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-03 19:09 xarray, fault injection and syzkaller Jason Gunthorpe
2022-11-03 20:00 ` Matthew Wilcox
2022-11-03 20:07   ` Jason Gunthorpe
2022-11-04  0:11     ` Dmitry Vyukov
2022-11-04  0:21       ` Jason Gunthorpe
2022-11-04 17:47         ` Dmitry Vyukov
2022-11-04 18:03           ` Jason Gunthorpe
2022-11-04 18:21             ` Dmitry Vyukov
2022-11-04 18:30               ` Jason Gunthorpe
2022-11-04 18:38                 ` Dmitry Vyukov
2022-11-04 18:45                   ` Jason Gunthorpe
2022-11-04 22:43               ` Qi Zheng
2022-11-05 12:16                 ` Qi Zheng
2022-11-06 17:36                   ` Dmitry Vyukov
2022-11-07  2:13                     ` Qi Zheng
2022-11-07  3:31                     ` [PATCH] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr Qi Zheng
2022-11-07 12:42                       ` Jason Gunthorpe
2022-11-07 15:05                         ` Qi Zheng
2022-11-07 16:26                           ` Jason Gunthorpe
2022-11-08  2:47                             ` Qi Zheng
2022-11-08  3:52                             ` [PATCH v2] " Qi Zheng
2022-11-08  8:44                               ` Wei Yongjun
2022-11-08  8:58                                 ` Qi Zheng
2022-11-08  9:32                                   ` Wei Yongjun
2022-11-08  9:45                                     ` Qi Zheng
2022-11-08 12:04                                     ` Jason Gunthorpe
2022-11-09  3:57                                       ` Wei Yongjun
2022-11-08 17:36                               ` Akinobu Mita
2022-11-14  3:59                                 ` Qi Zheng
2022-11-04 18:42             ` xarray, fault injection and syzkaller Dmitry Vyukov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.