All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Michal Hocko <mhocko@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org,
	"DRI Development" <dri-devel@lists.freedesktop.org>,
	"Intel Graphics Development" <intel-gfx@lists.freedesktop.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"David Rientjes" <rientjes@google.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Masahiro Yamada" <yamada.masahiro@socionext.com>,
	"Wei Wang" <wvw@google.com>,
	"Andy Shevchenko" <andriy.shevchenko@linux.intel.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Jann Horn" <jannh@google.com>, "Feng Tang" <feng.tang@intel.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Daniel Vetter" <daniel.vetter@intel.com>
Subject: Re: [PATCH 2/5] kernel.h: Add non_block_start/end()
Date: Thu, 15 Aug 2019 17:13:23 -0300	[thread overview]
Message-ID: <20190815201323.GU21596@ziepe.ca> (raw)
In-Reply-To: <20190815193526.GT9477@dhcp22.suse.cz>

On Thu, Aug 15, 2019 at 09:35:26PM +0200, Michal Hocko wrote:

> > The last detail is I'm still unclear what a GFP flags a blockable
> > invalidate_range_start() should use. Is GFP_KERNEL OK?
> 
> I hope I will not make this muddy again ;)
> invalidate_range_start in the blockable mode can use/depend on any sleepable
> allocation allowed in the context it is called from. 

'in the context is is called from' is the magic phrase, as
invalidate_range_start is called while holding several different mm
related locks. I know at least write mmap_sem and i_mmap_rwsem
(write?)

Can GFP_KERNEL be called while holding those locks?

This is the question of indirect dependency on reclaim via locks you
raised earlier.

> So in other words it is no different from any other function in the
> kernel that calls into allocator. As the API is missing gfp context
> then I hope it is not called from any restricted contexts (except
> from the oom which we have !blockable for).

Yes, the callers are exactly my concern.
 
> > Lockdep has
> > complained on that in past due to fs_reclaim - how do you know if it
> > is a false positive?
> 
> I would have to see the specific lockdep splat.

See below. I found it when trying to understand why the registration
of the mmu notififer was so oddly coded.

The situation was:

  down_write(&mm->mmap_sem);
  mm_take_all_locks(mm);
  kmalloc(GFP_KERNEL);  <--- lockdep warning

I understood Daniel said he saw this directly on a recent kernel when
working with his lockdep patch?

Checking myself, on todays kernel I see a call chain:

shrink_all_memory
  fs_reclaim_acquire(sc.gfp_mask);
  [..]
  do_try_to_free_pages
   shrink_zones
    shrink_node
     shrink_node_memcg
      shrink_list
       shrink_active_list
        page_referenced
         rmap_walk
          rmap_walk_file
           i_mmap_lock_read
            down_read(i_mmap_rwsem)

So it is possible that the down_read() above will block on
i_mmap_rwsem being held in the caller of invalidate_range_start which
is doing kmalloc(GPF_KERNEL).

Is this OK? The lockdep annotation says no..

Jason

commit 35cfa2b0b491c37e23527822bf365610dbb188e5
Author: Gavin Shan <shangw@linux.vnet.ibm.com>
Date:   Thu Oct 25 13:38:01 2012 -0700

    mm/mmu_notifier: allocate mmu_notifier in advance
    
    While allocating mmu_notifier with parameter GFP_KERNEL, swap would start
    to work in case of tight available memory.  Eventually, that would lead to
    a deadlock while the swap deamon swaps anonymous pages.  It was caused by
    commit e0f3c3f78da29b ("mm/mmu_notifier: init notifier if necessary").
    
      =================================
      [ INFO: inconsistent lock state ]
      3.7.0-rc1+ #518 Not tainted
      ---------------------------------
      inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
      kswapd0/35 [HC0[0]:SC0[0]:HE1:SE1] takes:
       (&mapping->i_mmap_mutex){+.+.?.}, at: page_referenced+0x9c/0x2e0
      {RECLAIM_FS-ON-W} state was registered at:
         mark_held_locks+0x86/0x150
         lockdep_trace_alloc+0x67/0xc0
         kmem_cache_alloc_trace+0x33/0x230
         do_mmu_notifier_register+0x87/0x180
         mmu_notifier_register+0x13/0x20
         kvm_dev_ioctl+0x428/0x510
         do_vfs_ioctl+0x98/0x570
         sys_ioctl+0x91/0xb0
         system_call_fastpath+0x16/0x1b

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Michal Hocko <mhocko@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org,
	"DRI Development" <dri-devel@lists.freedesktop.org>,
	"Intel Graphics Development" <intel-gfx@lists.freedesktop.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"David Rientjes" <rientjes@google.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Masahiro Yamada" <yamada.masahiro@socionext.com>,
	"Wei Wang" <wvw@google.com>,
	"Andy Shevchenko" <andriy.shevchenko@linux.intel.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Jann Horn" <jannh@google.com>, "Feng Tang" <feng.tang@intel.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Daniel Vetter" <daniel.vetter@i>
Subject: Re: [PATCH 2/5] kernel.h: Add non_block_start/end()
Date: Thu, 15 Aug 2019 17:13:23 -0300	[thread overview]
Message-ID: <20190815201323.GU21596@ziepe.ca> (raw)
In-Reply-To: <20190815193526.GT9477@dhcp22.suse.cz>

On Thu, Aug 15, 2019 at 09:35:26PM +0200, Michal Hocko wrote:

> > The last detail is I'm still unclear what a GFP flags a blockable
> > invalidate_range_start() should use. Is GFP_KERNEL OK?
> 
> I hope I will not make this muddy again ;)
> invalidate_range_start in the blockable mode can use/depend on any sleepable
> allocation allowed in the context it is called from. 

'in the context is is called from' is the magic phrase, as
invalidate_range_start is called while holding several different mm
related locks. I know at least write mmap_sem and i_mmap_rwsem
(write?)

Can GFP_KERNEL be called while holding those locks?

This is the question of indirect dependency on reclaim via locks you
raised earlier.

> So in other words it is no different from any other function in the
> kernel that calls into allocator. As the API is missing gfp context
> then I hope it is not called from any restricted contexts (except
> from the oom which we have !blockable for).

Yes, the callers are exactly my concern.
 
> > Lockdep has
> > complained on that in past due to fs_reclaim - how do you know if it
> > is a false positive?
> 
> I would have to see the specific lockdep splat.

See below. I found it when trying to understand why the registration
of the mmu notififer was so oddly coded.

The situation was:

  down_write(&mm->mmap_sem);
  mm_take_all_locks(mm);
  kmalloc(GFP_KERNEL);  <--- lockdep warning

I understood Daniel said he saw this directly on a recent kernel when
working with his lockdep patch?

Checking myself, on todays kernel I see a call chain:

shrink_all_memory
  fs_reclaim_acquire(sc.gfp_mask);
  [..]
  do_try_to_free_pages
   shrink_zones
    shrink_node
     shrink_node_memcg
      shrink_list
       shrink_active_list
        page_referenced
         rmap_walk
          rmap_walk_file
           i_mmap_lock_read
            down_read(i_mmap_rwsem)

So it is possible that the down_read() above will block on
i_mmap_rwsem being held in the caller of invalidate_range_start which
is doing kmalloc(GPF_KERNEL).

Is this OK? The lockdep annotation says no..

Jason

commit 35cfa2b0b491c37e23527822bf365610dbb188e5
Author: Gavin Shan <shangw@linux.vnet.ibm.com>
Date:   Thu Oct 25 13:38:01 2012 -0700

    mm/mmu_notifier: allocate mmu_notifier in advance
    
    While allocating mmu_notifier with parameter GFP_KERNEL, swap would start
    to work in case of tight available memory.  Eventually, that would lead to
    a deadlock while the swap deamon swaps anonymous pages.  It was caused by
    commit e0f3c3f78da29b ("mm/mmu_notifier: init notifier if necessary").
    
      =================================
      [ INFO: inconsistent lock state ]
      3.7.0-rc1+ #518 Not tainted
      ---------------------------------
      inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
      kswapd0/35 [HC0[0]:SC0[0]:HE1:SE1] takes:
       (&mapping->i_mmap_mutex){+.+.?.}, at: page_referenced+0x9c/0x2e0
      {RECLAIM_FS-ON-W} state was registered at:
         mark_held_locks+0x86/0x150
         lockdep_trace_alloc+0x67/0xc0
         kmem_cache_alloc_trace+0x33/0x230
         do_mmu_notifier_register+0x87/0x180
         mmu_notifier_register+0x13/0x20
         kvm_dev_ioctl+0x428/0x510
         do_vfs_ioctl+0x98/0x570
         sys_ioctl+0x91/0xb0
         system_call_fastpath+0x16/0x1b

  reply	other threads:[~2019-08-15 20:13 UTC|newest]

Thread overview: 130+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-14 20:20 [PATCH 0/5] hmm & mmu_notifier debug/lockdep annotations Daniel Vetter
2019-08-14 20:20 ` [PATCH 1/5] mm: Check if mmu notifier callbacks are allowed to fail Daniel Vetter
2019-08-14 22:14   ` Andrew Morton
2019-08-14 23:22     ` Jason Gunthorpe
2019-08-14 23:34     ` Ralph Campbell
2019-08-16 17:19   ` Jason Gunthorpe
2019-08-14 20:20 ` [PATCH 2/5] kernel.h: Add non_block_start/end() Daniel Vetter
2019-08-14 20:20   ` Daniel Vetter
2019-08-14 20:45   ` Andrew Morton
2019-08-14 20:45     ` Andrew Morton
2019-08-15  6:52     ` Daniel Vetter
2019-08-15  6:52       ` Daniel Vetter
2019-08-15  8:44     ` Michal Hocko
2019-08-15  8:44       ` Michal Hocko
2019-08-15 13:04       ` Jason Gunthorpe
2019-08-15 13:04         ` Jason Gunthorpe
2019-08-15 13:12         ` Daniel Vetter
2019-08-15 13:12           ` Daniel Vetter
2019-08-15 14:37           ` Jason Gunthorpe
2019-08-15 14:37             ` Jason Gunthorpe
2019-08-15 14:43             ` Daniel Vetter
2019-08-15 14:43               ` Daniel Vetter
2019-08-15 15:10               ` Jason Gunthorpe
2019-08-15 15:10                 ` Jason Gunthorpe
2019-08-15 16:25                 ` Daniel Vetter
2019-08-15 16:25                   ` Daniel Vetter
2019-08-15 17:35                   ` Jason Gunthorpe
2019-08-15 17:35                     ` Jason Gunthorpe
2019-08-15 17:39                     ` Jerome Glisse
2019-08-15 17:39                       ` Jerome Glisse
2019-08-15 18:01                       ` Jason Gunthorpe
2019-08-15 18:01                         ` Jason Gunthorpe
2019-08-15 18:27                         ` Jerome Glisse
2019-08-15 18:27                           ` Jerome Glisse
2019-08-15 18:57                           ` Jason Gunthorpe
2019-08-15 18:57                             ` Jason Gunthorpe
2019-08-15 16:32                 ` Jerome Glisse
2019-08-15 16:32                   ` Jerome Glisse
2019-08-15 17:16                   ` Jason Gunthorpe
2019-08-15 17:16                     ` Jason Gunthorpe
2019-08-15 17:21                     ` Daniel Vetter
2019-08-15 17:21                       ` Daniel Vetter
2019-08-15 17:35                       ` Jerome Glisse
2019-08-15 17:35                         ` Jerome Glisse
2019-08-15 13:24         ` Michal Hocko
2019-08-15 13:24           ` Michal Hocko
2019-08-15 22:15       ` Andrew Morton
2019-08-15 22:15         ` Andrew Morton
2019-08-16  8:24         ` Michal Hocko
2019-08-16  8:24           ` Michal Hocko
2019-08-14 23:58   ` Jason Gunthorpe
2019-08-14 23:58     ` Jason Gunthorpe
2019-08-15  6:58     ` Daniel Vetter
2019-08-15  6:58       ` Daniel Vetter
2019-08-15 12:23       ` Jason Gunthorpe
2019-08-15 12:23         ` Jason Gunthorpe
2019-08-15 13:21         ` Michal Hocko
2019-08-15 13:21           ` Michal Hocko
2019-08-15 14:12           ` Jason Gunthorpe
2019-08-15 14:12             ` Jason Gunthorpe
2019-08-15 16:00             ` Michal Hocko
2019-08-15 16:00               ` Michal Hocko
2019-08-15 16:56               ` Jason Gunthorpe
2019-08-15 16:56                 ` Jason Gunthorpe
2019-08-15 17:11                 ` Jerome Glisse
2019-08-15 17:17                   ` Jason Gunthorpe
2019-08-15 17:42                 ` Michal Hocko
2019-08-15 17:42                   ` Michal Hocko
2019-08-15 17:57                   ` Jerome Glisse
2019-08-15 18:24                   ` Jason Gunthorpe
2019-08-15 18:24                     ` Jason Gunthorpe
2019-08-15 19:05                     ` Michal Hocko
2019-08-15 19:05                       ` Michal Hocko
2019-08-15 19:18                       ` Jason Gunthorpe
2019-08-15 19:18                         ` Jason Gunthorpe
2019-08-15 19:35                         ` Michal Hocko
2019-08-15 19:35                           ` Michal Hocko
2019-08-15 20:13                           ` Jason Gunthorpe [this message]
2019-08-15 20:13                             ` Jason Gunthorpe
2019-08-16  8:10                             ` Michal Hocko
2019-08-16  8:10                               ` Michal Hocko
2019-08-16 12:19                               ` Jason Gunthorpe
2019-08-16 12:19                                 ` Jason Gunthorpe
2019-08-16 12:26                                 ` Michal Hocko
2019-08-16 12:26                                   ` Michal Hocko
2019-08-16 14:31                                   ` Jason Gunthorpe
2019-08-16 14:31                                     ` Jason Gunthorpe
2019-08-16 15:05                                     ` Jerome Glisse
2019-08-16 15:05                                       ` Jerome Glisse
2019-08-20  8:18                                     ` Michal Hocko
2019-08-20  8:18                                       ` Michal Hocko
2019-08-15 20:16                           ` [Intel-gfx] " Daniel Vetter
2019-08-15 20:16                             ` Daniel Vetter
2019-08-15 20:27                             ` Jason Gunthorpe
2019-08-15 20:27                               ` Jason Gunthorpe
2019-08-15 20:49                               ` Daniel Vetter
2019-08-15 20:49                                 ` Daniel Vetter
2019-08-16  1:00                                 ` Jason Gunthorpe
2019-08-16  1:00                                   ` Jason Gunthorpe
2019-08-16  6:20                                   ` Daniel Vetter
2019-08-16  6:20                                     ` Daniel Vetter
2019-08-16 12:12                                     ` Jason Gunthorpe
2019-08-16 12:12                                       ` Jason Gunthorpe
2019-08-16 14:11                                       ` Daniel Vetter
2019-08-16 14:11                                         ` Daniel Vetter
2019-08-16 14:38                                         ` Jason Gunthorpe
2019-08-16 14:38                                           ` Jason Gunthorpe
2019-08-16 16:36                                           ` Daniel Vetter
2019-08-16 16:36                                             ` Daniel Vetter
2019-08-16 16:54                                             ` Jason Gunthorpe
2019-08-16 16:54                                               ` Jason Gunthorpe
2019-08-16  8:27                             ` Michal Hocko
2019-08-16  8:27                               ` Michal Hocko
2019-08-14 20:20 ` [PATCH 3/5] mm, notifier: Catch sleeping/blocking for !blockable Daniel Vetter
2019-08-15  0:00   ` Jason Gunthorpe
2019-08-15  7:02     ` Daniel Vetter
2019-08-15 12:35       ` Jason Gunthorpe
2019-08-17 16:09         ` Daniel Vetter
2019-08-17 16:09           ` Daniel Vetter
2019-08-14 20:20 ` [PATCH 4/5] mm, notifier: Add a lockdep map for invalidate_range_start Daniel Vetter
2019-08-15  0:09   ` Jason Gunthorpe
2019-08-15  7:10     ` Daniel Vetter
2019-08-15  7:10       ` Daniel Vetter
2019-08-15 12:53       ` Jason Gunthorpe
2019-08-14 20:20 ` [PATCH 5/5] mm/hmm: WARN on illegal ->sync_cpu_device_pagetables errors Daniel Vetter
2019-08-15  0:11   ` Jason Gunthorpe
2019-08-15  7:14     ` Daniel Vetter
2019-08-15  7:14       ` Daniel Vetter
2019-08-14 21:29 ` ✗ Fi.CI.CHECKPATCH: warning for hmm & mmu_notifier debug/lockdep annotations Patchwork
2019-08-14 21:56 ` ✓ Fi.CI.BAT: success " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190815201323.GU21596@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=christian.koenig@amd.com \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=feng.tang@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jannh@google.com \
    --cc=jglisse@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=tglx@linutronix.de \
    --cc=wvw@google.com \
    --cc=yamada.masahiro@socionext.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.