[v3,0/3] mm: kmemleak: Use a memory pool for kmemleak object allocations
mbox series

Message ID 20190812160642.52134-1-catalin.marinas@arm.com
Headers show
Series
  • mm: kmemleak: Use a memory pool for kmemleak object allocations
Related show

Message

Catalin Marinas Aug. 12, 2019, 4:06 p.m. UTC
Following the discussions on v2 of this patch(set) [1], this series
takes slightly different approach:

- it implements its own simple memory pool that does not rely on the
  slab allocator

- drops the early log buffer logic entirely since it can now allocate
  metadata from the memory pool directly before kmemleak is fully
  initialised

- CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
  CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE

- moves the kmemleak_init() call earlier (mm_init())

- to avoid a separate memory pool for struct scan_area, it makes the
  tool robust when such allocations fail as scan areas are rather an
  optimisation

[1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com

Catalin Marinas (3):
  mm: kmemleak: Make the tool tolerant to struct scan_area allocation
    failures
  mm: kmemleak: Simple memory allocation pool for kmemleak objects
  mm: kmemleak: Use the memory pool for early allocations

 init/main.c       |   2 +-
 lib/Kconfig.debug |  11 +-
 mm/kmemleak.c     | 325 ++++++++++++----------------------------------
 3 files changed, 91 insertions(+), 247 deletions(-)

Comments

Andrew Morton Aug. 12, 2019, 9:07 p.m. UTC | #1
On Mon, 12 Aug 2019 17:06:39 +0100 Catalin Marinas <catalin.marinas@arm.com> wrote:

> Following the discussions on v2 of this patch(set) [1], this series
> takes slightly different approach:
> 
> - it implements its own simple memory pool that does not rely on the
>   slab allocator
> 
> - drops the early log buffer logic entirely since it can now allocate
>   metadata from the memory pool directly before kmemleak is fully
>   initialised
> 
> - CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
>   CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE
> 
> - moves the kmemleak_init() call earlier (mm_init())
> 
> - to avoid a separate memory pool for struct scan_area, it makes the
>   tool robust when such allocations fail as scan areas are rather an
>   optimisation
> 
> [1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com

Using the term "memory pool" is a little unfortunate, but better than
using "mempool"!

The changelog doesn't answer the very first question: why not use
mempools.  Please send along a paragraph which explains this decision.
Catalin Marinas Aug. 13, 2019, 9:40 a.m. UTC | #2
On Mon, Aug 12, 2019 at 02:07:30PM -0700, Andrew Morton wrote:
> On Mon, 12 Aug 2019 17:06:39 +0100 Catalin Marinas <catalin.marinas@arm.com> wrote:
> 
> > Following the discussions on v2 of this patch(set) [1], this series
> > takes slightly different approach:
> > 
> > - it implements its own simple memory pool that does not rely on the
> >   slab allocator
> > 
> > - drops the early log buffer logic entirely since it can now allocate
> >   metadata from the memory pool directly before kmemleak is fully
> >   initialised
> > 
> > - CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
> >   CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE
> > 
> > - moves the kmemleak_init() call earlier (mm_init())
> > 
> > - to avoid a separate memory pool for struct scan_area, it makes the
> >   tool robust when such allocations fail as scan areas are rather an
> >   optimisation
> > 
> > [1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com
> 
> Using the term "memory pool" is a little unfortunate, but better than
> using "mempool"!

I agree, it could have been more inspired. What about "metadata pool"
(together with function name updates etc.)? Happy to send a v4.

> The changelog doesn't answer the very first question: why not use
> mempools.  Please send along a paragraph which explains this decision.

I posted one in reply to the patch where the changelog should be
updated.

Thanks.
Catalin Marinas Dec. 3, 2019, 4:08 p.m. UTC | #3
On Tue, Dec 03, 2019 at 03:51:50PM +0000, Noam Stolero wrote:
> On 8/12/2019 7:06 PM, Catalin Marinas wrote:
> >     Following the discussions on v2 of this patch(set) [1], this series
> >     takes slightly different approach:
> > 
> >     - it implements its own simple memory pool that does not rely on the
> >       slab allocator
> > 
> >     - drops the early log buffer logic entirely since it can now allocate
> >       metadata from the memory pool directly before kmemleak is fully
> >       initialised
> > 
> >     - CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
> >       CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE
> > 
> >     - moves the kmemleak_init() call earlier (mm_init())
> > 
> >     - to avoid a separate memory pool for struct scan_area, it makes the
> >       tool robust when such allocations fail as scan areas are rather an
> >       optimisation
> > 
> >     [1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com
> > 
> >     Catalin Marinas (3):
> >       mm: kmemleak: Make the tool tolerant to struct scan_area allocation
> >         failures
> >       mm: kmemleak: Simple memory allocation pool for kmemleak objects
> >       mm: kmemleak: Use the memory pool for early allocations
> > 
> >      init/main.c       |   2 +-
> >      lib/Kconfig.debug |  11 +-
> >      mm/kmemleak.c     | 325 ++++++++++++----------------------------------
> >      3 files changed, 91 insertions(+), 247 deletions(-)
> 
> We observe severe degradation in our network performance affecting all
> of our NICs. The degradation is directly linked to this patch.
> 
> What we run:
> Simple Iperf TCP loopback with 8 streams on ConnectX5-100GbE.
> Since it's a loopback test, traffic goes from the socket through the IP
> stack and back to the socket, without going through the NIC driver.

Something similar was reported before. Can you try commit 2abd839aa7e6
("kmemleak: Do not corrupt the object_list during clean-up") and see if
it fixes the problem for you? It was merged in 5.4-rc4.
Noam Stolero Dec. 5, 2019, 4:16 p.m. UTC | #4
On 12/3/2019 6:08 PM, Catalin Marinas wrote:
> On Tue, Dec 03, 2019 at 03:51:50PM +0000, Noam Stolero wrote:
>> On 8/12/2019 7:06 PM, Catalin Marinas wrote:
>>>      Following the discussions on v2 of this patch(set) [1], this series
>>>      takes slightly different approach:
>>>
>>>      - it implements its own simple memory pool that does not rely on the
>>>        slab allocator
>>>
>>>      - drops the early log buffer logic entirely since it can now allocate
>>>        metadata from the memory pool directly before kmemleak is fully
>>>        initialised
>>>
>>>      - CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
>>>        CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE
>>>
>>>      - moves the kmemleak_init() call earlier (mm_init())
>>>
>>>      - to avoid a separate memory pool for struct scan_area, it makes the
>>>        tool robust when such allocations fail as scan areas are rather an
>>>        optimisation
>>>
>>>      [1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com
>>>
>>>      Catalin Marinas (3):
>>>        mm: kmemleak: Make the tool tolerant to struct scan_area allocation
>>>          failures
>>>        mm: kmemleak: Simple memory allocation pool for kmemleak objects
>>>        mm: kmemleak: Use the memory pool for early allocations
>>>
>>>       init/main.c       |   2 +-
>>>       lib/Kconfig.debug |  11 +-
>>>       mm/kmemleak.c     | 325 ++++++++++++----------------------------------
>>>       3 files changed, 91 insertions(+), 247 deletions(-)
>> We observe severe degradation in our network performance affecting all
>> of our NICs. The degradation is directly linked to this patch.
>>
>> What we run:
>> Simple Iperf TCP loopback with 8 streams on ConnectX5-100GbE.
>> Since it's a loopback test, traffic goes from the socket through the IP
>> stack and back to the socket, without going through the NIC driver.
> Something similar was reported before. Can you try commit 2abd839aa7e6
> ("kmemleak: Do not corrupt the object_list during clean-up") and see if
> it fixes the problem for you? It was merged in 5.4-rc4.

I've tested this commit, as well as 5.4.0-rc6 and 5.4.0 GA versions. We 
don't see the issue I've mentioned.

Thank you for the quick response and the assistance.