All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 0/6 v3] kvmalloc
@ 2017-01-25 18:14 ` Alexei Starovoitov
  0 siblings, 0 replies; 67+ messages in thread
From: Alexei Starovoitov @ 2017-01-25 18:14 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	linux-mm, LKML, Daniel Borkmann, netdev

On Wed, Jan 25, 2017 at 5:21 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Wed 25-01-17 14:10:06, Michal Hocko wrote:
>> On Tue 24-01-17 11:17:21, Alexei Starovoitov wrote:
>> > On Tue, Jan 24, 2017 at 04:17:52PM +0100, Michal Hocko wrote:
>> > > On Thu 12-01-17 16:37:11, Michal Hocko wrote:
>> > > > Hi,
>> > > > this has been previously posted as a single patch [1] but later on more
>> > > > built on top. It turned out that there are users who would like to have
>> > > > __GFP_REPEAT semantic. This is currently implemented for costly >64B
>> > > > requests. Doing the same for smaller requests would require to redefine
>> > > > __GFP_REPEAT semantic in the page allocator which is out of scope of
>> > > > this series.
>> > > >
>> > > > There are many open coded kmalloc with vmalloc fallback instances in
>> > > > the tree.  Most of them are not careful enough or simply do not care
>> > > > about the underlying semantic of the kmalloc/page allocator which means
>> > > > that a) some vmalloc fallbacks are basically unreachable because the
>> > > > kmalloc part will keep retrying until it succeeds b) the page allocator
>> > > > can invoke a really disruptive steps like the OOM killer to move forward
>> > > > which doesn't sound appropriate when we consider that the vmalloc
>> > > > fallback is available.
>> > > >
>> > > > As it can be seen implementing kvmalloc requires quite an intimate
>> > > > knowledge if the page allocator and the memory reclaim internals which
>> > > > strongly suggests that a helper should be implemented in the memory
>> > > > subsystem proper.
>> > > >
>> > > > Most callers I could find have been converted to use the helper instead.
>> > > > This is patch 5. There are some more relying on __GFP_REPEAT in the
>> > > > networking stack which I have converted as well but considering we do
>> > > > not have a support for __GFP_REPEAT for requests smaller than 64kB I
>> > > > have marked it RFC.
>> > >
>> > > Are there any more comments? I would really appreciate to hear from
>> > > networking folks before I resubmit the series.
>> >
>> > while this patchset was baking the bpf side switched to use bpf_map_area_alloc()
>> > which fixes the issue with missing __GFP_NORETRY that we had to fix quickly.
>> > See commit d407bd25a204 ("bpf: don't trigger OOM killer under pressure with map alloc")
>> > it covers all kmalloc/vmalloc pairs instead of just one place as in this set.
>> > So please rebase and switch bpf_map_area_alloc() to use kvmalloc().
>>
>> OK, will do. Thanks for the heads up.
>
> Just for the record, I will fold the following into the patch 1
> ---
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 19b6129eab23..8697f43cf93c 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -53,21 +53,7 @@ void bpf_register_map_type(struct bpf_map_type_list *tl)
>
>  void *bpf_map_area_alloc(size_t size)
>  {
> -       /* We definitely need __GFP_NORETRY, so OOM killer doesn't
> -        * trigger under memory pressure as we really just want to
> -        * fail instead.
> -        */
> -       const gfp_t flags = __GFP_NOWARN | __GFP_NORETRY | __GFP_ZERO;
> -       void *area;
> -
> -       if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
> -               area = kmalloc(size, GFP_USER | flags);
> -               if (area != NULL)
> -                       return area;
> -       }
> -
> -       return __vmalloc(size, GFP_KERNEL | __GFP_HIGHMEM | flags,
> -                        PAGE_KERNEL);
> +       return kvzalloc(size, GFP_USER);
>  }
>
>  void bpf_map_area_free(void *area)

Looks fine by me.
Daniel, thoughts?

^ permalink raw reply	[flat|nested] 67+ messages in thread
* [PATCH 0/6 v3] kvmalloc
@ 2017-01-30  9:49 ` Michal Hocko
  0 siblings, 0 replies; 67+ messages in thread
From: Michal Hocko @ 2017-01-30  9:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, David Rientjes, Mel Gorman, Johannes Weiner,
	Al Viro, linux-mm, LKML, Alexei Starovoitov, Andreas Dilger,
	Andreas Dilger, Andrey Konovalov, Anton Vorontsov, Ben Skeggs,
	Boris Ostrovsky, Christian Borntraeger, Colin Cross,
	Daniel Borkmann, Dan Williams, David Sterba, Eric Dumazet,
	Eric Dumazet, Hariprasad S, Heiko Carstens, Herbert Xu,
	Ilya Dryomov, John Hubbard, Kees Cook, Kent Overstreet,
	Marcelo Ricardo Leitner, Martin Schwidefsky, Michael S. Tsirkin,
	Michal Hocko, Mike Snitzer, Mikulas Patocka, Oleg Drokin,
	Pablo Neira Ayuso, Rafael J. Wysocki, Santosh Raspatur,
	Tariq Toukan, Tom Herbert, Tony Luck, Yan, Zheng, Yishai Hadas

Hi,
this has been previously posted here [1] and it received quite some
feedback. As a result the number of patches has grown again. We are at
9 patches right now. I have rebased the series on top of the current
next-20170130. There were some changes since the last posting, namely
a7f6c1b63b86 ("AppArmor: Use GFP_KERNEL for __aa_kvmalloc().") which
dropped GFP_NOIO from __aa_kvmalloc and d407bd25a204 ("bpf: don't
trigger OOM killer under pressure with map alloc") which has created a
kvmalloc alternative for bpf code. Both have been changed to use the mm
kvmalloc but it is worth noting this dependency during the merge window.

I hope there are no further obstacles to have this merged into the mmotm
tree and go in in the next merge window.

Original cover:

There are many open coded kmalloc with vmalloc fallback instances in
the tree.  Most of them are not careful enough or simply do not care
about the underlying semantic of the kmalloc/page allocator which means
that a) some vmalloc fallbacks are basically unreachable because the
kmalloc part will keep retrying until it succeeds b) the page allocator
can invoke a really disruptive steps like the OOM killer to move forward
which doesn't sound appropriate when we consider that the vmalloc
fallback is available.

As it can be seen implementing kvmalloc requires quite an intimate
knowledge if the page allocator and the memory reclaim internals which
strongly suggests that a helper should be implemented in the memory
subsystem proper.

Most callers, I could find, have been converted to use the helper
instead.  This is patch 5. There are some more relying on __GFP_REPEAT
in the networking stack which I have converted as well and Eric Dumazet
was not opposed [2] to convert them as well.

[1] http://lkml.kernel.org/r/20170112153717.28943-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com

Michal Hocko (9):
      mm: introduce kv[mz]alloc helpers
      mm: support __GFP_REPEAT in kvmalloc_node for >32kB
      rhashtable: simplify a strange allocation pattern
      ila: simplify a strange allocation pattern
      treewide: use kv[mz]alloc* rather than opencoded variants
      net: use kvmalloc with __GFP_REPEAT rather than open coded variant
      md: use kvmalloc rather than opencoded variant
      bcache: use kvmalloc
      net, bpf: use kvzalloc helper

 arch/s390/kvm/kvm-s390.c                           | 10 +---
 arch/x86/kvm/lapic.c                               |  4 +-
 arch/x86/kvm/page_track.c                          |  4 +-
 arch/x86/kvm/x86.c                                 |  4 +-
 crypto/lzo.c                                       |  4 +-
 drivers/acpi/apei/erst.c                           |  8 +--
 drivers/char/agp/generic.c                         |  8 +--
 drivers/gpu/drm/nouveau/nouveau_gem.c              |  4 +-
 drivers/md/bcache/super.c                          |  8 +--
 drivers/md/bcache/util.h                           | 12 +----
 drivers/md/dm-ioctl.c                              | 13 ++---
 drivers/md/dm-stats.c                              |  7 +--
 drivers/net/ethernet/chelsio/cxgb3/cxgb3_defs.h    |  3 --
 drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c | 29 ++---------
 drivers/net/ethernet/chelsio/cxgb3/l2t.c           |  8 +--
 drivers/net/ethernet/chelsio/cxgb3/l2t.h           |  1 -
 drivers/net/ethernet/chelsio/cxgb4/clip_tbl.c      | 12 ++---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h         |  3 --
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c | 10 ++--
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c |  8 +--
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c    | 31 ++----------
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c  | 13 +++--
 drivers/net/ethernet/chelsio/cxgb4/l2t.c           |  2 +-
 drivers/net/ethernet/chelsio/cxgb4/sched.c         | 12 ++---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c         |  9 ++--
 drivers/net/ethernet/mellanox/mlx4/mr.c            |  9 ++--
 drivers/nvdimm/dimm_devs.c                         |  5 +-
 .../staging/lustre/lnet/libcfs/linux/linux-mem.c   | 11 +----
 drivers/vhost/net.c                                |  9 ++--
 drivers/vhost/vhost.c                              | 15 ++----
 drivers/vhost/vsock.c                              |  9 ++--
 drivers/xen/evtchn.c                               | 14 +-----
 fs/btrfs/ctree.c                                   |  9 ++--
 fs/btrfs/ioctl.c                                   |  9 ++--
 fs/btrfs/send.c                                    | 27 ++++------
 fs/ceph/file.c                                     |  9 ++--
 fs/ext4/mballoc.c                                  |  2 +-
 fs/ext4/super.c                                    |  4 +-
 fs/f2fs/f2fs.h                                     | 20 --------
 fs/f2fs/file.c                                     |  4 +-
 fs/f2fs/segment.c                                  | 14 +++---
 fs/select.c                                        |  5 +-
 fs/seq_file.c                                      | 16 +-----
 fs/xattr.c                                         | 27 ++++------
 include/linux/kvm_host.h                           |  2 -
 include/linux/mlx5/driver.h                        |  7 +--
 include/linux/mm.h                                 | 22 +++++++++
 include/linux/vmalloc.h                            |  1 +
 ipc/util.c                                         |  7 +--
 kernel/bpf/syscall.c                               | 19 ++------
 lib/iov_iter.c                                     |  5 +-
 lib/rhashtable.c                                   | 13 ++---
 mm/frame_vector.c                                  |  5 +-
 mm/nommu.c                                         |  5 ++
 mm/util.c                                          | 57 ++++++++++++++++++++++
 mm/vmalloc.c                                       |  9 +++-
 net/core/dev.c                                     | 24 ++++-----
 net/ipv4/inet_hashtables.c                         |  6 +--
 net/ipv4/tcp_metrics.c                             |  5 +-
 net/ipv6/ila/ila_xlat.c                            |  8 +--
 net/mpls/af_mpls.c                                 |  5 +-
 net/netfilter/x_tables.c                           | 37 ++++----------
 net/netfilter/xt_recent.c                          |  5 +-
 net/sched/sch_choke.c                              |  5 +-
 net/sched/sch_fq.c                                 | 12 +----
 net/sched/sch_fq_codel.c                           | 26 +++-------
 net/sched/sch_hhf.c                                | 33 ++++---------
 net/sched/sch_netem.c                              |  6 +--
 net/sched/sch_sfq.c                                |  6 +--
 security/apparmor/apparmorfs.c                     |  2 +-
 security/apparmor/include/lib.h                    | 11 -----
 security/apparmor/lib.c                            | 30 ------------
 security/apparmor/match.c                          |  2 +-
 security/apparmor/policy_unpack.c                  |  2 +-
 security/keys/keyctl.c                             | 22 +++------
 virt/kvm/kvm_main.c                                | 18 ++-----
 76 files changed, 279 insertions(+), 583 deletions(-)

^ permalink raw reply	[flat|nested] 67+ messages in thread
* [PATCH 0/6 v3] kvmalloc
@ 2017-01-12 15:37 ` Michal Hocko
  0 siblings, 0 replies; 67+ messages in thread
From: Michal Hocko @ 2017-01-12 15:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, David Rientjes, Mel Gorman, Johannes Weiner,
	Al Viro, linux-mm, LKML, Alexei Starovoitov, Anatoly Stepanov,
	Andreas Dilger, Andreas Dilger, Anton Vorontsov, Ben Skeggs,
	Boris Ostrovsky, Colin Cross, Dan Williams, David Sterba,
	Eric Dumazet, Eric Dumazet, Hariprasad S, Heiko Carstens,
	Herbert Xu, Ilya Dryomov, Kees Cook, Kent Overstreet,
	Martin Schwidefsky, Michael S. Tsirkin, Michal Hocko,
	Mike Snitzer, Oleg Drokin, Paolo Bonzini, Rafael J. Wysocki,
	Santosh Raspatur, Tariq Toukan, Theodore Ts'o, Tom Herbert,
	Tony Luck, Yan, Zheng, Yishai Hadas

Hi,
this has been previously posted as a single patch [1] but later on more
built on top. It turned out that there are users who would like to have
__GFP_REPEAT semantic. This is currently implemented for costly >64B
requests. Doing the same for smaller requests would require to redefine
__GFP_REPEAT semantic in the page allocator which is out of scope of
this series.

There are many open coded kmalloc with vmalloc fallback instances in
the tree.  Most of them are not careful enough or simply do not care
about the underlying semantic of the kmalloc/page allocator which means
that a) some vmalloc fallbacks are basically unreachable because the
kmalloc part will keep retrying until it succeeds b) the page allocator
can invoke a really disruptive steps like the OOM killer to move forward
which doesn't sound appropriate when we consider that the vmalloc
fallback is available.

As it can be seen implementing kvmalloc requires quite an intimate
knowledge if the page allocator and the memory reclaim internals which
strongly suggests that a helper should be implemented in the memory
subsystem proper.

Most callers I could find have been converted to use the helper instead.
This is patch 5. There are some more relying on __GFP_REPEAT in the
networking stack which I have converted as well but considering we do
not have a support for __GFP_REPEAT for requests smaller than 64kB I
have marked it RFC.

[1] http://lkml.kernel.org/r/20170102133700.1734-1-mhocko@kernel.org

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2017-02-05 10:23 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-25 18:14 [PATCH 0/6 v3] kvmalloc Alexei Starovoitov
2017-01-25 18:14 ` Alexei Starovoitov
2017-01-25 20:16 ` Daniel Borkmann
2017-01-25 20:16   ` Daniel Borkmann
2017-01-26  7:43   ` Michal Hocko
2017-01-26  7:43     ` Michal Hocko
2017-01-26  9:36     ` Daniel Borkmann
2017-01-26  9:36       ` Daniel Borkmann
2017-01-26  9:48       ` David Laight
2017-01-26  9:48         ` David Laight
2017-01-26 10:08       ` Michal Hocko
2017-01-26 10:08         ` Michal Hocko
2017-01-26 10:32         ` Michal Hocko
2017-01-26 10:32           ` Michal Hocko
2017-01-26 11:04           ` Daniel Borkmann
2017-01-26 11:04             ` Daniel Borkmann
2017-01-26 11:49             ` Michal Hocko
2017-01-26 11:49               ` Michal Hocko
2017-01-26 12:14           ` Joe Perches
2017-01-26 12:14             ` Joe Perches
2017-01-26 12:27             ` Michal Hocko
2017-01-26 12:27               ` Michal Hocko
2017-01-26 11:33         ` Daniel Borkmann
2017-01-26 11:33           ` Daniel Borkmann
2017-01-26 11:58           ` Michal Hocko
2017-01-26 11:58             ` Michal Hocko
2017-01-26 13:10             ` Daniel Borkmann
2017-01-26 13:10               ` Daniel Borkmann
2017-01-26 13:40               ` Michal Hocko
2017-01-26 13:40                 ` Michal Hocko
2017-01-26 14:13                 ` Michal Hocko
2017-01-26 14:13                   ` Michal Hocko
2017-01-26 14:13                   ` Michal Hocko
2017-01-26 14:37                   ` [PATCH] net, bpf: use kvzalloc helper kbuild test robot
2017-01-26 14:58                   ` kbuild test robot
2017-01-26 20:34                 ` [PATCH 0/6 v3] kvmalloc Daniel Borkmann
2017-01-26 20:34                   ` Daniel Borkmann
2017-01-27 10:05                   ` Michal Hocko
2017-01-27 10:05                     ` Michal Hocko
2017-01-27 20:12                     ` Daniel Borkmann
2017-01-27 20:12                       ` Daniel Borkmann
2017-01-30  7:56                       ` Michal Hocko
2017-01-30  7:56                         ` Michal Hocko
2017-01-30 16:15                         ` Daniel Borkmann
2017-01-30 16:15                           ` Daniel Borkmann
2017-01-30 16:28                           ` Michal Hocko
2017-01-30 16:28                             ` Michal Hocko
2017-01-30 16:45                             ` Daniel Borkmann
2017-01-30 16:45                               ` Daniel Borkmann
  -- strict thread matches above, loose matches on Subject: below --
2017-01-30  9:49 Michal Hocko
2017-01-30  9:49 ` Michal Hocko
2017-02-05 10:23 ` Michal Hocko
2017-02-05 10:23   ` Michal Hocko
2017-01-12 15:37 Michal Hocko
2017-01-12 15:37 ` Michal Hocko
2017-01-24 15:17 ` Michal Hocko
2017-01-24 15:17   ` Michal Hocko
2017-01-24 16:00   ` Eric Dumazet
2017-01-24 16:00     ` Eric Dumazet
2017-01-25 13:10     ` Michal Hocko
2017-01-25 13:10       ` Michal Hocko
2017-01-24 19:17   ` Alexei Starovoitov
2017-01-24 19:17     ` Alexei Starovoitov
2017-01-25 13:10     ` Michal Hocko
2017-01-25 13:10       ` Michal Hocko
2017-01-25 13:21       ` Michal Hocko
2017-01-25 13:21         ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.