All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
To: Eduard Zingerman <eddyz87@gmail.com>
Cc: bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	 Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	 Martin KaFai Lau <martin.lau@kernel.org>,
	David Vernet <void@manifault.com>, Tejun Heo <tj@kernel.org>,
	 Raj Sahu <rjsu26@vt.edu>, Dan Williams <djwillia@vt.edu>,
	Rishabh Iyer <rishabh.iyer@epfl.ch>,
	 Sanidhya Kashyap <sanidhya.kashyap@epfl.ch>
Subject: Re: [RFC PATCH v1 00/14] Exceptions - Resource Cleanup
Date: Mon, 18 Mar 2024 06:40:09 +0100	[thread overview]
Message-ID: <CAP01T74kov3JKJFvpaqz6CjEkzBMbOfDLc6Xjg_n5=g9osApjA@mail.gmail.com> (raw)
In-Reply-To: <94ee37372c90c28980246ab803dffb3d2b63be35.camel@gmail.com>

On Thu, 14 Mar 2024 at 12:08, Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Hi Kumar,
>

Hello Eduard,

> Sorry for delayed response. Theoretical possibility that frame merge
> might fail because of some e.g. clang version changes bugs me,
> so I thought a bit on what alternatives are there.

I spent some time this weekend addressing feedback, and prepping the
next version.
As you point out (and as I explained in one of the replies to you),
the possibility of the
same stack slot / register containing a conflicting resource type can
cause merge problems.
I have had a similar thought to what you describe below (emitting
resource hints in such cases) to address this,
but I slept over your idea and have a few thoughts. Please let me know
what you think.

> For the sake of discussion, what do you think about runtime-based
> approach:
> - for each throwing sub-program reserve a stack region where
>   references to currently acquired resources (and their types)
>   would be stored;
> - upon a call to something that acquires resource push pointer/type
>   pair to this region;
> - upon a call to something that releases resource delete pointer/type
>   pair from this region;
> - when bpf_throw() is executed, walk this region for each active frame
>   and execute corresponding destructors.
> - (alternatively, reserve one region for all frames).
>

I went over this option early on but rejected it due to the typical
downsides you observed.
The maximum stack usage due to this will be peak resource acquisition
during the runtime of a subprogram (N) x 16 bytes.
The maximum will be the sum of all subprogs in a call chain.
This seems a bit problematic to me, given this can basically greatly
exceed typical program stack usage. Compared to
extra stack slot usage by bpf_loop inlining, and may_goto instruction,
this appears to be too much.
We can mitigate it by using a more space efficient encoding to
identify resources, but the core problem still remains.
Apart from that, there's the overhead to know zero of this stack
portion on entry every time.

I just think this is all unnecessary especially when most of the time
the exception is not going to be thrown anyway,
so it's all cost incurred for a case that will practically never
happen in a correct program.

> Technically, this could be implemented as follows:
> - during main verification pass:
>   - when verifier processes a call to something that acquires resource,
>     mark call instruction and resource type in insn_aux_data data;
>   - same for processing of a call to something that releases resource;
>   - keep track of a maximal number of simultaneously acquired resources;
> - after main verification pass:
>   - bump stack size for each subprogram by amount, necessary to hold
>     the acquired resource table and assume that this table is at the
>     end of the subprogram stack;
>   - after each acquire call, insert a kfunc call that would add
>     resource reference to the table;
>   - after each release call, insert a kfunc call that would remove
>     resource reference from the table.
>
> On a surface it appears that this approach has a few advantages:
> - seems simpler than frame descriptors tracking and merging
>   (but only implementation would tell if this is so);
> - should be resilient to program compilation changes;
> - abort is possible at any program point, which might be interesting for:
>   - cancelable BPF programs (where abort might be needed in the middle
>     of the e.g. loop);

Let us discuss cancellation in another set that I plan to send in some
time after this one,
but I think aborting arbitrarily at any point of the program is not
the way to go. I have also
reviewed literature on how this happens in other language runtimes,
and the broad consensus
is to have explicit cancellation points in programs, and only allow
cancellation for them.
Synchronous and asynchronous is irrelevant to the mechanism, but
usually it is at explicit program
points.

Not only can you encounter a program executing in the middle of a
loop, you can interrupt it within a kfunc,
in such a case, you cannot really abort it immediately, as that may
not be safe. It is thus better to designate
cancellation points in the program, which are visited even in cases
where the program may possibly be stuck
for a long time (like iterator next kfunc) and only do cancellation
when executing them.

Anyway, let's continue this discussion once I send the RFC out for
cancellation. Will try to also include a PoC
for arena fault aborting.

>   - arenas, where it might be desirable to stop the program after e.g.
>     faulting arena access.
>
> The obvious disadvantage is incurred runtime cost.
> On the other hand, it might be not that big.
>
> What do you think?

Let's go back to the problem we can encounter with frame descriptors.
The case of multiple paths having distinct resource types in a common
stack slot.

What I would do is only force a spill of the specific resource type
(right after the pc acquires it)
only if their stack slots are unmergeable. Since we go over the stack
first, any conflicts in the register
types will be resolved since all states for the ref_obj_id will be
erased. For conflicts in registers, we'd
do something similar. A new 8-byte entry would be reserved for the pc
after max_stack_depth and
all these entries will be zeroed upon entry. A similar thing could be
done by the compiler itself and the
verifier wouldn't have to do all this, but for now let's just do it in
the verifier.

In this way, when we encounter an unlikely case where the same slot
has conflicting entries, we'll erase
state by storing its entry in the stack slot reserved for the pc, and
clear its state from the verifier state,
allowing all respective entries for the same resource to now merge
easily. The same can be repeated
for each resource if all of them conflict.

So far playing around with this series applying it to program
cancellation and sched-ext cases, I haven't
encountered merging issues, so it seems like an edge case that is
unlikely to often occur in practice. But when it does,
it can be resolved by the logic above. This should address all issues
and should be resilient to changes triggered by different clang
versions.

Let me know what you think.

>
> Thanks,
> Eduard

      reply	other threads:[~2024-03-18  5:40 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-01  4:20 [RFC PATCH v1 00/14] Exceptions - Resource Cleanup Kumar Kartikeya Dwivedi
2024-02-01  4:20 ` [RFC PATCH v1 01/14] bpf: Mark subprogs as throw reachable before do_check pass Kumar Kartikeya Dwivedi
2024-02-12 19:35   ` David Vernet
2024-02-12 22:28     ` Kumar Kartikeya Dwivedi
2024-02-15  1:01   ` Eduard Zingerman
2024-02-16 21:34     ` Kumar Kartikeya Dwivedi
2024-02-01  4:20 ` [RFC PATCH v1 02/14] bpf: Process global subprog's exception propagation Kumar Kartikeya Dwivedi
2024-02-15  1:10   ` Eduard Zingerman
2024-02-16 21:50     ` Kumar Kartikeya Dwivedi
2024-02-17 14:04       ` Eduard Zingerman
2024-02-01  4:20 ` [RFC PATCH v1 03/14] selftests/bpf: Add test for throwing global subprog with acquired refs Kumar Kartikeya Dwivedi
2024-02-15  1:10   ` Eduard Zingerman
2024-02-01  4:20 ` [RFC PATCH v1 04/14] bpf: Refactor check_pseudo_btf_id's BTF reference bump Kumar Kartikeya Dwivedi
2024-02-15  1:11   ` Eduard Zingerman
2024-02-16 21:50     ` Kumar Kartikeya Dwivedi
2024-02-01  4:21 ` [RFC PATCH v1 05/14] bpf: Implement BPF exception frame descriptor generation Kumar Kartikeya Dwivedi
2024-02-15 18:24   ` Eduard Zingerman
2024-02-16 11:23     ` Eduard Zingerman
2024-02-16 22:06       ` Kumar Kartikeya Dwivedi
2024-02-17 17:14         ` Eduard Zingerman
2024-02-20 21:58           ` Kumar Kartikeya Dwivedi
2024-02-16 22:24     ` Kumar Kartikeya Dwivedi
2024-02-01  4:21 ` [RFC PATCH v1 06/14] bpf: Adjust frame descriptor pc on instruction patching Kumar Kartikeya Dwivedi
2024-02-15 16:31   ` Eduard Zingerman
2024-02-16 21:52     ` Kumar Kartikeya Dwivedi
2024-02-17 14:08       ` Eduard Zingerman
2024-02-01  4:21 ` [RFC PATCH v1 07/14] bpf: Use hidden subprog trampoline for bpf_throw Kumar Kartikeya Dwivedi
2024-02-15 22:11   ` Eduard Zingerman
2024-02-16 21:59     ` Kumar Kartikeya Dwivedi
2024-02-17 14:22       ` Eduard Zingerman
2024-02-01  4:21 ` [RFC PATCH v1 08/14] bpf: Compute used callee saved registers for subprogs Kumar Kartikeya Dwivedi
2024-02-15 22:12   ` Eduard Zingerman
2024-02-16 22:02     ` Kumar Kartikeya Dwivedi
2024-02-17 14:26       ` Eduard Zingerman
2024-02-01  4:21 ` [RFC PATCH v1 09/14] bpf, x86: Fix up pc offsets for frame descriptor entries Kumar Kartikeya Dwivedi
2024-02-15 22:12   ` Eduard Zingerman
2024-02-16 13:33     ` Eduard Zingerman
2024-02-01  4:21 ` [RFC PATCH v1 10/14] bpf, x86: Implement runtime resource cleanup for exceptions Kumar Kartikeya Dwivedi
2024-02-03  9:02   ` kernel test robot
2024-02-16 12:02   ` Eduard Zingerman
2024-02-16 22:28     ` Kumar Kartikeya Dwivedi
2024-02-19 12:01       ` Eduard Zingerman
2024-02-01  4:21 ` [RFC PATCH v1 11/14] bpf: Release references in verifier state when throwing exceptions Kumar Kartikeya Dwivedi
2024-02-16 12:21   ` Eduard Zingerman
2024-02-01  4:21 ` [RFC PATCH v1 12/14] bpf: Register cleanup dtors for runtime unwinding Kumar Kartikeya Dwivedi
2024-02-01  4:21 ` [RFC PATCH v1 13/14] bpf: Make bpf_throw available to all program types Kumar Kartikeya Dwivedi
2024-02-01  4:21 ` [RFC PATCH v1 14/14] selftests/bpf: Add tests for exceptions runtime cleanup Kumar Kartikeya Dwivedi
2024-02-12 20:53   ` David Vernet
2024-02-12 22:43     ` Kumar Kartikeya Dwivedi
2024-02-13 19:33       ` David Vernet
2024-02-13 20:51         ` Kumar Kartikeya Dwivedi
2024-03-14 11:08 ` [RFC PATCH v1 00/14] Exceptions - Resource Cleanup Eduard Zingerman
2024-03-18  5:40   ` Kumar Kartikeya Dwivedi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAP01T74kov3JKJFvpaqz6CjEkzBMbOfDLc6Xjg_n5=g9osApjA@mail.gmail.com' \
    --to=memxor@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=djwillia@vt.edu \
    --cc=eddyz87@gmail.com \
    --cc=martin.lau@kernel.org \
    --cc=rishabh.iyer@epfl.ch \
    --cc=rjsu26@vt.edu \
    --cc=sanidhya.kashyap@epfl.ch \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.