linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nadav Amit <namit@vmware.com>
To: Edward Cree <ecree@solarflare.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	Paolo Abeni <pabeni@redhat.com>
Subject: Re: [PATCH v2 0/4] Static calls
Date: Wed, 12 Dec 2018 18:14:00 +0000	[thread overview]
Message-ID: <294E22E9-7577-4716-A531-CBFE628789C3@vmware.com> (raw)
In-Reply-To: <899194d1-9777-71ed-70db-212d2983a400@solarflare.com>

> On Dec 12, 2018, at 9:11 AM, Edward Cree <ecree@solarflare.com> wrote:
> 
> On 12/12/18 05:59, Nadav Amit wrote:
>> Thanks for cc’ing me. (I didn’t know about the other patch-sets.)
> Well in my case, that's because I haven't posted any yet.  (Will follow up
>  shortly with what I currently have, though it's not pretty.)
> 
> Looking at your patches, it seems you've got a much more developed learning
>  mechanism.  Mine on the other hand is brutally simple but runs continuously
>  (i.e. after we patch we immediately enter the next 'relearning' phase);
>  since it never does anything but prod a handful of percpu variables, this
>  shouldn't be too costly.
> 
> Also, you've got the macrology for making all indirect calls use this,
>  whereas at present I just have an open-coded instance on a single call site
>  (I went with deliver_skb in the networking stack).
> 
> So I think where we probably want to go from here is:
>  1) get Josh's static_calls in.  AIUI Linus seems to prefer the out-of-line
>     approach; I'd say ditch the inline version (at least for now).
>  2) build a relpolines patch series that uses
>    i) static_calls for the text-patching part
>   ii) as much of Nadav's macrology as is applicable
>  iii) either my or Nadav's learning mechanism; we can experiment with both,
>       bikeshed it incessantly etc.
> 
> Seem reasonable?

Mostly yes. I have a few reservations (and let’s call them optpolines from
now on, since Josh disliked the previous name).

First, I still have to address the issues that Josh raised before, and try
to use gcc plugin instead of (most) of the macros. Specifically, I need to
bring back (from my PoC code) the part that sets multiple targets.

Second, (2i) is not very intuitive for me. Using the out-of-line static
calls seems to me as less performant than the inline (potentially, I didn’t
check).

Anyhow, the use of out-of-line static calls seems to me as
counter-intuitive. I think (didn’t measure) that it may add more overhead
than it saves due to the additional call, ret, and so on - at least if
retpolines are not used. For multiple targets it may be useful in saving
some memory if the outline block is dynamically allocated (as I did in my
yet unpublished code). But that’s not how it’s done in Josh’s code.

If we talk about inline implementation there is a different problem that
prevents me of using Josh’s static-calls as-is. I tried to avoid reading to
compared target from memory and therefore used an immediate. This should
prevent data cache misses and even when the data is available is faster by
one cycle. But it requires the patching of both the “cmp %target-reg, imm”
and “call rel-target” to be patched “atomically”. So the static-calls
mechanism wouldn’t be sufficient.

Based on Josh’s previous feedback, I thought of improving the learning using
some hysteresis. Anyhow, note that there are quite a few cases in which you
wouldn’t want optpolines. The question is whether in general it would be an
opt-in or opt-out mechanism.

Let me know what you think.

BTW: When it comes to deliver_skb, you have packet_type as an identifier.
You can use it directly or through an indirection table to figure the
target. Here’s a chunk of assembly magic that I used in a similar case:

.macro _call_table val:req bit:req max:req val1:req bit1:req
call_table_\val\()_\bit\():
        test $(1 << \bit), %al
.if \val1 + (1 << \bit1) >= \max
        jnz syscall_relpoline_\val1
        jmp syscall_relpoline_\val
.else
        jnz call_table_\val1\()_\bit1

        # fall-through to no carry, val unchange, going to next bit
        call_table \val,\bit1,\max
        call_table \val1,\bit1,\max
.endif
.endm

.macro call_table val:req bit:req max:req
.altmacro
        _call_table \val,\bit,\max,%(\val + (1 << \bit)),%(\bit + 1)
.noaltmacro
.endm

ENTRY(direct_syscall)
        mov %esi, %eax
        call_table val=0 bit=0 max=16
ENDPROC(direct_syscall)

  parent reply	other threads:[~2018-12-12 18:14 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-26 13:54 [PATCH v2 0/4] Static calls Josh Poimboeuf
2018-11-26 13:54 ` [PATCH v2 1/4] compiler.h: Make __ADDRESSABLE() symbol truly unique Josh Poimboeuf
2018-11-27  8:49   ` Ard Biesheuvel
2018-11-26 13:54 ` [PATCH v2 2/4] static_call: Add static call infrastructure Josh Poimboeuf
2018-11-26 13:54 ` [PATCH v2 3/4] x86/static_call: Add out-of-line static call implementation Josh Poimboeuf
2018-11-26 15:43   ` Peter Zijlstra
2018-11-26 16:19     ` Steven Rostedt
2018-11-26 13:55 ` [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64 Josh Poimboeuf
2018-11-26 16:02   ` Peter Zijlstra
2018-11-26 17:10     ` Josh Poimboeuf
2018-11-26 17:56       ` Josh Poimboeuf
2018-11-26 20:00         ` Peter Zijlstra
2018-11-26 20:08         ` Peter Zijlstra
2018-11-26 21:26           ` Josh Poimboeuf
2018-11-27  8:43             ` Peter Zijlstra
2018-11-27  8:50               ` Peter Zijlstra
2018-11-29  6:05               ` Andy Lutomirski
2018-11-29  9:42                 ` Peter Zijlstra
2018-11-29 13:11                   ` Josh Poimboeuf
2018-11-29 13:37                   ` Andy Lutomirski
2018-11-29 14:38                     ` Peter Zijlstra
2018-11-29 14:42                       ` Jiri Kosina
2018-11-29 16:33                       ` Josh Poimboeuf
2018-11-29 16:49                         ` Peter Zijlstra
2018-11-29 16:59                           ` Andy Lutomirski
2018-11-29 17:10                             ` Josh Poimboeuf
2018-11-29 22:01                               ` Peter Zijlstra
2018-11-29 22:14                                 ` Josh Poimboeuf
2018-11-29 22:22                                   ` Peter Zijlstra
2018-11-29 22:25                                     ` Andy Lutomirski
2018-11-29 22:30                                       ` Josh Poimboeuf
2018-11-29 17:15                             ` Peter Zijlstra
2018-11-29 17:20                               ` Steven Rostedt
2018-11-29 17:21                                 ` Steven Rostedt
2018-11-29 17:41                                   ` Andy Lutomirski
2018-11-29 17:45                                     ` Josh Poimboeuf
2018-11-29 17:52                                       ` Andy Lutomirski
2018-11-29 17:49                                     ` Steven Rostedt
2018-11-29 18:37                               ` Josh Poimboeuf
2018-11-29 16:50                         ` Linus Torvalds
2018-11-29 16:55                           ` Steven Rostedt
2018-11-29 17:02                           ` Andy Lutomirski
2018-11-29 17:07                             ` Peter Zijlstra
2018-11-29 17:31                               ` Andy Lutomirski
2018-11-29 17:35                                 ` Jiri Kosina
2018-11-29 17:13                             ` Steven Rostedt
2018-11-29 17:35                               ` Linus Torvalds
2018-11-29 17:44                                 ` Steven Rostedt
2018-11-29 17:50                                   ` Linus Torvalds
2018-11-29 17:54                                     ` Linus Torvalds
2018-11-29 17:58                                     ` Steven Rostedt
2018-11-29 18:23                                       ` Linus Torvalds
2018-11-29 18:47                                         ` Steven Rostedt
2018-11-29 18:58                                           ` Linus Torvalds
2018-11-29 19:08                                             ` Linus Torvalds
2018-11-29 19:11                                               ` Linus Torvalds
2018-12-10 23:58                                                 ` Pavel Machek
2018-12-11  1:43                                                   ` Linus Torvalds
2018-11-29 19:12                                               ` Steven Rostedt
2018-11-29 19:27                                               ` Andy Lutomirski
2018-11-29 20:24                                                 ` Josh Poimboeuf
2018-11-29 22:17                                                   ` Josh Poimboeuf
2018-11-29 23:04                                                   ` Linus Torvalds
2018-11-30 16:27                                                     ` Josh Poimboeuf
2018-12-11  9:41                                                       ` David Laight
2018-12-11 17:19                                                         ` Josh Poimboeuf
2018-12-12 18:29                                                     ` Josh Poimboeuf
2018-11-30 16:42                                                   ` Andy Lutomirski
2018-11-30 18:39                                                     ` Josh Poimboeuf
2018-11-30 19:45                                                       ` Linus Torvalds
2018-11-30 20:18                                                         ` Andy Lutomirski
2018-11-30 20:28                                                           ` Steven Rostedt
2018-11-30 20:59                                                             ` Andy Lutomirski
2018-11-30 21:01                                                               ` Steven Rostedt
2018-11-30 21:13                                                               ` Jiri Kosina
2018-11-30 21:10                                                           ` Josh Poimboeuf
2018-11-29 19:16                                             ` Steven Rostedt
2018-11-29 19:22                                               ` Josh Poimboeuf
2018-11-29 19:27                                                 ` Steven Rostedt
2018-11-30 22:16                                                 ` Rasmus Villemoes
2018-11-30 22:24                                                   ` Josh Poimboeuf
2018-11-29 19:24                                               ` Linus Torvalds
2018-11-29 19:28                                                 ` Andy Lutomirski
2018-11-29 19:31                                                 ` Steven Rostedt
2018-11-29 20:12                                             ` Josh Poimboeuf
2018-11-29 18:00                                     ` Andy Lutomirski
2018-11-29 18:42                                       ` Linus Torvalds
2018-11-29 18:55                                       ` Steven Rostedt
2018-11-29 17:29                             ` Linus Torvalds
2018-11-29 17:35                               ` Andy Lutomirski
2018-11-26 18:28       ` Andy Lutomirski
2018-11-26 20:14         ` Josh Poimboeuf
2018-11-27  8:46           ` Peter Zijlstra
2018-11-26 16:08   ` Peter Zijlstra
2018-11-26 16:11     ` Ard Biesheuvel
2018-11-26 16:33       ` Andy Lutomirski
2018-11-26 16:39       ` Peter Zijlstra
2018-11-26 16:44         ` Josh Poimboeuf
2018-11-26 14:01 ` [PATCH v2 0/4] Static calls Josh Poimboeuf
2018-11-26 20:54 ` Steven Rostedt
2018-11-26 22:24   ` Josh Poimboeuf
2018-11-26 22:53     ` Steven Rostedt
2018-12-04 23:08 ` Steven Rostedt
2018-12-04 23:41   ` Andy Lutomirski
2018-12-05 15:04     ` Josh Poimboeuf
2018-12-05 23:36       ` Andy Lutomirski
2018-12-07 16:06 ` Edward Cree
2018-12-07 16:49   ` Edward Cree
2018-12-11 18:05   ` Josh Poimboeuf
2018-12-12  5:59     ` Nadav Amit
2018-12-12 17:11       ` Edward Cree
2018-12-12 17:47         ` [RFC/WIP PATCH 0/2] dynamic calls Edward Cree
2018-12-12 17:50           ` [RFC PATCH 1/2] static_call: fix out-of-line static call implementation Edward Cree
2018-12-12 17:52           ` [RFC PATCH 2/2] net: core: rather hacky PoC implementation of dynamic calls Edward Cree
2018-12-12 18:14         ` Nadav Amit [this message]
2018-12-12 18:33           ` [PATCH v2 0/4] Static calls Edward Cree
2018-12-12 21:15             ` Nadav Amit
2018-12-12 21:36               ` Edward Cree
2018-12-12 21:45                 ` Nadav Amit
2018-12-10 23:57 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=294E22E9-7577-4716-A531-CBFE628789C3@vmware.com \
    --to=namit@vmware.com \
    --cc=ecree@solarflare.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).