linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Why is text_mutex used in jump_label_transform for x86_64
@ 2020-03-19 13:49 chengjian (D)
  2020-03-20 10:27 ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: chengjian (D) @ 2020-03-19 13:49 UTC (permalink / raw)
  To: andrew.murray, bristot, jakub.kicinski, Kees Cook
  Cc: x86, linux-kernel, linux-arm-kernel, Xiexiuqi (Xie XiuQi),
	Li Bin, bobo.shaobowang, chengjian (D)

Hi,everyone

I'm sorry to disturb you. I have a problem about jump_label, and a bit 
confused about the code

I noticed that text_mutex is used in this function under x86_64 
architecture,
but other architectures do not.

in arch/x86/kernel/jump_label.c
         static void __ref jump_label_transform(struct jump_entry *entry,
              enum jump_label_type type,
              int init)
         {
          mutex_lock(&text_mutex);
          __jump_label_transform(entry, type, init);
          mutex_unlock(&text_mutex);

in arch/arm64/kernel/jump_label.c

         void arch_jump_label_transform(struct jump_entry *entry,
                                        enum jump_label_type type)
         {
                 void *addr = (void *)jump_entry_code(entry);
                 u32 insn;

                 if (type == JUMP_LABEL_JMP) {
                         insn = 
aarch64_insn_gen_branch_imm(jump_entry_code(entry),
jump_entry_target(entry),
AARCH64_INSN_BRANCH_NOLINK);
                 } else {
                         insn = aarch64_insn_gen_nop();
                 }

                 aarch64_insn_patch_text_nosync(addr, insn);
         }


Is there anything wrong with x86

or

is this missing for other architectures?


Thanks

---- Cheng Jian






^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Why is text_mutex used in jump_label_transform for x86_64
  2020-03-19 13:49 Why is text_mutex used in jump_label_transform for x86_64 chengjian (D)
@ 2020-03-20 10:27 ` Peter Zijlstra
  2020-04-06  8:39   ` chengjian (D)
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2020-03-20 10:27 UTC (permalink / raw)
  To: chengjian (D)
  Cc: andrew.murray, bristot, jakub.kicinski, Kees Cook, x86,
	linux-kernel, linux-arm-kernel, Xiexiuqi (Xie XiuQi),
	Li Bin, bobo.shaobowang

On Thu, Mar 19, 2020 at 09:49:04PM +0800, chengjian (D) wrote:
> Hi,everyone
> 
> I'm sorry to disturb you. I have a problem about jump_label, and a bit
> confused about the code
> 
> I noticed that text_mutex is used in this function under x86_64
> architecture,
> but other architectures do not.
> 
> in arch/x86/kernel/jump_label.c
>         static void __ref jump_label_transform(struct jump_entry *entry,
>              enum jump_label_type type,
>              int init)
>         {
>          mutex_lock(&text_mutex);
>          __jump_label_transform(entry, type, init);
>          mutex_unlock(&text_mutex);
> 
> in arch/arm64/kernel/jump_label.c
> 
>         void arch_jump_label_transform(struct jump_entry *entry,
>                                        enum jump_label_type type)
>         {
>                 void *addr = (void *)jump_entry_code(entry);
>                 u32 insn;
> 
>                 if (type == JUMP_LABEL_JMP) {
>                         insn =
> aarch64_insn_gen_branch_imm(jump_entry_code(entry),
> jump_entry_target(entry),
> AARCH64_INSN_BRANCH_NOLINK);
>                 } else {
>                         insn = aarch64_insn_gen_nop();
>                 }
> 
>                 aarch64_insn_patch_text_nosync(addr, insn);
>         }
> 
> 
> Is there anything wrong with x86
> 
> or
> 
> is this missing for other architectures?

It depends on the architecture details of how self-modifying code works.
In particular, x86 is a variable instruction length architecture and
needs extreme care -- it's implementation requires there only be a
single text modifier at any one time, hence the use of text_mutex.

ARM64 OTOH is, like most RISC based architectures, a fixed width
instruction architecture. And in particular it can re-write certain
(branch) instructions with impunity (see their
aarch64_insn_patch_text_nosync()). Which is why they don't need
additional serialization.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Why is text_mutex used in jump_label_transform for x86_64
  2020-03-20 10:27 ` Peter Zijlstra
@ 2020-04-06  8:39   ` chengjian (D)
  2020-04-06  9:15     ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: chengjian (D) @ 2020-04-06  8:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: andrew.murray, bristot, jakub.kicinski, Kees Cook, x86,
	linux-kernel, linux-arm-kernel, Xiexiuqi (Xie XiuQi),
	Li Bin, bobo.shaobowang, chengjian (D)


On 2020/3/20 18:27, Peter Zijlstra wrote:
> It depends on the architecture details of how self-modifying code works.
> In particular, x86 is a variable instruction length architecture and
> needs extreme care -- it's implementation requires there only be a
> single text modifier at any one time, hence the use of text_mutex.
>
> ARM64 OTOH is, like most RISC based architectures, a fixed width
> instruction architecture. And in particular it can re-write certain
> (branch) instructions with impunity (see their
> aarch64_insn_patch_text_nosync()). Which is why they don't need
> additional serialization.

Hi, Peter

Thank you very much for your reply.

X86 is a variable-length instruction, only one byte modification of the 
instruction
can be regarded as atomic. so we must be very careful when modifying 
instructions
concurrently.

For other architectures such as ARM64, the modification of some 
instructions can be
considered atomic, (Eg. nop -> jmp/b). The set of instructions that can 
be executed
by one thread of execution as they are being modified by another thread 
of execution
without requiring explicit synchronization.

In ARM64 Architecture Reference Manual, I find that:
     Concurrent modification and execution of instructions can lead to 
the resulting instruction performing any behavior
     that can be achieved by executing any sequence of instructions that 
can be executed from the same Exception level,
     except where each of the instruction before modification and the 
instruction after modification is one of a B, BL, BRK,
     HVC, ISB, NOP, SMC, or SVC instruction.
     For the B, BL, BRK, HVC, ISB, NOP, SMC, and SVC instructions the 
architecture guarantees that, after modification of the
     instruction, behavior is consistent with execution of either:
     • The instruction originally fetched.
     • A fetch of the modified instruction

So we can safely modify jump_label for ARM64(from NOP to b or form b to 
NOP).

Is my understanding correct?



Thank You

     -- Cheng Jian



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Why is text_mutex used in jump_label_transform for x86_64
  2020-04-06  8:39   ` chengjian (D)
@ 2020-04-06  9:15     ` Peter Zijlstra
  2020-04-06 14:10       ` Will Deacon
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2020-04-06  9:15 UTC (permalink / raw)
  To: chengjian (D)
  Cc: andrew.murray, bristot, jakub.kicinski, Kees Cook, x86,
	linux-kernel, linux-arm-kernel, Xiexiuqi (Xie XiuQi),
	Li Bin, bobo.shaobowang, Will Deacon

On Mon, Apr 06, 2020 at 04:39:11PM +0800, chengjian (D) wrote:
> 
> On 2020/3/20 18:27, Peter Zijlstra wrote:
> > It depends on the architecture details of how self-modifying code works.
> > In particular, x86 is a variable instruction length architecture and
> > needs extreme care -- it's implementation requires there only be a
> > single text modifier at any one time, hence the use of text_mutex.
> > 
> > ARM64 OTOH is, like most RISC based architectures, a fixed width
> > instruction architecture. And in particular it can re-write certain
> > (branch) instructions with impunity (see their
> > aarch64_insn_patch_text_nosync()). Which is why they don't need
> > additional serialization.
> 
> Hi, Peter
> 
> Thank you very much for your reply.
> 
> X86 is a variable-length instruction, only one byte modification of the
> instruction
> can be regarded as atomic. so we must be very careful when modifying
> instructions
> concurrently.

Close enough.

> For other architectures such as ARM64, the modification of some instructions
> can be
> considered atomic, (Eg. nop -> jmp/b). The set of instructions that can be
> executed
> by one thread of execution as they are being modified by another thread of
> execution
> without requiring explicit synchronization.
> 
> In ARM64 Architecture Reference Manual, I find that:
>     Concurrent modification and execution of instructions can lead to the
> resulting instruction performing any behavior
>     that can be achieved by executing any sequence of instructions that can
> be executed from the same Exception level,
>     except where each of the instruction before modification and the
> instruction after modification is one of a B, BL, BRK,
>     HVC, ISB, NOP, SMC, or SVC instruction.
>     For the B, BL, BRK, HVC, ISB, NOP, SMC, and SVC instructions the
> architecture guarantees that, after modification of the
>     instruction, behavior is consistent with execution of either:
>     • The instruction originally fetched.
>     • A fetch of the modified instruction
> 
> So we can safely modify jump_label for ARM64(from NOP to b or form b to
> NOP).
> 
> Is my understanding correct?

I think so; but I'm really not much of an ARM64 person. FWIW I think I
remember Will saying the same is true of ARM (32bit) and they could
implement the same optimization, but so far nobody has bothered doing
so. But please, ask an ARM64 maintainer and don't take my word for this.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Why is text_mutex used in jump_label_transform for x86_64
  2020-04-06  9:15     ` Peter Zijlstra
@ 2020-04-06 14:10       ` Will Deacon
  2020-04-08  1:17         ` chengjian (D)
  0 siblings, 1 reply; 6+ messages in thread
From: Will Deacon @ 2020-04-06 14:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: chengjian (D),
	andrew.murray, bristot, jakub.kicinski, Kees Cook, x86,
	linux-kernel, linux-arm-kernel, Xiexiuqi (Xie XiuQi),
	Li Bin, bobo.shaobowang

On Mon, Apr 06, 2020 at 11:15:51AM +0200, Peter Zijlstra wrote:
> On Mon, Apr 06, 2020 at 04:39:11PM +0800, chengjian (D) wrote:
> > 
> > On 2020/3/20 18:27, Peter Zijlstra wrote:
> > > It depends on the architecture details of how self-modifying code works.
> > > In particular, x86 is a variable instruction length architecture and
> > > needs extreme care -- it's implementation requires there only be a
> > > single text modifier at any one time, hence the use of text_mutex.
> > > 
> > > ARM64 OTOH is, like most RISC based architectures, a fixed width
> > > instruction architecture. And in particular it can re-write certain
> > > (branch) instructions with impunity (see their
> > > aarch64_insn_patch_text_nosync()). Which is why they don't need
> > > additional serialization.
> > 
> > Hi, Peter
> > 
> > Thank you very much for your reply.
> > 
> > X86 is a variable-length instruction, only one byte modification of the
> > instruction
> > can be regarded as atomic. so we must be very careful when modifying
> > instructions
> > concurrently.
> 
> Close enough.
> 
> > For other architectures such as ARM64, the modification of some instructions
> > can be
> > considered atomic, (Eg. nop -> jmp/b). The set of instructions that can be
> > executed
> > by one thread of execution as they are being modified by another thread of
> > execution
> > without requiring explicit synchronization.
> > 
> > In ARM64 Architecture Reference Manual, I find that:
> >     Concurrent modification and execution of instructions can lead to the
> > resulting instruction performing any behavior
> >     that can be achieved by executing any sequence of instructions that can
> > be executed from the same Exception level,
> >     except where each of the instruction before modification and the
> > instruction after modification is one of a B, BL, BRK,
> >     HVC, ISB, NOP, SMC, or SVC instruction.
> >     For the B, BL, BRK, HVC, ISB, NOP, SMC, and SVC instructions the
> > architecture guarantees that, after modification of the
> >     instruction, behavior is consistent with execution of either:
> >     • The instruction originally fetched.
> >     • A fetch of the modified instruction
> > 
> > So we can safely modify jump_label for ARM64(from NOP to b or form b to
> > NOP).
> > 
> > Is my understanding correct?
> 
> I think so; but I'm really not much of an ARM64 person. FWIW I think I
> remember Will saying the same is true of ARM (32bit) and they could
> implement the same optimization, but so far nobody has bothered doing
> so. But please, ask an ARM64 maintainer and don't take my word for this.

On 32-bit there are complications with Thumb-2 instructions where you can
have a mixture of 16-bit and 32-bit encodings, so you have to be pretty
careful there.

For arm64, we have aarch64_insn_patch_text_nosync() which we use to toggle
jump labels.

Will

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Why is text_mutex used in jump_label_transform for x86_64
  2020-04-06 14:10       ` Will Deacon
@ 2020-04-08  1:17         ` chengjian (D)
  0 siblings, 0 replies; 6+ messages in thread
From: chengjian (D) @ 2020-04-08  1:17 UTC (permalink / raw)
  To: Will Deacon, Peter Zijlstra
  Cc: andrew.murray, bristot, jakub.kicinski, Kees Cook, x86,
	linux-kernel, linux-arm-kernel, Xiexiuqi (Xie XiuQi),
	Li Bin, bobo.shaobowang, chengjian (D)


On 2020/4/6 22:10, Will Deacon wrote:
> On Mon, Apr 06, 2020 at 11:15:51AM +0200, Peter Zijlstra wrote:
>> On Mon, Apr 06, 2020 at 04:39:11PM +0800, chengjian (D) wrote:
>>> On 2020/3/20 18:27, Peter Zijlstra wrote:
>>>> It depends on the architecture details of how self-modifying code works.
>>>> In particular, x86 is a variable instruction length architecture and
>>>> needs extreme care -- it's implementation requires there only be a
>>>> single text modifier at any one time, hence the use of text_mutex.
>>>>
>>>> ARM64 OTOH is, like most RISC based architectures, a fixed width
>>>> instruction architecture. And in particular it can re-write certain
>>>> (branch) instructions with impunity (see their
>>>> aarch64_insn_patch_text_nosync()). Which is why they don't need
>>>> additional serialization.
>>> Hi, Peter
>>>
>>> Thank you very much for your reply.
>>>
>>> X86 is a variable-length instruction, only one byte modification of the
>>> instruction
>>> can be regarded as atomic. so we must be very careful when modifying
>>> instructions
>>> concurrently.
>> Close enough.
>>
>>> For other architectures such as ARM64, the modification of some instructions
>>> can be
>>> considered atomic, (Eg. nop -> jmp/b). The set of instructions that can be
>>> executed
>>> by one thread of execution as they are being modified by another thread of
>>> execution
>>> without requiring explicit synchronization.
>>>
>>> In ARM64 Architecture Reference Manual, I find that:
>>>      Concurrent modification and execution of instructions can lead to the
>>> resulting instruction performing any behavior
>>>      that can be achieved by executing any sequence of instructions that can
>>> be executed from the same Exception level,
>>>      except where each of the instruction before modification and the
>>> instruction after modification is one of a B, BL, BRK,
>>>      HVC, ISB, NOP, SMC, or SVC instruction.
>>>      For the B, BL, BRK, HVC, ISB, NOP, SMC, and SVC instructions the
>>> architecture guarantees that, after modification of the
>>>      instruction, behavior is consistent with execution of either:
>>>      • The instruction originally fetched.
>>>      • A fetch of the modified instruction
>>>
>>> So we can safely modify jump_label for ARM64(from NOP to b or form b to
>>> NOP).
>>>
>>> Is my understanding correct?
>> I think so; but I'm really not much of an ARM64 person. FWIW I think I
>> remember Will saying the same is true of ARM (32bit) and they could
>> implement the same optimization, but so far nobody has bothered doing
>> so. But please, ask an ARM64 maintainer and don't take my word for this.
> On 32-bit there are complications with Thumb-2 instructions where you can
> have a mixture of 16-bit and 32-bit encodings, so you have to be pretty
> careful there.
>
> For arm64, we have aarch64_insn_patch_text_nosync() which we use to toggle
> jump labels.
>
> Will
>
> .


Hi, Peter and Will

     I have learned.

     I truly appreciate your timely help.


     Thanks a lot.

     -- Cheng Jian



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-04-08  1:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-19 13:49 Why is text_mutex used in jump_label_transform for x86_64 chengjian (D)
2020-03-20 10:27 ` Peter Zijlstra
2020-04-06  8:39   ` chengjian (D)
2020-04-06  9:15     ` Peter Zijlstra
2020-04-06 14:10       ` Will Deacon
2020-04-08  1:17         ` chengjian (D)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).