linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG][-tip] kprobes on module functions hits kernel BUG in text_poke on x86-32
@ 2009-04-04 14:34 Masami Hiramatsu
  2009-04-04 15:42 ` Mathieu Desnoyers
  0 siblings, 1 reply; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-04 14:34 UTC (permalink / raw)
  To: Ingo Molnar, Mathieu Desnoyers, Ananth N Mavinakayanahalli
  Cc: LKML, systemtap-ml

Hi,

I found text_poke() problem on x86-32 with the latest-tip tree.
When I put a kprobe on a module function, text_poke() hit a BUG.

This bug can be reproduced on x86-32, but not on x86-64.
And inserting kprobes on a kernel-core function is OK.

Thank you,

------------[ cut here ]------------
kernel BUG at /home/mhiramat/ksrc/linux-2.6-tip/arch/x86/kernel/alternative.c:543!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0/net/eth0/broadcast
Modules linked in: probe_bench(+) netconsole configfs sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables ipv6 cpufreq_ondemand powernow_k8 dm_mirror
dm_region_hash dm_log dm_multipath uinput snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore dcdbas
i2c_nforce2 pcspkr tg3 snd_page_alloc rtc_cmos rtc_core i2c_core libphy rtc_lib ata_generic pata_acpi sata_nv [last
unloaded: scsi_wait_scan]

Pid: 5411, comm: insmod Not tainted (2.6.29-tip #8) OptiPlex 740
EIP: 0060:[<c06e97ef>] EFLAGS: 00210893 CPU: 0
EIP is at text_poke+0x168/0x1a4
EAX: 00040f55 EBX: 00020800 ECX: f45d8f03 EDX: 00000000
ESI: f45d8f04 EDI: ffc58001 EBP: f45d8ef8 ESP: f45d8ed8
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process insmod (pid: 5411, ti=f45d8000 task=f4490000 task.ti=f45d8000)
Stack:
 00000001 f45d8f03 f8464000 00200286 00000000 00000000 00000000 f846492c
 f45d8f04 c06ea30d cc46492c f45d8f30 c06eb4bb 000000d8 f8464934 f84647e4
 f84647e4 00936fc0 f45d8f30 f84647e4 fffffffc 00936fc0 f45d8f40 f8464110
Call Trace:
 [<f8464000>] ? dummy_function+0x0/0xb [probe_bench]
 [<c06ea30d>] ? arch_arm_kprobe+0x1a/0x1c
 [<c06eb4bb>] ? register_kprobe+0x3b8/0x40a
 [<f8464110>] ? install_probe+0x31/0x13d [probe_bench]
 [<c040304f>] ? do_one_initcall+0x4a/0x11a
 [<f84640df>] ? install_probe+0x0/0x13d [probe_bench]
 [<c044b8fe>] ? up_read+0x16/0x2c
 [<c044c14e>] ? __blocking_notifier_call_chain+0x40/0x4c
 [<c045e5e4>] ? sys_init_module+0x89/0x18c
 [<c0407b44>] ? sysenter_do_call+0x12/0x38
Code: 00 00 6a 00 ff 15 04 83 85 c0 58 5a e8 86 82 d3 ff 90 b8 01 00 00 00 0f a2 31 d2 eb 13 8b 4d e8 8a 04 11 8b 4d e4
3a 04 11 74 04 <0f> 0b eb fe 42 3b 55 e0 72 e8 f7 45 ec 00 02 00 00 75 10 8b 45
EIP: [<c06e97ef>] text_poke+0x168/0x1a4 SS:ESP 0068:f45d8ed8
---[ end trace 12f1ca8c7f7964a0 ]---
Kernel panic - not syncing: Fatal exception
Pid: 5411, comm: insmod Tainted: G      D    2.6.29-tip #8
Call Trace:
 [<c06e6081>] ? printk+0xf/0x16
 [<c06e5fc8>] panic+0x39/0xe3
 [<c06e9526>] oops_end+0x96/0xa5
 [<c040a486>] die+0x54/0x5a
 [<c06e8e43>] do_trap+0x89/0xa2
 [<c0408bbe>] ? do_invalid_op+0x0/0x7b
 [<c0408c2f>] do_invalid_op+0x71/0x7b
 [<c06e97ef>] ? text_poke+0x168/0x1a4
 [<c045567d>] ? mark_lock+0x1e/0x1e0
 [<c042672c>] ? set_pte_vaddr+0xac/0xcf
 [<c054d6fc>] ? trace_hardirqs_off_thunk+0xc/0x10
 [<c06e8bfa>] error_code+0x72/0x78
 [<c0408bbe>] ? do_invalid_op+0x0/0x7b
 [<c06e97ef>] ? text_poke+0x168/0x1a4
 [<f8464000>] ? dummy_function+0x0/0xb [probe_bench]
 [<c06ea30d>] arch_arm_kprobe+0x1a/0x1c
 [<c06eb4bb>] register_kprobe+0x3b8/0x40a
 [<f8464110>] install_probe+0x31/0x13d [probe_bench]
 [<c040304f>] do_one_initcall+0x4a/0x11a
 [<f84640df>] ? install_probe+0x0/0x13d [probe_bench]
 [<c044b8fe>] ? up_read+0x16/0x2c
 [<c044c14e>] ? __blocking_notifier_call_chain+0x40/0x4c
 [<c045e5e4>] sys_init_module+0x89/0x18c
 [<c0407b44>] sysenter_do_call+0x12/0x38
Rebooting in 5 seconds..

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][-tip] kprobes on module functions hits kernel BUG in text_poke on x86-32
  2009-04-04 14:34 [BUG][-tip] kprobes on module functions hits kernel BUG in text_poke on x86-32 Masami Hiramatsu
@ 2009-04-04 15:42 ` Mathieu Desnoyers
  2009-04-04 18:28   ` Masami Hiramatsu
  0 siblings, 1 reply; 22+ messages in thread
From: Mathieu Desnoyers @ 2009-04-04 15:42 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Ananth N Mavinakayanahalli, LKML, systemtap-ml

* Masami Hiramatsu (mhiramat@redhat.com) wrote:
> Hi,
> 
> I found text_poke() problem on x86-32 with the latest-tip tree.
> When I put a kprobe on a module function, text_poke() hit a BUG.
> 
> This bug can be reproduced on x86-32, but not on x86-64.
> And inserting kprobes on a kernel-core function is OK.
> 
> Thank you,
> 

Hi Masami,

OK, so text_poke safety net saves the day :)

Basically, what we have here is the BUG_ON I have put :

       for (i = 0; i < len; i++)
                BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]);

Which checks that the modification is really preceivable in the kernel
code, triggers this bug. Only for modules you say.

It might not be this, but.. let's try something simple (this could be
completely unrelated, but won't take long to test): can you try to add a
vmalloc_sync_all() at the beginning of text_poke ? This would make sure
that lazily-populated TLB entries, which include module code and data on
x86, will be populated. I wonder if we hit this problem because
vmalloc_to_page would be returning a mapping to a yet unpopulated TLB
entry, if it is ever possible.

If that's not this, then I guess we have some problem with setting a
fixmap to a page returned by vmalloc on x86 32.

Mathieu


> ------------[ cut here ]------------
> kernel BUG at /home/mhiramat/ksrc/linux-2.6-tip/arch/x86/kernel/alternative.c:543!
> invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
> last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0/net/eth0/broadcast
> Modules linked in: probe_bench(+) netconsole configfs sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
> nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables ipv6 cpufreq_ondemand powernow_k8 dm_mirror
> dm_region_hash dm_log dm_multipath uinput snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy
> snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore dcdbas
> i2c_nforce2 pcspkr tg3 snd_page_alloc rtc_cmos rtc_core i2c_core libphy rtc_lib ata_generic pata_acpi sata_nv [last
> unloaded: scsi_wait_scan]
> 
> Pid: 5411, comm: insmod Not tainted (2.6.29-tip #8) OptiPlex 740
> EIP: 0060:[<c06e97ef>] EFLAGS: 00210893 CPU: 0
> EIP is at text_poke+0x168/0x1a4
> EAX: 00040f55 EBX: 00020800 ECX: f45d8f03 EDX: 00000000
> ESI: f45d8f04 EDI: ffc58001 EBP: f45d8ef8 ESP: f45d8ed8
>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Process insmod (pid: 5411, ti=f45d8000 task=f4490000 task.ti=f45d8000)
> Stack:
>  00000001 f45d8f03 f8464000 00200286 00000000 00000000 00000000 f846492c
>  f45d8f04 c06ea30d cc46492c f45d8f30 c06eb4bb 000000d8 f8464934 f84647e4
>  f84647e4 00936fc0 f45d8f30 f84647e4 fffffffc 00936fc0 f45d8f40 f8464110
> Call Trace:
>  [<f8464000>] ? dummy_function+0x0/0xb [probe_bench]
>  [<c06ea30d>] ? arch_arm_kprobe+0x1a/0x1c
>  [<c06eb4bb>] ? register_kprobe+0x3b8/0x40a
>  [<f8464110>] ? install_probe+0x31/0x13d [probe_bench]
>  [<c040304f>] ? do_one_initcall+0x4a/0x11a
>  [<f84640df>] ? install_probe+0x0/0x13d [probe_bench]
>  [<c044b8fe>] ? up_read+0x16/0x2c
>  [<c044c14e>] ? __blocking_notifier_call_chain+0x40/0x4c
>  [<c045e5e4>] ? sys_init_module+0x89/0x18c
>  [<c0407b44>] ? sysenter_do_call+0x12/0x38
> Code: 00 00 6a 00 ff 15 04 83 85 c0 58 5a e8 86 82 d3 ff 90 b8 01 00 00 00 0f a2 31 d2 eb 13 8b 4d e8 8a 04 11 8b 4d e4
> 3a 04 11 74 04 <0f> 0b eb fe 42 3b 55 e0 72 e8 f7 45 ec 00 02 00 00 75 10 8b 45
> EIP: [<c06e97ef>] text_poke+0x168/0x1a4 SS:ESP 0068:f45d8ed8
> ---[ end trace 12f1ca8c7f7964a0 ]---
> Kernel panic - not syncing: Fatal exception
> Pid: 5411, comm: insmod Tainted: G      D    2.6.29-tip #8
> Call Trace:
>  [<c06e6081>] ? printk+0xf/0x16
>  [<c06e5fc8>] panic+0x39/0xe3
>  [<c06e9526>] oops_end+0x96/0xa5
>  [<c040a486>] die+0x54/0x5a
>  [<c06e8e43>] do_trap+0x89/0xa2
>  [<c0408bbe>] ? do_invalid_op+0x0/0x7b
>  [<c0408c2f>] do_invalid_op+0x71/0x7b
>  [<c06e97ef>] ? text_poke+0x168/0x1a4
>  [<c045567d>] ? mark_lock+0x1e/0x1e0
>  [<c042672c>] ? set_pte_vaddr+0xac/0xcf
>  [<c054d6fc>] ? trace_hardirqs_off_thunk+0xc/0x10
>  [<c06e8bfa>] error_code+0x72/0x78
>  [<c0408bbe>] ? do_invalid_op+0x0/0x7b
>  [<c06e97ef>] ? text_poke+0x168/0x1a4
>  [<f8464000>] ? dummy_function+0x0/0xb [probe_bench]
>  [<c06ea30d>] arch_arm_kprobe+0x1a/0x1c
>  [<c06eb4bb>] register_kprobe+0x3b8/0x40a
>  [<f8464110>] install_probe+0x31/0x13d [probe_bench]
>  [<c040304f>] do_one_initcall+0x4a/0x11a
>  [<f84640df>] ? install_probe+0x0/0x13d [probe_bench]
>  [<c044b8fe>] ? up_read+0x16/0x2c
>  [<c044c14e>] ? __blocking_notifier_call_chain+0x40/0x4c
>  [<c045e5e4>] sys_init_module+0x89/0x18c
>  [<c0407b44>] sysenter_do_call+0x12/0x38
> Rebooting in 5 seconds..
> 
> -- 
> Masami Hiramatsu
> 
> Software Engineer
> Hitachi Computer Products (America) Inc.
> Software Solutions Division
> 
> e-mail: mhiramat@redhat.com
> 
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][-tip] kprobes on module functions hits kernel BUG in  text_poke on x86-32
  2009-04-04 15:42 ` Mathieu Desnoyers
@ 2009-04-04 18:28   ` Masami Hiramatsu
  2009-04-04 19:04     ` Masami Hiramatsu
  0 siblings, 1 reply; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-04 18:28 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Ananth N Mavinakayanahalli, LKML, systemtap-ml

Mathieu Desnoyers wrote:
> * Masami Hiramatsu (mhiramat@redhat.com) wrote:
>> Hi,
>>
>> I found text_poke() problem on x86-32 with the latest-tip tree.
>> When I put a kprobe on a module function, text_poke() hit a BUG.
>>
>> This bug can be reproduced on x86-32, but not on x86-64.
>> And inserting kprobes on a kernel-core function is OK.
>>
>> Thank you,
>>
> 
> Hi Masami,
> 
> OK, so text_poke safety net saves the day :)
> 
> Basically, what we have here is the BUG_ON I have put :
> 
>        for (i = 0; i < len; i++)
>                 BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]);
> 
> Which checks that the modification is really preceivable in the kernel
> code, triggers this bug. Only for modules you say.
> 
> It might not be this, but.. let's try something simple (this could be
> completely unrelated, but won't take long to test): can you try to add a
> vmalloc_sync_all() at the beginning of text_poke ? This would make sure
> that lazily-populated TLB entries, which include module code and data on
> x86, will be populated. I wonder if we hit this problem because
> vmalloc_to_page would be returning a mapping to a yet unpopulated TLB
> entry, if it is ever possible.

Hmm, from the result of my test, vmalloc_sync_all() didn't change anything...

> If that's not this, then I guess we have some problem with setting a
> fixmap to a page returned by vmalloc on x86 32.

I investigate it a bit deeper. I compared fixmap's page* and original
which vmalloc_to_page returns(because vmalloc_to_page just decode
current pagetable).

I added a printk right after set_fixmap, which shows below message.

fixmap:<fixmap's vaddr>:<page* by vmalloc_to_page>, \
orig:<original vaddr>:<page* passed to fixmap>

When I probe a module address, I got this;
fixmap:ffc58000:c1db1ae4, orig:f84a1000:c59b1ae4

on the other hand, when probing a kernel addrees, I got this;
fixmap:ffc58000:c129e01c, orig:c048946a:c129e01c

I guess this means that set_fixmap didn't set a correct page or
page_to_phys() returned incorrect phys address.

Thank you,

> 
> Mathieu
> 
> 
>> ------------[ cut here ]------------
>> kernel BUG at /home/mhiramat/ksrc/linux-2.6-tip/arch/x86/kernel/alternative.c:543!
>> invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
>> last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0/net/eth0/broadcast
>> Modules linked in: probe_bench(+) netconsole configfs sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
>> nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables ipv6 cpufreq_ondemand powernow_k8 dm_mirror
>> dm_region_hash dm_log dm_multipath uinput snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy
>> snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore dcdbas
>> i2c_nforce2 pcspkr tg3 snd_page_alloc rtc_cmos rtc_core i2c_core libphy rtc_lib ata_generic pata_acpi sata_nv [last
>> unloaded: scsi_wait_scan]
>>
>> Pid: 5411, comm: insmod Not tainted (2.6.29-tip #8) OptiPlex 740
>> EIP: 0060:[<c06e97ef>] EFLAGS: 00210893 CPU: 0
>> EIP is at text_poke+0x168/0x1a4
>> EAX: 00040f55 EBX: 00020800 ECX: f45d8f03 EDX: 00000000
>> ESI: f45d8f04 EDI: ffc58001 EBP: f45d8ef8 ESP: f45d8ed8
>>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
>> Process insmod (pid: 5411, ti=f45d8000 task=f4490000 task.ti=f45d8000)
>> Stack:
>>  00000001 f45d8f03 f8464000 00200286 00000000 00000000 00000000 f846492c
>>  f45d8f04 c06ea30d cc46492c f45d8f30 c06eb4bb 000000d8 f8464934 f84647e4
>>  f84647e4 00936fc0 f45d8f30 f84647e4 fffffffc 00936fc0 f45d8f40 f8464110
>> Call Trace:
>>  [<f8464000>] ? dummy_function+0x0/0xb [probe_bench]
>>  [<c06ea30d>] ? arch_arm_kprobe+0x1a/0x1c
>>  [<c06eb4bb>] ? register_kprobe+0x3b8/0x40a
>>  [<f8464110>] ? install_probe+0x31/0x13d [probe_bench]
>>  [<c040304f>] ? do_one_initcall+0x4a/0x11a
>>  [<f84640df>] ? install_probe+0x0/0x13d [probe_bench]
>>  [<c044b8fe>] ? up_read+0x16/0x2c
>>  [<c044c14e>] ? __blocking_notifier_call_chain+0x40/0x4c
>>  [<c045e5e4>] ? sys_init_module+0x89/0x18c
>>  [<c0407b44>] ? sysenter_do_call+0x12/0x38
>> Code: 00 00 6a 00 ff 15 04 83 85 c0 58 5a e8 86 82 d3 ff 90 b8 01 00 00 00 0f a2 31 d2 eb 13 8b 4d e8 8a 04 11 8b 4d e4
>> 3a 04 11 74 04 <0f> 0b eb fe 42 3b 55 e0 72 e8 f7 45 ec 00 02 00 00 75 10 8b 45
>> EIP: [<c06e97ef>] text_poke+0x168/0x1a4 SS:ESP 0068:f45d8ed8
>> ---[ end trace 12f1ca8c7f7964a0 ]---
>> Kernel panic - not syncing: Fatal exception
>> Pid: 5411, comm: insmod Tainted: G      D    2.6.29-tip #8
>> Call Trace:
>>  [<c06e6081>] ? printk+0xf/0x16
>>  [<c06e5fc8>] panic+0x39/0xe3
>>  [<c06e9526>] oops_end+0x96/0xa5
>>  [<c040a486>] die+0x54/0x5a
>>  [<c06e8e43>] do_trap+0x89/0xa2
>>  [<c0408bbe>] ? do_invalid_op+0x0/0x7b
>>  [<c0408c2f>] do_invalid_op+0x71/0x7b
>>  [<c06e97ef>] ? text_poke+0x168/0x1a4
>>  [<c045567d>] ? mark_lock+0x1e/0x1e0
>>  [<c042672c>] ? set_pte_vaddr+0xac/0xcf
>>  [<c054d6fc>] ? trace_hardirqs_off_thunk+0xc/0x10
>>  [<c06e8bfa>] error_code+0x72/0x78
>>  [<c0408bbe>] ? do_invalid_op+0x0/0x7b
>>  [<c06e97ef>] ? text_poke+0x168/0x1a4
>>  [<f8464000>] ? dummy_function+0x0/0xb [probe_bench]
>>  [<c06ea30d>] arch_arm_kprobe+0x1a/0x1c
>>  [<c06eb4bb>] register_kprobe+0x3b8/0x40a
>>  [<f8464110>] install_probe+0x31/0x13d [probe_bench]
>>  [<c040304f>] do_one_initcall+0x4a/0x11a
>>  [<f84640df>] ? install_probe+0x0/0x13d [probe_bench]
>>  [<c044b8fe>] ? up_read+0x16/0x2c
>>  [<c044c14e>] ? __blocking_notifier_call_chain+0x40/0x4c
>>  [<c045e5e4>] sys_init_module+0x89/0x18c
>>  [<c0407b44>] sysenter_do_call+0x12/0x38
>> Rebooting in 5 seconds..
>>
>> -- 
>> Masami Hiramatsu
>>
>> Software Engineer
>> Hitachi Computer Products (America) Inc.
>> Software Solutions Division
>>
>> e-mail: mhiramat@redhat.com
>>
>>
> 

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][-tip] kprobes on module functions hits kernel BUG in  text_poke on x86-32
  2009-04-04 18:28   ` Masami Hiramatsu
@ 2009-04-04 19:04     ` Masami Hiramatsu
  2009-04-05  3:46       ` Masami Hiramatsu
  0 siblings, 1 reply; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-04 19:04 UTC (permalink / raw)
  To: Mathieu Desnoyers, Ingo Molnar
  Cc: Ananth N Mavinakayanahalli, LKML, systemtap-ml

Masami Hiramatsu wrote:
> Mathieu Desnoyers wrote:
>> * Masami Hiramatsu (mhiramat@redhat.com) wrote:
>>> Hi,
>>>
>>> I found text_poke() problem on x86-32 with the latest-tip tree.
>>> When I put a kprobe on a module function, text_poke() hit a BUG.
>>>
>>> This bug can be reproduced on x86-32, but not on x86-64.
>>> And inserting kprobes on a kernel-core function is OK.
>>>
>>> Thank you,
>>>
>> Hi Masami,
>>
>> OK, so text_poke safety net saves the day :)
>>
>> Basically, what we have here is the BUG_ON I have put :
>>
>>        for (i = 0; i < len; i++)
>>                 BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]);
>>
>> Which checks that the modification is really preceivable in the kernel
>> code, triggers this bug. Only for modules you say.
>>
>> It might not be this, but.. let's try something simple (this could be
>> completely unrelated, but won't take long to test): can you try to add a
>> vmalloc_sync_all() at the beginning of text_poke ? This would make sure
>> that lazily-populated TLB entries, which include module code and data on
>> x86, will be populated. I wonder if we hit this problem because
>> vmalloc_to_page would be returning a mapping to a yet unpopulated TLB
>> entry, if it is ever possible.
> 
> Hmm, from the result of my test, vmalloc_sync_all() didn't change anything...
> 
>> If that's not this, then I guess we have some problem with setting a
>> fixmap to a page returned by vmalloc on x86 32.
> 
> I investigate it a bit deeper. I compared fixmap's page* and original
> which vmalloc_to_page returns(because vmalloc_to_page just decode
> current pagetable).
> 
> I added a printk right after set_fixmap, which shows below message.
> 
> fixmap:<fixmap's vaddr>:<page* by vmalloc_to_page>, \
> orig:<original vaddr>:<page* passed to fixmap>
> 
> When I probe a module address, I got this;
> fixmap:ffc58000:c1db1ae4, orig:f84a1000:c59b1ae4
> 
> on the other hand, when probing a kernel addrees, I got this;
> fixmap:ffc58000:c129e01c, orig:c048946a:c129e01c
> 
> I guess this means that set_fixmap didn't set a correct page or
> page_to_phys() returned incorrect phys address.

Oops, I found a funny truth,

fixmap:ffc58000:c1db1670, orig:f83fb000:c59b1670
orig page c59b1670, phys 12f8a4000
fixmap page c1db1670, phys 2f8a4000

page means (struct page *) value, and phys means
its physical address.

You can see set_fixmap() cuts higher bits than 32 bit in fixmap.h

----
void native_set_fixmap(enum fixed_addresses idx,
                       unsigned long phys, pgprot_t flags);

#ifndef CONFIG_PARAVIRT
static inline void __set_fixmap(enum fixed_addresses idx,
                                unsigned long phys, pgprot_t flags)
{
        native_set_fixmap(idx, phys, flags);
}
#endif
---

in x86-64, unsigned long is 64bit, so it can handle highmem. So,
this problem never happens on x86-64.

Ingo, would you think we can expand phys to unsigned long long or
somesush in fixmap.h?

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][-tip] kprobes on module functions hits kernel BUG in  text_poke on x86-32
  2009-04-04 19:04     ` Masami Hiramatsu
@ 2009-04-05  3:46       ` Masami Hiramatsu
  2009-04-05  3:49         ` Masami Hiramatsu
  2009-04-06 17:11         ` [BUGFIX][PATCH -tip] x86: fix text_poke to handle highmem pages Masami Hiramatsu
  0 siblings, 2 replies; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-05  3:46 UTC (permalink / raw)
  To: Mathieu Desnoyers, Ingo Molnar
  Cc: Ananth N Mavinakayanahalli, LKML, systemtap-ml

Masami Hiramatsu wrote:
> Masami Hiramatsu wrote:
>> Mathieu Desnoyers wrote:
>>> * Masami Hiramatsu (mhiramat@redhat.com) wrote:
>>>> Hi,
>>>>
>>>> I found text_poke() problem on x86-32 with the latest-tip tree.
>>>> When I put a kprobe on a module function, text_poke() hit a BUG.
>>>>
>>>> This bug can be reproduced on x86-32, but not on x86-64.
>>>> And inserting kprobes on a kernel-core function is OK.
>>>>
>>>> Thank you,
>>>>
>>> Hi Masami,
>>>
>>> OK, so text_poke safety net saves the day :)
>>>
>>> Basically, what we have here is the BUG_ON I have put :
>>>
>>>        for (i = 0; i < len; i++)
>>>                 BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]);
>>>
>>> Which checks that the modification is really preceivable in the kernel
>>> code, triggers this bug. Only for modules you say.
>>>
>>> It might not be this, but.. let's try something simple (this could be
>>> completely unrelated, but won't take long to test): can you try to add a
>>> vmalloc_sync_all() at the beginning of text_poke ? This would make sure
>>> that lazily-populated TLB entries, which include module code and data on
>>> x86, will be populated. I wonder if we hit this problem because
>>> vmalloc_to_page would be returning a mapping to a yet unpopulated TLB
>>> entry, if it is ever possible.
>> Hmm, from the result of my test, vmalloc_sync_all() didn't change anything...
>>
>>> If that's not this, then I guess we have some problem with setting a
>>> fixmap to a page returned by vmalloc on x86 32.

Hmm, ok. AFAICS, fixmap is only for lowmem, and pkmap is only for highmem.

So, I think we have some options;

A) Separate text_poke into __text_poke and __text_poke_highmem. And
  use pkmap_atomic in __text_poke_highmem. This way doesn't require
  any additional change except adding KM_TEXT_POKE0/1 in km_type.

B) Add set_fixmap_page and use it in text_poke. This will require
  changes in paravirt_ops and pgtable.c. We need to ensure there is
  no side effects.

C) Change pkmap_atomic_prot to map lowmem only if the page's pgprot
  is different from user specified pgprot. And use it instead of
  fixmap. This also requires KM_TEXT_POKE0/1, however we can
  remove FIX_TEXT_POKE0/1.

etc...

I think A) is for short-term solution. I guess it will be acceptable
for next release. But for long-term, C) might be better.

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][-tip] kprobes on module functions hits kernel BUG in  text_poke on x86-32
  2009-04-05  3:46       ` Masami Hiramatsu
@ 2009-04-05  3:49         ` Masami Hiramatsu
  2009-04-06 17:11         ` [BUGFIX][PATCH -tip] x86: fix text_poke to handle highmem pages Masami Hiramatsu
  1 sibling, 0 replies; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-05  3:49 UTC (permalink / raw)
  To: Mathieu Desnoyers, Ingo Molnar
  Cc: Ananth N Mavinakayanahalli, LKML, systemtap-ml

Masami Hiramatsu wrote:
> Hmm, ok. AFAICS, fixmap is only for lowmem, and pkmap is only for highmem.

Oops, I mean kmap...


> So, I think we have some options;
> 
> A) Separate text_poke into __text_poke and __text_poke_highmem. And
>   use pkmap_atomic in __text_poke_highmem. This way doesn't require
>   any additional change except adding KM_TEXT_POKE0/1 in km_type.
> 
> B) Add set_fixmap_page and use it in text_poke. This will require
>   changes in paravirt_ops and pgtable.c. We need to ensure there is
>   no side effects.
> 
> C) Change pkmap_atomic_prot to map lowmem only if the page's pgprot
>   is different from user specified pgprot. And use it instead of
>   fixmap. This also requires KM_TEXT_POKE0/1, however we can
>   remove FIX_TEXT_POKE0/1.

same, s/pkmap/kmap/g


-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [BUGFIX][PATCH -tip] x86: fix text_poke to handle highmem pages
  2009-04-05  3:46       ` Masami Hiramatsu
  2009-04-05  3:49         ` Masami Hiramatsu
@ 2009-04-06 17:11         ` Masami Hiramatsu
  2009-04-06 17:32           ` Mathieu Desnoyers
  2009-04-08 12:31           ` Ingo Molnar
  1 sibling, 2 replies; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-06 17:11 UTC (permalink / raw)
  To: Mathieu Desnoyers, Ingo Molnar
  Cc: Ananth N Mavinakayanahalli, LKML, systemtap-ml

Fix a bug in text_poke to handle highmem pages, because module
text pages are possible to be highmem pages on x86-32.
In that case, since fixmap can't handle those pages, text_poke
uses kmap_atomic.

Moreover, module_text pages will be discontinuous and it is
possible that one page is lowmem and another page is highmem,
because it is allocated by vmalloc. To handle this situation,
text_poke splits poke area into two pieces and write the areas
page by page.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Ingo Molnar <mingo@elte.hu>
---

 Hi Ingo and Mathieu,

 Masami Hiramatsu wrote:
 > A) Separate text_poke into __text_poke and __text_poke_highmem. And
 >   use pkmap_atomic in __text_poke_highmem. This way doesn't require
 >   any additional change except adding KM_TEXT_POKE0/1 in km_type.

 Here is A) patch.

 Thank you,

 arch/x86/include/asm/fixmap.h     |    3 +-
 arch/x86/include/asm/kmap_types.h |    3 ++
 arch/x86/kernel/alternative.c     |   45 ++++++++++++++++++++++++-------------
 3 files changed, 32 insertions(+), 19 deletions(-)


diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index 81937a5..6c59d4a 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -111,8 +111,7 @@ enum fixed_addresses {
 #ifdef CONFIG_PARAVIRT
 	FIX_PARAVIRT_BOOTMAP,
 #endif
-	FIX_TEXT_POKE0,	/* reserve 2 pages for text_poke() */
-	FIX_TEXT_POKE1,
+	FIX_TEXT_POKE,	/* reserve 1 page for text_poke() */
 	__end_of_permanent_fixed_addresses,
 #ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT
 	FIX_OHCI1394_BASE,
diff --git a/arch/x86/include/asm/kmap_types.h b/arch/x86/include/asm/kmap_types.h
index 5759c16..e48f11f 100644
--- a/arch/x86/include/asm/kmap_types.h
+++ b/arch/x86/include/asm/kmap_types.h
@@ -21,7 +21,8 @@ D(9)	KM_IRQ0,
 D(10)	KM_IRQ1,
 D(11)	KM_SOFTIRQ0,
 D(12)	KM_SOFTIRQ1,
-D(13)	KM_TYPE_NR
+D(13)	KM_TEXT_POKE,
+D(14)	KM_TYPE_NR
 };

 #undef D
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index f576587..f4487c9 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -6,6 +6,7 @@
 #include <linux/mm.h>
 #include <linux/vmalloc.h>
 #include <linux/memory.h>
+#include <linux/highmem.h>
 #include <asm/alternative.h>
 #include <asm/sections.h>
 #include <asm/pgtable.h>
@@ -514,27 +515,39 @@ void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
 {
 	unsigned long flags;
 	char *vaddr;
-	struct page *pages[2];
+	struct page *page;
 	int i;
+	unsigned long endp = ((unsigned long)addr + len) & PAGE_MASK;

-	if (!core_kernel_text((unsigned long)addr)) {
-		pages[0] = vmalloc_to_page(addr);
-		pages[1] = vmalloc_to_page(addr + PAGE_SIZE);
-	} else {
-		pages[0] = virt_to_page(addr);
-		WARN_ON(!PageReserved(pages[0]));
-		pages[1] = virt_to_page(addr + PAGE_SIZE);
+	/*
+	 * If the written range covers 2 pages, we'll split it, because
+	 * vmalloc pages are not always continuous -- e.g. 1st page is
+	 * lowmem and 2nd page is highmem.
+	 */
+	if (((unsigned long)addr & PAGE_MASK) != endp) {
+		text_poke(addr, opcode, endp - (unsigned long)addr);
+		addr =  (void *)endp;
+		opcode = (char *)opcode + (endp - (unsigned long)addr);
+		len -= endp - (unsigned long)addr;
 	}
-	BUG_ON(!pages[0]);
+
+	if (!core_kernel_text((unsigned long)addr))
+		page = vmalloc_to_page(addr);
+	else
+		page = virt_to_page(addr);
+	BUG_ON(!page);
 	local_irq_save(flags);
-	set_fixmap(FIX_TEXT_POKE0, page_to_phys(pages[0]));
-	if (pages[1])
-		set_fixmap(FIX_TEXT_POKE1, page_to_phys(pages[1]));
-	vaddr = (char *)fix_to_virt(FIX_TEXT_POKE0);
+	if (PageHighMem(page))
+		vaddr = kmap_atomic(page, KM_TEXT_POKE);
+	else {
+		set_fixmap(FIX_TEXT_POKE, page_to_phys(page));
+		vaddr = (char *)fix_to_virt(FIX_TEXT_POKE);
+	}
 	memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
-	clear_fixmap(FIX_TEXT_POKE0);
-	if (pages[1])
-		clear_fixmap(FIX_TEXT_POKE1);
+	if (PageHighMem(page))
+		kunmap_atomic(vaddr, KM_TEXT_POKE);
+	else
+		clear_fixmap(FIX_TEXT_POKE);
 	local_flush_tlb();
 	sync_core();
 	/* Could also do a CLFLUSH here to speed up CPU recovery; but

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [BUGFIX][PATCH -tip] x86: fix text_poke to handle highmem pages
  2009-04-06 17:11         ` [BUGFIX][PATCH -tip] x86: fix text_poke to handle highmem pages Masami Hiramatsu
@ 2009-04-06 17:32           ` Mathieu Desnoyers
  2009-04-06 17:44             ` Masami Hiramatsu
  2009-04-08 12:31           ` Ingo Molnar
  1 sibling, 1 reply; 22+ messages in thread
From: Mathieu Desnoyers @ 2009-04-06 17:32 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Ananth N Mavinakayanahalli, LKML, systemtap-ml

* Masami Hiramatsu (mhiramat@redhat.com) wrote:
> Fix a bug in text_poke to handle highmem pages, because module
> text pages are possible to be highmem pages on x86-32.
> In that case, since fixmap can't handle those pages, text_poke
> uses kmap_atomic.
> 

Hrm, can you remind me what would be the downside of using kmap_atomic
in every scenarios (highmem and non-highmem) then ?

I would try to avoid "special cases" as much as possible, because they
just make problems harder to reproduce.

The idea would be to either add fixmap highmem support, or to simply use
kmap_atomic() for all cases until we add fixmap highmem support.

Mathieu

> Moreover, module_text pages will be discontinuous and it is
> possible that one page is lowmem and another page is highmem,
> because it is allocated by vmalloc. To handle this situation,
> text_poke splits poke area into two pieces and write the areas
> page by page.
> 
> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> Cc: Ingo Molnar <mingo@elte.hu>
> ---
> 
>  Hi Ingo and Mathieu,
> 
>  Masami Hiramatsu wrote:
>  > A) Separate text_poke into __text_poke and __text_poke_highmem. And
>  >   use pkmap_atomic in __text_poke_highmem. This way doesn't require
>  >   any additional change except adding KM_TEXT_POKE0/1 in km_type.
> 
>  Here is A) patch.
> 
>  Thank you,
> 
>  arch/x86/include/asm/fixmap.h     |    3 +-
>  arch/x86/include/asm/kmap_types.h |    3 ++
>  arch/x86/kernel/alternative.c     |   45 ++++++++++++++++++++++++-------------
>  3 files changed, 32 insertions(+), 19 deletions(-)
> 
> 
> diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
> index 81937a5..6c59d4a 100644
> --- a/arch/x86/include/asm/fixmap.h
> +++ b/arch/x86/include/asm/fixmap.h
> @@ -111,8 +111,7 @@ enum fixed_addresses {
>  #ifdef CONFIG_PARAVIRT
>  	FIX_PARAVIRT_BOOTMAP,
>  #endif
> -	FIX_TEXT_POKE0,	/* reserve 2 pages for text_poke() */
> -	FIX_TEXT_POKE1,
> +	FIX_TEXT_POKE,	/* reserve 1 page for text_poke() */
>  	__end_of_permanent_fixed_addresses,
>  #ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT
>  	FIX_OHCI1394_BASE,
> diff --git a/arch/x86/include/asm/kmap_types.h b/arch/x86/include/asm/kmap_types.h
> index 5759c16..e48f11f 100644
> --- a/arch/x86/include/asm/kmap_types.h
> +++ b/arch/x86/include/asm/kmap_types.h
> @@ -21,7 +21,8 @@ D(9)	KM_IRQ0,
>  D(10)	KM_IRQ1,
>  D(11)	KM_SOFTIRQ0,
>  D(12)	KM_SOFTIRQ1,
> -D(13)	KM_TYPE_NR
> +D(13)	KM_TEXT_POKE,
> +D(14)	KM_TYPE_NR
>  };
> 
>  #undef D
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index f576587..f4487c9 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -6,6 +6,7 @@
>  #include <linux/mm.h>
>  #include <linux/vmalloc.h>
>  #include <linux/memory.h>
> +#include <linux/highmem.h>
>  #include <asm/alternative.h>
>  #include <asm/sections.h>
>  #include <asm/pgtable.h>
> @@ -514,27 +515,39 @@ void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
>  {
>  	unsigned long flags;
>  	char *vaddr;
> -	struct page *pages[2];
> +	struct page *page;
>  	int i;
> +	unsigned long endp = ((unsigned long)addr + len) & PAGE_MASK;
> 
> -	if (!core_kernel_text((unsigned long)addr)) {
> -		pages[0] = vmalloc_to_page(addr);
> -		pages[1] = vmalloc_to_page(addr + PAGE_SIZE);
> -	} else {
> -		pages[0] = virt_to_page(addr);
> -		WARN_ON(!PageReserved(pages[0]));
> -		pages[1] = virt_to_page(addr + PAGE_SIZE);
> +	/*
> +	 * If the written range covers 2 pages, we'll split it, because
> +	 * vmalloc pages are not always continuous -- e.g. 1st page is
> +	 * lowmem and 2nd page is highmem.
> +	 */
> +	if (((unsigned long)addr & PAGE_MASK) != endp) {
> +		text_poke(addr, opcode, endp - (unsigned long)addr);
> +		addr =  (void *)endp;
> +		opcode = (char *)opcode + (endp - (unsigned long)addr);
> +		len -= endp - (unsigned long)addr;
>  	}
> -	BUG_ON(!pages[0]);
> +
> +	if (!core_kernel_text((unsigned long)addr))
> +		page = vmalloc_to_page(addr);
> +	else
> +		page = virt_to_page(addr);
> +	BUG_ON(!page);
>  	local_irq_save(flags);
> -	set_fixmap(FIX_TEXT_POKE0, page_to_phys(pages[0]));
> -	if (pages[1])
> -		set_fixmap(FIX_TEXT_POKE1, page_to_phys(pages[1]));
> -	vaddr = (char *)fix_to_virt(FIX_TEXT_POKE0);
> +	if (PageHighMem(page))
> +		vaddr = kmap_atomic(page, KM_TEXT_POKE);
> +	else {
> +		set_fixmap(FIX_TEXT_POKE, page_to_phys(page));
> +		vaddr = (char *)fix_to_virt(FIX_TEXT_POKE);
> +	}
>  	memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
> -	clear_fixmap(FIX_TEXT_POKE0);
> -	if (pages[1])
> -		clear_fixmap(FIX_TEXT_POKE1);
> +	if (PageHighMem(page))
> +		kunmap_atomic(vaddr, KM_TEXT_POKE);
> +	else
> +		clear_fixmap(FIX_TEXT_POKE);
>  	local_flush_tlb();
>  	sync_core();
>  	/* Could also do a CLFLUSH here to speed up CPU recovery; but
> 
> -- 
> Masami Hiramatsu
> 
> Software Engineer
> Hitachi Computer Products (America) Inc.
> Software Solutions Division
> 
> e-mail: mhiramat@redhat.com
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUGFIX][PATCH -tip] x86: fix text_poke to handle highmem pages
  2009-04-06 17:32           ` Mathieu Desnoyers
@ 2009-04-06 17:44             ` Masami Hiramatsu
  2009-04-06 17:58               ` Mathieu Desnoyers
  0 siblings, 1 reply; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-06 17:44 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Ananth N Mavinakayanahalli, LKML, systemtap-ml

Mathieu Desnoyers wrote:
> * Masami Hiramatsu (mhiramat@redhat.com) wrote:
>> Fix a bug in text_poke to handle highmem pages, because module
>> text pages are possible to be highmem pages on x86-32.
>> In that case, since fixmap can't handle those pages, text_poke
>> uses kmap_atomic.
>>
> 
> Hrm, can you remind me what would be the downside of using kmap_atomic
> in every scenarios (highmem and non-highmem) then ?

kmap_atomic can handle only highmem pages. If you passes lowmem pages,
it returns just original vaddr of it. (because kmap is only for
highmem support)

> 
> I would try to avoid "special cases" as much as possible, because they
> just make problems harder to reproduce.

Actually, this bug is a special case because it happens only on PAE kernel.

> 
> The idea would be to either add fixmap highmem support, or to simply use
> kmap_atomic() for all cases until we add fixmap highmem support.
> 


Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUGFIX][PATCH -tip] x86: fix text_poke to handle highmem pages
  2009-04-06 17:44             ` Masami Hiramatsu
@ 2009-04-06 17:58               ` Mathieu Desnoyers
  2009-04-06 20:23                 ` Masami Hiramatsu
  0 siblings, 1 reply; 22+ messages in thread
From: Mathieu Desnoyers @ 2009-04-06 17:58 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Ananth N Mavinakayanahalli, LKML, systemtap-ml

* Masami Hiramatsu (mhiramat@redhat.com) wrote:
> Mathieu Desnoyers wrote:
> > * Masami Hiramatsu (mhiramat@redhat.com) wrote:
> >> Fix a bug in text_poke to handle highmem pages, because module
> >> text pages are possible to be highmem pages on x86-32.
> >> In that case, since fixmap can't handle those pages, text_poke
> >> uses kmap_atomic.
> >>
> > 
> > Hrm, can you remind me what would be the downside of using kmap_atomic
> > in every scenarios (highmem and non-highmem) then ?
> 
> kmap_atomic can handle only highmem pages. If you passes lowmem pages,
> it returns just original vaddr of it. (because kmap is only for
> highmem support)
> 

OK, and if we are doing a second kmap_atomic() of a module text page
which is already mapped, does the second kmap_atomic return the vaddr of
the page originally mapped or is it creating a second mapping ?

Because if we ever decide to enforce read-only page mapping for module
text pages, touching highmem pages too, we will run into real trouble if
those happen to be the same page.

Mathieu

> > 
> > I would try to avoid "special cases" as much as possible, because they
> > just make problems harder to reproduce.
> 
> Actually, this bug is a special case because it happens only on PAE kernel.
> 
> > 
> > The idea would be to either add fixmap highmem support, or to simply use
> > kmap_atomic() for all cases until we add fixmap highmem support.
> > 
> 
> 
> Thank you,
> 
> -- 
> Masami Hiramatsu
> 
> Software Engineer
> Hitachi Computer Products (America) Inc.
> Software Solutions Division
> 
> e-mail: mhiramat@redhat.com
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUGFIX][PATCH -tip] x86: fix text_poke to handle highmem pages
  2009-04-06 17:58               ` Mathieu Desnoyers
@ 2009-04-06 20:23                 ` Masami Hiramatsu
  0 siblings, 0 replies; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-06 20:23 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Ananth N Mavinakayanahalli, LKML, systemtap-ml



Mathieu Desnoyers wrote:
> * Masami Hiramatsu (mhiramat@redhat.com) wrote:
>> Mathieu Desnoyers wrote:
>>> * Masami Hiramatsu (mhiramat@redhat.com) wrote:
>>>> Fix a bug in text_poke to handle highmem pages, because module
>>>> text pages are possible to be highmem pages on x86-32.
>>>> In that case, since fixmap can't handle those pages, text_poke
>>>> uses kmap_atomic.
>>>>
>>> Hrm, can you remind me what would be the downside of using kmap_atomic
>>> in every scenarios (highmem and non-highmem) then ?
>> kmap_atomic can handle only highmem pages. If you passes lowmem pages,
>> it returns just original vaddr of it. (because kmap is only for
>> highmem support)
>>
> 
> OK, and if we are doing a second kmap_atomic() of a module text page
> which is already mapped, does the second kmap_atomic return the vaddr of
> the page originally mapped or is it creating a second mapping ?

Good point, kmap_atomic creates a second mapping if the page is highmem.

in arch/x86/mm/highmem_32.c:

void *kmap_atomic_prot(struct page *page, enum km_type type, pgprot_t prot)
{
	...
        if (!PageHighMem(page))
                return page_address(page);
	...
}

> Because if we ever decide to enforce read-only page mapping for module
> text pages, touching highmem pages too, we will run into real trouble if
> those happen to be the same page.

So, as long as the page is in highmem, we can create a second mapping by
using kmap_atomic.

Thank you,

> 
> Mathieu
> 
>>> I would try to avoid "special cases" as much as possible, because they
>>> just make problems harder to reproduce.
>> Actually, this bug is a special case because it happens only on PAE kernel.
>>
>>> The idea would be to either add fixmap highmem support, or to simply use
>>> kmap_atomic() for all cases until we add fixmap highmem support.
>>>
>>
>> Thank you,
>>
>> -- 
>> Masami Hiramatsu
>>
>> Software Engineer
>> Hitachi Computer Products (America) Inc.
>> Software Solutions Division
>>
>> e-mail: mhiramat@redhat.com
>>
> 

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUGFIX][PATCH -tip] x86: fix text_poke to handle highmem pages
  2009-04-06 17:11         ` [BUGFIX][PATCH -tip] x86: fix text_poke to handle highmem pages Masami Hiramatsu
  2009-04-06 17:32           ` Mathieu Desnoyers
@ 2009-04-08 12:31           ` Ingo Molnar
  2009-04-08 14:57             ` Masami Hiramatsu
  1 sibling, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2009-04-08 12:31 UTC (permalink / raw)
  To: Masami Hiramatsu, Andrew Morton
  Cc: Mathieu Desnoyers, Ananth N Mavinakayanahalli, LKML, systemtap-ml


* Masami Hiramatsu <mhiramat@redhat.com> wrote:

> @@ -514,27 +515,39 @@ void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
>  {
>  	unsigned long flags;
>  	char *vaddr;
> -	struct page *pages[2];
> +	struct page *page;
>  	int i;
> +	unsigned long endp = ((unsigned long)addr + len) & PAGE_MASK;
> 
> -	if (!core_kernel_text((unsigned long)addr)) {
> -		pages[0] = vmalloc_to_page(addr);
> -		pages[1] = vmalloc_to_page(addr + PAGE_SIZE);
> -	} else {
> -		pages[0] = virt_to_page(addr);
> -		WARN_ON(!PageReserved(pages[0]));
> -		pages[1] = virt_to_page(addr + PAGE_SIZE);
> +	/*
> +	 * If the written range covers 2 pages, we'll split it, because
> +	 * vmalloc pages are not always continuous -- e.g. 1st page is
> +	 * lowmem and 2nd page is highmem.
> +	 */
> +	if (((unsigned long)addr & PAGE_MASK) != endp) {
> +		text_poke(addr, opcode, endp - (unsigned long)addr);
> +		addr =  (void *)endp;
> +		opcode = (char *)opcode + (endp - (unsigned long)addr);
> +		len -= endp - (unsigned long)addr;
>  	}
> -	BUG_ON(!pages[0]);
> +
> +	if (!core_kernel_text((unsigned long)addr))
> +		page = vmalloc_to_page(addr);
> +	else
> +		page = virt_to_page(addr);

hm, the bug is upstream now. And your fix turns a 
supposed-to-be-simpler kmap based patching thing back into something 
fragile looking again. We might be better off with a revert - or we 
do a real clean patch.

Firstly, that core_kernel_text() distinction above looks 
artificially open-coded - dont we have a proper, generic 
"look-up-the-page" variant in the MM somewhere?

> +	BUG_ON(!page);
>  	local_irq_save(flags);
> -	set_fixmap(FIX_TEXT_POKE0, page_to_phys(pages[0]));
> -	if (pages[1])
> -		set_fixmap(FIX_TEXT_POKE1, page_to_phys(pages[1]));
> -	vaddr = (char *)fix_to_virt(FIX_TEXT_POKE0);
> +	if (PageHighMem(page))
> +		vaddr = kmap_atomic(page, KM_TEXT_POKE);
> +	else {
> +		set_fixmap(FIX_TEXT_POKE, page_to_phys(page));
> +		vaddr = (char *)fix_to_virt(FIX_TEXT_POKE);
> +	}

that too looks artificially complex. Why cannot we kmap lowmem pages 
too? If the API isnt available on !HIGHMEM kernels .. then the 
solution is to make it available, not to branch our way around it.

>  	memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
> -	clear_fixmap(FIX_TEXT_POKE0);
> -	if (pages[1])
> -		clear_fixmap(FIX_TEXT_POKE1);
> +	if (PageHighMem(page))
> +		kunmap_atomic(vaddr, KM_TEXT_POKE);
> +	else
> +		clear_fixmap(FIX_TEXT_POKE);

ditto.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUGFIX][PATCH -tip] x86: fix text_poke to handle highmem pages
  2009-04-08 12:31           ` Ingo Molnar
@ 2009-04-08 14:57             ` Masami Hiramatsu
  2009-04-08 14:59               ` Ingo Molnar
  0 siblings, 1 reply; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-08 14:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Mathieu Desnoyers, Ananth N Mavinakayanahalli,
	LKML, systemtap-ml

Ingo Molnar wrote:
> * Masami Hiramatsu <mhiramat@redhat.com> wrote:
> 
>> @@ -514,27 +515,39 @@ void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
>>  {
>>  	unsigned long flags;
>>  	char *vaddr;
>> -	struct page *pages[2];
>> +	struct page *page;
>>  	int i;
>> +	unsigned long endp = ((unsigned long)addr + len) & PAGE_MASK;
>>
>> -	if (!core_kernel_text((unsigned long)addr)) {
>> -		pages[0] = vmalloc_to_page(addr);
>> -		pages[1] = vmalloc_to_page(addr + PAGE_SIZE);
>> -	} else {
>> -		pages[0] = virt_to_page(addr);
>> -		WARN_ON(!PageReserved(pages[0]));
>> -		pages[1] = virt_to_page(addr + PAGE_SIZE);
>> +	/*
>> +	 * If the written range covers 2 pages, we'll split it, because
>> +	 * vmalloc pages are not always continuous -- e.g. 1st page is
>> +	 * lowmem and 2nd page is highmem.
>> +	 */
>> +	if (((unsigned long)addr & PAGE_MASK) != endp) {
>> +		text_poke(addr, opcode, endp - (unsigned long)addr);
>> +		addr =  (void *)endp;
>> +		opcode = (char *)opcode + (endp - (unsigned long)addr);
>> +		len -= endp - (unsigned long)addr;
>>  	}
>> -	BUG_ON(!pages[0]);
>> +
>> +	if (!core_kernel_text((unsigned long)addr))
>> +		page = vmalloc_to_page(addr);
>> +	else
>> +		page = virt_to_page(addr);
> 
> hm, the bug is upstream now. And your fix turns a 
> supposed-to-be-simpler kmap based patching thing back into something 
> fragile looking again. We might be better off with a revert - or we 
> do a real clean patch.
> 
> Firstly, that core_kernel_text() distinction above looks 
> artificially open-coded - dont we have a proper, generic 
> "look-up-the-page" variant in the MM somewhere?

Actually, vmalloc_to_page() is generic one. It decodes
the kernel page table directly to find struct page *.
virt_to_page() is just a short-cut api.

> 
>> +	BUG_ON(!page);
>>  	local_irq_save(flags);
>> -	set_fixmap(FIX_TEXT_POKE0, page_to_phys(pages[0]));
>> -	if (pages[1])
>> -		set_fixmap(FIX_TEXT_POKE1, page_to_phys(pages[1]));
>> -	vaddr = (char *)fix_to_virt(FIX_TEXT_POKE0);
>> +	if (PageHighMem(page))
>> +		vaddr = kmap_atomic(page, KM_TEXT_POKE);
>> +	else {
>> +		set_fixmap(FIX_TEXT_POKE, page_to_phys(page));
>> +		vaddr = (char *)fix_to_virt(FIX_TEXT_POKE);
>> +	}
> 
> that too looks artificially complex. Why cannot we kmap lowmem pages 
> too? If the API isnt available on !HIGHMEM kernels .. then the 
> solution is to make it available, not to branch our way around it.

Hmm, why don't we enhance fixmap to handle highmem pages?
(e.g. adding set_fixmap_page())
Since kmap is only for highmem kernels, I think changing it will effects
more users...

Thank you,

> 
>>  	memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
>> -	clear_fixmap(FIX_TEXT_POKE0);
>> -	if (pages[1])
>> -		clear_fixmap(FIX_TEXT_POKE1);
>> +	if (PageHighMem(page))
>> +		kunmap_atomic(vaddr, KM_TEXT_POKE);
>> +	else
>> +		clear_fixmap(FIX_TEXT_POKE);
> 
> ditto.
> 
> Thanks,
> 
> 	Ingo

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUGFIX][PATCH -tip] x86: fix text_poke to handle highmem pages
  2009-04-08 14:57             ` Masami Hiramatsu
@ 2009-04-08 14:59               ` Ingo Molnar
  2009-04-09 17:55                 ` [BUGFIX][PATCH] x86: fix set_fixmap to use phys_addr_t Masami Hiramatsu
  0 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2009-04-08 14:59 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Andrew Morton, Mathieu Desnoyers, Ananth N Mavinakayanahalli,
	LKML, systemtap-ml


* Masami Hiramatsu <mhiramat@redhat.com> wrote:

> Ingo Molnar wrote:
> > * Masami Hiramatsu <mhiramat@redhat.com> wrote:
> > 
> >> @@ -514,27 +515,39 @@ void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
> >>  {
> >>  	unsigned long flags;
> >>  	char *vaddr;
> >> -	struct page *pages[2];
> >> +	struct page *page;
> >>  	int i;
> >> +	unsigned long endp = ((unsigned long)addr + len) & PAGE_MASK;
> >>
> >> -	if (!core_kernel_text((unsigned long)addr)) {
> >> -		pages[0] = vmalloc_to_page(addr);
> >> -		pages[1] = vmalloc_to_page(addr + PAGE_SIZE);
> >> -	} else {
> >> -		pages[0] = virt_to_page(addr);
> >> -		WARN_ON(!PageReserved(pages[0]));
> >> -		pages[1] = virt_to_page(addr + PAGE_SIZE);
> >> +	/*
> >> +	 * If the written range covers 2 pages, we'll split it, because
> >> +	 * vmalloc pages are not always continuous -- e.g. 1st page is
> >> +	 * lowmem and 2nd page is highmem.
> >> +	 */
> >> +	if (((unsigned long)addr & PAGE_MASK) != endp) {
> >> +		text_poke(addr, opcode, endp - (unsigned long)addr);
> >> +		addr =  (void *)endp;
> >> +		opcode = (char *)opcode + (endp - (unsigned long)addr);
> >> +		len -= endp - (unsigned long)addr;
> >>  	}
> >> -	BUG_ON(!pages[0]);
> >> +
> >> +	if (!core_kernel_text((unsigned long)addr))
> >> +		page = vmalloc_to_page(addr);
> >> +	else
> >> +		page = virt_to_page(addr);
> > 
> > hm, the bug is upstream now. And your fix turns a 
> > supposed-to-be-simpler kmap based patching thing back into something 
> > fragile looking again. We might be better off with a revert - or we 
> > do a real clean patch.
> > 
> > Firstly, that core_kernel_text() distinction above looks 
> > artificially open-coded - dont we have a proper, generic 
> > "look-up-the-page" variant in the MM somewhere?
> 
> Actually, vmalloc_to_page() is generic one. It decodes
> the kernel page table directly to find struct page *.
> virt_to_page() is just a short-cut api.
> 
> > 
> >> +	BUG_ON(!page);
> >>  	local_irq_save(flags);
> >> -	set_fixmap(FIX_TEXT_POKE0, page_to_phys(pages[0]));
> >> -	if (pages[1])
> >> -		set_fixmap(FIX_TEXT_POKE1, page_to_phys(pages[1]));
> >> -	vaddr = (char *)fix_to_virt(FIX_TEXT_POKE0);
> >> +	if (PageHighMem(page))
> >> +		vaddr = kmap_atomic(page, KM_TEXT_POKE);
> >> +	else {
> >> +		set_fixmap(FIX_TEXT_POKE, page_to_phys(page));
> >> +		vaddr = (char *)fix_to_virt(FIX_TEXT_POKE);
> >> +	}
> > 
> > that too looks artificially complex. Why cannot we kmap lowmem pages 
> > too? If the API isnt available on !HIGHMEM kernels .. then the 
> > solution is to make it available, not to branch our way around it.
> 
> Hmm, why don't we enhance fixmap to handle highmem pages?
> (e.g. adding set_fixmap_page())
> Since kmap is only for highmem kernels, I think changing it will effects
> more users...
> 
> Thank you,

Sure, whichever way looks more logical to you - my only condition is 
that it should result in a clean patch! :-)

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [BUGFIX][PATCH] x86: fix set_fixmap to use phys_addr_t
  2009-04-08 14:59               ` Ingo Molnar
@ 2009-04-09 17:55                 ` Masami Hiramatsu
  2009-04-09 18:46                   ` Mathieu Desnoyers
                                     ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-09 17:55 UTC (permalink / raw)
  To: Ingo Molnar, Linus Torvalds
  Cc: Andrew Morton, Mathieu Desnoyers, Ananth N Mavinakayanahalli,
	LKML, systemtap-ml, Gary Hade

Use phys_addr_t for receiving a physical address argument
instead of unsigned long. This allows fixmap to handle
pages higher than 4GB on x86-32.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
---

 arch/x86/include/asm/fixmap.h   |    4 ++--
 arch/x86/include/asm/paravirt.h |    4 ++--
 arch/x86/mm/pgtable.c           |    3 ++-
 arch/x86/xen/mmu.c              |    2 +-
 4 files changed, 7 insertions(+), 6 deletions(-)


diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index 81937a5..2d81af3 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -151,11 +151,11 @@ extern pte_t *pkmap_page_table;

 void __native_set_fixmap(enum fixed_addresses idx, pte_t pte);
 void native_set_fixmap(enum fixed_addresses idx,
-		       unsigned long phys, pgprot_t flags);
+		       phys_addr_t phys, pgprot_t flags);

 #ifndef CONFIG_PARAVIRT
 static inline void __set_fixmap(enum fixed_addresses idx,
-				unsigned long phys, pgprot_t flags)
+				phys_addr_t phys, pgprot_t flags)
 {
 	native_set_fixmap(idx, phys, flags);
 }
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 7727aa8..378e369 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -347,7 +347,7 @@ struct pv_mmu_ops {
 	/* Sometimes the physical address is a pfn, and sometimes its
 	   an mfn.  We can tell which is which from the index. */
 	void (*set_fixmap)(unsigned /* enum fixed_addresses */ idx,
-			   unsigned long phys, pgprot_t flags);
+			   phys_addr_t phys, pgprot_t flags);
 };

 struct raw_spinlock;
@@ -1432,7 +1432,7 @@ static inline void arch_leave_lazy_mmu_mode(void)
 void arch_flush_lazy_mmu_mode(void);

 static inline void __set_fixmap(unsigned /* enum fixed_addresses */ idx,
-				unsigned long phys, pgprot_t flags)
+				phys_addr_t phys, pgprot_t flags)
 {
 	pv_mmu_ops.set_fixmap(idx, phys, flags);
 }
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 7a4d6ee..8e43bdd 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -347,7 +347,8 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte)
 	fixmaps_set++;
 }

-void native_set_fixmap(enum fixed_addresses idx, unsigned long phys, pgprot_t flags)
+void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
+		       pgprot_t flags)
 {
 	__native_set_fixmap(idx, pfn_pte(phys >> PAGE_SHIFT, flags));
 }
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index db3802f..2a81838 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1750,7 +1750,7 @@ __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd,
 }
 #endif	/* CONFIG_X86_64 */

-static void xen_set_fixmap(unsigned idx, unsigned long phys, pgprot_t prot)
+static void xen_set_fixmap(unsigned idx, phys_addr_t phys, pgprot_t prot)
 {
 	pte_t pte;


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [BUGFIX][PATCH] x86: fix set_fixmap to use phys_addr_t
  2009-04-09 17:55                 ` [BUGFIX][PATCH] x86: fix set_fixmap to use phys_addr_t Masami Hiramatsu
@ 2009-04-09 18:46                   ` Mathieu Desnoyers
  2009-04-09 21:52                     ` Masami Hiramatsu
  2009-04-10 14:06                   ` [tip:x86/urgent] " Masami Hiramatsu
  2009-04-10 18:30                   ` Masami Hiramatsu
  2 siblings, 1 reply; 22+ messages in thread
From: Mathieu Desnoyers @ 2009-04-09 18:46 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Linus Torvalds, Andrew Morton,
	Ananth N Mavinakayanahalli, LKML, systemtap-ml, Gary Hade

* Masami Hiramatsu (mhiramat@redhat.com) wrote:
> Use phys_addr_t for receiving a physical address argument
> instead of unsigned long. This allows fixmap to handle
> pages higher than 4GB on x86-32.
> 

You might also want to update arch/x86/mm/ioremap.c:early_set_fixmap()
and __early_set_fixmap().

Otherwise it looks good.

Thanks,

Mathieu

> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> ---
> 
>  arch/x86/include/asm/fixmap.h   |    4 ++--
>  arch/x86/include/asm/paravirt.h |    4 ++--
>  arch/x86/mm/pgtable.c           |    3 ++-
>  arch/x86/xen/mmu.c              |    2 +-
>  4 files changed, 7 insertions(+), 6 deletions(-)
> 
> 
> diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
> index 81937a5..2d81af3 100644
> --- a/arch/x86/include/asm/fixmap.h
> +++ b/arch/x86/include/asm/fixmap.h
> @@ -151,11 +151,11 @@ extern pte_t *pkmap_page_table;
> 
>  void __native_set_fixmap(enum fixed_addresses idx, pte_t pte);
>  void native_set_fixmap(enum fixed_addresses idx,
> -		       unsigned long phys, pgprot_t flags);
> +		       phys_addr_t phys, pgprot_t flags);
> 
>  #ifndef CONFIG_PARAVIRT
>  static inline void __set_fixmap(enum fixed_addresses idx,
> -				unsigned long phys, pgprot_t flags)
> +				phys_addr_t phys, pgprot_t flags)
>  {
>  	native_set_fixmap(idx, phys, flags);
>  }
> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
> index 7727aa8..378e369 100644
> --- a/arch/x86/include/asm/paravirt.h
> +++ b/arch/x86/include/asm/paravirt.h
> @@ -347,7 +347,7 @@ struct pv_mmu_ops {
>  	/* Sometimes the physical address is a pfn, and sometimes its
>  	   an mfn.  We can tell which is which from the index. */
>  	void (*set_fixmap)(unsigned /* enum fixed_addresses */ idx,
> -			   unsigned long phys, pgprot_t flags);
> +			   phys_addr_t phys, pgprot_t flags);
>  };
> 
>  struct raw_spinlock;
> @@ -1432,7 +1432,7 @@ static inline void arch_leave_lazy_mmu_mode(void)
>  void arch_flush_lazy_mmu_mode(void);
> 
>  static inline void __set_fixmap(unsigned /* enum fixed_addresses */ idx,
> -				unsigned long phys, pgprot_t flags)
> +				phys_addr_t phys, pgprot_t flags)
>  {
>  	pv_mmu_ops.set_fixmap(idx, phys, flags);
>  }
> diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> index 7a4d6ee..8e43bdd 100644
> --- a/arch/x86/mm/pgtable.c
> +++ b/arch/x86/mm/pgtable.c
> @@ -347,7 +347,8 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte)
>  	fixmaps_set++;
>  }
> 
> -void native_set_fixmap(enum fixed_addresses idx, unsigned long phys, pgprot_t flags)
> +void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
> +		       pgprot_t flags)
>  {
>  	__native_set_fixmap(idx, pfn_pte(phys >> PAGE_SHIFT, flags));
>  }
> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> index db3802f..2a81838 100644
> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -1750,7 +1750,7 @@ __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd,
>  }
>  #endif	/* CONFIG_X86_64 */
> 
> -static void xen_set_fixmap(unsigned idx, unsigned long phys, pgprot_t prot)
> +static void xen_set_fixmap(unsigned idx, phys_addr_t phys, pgprot_t prot)
>  {
>  	pte_t pte;
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUGFIX][PATCH] x86: fix set_fixmap to use phys_addr_t
  2009-04-09 18:46                   ` Mathieu Desnoyers
@ 2009-04-09 21:52                     ` Masami Hiramatsu
  0 siblings, 0 replies; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-09 21:52 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Linus Torvalds, Andrew Morton,
	Ananth N Mavinakayanahalli, LKML, systemtap-ml, Gary Hade

Mathieu Desnoyers wrote:
> * Masami Hiramatsu (mhiramat@redhat.com) wrote:
>> Use phys_addr_t for receiving a physical address argument
>> instead of unsigned long. This allows fixmap to handle
>> pages higher than 4GB on x86-32.
>>
> 
> You might also want to update arch/x86/mm/ioremap.c:early_set_fixmap()
> and __early_set_fixmap().

Hmm, it is easy to change unsigned long to phys_addr_t in
early_set_fixmap(), however, ioremap also uses unsigned long for
physical addresses. Do I need to update ioremap functions too?

I mean, is there any chance that over 4GB physical address will
be passed to ioremap? If not, I just change early_set_fixmap()
and __early_set_fixmap().

Thanks,

> 
> Otherwise it looks good.
> 
> Thanks,
> 
> Mathieu
> 
>> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
>> Cc: Ingo Molnar <mingo@elte.hu>
>> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
>> ---
>>
>>  arch/x86/include/asm/fixmap.h   |    4 ++--
>>  arch/x86/include/asm/paravirt.h |    4 ++--
>>  arch/x86/mm/pgtable.c           |    3 ++-
>>  arch/x86/xen/mmu.c              |    2 +-
>>  4 files changed, 7 insertions(+), 6 deletions(-)
>>
>>
>> diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
>> index 81937a5..2d81af3 100644
>> --- a/arch/x86/include/asm/fixmap.h
>> +++ b/arch/x86/include/asm/fixmap.h
>> @@ -151,11 +151,11 @@ extern pte_t *pkmap_page_table;
>>
>>  void __native_set_fixmap(enum fixed_addresses idx, pte_t pte);
>>  void native_set_fixmap(enum fixed_addresses idx,
>> -		       unsigned long phys, pgprot_t flags);
>> +		       phys_addr_t phys, pgprot_t flags);
>>
>>  #ifndef CONFIG_PARAVIRT
>>  static inline void __set_fixmap(enum fixed_addresses idx,
>> -				unsigned long phys, pgprot_t flags)
>> +				phys_addr_t phys, pgprot_t flags)
>>  {
>>  	native_set_fixmap(idx, phys, flags);
>>  }
>> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
>> index 7727aa8..378e369 100644
>> --- a/arch/x86/include/asm/paravirt.h
>> +++ b/arch/x86/include/asm/paravirt.h
>> @@ -347,7 +347,7 @@ struct pv_mmu_ops {
>>  	/* Sometimes the physical address is a pfn, and sometimes its
>>  	   an mfn.  We can tell which is which from the index. */
>>  	void (*set_fixmap)(unsigned /* enum fixed_addresses */ idx,
>> -			   unsigned long phys, pgprot_t flags);
>> +			   phys_addr_t phys, pgprot_t flags);
>>  };
>>
>>  struct raw_spinlock;
>> @@ -1432,7 +1432,7 @@ static inline void arch_leave_lazy_mmu_mode(void)
>>  void arch_flush_lazy_mmu_mode(void);
>>
>>  static inline void __set_fixmap(unsigned /* enum fixed_addresses */ idx,
>> -				unsigned long phys, pgprot_t flags)
>> +				phys_addr_t phys, pgprot_t flags)
>>  {
>>  	pv_mmu_ops.set_fixmap(idx, phys, flags);
>>  }
>> diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
>> index 7a4d6ee..8e43bdd 100644
>> --- a/arch/x86/mm/pgtable.c
>> +++ b/arch/x86/mm/pgtable.c
>> @@ -347,7 +347,8 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte)
>>  	fixmaps_set++;
>>  }
>>
>> -void native_set_fixmap(enum fixed_addresses idx, unsigned long phys, pgprot_t flags)
>> +void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
>> +		       pgprot_t flags)
>>  {
>>  	__native_set_fixmap(idx, pfn_pte(phys >> PAGE_SHIFT, flags));
>>  }
>> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
>> index db3802f..2a81838 100644
>> --- a/arch/x86/xen/mmu.c
>> +++ b/arch/x86/xen/mmu.c
>> @@ -1750,7 +1750,7 @@ __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd,
>>  }
>>  #endif	/* CONFIG_X86_64 */
>>
>> -static void xen_set_fixmap(unsigned idx, unsigned long phys, pgprot_t prot)
>> +static void xen_set_fixmap(unsigned idx, phys_addr_t phys, pgprot_t prot)
>>  {
>>  	pte_t pte;
>>
> 

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [tip:x86/urgent] x86: fix set_fixmap to use phys_addr_t
  2009-04-09 17:55                 ` [BUGFIX][PATCH] x86: fix set_fixmap to use phys_addr_t Masami Hiramatsu
  2009-04-09 18:46                   ` Mathieu Desnoyers
@ 2009-04-10 14:06                   ` Masami Hiramatsu
  2009-04-10 15:20                     ` Masami Hiramatsu
  2009-04-10 18:30                   ` Masami Hiramatsu
  2 siblings, 1 reply; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-10 14:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, mathieu.desnoyers, hpa, mingo, torvalds, ananth,
	garyhade, akpm, tglx, mhiramat, mingo, systemtap

Commit-ID:  189cdc2b41fce3a780a6f3aa963cc4e8114aec6b
Gitweb:     http://git.kernel.org/tip/189cdc2b41fce3a780a6f3aa963cc4e8114aec6b
Author:     Masami Hiramatsu <mhiramat@redhat.com>
AuthorDate: Thu, 9 Apr 2009 10:55:33 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 10 Apr 2009 15:52:59 +0200

x86: fix set_fixmap to use phys_addr_t

Impact: fix kprobes crash on 32-bit with RAM above 4G

Use phys_addr_t for receiving a physical address argument
instead of unsigned long. This allows fixmap to handle
pages higher than 4GB on x86-32.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: systemtap-ml <systemtap@sources.redhat.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <49DE3695.6040800@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 arch/x86/include/asm/fixmap.h   |    4 ++--
 arch/x86/include/asm/paravirt.h |    4 ++--
 arch/x86/mm/pgtable.c           |    3 ++-
 arch/x86/xen/mmu.c              |    2 +-
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index 81937a5..2d81af3 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -151,11 +151,11 @@ extern pte_t *pkmap_page_table;
 
 void __native_set_fixmap(enum fixed_addresses idx, pte_t pte);
 void native_set_fixmap(enum fixed_addresses idx,
-		       unsigned long phys, pgprot_t flags);
+		       phys_addr_t phys, pgprot_t flags);
 
 #ifndef CONFIG_PARAVIRT
 static inline void __set_fixmap(enum fixed_addresses idx,
-				unsigned long phys, pgprot_t flags)
+				phys_addr_t phys, pgprot_t flags)
 {
 	native_set_fixmap(idx, phys, flags);
 }
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 7727aa8..378e369 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -347,7 +347,7 @@ struct pv_mmu_ops {
 	/* Sometimes the physical address is a pfn, and sometimes its
 	   an mfn.  We can tell which is which from the index. */
 	void (*set_fixmap)(unsigned /* enum fixed_addresses */ idx,
-			   unsigned long phys, pgprot_t flags);
+			   phys_addr_t phys, pgprot_t flags);
 };
 
 struct raw_spinlock;
@@ -1432,7 +1432,7 @@ static inline void arch_leave_lazy_mmu_mode(void)
 void arch_flush_lazy_mmu_mode(void);
 
 static inline void __set_fixmap(unsigned /* enum fixed_addresses */ idx,
-				unsigned long phys, pgprot_t flags)
+				phys_addr_t phys, pgprot_t flags)
 {
 	pv_mmu_ops.set_fixmap(idx, phys, flags);
 }
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 5b7c7c8..7aa03a5 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -345,7 +345,8 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte)
 	fixmaps_set++;
 }
 
-void native_set_fixmap(enum fixed_addresses idx, unsigned long phys, pgprot_t flags)
+void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
+		       pgprot_t flags)
 {
 	__native_set_fixmap(idx, pfn_pte(phys >> PAGE_SHIFT, flags));
 }
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index db3802f..2a81838 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1750,7 +1750,7 @@ __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd,
 }
 #endif	/* CONFIG_X86_64 */
 
-static void xen_set_fixmap(unsigned idx, unsigned long phys, pgprot_t prot)
+static void xen_set_fixmap(unsigned idx, phys_addr_t phys, pgprot_t prot)
 {
 	pte_t pte;
 

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [tip:x86/urgent] x86: fix set_fixmap to use phys_addr_t
  2009-04-10 14:06                   ` [tip:x86/urgent] " Masami Hiramatsu
@ 2009-04-10 15:20                     ` Masami Hiramatsu
  2009-04-10 16:05                       ` Mathieu Desnoyers
  0 siblings, 1 reply; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-10 15:20 UTC (permalink / raw)
  To: mingo, hpa, mathieu.desnoyers, linux-kernel, torvalds, ananth,
	garyhade, akpm, tglx, mhiramat, systemtap, mingo
  Cc: linux-tip-commits

Masami Hiramatsu wrote:
> Commit-ID:  189cdc2b41fce3a780a6f3aa963cc4e8114aec6b
> Gitweb:     http://git.kernel.org/tip/189cdc2b41fce3a780a6f3aa963cc4e8114aec6b
> Author:     Masami Hiramatsu <mhiramat@redhat.com>
> AuthorDate: Thu, 9 Apr 2009 10:55:33 -0700
> Committer:  Ingo Molnar <mingo@elte.hu>
> CommitDate: Fri, 10 Apr 2009 15:52:59 +0200
> 
> x86: fix set_fixmap to use phys_addr_t
> 
> Impact: fix kprobes crash on 32-bit with RAM above 4G
> 
> Use phys_addr_t for receiving a physical address argument
> instead of unsigned long. This allows fixmap to handle
> pages higher than 4GB on x86-32.
> 
> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>

I thought I had to update ioremap.c too...
Anyway, here is the ioremap.c update. If you think it is useful,
please merge it.

Thank you,
----

x86: fix early_ioremap and early_set_fixmap to handle over 4G pages

From: Masami Hiramatsu <mhiramat@redhat.com>

Impact: Allow early_ioremap to handle over 4G pages

Use phys_addr_t and resource_size_t for receiving a physical
address argument instead of unsigned long in early_set_fixmap
and early_ioremap. This allows early_ioremap to handle
pages higher than 4GB on x86-32 with PAE.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Ingo Molnar <mingo@elte.hu>
---

 arch/x86/include/asm/io.h |    6 ++++--
 arch/x86/mm/ioremap.c     |   23 +++++++++++++----------
 2 files changed, 17 insertions(+), 12 deletions(-)


diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index e5383e3..7373932 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -193,8 +193,10 @@ extern void __iomem *ioremap_wc(resource_size_t offset, unsigned long size);
  */
 extern void early_ioremap_init(void);
 extern void early_ioremap_reset(void);
-extern void __iomem *early_ioremap(unsigned long offset, unsigned long size);
-extern void __iomem *early_memremap(unsigned long offset, unsigned long size);
+extern void __iomem *early_ioremap(resource_size_t phys_addr,
+				   unsigned long size);
+extern void __iomem *early_memremap(resource_size_t phys_addr,
+				    unsigned long size);
 extern void early_iounmap(void __iomem *addr, unsigned long size);

 #define IO_SPACE_LIMIT 0xffff
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 0dfa09d..09daebf 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -547,7 +547,7 @@ void __init early_ioremap_reset(void)
 }

 static void __init __early_set_fixmap(enum fixed_addresses idx,
-				   unsigned long phys, pgprot_t flags)
+				      phys_addr_t phys, pgprot_t flags)
 {
 	unsigned long addr = __fix_to_virt(idx);
 	pte_t *pte;
@@ -566,7 +566,7 @@ static void __init __early_set_fixmap(enum fixed_addresses idx,
 }

 static inline void __init early_set_fixmap(enum fixed_addresses idx,
-					   unsigned long phys, pgprot_t prot)
+					   phys_addr_t phys, pgprot_t prot)
 {
 	if (after_paging_init)
 		__set_fixmap(idx, phys, prot);
@@ -607,9 +607,10 @@ static int __init check_early_ioremap_leak(void)
 late_initcall(check_early_ioremap_leak);

 static void __init __iomem *
-__early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot)
+__early_ioremap(resource_size_t phys_addr, unsigned long size, pgprot_t prot)
 {
-	unsigned long offset, last_addr;
+	unsigned long offset;
+	resource_size_t last_addr;
 	unsigned int nrpages;
 	enum fixed_addresses idx0, idx;
 	int i, slot;
@@ -625,15 +626,15 @@ __early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot)
 	}

 	if (slot < 0) {
-		printk(KERN_INFO "early_iomap(%08lx, %08lx) not found slot\n",
-			 phys_addr, size);
+		printk(KERN_INFO "early_iomap(%08llx, %08lx) not found slot\n",
+			 (u64)phys_addr, size);
 		WARN_ON(1);
 		return NULL;
 	}

 	if (early_ioremap_debug) {
-		printk(KERN_INFO "early_ioremap(%08lx, %08lx) [%d] => ",
-		       phys_addr, size, slot);
+		printk(KERN_INFO "early_ioremap(%08llx, %08lx) [%d] => ",
+		       (u64)phys_addr, size, slot);
 		dump_stack();
 	}

@@ -680,13 +681,15 @@ __early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot)
 }

 /* Remap an IO device */
-void __init __iomem *early_ioremap(unsigned long phys_addr, unsigned long size)
+void __init __iomem *
+early_ioremap(resource_size_t phys_addr, unsigned long size)
 {
 	return __early_ioremap(phys_addr, size, PAGE_KERNEL_IO);
 }

 /* Remap memory */
-void __init __iomem *early_memremap(unsigned long phys_addr, unsigned long size)
+void __init __iomem *
+early_memremap(resource_size_t phys_addr, unsigned long size)
 {
 	return __early_ioremap(phys_addr, size, PAGE_KERNEL);
 }
-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [tip:x86/urgent] x86: fix set_fixmap to use phys_addr_t
  2009-04-10 15:20                     ` Masami Hiramatsu
@ 2009-04-10 16:05                       ` Mathieu Desnoyers
  2009-04-10 17:48                         ` Masami Hiramatsu
  0 siblings, 1 reply; 22+ messages in thread
From: Mathieu Desnoyers @ 2009-04-10 16:05 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: mingo, hpa, linux-kernel, torvalds, ananth, garyhade, akpm, tglx,
	systemtap, mingo, linux-tip-commits

* Masami Hiramatsu (mhiramat@redhat.com) wrote:
> Masami Hiramatsu wrote:
> > Commit-ID:  189cdc2b41fce3a780a6f3aa963cc4e8114aec6b
> > Gitweb:     http://git.kernel.org/tip/189cdc2b41fce3a780a6f3aa963cc4e8114aec6b
> > Author:     Masami Hiramatsu <mhiramat@redhat.com>
> > AuthorDate: Thu, 9 Apr 2009 10:55:33 -0700
> > Committer:  Ingo Molnar <mingo@elte.hu>
> > CommitDate: Fri, 10 Apr 2009 15:52:59 +0200
> > 
> > x86: fix set_fixmap to use phys_addr_t
> > 
> > Impact: fix kprobes crash on 32-bit with RAM above 4G
> > 
> > Use phys_addr_t for receiving a physical address argument
> > instead of unsigned long. This allows fixmap to handle
> > pages higher than 4GB on x86-32.
> > 
> > Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
> > Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> 
> I thought I had to update ioremap.c too...
> Anyway, here is the ioremap.c update. If you think it is useful,
> please merge it.
> 
> Thank you,
> ----
> 
> x86: fix early_ioremap and early_set_fixmap to handle over 4G pages
> 
> From: Masami Hiramatsu <mhiramat@redhat.com>
> 
> Impact: Allow early_ioremap to handle over 4G pages
> 
> Use phys_addr_t and resource_size_t for receiving a physical
> address argument instead of unsigned long in early_set_fixmap
> and early_ioremap. This allows early_ioremap to handle
> pages higher than 4GB on x86-32 with PAE.
> 
> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> Cc: Ingo Molnar <mingo@elte.hu>
> ---
> 
>  arch/x86/include/asm/io.h |    6 ++++--
>  arch/x86/mm/ioremap.c     |   23 +++++++++++++----------
>  2 files changed, 17 insertions(+), 12 deletions(-)
> 
> 
> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
> index e5383e3..7373932 100644
> --- a/arch/x86/include/asm/io.h
> +++ b/arch/x86/include/asm/io.h
> @@ -193,8 +193,10 @@ extern void __iomem *ioremap_wc(resource_size_t offset, unsigned long size);
>   */
>  extern void early_ioremap_init(void);
>  extern void early_ioremap_reset(void);
> -extern void __iomem *early_ioremap(unsigned long offset, unsigned long size);
> -extern void __iomem *early_memremap(unsigned long offset, unsigned long size);
> +extern void __iomem *early_ioremap(resource_size_t phys_addr,
> +				   unsigned long size);
> +extern void __iomem *early_memremap(resource_size_t phys_addr,
> +				    unsigned long size);


Nitpick : those second lines are not aligned with each other. The rest
has my

Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>

Thanks !

Mathieu

>  extern void early_iounmap(void __iomem *addr, unsigned long size);
> 
>  #define IO_SPACE_LIMIT 0xffff
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index 0dfa09d..09daebf 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -547,7 +547,7 @@ void __init early_ioremap_reset(void)
>  }
> 
>  static void __init __early_set_fixmap(enum fixed_addresses idx,
> -				   unsigned long phys, pgprot_t flags)
> +				      phys_addr_t phys, pgprot_t flags)
>  {
>  	unsigned long addr = __fix_to_virt(idx);
>  	pte_t *pte;
> @@ -566,7 +566,7 @@ static void __init __early_set_fixmap(enum fixed_addresses idx,
>  }
> 
>  static inline void __init early_set_fixmap(enum fixed_addresses idx,
> -					   unsigned long phys, pgprot_t prot)
> +					   phys_addr_t phys, pgprot_t prot)
>  {
>  	if (after_paging_init)
>  		__set_fixmap(idx, phys, prot);
> @@ -607,9 +607,10 @@ static int __init check_early_ioremap_leak(void)
>  late_initcall(check_early_ioremap_leak);
> 
>  static void __init __iomem *
> -__early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot)
> +__early_ioremap(resource_size_t phys_addr, unsigned long size, pgprot_t prot)
>  {
> -	unsigned long offset, last_addr;
> +	unsigned long offset;
> +	resource_size_t last_addr;
>  	unsigned int nrpages;
>  	enum fixed_addresses idx0, idx;
>  	int i, slot;
> @@ -625,15 +626,15 @@ __early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot)
>  	}
> 
>  	if (slot < 0) {
> -		printk(KERN_INFO "early_iomap(%08lx, %08lx) not found slot\n",
> -			 phys_addr, size);
> +		printk(KERN_INFO "early_iomap(%08llx, %08lx) not found slot\n",
> +			 (u64)phys_addr, size);
>  		WARN_ON(1);
>  		return NULL;
>  	}
> 
>  	if (early_ioremap_debug) {
> -		printk(KERN_INFO "early_ioremap(%08lx, %08lx) [%d] => ",
> -		       phys_addr, size, slot);
> +		printk(KERN_INFO "early_ioremap(%08llx, %08lx) [%d] => ",
> +		       (u64)phys_addr, size, slot);
>  		dump_stack();
>  	}
> 
> @@ -680,13 +681,15 @@ __early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot)
>  }
> 
>  /* Remap an IO device */
> -void __init __iomem *early_ioremap(unsigned long phys_addr, unsigned long size)
> +void __init __iomem *
> +early_ioremap(resource_size_t phys_addr, unsigned long size)
>  {
>  	return __early_ioremap(phys_addr, size, PAGE_KERNEL_IO);
>  }
> 
>  /* Remap memory */
> -void __init __iomem *early_memremap(unsigned long phys_addr, unsigned long size)
> +void __init __iomem *
> +early_memremap(resource_size_t phys_addr, unsigned long size)
>  {
>  	return __early_ioremap(phys_addr, size, PAGE_KERNEL);
>  }
> -- 
> Masami Hiramatsu
> 
> Software Engineer
> Hitachi Computer Products (America) Inc.
> Software Solutions Division
> 
> e-mail: mhiramat@redhat.com
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [tip:x86/urgent] x86: fix set_fixmap to use phys_addr_t
  2009-04-10 16:05                       ` Mathieu Desnoyers
@ 2009-04-10 17:48                         ` Masami Hiramatsu
  0 siblings, 0 replies; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-10 17:48 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: mingo, hpa, linux-kernel, torvalds, ananth, garyhade, akpm, tglx,
	systemtap, mingo, linux-tip-commits

Mathieu Desnoyers wrote:
> * Masami Hiramatsu (mhiramat@redhat.com) wrote:
>> Masami Hiramatsu wrote:
>>> Commit-ID:  189cdc2b41fce3a780a6f3aa963cc4e8114aec6b
>>> Gitweb:     http://git.kernel.org/tip/189cdc2b41fce3a780a6f3aa963cc4e8114aec6b
>>> Author:     Masami Hiramatsu <mhiramat@redhat.com>
>>> AuthorDate: Thu, 9 Apr 2009 10:55:33 -0700
>>> Committer:  Ingo Molnar <mingo@elte.hu>
>>> CommitDate: Fri, 10 Apr 2009 15:52:59 +0200
>>>
>>> x86: fix set_fixmap to use phys_addr_t
>>>
>>> Impact: fix kprobes crash on 32-bit with RAM above 4G
>>>
>>> Use phys_addr_t for receiving a physical address argument
>>> instead of unsigned long. This allows fixmap to handle
>>> pages higher than 4GB on x86-32.
>>>
>>> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
>>> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
>> I thought I had to update ioremap.c too...
>> Anyway, here is the ioremap.c update. If you think it is useful,
>> please merge it.
>>
>> Thank you,
>> ----
>>
>> x86: fix early_ioremap and early_set_fixmap to handle over 4G pages
>>
>> From: Masami Hiramatsu <mhiramat@redhat.com>
>>
>> Impact: Allow early_ioremap to handle over 4G pages
>>
>> Use phys_addr_t and resource_size_t for receiving a physical
>> address argument instead of unsigned long in early_set_fixmap
>> and early_ioremap. This allows early_ioremap to handle
>> pages higher than 4GB on x86-32 with PAE.
>>
>> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
>> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
>> Cc: Ingo Molnar <mingo@elte.hu>
>> ---
>>
>>  arch/x86/include/asm/io.h |    6 ++++--
>>  arch/x86/mm/ioremap.c     |   23 +++++++++++++----------
>>  2 files changed, 17 insertions(+), 12 deletions(-)
>>
>>
>> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
>> index e5383e3..7373932 100644
>> --- a/arch/x86/include/asm/io.h
>> +++ b/arch/x86/include/asm/io.h
>> @@ -193,8 +193,10 @@ extern void __iomem *ioremap_wc(resource_size_t offset, unsigned long size);
>>   */
>>  extern void early_ioremap_init(void);
>>  extern void early_ioremap_reset(void);
>> -extern void __iomem *early_ioremap(unsigned long offset, unsigned long size);
>> -extern void __iomem *early_memremap(unsigned long offset, unsigned long size);
>> +extern void __iomem *early_ioremap(resource_size_t phys_addr,
>> +				   unsigned long size);
>> +extern void __iomem *early_memremap(resource_size_t phys_addr,
>> +				    unsigned long size);
> 
> 
> Nitpick : those second lines are not aligned with each other. The rest
> has my

hmm, after being patched, those are aligned to the first line...

---
extern void __iomem *early_ioremap(resource_size_t phys_addr,
				   unsigned long size);
extern void __iomem *early_memremap(resource_size_t phys_addr,
				    unsigned long size);
---

Thank you,

> 
> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> 
> Thanks !
> 
> Mathieu
> 
>>  extern void early_iounmap(void __iomem *addr, unsigned long size);
>>
>>  #define IO_SPACE_LIMIT 0xffff
>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>> index 0dfa09d..09daebf 100644
>> --- a/arch/x86/mm/ioremap.c
>> +++ b/arch/x86/mm/ioremap.c
>> @@ -547,7 +547,7 @@ void __init early_ioremap_reset(void)
>>  }
>>
>>  static void __init __early_set_fixmap(enum fixed_addresses idx,
>> -				   unsigned long phys, pgprot_t flags)
>> +				      phys_addr_t phys, pgprot_t flags)
>>  {
>>  	unsigned long addr = __fix_to_virt(idx);
>>  	pte_t *pte;
>> @@ -566,7 +566,7 @@ static void __init __early_set_fixmap(enum fixed_addresses idx,
>>  }
>>
>>  static inline void __init early_set_fixmap(enum fixed_addresses idx,
>> -					   unsigned long phys, pgprot_t prot)
>> +					   phys_addr_t phys, pgprot_t prot)
>>  {
>>  	if (after_paging_init)
>>  		__set_fixmap(idx, phys, prot);
>> @@ -607,9 +607,10 @@ static int __init check_early_ioremap_leak(void)
>>  late_initcall(check_early_ioremap_leak);
>>
>>  static void __init __iomem *
>> -__early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot)
>> +__early_ioremap(resource_size_t phys_addr, unsigned long size, pgprot_t prot)
>>  {
>> -	unsigned long offset, last_addr;
>> +	unsigned long offset;
>> +	resource_size_t last_addr;
>>  	unsigned int nrpages;
>>  	enum fixed_addresses idx0, idx;
>>  	int i, slot;
>> @@ -625,15 +626,15 @@ __early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot)
>>  	}
>>
>>  	if (slot < 0) {
>> -		printk(KERN_INFO "early_iomap(%08lx, %08lx) not found slot\n",
>> -			 phys_addr, size);
>> +		printk(KERN_INFO "early_iomap(%08llx, %08lx) not found slot\n",
>> +			 (u64)phys_addr, size);
>>  		WARN_ON(1);
>>  		return NULL;
>>  	}
>>
>>  	if (early_ioremap_debug) {
>> -		printk(KERN_INFO "early_ioremap(%08lx, %08lx) [%d] => ",
>> -		       phys_addr, size, slot);
>> +		printk(KERN_INFO "early_ioremap(%08llx, %08lx) [%d] => ",
>> +		       (u64)phys_addr, size, slot);
>>  		dump_stack();
>>  	}
>>
>> @@ -680,13 +681,15 @@ __early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot)
>>  }
>>
>>  /* Remap an IO device */
>> -void __init __iomem *early_ioremap(unsigned long phys_addr, unsigned long size)
>> +void __init __iomem *
>> +early_ioremap(resource_size_t phys_addr, unsigned long size)
>>  {
>>  	return __early_ioremap(phys_addr, size, PAGE_KERNEL_IO);
>>  }
>>
>>  /* Remap memory */
>> -void __init __iomem *early_memremap(unsigned long phys_addr, unsigned long size)
>> +void __init __iomem *
>> +early_memremap(resource_size_t phys_addr, unsigned long size)
>>  {
>>  	return __early_ioremap(phys_addr, size, PAGE_KERNEL);
>>  }
>> -- 
>> Masami Hiramatsu
>>
>> Software Engineer
>> Hitachi Computer Products (America) Inc.
>> Software Solutions Division
>>
>> e-mail: mhiramat@redhat.com
>>
> 

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [tip:x86/urgent] x86: fix set_fixmap to use phys_addr_t
  2009-04-09 17:55                 ` [BUGFIX][PATCH] x86: fix set_fixmap to use phys_addr_t Masami Hiramatsu
  2009-04-09 18:46                   ` Mathieu Desnoyers
  2009-04-10 14:06                   ` [tip:x86/urgent] " Masami Hiramatsu
@ 2009-04-10 18:30                   ` Masami Hiramatsu
  2 siblings, 0 replies; 22+ messages in thread
From: Masami Hiramatsu @ 2009-04-10 18:30 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, mathieu.desnoyers, hpa, mingo, torvalds, ananth,
	garyhade, akpm, tglx, mhiramat, mingo, systemtap

Commit-ID:  9b987aeb4a7bc42a3eb8361030b820b0263c31f1
Gitweb:     http://git.kernel.org/tip/9b987aeb4a7bc42a3eb8361030b820b0263c31f1
Author:     Masami Hiramatsu <mhiramat@redhat.com>
AuthorDate: Thu, 9 Apr 2009 10:55:33 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 10 Apr 2009 20:27:13 +0200

x86: fix set_fixmap to use phys_addr_t

Impact: fix kprobes crash on 32-bit with RAM above 4G

Use phys_addr_t for receiving a physical address argument
instead of unsigned long. This allows fixmap to handle
pages higher than 4GB on x86-32.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: systemtap-ml <systemtap@sources.redhat.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <49DE3695.6040800@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 arch/x86/include/asm/fixmap.h   |    4 ++--
 arch/x86/include/asm/io.h       |    6 ++++--
 arch/x86/include/asm/paravirt.h |    4 ++--
 arch/x86/mm/ioremap.c           |   23 +++++++++++++----------
 arch/x86/mm/pgtable.c           |    3 ++-
 arch/x86/xen/mmu.c              |    2 +-
 6 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index 81937a5..2d81af3 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -151,11 +151,11 @@ extern pte_t *pkmap_page_table;
 
 void __native_set_fixmap(enum fixed_addresses idx, pte_t pte);
 void native_set_fixmap(enum fixed_addresses idx,
-		       unsigned long phys, pgprot_t flags);
+		       phys_addr_t phys, pgprot_t flags);
 
 #ifndef CONFIG_PARAVIRT
 static inline void __set_fixmap(enum fixed_addresses idx,
-				unsigned long phys, pgprot_t flags)
+				phys_addr_t phys, pgprot_t flags)
 {
 	native_set_fixmap(idx, phys, flags);
 }
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index e5383e3..7373932 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -193,8 +193,10 @@ extern void __iomem *ioremap_wc(resource_size_t offset, unsigned long size);
  */
 extern void early_ioremap_init(void);
 extern void early_ioremap_reset(void);
-extern void __iomem *early_ioremap(unsigned long offset, unsigned long size);
-extern void __iomem *early_memremap(unsigned long offset, unsigned long size);
+extern void __iomem *early_ioremap(resource_size_t phys_addr,
+				   unsigned long size);
+extern void __iomem *early_memremap(resource_size_t phys_addr,
+				    unsigned long size);
 extern void early_iounmap(void __iomem *addr, unsigned long size);
 
 #define IO_SPACE_LIMIT 0xffff
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 7727aa8..378e369 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -347,7 +347,7 @@ struct pv_mmu_ops {
 	/* Sometimes the physical address is a pfn, and sometimes its
 	   an mfn.  We can tell which is which from the index. */
 	void (*set_fixmap)(unsigned /* enum fixed_addresses */ idx,
-			   unsigned long phys, pgprot_t flags);
+			   phys_addr_t phys, pgprot_t flags);
 };
 
 struct raw_spinlock;
@@ -1432,7 +1432,7 @@ static inline void arch_leave_lazy_mmu_mode(void)
 void arch_flush_lazy_mmu_mode(void);
 
 static inline void __set_fixmap(unsigned /* enum fixed_addresses */ idx,
-				unsigned long phys, pgprot_t flags)
+				phys_addr_t phys, pgprot_t flags)
 {
 	pv_mmu_ops.set_fixmap(idx, phys, flags);
 }
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 0dfa09d..09daebf 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -547,7 +547,7 @@ void __init early_ioremap_reset(void)
 }
 
 static void __init __early_set_fixmap(enum fixed_addresses idx,
-				   unsigned long phys, pgprot_t flags)
+				      phys_addr_t phys, pgprot_t flags)
 {
 	unsigned long addr = __fix_to_virt(idx);
 	pte_t *pte;
@@ -566,7 +566,7 @@ static void __init __early_set_fixmap(enum fixed_addresses idx,
 }
 
 static inline void __init early_set_fixmap(enum fixed_addresses idx,
-					   unsigned long phys, pgprot_t prot)
+					   phys_addr_t phys, pgprot_t prot)
 {
 	if (after_paging_init)
 		__set_fixmap(idx, phys, prot);
@@ -607,9 +607,10 @@ static int __init check_early_ioremap_leak(void)
 late_initcall(check_early_ioremap_leak);
 
 static void __init __iomem *
-__early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot)
+__early_ioremap(resource_size_t phys_addr, unsigned long size, pgprot_t prot)
 {
-	unsigned long offset, last_addr;
+	unsigned long offset;
+	resource_size_t last_addr;
 	unsigned int nrpages;
 	enum fixed_addresses idx0, idx;
 	int i, slot;
@@ -625,15 +626,15 @@ __early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot)
 	}
 
 	if (slot < 0) {
-		printk(KERN_INFO "early_iomap(%08lx, %08lx) not found slot\n",
-			 phys_addr, size);
+		printk(KERN_INFO "early_iomap(%08llx, %08lx) not found slot\n",
+			 (u64)phys_addr, size);
 		WARN_ON(1);
 		return NULL;
 	}
 
 	if (early_ioremap_debug) {
-		printk(KERN_INFO "early_ioremap(%08lx, %08lx) [%d] => ",
-		       phys_addr, size, slot);
+		printk(KERN_INFO "early_ioremap(%08llx, %08lx) [%d] => ",
+		       (u64)phys_addr, size, slot);
 		dump_stack();
 	}
 
@@ -680,13 +681,15 @@ __early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot)
 }
 
 /* Remap an IO device */
-void __init __iomem *early_ioremap(unsigned long phys_addr, unsigned long size)
+void __init __iomem *
+early_ioremap(resource_size_t phys_addr, unsigned long size)
 {
 	return __early_ioremap(phys_addr, size, PAGE_KERNEL_IO);
 }
 
 /* Remap memory */
-void __init __iomem *early_memremap(unsigned long phys_addr, unsigned long size)
+void __init __iomem *
+early_memremap(resource_size_t phys_addr, unsigned long size)
 {
 	return __early_ioremap(phys_addr, size, PAGE_KERNEL);
 }
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 5b7c7c8..7aa03a5 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -345,7 +345,8 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte)
 	fixmaps_set++;
 }
 
-void native_set_fixmap(enum fixed_addresses idx, unsigned long phys, pgprot_t flags)
+void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
+		       pgprot_t flags)
 {
 	__native_set_fixmap(idx, pfn_pte(phys >> PAGE_SHIFT, flags));
 }
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index db3802f..2a81838 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1750,7 +1750,7 @@ __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd,
 }
 #endif	/* CONFIG_X86_64 */
 
-static void xen_set_fixmap(unsigned idx, unsigned long phys, pgprot_t prot)
+static void xen_set_fixmap(unsigned idx, phys_addr_t phys, pgprot_t prot)
 {
 	pte_t pte;
 

^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2009-04-10 18:33 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-04 14:34 [BUG][-tip] kprobes on module functions hits kernel BUG in text_poke on x86-32 Masami Hiramatsu
2009-04-04 15:42 ` Mathieu Desnoyers
2009-04-04 18:28   ` Masami Hiramatsu
2009-04-04 19:04     ` Masami Hiramatsu
2009-04-05  3:46       ` Masami Hiramatsu
2009-04-05  3:49         ` Masami Hiramatsu
2009-04-06 17:11         ` [BUGFIX][PATCH -tip] x86: fix text_poke to handle highmem pages Masami Hiramatsu
2009-04-06 17:32           ` Mathieu Desnoyers
2009-04-06 17:44             ` Masami Hiramatsu
2009-04-06 17:58               ` Mathieu Desnoyers
2009-04-06 20:23                 ` Masami Hiramatsu
2009-04-08 12:31           ` Ingo Molnar
2009-04-08 14:57             ` Masami Hiramatsu
2009-04-08 14:59               ` Ingo Molnar
2009-04-09 17:55                 ` [BUGFIX][PATCH] x86: fix set_fixmap to use phys_addr_t Masami Hiramatsu
2009-04-09 18:46                   ` Mathieu Desnoyers
2009-04-09 21:52                     ` Masami Hiramatsu
2009-04-10 14:06                   ` [tip:x86/urgent] " Masami Hiramatsu
2009-04-10 15:20                     ` Masami Hiramatsu
2009-04-10 16:05                       ` Mathieu Desnoyers
2009-04-10 17:48                         ` Masami Hiramatsu
2009-04-10 18:30                   ` Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).