All of lore.kernel.org
 help / color / mirror / Atom feed
From: Toshiaki Makita <toshiaki.makita1@gmail.com>
To: Jason Wang <jasowang@redhat.com>,
	Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>,
	"David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Subject: Re: [PATCH net] tun: Fix NULL pointer dereference in XDP redirect
Date: Fri, 25 May 2018 22:43:56 +0900	[thread overview]
Message-ID: <863681d4-92cf-1575-a182-b07f22ec578e@gmail.com> (raw)
In-Reply-To: <491862dc-b6d2-854d-4d5f-8dfe34da882d@redhat.com>

On 18/05/25 (金) 18:59, Jason Wang wrote:
> On 2018年05月25日 12:32, Toshiaki Makita wrote:
>> Calling XDP redirection requires preempt/bh disabled. Especially softirq
>> can call another XDP function and redirection functions, then percpu
>> value ri->map can be overwritten to NULL.
>>
>> This is a generic XDP case called from tun.
>>
>> [ 3535.736058] BUG: unable to handle kernel NULL pointer dereference 
>> at 0000000000000018
>> [ 3535.743974] PGD 0 P4D 0
>> [ 3535.746530] Oops: 0000 [#1] SMP PTI
>> [ 3535.750049] Modules linked in: vhost_net vhost tap tun bridge stp 
>> llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter 
>> sunrpc vfat fat ext4 mbcache jbd2 intel_rapl skx_edac nfit libnvdimm 
>> x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm ipmi_ssif 
>> irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ses 
>> aesni_intel crypto_simd cryptd enclosure hpwdt hpilo glue_helper 
>> ipmi_si pcspkr wmi mei_me ioatdma mei ipmi_devintf shpchp dca 
>> ipmi_msghandler lpc_ich acpi_power_meter sch_fq_codel ip_tables xfs 
>> libcrc32c sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea 
>> sysfillrect sysimgblt fb_sys_fops ttm drm smartpqi i40e crc32c_intel 
>> scsi_transport_sas tg3 i2c_core ptp pps_core
>> [ 3535.813456] CPU: 5 PID: 1630 Comm: vhost-1614 Not tainted 
>> 4.17.0-rc4 #2
>> [ 3535.820127] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 
>> Gen10, BIOS U32 11/14/2017
>> [ 3535.828732] RIP: 0010:__xdp_map_lookup_elem+0x5/0x30
>> [ 3535.833740] RSP: 0018:ffffb4bc47bf7c58 EFLAGS: 00010246
>> [ 3535.839009] RAX: ffff9fdfcfea1c40 RBX: 0000000000000000 RCX: 
>> ffff9fdf27fe3100
>> [ 3535.846205] RDX: ffff9fdfca769200 RSI: 0000000000000000 RDI: 
>> 0000000000000000
>> [ 3535.853402] RBP: ffffb4bc491d9000 R08: 00000000000045ad R09: 
>> 0000000000000ec0
>> [ 3535.860597] R10: 0000000000000001 R11: ffff9fdf26c3ce4e R12: 
>> ffff9fdf9e72c000
>> [ 3535.867794] R13: 0000000000000000 R14: fffffffffffffff2 R15: 
>> ffff9fdfc82cdd00
>> [ 3535.874990] FS:  0000000000000000(0000) GS:ffff9fdfcfe80000(0000) 
>> knlGS:0000000000000000
>> [ 3535.883152] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 3535.888948] CR2: 0000000000000018 CR3: 0000000bde724004 CR4: 
>> 00000000007626e0
>> [ 3535.896145] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
>> 0000000000000000
>> [ 3535.903342] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
>> 0000000000000400
>> [ 3535.910538] PKRU: 55555554
>> [ 3535.913267] Call Trace:
>> [ 3535.915736]  xdp_do_generic_redirect+0x7a/0x310
>> [ 3535.920310]  do_xdp_generic.part.117+0x285/0x370
>> [ 3535.924970]  tun_get_user+0x5b9/0x1260 [tun]
>> [ 3535.929279]  tun_sendmsg+0x52/0x70 [tun]
>> [ 3535.933237]  handle_tx+0x2ad/0x5f0 [vhost_net]
>> [ 3535.937721]  vhost_worker+0xa5/0x100 [vhost]
>> [ 3535.942030]  kthread+0xf5/0x130
>> [ 3535.945198]  ? vhost_dev_ioctl+0x3b0/0x3b0 [vhost]
>> [ 3535.950031]  ? kthread_bind+0x10/0x10
>> [ 3535.953727]  ret_from_fork+0x35/0x40
>> [ 3535.957334] Code: 0e 74 15 83 f8 10 75 05 e9 49 aa b3 ff f3 c3 0f 
>> 1f 80 00 00 00 00 f3 c3 e9 29 9d b3 ff 66 0f 1f 84 00 00 00 00 00 0f 
>> 1f 44 00 00 <8b> 47 18 83 f8 0e 74 0d 83 f8 10 75 05 e9 49 a9 b3 ff 31 
>> c0 c3
>> [ 3535.976387] RIP: __xdp_map_lookup_elem+0x5/0x30 RSP: ffffb4bc47bf7c58
>> [ 3535.982883] CR2: 0000000000000018
>> [ 3535.987096] ---[ end trace 383b299dd1430240 ]---
>> [ 3536.131325] Kernel panic - not syncing: Fatal exception
>> [ 3536.137484] Kernel Offset: 0x26a00000 from 0xffffffff81000000 
>> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>> [ 3536.281406] ---[ end Kernel panic - not syncing: Fatal exception ]---
>>
>> And a kernel with generic case fixed still panics in tun driver XDP
>> redirect, because it did not disable bh.
>>
>> [ 2055.128746] BUG: unable to handle kernel NULL pointer dereference 
>> at 0000000000000018
>> [ 2055.136662] PGD 0 P4D 0
>> [ 2055.139219] Oops: 0000 [#1] SMP PTI
>> [ 2055.142736] Modules linked in: vhost_net vhost tap tun bridge stp 
>> llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter 
>> sunrpc vfat fat ext4 mbcache jbd2 intel_rapl skx_edac nfit libnvdimm 
>> x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass 
>> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ses aesni_intel 
>> ipmi_ssif crypto_simd enclosure cryptd hpwdt glue_helper ioatdma hpilo 
>> wmi dca pcspkr ipmi_si acpi_power_meter ipmi_devintf shpchp mei_me 
>> ipmi_msghandler mei lpc_ich sch_fq_codel ip_tables xfs libcrc32c 
>> sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect 
>> sysimgblt fb_sys_fops ttm drm i40e smartpqi tg3 scsi_transport_sas 
>> crc32c_intel i2c_core ptp pps_core
>> [ 2055.206142] CPU: 6 PID: 1693 Comm: vhost-1683 Tainted: G        
>> W         4.17.0-rc5-fix-tun+ #1
>> [ 2055.215011] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 
>> Gen10, BIOS U32 11/14/2017
>> [ 2055.223617] RIP: 0010:__xdp_map_lookup_elem+0x5/0x30
>> [ 2055.228624] RSP: 0018:ffff998b07607cc0 EFLAGS: 00010246
>> [ 2055.233892] RAX: ffff8dbd8e235700 RBX: ffff8dbd8ff21c40 RCX: 
>> 0000000000000004
>> [ 2055.241089] RDX: ffff998b097a9000 RSI: 0000000000000000 RDI: 
>> 0000000000000000
>> [ 2055.248286] RBP: 0000000000000000 R08: 00000000000065a8 R09: 
>> 0000000000005d80
>> [ 2055.255483] R10: 0000000000000040 R11: ffff8dbcf0100000 R12: 
>> ffff998b097a9000
>> [ 2055.262681] R13: ffff8dbd8c98c000 R14: 0000000000000000 R15: 
>> ffff998b07607d78
>> [ 2055.269879] FS:  0000000000000000(0000) GS:ffff8dbd8ff00000(0000) 
>> knlGS:0000000000000000
>> [ 2055.278039] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 2055.283834] CR2: 0000000000000018 CR3: 0000000c0c8cc005 CR4: 
>> 00000000007626e0
>> [ 2055.291030] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
>> 0000000000000000
>> [ 2055.298227] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
>> 0000000000000400
>> [ 2055.305424] PKRU: 55555554
>> [ 2055.308153] Call Trace:
>> [ 2055.310624]  xdp_do_redirect+0x7b/0x380
>> [ 2055.314499]  tun_get_user+0x10fe/0x12a0 [tun]
>> [ 2055.318895]  tun_sendmsg+0x52/0x70 [tun]
>> [ 2055.322852]  handle_tx+0x2ad/0x5f0 [vhost_net]
>> [ 2055.327337]  vhost_worker+0xa5/0x100 [vhost]
>> [ 2055.331646]  kthread+0xf5/0x130
>> [ 2055.334813]  ? vhost_dev_ioctl+0x3b0/0x3b0 [vhost]
>> [ 2055.339646]  ? kthread_bind+0x10/0x10
>> [ 2055.343343]  ret_from_fork+0x35/0x40
>> [ 2055.346950] Code: 0e 74 15 83 f8 10 75 05 e9 e9 aa b3 ff f3 c3 0f 
>> 1f 80 00 00 00 00 f3 c3 e9 c9 9d b3 ff 66 0f 1f 84 00 00 00 00 00 0f 
>> 1f 44 00 00 <8b> 47 18 83 f8 0e 74 0d 83 f8 10 75 05 e9 e9 a9 b3 ff 31 
>> c0 c3
>> [ 2055.366004] RIP: __xdp_map_lookup_elem+0x5/0x30 RSP: ffff998b07607cc0
>> [ 2055.372500] CR2: 0000000000000018
>> [ 2055.375856] ---[ end trace 2a2dcc5e9e174268 ]---
>> [ 2055.523626] Kernel panic - not syncing: Fatal exception
>> [ 2055.529796] Kernel Offset: 0x2e000000 from 0xffffffff81000000 
>> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>> [ 2055.677539] ---[ end Kernel panic - not syncing: Fatal exception ]---
>>
>> Fixes: 761876c857cb ("tap: XDP support")
>> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>> ---
>>   drivers/net/tun.c | 12 ++++++++++++
>>   1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index 45d8077..4fc7dbf 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -1650,6 +1650,7 @@ static struct sk_buff *tun_build_skb(struct 
>> tun_struct *tun,
>>       else
>>           *skb_xdp = 0;
>> +    local_bh_disable();
>>       preempt_disable();
>>       rcu_read_lock();
>>       xdp_prog = rcu_dereference(tun->xdp_prog);
>> @@ -1676,6 +1677,7 @@ static struct sk_buff *tun_build_skb(struct 
>> tun_struct *tun,
>>                   goto err_redirect;
>>               rcu_read_unlock();
>>               preempt_enable();
>> +            local_bh_enable();
>>               return NULL;
>>           case XDP_TX:
>>               get_page(alloc_frag->page);
>> @@ -1685,6 +1687,7 @@ static struct sk_buff *tun_build_skb(struct 
>> tun_struct *tun,
>>               tun_xdp_flush(tun->dev);
>>               rcu_read_unlock();
>>               preempt_enable();
>> +            local_bh_enable();
>>               return NULL;
>>           case XDP_PASS:
>>               delta = orig_data - xdp.data;
>> @@ -1704,6 +1707,7 @@ static struct sk_buff *tun_build_skb(struct 
>> tun_struct *tun,
>>       if (!skb) {
>>           rcu_read_unlock();
>>           preempt_enable();
>> +        local_bh_enable();
>>           return ERR_PTR(-ENOMEM);
>>       }
>> @@ -1714,6 +1718,7 @@ static struct sk_buff *tun_build_skb(struct 
>> tun_struct *tun,
>>       rcu_read_unlock();
>>       preempt_enable();
>> +    local_bh_enable();
>>       return skb;
>> @@ -1722,6 +1727,7 @@ static struct sk_buff *tun_build_skb(struct 
>> tun_struct *tun,
>>   err_xdp:
>>       rcu_read_unlock();
>>       preempt_enable();
>> +    local_bh_enable();
>>       this_cpu_inc(tun->pcpu_stats->rx_dropped);
>>       return NULL;
>>   }
>> @@ -1917,16 +1923,22 @@ static ssize_t tun_get_user(struct tun_struct 
>> *tun, struct tun_file *tfile,
>>           struct bpf_prog *xdp_prog;
>>           int ret;
>> +        local_bh_disable();
>> +        preempt_disable();
>>           rcu_read_lock();
>>           xdp_prog = rcu_dereference(tun->xdp_prog);
>>           if (xdp_prog) {
>>               ret = do_xdp_generic(xdp_prog, skb);
>>               if (ret != XDP_PASS) {
>>                   rcu_read_unlock();
>> +                preempt_enable();
>> +                local_bh_enable();
>>                   return total_len;
>>               }
>>           }
>>           rcu_read_unlock();
>> +        preempt_enable();
>> +        local_bh_enable();
>>       }
>>       rcu_read_lock();
> 
> Good catch, thanks.
> 
> But I think we can just replace preempt_disable()/enable() with 
> local_bh_disable()/local_bh_enable() ?

I actually thought the same, but noticed this patch.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9ea4c380066fbe

It looks like they do not think local_bh_disable() implies 
preempt_disable(). But I'm not sure why..

Toshiaki Makita

  reply	other threads:[~2018-05-25 13:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-25  4:32 [PATCH net] tun: Fix NULL pointer dereference in XDP redirect Toshiaki Makita
2018-05-25  9:59 ` Jason Wang
2018-05-25 13:43   ` Toshiaki Makita [this message]
2018-05-28  2:24     ` Jason Wang
2018-05-28  4:28       ` Toshiaki Makita

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=863681d4-92cf-1575-a182-b07f22ec578e@gmail.com \
    --to=toshiaki.makita1@gmail.com \
    --cc=davem@davemloft.net \
    --cc=jasowang@redhat.com \
    --cc=makita.toshiaki@lab.ntt.co.jp \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.