linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	<gregkh@linuxfoundation.org>, <rafael@kernel.org>,
	<martin.petersen@oracle.com>, <jejb@linux.ibm.com>,
	<linuxarm@huawei.com>, <linux-scsi@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 1/3] genirq/affinity: Add irq_update_affinity_desc()
Date: Mon, 23 Nov 2020 15:45:55 +0000	[thread overview]
Message-ID: <5a314713-c1ee-2d34-bee1-60beae274742@huawei.com> (raw)
In-Reply-To: <afd97dd4b1e102ac9ad49800821231a4@kernel.org>

Hi Marc,

>> So is there a reason for which irq dispose mapping is not a
>> requirement for drivers when finished with the irq? because of shared
>> interrupts?
> 
> For a bunch of reasons: IRQ number used to be created 1:1 with their
> physical counterpart, so there wasn't a need to "get rid" of the
> associated data structures. Also, people expected their drivers
> to be there for the lifetime of the system (believe it or not,
> hotplug devices are pretty new!).
> 
> Shared interrupts are just another part of the problem (although we
> should be able to work out whether there is any action attached to
> the descriptor before blowing it away.
> 

OK, understood.

>> But it looks like there is more to it than that, which I'm worried is
>> far from non-trivial. For example, just calling irq_dispose_mapping()
>> for removal and then plaform_get_irq()->acpi_get_irq() second time
>> fails as it looks like more tidy-up is needed for removal...
> 
> Most probably. I could imagine things failing if there is any trace
> of an existing translation in the ITS or in the platform-MSI layer,
> for example, or if the interrupt is still active...

So this looks to be a problem I have. So if I hack the code to skip the 
check in acpi_get_irq() for the irq already being init'ed, I run into a 
use-after-free in the gic-v3-its driver. I may be skipping something 
with this hack, but I'll ask anyway.

So initially in the msi_prepare method we setup the its dev - this is 
from the mbigen probe. Then when all the irqs are unmapped later for end 
device driver removal, we release this its device in 
its_irq_domain_free(). But I don't see anything to set it up again. Is 
it improper to have released the its device in this scenario? Commenting 
out the release makes things "good" again.

Thanks,
john

Complete splat:

[   35.751627] 
==================================================================
[   35.758892] BUG: KASAN: use-after-free in its_irq_domain_alloc+0x44/0x1a8
[   35.765714] Read of size 8 at addr ffff04101bc93210 by task swapper/0/1
[   35.772357]
[   35.773854] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
5.10.0-rc4-18055-g3247a1b07719 #955
[   35.782072] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon 
D05 IT21 Nemo 2.0 RC0 04/18/2018
[   35.791250] Call trace:
[   35.793706]  dump_backtrace+0x0/0x2d0
[   35.797384]  show_stack+0x18/0x68
[   35.800713]  dump_stack+0x100/0x16c
[   35.804216]  print_address_description.constprop.12+0x6c/0x4e8
[   35.810078]  kasan_report+0x130/0x200
[   35.813755]  __asan_load8+0x9c/0xd8
[   35.817257]  its_irq_domain_alloc+0x44/0x1a8
[   35.821548]  irq_domain_alloc_irqs_parent+0x68/0x88
[   35.826447]  msi_domain_alloc+0x98/0x180
[   35.830387]  irq_domain_alloc_irqs_hierarchy+0x58/0x78
[   35.835549]  msi_domain_populate_irqs+0x16c/0x1d8
[   35.840275]  platform_msi_domain_alloc+0x9c/0xd0
[   35.844914]  mbigen_irq_domain_alloc+0x100/0x180
[   35.849553]  __irq_domain_alloc_irqs+0x188/0x498
[   35.854191]  irq_create_fwspec_mapping+0x21c/0x4e0
[   35.859005]  acpi_irq_get+0x1f0/0x218
[   35.862683]  platform_get_irq_optional+0x1c4/0x4a0
[   35.867495]  platform_get_irq+0x20/0x68
[   35.871348]  hisi_sas_v2_probe+0x2c/0x88
[   35.875287]  platform_drv_probe+0x70/0xd0
[   35.879313]  really_probe+0x414/0x640
[   35.882990]  driver_probe_device+0x78/0xe8
[   35.887103]  device_driver_attach+0x9c/0xa8
[   35.891304]  __driver_attach+0x74/0x120
[   35.895156]  bus_for_each_dev+0xec/0x160
[   35.899094]  driver_attach+0x34/0x48
[   35.902684]  bus_add_driver+0x1b8/0x2c0
[   35.906535]  driver_register+0xc0/0x1e0
[   35.910386]  __platform_driver_register+0x80/0x90
[   35.915115]  hisi_sas_v2_driver_init+0x1c/0x28
[   35.919578]  do_one_initcall+0xd4/0x268
[   35.923431]  kernel_init_freeable+0x270/0x2d8
[   35.927807]  kernel_init+0x14/0x124
[   35.931309]  ret_from_fork+0x10/0x34
[   35.934895]
[   35.936386] Allocated by task 1:
[   35.939628]  stack_trace_save+0x94/0xc8
[   35.943480]  kasan_save_stack+0x28/0x58
[   35.947331]  __kasan_kmalloc.isra.6+0xcc/0xf0
[   35.951706]  kasan_kmalloc+0x10/0x20
[   35.955295]  its_create_device+0x1b8/0x448
[   35.959408]  its_msi_prepare+0x120/0x1d8
[   35.963347]  its_pmsi_prepare+0x20c/0x280
[   35.967373]  msi_domain_prepare_irqs+0x80/0x98
[   35.971836]  __platform_msi_create_device_domain+0xa8/0x130
[   35.977435]  mbigen_device_probe+0x298/0x340
[   35.981723]  platform_drv_probe+0x70/0xd0
[   35.985748]  really_probe+0x414/0x640
[   35.989424]  driver_probe_device+0x78/0xe8
[   35.993537]  device_driver_attach+0x9c/0xa8
[   35.997737]  __driver_attach+0x74/0x120
[   36.001589]  bus_for_each_dev+0xec/0x160
[   36.005527]  driver_attach+0x34/0x48
[   36.009118]  bus_add_driver+0x1b8/0x2c0
[   36.012968]  driver_register+0xc0/0x1e0
[   36.016819]  __platform_driver_register+0x80/0x90
[   36.021545]  mbigen_platform_driver_init+0x1c/0x28
[   36.026356]  do_one_initcall+0xd4/0x268
[   36.030207]  kernel_init_freeable+0x270/0x2d8
[   36.034583]  kernel_init+0x14/0x124
[   36.038084]  ret_from_fork+0x10/0x34
[   36.041671]
[   36.043162] Freed by task 1:
[   36.046053]  stack_trace_save+0x94/0xc8
[   36.049904]  kasan_save_stack+0x28/0x58
[   36.053754]  kasan_set_track+0x28/0x40
[   36.057518]  kasan_set_free_info+0x24/0x48
[   36.061631]  __kasan_slab_free+0x104/0x188
[   36.065744]  kasan_slab_free+0x14/0x20
[   36.069510]  slab_free_freelist_hook+0x8c/0x190
[   36.074060]  kfree+0x308/0x448
[   36.077125]  its_irq_domain_free+0x31c/0x338
[   36.081414]  irq_domain_free_irqs_common+0xd8/0x128
[   36.086314]  irq_domain_free_irqs_top+0x70/0x88
[   36.090864]  msi_domain_free+0xa8/0xc8
[   36.094629]  irq_domain_free_irqs_common+0xd8/0x128
[   36.099528]  platform_msi_domain_free+0xdc/0x1b0
[   36.104167]  mbigen_irq_domain_free+0x10/0x20
[   36.108543]  irq_domain_free_irqs+0x184/0x208
[   36.112918]  irq_dispose_mapping+0x54/0x98
[   36.117033]  devm_platform_get_irqs_affinity_release+0xbc/0xdc
[   36.122893]  release_nodes+0x350/0x3e8
[   36.126657]  devres_release_all+0x54/0x78
[   36.130683]  really_probe+0x234/0x640
[   36.134359]  driver_probe_device+0x78/0xe8
[   36.138473]  device_driver_attach+0x9c/0xa8
[   36.142673]  __driver_attach+0x74/0x120
[   36.146524]  bus_for_each_dev+0xec/0x160
[   36.150462]  driver_attach+0x34/0x48
[   36.154052]  bus_add_driver+0x1b8/0x2c0
[   36.157903]  driver_register+0xc0/0x1e0
[   36.161754]  __platform_driver_register+0x80/0x90
[   36.166479]  hisi_sas_v2_driver_init+0x1c/0x28
[   36.170942]  do_one_initcall+0xd4/0x268
[   36.174793]  kernel_init_freeable+0x270/0x2d8
[   36.179168]  kernel_init+0x14/0x124
[   36.182670]  ret_from_fork+0x10/0x34
[   36.186256]
[   36.187749] The buggy address belongs to the object at ffff04101bc93200
[   36.187749]  which belongs to the cache kmalloc-128 of size 128
[   36.200335] The buggy address is located 16 bytes inside of
[   36.200335]  128-byte region [ffff04101bc93200, ffff04101bc93280)
[   36.212046] The buggy address belongs to the page:
[   36.216864] page:(____ptrval____) refcount:1 mapcount:0 
mapping:0000000000000000 index:0x0 pfn:0x4101bc92
[   36.226479] head:(____ptrval____) order:1 compound_mapcount:0
[   36.232253] flags: 0x2bfffc0000010200(slab|head)
[   36.236895] raw: 2bfffc0000010200 dead000000000100 dead000000000122 
ffff00104000fc00
[   36.244679] raw: 0000000000000000 0000000000200020 00000001ffffffff 
0000000000000000
[   36.252459] page dumped because: kasan: bad access detected
[   36.258055]
[   36.259545] Memory state around the buggy address:
[   36.264358]  ffff04101bc93100: fa fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   36.271616]  ffff04101bc93180: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[   36.278874] >ffff04101bc93200: fa fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   36.286130]                          ^
[   36.289893]  ffff04101bc93280: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[   36.297151]  ffff04101bc93300: fa fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   36.304406] 
==================================================================
[   36.311663] Disabling lock debugging due to kernel taint

  reply	other threads:[~2020-11-23 15:46 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-28 12:33 [PATCH v2 0/3] Support managed interrupts for platform devices John Garry
2020-10-28 12:33 ` [PATCH v2 1/3] genirq/affinity: Add irq_update_affinity_desc() John Garry
2020-10-28 18:22   ` Thomas Gleixner
2020-11-02 17:32     ` John Garry
2020-11-02 20:35       ` Thomas Gleixner
2020-11-17 21:28         ` Thomas Gleixner
2020-11-18 11:34           ` John Garry
2020-11-18 20:38             ` Thomas Gleixner
2020-11-19  9:31               ` John Garry
2020-11-19 18:09                 ` Thomas Gleixner
2020-11-19 19:56                   ` John Garry
2020-11-19 21:03                     ` Thomas Gleixner
2020-11-20 11:52                       ` John Garry
2020-11-22 13:38                         ` Marc Zyngier
2020-11-23 12:54                           ` John Garry
2020-11-23 13:26                             ` Marc Zyngier
2020-11-23 15:45                               ` John Garry [this message]
2020-11-24 16:51                                 ` Marc Zyngier
2020-11-24 17:38                                   ` John Garry
2020-11-25 18:35                                     ` Marc Zyngier
2020-11-26 10:47                                       ` John Garry
2020-11-26 11:08                                         ` Marc Zyngier
2020-11-26 11:29                                           ` John Garry
2020-11-26 16:52                                             ` John Garry
2020-11-27  9:57                                               ` Marc Zyngier
2020-11-27 12:45                                                 ` John Garry
2020-11-27 12:49                                                   ` Marc Zyngier
2020-10-28 12:33 ` [PATCH v2 2/3] Driver core: platform: Add platform_get_irqs_affinity() John Garry
2020-10-28 12:33 ` [PATCH v2 3/3] scsi: hisi_sas: Expose HW queues for v2 hw John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5a314713-c1ee-2d34-bee1-60beae274742@huawei.com \
    --to=john.garry@huawei.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jejb@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=martin.petersen@oracle.com \
    --cc=maz@kernel.org \
    --cc=rafael@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).