* [PATCH 1/2] scsi: target: tcmu: Fix possible page UAF
@ 2022-03-11 13:22 Xiaoguang Wang
2022-03-11 13:22 ` [PATCH 2/2] scsi: target: tcmu: Use address_space->invalidate_lock Xiaoguang Wang
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Xiaoguang Wang @ 2022-03-11 13:22 UTC (permalink / raw)
To: linux-scsi, target-devel; +Cc: bostroesser
tcmu_try_get_data_page() looks up pages under cmdr_lock, but it don't
take refcount properly and just return page pointer.
When tcmu_try_get_data_page() returns, the returned page may have been
freed by tcmu_blocks_release(), need to get_page() under cmdr_lock to
avoid concurrent tcmu_blocks_release().
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
---
drivers/target/target_core_user.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 7b2a89a67cdb..06a5c4086551 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -1820,6 +1820,7 @@ static struct page *tcmu_try_get_data_page(struct tcmu_dev *udev, uint32_t dpi)
mutex_lock(&udev->cmdr_lock);
page = xa_load(&udev->data_pages, dpi);
if (likely(page)) {
+ get_page(page);
mutex_unlock(&udev->cmdr_lock);
return page;
}
@@ -1876,6 +1877,7 @@ static vm_fault_t tcmu_vma_fault(struct vm_fault *vmf)
/* For the vmalloc()ed cmd area pages */
addr = (void *)(unsigned long)info->mem[mi].addr + offset;
page = vmalloc_to_page(addr);
+ get_page(page);
} else {
uint32_t dpi;
@@ -1886,7 +1888,6 @@ static vm_fault_t tcmu_vma_fault(struct vm_fault *vmf)
return VM_FAULT_SIGBUS;
}
- get_page(page);
vmf->page = page;
return 0;
}
--
2.14.4.44.g2045bb6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] scsi: target: tcmu: Use address_space->invalidate_lock
2022-03-11 13:22 [PATCH 1/2] scsi: target: tcmu: Fix possible page UAF Xiaoguang Wang
@ 2022-03-11 13:22 ` Xiaoguang Wang
2022-03-16 10:43 ` Xiaoguang Wang
2022-03-16 12:38 ` [PATCH 1/2] scsi: target: tcmu: Fix possible page UAF Bodo Stroesser
2022-04-07 13:35 ` Martin K. Petersen
2 siblings, 1 reply; 8+ messages in thread
From: Xiaoguang Wang @ 2022-03-11 13:22 UTC (permalink / raw)
To: linux-scsi, target-devel; +Cc: bostroesser
Currently tcmu_vma_fault() uses udev->cmdr_lock to avoid concurrent
find_free_blocks(), which unmaps idle pages and truncates them. This
work is really like many filesystem's truncate operations, but they
use address_space->invalidate_lock to protect race.
This patch replaces cmdr_lock with address_space->invalidate_lock in
tcmu fault procedure, which will also make page-fault have concurrency.
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
---
drivers/target/target_core_user.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 06a5c4086551..e0a62623ccd7 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -1815,13 +1815,14 @@ static int tcmu_find_mem_index(struct vm_area_struct *vma)
static struct page *tcmu_try_get_data_page(struct tcmu_dev *udev, uint32_t dpi)
{
+ struct address_space *mapping = udev->inode->i_mapping;
struct page *page;
- mutex_lock(&udev->cmdr_lock);
+ filemap_invalidate_lock_shared(mapping);
page = xa_load(&udev->data_pages, dpi);
if (likely(page)) {
get_page(page);
- mutex_unlock(&udev->cmdr_lock);
+ filemap_invalidate_unlock_shared(mapping);
return page;
}
@@ -1831,7 +1832,7 @@ static struct page *tcmu_try_get_data_page(struct tcmu_dev *udev, uint32_t dpi)
*/
pr_err("Invalid addr to data page mapping (dpi %u) on device %s\n",
dpi, udev->name);
- mutex_unlock(&udev->cmdr_lock);
+ filemap_invalidate_unlock_shared(mapping);
return NULL;
}
@@ -3111,6 +3112,7 @@ static void find_free_blocks(void)
loff_t off;
u32 pages_freed, total_pages_freed = 0;
u32 start, end, block, total_blocks_freed = 0;
+ struct address_space *mapping;
if (atomic_read(&global_page_count) <= tcmu_global_max_pages)
return;
@@ -3134,6 +3136,7 @@ static void find_free_blocks(void)
continue;
}
+ mapping = udev->inode->i_mapping;
end = udev->dbi_max + 1;
block = find_last_bit(udev->data_bitmap, end);
if (block == udev->dbi_max) {
@@ -3152,12 +3155,14 @@ static void find_free_blocks(void)
udev->dbi_max = block;
}
+ filemap_invalidate_lock(mapping);
/* Here will truncate the data area from off */
off = udev->data_off + (loff_t)start * udev->data_blk_size;
- unmap_mapping_range(udev->inode->i_mapping, off, 0, 1);
+ unmap_mapping_range(mapping, off, 0, 1);
/* Release the block pages */
pages_freed = tcmu_blocks_release(udev, start, end - 1);
+ filemap_invalidate_unlock(mapping);
mutex_unlock(&udev->cmdr_lock);
total_pages_freed += pages_freed;
--
2.14.4.44.g2045bb6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] scsi: target: tcmu: Use address_space->invalidate_lock
2022-03-11 13:22 ` [PATCH 2/2] scsi: target: tcmu: Use address_space->invalidate_lock Xiaoguang Wang
@ 2022-03-16 10:43 ` Xiaoguang Wang
2022-03-16 13:05 ` Bodo Stroesser
0 siblings, 1 reply; 8+ messages in thread
From: Xiaoguang Wang @ 2022-03-16 10:43 UTC (permalink / raw)
To: linux-scsi, target-devel; +Cc: bostroesser
hello,
Gentle ping.
Regards,
Xiaoguang Wang
> Currently tcmu_vma_fault() uses udev->cmdr_lock to avoid concurrent
> find_free_blocks(), which unmaps idle pages and truncates them. This
> work is really like many filesystem's truncate operations, but they
> use address_space->invalidate_lock to protect race.
>
> This patch replaces cmdr_lock with address_space->invalidate_lock in
> tcmu fault procedure, which will also make page-fault have concurrency.
>
> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
> ---
> drivers/target/target_core_user.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
> index 06a5c4086551..e0a62623ccd7 100644
> --- a/drivers/target/target_core_user.c
> +++ b/drivers/target/target_core_user.c
> @@ -1815,13 +1815,14 @@ static int tcmu_find_mem_index(struct vm_area_struct *vma)
>
> static struct page *tcmu_try_get_data_page(struct tcmu_dev *udev, uint32_t dpi)
> {
> + struct address_space *mapping = udev->inode->i_mapping;
> struct page *page;
>
> - mutex_lock(&udev->cmdr_lock);
> + filemap_invalidate_lock_shared(mapping);
> page = xa_load(&udev->data_pages, dpi);
> if (likely(page)) {
> get_page(page);
> - mutex_unlock(&udev->cmdr_lock);
> + filemap_invalidate_unlock_shared(mapping);
> return page;
> }
>
> @@ -1831,7 +1832,7 @@ static struct page *tcmu_try_get_data_page(struct tcmu_dev *udev, uint32_t dpi)
> */
> pr_err("Invalid addr to data page mapping (dpi %u) on device %s\n",
> dpi, udev->name);
> - mutex_unlock(&udev->cmdr_lock);
> + filemap_invalidate_unlock_shared(mapping);
>
> return NULL;
> }
> @@ -3111,6 +3112,7 @@ static void find_free_blocks(void)
> loff_t off;
> u32 pages_freed, total_pages_freed = 0;
> u32 start, end, block, total_blocks_freed = 0;
> + struct address_space *mapping;
>
> if (atomic_read(&global_page_count) <= tcmu_global_max_pages)
> return;
> @@ -3134,6 +3136,7 @@ static void find_free_blocks(void)
> continue;
> }
>
> + mapping = udev->inode->i_mapping;
> end = udev->dbi_max + 1;
> block = find_last_bit(udev->data_bitmap, end);
> if (block == udev->dbi_max) {
> @@ -3152,12 +3155,14 @@ static void find_free_blocks(void)
> udev->dbi_max = block;
> }
>
> + filemap_invalidate_lock(mapping);
> /* Here will truncate the data area from off */
> off = udev->data_off + (loff_t)start * udev->data_blk_size;
> - unmap_mapping_range(udev->inode->i_mapping, off, 0, 1);
> + unmap_mapping_range(mapping, off, 0, 1);
>
> /* Release the block pages */
> pages_freed = tcmu_blocks_release(udev, start, end - 1);
> + filemap_invalidate_unlock(mapping);
> mutex_unlock(&udev->cmdr_lock);
>
> total_pages_freed += pages_freed;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] scsi: target: tcmu: Fix possible page UAF
2022-03-11 13:22 [PATCH 1/2] scsi: target: tcmu: Fix possible page UAF Xiaoguang Wang
2022-03-11 13:22 ` [PATCH 2/2] scsi: target: tcmu: Use address_space->invalidate_lock Xiaoguang Wang
@ 2022-03-16 12:38 ` Bodo Stroesser
2022-04-07 13:35 ` Martin K. Petersen
2 siblings, 0 replies; 8+ messages in thread
From: Bodo Stroesser @ 2022-03-16 12:38 UTC (permalink / raw)
To: Xiaoguang Wang, linux-scsi, target-devel
This one looks good. Thank you.
Reviewed-by: Bodo Stroesser <bostroesser@gmail.com>
On 11.03.22 14:22, Xiaoguang Wang wrote:
> tcmu_try_get_data_page() looks up pages under cmdr_lock, but it don't
> take refcount properly and just return page pointer.
>
> When tcmu_try_get_data_page() returns, the returned page may have been
> freed by tcmu_blocks_release(), need to get_page() under cmdr_lock to
> avoid concurrent tcmu_blocks_release().
>
> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
> ---
> drivers/target/target_core_user.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
> index 7b2a89a67cdb..06a5c4086551 100644
> --- a/drivers/target/target_core_user.c
> +++ b/drivers/target/target_core_user.c
> @@ -1820,6 +1820,7 @@ static struct page *tcmu_try_get_data_page(struct tcmu_dev *udev, uint32_t dpi)
> mutex_lock(&udev->cmdr_lock);
> page = xa_load(&udev->data_pages, dpi);
> if (likely(page)) {
> + get_page(page);
> mutex_unlock(&udev->cmdr_lock);
> return page;
> }
> @@ -1876,6 +1877,7 @@ static vm_fault_t tcmu_vma_fault(struct vm_fault *vmf)
> /* For the vmalloc()ed cmd area pages */
> addr = (void *)(unsigned long)info->mem[mi].addr + offset;
> page = vmalloc_to_page(addr);
> + get_page(page);
> } else {
> uint32_t dpi;
>
> @@ -1886,7 +1888,6 @@ static vm_fault_t tcmu_vma_fault(struct vm_fault *vmf)
> return VM_FAULT_SIGBUS;
> }
>
> - get_page(page);
> vmf->page = page;
> return 0;
> }
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] scsi: target: tcmu: Use address_space->invalidate_lock
2022-03-16 10:43 ` Xiaoguang Wang
@ 2022-03-16 13:05 ` Bodo Stroesser
2022-03-17 4:59 ` Xiaoguang Wang
0 siblings, 1 reply; 8+ messages in thread
From: Bodo Stroesser @ 2022-03-16 13:05 UTC (permalink / raw)
To: Xiaoguang Wang, linux-scsi, target-devel
Sorry for the late response. Currently I'm quite busy.
In your earlier mail you described a possible dead lock.
With this patch applied, are you sure a similar deadlock cannot
happen?
Additionally, let's assume tcmu_vma_fault/tcmu_try_get_data_page
- after having found a valid page to map - is interrupted after
releasing the invalidate_lock. Are there any locks held to prevent
find_free_blocks from jumping in and possibly remove that page from
xarray and try to remove it from the mmapped area?
If not, we might end up mapping a no longer valid page.
Of course, this would be a long standing problem not caused by your
change. But if there would be a problem, we should try to fix it
when touching this code, I think.
Unfortunately I didn't manage yet to check which locks are involved
during page fault handling and unmap_mapping_range.
Bodo
On 16.03.22 11:43, Xiaoguang Wang wrote:
> hello,
>
> Gentle ping.
>
> Regards,
> Xiaoguang Wang
>
>> Currently tcmu_vma_fault() uses udev->cmdr_lock to avoid concurrent
>> find_free_blocks(), which unmaps idle pages and truncates them. This
>> work is really like many filesystem's truncate operations, but they
>> use address_space->invalidate_lock to protect race.
>>
>> This patch replaces cmdr_lock with address_space->invalidate_lock in
>> tcmu fault procedure, which will also make page-fault have concurrency.
>>
>> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
>> ---
>> drivers/target/target_core_user.c | 13 +++++++++----
>> 1 file changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/target/target_core_user.c
>> b/drivers/target/target_core_user.c
>> index 06a5c4086551..e0a62623ccd7 100644
>> --- a/drivers/target/target_core_user.c
>> +++ b/drivers/target/target_core_user.c
>> @@ -1815,13 +1815,14 @@ static int tcmu_find_mem_index(struct
>> vm_area_struct *vma)
>> static struct page *tcmu_try_get_data_page(struct tcmu_dev *udev,
>> uint32_t dpi)
>> {
>> + struct address_space *mapping = udev->inode->i_mapping;
>> struct page *page;
>> - mutex_lock(&udev->cmdr_lock);
>> + filemap_invalidate_lock_shared(mapping);
>> page = xa_load(&udev->data_pages, dpi);
>> if (likely(page)) {
>> get_page(page);
>> - mutex_unlock(&udev->cmdr_lock);
>> + filemap_invalidate_unlock_shared(mapping);
>> return page;
>> }
>> @@ -1831,7 +1832,7 @@ static struct page
>> *tcmu_try_get_data_page(struct tcmu_dev *udev, uint32_t dpi)
>> */
>> pr_err("Invalid addr to data page mapping (dpi %u) on device %s\n",
>> dpi, udev->name);
>> - mutex_unlock(&udev->cmdr_lock);
>> + filemap_invalidate_unlock_shared(mapping);
>> return NULL;
>> }
>> @@ -3111,6 +3112,7 @@ static void find_free_blocks(void)
>> loff_t off;
>> u32 pages_freed, total_pages_freed = 0;
>> u32 start, end, block, total_blocks_freed = 0;
>> + struct address_space *mapping;
>> if (atomic_read(&global_page_count) <= tcmu_global_max_pages)
>> return;
>> @@ -3134,6 +3136,7 @@ static void find_free_blocks(void)
>> continue;
>> }
>> + mapping = udev->inode->i_mapping;
>> end = udev->dbi_max + 1;
>> block = find_last_bit(udev->data_bitmap, end);
>> if (block == udev->dbi_max) {
>> @@ -3152,12 +3155,14 @@ static void find_free_blocks(void)
>> udev->dbi_max = block;
>> }
>> + filemap_invalidate_lock(mapping);
>> /* Here will truncate the data area from off */
>> off = udev->data_off + (loff_t)start * udev->data_blk_size;
>> - unmap_mapping_range(udev->inode->i_mapping, off, 0, 1);
>> + unmap_mapping_range(mapping, off, 0, 1);
>> /* Release the block pages */
>> pages_freed = tcmu_blocks_release(udev, start, end - 1);
>> + filemap_invalidate_unlock(mapping);
>> mutex_unlock(&udev->cmdr_lock);
>> total_pages_freed += pages_freed;
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] scsi: target: tcmu: Use address_space->invalidate_lock
2022-03-16 13:05 ` Bodo Stroesser
@ 2022-03-17 4:59 ` Xiaoguang Wang
2022-03-17 6:09 ` Xiaoguang Wang
0 siblings, 1 reply; 8+ messages in thread
From: Xiaoguang Wang @ 2022-03-17 4:59 UTC (permalink / raw)
To: Bodo Stroesser, linux-scsi, target-devel
hi,
> Sorry for the late response. Currently I'm quite busy.
Really never mind :)
>
> In your earlier mail you described a possible dead lock.
> With this patch applied, are you sure a similar deadlock cannot
> happen?
AFAIK, this patch will solve the deadlock.
>
> Additionally, let's assume tcmu_vma_fault/tcmu_try_get_data_page
> - after having found a valid page to map - is interrupted after
> releasing the invalidate_lock. Are there any locks held to prevent
> find_free_blocks from jumping in and possibly remove that page from
> xarray and try to remove it from the mmapped area?
> If not, we might end up mapping a no longer valid page.
Yeah, after tcmu_try_get_data_page() returns, find_free_blocks() definitely
may come in and do unmap_mapping_range() and tcmu_blocks_release(),
but I think it won't cause problems:
1) Since page fault procedure and unmap_mapping_range are designed to
be able to run concurrently, they sync at pte_offset_map_lock(). See
=> do_user_addr_fault
==> handle_mm_fault
===> __handle_mm_fault
====> do_fault
=====> do_shared_fault
=======> finish_fault
========> pte_offset_map_lock
========> do_set_pte
========> pte_unmap_unlock
and in find_free_blocks():
=> unmap_mapping_range
== > unmap_mapping_range_tree
===> zap_page_range_single
====> unmap_page_range
=====> zap_p4d_range
======> zap_pud_range
========> zap_pmd_range
==========> zap_pte_range
===========> pte_offset_map_lock
===========> pte_clear_not_present_full
===========> pte_unmap_unlock(start_pte, ptl);
So what I want to express is that because of the concurrency of page fault
procedure and unmap_mapping_range(), one will either see a valid map, or
not. And if not, because this page exceeds dbi_max, a later page fault will
happen, and will get sigbus, but it's reasonable.
As for your question, tcmu_try_get_data_page() finds a page successfully,
this page will get a refcount properly, if later unmap_mapping_range() and
tcmu_blocks_release() come in, just after tcmu_try_get_data_page()
returns and
before tcmu_vma_fault() returns, then actually tcmu_blocks_release() won't
free this page because there is one refcount. So yes, we'll map a no longer
valid page, but this page also won't be re-used, unless the map is unmapped
later(process exits or killed), then put_page() will be called and page
will finally
be given back to mm subsystem.
>
> Of course, this would be a long standing problem not caused by your
> change. But if there would be a problem, we should try to fix it
> when touching this code, I think.
> Unfortunately I didn't manage yet to check which locks are involved
> during page fault handling and unmap_mapping_range.
At least for my knowledge, page fault will hold mmap_read_lock() and
pte lock, unmap_mapping_range() will hold mapping->i_mmap_rwsem
and pte lock.
Regards,
Xiaoguang Wang
>
> Bodo
>
> On 16.03.22 11:43, Xiaoguang Wang wrote:
>> hello,
>>
>> Gentle ping.
>>
>> Regards,
>> Xiaoguang Wang
>>
>>> Currently tcmu_vma_fault() uses udev->cmdr_lock to avoid concurrent
>>> find_free_blocks(), which unmaps idle pages and truncates them. This
>>> work is really like many filesystem's truncate operations, but they
>>> use address_space->invalidate_lock to protect race.
>>>
>>> This patch replaces cmdr_lock with address_space->invalidate_lock in
>>> tcmu fault procedure, which will also make page-fault have concurrency.
>>>
>>> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
>>> ---
>>> drivers/target/target_core_user.c | 13 +++++++++----
>>> 1 file changed, 9 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/target/target_core_user.c
>>> b/drivers/target/target_core_user.c
>>> index 06a5c4086551..e0a62623ccd7 100644
>>> --- a/drivers/target/target_core_user.c
>>> +++ b/drivers/target/target_core_user.c
>>> @@ -1815,13 +1815,14 @@ static int tcmu_find_mem_index(struct
>>> vm_area_struct *vma)
>>> static struct page *tcmu_try_get_data_page(struct tcmu_dev *udev,
>>> uint32_t dpi)
>>> {
>>> + struct address_space *mapping = udev->inode->i_mapping;
>>> struct page *page;
>>> - mutex_lock(&udev->cmdr_lock);
>>> + filemap_invalidate_lock_shared(mapping);
>>> page = xa_load(&udev->data_pages, dpi);
>>> if (likely(page)) {
>>> get_page(page);
>>> - mutex_unlock(&udev->cmdr_lock);
>>> + filemap_invalidate_unlock_shared(mapping);
>>> return page;
>>> }
>>> @@ -1831,7 +1832,7 @@ static struct page
>>> *tcmu_try_get_data_page(struct tcmu_dev *udev, uint32_t dpi)
>>> */
>>> pr_err("Invalid addr to data page mapping (dpi %u) on device
>>> %s\n",
>>> dpi, udev->name);
>>> - mutex_unlock(&udev->cmdr_lock);
>>> + filemap_invalidate_unlock_shared(mapping);
>>> return NULL;
>>> }
>>> @@ -3111,6 +3112,7 @@ static void find_free_blocks(void)
>>> loff_t off;
>>> u32 pages_freed, total_pages_freed = 0;
>>> u32 start, end, block, total_blocks_freed = 0;
>>> + struct address_space *mapping;
>>> if (atomic_read(&global_page_count) <= tcmu_global_max_pages)
>>> return;
>>> @@ -3134,6 +3136,7 @@ static void find_free_blocks(void)
>>> continue;
>>> }
>>> + mapping = udev->inode->i_mapping;
>>> end = udev->dbi_max + 1;
>>> block = find_last_bit(udev->data_bitmap, end);
>>> if (block == udev->dbi_max) {
>>> @@ -3152,12 +3155,14 @@ static void find_free_blocks(void)
>>> udev->dbi_max = block;
>>> }
>>> + filemap_invalidate_lock(mapping);
>>> /* Here will truncate the data area from off */
>>> off = udev->data_off + (loff_t)start * udev->data_blk_size;
>>> - unmap_mapping_range(udev->inode->i_mapping, off, 0, 1);
>>> + unmap_mapping_range(mapping, off, 0, 1);
>>> /* Release the block pages */
>>> pages_freed = tcmu_blocks_release(udev, start, end - 1);
>>> + filemap_invalidate_unlock(mapping);
>>> mutex_unlock(&udev->cmdr_lock);
>>> total_pages_freed += pages_freed;
>>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] scsi: target: tcmu: Use address_space->invalidate_lock
2022-03-17 4:59 ` Xiaoguang Wang
@ 2022-03-17 6:09 ` Xiaoguang Wang
0 siblings, 0 replies; 8+ messages in thread
From: Xiaoguang Wang @ 2022-03-17 6:09 UTC (permalink / raw)
To: Bodo Stroesser, linux-scsi, target-devel
hi,
> hi,
>
>> Sorry for the late response. Currently I'm quite busy.
> Really never mind :)
>
>>
>> In your earlier mail you described a possible dead lock.
>> With this patch applied, are you sure a similar deadlock cannot
>> happen?
> AFAIK, this patch will solve the deadlock.
>
>>
>> Additionally, let's assume tcmu_vma_fault/tcmu_try_get_data_page
>> - after having found a valid page to map - is interrupted after
>> releasing the invalidate_lock. Are there any locks held to prevent
>> find_free_blocks from jumping in and possibly remove that page from
>> xarray and try to remove it from the mmapped area?
>> If not, we might end up mapping a no longer valid page.
> Yeah, after tcmu_try_get_data_page() returns, find_free_blocks()
> definitely
> may come in and do unmap_mapping_range() and tcmu_blocks_release(),
> but I think it won't cause problems:
> 1) Since page fault procedure and unmap_mapping_range are designed to
> be able to run concurrently, they sync at pte_offset_map_lock(). See
> => do_user_addr_fault
> ==> handle_mm_fault
> ===> __handle_mm_fault
> ====> do_fault
> =====> do_shared_fault
> =======> finish_fault
> ========> pte_offset_map_lock
> ========> do_set_pte
> ========> pte_unmap_unlock
>
> and in find_free_blocks():
> => unmap_mapping_range
> == > unmap_mapping_range_tree
> ===> zap_page_range_single
> ====> unmap_page_range
> =====> zap_p4d_range
> ======> zap_pud_range
> ========> zap_pmd_range
> ==========> zap_pte_range
> ===========> pte_offset_map_lock
> ===========> pte_clear_not_present_full
> ===========> pte_unmap_unlock(start_pte, ptl);
>
> So what I want to express is that because of the concurrency of page
> fault
> procedure and unmap_mapping_range(), one will either see a valid map, or
> not. And if not, because this page exceeds dbi_max, a later page fault
> will
> happen, and will get sigbus, but it's reasonable.
>
> As for your question, tcmu_try_get_data_page() finds a page successfully,
> this page will get a refcount properly, if later unmap_mapping_range()
> and
> tcmu_blocks_release() come in, just after tcmu_try_get_data_page()
> returns and
> before tcmu_vma_fault() returns, then actually tcmu_blocks_release()
> won't
> free this page because there is one refcount. So yes, we'll map a no
> longer
> valid page, but this page also won't be re-used, unless the map is
> unmapped
> later(process exits or killed), then put_page() will be called and
> page will finally
> be given back to mm subsystem.
After thinking more about this problem, if we now have a valid map which
points
to a truncated page, and this offset of this page in data_bitmap is
freed. If later
another command runs in, it reuse the previous freed slot in
data_bitmap. Though
we'll allocate new page for this slot in data_area, but seems no page
fault will
happen again, because we have a valid map.. so real request's data will
lose.
As you say, indeed this would be a long standing problem, we'll need to have
a deeper look at codes.
Regards,
Xiaoguang Wang
>
>>
>> Of course, this would be a long standing problem not caused by your
>> change. But if there would be a problem, we should try to fix it
>> when touching this code, I think.
>> Unfortunately I didn't manage yet to check which locks are involved
>> during page fault handling and unmap_mapping_range.
> At least for my knowledge, page fault will hold mmap_read_lock() and
> pte lock, unmap_mapping_range() will hold mapping->i_mmap_rwsem
> and pte lock.
>
> Regards,
> Xiaoguang Wang
>>
>> Bodo
>>
>> On 16.03.22 11:43, Xiaoguang Wang wrote:
>>> hello,
>>>
>>> Gentle ping.
>>>
>>> Regards,
>>> Xiaoguang Wang
>>>
>>>> Currently tcmu_vma_fault() uses udev->cmdr_lock to avoid concurrent
>>>> find_free_blocks(), which unmaps idle pages and truncates them. This
>>>> work is really like many filesystem's truncate operations, but they
>>>> use address_space->invalidate_lock to protect race.
>>>>
>>>> This patch replaces cmdr_lock with address_space->invalidate_lock in
>>>> tcmu fault procedure, which will also make page-fault have
>>>> concurrency.
>>>>
>>>> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
>>>> ---
>>>> drivers/target/target_core_user.c | 13 +++++++++----
>>>> 1 file changed, 9 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/target/target_core_user.c
>>>> b/drivers/target/target_core_user.c
>>>> index 06a5c4086551..e0a62623ccd7 100644
>>>> --- a/drivers/target/target_core_user.c
>>>> +++ b/drivers/target/target_core_user.c
>>>> @@ -1815,13 +1815,14 @@ static int tcmu_find_mem_index(struct
>>>> vm_area_struct *vma)
>>>> static struct page *tcmu_try_get_data_page(struct tcmu_dev *udev,
>>>> uint32_t dpi)
>>>> {
>>>> + struct address_space *mapping = udev->inode->i_mapping;
>>>> struct page *page;
>>>> - mutex_lock(&udev->cmdr_lock);
>>>> + filemap_invalidate_lock_shared(mapping);
>>>> page = xa_load(&udev->data_pages, dpi);
>>>> if (likely(page)) {
>>>> get_page(page);
>>>> - mutex_unlock(&udev->cmdr_lock);
>>>> + filemap_invalidate_unlock_shared(mapping);
>>>> return page;
>>>> }
>>>> @@ -1831,7 +1832,7 @@ static struct page
>>>> *tcmu_try_get_data_page(struct tcmu_dev *udev, uint32_t dpi)
>>>> */
>>>> pr_err("Invalid addr to data page mapping (dpi %u) on device
>>>> %s\n",
>>>> dpi, udev->name);
>>>> - mutex_unlock(&udev->cmdr_lock);
>>>> + filemap_invalidate_unlock_shared(mapping);
>>>> return NULL;
>>>> }
>>>> @@ -3111,6 +3112,7 @@ static void find_free_blocks(void)
>>>> loff_t off;
>>>> u32 pages_freed, total_pages_freed = 0;
>>>> u32 start, end, block, total_blocks_freed = 0;
>>>> + struct address_space *mapping;
>>>> if (atomic_read(&global_page_count) <= tcmu_global_max_pages)
>>>> return;
>>>> @@ -3134,6 +3136,7 @@ static void find_free_blocks(void)
>>>> continue;
>>>> }
>>>> + mapping = udev->inode->i_mapping;
>>>> end = udev->dbi_max + 1;
>>>> block = find_last_bit(udev->data_bitmap, end);
>>>> if (block == udev->dbi_max) {
>>>> @@ -3152,12 +3155,14 @@ static void find_free_blocks(void)
>>>> udev->dbi_max = block;
>>>> }
>>>> + filemap_invalidate_lock(mapping);
>>>> /* Here will truncate the data area from off */
>>>> off = udev->data_off + (loff_t)start * udev->data_blk_size;
>>>> - unmap_mapping_range(udev->inode->i_mapping, off, 0, 1);
>>>> + unmap_mapping_range(mapping, off, 0, 1);
>>>> /* Release the block pages */
>>>> pages_freed = tcmu_blocks_release(udev, start, end - 1);
>>>> + filemap_invalidate_unlock(mapping);
>>>> mutex_unlock(&udev->cmdr_lock);
>>>> total_pages_freed += pages_freed;
>>>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] scsi: target: tcmu: Fix possible page UAF
2022-03-11 13:22 [PATCH 1/2] scsi: target: tcmu: Fix possible page UAF Xiaoguang Wang
2022-03-11 13:22 ` [PATCH 2/2] scsi: target: tcmu: Use address_space->invalidate_lock Xiaoguang Wang
2022-03-16 12:38 ` [PATCH 1/2] scsi: target: tcmu: Fix possible page UAF Bodo Stroesser
@ 2022-04-07 13:35 ` Martin K. Petersen
2 siblings, 0 replies; 8+ messages in thread
From: Martin K. Petersen @ 2022-04-07 13:35 UTC (permalink / raw)
To: linux-scsi; +Cc: Martin K . Petersen
On Fri, 11 Mar 2022 21:22:05 +0800, Xiaoguang Wang wrote:
> tcmu_try_get_data_page() looks up pages under cmdr_lock, but it don't
> take refcount properly and just return page pointer.
>
> When tcmu_try_get_data_page() returns, the returned page may have been
> freed by tcmu_blocks_release(), need to get_page() under cmdr_lock to
> avoid concurrent tcmu_blocks_release().
>
> [...]
Applied to 5.18/scsi-fixes, thanks!
[1/2] scsi: target: tcmu: Fix possible page UAF
https://git.kernel.org/mkp/scsi/c/a6968f7a367f
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-04-07 13:36 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-11 13:22 [PATCH 1/2] scsi: target: tcmu: Fix possible page UAF Xiaoguang Wang
2022-03-11 13:22 ` [PATCH 2/2] scsi: target: tcmu: Use address_space->invalidate_lock Xiaoguang Wang
2022-03-16 10:43 ` Xiaoguang Wang
2022-03-16 13:05 ` Bodo Stroesser
2022-03-17 4:59 ` Xiaoguang Wang
2022-03-17 6:09 ` Xiaoguang Wang
2022-03-16 12:38 ` [PATCH 1/2] scsi: target: tcmu: Fix possible page UAF Bodo Stroesser
2022-04-07 13:35 ` Martin K. Petersen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.