target-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] scsi: target: tcmu: fix crashes on ARM
@ 2020-06-18 13:16 Bodo Stroesser
  2020-06-18 13:16 ` [PATCH 1/2 v2] scsi: target: tcmu: Optimize use of flush_dcache_page Bodo Stroesser
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Bodo Stroesser @ 2020-06-18 13:16 UTC (permalink / raw)
  To: Martin K. Petersen, Mike Christie, linux-scsi, target-devel
  Cc: JiangYu, Daniel Meyerholt, Bodo Stroesser

This small series of patches consists of:
   [PATCH 1/2 v2] scsi: target: tcmu: Optimize use of flush_dcache_page
   [PATCH 2/2 v2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range

Together with commit
   8c4e0f212398 scsi: target: tcmu: Fix size in calls to tcmu_flush_dcache_range
these patches fix crashes in tcmu on ARM.

The first patch of this series already was sent some weeks ago
as "PATCH RFC", since it was untested at that time.

Meanwhile I added patch 2 of the series to fix the crash reported in:
   https://github.com/open-iscsi/tcmu-runner/issues/627
   https://bugzilla.kernel.org/show_bug.cgi?id 8045

All three patches together were tested on ARM with kernel
4.19.118 and 5.7.2 (see github issue and bugzilla).

---
v2: sent with a valid subject line.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2 v2] scsi: target: tcmu: Optimize use of flush_dcache_page
  2020-06-18 13:16 [PATCH v2 0/2] scsi: target: tcmu: fix crashes on ARM Bodo Stroesser
@ 2020-06-18 13:16 ` Bodo Stroesser
  2020-06-18 13:16 ` [PATCH 2/2 v2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range on ARM Bodo Stroesser
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Bodo Stroesser @ 2020-06-18 13:16 UTC (permalink / raw)
  To: Martin K. Petersen, Mike Christie, linux-scsi, target-devel
  Cc: JiangYu, Daniel Meyerholt, Bodo Stroesser

(scatter|gather)_data_area() need to flush dcache after
writing data to or before reading data from a page in
uio data area.
The two routines are able to handle data transfer to/from
such a page in fragments and flush the cache after each
fragment was copied by calling the wrapper
tcmu_flush_dcache_range().

That means:
1) flush_dcache_page() can be called multiple times for
   the same page.
2) Calling flush_dcache_page() indirectly using the
   wrapper does not make sense, because each call of the
   wrapper is for one single page only and the calling
   routine already has the correct page pointer.

Therefore I changed (scatter|gather)_data_area() such,
that instead of calling tcmu_flush_dcache_range()
before/after each memcpy, it now calls flush_dcache_page()
before unmapping a page (when writing is complete for
that page) or after mapping a page (when starting to read
the page).

After this change only calls to tcmu_flush_dcache_range()
for addresses in vmalloc'ed command ring are left over.
This is a good preparation for the second patch of the series.

The patch was tested on ARM with kernel 4.19.118 and 5.7.2

Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Tested-by: JiangYu <lnsyyj@hotmail.com>
Tested-by: Daniel Meyerholt <dxm523@gmail.com>
---
 drivers/target/target_core_user.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 560bfec933bc..a65e9671ae7a 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -676,8 +676,10 @@ static void scatter_data_area(struct tcmu_dev *udev,
 		from = kmap_atomic(sg_page(sg)) + sg->offset;
 		while (sg_remaining > 0) {
 			if (block_remaining = 0) {
-				if (to)
+				if (to) {
+					flush_dcache_page(page);
 					kunmap_atomic(to);
+				}
 
 				block_remaining = DATA_BLOCK_SIZE;
 				dbi = tcmu_cmd_get_dbi(tcmu_cmd);
@@ -722,7 +724,6 @@ static void scatter_data_area(struct tcmu_dev *udev,
 				memcpy(to + offset,
 				       from + sg->length - sg_remaining,
 				       copy_bytes);
-				tcmu_flush_dcache_range(to, copy_bytes);
 			}
 
 			sg_remaining -= copy_bytes;
@@ -731,8 +732,10 @@ static void scatter_data_area(struct tcmu_dev *udev,
 		kunmap_atomic(from - sg->offset);
 	}
 
-	if (to)
+	if (to) {
+		flush_dcache_page(page);
 		kunmap_atomic(to);
+	}
 }
 
 static void gather_data_area(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
@@ -778,13 +781,13 @@ static void gather_data_area(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
 				dbi = tcmu_cmd_get_dbi(cmd);
 				page = tcmu_get_block_page(udev, dbi);
 				from = kmap_atomic(page);
+				flush_dcache_page(page);
 			}
 			copy_bytes = min_t(size_t, sg_remaining,
 					block_remaining);
 			if (read_len < copy_bytes)
 				copy_bytes = read_len;
 			offset = DATA_BLOCK_SIZE - block_remaining;
-			tcmu_flush_dcache_range(from, copy_bytes);
 			memcpy(to + sg->length - sg_remaining, from + offset,
 					copy_bytes);
 
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2 v2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range on ARM
  2020-06-18 13:16 [PATCH v2 0/2] scsi: target: tcmu: fix crashes on ARM Bodo Stroesser
  2020-06-18 13:16 ` [PATCH 1/2 v2] scsi: target: tcmu: Optimize use of flush_dcache_page Bodo Stroesser
@ 2020-06-18 13:16 ` Bodo Stroesser
  2020-06-18 15:00   ` Mike Christie
  2020-06-18 16:04 ` [PATCH v2 0/2] scsi: target: tcmu: fix crashes " Mike Christie
  2020-06-20  3:26 ` Martin K. Petersen
  3 siblings, 1 reply; 9+ messages in thread
From: Bodo Stroesser @ 2020-06-18 13:16 UTC (permalink / raw)
  To: Martin K. Petersen, Mike Christie, linux-scsi, target-devel
  Cc: JiangYu, Daniel Meyerholt, Bodo Stroesser

This patch fixes the following crash
(see https://bugzilla.kernel.org/show_bug.cgi?id 8045)

 Process iscsi_trx (pid: 7496, stack limit = 0x0000000010dd111a)
 CPU: 0 PID: 7496 Comm: iscsi_trx Not tainted 4.19.118-0419118-generic
        #202004230533
 Hardware name: Greatwall QingTian DF720/F601, BIOS 601FBE20 Sep 26 2019
 pstate: 80400005 (Nzcv daif +PAN -UAO)
 pc : flush_dcache_page+0x18/0x40
 lr : is_ring_space_avail+0x68/0x2f8 [target_core_user]
 sp : ffff000015123a80
 x29: ffff000015123a80 x28: 0000000000000000
 x27: 0000000000001000 x26: ffff000023ea5000
 x25: ffffcfa25bbe08b8 x24: 0000000000000078
 x23: ffff7e0000000000 x22: ffff000023ea5001
 x21: ffffcfa24b79c000 x20: 0000000000000fff
 x19: ffff7e00008fa940 x18: 0000000000000000
 x17: 0000000000000000 x16: ffff2d047e709138
 x15: 0000000000000000 x14: 0000000000000000
 x13: 0000000000000000 x12: ffff2d047fbd0a40
 x11: 0000000000000000 x10: 0000000000000030
 x9 : 0000000000000000 x8 : ffffc9a254820a00
 x7 : 00000000000013b0 x6 : 000000000000003f
 x5 : 0000000000000040 x4 : ffffcfa25bbe08e8
 x3 : 0000000000001000 x2 : 0000000000000078
 x1 : ffffcfa25bbe08b8 x0 : ffff2d040bc88a18
 Call trace:
  flush_dcache_page+0x18/0x40
  is_ring_space_avail+0x68/0x2f8 [target_core_user]
  queue_cmd_ring+0x1f8/0x680 [target_core_user]
  tcmu_queue_cmd+0xe4/0x158 [target_core_user]
  __target_execute_cmd+0x30/0xf0 [target_core_mod]
  target_execute_cmd+0x294/0x390 [target_core_mod]
  transport_generic_new_cmd+0x1e8/0x358 [target_core_mod]
  transport_handle_cdb_direct+0x50/0xb0 [target_core_mod]
  iscsit_execute_cmd+0x2b4/0x350 [iscsi_target_mod]
  iscsit_sequence_cmd+0xd8/0x1d8 [iscsi_target_mod]
  iscsit_process_scsi_cmd+0xac/0xf8 [iscsi_target_mod]
  iscsit_get_rx_pdu+0x404/0xd00 [iscsi_target_mod]
  iscsi_target_rx_thread+0xb8/0x130 [iscsi_target_mod]
  kthread+0x130/0x138
  ret_from_fork+0x10/0x18
 Code: f9000bf3 aa0003f3 aa1e03e0 d503201f (f9400260)
 ---[ end trace 1e451c73f4266776 ]---

The solution is based on patch:

  "scsi: target: tcmu: Optimize use of flush_dcache_page"

which restricts the use of tcmu_flush_dcache_range() to
addresses from vmalloc'ed areas only.

This patch now replaces the virt_to_page() call in
tcmu_flush_dcache_range() - which is wrong for vmalloced addrs -
by vmalloc_to_page().

The patch was tested on ARM with kernel 4.19.118 and 5.7.2

Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Tested-by: JiangYu <lnsyyj@hotmail.com>
Tested-by: Daniel Meyerholt <dxm523@gmail.com>
---
 drivers/target/target_core_user.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index a65e9671ae7a..835d3001cb0e 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -601,7 +601,7 @@ static inline void tcmu_flush_dcache_range(void *vaddr, size_t size)
 	size = round_up(size+offset, PAGE_SIZE);
 
 	while (size) {
-		flush_dcache_page(virt_to_page(start));
+		flush_dcache_page(vmalloc_to_page(start));
 		start += PAGE_SIZE;
 		size -= PAGE_SIZE;
 	}
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2 v2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range on ARM
  2020-06-18 13:16 ` [PATCH 2/2 v2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range on ARM Bodo Stroesser
@ 2020-06-18 15:00   ` Mike Christie
  2020-06-18 15:41     ` Bodo Stroesser
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Christie @ 2020-06-18 15:00 UTC (permalink / raw)
  To: Bodo Stroesser, Martin K. Petersen, linux-scsi, target-devel
  Cc: JiangYu, Daniel Meyerholt

On 6/18/20 8:16 AM, Bodo Stroesser wrote:
> This patch fixes the following crash
> (see https://urldefense.com/v3/__https://bugzilla.kernel.org/show_bug.cgi?id 8045__;!!GqivPVa7Brio!O7JgZIL3VPAzIqwJfZjL48y8M90K3HXv3pUVeoCzZ-vXovCpx5g7xMg-z5aAiVXVtkfE$ )
> 
>   Process iscsi_trx (pid: 7496, stack limit = 0x0000000010dd111a)
>   CPU: 0 PID: 7496 Comm: iscsi_trx Not tainted 4.19.118-0419118-generic
>          #202004230533
>   Hardware name: Greatwall QingTian DF720/F601, BIOS 601FBE20 Sep 26 2019
>   pstate: 80400005 (Nzcv daif +PAN -UAO)
>   pc : flush_dcache_page+0x18/0x40
>   lr : is_ring_space_avail+0x68/0x2f8 [target_core_user]
>   sp : ffff000015123a80
>   x29: ffff000015123a80 x28: 0000000000000000
>   x27: 0000000000001000 x26: ffff000023ea5000
>   x25: ffffcfa25bbe08b8 x24: 0000000000000078
>   x23: ffff7e0000000000 x22: ffff000023ea5001
>   x21: ffffcfa24b79c000 x20: 0000000000000fff
>   x19: ffff7e00008fa940 x18: 0000000000000000
>   x17: 0000000000000000 x16: ffff2d047e709138
>   x15: 0000000000000000 x14: 0000000000000000
>   x13: 0000000000000000 x12: ffff2d047fbd0a40
>   x11: 0000000000000000 x10: 0000000000000030
>   x9 : 0000000000000000 x8 : ffffc9a254820a00
>   x7 : 00000000000013b0 x6 : 000000000000003f
>   x5 : 0000000000000040 x4 : ffffcfa25bbe08e8
>   x3 : 0000000000001000 x2 : 0000000000000078
>   x1 : ffffcfa25bbe08b8 x0 : ffff2d040bc88a18
>   Call trace:
>    flush_dcache_page+0x18/0x40
>    is_ring_space_avail+0x68/0x2f8 [target_core_user]
>    queue_cmd_ring+0x1f8/0x680 [target_core_user]
>    tcmu_queue_cmd+0xe4/0x158 [target_core_user]
>    __target_execute_cmd+0x30/0xf0 [target_core_mod]
>    target_execute_cmd+0x294/0x390 [target_core_mod]
>    transport_generic_new_cmd+0x1e8/0x358 [target_core_mod]
>    transport_handle_cdb_direct+0x50/0xb0 [target_core_mod]
>    iscsit_execute_cmd+0x2b4/0x350 [iscsi_target_mod]
>    iscsit_sequence_cmd+0xd8/0x1d8 [iscsi_target_mod]
>    iscsit_process_scsi_cmd+0xac/0xf8 [iscsi_target_mod]
>    iscsit_get_rx_pdu+0x404/0xd00 [iscsi_target_mod]
>    iscsi_target_rx_thread+0xb8/0x130 [iscsi_target_mod]
>    kthread+0x130/0x138
>    ret_from_fork+0x10/0x18
>   Code: f9000bf3 aa0003f3 aa1e03e0 d503201f (f9400260)
>   ---[ end trace 1e451c73f4266776 ]---
> 
> The solution is based on patch:
> 
>    "scsi: target: tcmu: Optimize use of flush_dcache_page"
> 
> which restricts the use of tcmu_flush_dcache_range() to
> addresses from vmalloc'ed areas only.
> 
> This patch now replaces the virt_to_page() call in
> tcmu_flush_dcache_range() - which is wrong for vmalloced addrs -
> by vmalloc_to_page().
> 
> The patch was tested on ARM with kernel 4.19.118 and 5.7.2
> 
> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
> Tested-by: JiangYu <lnsyyj@hotmail.com>
> Tested-by: Daniel Meyerholt <dxm523@gmail.com>
> ---
>   drivers/target/target_core_user.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
> index a65e9671ae7a..835d3001cb0e 100644
> --- a/drivers/target/target_core_user.c
> +++ b/drivers/target/target_core_user.c
> @@ -601,7 +601,7 @@ static inline void tcmu_flush_dcache_range(void *vaddr, size_t size)
>   	size = round_up(size+offset, PAGE_SIZE);
>   
>   	while (size) {
> -		flush_dcache_page(virt_to_page(start));
> +		flush_dcache_page(vmalloc_to_page(start));
>   		start += PAGE_SIZE;
>   		size -= PAGE_SIZE;
>   	}

For this bug we only need to change the flush in is_ring_space_avail 
right? It's what is accessing the mb which is vmalloc'd.

Is it reccommended to call vmalloc_to_page on a page we get with 
alloc_page and is the mm then always going to do the right thing for us 
(I do not know the mm code well and just quickly scanned the 
Documentation and comments, but could not find anything), or could we 
hit similar issues where we are using the wrong call on different types 
of allocated memory down the line?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2 v2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range on ARM
  2020-06-18 15:00   ` Mike Christie
@ 2020-06-18 15:41     ` Bodo Stroesser
  2020-06-18 16:06       ` Mike Christie
  0 siblings, 1 reply; 9+ messages in thread
From: Bodo Stroesser @ 2020-06-18 15:41 UTC (permalink / raw)
  To: Mike Christie, Martin K. Petersen, linux-scsi, target-devel
  Cc: JiangYu, Daniel Meyerholt

On 2020-06-18 17:00, Mike Christie wrote:
> On 6/18/20 8:16 AM, Bodo Stroesser wrote:
>> This patch fixes the following crash
>> (see 
>> https://urldefense.com/v3/__https://bugzilla.kernel.org/show_bug.cgi?id=208045__;!!GqivPVa7Brio!O7JgZIL3VPAzIqwJfZjL48y8M90K3HXv3pUVeoCzZ-vXovCpx5g7xMg-z5aAiVXVtkfE$ 
>> )
>>
>>   Process iscsi_trx (pid: 7496, stack limit = 0x0000000010dd111a)
>>   CPU: 0 PID: 7496 Comm: iscsi_trx Not tainted 4.19.118-0419118-generic
>>          #202004230533
>>   Hardware name: Greatwall QingTian DF720/F601, BIOS 601FBE20 Sep 26 2019
>>   pstate: 80400005 (Nzcv daif +PAN -UAO)
>>   pc : flush_dcache_page+0x18/0x40
>>   lr : is_ring_space_avail+0x68/0x2f8 [target_core_user]
>>   sp : ffff000015123a80
>>   x29: ffff000015123a80 x28: 0000000000000000
>>   x27: 0000000000001000 x26: ffff000023ea5000
>>   x25: ffffcfa25bbe08b8 x24: 0000000000000078
>>   x23: ffff7e0000000000 x22: ffff000023ea5001
>>   x21: ffffcfa24b79c000 x20: 0000000000000fff
>>   x19: ffff7e00008fa940 x18: 0000000000000000
>>   x17: 0000000000000000 x16: ffff2d047e709138
>>   x15: 0000000000000000 x14: 0000000000000000
>>   x13: 0000000000000000 x12: ffff2d047fbd0a40
>>   x11: 0000000000000000 x10: 0000000000000030
>>   x9 : 0000000000000000 x8 : ffffc9a254820a00
>>   x7 : 00000000000013b0 x6 : 000000000000003f
>>   x5 : 0000000000000040 x4 : ffffcfa25bbe08e8
>>   x3 : 0000000000001000 x2 : 0000000000000078
>>   x1 : ffffcfa25bbe08b8 x0 : ffff2d040bc88a18
>>   Call trace:
>>    flush_dcache_page+0x18/0x40
>>    is_ring_space_avail+0x68/0x2f8 [target_core_user]
>>    queue_cmd_ring+0x1f8/0x680 [target_core_user]
>>    tcmu_queue_cmd+0xe4/0x158 [target_core_user]
>>    __target_execute_cmd+0x30/0xf0 [target_core_mod]
>>    target_execute_cmd+0x294/0x390 [target_core_mod]
>>    transport_generic_new_cmd+0x1e8/0x358 [target_core_mod]
>>    transport_handle_cdb_direct+0x50/0xb0 [target_core_mod]
>>    iscsit_execute_cmd+0x2b4/0x350 [iscsi_target_mod]
>>    iscsit_sequence_cmd+0xd8/0x1d8 [iscsi_target_mod]
>>    iscsit_process_scsi_cmd+0xac/0xf8 [iscsi_target_mod]
>>    iscsit_get_rx_pdu+0x404/0xd00 [iscsi_target_mod]
>>    iscsi_target_rx_thread+0xb8/0x130 [iscsi_target_mod]
>>    kthread+0x130/0x138
>>    ret_from_fork+0x10/0x18
>>   Code: f9000bf3 aa0003f3 aa1e03e0 d503201f (f9400260)
>>   ---[ end trace 1e451c73f4266776 ]---
>>
>> The solution is based on patch:
>>
>>    "scsi: target: tcmu: Optimize use of flush_dcache_page"
>>
>> which restricts the use of tcmu_flush_dcache_range() to
>> addresses from vmalloc'ed areas only.
>>
>> This patch now replaces the virt_to_page() call in
>> tcmu_flush_dcache_range() - which is wrong for vmalloced addrs -
>> by vmalloc_to_page().
>>
>> The patch was tested on ARM with kernel 4.19.118 and 5.7.2
>>
>> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
>> Tested-by: JiangYu <lnsyyj@hotmail.com>
>> Tested-by: Daniel Meyerholt <dxm523@gmail.com>
>> ---
>>   drivers/target/target_core_user.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/target/target_core_user.c 
>> b/drivers/target/target_core_user.c
>> index a65e9671ae7a..835d3001cb0e 100644
>> --- a/drivers/target/target_core_user.c
>> +++ b/drivers/target/target_core_user.c
>> @@ -601,7 +601,7 @@ static inline void tcmu_flush_dcache_range(void 
>> *vaddr, size_t size)
>>       size = round_up(size+offset, PAGE_SIZE);
>>       while (size) {
>> -        flush_dcache_page(virt_to_page(start));
>> +        flush_dcache_page(vmalloc_to_page(start));
>>           start += PAGE_SIZE;
>>           size -= PAGE_SIZE;
>>       }
> 
> For this bug we only need to change the flush in is_ring_space_avail 
> right? It's what is accessing the mb which is vmalloc'd.
No, is_ring_space_avail was just the first caller of
tcmu_flush_dcache_range for vmalloc'ed pages, so it crashed there.

The entiere address range exposed to userspace via uio consists of
two parts:
1) mb + command ring are vmalloc'ed in a single vzalloc call
    during initialization of a tcmu device.
2) the data area which is allocated page by page calling
    alloc_page()

The second part is handled by (scatter|gather)_data_area. For the
calls from these routines I think usage of virt_to_page in
tcmu_flush_dcache_range was fine. But patch number 1 of this
series replaced these called with direct calls to flush_dcache_page.

So all remaining calls to tcmu_flush_dcache_range after patch 1
are calls for vmalloc'ed addresses. Therefore after patch 1 we can
safely replace virt_to_page with vmalloc_to_page.

So it turned out that patch 1, which in the beginning was thought
to be an optimization, now also simplifies the crash fix.

> 
> Is it reccommended to call vmalloc_to_page on a page we get with 
> alloc_page and is the mm then always going to do the right thing for us 
> (I do not know the mm code well and just quickly scanned the 
> Documentation and comments, but could not find anything), or could we 
> hit similar issues where we are using the wrong call on different types 
> of allocated memory down the line?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/2] scsi: target: tcmu: fix crashes on ARM
  2020-06-18 13:16 [PATCH v2 0/2] scsi: target: tcmu: fix crashes on ARM Bodo Stroesser
  2020-06-18 13:16 ` [PATCH 1/2 v2] scsi: target: tcmu: Optimize use of flush_dcache_page Bodo Stroesser
  2020-06-18 13:16 ` [PATCH 2/2 v2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range on ARM Bodo Stroesser
@ 2020-06-18 16:04 ` Mike Christie
  2020-06-20  3:26 ` Martin K. Petersen
  3 siblings, 0 replies; 9+ messages in thread
From: Mike Christie @ 2020-06-18 16:04 UTC (permalink / raw)
  To: Bodo Stroesser, Martin K. Petersen, linux-scsi, target-devel
  Cc: JiangYu, Daniel Meyerholt

On 6/18/20 8:16 AM, Bodo Stroesser wrote:
> This small series of patches consists of:
>     [PATCH 1/2 v2] scsi: target: tcmu: Optimize use of flush_dcache_page
>     [PATCH 2/2 v2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range
> 
> Together with commit
>     8c4e0f212398 scsi: target: tcmu: Fix size in calls to tcmu_flush_dcache_range
> these patches fix crashes in tcmu on ARM.
> 
> The first patch of this series already was sent some weeks ago
> as "PATCH RFC", since it was untested at that time.
> 
> Meanwhile I added patch 2 of the series to fix the crash reported in:
>     https://urldefense.com/v3/__https://github.com/open-iscsi/tcmu-runner/issues/627__;!!GqivPVa7Brio!Nzsi6AfJJQIssE3tLed42r5EKmR0PmlYozji6dVy16oPnVKLMTCsE87jNrIgx6hasnfy$
>     https://urldefense.com/v3/__https://bugzilla.kernel.org/show_bug.cgi?id 8045__;!!GqivPVa7Brio!Nzsi6AfJJQIssE3tLed42r5EKmR0PmlYozji6dVy16oPnVKLMTCsE87jNrIgx9cTxaSo$
> 
> All three patches together were tested on ARM with kernel
> 4.19.118 and 5.7.2 (see github issue and bugzilla).
> 
> ---
> v2: sent with a valid subject line.
> 

Acked-by: Mike Christie <michael.christie@oracle.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2 v2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range on ARM
  2020-06-18 15:41     ` Bodo Stroesser
@ 2020-06-18 16:06       ` Mike Christie
  0 siblings, 0 replies; 9+ messages in thread
From: Mike Christie @ 2020-06-18 16:06 UTC (permalink / raw)
  To: Bodo Stroesser, Martin K. Petersen, linux-scsi, target-devel
  Cc: JiangYu, Daniel Meyerholt

On 6/18/20 10:41 AM, Bodo Stroesser wrote:
> On 2020-06-18 17:00, Mike Christie wrote:
>> On 6/18/20 8:16 AM, Bodo Stroesser wrote:
>>> This patch fixes the following crash
>>> (see 
>>> https://urldefense.com/v3/__https://bugzilla.kernel.org/show_bug.cgi?id 8045__;!!GqivPVa7Brio!O7JgZIL3VPAzIqwJfZjL48y8M90K3HXv3pUVeoCzZ-vXovCpx5g7xMg-z5aAiVXVtkfE$ 
>>> )
>>>
>>>   Process iscsi_trx (pid: 7496, stack limit = 0x0000000010dd111a)
>>>   CPU: 0 PID: 7496 Comm: iscsi_trx Not tainted 4.19.118-0419118-generic
>>>          #202004230533
>>>   Hardware name: Greatwall QingTian DF720/F601, BIOS 601FBE20 Sep 26 
>>> 2019
>>>   pstate: 80400005 (Nzcv daif +PAN -UAO)
>>>   pc : flush_dcache_page+0x18/0x40
>>>   lr : is_ring_space_avail+0x68/0x2f8 [target_core_user]
>>>   sp : ffff000015123a80
>>>   x29: ffff000015123a80 x28: 0000000000000000
>>>   x27: 0000000000001000 x26: ffff000023ea5000
>>>   x25: ffffcfa25bbe08b8 x24: 0000000000000078
>>>   x23: ffff7e0000000000 x22: ffff000023ea5001
>>>   x21: ffffcfa24b79c000 x20: 0000000000000fff
>>>   x19: ffff7e00008fa940 x18: 0000000000000000
>>>   x17: 0000000000000000 x16: ffff2d047e709138
>>>   x15: 0000000000000000 x14: 0000000000000000
>>>   x13: 0000000000000000 x12: ffff2d047fbd0a40
>>>   x11: 0000000000000000 x10: 0000000000000030
>>>   x9 : 0000000000000000 x8 : ffffc9a254820a00
>>>   x7 : 00000000000013b0 x6 : 000000000000003f
>>>   x5 : 0000000000000040 x4 : ffffcfa25bbe08e8
>>>   x3 : 0000000000001000 x2 : 0000000000000078
>>>   x1 : ffffcfa25bbe08b8 x0 : ffff2d040bc88a18
>>>   Call trace:
>>>    flush_dcache_page+0x18/0x40
>>>    is_ring_space_avail+0x68/0x2f8 [target_core_user]
>>>    queue_cmd_ring+0x1f8/0x680 [target_core_user]
>>>    tcmu_queue_cmd+0xe4/0x158 [target_core_user]
>>>    __target_execute_cmd+0x30/0xf0 [target_core_mod]
>>>    target_execute_cmd+0x294/0x390 [target_core_mod]
>>>    transport_generic_new_cmd+0x1e8/0x358 [target_core_mod]
>>>    transport_handle_cdb_direct+0x50/0xb0 [target_core_mod]
>>>    iscsit_execute_cmd+0x2b4/0x350 [iscsi_target_mod]
>>>    iscsit_sequence_cmd+0xd8/0x1d8 [iscsi_target_mod]
>>>    iscsit_process_scsi_cmd+0xac/0xf8 [iscsi_target_mod]
>>>    iscsit_get_rx_pdu+0x404/0xd00 [iscsi_target_mod]
>>>    iscsi_target_rx_thread+0xb8/0x130 [iscsi_target_mod]
>>>    kthread+0x130/0x138
>>>    ret_from_fork+0x10/0x18
>>>   Code: f9000bf3 aa0003f3 aa1e03e0 d503201f (f9400260)
>>>   ---[ end trace 1e451c73f4266776 ]---
>>>
>>> The solution is based on patch:
>>>
>>>    "scsi: target: tcmu: Optimize use of flush_dcache_page"
>>>
>>> which restricts the use of tcmu_flush_dcache_range() to
>>> addresses from vmalloc'ed areas only.
>>>
>>> This patch now replaces the virt_to_page() call in
>>> tcmu_flush_dcache_range() - which is wrong for vmalloced addrs -
>>> by vmalloc_to_page().
>>>
>>> The patch was tested on ARM with kernel 4.19.118 and 5.7.2
>>>
>>> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
>>> Tested-by: JiangYu <lnsyyj@hotmail.com>
>>> Tested-by: Daniel Meyerholt <dxm523@gmail.com>
>>> ---
>>>   drivers/target/target_core_user.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/target/target_core_user.c 
>>> b/drivers/target/target_core_user.c
>>> index a65e9671ae7a..835d3001cb0e 100644
>>> --- a/drivers/target/target_core_user.c
>>> +++ b/drivers/target/target_core_user.c
>>> @@ -601,7 +601,7 @@ static inline void tcmu_flush_dcache_range(void 
>>> *vaddr, size_t size)
>>>       size = round_up(size+offset, PAGE_SIZE);
>>>       while (size) {
>>> -        flush_dcache_page(virt_to_page(start));
>>> +        flush_dcache_page(vmalloc_to_page(start));
>>>           start += PAGE_SIZE;
>>>           size -= PAGE_SIZE;
>>>       }
>>
>> For this bug we only need to change the flush in is_ring_space_avail 
>> right? It's what is accessing the mb which is vmalloc'd.
> No, is_ring_space_avail was just the first caller of
> tcmu_flush_dcache_range for vmalloc'ed pages, so it crashed there.
> 
> The entiere address range exposed to userspace via uio consists of
> two parts:
> 1) mb + command ring are vmalloc'ed in a single vzalloc call
>     during initialization of a tcmu device.
> 2) the data area which is allocated page by page calling
>     alloc_page()
> 
> The second part is handled by (scatter|gather)_data_area. For the
> calls from these routines I think usage of virt_to_page in
> tcmu_flush_dcache_range was fine. But patch number 1 of this
> series replaced these called with direct calls to flush_dcache_page.

That was my problem. Did not see the replacement. Looks good to me.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/2] scsi: target: tcmu: fix crashes on ARM
  2020-06-18 13:16 [PATCH v2 0/2] scsi: target: tcmu: fix crashes on ARM Bodo Stroesser
                   ` (2 preceding siblings ...)
  2020-06-18 16:04 ` [PATCH v2 0/2] scsi: target: tcmu: fix crashes " Mike Christie
@ 2020-06-20  3:26 ` Martin K. Petersen
  2020-08-28  8:43   ` Bodo Stroesser
  3 siblings, 1 reply; 9+ messages in thread
From: Martin K. Petersen @ 2020-06-20  3:26 UTC (permalink / raw)
  To: linux-scsi, target-devel, Mike Christie, Bodo Stroesser
  Cc: Martin K . Petersen, JiangYu, Daniel Meyerholt

On Thu, 18 Jun 2020 15:16:30 +0200, Bodo Stroesser wrote:

> This small series of patches consists of:
>    [PATCH 1/2 v2] scsi: target: tcmu: Optimize use of flush_dcache_page
>    [PATCH 2/2 v2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range
> 
> Together with commit
>    8c4e0f212398 scsi: target: tcmu: Fix size in calls to tcmu_flush_dcache_range
> these patches fix crashes in tcmu on ARM.
> 
> [...]

Applied to 5.9/scsi-queue, thanks!

[1/2] scsi: target: tcmu: Optimize use of flush_dcache_page
      https://git.kernel.org/mkp/scsi/c/3c58f737231e
[2/2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range on ARM
      https://git.kernel.org/mkp/scsi/c/3145550a7f8b

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/2] scsi: target: tcmu: fix crashes on ARM
  2020-06-20  3:26 ` Martin K. Petersen
@ 2020-08-28  8:43   ` Bodo Stroesser
  0 siblings, 0 replies; 9+ messages in thread
From: Bodo Stroesser @ 2020-08-28  8:43 UTC (permalink / raw)
  To: Martin K. Petersen, linux-scsi, target-devel, Mike Christie, stable
  Cc: JiangYu, Daniel Meyerholt

Hi,

I'm adding stable@vger.kernel.org

On 2020-06-20 05:26, Martin K. Petersen wrote:
> On Thu, 18 Jun 2020 15:16:30 +0200, Bodo Stroesser wrote:
> 
>> This small series of patches consists of:
>>     [PATCH 1/2 v2] scsi: target: tcmu: Optimize use of flush_dcache_page
>>     [PATCH 2/2 v2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range
>>
>> Together with commit
>>     8c4e0f212398 scsi: target: tcmu: Fix size in calls to tcmu_flush_dcache_range
>> these patches fix crashes in tcmu on ARM.
>>
>> [...]
> 
> Applied to 5.9/scsi-queue, thanks!
> 
> [1/2] scsi: target: tcmu: Optimize use of flush_dcache_page
>        https://git.kernel.org/mkp/scsi/c/3c58f737231e
> [2/2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range on ARM
>        https://git.kernel.org/mkp/scsi/c/3145550a7f8b
> 

Patch 2/2 of this series already made it into 5.8, (5.7,) 5.4 and 4.19,
but patch 1/2 was not added yet. The crash will be fixed with with both
patches only. So please consider adding patch 1/2 also. The full commit
is 3c58f737231e2c8cbf543a09d84d8c8e80e05e43

Thank you
Bodo

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-08-28  8:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-18 13:16 [PATCH v2 0/2] scsi: target: tcmu: fix crashes on ARM Bodo Stroesser
2020-06-18 13:16 ` [PATCH 1/2 v2] scsi: target: tcmu: Optimize use of flush_dcache_page Bodo Stroesser
2020-06-18 13:16 ` [PATCH 2/2 v2] scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range on ARM Bodo Stroesser
2020-06-18 15:00   ` Mike Christie
2020-06-18 15:41     ` Bodo Stroesser
2020-06-18 16:06       ` Mike Christie
2020-06-18 16:04 ` [PATCH v2 0/2] scsi: target: tcmu: fix crashes " Mike Christie
2020-06-20  3:26 ` Martin K. Petersen
2020-08-28  8:43   ` Bodo Stroesser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).