linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse
@ 2020-07-30 22:07 Samuel Holland
  2020-07-31  7:29 ` Arnd Bergmann
  0 siblings, 1 reply; 7+ messages in thread
From: Samuel Holland @ 2020-07-30 22:07 UTC (permalink / raw)
  To: Adam Radford, James E.J. Bottomley, Martin K. Petersen
  Cc: Arnd Bergmann, linux-scsi, linux-kernel, Samuel Holland

The main issue observed was at the call to scsi_set_resid, where the
byteswapped parameter would eventually trigger the alignment check at
drivers/scsi/sd.c:2009. At that point, the kernel would continuously
complain about an "Unaligned partial completion", and no further I/O
could occur.

This gets the controller working on big endian powerpc64.

Signed-off-by: Samuel Holland <samuel@sholland.org>
---

Changes since v1:
 - Include changes to use __le?? types in command structures
 - Use an object literal for the intermediate "schedulertime" value
 - Use local "error" variable to avoid repeated byte swapping
 - Create a local "length" variable to avoid very long lines
 - Move byte swapping to TW_REQ_LUN_IN/TW_LUN_OUT to avoid long lines

I verified this patch with `make C=1`, and there were no warnings from
these files.

---
 drivers/scsi/3w-9xxx.c |  56 +++++++++------------
 drivers/scsi/3w-9xxx.h | 112 ++++++++++++++++++++++-------------------
 2 files changed, 85 insertions(+), 83 deletions(-)

diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c
index 3337b1e80412..8f56beefa338 100644
--- a/drivers/scsi/3w-9xxx.c
+++ b/drivers/scsi/3w-9xxx.c
@@ -303,10 +303,10 @@ static int twa_aen_drain_queue(TW_Device_Extension *tw_dev, int no_check_reset)
 
 	/* Initialize sglist */
 	memset(&sglist, 0, sizeof(TW_SG_Entry));
-	sglist[0].length = TW_SECTOR_SIZE;
-	sglist[0].address = tw_dev->generic_buffer_phys[request_id];
+	sglist[0].length = cpu_to_le32(TW_SECTOR_SIZE);
+	sglist[0].address = TW_CPU_TO_SGL(tw_dev->generic_buffer_phys[request_id]);
 
-	if (sglist[0].address & TW_ALIGNMENT_9000_SGL) {
+	if (tw_dev->generic_buffer_phys[request_id] & TW_ALIGNMENT_9000_SGL) {
 		TW_PRINTK(tw_dev->host, TW_DRIVER, 0x1, "Found unaligned address during AEN drain");
 		goto out;
 	}
@@ -440,8 +440,8 @@ static int twa_aen_read_queue(TW_Device_Extension *tw_dev, int request_id)
 
 	/* Initialize sglist */
 	memset(&sglist, 0, sizeof(TW_SG_Entry));
-	sglist[0].length = TW_SECTOR_SIZE;
-	sglist[0].address = tw_dev->generic_buffer_phys[request_id];
+	sglist[0].length = cpu_to_le32(TW_SECTOR_SIZE);
+	sglist[0].address = TW_CPU_TO_SGL(tw_dev->generic_buffer_phys[request_id]);
 
 	/* Mark internal command */
 	tw_dev->srb[request_id] = NULL;
@@ -501,9 +501,8 @@ static void twa_aen_sync_time(TW_Device_Extension *tw_dev, int request_id)
            Sunday 12:00AM */
 	local_time = (ktime_get_real_seconds() - (sys_tz.tz_minuteswest * 60));
 	div_u64_rem(local_time - (3 * 86400), 604800, &schedulertime);
-	schedulertime = cpu_to_le32(schedulertime % 604800);
 
-	memcpy(param->data, &schedulertime, sizeof(u32));
+	memcpy(param->data, &(__le32){cpu_to_le32(schedulertime)}, sizeof(__le32));
 
 	/* Mark internal command */
 	tw_dev->srb[request_id] = NULL;
@@ -1000,19 +999,13 @@ static int twa_fill_sense(TW_Device_Extension *tw_dev, int request_id, int copy_
 		if (print_host)
 			printk(KERN_WARNING "3w-9xxx: scsi%d: ERROR: (0x%02X:0x%04X): %s:%s.\n",
 			       tw_dev->host->host_no,
-			       TW_MESSAGE_SOURCE_CONTROLLER_ERROR,
-			       full_command_packet->header.status_block.error,
-			       error_str[0] == '\0' ?
-			       twa_string_lookup(twa_error_table,
-						 full_command_packet->header.status_block.error) : error_str,
+			       TW_MESSAGE_SOURCE_CONTROLLER_ERROR, error,
+			       error_str[0] ? error_str : twa_string_lookup(twa_error_table, error),
 			       full_command_packet->header.err_specific_desc);
 		else
 			printk(KERN_WARNING "3w-9xxx: ERROR: (0x%02X:0x%04X): %s:%s.\n",
-			       TW_MESSAGE_SOURCE_CONTROLLER_ERROR,
-			       full_command_packet->header.status_block.error,
-			       error_str[0] == '\0' ?
-			       twa_string_lookup(twa_error_table,
-						 full_command_packet->header.status_block.error) : error_str,
+			       TW_MESSAGE_SOURCE_CONTROLLER_ERROR, error,
+			       error_str[0] ? error_str : twa_string_lookup(twa_error_table, error),
 			       full_command_packet->header.err_specific_desc);
 	}
 
@@ -1129,12 +1122,11 @@ static int twa_initconnection(TW_Device_Extension *tw_dev, int message_credits,
 	tw_initconnect->opcode__reserved = TW_OPRES_IN(0, TW_OP_INIT_CONNECTION);
 	tw_initconnect->request_id = request_id;
 	tw_initconnect->message_credits = cpu_to_le16(message_credits);
-	tw_initconnect->features = set_features;
 
 	/* Turn on 64-bit sgl support if we need to */
-	tw_initconnect->features |= sizeof(dma_addr_t) > 4 ? 1 : 0;
+	set_features |= sizeof(dma_addr_t) > 4 ? 1 : 0;
 
-	tw_initconnect->features = cpu_to_le32(tw_initconnect->features);
+	tw_initconnect->features = cpu_to_le32(set_features);
 
 	if (set_features & TW_EXTENDED_INIT_CONNECT) {
 		tw_initconnect->size = TW_INIT_COMMAND_PACKET_SIZE_EXTENDED;
@@ -1347,8 +1339,10 @@ static irqreturn_t twa_interrupt(int irq, void *dev_instance)
 
 				/* Report residual bytes for single sgl */
 				if ((scsi_sg_count(cmd) <= 1) && (full_command_packet->command.newcommand.status == 0)) {
-					if (full_command_packet->command.newcommand.sg_list[0].length < scsi_bufflen(tw_dev->srb[request_id]))
-						scsi_set_resid(cmd, scsi_bufflen(cmd) - full_command_packet->command.newcommand.sg_list[0].length);
+					u32 length = le32_to_cpu(full_command_packet->command.newcommand.sg_list[0].length);
+
+					if (length < scsi_bufflen(cmd))
+						scsi_set_resid(cmd, scsi_bufflen(cmd) - length);
 				}
 
 				/* Now complete the io */
@@ -1390,13 +1384,13 @@ static void twa_load_sgl(TW_Device_Extension *tw_dev, TW_Command_Full *full_comm
 	if (TW_OP_OUT(full_command_packet->command.newcommand.opcode__reserved) == TW_OP_EXECUTE_SCSI) {
 		newcommand = &full_command_packet->command.newcommand;
 		newcommand->request_id__lunl =
-			cpu_to_le16(TW_REQ_LUN_IN(TW_LUN_OUT(newcommand->request_id__lunl), request_id));
+			TW_REQ_LUN_IN(TW_LUN_OUT(newcommand->request_id__lunl), request_id);
 		if (length) {
 			newcommand->sg_list[0].address = TW_CPU_TO_SGL(dma_handle + sizeof(TW_Ioctl_Buf_Apache) - 1);
 			newcommand->sg_list[0].length = cpu_to_le32(length);
 		}
 		newcommand->sgl_entries__lunh =
-			cpu_to_le16(TW_REQ_LUN_IN(TW_LUN_OUT(newcommand->sgl_entries__lunh), length ? 1 : 0));
+			TW_REQ_LUN_IN(TW_LUN_OUT(newcommand->sgl_entries__lunh), length ? 1 : 0);
 	} else {
 		oldcommand = &full_command_packet->command.oldcommand;
 		oldcommand->request_id = request_id;
@@ -1837,10 +1831,10 @@ static int twa_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int request_id,
 	if (srb) {
 		command_packet->unit = srb->device->id;
 		command_packet->request_id__lunl =
-			cpu_to_le16(TW_REQ_LUN_IN(srb->device->lun, request_id));
+			TW_REQ_LUN_IN(srb->device->lun, request_id);
 	} else {
 		command_packet->request_id__lunl =
-			cpu_to_le16(TW_REQ_LUN_IN(0, request_id));
+			TW_REQ_LUN_IN(0, request_id);
 		command_packet->unit = 0;
 	}
 
@@ -1872,19 +1866,19 @@ static int twa_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int request_id,
 					}
 				}
 			}
-			command_packet->sgl_entries__lunh = cpu_to_le16(TW_REQ_LUN_IN((srb->device->lun >> 4), scsi_sg_count(tw_dev->srb[request_id])));
+			command_packet->sgl_entries__lunh = TW_REQ_LUN_IN((srb->device->lun >> 4), scsi_sg_count(tw_dev->srb[request_id]));
 		}
 	} else {
 		/* Internal cdb post */
 		for (i = 0; i < use_sg; i++) {
-			command_packet->sg_list[i].address = TW_CPU_TO_SGL(sglistarg[i].address);
-			command_packet->sg_list[i].length = cpu_to_le32(sglistarg[i].length);
+			command_packet->sg_list[i].address = sglistarg[i].address;
+			command_packet->sg_list[i].length = sglistarg[i].length;
 			if (command_packet->sg_list[i].address & TW_CPU_TO_SGL(TW_ALIGNMENT_9000_SGL)) {
 				TW_PRINTK(tw_dev->host, TW_DRIVER, 0x2f, "Found unaligned sgl address during internal post");
 				goto out;
 			}
 		}
-		command_packet->sgl_entries__lunh = cpu_to_le16(TW_REQ_LUN_IN(0, use_sg));
+		command_packet->sgl_entries__lunh = TW_REQ_LUN_IN(0, use_sg);
 	}
 
 	if (srb) {
@@ -2109,7 +2103,7 @@ static int twa_probe(struct pci_dev *pdev, const struct pci_device_id *dev_id)
 				     TW_PARAM_FWVER, TW_PARAM_FWVER_LENGTH),
 	       (char *)twa_get_param(tw_dev, 1, TW_VERSION_TABLE,
 				     TW_PARAM_BIOSVER, TW_PARAM_BIOSVER_LENGTH),
-	       le32_to_cpu(*(int *)twa_get_param(tw_dev, 2, TW_INFORMATION_TABLE,
+	       le32_to_cpu(*(__le32 *)twa_get_param(tw_dev, 2, TW_INFORMATION_TABLE,
 				     TW_PARAM_PORTCOUNT, TW_PARAM_PORTCOUNT_LENGTH)));
 
 	/* Try to enable MSI */
diff --git a/drivers/scsi/3w-9xxx.h b/drivers/scsi/3w-9xxx.h
index d88cd3499bd5..d258bc8fe9b0 100644
--- a/drivers/scsi/3w-9xxx.h
+++ b/drivers/scsi/3w-9xxx.h
@@ -434,8 +434,8 @@ static twa_message_type twa_error_table[] = {
 #define TW_RESID_OUT(x) ((x >> 4) & 0xff)
 
 /* request_id: 12, lun: 4 */
-#define TW_REQ_LUN_IN(lun, request_id) (((lun << 12) & 0xf000) | (request_id & 0xfff))
-#define TW_LUN_OUT(lun) ((lun >> 12) & 0xf)
+#define TW_REQ_LUN_IN(lun, request_id) cpu_to_le16(((lun << 12) & 0xf000) | (request_id & 0xfff))
+#define TW_LUN_OUT(lun) ((le16_to_cpu(lun) >> 12) & 0xf)
 
 /* Macros */
 #define TW_CONTROL_REG_ADDR(x) (x->base_addr)
@@ -469,70 +469,78 @@ printk(KERN_WARNING "3w-9xxx: ERROR: (0x%02X:0x%04X): %s.\n",a,b,c); \
 #define TW_APACHE_MAX_SGL_LENGTH (sizeof(dma_addr_t) > 4 ? 72 : 109)
 #define TW_ESCALADE_MAX_SGL_LENGTH (sizeof(dma_addr_t) > 4 ? 41 : 62)
 #define TW_PADDING_LENGTH (sizeof(dma_addr_t) > 4 ? 8 : 0)
-#define TW_CPU_TO_SGL(x) (sizeof(dma_addr_t) > 4 ? cpu_to_le64(x) : cpu_to_le32(x))
+#if IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT)
+#define TW_CPU_TO_SGL(x) cpu_to_le64(x)
+#else
+#define TW_CPU_TO_SGL(x) cpu_to_le32(x)
+#endif
 
 #pragma pack(1)
 
 /* Scatter Gather List Entry */
 typedef struct TAG_TW_SG_Entry {
-	dma_addr_t address;
-	u32 length;
+#if IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT)
+	__le64	address;
+#else
+	__le32	address;
+#endif
+	__le32	length;
 } TW_SG_Entry;
 
 /* Command Packet */
 typedef struct TW_Command {
-	unsigned char opcode__sgloffset;
-	unsigned char size;
-	unsigned char request_id;
-	unsigned char unit__hostid;
+	u8	opcode__sgloffset;
+	u8	size;
+	u8	request_id;
+	u8	unit__hostid;
 	/* Second DWORD */
-	unsigned char status;
-	unsigned char flags;
+	u8	status;
+	u8	flags;
 	union {
-		unsigned short block_count;
-		unsigned short parameter_count;
+		__le16	block_count;
+		__le16	parameter_count;
 	} byte6_offset;
 	union {
 		struct {
-			u32 lba;
-			TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
-			dma_addr_t padding;
+			__le32		lba;
+			TW_SG_Entry	sgl[TW_ESCALADE_MAX_SGL_LENGTH];
+			dma_addr_t	padding;
 		} io;
 		struct {
-			TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
-			u32 padding;
-			dma_addr_t padding2;
+			TW_SG_Entry	sgl[TW_ESCALADE_MAX_SGL_LENGTH];
+			__le32		padding;
+			dma_addr_t	padding2;
 		} param;
 	} byte8_offset;
 } TW_Command;
 
 /* Command Packet for 9000+ controllers */
 typedef struct TAG_TW_Command_Apache {
-	unsigned char opcode__reserved;
-	unsigned char unit;
-	unsigned short request_id__lunl;
-	unsigned char status;
-	unsigned char sgl_offset;
-	unsigned short sgl_entries__lunh;
-	unsigned char cdb[16];
+	u8	opcode__reserved;
+	u8	unit;
+	__le16	request_id__lunl;
+	u8	status;
+	u8	sgl_offset;
+	__le16	sgl_entries__lunh;
+	u8	cdb[16];
 	TW_SG_Entry sg_list[TW_APACHE_MAX_SGL_LENGTH];
-	unsigned char padding[TW_PADDING_LENGTH];
+	u8	padding[TW_PADDING_LENGTH];
 } TW_Command_Apache;
 
 /* New command packet header */
 typedef struct TAG_TW_Command_Apache_Header {
 	unsigned char sense_data[TW_SENSE_DATA_LENGTH];
 	struct {
-		char reserved[4];
-		unsigned short error;
-		unsigned char padding;
-		unsigned char severity__reserved;
+		u8	reserved[4];
+		__le16	error;
+		u8	padding;
+		u8	severity__reserved;
 	} status_block;
 	unsigned char err_specific_desc[98];
 	struct {
-		unsigned char size_header;
-		unsigned short reserved;
-		unsigned char size_sense;
+		u8	size_header;
+		__le16	reserved;
+		u8	size_sense;
 	} header_desc;
 } TW_Command_Apache_Header;
 
@@ -547,19 +555,19 @@ typedef struct TAG_TW_Command_Full {
 
 /* Initconnection structure */
 typedef struct TAG_TW_Initconnect {
-	unsigned char opcode__reserved;
-	unsigned char size;
-	unsigned char request_id;
-	unsigned char res2;
-	unsigned char status;
-	unsigned char flags;
-	unsigned short message_credits;
-	u32 features;
-	unsigned short fw_srl;
-	unsigned short fw_arch_id;
-	unsigned short fw_branch;
-	unsigned short fw_build;
-	u32 result;
+	u8	opcode__reserved;
+	u8	size;
+	u8	request_id;
+	u8	res2;
+	u8	status;
+	u8	flags;
+	__le16	message_credits;
+	__le32	features;
+	__le16	fw_srl;
+	__le16	fw_arch_id;
+	__le16	fw_branch;
+	__le16	fw_build;
+	__le32	result;
 } TW_Initconnect;
 
 /* Event info structure */
@@ -600,11 +608,11 @@ typedef struct TAG_TW_Lock {
 
 /* GetParam descriptor */
 typedef struct {
-	unsigned short	table_id;
-	unsigned short	parameter_id;
-	unsigned short	parameter_size_bytes;
-	unsigned short  actual_parameter_size_bytes;
-	unsigned char	data[1];
+	__le16	table_id;
+	__le16	parameter_id;
+	__le16	parameter_size_bytes;
+	__le16  actual_parameter_size_bytes;
+	u8	data[1];
 } TW_Param_Apache, *PTW_Param_Apache;
 
 /* Response queue */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse
  2020-07-30 22:07 [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse Samuel Holland
@ 2020-07-31  7:29 ` Arnd Bergmann
  2020-08-03  3:43   ` Samuel Holland
  0 siblings, 1 reply; 7+ messages in thread
From: Arnd Bergmann @ 2020-07-31  7:29 UTC (permalink / raw)
  To: Samuel Holland
  Cc: Adam Radford, James E.J. Bottomley, Martin K. Petersen,
	linux-scsi, linux-kernel

On Fri, Jul 31, 2020 at 12:07 AM Samuel Holland <samuel@sholland.org> wrote:
>
> The main issue observed was at the call to scsi_set_resid, where the
> byteswapped parameter would eventually trigger the alignment check at
> drivers/scsi/sd.c:2009. At that point, the kernel would continuously
> complain about an "Unaligned partial completion", and no further I/O
> could occur.
>
> This gets the controller working on big endian powerpc64.
>
> Signed-off-by: Samuel Holland <samuel@sholland.org>
> ---
>
> Changes since v1:
>  - Include changes to use __le?? types in command structures
>  - Use an object literal for the intermediate "schedulertime" value
>  - Use local "error" variable to avoid repeated byte swapping
>  - Create a local "length" variable to avoid very long lines
>  - Move byte swapping to TW_REQ_LUN_IN/TW_LUN_OUT to avoid long lines
>

Looks much better, thanks for the update. I see one more issue here
>  /* Command Packet */
>  typedef struct TW_Command {
> -       unsigned char opcode__sgloffset;
> -       unsigned char size;
> -       unsigned char request_id;
> -       unsigned char unit__hostid;
> +       u8      opcode__sgloffset;
> +       u8      size;
> +       u8      request_id;
> +       u8      unit__hostid;
>         /* Second DWORD */
> -       unsigned char status;
> -       unsigned char flags;
> +       u8      status;
> +       u8      flags;
>         union {
> -               unsigned short block_count;
> -               unsigned short parameter_count;
> +               __le16  block_count;
> +               __le16  parameter_count;
>         } byte6_offset;
>         union {
>                 struct {
> -                       u32 lba;
> -                       TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
> -                       dma_addr_t padding;
> +                       __le32          lba;
> +                       TW_SG_Entry     sgl[TW_ESCALADE_MAX_SGL_LENGTH];
> +                       dma_addr_t      padding;


The use of dma_addr_t here seems odd, since this is neither endian-safe nor
fixed-length. I see you replaced the dma_addr_t in TW_SG_Entry with
a variable-length fixed-endian word. I guess there is a chance that this is
correct, but it is really confusing. On top of that, it seems that there is
implied padding in the structure when built with a 64-bit dma_addr_t
on most architectures but not on x86-32 (which uses 32-bit alignment for
64-bit integers). I don't know what the hardware definition is for TW_Command,
but ideally this would be expressed using only fixed-endian fixed-length
members and explicit padding.

    Arnd

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse
  2020-07-31  7:29 ` Arnd Bergmann
@ 2020-08-03  3:43   ` Samuel Holland
  2020-08-03 14:02     ` Arnd Bergmann
  0 siblings, 1 reply; 7+ messages in thread
From: Samuel Holland @ 2020-08-03  3:43 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Adam Radford, James E.J. Bottomley, Martin K. Petersen,
	linux-scsi, linux-kernel

On 7/31/20 2:29 AM, Arnd Bergmann wrote:
> On Fri, Jul 31, 2020 at 12:07 AM Samuel Holland <samuel@sholland.org> wrote:
>>
>> The main issue observed was at the call to scsi_set_resid, where the
>> byteswapped parameter would eventually trigger the alignment check at
>> drivers/scsi/sd.c:2009. At that point, the kernel would continuously
>> complain about an "Unaligned partial completion", and no further I/O
>> could occur.
>>
>> This gets the controller working on big endian powerpc64.
>>
>> Signed-off-by: Samuel Holland <samuel@sholland.org>
>> ---
>>
>> Changes since v1:
>>  - Include changes to use __le?? types in command structures
>>  - Use an object literal for the intermediate "schedulertime" value
>>  - Use local "error" variable to avoid repeated byte swapping
>>  - Create a local "length" variable to avoid very long lines
>>  - Move byte swapping to TW_REQ_LUN_IN/TW_LUN_OUT to avoid long lines
>>
> 
> Looks much better, thanks for the update. I see one more issue here
>>  /* Command Packet */
>>  typedef struct TW_Command {
>> -       unsigned char opcode__sgloffset;
>> -       unsigned char size;
>> -       unsigned char request_id;
>> -       unsigned char unit__hostid;
>> +       u8      opcode__sgloffset;
>> +       u8      size;
>> +       u8      request_id;
>> +       u8      unit__hostid;
>>         /* Second DWORD */
>> -       unsigned char status;
>> -       unsigned char flags;
>> +       u8      status;
>> +       u8      flags;
>>         union {
>> -               unsigned short block_count;
>> -               unsigned short parameter_count;
>> +               __le16  block_count;
>> +               __le16  parameter_count;
>>         } byte6_offset;
>>         union {
>>                 struct {
>> -                       u32 lba;
>> -                       TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
>> -                       dma_addr_t padding;
>> +                       __le32          lba;
>> +                       TW_SG_Entry     sgl[TW_ESCALADE_MAX_SGL_LENGTH];
>> +                       dma_addr_t      padding;
> 
> 
> The use of dma_addr_t here seems odd, since this is neither endian-safe nor
> fixed-length. I see you replaced the dma_addr_t in TW_SG_Entry with
> a variable-length fixed-endian word. I guess there is a chance that this is
> correct, but it is really confusing. On top of that, it seems that there is
> implied padding in the structure when built with a 64-bit dma_addr_t
> on most architectures but not on x86-32 (which uses 32-bit alignment for
> 64-bit integers). I don't know what the hardware definition is for TW_Command,
> but ideally this would be expressed using only fixed-endian fixed-length
> members and explicit padding.

All of the command structures are packed, due to the "#pragma pack(1)" earlier
in the file. So alignment is not an issue. This dma_addr_t member _is_ the
explicit padding to make sizeof(TW_Command) -
sizeof(TW_Command.byte8_offset.{io,param}.sgl) equal TW_COMMAND_SIZE * 4. And
indeed the structure is expected to be a different size depending on
sizeof(dma_addr_t).

I left the padding member alone to avoid the #ifdef; since it's never accessed,
the endianness doesn't matter. In fact, since in both cases it's at the end of
the structure, it could probably be removed entirely. I don't see
sizeof(TW_Command) being used anywhere, but I'm not 100% certain. The downside
of removing it would be TW_COMMAND_SIZE becoming a slightly more magic number.

Regards,
Samuel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse
  2020-08-03  3:43   ` Samuel Holland
@ 2020-08-03 14:02     ` Arnd Bergmann
  2020-08-05  1:45       ` Samuel Holland
  0 siblings, 1 reply; 7+ messages in thread
From: Arnd Bergmann @ 2020-08-03 14:02 UTC (permalink / raw)
  To: Samuel Holland
  Cc: Adam Radford, James E.J. Bottomley, Martin K. Petersen,
	linux-scsi, linux-kernel

On Mon, Aug 3, 2020 at 5:42 AM Samuel Holland <samuel@sholland.org> wrote:
>
> On 7/31/20 2:29 AM, Arnd Bergmann wrote:
> > On Fri, Jul 31, 2020 at 12:07 AM Samuel Holland <samuel@sholland.org> wrote:
> >>
> >> The main issue observed was at the call to scsi_set_resid, where the
> >> byteswapped parameter would eventually trigger the alignment check at
> >> drivers/scsi/sd.c:2009. At that point, the kernel would continuously
> >> complain about an "Unaligned partial completion", and no further I/O
> >> could occur.
> >>
> >> This gets the controller working on big endian powerpc64.
> >>
> >> Signed-off-by: Samuel Holland <samuel@sholland.org>
> >> ---
> >>
> >> Changes since v1:
> >>  - Include changes to use __le?? types in command structures
> >>  - Use an object literal for the intermediate "schedulertime" value
> >>  - Use local "error" variable to avoid repeated byte swapping
> >>  - Create a local "length" variable to avoid very long lines
> >>  - Move byte swapping to TW_REQ_LUN_IN/TW_LUN_OUT to avoid long lines
> >>
> >
> > Looks much better, thanks for the update. I see one more issue here
> >>  /* Command Packet */
> >>  typedef struct TW_Command {
> >> -       unsigned char opcode__sgloffset;
> >> -       unsigned char size;
> >> -       unsigned char request_id;
> >> -       unsigned char unit__hostid;
> >> +       u8      opcode__sgloffset;
> >> +       u8      size;
> >> +       u8      request_id;
> >> +       u8      unit__hostid;
> >>         /* Second DWORD */
> >> -       unsigned char status;
> >> -       unsigned char flags;
> >> +       u8      status;
> >> +       u8      flags;
> >>         union {
> >> -               unsigned short block_count;
> >> -               unsigned short parameter_count;
> >> +               __le16  block_count;
> >> +               __le16  parameter_count;
> >>         } byte6_offset;
> >>         union {
> >>                 struct {
> >> -                       u32 lba;
> >> -                       TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
> >> -                       dma_addr_t padding;
> >> +                       __le32          lba;
> >> +                       TW_SG_Entry     sgl[TW_ESCALADE_MAX_SGL_LENGTH];
> >> +                       dma_addr_t      padding;
> >
> >
> > The use of dma_addr_t here seems odd, since this is neither endian-safe nor
> > fixed-length. I see you replaced the dma_addr_t in TW_SG_Entry with
> > a variable-length fixed-endian word. I guess there is a chance that this is
> > correct, but it is really confusing. On top of that, it seems that there is
> > implied padding in the structure when built with a 64-bit dma_addr_t
> > on most architectures but not on x86-32 (which uses 32-bit alignment for
> > 64-bit integers). I don't know what the hardware definition is for TW_Command,
> > but ideally this would be expressed using only fixed-endian fixed-length
> > members and explicit padding.
>
> All of the command structures are packed, due to the "#pragma pack(1)" earlier
> in the file. So alignment is not an issue. This dma_addr_t member _is_ the
> explicit padding to make sizeof(TW_Command) -
> sizeof(TW_Command.byte8_offset.{io,param}.sgl) equal TW_COMMAND_SIZE * 4. And
> indeed the structure is expected to be a different size depending on
> sizeof(dma_addr_t).

Ah, so only the first few members are accessed by hardware and the
last union is only accessed by the OS then? In that case I suppose it is
all fine, but I would also suggest removing the "#pragma packed"
to get somewhat more efficient access on systems that have  problems
with misaligned accesses.

      Arnd

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse
  2020-08-03 14:02     ` Arnd Bergmann
@ 2020-08-05  1:45       ` Samuel Holland
  2020-08-05  7:17         ` Arnd Bergmann
  0 siblings, 1 reply; 7+ messages in thread
From: Samuel Holland @ 2020-08-05  1:45 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Adam Radford, James E.J. Bottomley, Martin K. Petersen,
	linux-scsi, linux-kernel

On 8/3/20 9:02 AM, Arnd Bergmann wrote:
> On Mon, Aug 3, 2020 at 5:42 AM Samuel Holland <samuel@sholland.org> wrote:
>> On 7/31/20 2:29 AM, Arnd Bergmann wrote:
>>> On Fri, Jul 31, 2020 at 12:07 AM Samuel Holland <samuel@sholland.org> wrote:
>>>>
>>>> The main issue observed was at the call to scsi_set_resid, where the
>>>> byteswapped parameter would eventually trigger the alignment check at
>>>> drivers/scsi/sd.c:2009. At that point, the kernel would continuously
>>>> complain about an "Unaligned partial completion", and no further I/O
>>>> could occur.
>>>>
>>>> This gets the controller working on big endian powerpc64.
>>>>
>>>> Signed-off-by: Samuel Holland <samuel@sholland.org>
>>>> ---
>>>>
>>>> Changes since v1:
>>>>  - Include changes to use __le?? types in command structures
>>>>  - Use an object literal for the intermediate "schedulertime" value
>>>>  - Use local "error" variable to avoid repeated byte swapping
>>>>  - Create a local "length" variable to avoid very long lines
>>>>  - Move byte swapping to TW_REQ_LUN_IN/TW_LUN_OUT to avoid long lines
>>>>
>>>
>>> Looks much better, thanks for the update. I see one more issue here
>>>>  /* Command Packet */
>>>>  typedef struct TW_Command {
>>>> -       unsigned char opcode__sgloffset;
>>>> -       unsigned char size;
>>>> -       unsigned char request_id;
>>>> -       unsigned char unit__hostid;
>>>> +       u8      opcode__sgloffset;
>>>> +       u8      size;
>>>> +       u8      request_id;
>>>> +       u8      unit__hostid;
>>>>         /* Second DWORD */
>>>> -       unsigned char status;
>>>> -       unsigned char flags;
>>>> +       u8      status;
>>>> +       u8      flags;
>>>>         union {
>>>> -               unsigned short block_count;
>>>> -               unsigned short parameter_count;
>>>> +               __le16  block_count;
>>>> +               __le16  parameter_count;
>>>>         } byte6_offset;
>>>>         union {
>>>>                 struct {
>>>> -                       u32 lba;
>>>> -                       TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
>>>> -                       dma_addr_t padding;
>>>> +                       __le32          lba;
>>>> +                       TW_SG_Entry     sgl[TW_ESCALADE_MAX_SGL_LENGTH];
>>>> +                       dma_addr_t      padding;
>>>
>>>
>>> The use of dma_addr_t here seems odd, since this is neither endian-safe nor
>>> fixed-length. I see you replaced the dma_addr_t in TW_SG_Entry with
>>> a variable-length fixed-endian word. I guess there is a chance that this is
>>> correct, but it is really confusing. On top of that, it seems that there is
>>> implied padding in the structure when built with a 64-bit dma_addr_t
>>> on most architectures but not on x86-32 (which uses 32-bit alignment for
>>> 64-bit integers). I don't know what the hardware definition is for TW_Command,
>>> but ideally this would be expressed using only fixed-endian fixed-length
>>> members and explicit padding.
>>
>> All of the command structures are packed, due to the "#pragma pack(1)" earlier
>> in the file. So alignment is not an issue. This dma_addr_t member _is_ the
>> explicit padding to make sizeof(TW_Command) -
>> sizeof(TW_Command.byte8_offset.{io,param}.sgl) equal TW_COMMAND_SIZE * 4. And
>> indeed the structure is expected to be a different size depending on
>> sizeof(dma_addr_t).
> 
> Ah, so only the first few members are accessed by hardware and the
> last union is only accessed by the OS then? In that case I suppose it is
> all fine, but I would also suggest removing the "#pragma packed"
> to get somewhat more efficient access on systems that have  problems
> with misaligned accesses.

I don't know what part the hardware accesses; everything I know about the
hardware comes from reading the driver.

The problem with removing the "#pragma pack(1)" is that the structure is
inherently misaligned: byte8_offset.io.sgl starts at offset 12, but it may begin
with a __le64.

Regards,
Samuel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse
  2020-08-05  1:45       ` Samuel Holland
@ 2020-08-05  7:17         ` Arnd Bergmann
  2020-08-09  0:16           ` Samuel Holland
  0 siblings, 1 reply; 7+ messages in thread
From: Arnd Bergmann @ 2020-08-05  7:17 UTC (permalink / raw)
  To: Samuel Holland
  Cc: Adam Radford, James E.J. Bottomley, Martin K. Petersen,
	linux-scsi, linux-kernel

On Wed, Aug 5, 2020 at 3:44 AM Samuel Holland <samuel@sholland.org> wrote:
> On 8/3/20 9:02 AM, Arnd Bergmann wrote:
> > On Mon, Aug 3, 2020 at 5:42 AM Samuel Holland <samuel@sholland.org> wrote:
> >> All of the command structures are packed, due to the "#pragma pack(1)" earlier
> >> in the file. So alignment is not an issue. This dma_addr_t member _is_ the
> >> explicit padding to make sizeof(TW_Command) -
> >> sizeof(TW_Command.byte8_offset.{io,param}.sgl) equal TW_COMMAND_SIZE * 4. And
> >> indeed the structure is expected to be a different size depending on
> >> sizeof(dma_addr_t).
> >
> > Ah, so only the first few members are accessed by hardware and the
> > last union is only accessed by the OS then? In that case I suppose it is
> > all fine, but I would also suggest removing the "#pragma packed"
> > to get somewhat more efficient access on systems that have  problems
> > with misaligned accesses.
>
> I don't know what part the hardware accesses; everything I know about the
> hardware comes from reading the driver.

I see now from your explanation below that this is a hardware-defined
structure. I was confused by how it can be either 32 or 64 bits wide but
found the

tw_initconnect->features |= sizeof(dma_addr_t) > 4 ? 1 : 0;

line now that tells the hardware about which format is used.

> The problem with removing the "#pragma pack(1)" is that the structure is
> inherently misaligned: byte8_offset.io.sgl starts at offset 12, but it may begin
> with a __le64.

I think a fairly clean way to handle this would be to remove the pragma
and instead define a local type like

#if IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT)
typedef  __le64 twa_address_t __packed;
#else
typedef __le32 twa_addr_t;
#endif

The problem with marking the entire structure as packed, rather than
just individual members is that you end up with very inefficient bytewise
access on some architectures (especially those without cache-coherent
DMA or hardware unaligned access in the CPU), so I would recommend
avoiding that in portable driver code.

      Arnd

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse
  2020-08-05  7:17         ` Arnd Bergmann
@ 2020-08-09  0:16           ` Samuel Holland
  0 siblings, 0 replies; 7+ messages in thread
From: Samuel Holland @ 2020-08-09  0:16 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Adam Radford, James E.J. Bottomley, Martin K. Petersen,
	linux-scsi, linux-kernel

On 8/5/20 2:17 AM, Arnd Bergmann wrote:
> On Wed, Aug 5, 2020 at 3:44 AM Samuel Holland <samuel@sholland.org> wrote:
>> On 8/3/20 9:02 AM, Arnd Bergmann wrote:
>>> On Mon, Aug 3, 2020 at 5:42 AM Samuel Holland <samuel@sholland.org> wrote:
>>>> All of the command structures are packed, due to the "#pragma pack(1)" earlier
>>>> in the file. So alignment is not an issue. This dma_addr_t member _is_ the
>>>> explicit padding to make sizeof(TW_Command) -
>>>> sizeof(TW_Command.byte8_offset.{io,param}.sgl) equal TW_COMMAND_SIZE * 4. And
>>>> indeed the structure is expected to be a different size depending on
>>>> sizeof(dma_addr_t).
>>>
>>> Ah, so only the first few members are accessed by hardware and the
>>> last union is only accessed by the OS then? In that case I suppose it is
>>> all fine, but I would also suggest removing the "#pragma packed"
>>> to get somewhat more efficient access on systems that have  problems
>>> with misaligned accesses.
>>
>> I don't know what part the hardware accesses; everything I know about the
>> hardware comes from reading the driver.
> 
> I see now from your explanation below that this is a hardware-defined
> structure. I was confused by how it can be either 32 or 64 bits wide but
> found the
> 
> tw_initconnect->features |= sizeof(dma_addr_t) > 4 ? 1 : 0;
> 
> line now that tells the hardware about which format is used.
> 
>> The problem with removing the "#pragma pack(1)" is that the structure is
>> inherently misaligned: byte8_offset.io.sgl starts at offset 12, but it may begin
>> with a __le64.
> 
> I think a fairly clean way to handle this would be to remove the pragma
> and instead define a local type like
> 
> #if IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT)
> typedef  __le64 twa_address_t __packed;
> #else
> typedef __le32 twa_addr_t;
> #endif

I would be happy to implement this... but __packed only works on enums, structs,
and unions[1]:

In file included from drivers/scsi/3w-9xxx.c:100:
drivers/scsi/3w-9xxx.h:474:1: warning: 'packed' attribute ignored [-Wattributes]
  474 | typedef __le64 twa_addr_t __packed;
      | ^~~~~~~

[1]:
https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#index-packed-type-attribute

> The problem with marking the entire structure as packed, rather than
> just individual members is that you end up with very inefficient bytewise
> access on some architectures (especially those without cache-coherent
> DMA or hardware unaligned access in the CPU), so I would recommend
> avoiding that in portable driver code.

I agree, but I think this is a separate issue from what this patch is fixing. I
would prefer to save this change for a separate patch.

Regards,
Samuel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-08-09  0:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-30 22:07 [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse Samuel Holland
2020-07-31  7:29 ` Arnd Bergmann
2020-08-03  3:43   ` Samuel Holland
2020-08-03 14:02     ` Arnd Bergmann
2020-08-05  1:45       ` Samuel Holland
2020-08-05  7:17         ` Arnd Bergmann
2020-08-09  0:16           ` Samuel Holland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).