On Tuesday, 21 December 2021 15:48:53 CET Kalle Valo wrote: > My understanding is that on little endian host it's (the number representing > the byte index): > > 1 2 3 4 5 6 7 8 > > And on big endian host it's (as the firmware automatically swapped the > values): > > 4 3 2 1 8 7 6 5 > > So for on big endian we need to use ath11k_ce_byte_swap() to get them > back to correct order. (Or to be exact we need to use > ath11k_ce_byte_swap() every time as it does nothing on a little endian > host.) > > Completely untested, of course. I don't have a big endian system. Ok, just to summarize: the value is 0x0807060504030201 -> which is is correctly stored in memory as 0102030405060708 for little endian systems. Fine with this part. So if there would only be little endian systems than following code would be fine: replay_ctr = cpu_to_be64(get_unaligned((u64 *)ev->replay_counter)); According to the info from here, the memory from the firmware on big endian systems is 0403020108070605. So after switching it with the ath11k swap function, it is back to 0102030405060708 -> which is little endian again (and not aligned in memory). We must take care of it by converting in from __le64 to a u64 before converting it to __be64. So we would end up with: __le64 replay_ctr_le; __be64 replay_ctr_be; u64 replay_ctr; /* TODO also swap bytes for (i)gt_key* back to little endian */ ath11k_ce_byte_swap(ev->replay_counter, sizeof(ev->replay_counter)); replay_ctr_le = get_unaligned((__le64 *)ev->replay_counter); replay_ctr = le64_to_cpu(replay_ctr_le); arvif->rekey_data.replay_ctr = replay_ctr; replay_ctr_be = cpu_to_be64(replay_ctr); Of course, completely untested. Another idea would be to change wmi_gtk_offload_status_event->replay_counter into two u32. In that case, it would be enough to do: __be64 replay_ctr_be; u64 replay_ctr; replay_ctr = ev->replay_counter[1]; replay_ctr <<= 32; replay_ctr |= ev->replay_counter[0]; arvif->rekey_data.replay_ctr = replay_ctr; replay_ctr_be = cpu_to_be64(replay_ctr); replay_counter[1] could also be called replay_counter_upper - and replay_counter[0] just replay_counter_lower. Which reminds me of that the memcpy from a u64 to wmi_gtk_rekey_offload_cmd->replay_ctrl in ath11k_wmi_gtk_rekey_offload. This is of course also wrong. It must first be converted into a __le64 and the bytes must be pre-swapped (see below). > So my understanding is that when CE_ATTR_BYTE_SWAP_DATA is enabled the > firmware automatically swaps the packets per every four bytes. That's > why all the fields in WMI commands and events are u32. [...] > The firmware interface should not have u16 or u8 fields. And for > anything larger ath11k_ce_byte_swap() should be used. Again, this is > just my recollection from discussions years back and I have not tested > this myself. Ok, so wmi_gtk_offload_status_event and wmi_gtk_rekey_offload_cmd are breaking this assertion because they are full of u8's. :( Especially wmi_gtk_offload_status_event is problematic here because there are now a lot of single u8's in it. The original code must therefore use ath11k_ce_byte_swap on everything after wmi_gtk_offload_status_event->refresh_cnt before accessing any of the members. And ath11k_wmi_gtk_rekey_offload must perform the ath11k_ce_byte_swap on * wmi_gtk_rekey_offload_cmd->kek * wmi_gtk_rekey_offload_cmd->kck * wmi_gtk_rekey_offload_cmd->replay_ctr See also above why the memcpy to wmi_gtk_rekey_offload_cmd->replay_ctr is wrong in this function. And it seems like wmi_obss_color_collision_event->obss_color_bitmap (see ath11k_wmi_obss_color_collision_event) could suffer from a similar problem. Maybe key_rsc_counter in ath11k_wmi_vdev_install_key too - but this doesn't have any producers yet. Kind regards, Sven