On 2024/3/14 06:47, Reinette Chatre wrote: > Hi Haifeng, > > On 3/7/2024 11:41 PM, Haifeng Xu wrote: >> In our production environment, after removing monitor groups, those unused >> RMIDs get stuck in the limbo list forever because their llc_occupancy are >> always larger than the threshold. But the unused RMIDs can be successfully >> freed by turning up the threshold. >> >> In order to know how much the threshold should be, perf can be used to >> acquire the llc_occupancy of RMIDs in each rdt domain. >> >> Instead of using perf tool to track llc_occupancy and filter the log >> manually, it is more convenient for users to use tracepoint to do this >> work. So add a new tracepoint that shows the llc_occupancy of busy RMIDs >> when scanning the limbo list. >> >> Signed-off-by: Haifeng Xu <haifeng.xu@shopee.com> >> Suggested-by: Reinette Chatre <reinette.chatre@intel.com> >> Suggested-by: James Morse <james.morse@arm.com> >> Reviewed-by: James Morse <james.morse@arm.com> >> --- >> Documentation/arch/x86/resctrl.rst | 8 ++++++++ >> arch/x86/kernel/cpu/resctrl/monitor.c | 9 +++++++++ >> arch/x86/kernel/cpu/resctrl/trace.h | 16 ++++++++++++++++ >> 3 files changed, 33 insertions(+) >> >> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst >> index a6279df64a9d..dd3507dc765c 100644 >> --- a/Documentation/arch/x86/resctrl.rst >> +++ b/Documentation/arch/x86/resctrl.rst >> @@ -478,6 +478,14 @@ if non-contiguous 1s value is supported. On a system with a 20-bit mask >> each bit represents 5% of the capacity of the cache. You could partition >> the cache into four equal parts with masks: 0x1f, 0x3e0, 0x7c00, 0xf8000. >> >> +Tracepoint - mon_llc_occupancy_limbo >> +------------------------------------ > > I think that the below paragraph would fit nicely as a new paragraph in the > existing "max_threshold_occupancy - generic concepts" section. To support that > just one change to text below ... > >> +This tracepoint gives you the precise occupancy values for a subset of RMID > > The mon_llc_occupancy_limbo tracepoint gives the precise occupancy in bytes > for a subset of RMID ... > OK, I'll move the descriptions to "max_threshold_occupancy - generic concepts" section. >> +that are not immediately available for allocation. This can't be relied on >> +to produce output every second, it may be necessary to attempt to create an >> +empty monitor group to force an update. Output may only be produced if creation >> +of a control or monitor group fails. >> + >> Memory bandwidth Allocation and monitoring >> ========================================== >> >> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c >> index c34a35ec0f03..60b6a29a9e29 100644 >> --- a/arch/x86/kernel/cpu/resctrl/monitor.c >> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c >> @@ -24,6 +24,7 @@ >> #include <asm/resctrl.h> >> >> #include "internal.h" >> +#include "trace.h" >> >> /** >> * struct rmid_entry - dirty tracking for all RMID. >> @@ -354,6 +355,14 @@ void __check_limbo(struct rdt_domain *d, bool force_free) >> rmid_dirty = true; >> } else { >> rmid_dirty = (val >= resctrl_rmid_realloc_threshold); >> + >> + /* x86's CLOSID and RMID are independent numbers, so the entry's >> + * closid is a invalid CLOSID. But on arm64, the RMID value isn't >> + * a unique number for each CLOSID. It's necessary to track both >> + * CLOSID and RMID because there may be dependencies between each >> + * other on some architectures. >> + */ > > Please watch for proper formatting of multi-line comment and consistent capitalization. > I also think comment can be more accurate, for example: > > /* > * x86's CLOSID and RMID are independent numbers, so the entry's > * CLOSID is an empty CLOSID (X86_RESCTRL_EMPTY_CLOSID). On Arm the > * RMID (PMG) extends the CLOSID (PARTID) space with bits that aren't used > * to select the configuration. It is thus necessary to track both > * CLOSID and RMID because there may be dependencies between them > * on some architectures. > */ > Thanks for your suggestions! >> + trace_mon_llc_occupancy_limbo(entry->closid, entry->rmid, d->id, val); >> } >> >> if (force_free || !rmid_dirty) { >> diff --git a/arch/x86/kernel/cpu/resctrl/trace.h b/arch/x86/kernel/cpu/resctrl/trace.h >> index ed5c66b8ab0b..b310b4985b94 100644 >> --- a/arch/x86/kernel/cpu/resctrl/trace.h >> +++ b/arch/x86/kernel/cpu/resctrl/trace.h >> @@ -35,6 +35,22 @@ TRACE_EVENT(pseudo_lock_l3, >> TP_printk("hits=%llu miss=%llu", >> __entry->l3_hits, __entry->l3_miss)); >> >> +TRACE_EVENT(mon_llc_occupancy_limbo, >> + TP_PROTO(u32 ctrl_hw_id, u32 mon_hw_id, int domain_id, u64 llc_occupancy_bytes), >> + TP_ARGS(ctrl_hw_id, mon_hw_id, domain_id, llc_occupancy_bytes), >> + TP_STRUCT__entry(__field(u32, ctrl_hw_id) >> + __field(u32, mon_hw_id) >> + __field(int, domain_id) >> + __field(u64, llc_occupancy_bytes)), >> + TP_fast_assign(__entry->ctrl_hw_id = ctrl_hw_id; >> + __entry->mon_hw_id = mon_hw_id; >> + __entry->domain_id = domain_id; >> + __entry->llc_occupancy_bytes = llc_occupancy_bytes;), >> + TP_printk("ctrl_hw_id=%u mon_hw_id=%u domain_d=%d llc_occupancy_bytes=%llu", > > domain_d -> domain_id > >> + __entry->ctrl_hw_id, __entry->mon_hw_id, __entry->domain_id, >> + __entry->llc_occupancy_bytes) >> + ); >> + >> #endif /* _TRACE_RESCTRL_H */ >> >> #undef TRACE_INCLUDE_PATH > > > Reinette Thanks!
On 2024/3/14 06:46, Reinette Chatre wrote: > Hi Haifeng, > > On 3/7/2024 11:41 PM, Haifeng Xu wrote: >> Now only pseudo-locking part uses tracepoints to do event tracking, but >> other parts of resctrl may need new tracepoints. It is unnecessary to >> create separate header files and define CREATE_TRACE_POINTS in different >> c files which fragments the resctrl tracing. >> >> Therefore, give the resctrl tracepoint header file a generic name to >> support its use for tracepoints that are not specific to pseudo-locking. >> >> No functional change. >> >> Signed-off-by: Haifeng Xu <haifeng.xu@shopee.com> >> Suggested-by: Reinette Chatre <reinette.chatre@intel.com> >> --- >> arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +- >> .../x86/kernel/cpu/resctrl/{pseudo_lock_event.h => trace.h} | 6 +++--- >> 2 files changed, 4 insertions(+), 4 deletions(-) >> rename arch/x86/kernel/cpu/resctrl/{pseudo_lock_event.h => trace.h} (88%) >> >> diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c >> index 884b88e25141..492c8e28c4ce 100644 >> --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c >> +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c >> @@ -31,7 +31,7 @@ >> #include "internal.h" >> >> #define CREATE_TRACE_POINTS >> -#include "pseudo_lock_event.h" >> +#include "trace.h" >> >> /* >> * The bits needed to disable hardware prefetching varies based on the >> diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock_event.h b/arch/x86/kernel/cpu/resctrl/trace.h >> similarity index 88% >> rename from arch/x86/kernel/cpu/resctrl/pseudo_lock_event.h >> rename to arch/x86/kernel/cpu/resctrl/trace.h >> index 428ebbd4270b..ed5c66b8ab0b 100644 >> --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock_event.h >> +++ b/arch/x86/kernel/cpu/resctrl/trace.h >> @@ -2,7 +2,7 @@ >> #undef TRACE_SYSTEM >> #define TRACE_SYSTEM resctrl >> >> -#if !defined(_TRACE_PSEUDO_LOCK_H) || defined(TRACE_HEADER_MULTI_READ) >> +#if !defined(_TRACE_RESCTRL_H) || defined(TRACE_HEADER_MULTI_READ) >> #define _TRACE_PSEUDO_LOCK_H > > The above #define should match the new name also. Sorry, I forgot to replace it with 'RESCTRL' in this version. Thanks for pointing that out! > >> >> #include <linux/tracepoint.h> >> @@ -35,9 +35,9 @@ TRACE_EVENT(pseudo_lock_l3, >> TP_printk("hits=%llu miss=%llu", >> __entry->l3_hits, __entry->l3_miss)); >> >> -#endif /* _TRACE_PSEUDO_LOCK_H */ >> +#endif /* _TRACE_RESCTRL_H */ >> >> #undef TRACE_INCLUDE_PATH >> #define TRACE_INCLUDE_PATH . >> -#define TRACE_INCLUDE_FILE pseudo_lock_event >> +#define TRACE_INCLUDE_FILE trace >> #include <trace/define_trace.h> > > The rest looks good. > > Thank you. > > Reinette
On Tue, Mar 05, 2024 at 08:29:26PM +0100, Marco Pagani wrote:
> The current implementation of the fpga manager assumes that the low-level
> module registers a driver for the parent device and uses its owner pointer
> to take the module's refcount. This approach is problematic since it can
> lead to a null pointer dereference while attempting to get the manager if
> the parent device does not have a driver.
>
> To address this problem, add a module owner pointer to the fpga_manager
> struct and use it to take the module's refcount. Modify the functions for
> registering the manager to take an additional owner module parameter and
> rename them to avoid conflicts. Use the old function names for helper
> macros that automatically set the module that registers the manager as the
> owner. This ensures compatibility with existing low-level control modules
> and reduces the chances of registering a manager without setting the owner.
>
> Also, update the documentation to keep it consistent with the new interface
> for registering an fpga manager.
>
> Other changes: opportunistically move put_device() from __fpga_mgr_get() to
> fpga_mgr_get() and of_fpga_mgr_get() to improve code clarity since the
> manager device is taken in these functions.
>
> Fixes: 654ba4cc0f3e ("fpga manager: ensure lifetime with of_fpga_mgr_get")
> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Suggested-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Marco Pagani <marpagan@redhat.com>
Acked-by: Xu Yilun <yilun.xu@intel.com>
Will apply to v6.9-rc1
On Mon, Mar 18, 2024 at 02:27:44PM +0300, Dmitry Mastykin wrote:
> Hello all,
> It seems there is a problem in "Krefs and RCU" example, that may cause
> a crash.
> I marked the place between two problem lines with "problem is here" comment.
> If list_del_rcu() will be called between these lines, and list will become
> empty, then q.next will not point to a valid struct my_data.
> entry->refcount will also be invalid.
> Instead, q.next must be read first, and then compared with q to check
> list's emptiness (for example like in list_for_each_entry_rcu macro).
I agree you've identified a problem, but there's no way we should apply
this patch that adds "problem is here"! Instead we should show the
proper usage, which is a call to list_first_or_null_rcu().
> We can discuss that offline and come up with an approach that is > reviewable by maintainers and the community. Sure, looking forward to working together with you! Thanks, Albert Wang On Fri, Mar 15, 2024 at 4:57 AM Wesley Cheng <quic_wcheng@quicinc.com> wrote: > > Hi Albert > > On 3/14/2024 3:29 AM, Albert Wang wrote: > > On Thu, Mar 14, 2024 at 3:18 AM Wesley Cheng <quic_wcheng@quicinc.com> wrote: > >> > >> Hi Albert, > >> > >> On 3/13/2024 1:03 AM, Albert Wang wrote: > >>> Hi Wesley, > >>> > >>> The suspend function `qc_usb_audio_offload_suspend()` looks to stop > >>> the traffic on the bus, so that the bus can be suspended. That allows > >>> the application processor(AP) to enter suspend. There is a subtle > >>> difference with our feature, which is to allow AP suspend with the > >>> Host and USB controller active to continue the audio offloading. We > >>> call this feature `allow AP suspend in playback`. So, I have some > >>> points to clarify with you: > >> > >> Yes, I'm aware of that feature also. > >> > >>> 1. Will the suspend flow `usb_audio_suspend() --> > >>> platform_ops->suspend_cb() --> qc_usb_audio_offload_suspend()` be > >>> called when offloading is active? > >> > >> It can be. This is why in our case, we are going to issue the > >> disconnect event to the audio DSP to stop the session if it is currently > >> in one. > >> > >>> 2. As my understanding, the suspend function is to allow AP suspend > >>> when the offloading is IDLE, but it won't allow AP suspend when in > >>> playback or capture. Please correct me if anything is wrong. > >> > >> As mentioned above, it will let apps go into PM suspend after forcing > >> the audio stream to be idle. We won't block PM suspend entry. > >> > > Right. Your design is to force the audio stream idle, or say, inform > > the audio DSP > > to stop the current offloading session first, then AP can go into PM > > suspend as usual. > > Then I can say the current design did not support the `allow AP > > suspend in playback` > > feature, right? > > > > Correct, this series does not cover this mechanism. > > >> Yes, I saw that patch as well. I'll take a look once this series lands > >> upstream. > > > > That patch is rejected and archived now. So we need to find another > > approach to do > > that, even based on your framework. > > > > We can discuss that offline and come up with an approach that is > reviewable by maintainers and the community. > > Thanks > Wesley Cheng > > > Thanks, > > Albert > > > > > >>> 3. We would like to integrate the `allow AP suspend in playback` > >>> feature with your framework to become one upstream offload solution. > >>> Here is the patch: > >>> https://patchwork.kernel.org/project/linux-pm/patch/20240223143833.1509961-1-guanyulin@google.com/ > >>> . > >> > >> Yes, I saw that patch as well. I'll take a look once this series lands > >> upstream. > >> > >> Thanks > >> Wesley Cheng
[-- Attachment #1: Type: text/plain, Size: 282 bytes --] On Mon, Mar 18, 2024 at 10:01:25PM +0800, Weiji Wang wrote: > This formats the last two shell commands in > Documentation/admin-guide/mm/zswap.rst as code blocks. Why? Also missing Signed-off-by. Confused... -- An old man doll... just what I always wanted! - Clara [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --]
[-- Attachment #1: Type: text/plain, Size: 323 bytes --] On Mon, Mar 18, 2024 at 04:53:31PM -0700, Shannon Nelson wrote: > +XDP > +--- > + > +Support for XDP includes the basics, plus Jumbo frames, Redirect > +and ndo_xmit. There is no current for zero-copy sockets or HW offload. "... no current support ..." -- An old man doll... just what I always wanted! - Clara [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --]
On Sun, Mar 17, 2024 at 06:02:42PM PDT, Ban Feng wrote:
>HI Guenter and Zev,
>
>If there's no concern about supporting nct7362 in nct7363 driver,
>I'll add it to the of_device_id and i2c_device_id table in v5.
>
>Thanks,
>Ban
>
That sounds good to me, thanks.
Zev
Hi Radu, kernel test robot noticed the following build errors: [auto build test ERROR on groeck-staging/hwmon-next] [also build test ERROR on robh/for-next linus/master v6.8 next-20240318] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Radu-Sabau/dt-bindings-hwmon-pmbus-adp1050-add-bindings/20240318-202619 base: https://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git hwmon-next patch link: https://lore.kernel.org/r/20240318112140.385244-3-radu.sabau%40analog.com patch subject: [PATCH 2/2] hwmon: pmbus: adp1050 : Add driver support config: hexagon-allmodconfig (https://download.01.org/0day-ci/archive/20240319/202403190800.h8cSGROp-lkp@intel.com/config) compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project 8f68022f8e6e54d1aeae4ed301f5a015963089b7) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240319/202403190800.h8cSGROp-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202403190800.h8cSGROp-lkp@intel.com/ All errors (new ones prefixed by >>): In file included from drivers/hwmon/pmbus/adp1050.c:9: In file included from include/linux/i2c.h:19: In file included from include/linux/regulator/consumer.h:35: In file included from include/linux/suspend.h:5: In file included from include/linux/swap.h:9: In file included from include/linux/memcontrol.h:13: In file included from include/linux/cgroup.h:26: In file included from include/linux/kernel_stat.h:9: In file included from include/linux/interrupt.h:11: In file included from include/linux/hardirq.h:11: In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/hexagon/include/asm/io.h:328: include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 547 | val = __raw_readb(PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 560 | val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr)); | ~~~~~~~~~~ ^ include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu' 37 | #define __le16_to_cpu(x) ((__force __u16)(__le16)(x)) | ^ In file included from drivers/hwmon/pmbus/adp1050.c:9: In file included from include/linux/i2c.h:19: In file included from include/linux/regulator/consumer.h:35: In file included from include/linux/suspend.h:5: In file included from include/linux/swap.h:9: In file included from include/linux/memcontrol.h:13: In file included from include/linux/cgroup.h:26: In file included from include/linux/kernel_stat.h:9: In file included from include/linux/interrupt.h:11: In file included from include/linux/hardirq.h:11: In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/hexagon/include/asm/io.h:328: include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 573 | val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr)); | ~~~~~~~~~~ ^ include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu' 35 | #define __le32_to_cpu(x) ((__force __u32)(__le32)(x)) | ^ In file included from drivers/hwmon/pmbus/adp1050.c:9: In file included from include/linux/i2c.h:19: In file included from include/linux/regulator/consumer.h:35: In file included from include/linux/suspend.h:5: In file included from include/linux/swap.h:9: In file included from include/linux/memcontrol.h:13: In file included from include/linux/cgroup.h:26: In file included from include/linux/kernel_stat.h:9: In file included from include/linux/interrupt.h:11: In file included from include/linux/hardirq.h:11: In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/hexagon/include/asm/io.h:328: include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 584 | __raw_writeb(value, PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 594 | __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 604 | __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr); | ~~~~~~~~~~ ^ In file included from drivers/hwmon/pmbus/adp1050.c:9: In file included from include/linux/i2c.h:19: In file included from include/linux/regulator/consumer.h:35: In file included from include/linux/suspend.h:5: In file included from include/linux/swap.h:9: In file included from include/linux/memcontrol.h:20: In file included from include/linux/mm.h:2188: include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion] 522 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" | ~~~~~~~~~~~ ^ ~~~ >> drivers/hwmon/pmbus/adp1050.c:47:60: error: too few arguments to function call, expected at least 3, have 2 47 | dev_err_probe(&client->dev, "Device can't be unlocked.\n"); | ~~~~~~~~~~~~~ ^ include/linux/dev_printk.h:277:20: note: 'dev_err_probe' declared here 277 | __printf(3, 4) int dev_err_probe(const struct device *dev, int err, const char *fmt, ...); | ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/hwmon/pmbus/adp1050.c:53:63: error: too few arguments to function call, expected at least 3, have 2 53 | dev_err_probe(&client->dev, "Device couldn't be unlocked.\n"); | ~~~~~~~~~~~~~ ^ include/linux/dev_printk.h:277:20: note: 'dev_err_probe' declared here 277 | __printf(3, 4) int dev_err_probe(const struct device *dev, int err, const char *fmt, ...); | ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 7 warnings and 2 errors generated. vim +47 drivers/hwmon/pmbus/adp1050.c 32 33 static int adp1050_probe(struct i2c_client *client) 34 { 35 u32 vin_scale_monitor, iin_scale_monitor; 36 int ret; 37 38 if (!i2c_check_functionality(client->adapter, 39 I2C_FUNC_SMBUS_WRITE_WORD_DATA)) 40 return -ENODEV; 41 42 /* Unlock CHIP's password in order to be able to read/write to it's 43 * VIN_SCALE and IIN_SCALE registers. 44 */ 45 ret = i2c_smbus_write_word_data(client, ADP1050_CHIP_PASSWORD, 0xFFFF); 46 if (ret < 0) { > 47 dev_err_probe(&client->dev, "Device can't be unlocked.\n"); 48 return ret; 49 } 50 51 ret = i2c_smbus_write_word_data(client, ADP1050_CHIP_PASSWORD, 0xFFFF); 52 if (ret < 0) { 53 dev_err_probe(&client->dev, "Device couldn't be unlocked.\n"); 54 return ret; 55 } 56 57 /* If adi,vin-scale-monitor isn't set or is set to 0 means that the 58 * VIN monitor isn't used, therefore 0 is used as scale in order 59 * for the readings to return 0. 60 */ 61 if (device_property_read_u32(&client->dev, "adi,vin-scale-monitor", 62 &vin_scale_monitor)) 63 vin_scale_monitor = 0; 64 65 /* If adi,iin-scale-monitor isn't set or is set to 0 means that the 66 * IIN monitor isn't used, therefore 0 is used as scale in order 67 * for the readings to return 0. 68 */ 69 if (device_property_read_u32(&client->dev, "adi,iin-scale-monitor", 70 &iin_scale_monitor)) 71 iin_scale_monitor = 0; 72 73 ret = i2c_smbus_write_word_data(client, ADP1050_VIN_SCALE_MONITOR, 74 vin_scale_monitor); 75 if (ret < 0) 76 return ret; 77 78 ret = i2c_smbus_write_word_data(client, ADP1050_IIN_SCALE_MONITOR, 79 iin_scale_monitor); 80 if (ret < 0) 81 return ret; 82 83 return pmbus_do_probe(client, &adp1050_info); 84 } 85 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
Add information to our documentation for the XDP features and related ethtool stats. While we're here, we also add the missing timestamp stats. Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> --- .../ethernet/pensando/ionic.rst | 22 +++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/Documentation/networking/device_drivers/ethernet/pensando/ionic.rst b/Documentation/networking/device_drivers/ethernet/pensando/ionic.rst index 6ec7d686efab..a3f7e4d6b95e 100644 --- a/Documentation/networking/device_drivers/ethernet/pensando/ionic.rst +++ b/Documentation/networking/device_drivers/ethernet/pensando/ionic.rst @@ -99,6 +99,12 @@ Minimal SR-IOV support is currently offered and can be enabled by setting the sysfs 'sriov_numvfs' value, if supported by your particular firmware configuration. +XDP +--- + +Support for XDP includes the basics, plus Jumbo frames, Redirect +and ndo_xmit. There is no current for zero-copy sockets or HW offload. + Statistics ========== @@ -138,6 +144,12 @@ Driver port specific:: rx_csum_none: 0 rx_csum_complete: 3 rx_csum_error: 0 + xdp_drop: 0 + xdp_aborted: 0 + xdp_pass: 0 + xdp_tx: 0 + xdp_redirect: 0 + xdp_frames: 0 Driver queue specific:: @@ -149,9 +161,12 @@ Driver queue specific:: tx_0_frags: 0 tx_0_tso: 0 tx_0_tso_bytes: 0 + tx_0_hwstamp_valid: 0 + tx_0_hwstamp_invalid: 0 tx_0_csum_none: 3 tx_0_csum: 0 tx_0_vlan_inserted: 0 + tx_0_xdp_frames: 0 rx_0_pkts: 2 rx_0_bytes: 120 rx_0_dma_map_err: 0 @@ -159,8 +174,15 @@ Driver queue specific:: rx_0_csum_none: 0 rx_0_csum_complete: 0 rx_0_csum_error: 0 + rx_0_hwstamp_valid: 0 + rx_0_hwstamp_invalid: 0 rx_0_dropped: 0 rx_0_vlan_stripped: 0 + rx_0_xdp_drop: 0 + rx_0_xdp_aborted: 0 + rx_0_xdp_pass: 0 + rx_0_xdp_tx: 0 + rx_0_xdp_redirect: 0 Firmware port specific:: -- 2.17.1
On Sun, Mar 17, 2024 at 07:49:43PM -0700, David Wei wrote: > I'm working on a similar proposal for zero copy Rx but to host memory > and depend on this memory provider API. How do you need a different provider for that vs just udmabuf? > Jakub also designed this API for hugepages too IIRC. Basically there's > going to be at least three fancy ways of providing pages (one of which > isn't actually pages, hence the merged netmem_t series) to drivers. How do hugepages different from a normal page allocation? They should just a different ordered passed to the page allocator.
Hi Radu, kernel test robot noticed the following build errors: [auto build test ERROR on groeck-staging/hwmon-next] [also build test ERROR on robh/for-next linus/master v6.8 next-20240318] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Radu-Sabau/dt-bindings-hwmon-pmbus-adp1050-add-bindings/20240318-202619 base: https://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git hwmon-next patch link: https://lore.kernel.org/r/20240318112140.385244-3-radu.sabau%40analog.com patch subject: [PATCH 2/2] hwmon: pmbus: adp1050 : Add driver support config: m68k-allmodconfig (https://download.01.org/0day-ci/archive/20240319/202403190552.U4RHYvqc-lkp@intel.com/config) compiler: m68k-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240319/202403190552.U4RHYvqc-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202403190552.U4RHYvqc-lkp@intel.com/ All error/warnings (new ones prefixed by >>): drivers/hwmon/pmbus/adp1050.c: In function 'adp1050_probe': >> drivers/hwmon/pmbus/adp1050.c:47:45: warning: passing argument 2 of 'dev_err_probe' makes integer from pointer without a cast [-Wint-conversion] 47 | dev_err_probe(&client->dev, "Device can't be unlocked.\n"); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | char * In file included from include/linux/device.h:15, from include/linux/acpi.h:14, from include/linux/i2c.h:13, from drivers/hwmon/pmbus/adp1050.c:9: include/linux/dev_printk.h:277:64: note: expected 'int' but argument is of type 'char *' 277 | __printf(3, 4) int dev_err_probe(const struct device *dev, int err, const char *fmt, ...); | ~~~~^~~ >> drivers/hwmon/pmbus/adp1050.c:47:17: error: too few arguments to function 'dev_err_probe' 47 | dev_err_probe(&client->dev, "Device can't be unlocked.\n"); | ^~~~~~~~~~~~~ include/linux/dev_printk.h:277:20: note: declared here 277 | __printf(3, 4) int dev_err_probe(const struct device *dev, int err, const char *fmt, ...); | ^~~~~~~~~~~~~ drivers/hwmon/pmbus/adp1050.c:53:45: warning: passing argument 2 of 'dev_err_probe' makes integer from pointer without a cast [-Wint-conversion] 53 | dev_err_probe(&client->dev, "Device couldn't be unlocked.\n"); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | char * include/linux/dev_printk.h:277:64: note: expected 'int' but argument is of type 'char *' 277 | __printf(3, 4) int dev_err_probe(const struct device *dev, int err, const char *fmt, ...); | ~~~~^~~ drivers/hwmon/pmbus/adp1050.c:53:17: error: too few arguments to function 'dev_err_probe' 53 | dev_err_probe(&client->dev, "Device couldn't be unlocked.\n"); | ^~~~~~~~~~~~~ include/linux/dev_printk.h:277:20: note: declared here 277 | __printf(3, 4) int dev_err_probe(const struct device *dev, int err, const char *fmt, ...); | ^~~~~~~~~~~~~ vim +/dev_err_probe +47 drivers/hwmon/pmbus/adp1050.c > 9 #include <linux/i2c.h> 10 #include <linux/init.h> 11 #include <linux/kernel.h> 12 #include <linux/module.h> 13 #include <linux/of.h> 14 #include "pmbus.h" 15 16 #define ADP1050_CHIP_PASSWORD 0xD7 17 18 #define ADP1050_VIN_SCALE_MONITOR 0xD8 19 #define ADP1050_IIN_SCALE_MONITOR 0xD9 20 21 static struct pmbus_driver_info adp1050_info = { 22 .pages = 1, 23 .format[PSC_VOLTAGE_IN] = linear, 24 .format[PSC_VOLTAGE_OUT] = linear, 25 .format[PSC_CURRENT_IN] = linear, 26 .format[PSC_TEMPERATURE] = linear, 27 .func[0] = PMBUS_HAVE_VOUT | PMBUS_HAVE_STATUS_VOUT 28 | PMBUS_HAVE_VIN | PMBUS_HAVE_STATUS_INPUT 29 | PMBUS_HAVE_IIN | PMBUS_HAVE_TEMP 30 | PMBUS_HAVE_STATUS_TEMP, 31 }; 32 33 static int adp1050_probe(struct i2c_client *client) 34 { 35 u32 vin_scale_monitor, iin_scale_monitor; 36 int ret; 37 38 if (!i2c_check_functionality(client->adapter, 39 I2C_FUNC_SMBUS_WRITE_WORD_DATA)) 40 return -ENODEV; 41 42 /* Unlock CHIP's password in order to be able to read/write to it's 43 * VIN_SCALE and IIN_SCALE registers. 44 */ 45 ret = i2c_smbus_write_word_data(client, ADP1050_CHIP_PASSWORD, 0xFFFF); 46 if (ret < 0) { > 47 dev_err_probe(&client->dev, "Device can't be unlocked.\n"); 48 return ret; 49 } 50 51 ret = i2c_smbus_write_word_data(client, ADP1050_CHIP_PASSWORD, 0xFFFF); 52 if (ret < 0) { 53 dev_err_probe(&client->dev, "Device couldn't be unlocked.\n"); 54 return ret; 55 } 56 57 /* If adi,vin-scale-monitor isn't set or is set to 0 means that the 58 * VIN monitor isn't used, therefore 0 is used as scale in order 59 * for the readings to return 0. 60 */ 61 if (device_property_read_u32(&client->dev, "adi,vin-scale-monitor", 62 &vin_scale_monitor)) 63 vin_scale_monitor = 0; 64 65 /* If adi,iin-scale-monitor isn't set or is set to 0 means that the 66 * IIN monitor isn't used, therefore 0 is used as scale in order 67 * for the readings to return 0. 68 */ 69 if (device_property_read_u32(&client->dev, "adi,iin-scale-monitor", 70 &iin_scale_monitor)) 71 iin_scale_monitor = 0; 72 73 ret = i2c_smbus_write_word_data(client, ADP1050_VIN_SCALE_MONITOR, 74 vin_scale_monitor); 75 if (ret < 0) 76 return ret; 77 78 ret = i2c_smbus_write_word_data(client, ADP1050_IIN_SCALE_MONITOR, 79 iin_scale_monitor); 80 if (ret < 0) 81 return ret; 82 83 return pmbus_do_probe(client, &adp1050_info); 84 } 85 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
On 3/18/2024 1:08 AM, Roberto Sassu wrote:
> On Sun, 2024-03-17 at 22:17 -0700, Eric Biggers wrote:
>> On Fri, Mar 15, 2024 at 08:35:48PM -0700, Fan Wu wrote:
>>> +config IPE_PROP_FS_VERITY
>>> + bool "Enable property for fs-verity files"
>>> + depends on FS_VERITY && FS_VERITY_BUILTIN_SIGNATURES
>>> + help
>>> + This option enables the usage of properties "fsverity_signature"
>>> + and "fsverity_digest". These properties evaluate to TRUE when
>>> + a file is fsverity enabled and with a signed digest
>>
>> Again: why would anyone care if there is a signature, if that signature is not
>> checked.
>>
>> I think you meant to write something like: "when a file is fsverity enabled and
>> has a valid builtin signature whose signing cert is in the .fs-verity keyring".
>
> I was also thinking the same. I didn't follow the recent development
> closely, but unless IPE locks somehow the .fs-verity keyring, the
> property you suggested would not be immutable. Meaning that someone can
> add/remove a key in that keyring, making the property true or false.
>
> Roberto
Yes, the .fs-verity keyring's mutability could affect the property's
immutability. However, we are not planing to "lock" the keyrings, but we
would like to use policies languages to express what certificate can be
trusted.
For example, we can have a rule like this:
#Certificate declaration
CERTIFICATE=MyCertificate CertThumbprint=DummyThumbprint
op=EXECUTE fsverity_signature=MyCertificate action=ALLOW
This will be our immediate next work after the initial version is accepted.
-Fan
On 3/17/2024 10:17 PM, Eric Biggers wrote:
> On Fri, Mar 15, 2024 at 08:35:48PM -0700, Fan Wu wrote:
>> +config IPE_PROP_FS_VERITY
>> + bool "Enable property for fs-verity files"
>> + depends on FS_VERITY && FS_VERITY_BUILTIN_SIGNATURES
>> + help
>> + This option enables the usage of properties "fsverity_signature"
>> + and "fsverity_digest". These properties evaluate to TRUE when
>> + a file is fsverity enabled and with a signed digest
>
> Again: why would anyone care if there is a signature, if that signature is not
> checked.
>
> I think you meant to write something like: "when a file is fsverity enabled and
> has a valid builtin signature whose signing cert is in the .fs-verity keyring".
>
> - Eric
Thanks for the suggestion. I agree this is a more accurate description.
I'll update the description to include these details.
-Fan
On Mon, Mar 18, 2024 at 02:01:18PM -0500, Tom Zanussi wrote:
...
> >
> > This is what I came up with last night for the kernel parameters when thinking about it:
> >
> >
> > > mode \ default enable | intel_iommu + /sm + | intel_iommu + / sm - | intel_iommu - / sm + | intel_iommu - / sm - |
> > > -----------------------+---------------------+----------------------+-----------------------+----------------------|
> > > Scalable Mode | nothing | intel_iommu=sm_on | intel_iommu=on | intel_iommu=on,sm_on |
> > > Legacy Mode | intel_iommu=sm_off | nothing | intel_iommu=on,sm_off | intel_iommu=on |
> > > No IOMMU Mode | intel_iommu=off | intel_iommu=off | nothing | nothing |
> >
>
> Very nice. I think it would need a little explanation on how to read
> the table and mention of how the defaults correspond to actual config
> options e.g. sm+ = CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON=y,
> etc...
>
> Thanks,
>
> Tom
>
Yes, if something like this were to go into the doc it would need more
explanation. This was me just trying to map out the different
combinations last night in an org-mode table in emacs.
On Mon, 2024-03-18 at 10:58 -0700, Jerry Snitselaar wrote: > On Mon, Mar 18, 2024 at 11:26:31AM -0500, Tom Zanussi wrote: > > Hi Jerry, > > > > On Mon, 2024-03-18 at 00:49 -0700, Jerry Snitselaar wrote: > > > On Sun, Mar 17, 2024 at 11:44:21PM -0700, Jerry Snitselaar wrote: > > > > This cleans up the following issues I ran into when trying to use > > > > the > > > > scripts and commands in the iaa-crypto.rst document. > > > > > > > > - Fix incorrect arguments being passed to accel-config > > > > config-wq. > > > > - Replace --device_name with --driver-name. > > > > - Replace --driver_name with --driver-name. > > > > - Replace --size with --wq-size. > > > > - Add missing --priority argument. > > > > - Add missing accel-config config-engine command after the > > > > config-wq commands. > > > > - Fix wq name passed to accel-config config-wq. > > > > - Add rmmod/modprobe of iaa_crypto to script that disables, > > > > then enables all devices and workqueues to avoid enable-wq > > > > failing with -EEXIST when trying to register to compression > > > > algorithm. > > > > - Fix device name in cases where iaa was used instead of iax. > > > > > > > > Cc: Tom Zanussi <tom.zanussi@linux.intel.com> > > > > Cc: Jonathan Corbet <corbet@lwn.net> > > > > Cc: linux-crypto@vger.kernel.org > > > > Cc: linux-doc@vger.kernel.org > > > > Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> > > > > --- > > > > .../driver-api/crypto/iaa/iaa-crypto.rst | 22 ++++++++++++++- > > > > ---- > > > > 1 file changed, 16 insertions(+), 6 deletions(-) > > > > > > > > diff --git a/Documentation/driver-api/crypto/iaa/iaa-crypto.rst > > > > b/Documentation/driver-api/crypto/iaa/iaa-crypto.rst > > > > index de587cf9cbed..330d35df5f16 100644 > > > > --- a/Documentation/driver-api/crypto/iaa/iaa-crypto.rst > > > > +++ b/Documentation/driver-api/crypto/iaa/iaa-crypto.rst > > > > @@ -179,7 +179,9 @@ has the old 'iax' device naming in place) :: > > > > > > > > # configure wq1.0 > > > > > > > > - accel-config config-wq --group-id=0 --mode=dedicated -- > > > > type=kernel --name="iaa_crypto" --device_name="crypto" iax1/wq1.0 > > > > + accel-config config-wq --group-id=0 --mode=dedicated -- > > > > type=kernel --priority=10 --name="iaa_crypto" --driver- > > > > name="crypto" iax1/wq1.0 > > > > + > > > > + accel-config config-engine iax1/engine1.0 --group-id=0 > > > > > > > > # enable IAA device iax1 > > > > > > > > @@ -536,12 +538,20 @@ The below script automatically does that:: > > > > > > > > echo "End Disable IAA" > > > > > > > > + echo "Reload iaa_crypto module" > > > > + > > > > + rmmod iaa_crypto > > > > + modprobe iaa_crypto > > > > + > > > > + echo "End Reload iaa_crypto module" > > > > + > > > > # > > > > # configure iaa wqs and devices > > > > # > > > > echo "Configure IAA" > > > > for ((i = 1; i < ${num_iaa} * 2; i += 2)); do > > > > - accel-config config-wq --group-id=0 --mode=dedicated -- > > > > size=128 --priority=10 --type=kernel --name="iaa_crypto" -- > > > > driver_name="crypto" iax${i}/wq${i} > > > > + accel-config config-wq --group-id=0 --mode=dedicated --wq- > > > > size=128 --priority=10 --type=kernel --name="iaa_crypto" --driver- > > > > name="crypto" iax${i}/wq${i}.0 > > > > + accel-config config-engine iax${i}/engine${i}.0 --group-id=0 > > > > done > > > > > > > > echo "End Configure IAA" > > > > @@ -552,10 +562,10 @@ The below script automatically does that:: > > > > echo "Enable IAA" > > > > > > > > for ((i = 1; i < ${num_iaa} * 2; i += 2)); do > > > > - echo enable iaa iaa${i} > > > > - accel-config enable-device iaa${i} > > > > - echo enable wq iaa${i}/wq${i}.0 > > > > - accel-config enable-wq iaa${i}/wq${i}.0 > > > > + echo enable iaa iax${i} > > > > + accel-config enable-device iax${i} > > > > + echo enable wq iax${i}/wq${i}.0 > > > > + accel-config enable-wq iax${i}/wq${i}.0 > > > > done > > > > > > > > echo "End Enable IAA" > > > > -- > > > > 2.41.0 > > > > > > > > > > In addition to the above, the sections related to the modes seem > > > to be off to me. > > > > > > Legacy mode in the Intel IOMMU context is when the IOMMU does not > > > have > > > scalable mode enabled. If you pass intel_iommu=off the Intel IOMMU > > > will not be initialized, and I think that would correspond to the No > > > IOMMU > > > mode instead of Legacy mode. The other suggestion for Legacy mode of > > > disabling VT-d in the BIOS would also be No IOMMU mode, but in > > > addition to the dma remapping units being disabled it would disable > > > interrupt remapping since the DMAR table would no longer be presented > > > to the OS by the BIOS. > > > > > > I think the modes should be: > > > > > > Scalable mode: intel_iommu=on,sm_on > > > Legacy mode: intel_iommu=on > > > No IOMMU mode: intel_iommu=off (or VT-d disabled in BIOS) > > > > > > > Yes, I think you're correct, those make more sense. > > > > > Since Intel IOMMU and scabale mode have config options that allow > > > them > > > to be enabled by default, there are different parameter variations > > > that would match the above cases. I don't know if they need to > > > be detailed here, or if it would just make it more confusing. > > > > > > > Personally, I think it would be useful to have them detailed and might > > lessen confusion for people setting things up and/or debugging a setup. > > > > Thanks, > > > > Tom > > Hi Tom, > > This is what I came up with last night for the kernel parameters when thinking about it: > > > > mode \ default enable | intel_iommu + /sm + | intel_iommu + / sm - | intel_iommu - / sm + | intel_iommu - / sm - | > > -----------------------+---------------------+----------------------+-----------------------+----------------------| > > Scalable Mode | nothing | intel_iommu=sm_on | intel_iommu=on | intel_iommu=on,sm_on | > > Legacy Mode | intel_iommu=sm_off | nothing | intel_iommu=on,sm_off | intel_iommu=on | > > No IOMMU Mode | intel_iommu=off | intel_iommu=off | nothing | nothing | > Very nice. I think it would need a little explanation on how to read the table and mention of how the defaults correspond to actual config options e.g. sm+ = CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON=y, etc... Thanks, Tom > > Regards, > Jerry > > > > > > > > Regards, > > > Jerry > > > > > >
[-- Attachment #1: Type: text/plain, Size: 4360 bytes --] On Mon, 2024-03-18 at 18:31 +0000, Marc Zyngier wrote: > On Mon, 18 Mar 2024 18:15:36 +0000, > David Woodhouse <dwmw2@infradead.org> wrote: > > > > [1 <text/plain; UTF-8 (quoted-printable)>] > > On Mon, 2024-03-18 at 17:41 +0000, Marc Zyngier wrote: > > > On Mon, 18 Mar 2024 17:26:07 +0000, > > > David Woodhouse <dwmw2@infradead.org> wrote: > > > > > > > > [1 <text/plain; UTF-8 (quoted-printable)>] > > > > On Mon, 2024-03-18 at 16:57 +0000, Marc Zyngier wrote: > > > > > > > > > > > > > > > > > There *is* a way for a VMM to opt *out* of newer PSCI versions... by > > > > > > setting a per-vCPU "special" register that actually ends up setting the > > > > > > PSCI version KVM-wide. Quite why this isn't just a simple KVM_CAP, I > > > > > > have no idea. > > > > > > > > > > Because the expectations are that the VMM can blindly save/restore the > > > > > guest's state, including the PSCI version, and restore that blindly. > > > > > KVM CAPs are just a really bad design pattern for this sort of things. > > > > > > > > Hm, am I missing something here? Does the *guest* get to set the PSCI > > > > version somehow, and opt into the latest version that it understands > > > > regardless of what the firmware/host can support? > > > > > > No. The *VMM* sets the PSCI version by writing to a pseudo register. > > > It means that when the guest migrates, the VMM saves and restores that > > > version, and the guest doesn't see any change. > > > > And when you boot a guest image which has been working for years under > > a new kernel+KVM, your guest suddenly experiences a new PSCI version. > > As I said that's not just new optional functions; it's potentially even > > returning new error codes to the functions that said guest was already > > using. > > If you want to stick to a given PSCI version, you write the version > you want. > > > > > And when you *hibernate* a guest and then launch it again under a newer > > kernel+KVM, it experiences the same incompatibility. > > > > Unless the VMM realises this problem and opts *out* of the newer KVM > > behaviour, of course. This is very much unlike how we *normally* expose > > new KVM capabilities. > > This was discussed at length 5 or 6 years ago (opt-in vs opt-out). > > The feedback from QEMU (which is the only public VMM that does > anything remotely useful with this) was that opt-out was a better > model, specially as PSCI is the conduit for advertising the Spectre > mitigations and users (such as certain cloud vendors) were pretty keen > on guests seeing the mitigations advertised *by default*. OK. > And if you can spot any form of "normality" in the KVM interface, I'll > buy you whatever beer you want. It is all inconsistent crap, so I > think we're in pretty good company here. I'll give you that one :) > > > > > > I don't think we ever aspired to be able to hand an arbitrary KVM fd to > > > > a userspace VMM and have the VMM be able to drive that VM without > > > > having any a priori context, did we? > > > > > > Arbitrary? No. This is actually very specific and pretty well > > > documented. > > > > > > Also, to answer your question about why we treat 0.1 differently from > > > 0.2+: 0.1 didn't specify the PSCI SMC/HCR encoding, meaning that KVM > > > implemented something that was never fully specified. The VMM has to > > > provide firmware tables that describe that. With 0.2+, there is a > > > standard encoding for all functions, and the VMM doesn't have to > > > provide the encoding to the guest. > > > > Gotcha. So for that case we were *forced* to do things correctly and > > allow userspace to opt-in to the capability. While for 0.2 onwards we > > got away with this awfulness of silently upgrading the version without > > VMM consent. > > > > I was hoping to just follow the existing model of SYSTEM_RESET2 and not > > have to touch this awfulness with a barge-pole, but sure, whatever you > > want. > > Unless I'm reading the whole thing wrong (which isn't impossible given > that I'm jet-lagged to my eyeballs), SYSTEM_RESET2 doesn't have any > form of configuration. If PSCI 1.1 is selected, SYSTEM_RESET2 is > available. So that'd be the model to follow. Sorry, that was supposed to be SYSTEM_SUSPEND not SYSTEM_RESET2. But OK. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5965 bytes --]
On Mon, 18 Mar 2024 18:15:36 +0000, David Woodhouse <dwmw2@infradead.org> wrote: > > [1 <text/plain; UTF-8 (quoted-printable)>] > On Mon, 2024-03-18 at 17:41 +0000, Marc Zyngier wrote: > > On Mon, 18 Mar 2024 17:26:07 +0000, > > David Woodhouse <dwmw2@infradead.org> wrote: > > > > > > [1 <text/plain; UTF-8 (quoted-printable)>] > > > On Mon, 2024-03-18 at 16:57 +0000, Marc Zyngier wrote: > > > > > > > > > > > > > > There *is* a way for a VMM to opt *out* of newer PSCI versions... by > > > > > setting a per-vCPU "special" register that actually ends up setting the > > > > > PSCI version KVM-wide. Quite why this isn't just a simple KVM_CAP, I > > > > > have no idea. > > > > > > > > Because the expectations are that the VMM can blindly save/restore the > > > > guest's state, including the PSCI version, and restore that blindly. > > > > KVM CAPs are just a really bad design pattern for this sort of things. > > > > > > Hm, am I missing something here? Does the *guest* get to set the PSCI > > > version somehow, and opt into the latest version that it understands > > > regardless of what the firmware/host can support? > > > > No. The *VMM* sets the PSCI version by writing to a pseudo register. > > It means that when the guest migrates, the VMM saves and restores that > > version, and the guest doesn't see any change. > > And when you boot a guest image which has been working for years under > a new kernel+KVM, your guest suddenly experiences a new PSCI version. > As I said that's not just new optional functions; it's potentially even > returning new error codes to the functions that said guest was already > using. If you want to stick to a given PSCI version, you write the version you want. > > And when you *hibernate* a guest and then launch it again under a newer > kernel+KVM, it experiences the same incompatibility. > > Unless the VMM realises this problem and opts *out* of the newer KVM > behaviour, of course. This is very much unlike how we *normally* expose > new KVM capabilities. This was discussed at length 5 or 6 years ago (opt-in vs opt-out). The feedback from QEMU (which is the only public VMM that does anything remotely useful with this) was that opt-out was a better model, specially as PSCI is the conduit for advertising the Spectre mitigations and users (such as certain cloud vendors) were pretty keen on guests seeing the mitigations advertised *by default*. And if you can spot any form of "normality" in the KVM interface, I'll buy you whatever beer you want. It is all inconsistent crap, so I think we're in pretty good company here. > > > > I don't think we ever aspired to be able to hand an arbitrary KVM fd to > > > a userspace VMM and have the VMM be able to drive that VM without > > > having any a priori context, did we? > > > > Arbitrary? No. This is actually very specific and pretty well > > documented. > > > > Also, to answer your question about why we treat 0.1 differently from > > 0.2+: 0.1 didn't specify the PSCI SMC/HCR encoding, meaning that KVM > > implemented something that was never fully specified. The VMM has to > > provide firmware tables that describe that. With 0.2+, there is a > > standard encoding for all functions, and the VMM doesn't have to > > provide the encoding to the guest. > > Gotcha. So for that case we were *forced* to do things correctly and > allow userspace to opt-in to the capability. While for 0.2 onwards we > got away with this awfulness of silently upgrading the version without > VMM consent. > > I was hoping to just follow the existing model of SYSTEM_RESET2 and not > have to touch this awfulness with a barge-pole, but sure, whatever you > want. Unless I'm reading the whole thing wrong (which isn't impossible given that I'm jet-lagged to my eyeballs), SYSTEM_RESET2 doesn't have any form of configuration. If PSCI 1.1 is selected, SYSTEM_RESET2 is available. So that'd be the model to follow. M. -- Without deviation from the norm, progress is not possible.
[-- Attachment #1: Type: text/plain, Size: 2702 bytes --] On Mon, 2024-03-18 at 18:07 +0000, Marc Zyngier wrote: > On Mon, 18 Mar 2024 17:54:06 +0000, > David Woodhouse <dwmw2@infradead.org> wrote: > > > > [1 <text/plain; UTF-8 (quoted-printable)>] > > On Mon, 2024-03-18 at 17:29 +0000, Marc Zyngier wrote: > > > > > > Again, I really oppose this way of doing things. We already have an > > > infrastructure for selecting PSCI levels. You may not like it, but it > > > exists, and I'm not going entertain supporting yet another bike-shed > > > model. Adding an orthogonal cap for a feature that is specific to a > > > new PSCI version is just awful. > > > > Huh? This isn't a "new bike-shed model". This is a straight copy of > > what we *already* have for SYSTEM_RESET2. > > There is no KVM capability for SYSTEM_RESET2. It is directly > advertised to the guest when PSCI 1.1 is supported. Apologies, I got that wrong. It's SYSTEM_SUSPEND and the corresponding KVM_CAP_ARM_SYSTEM_SUSPEND that I was thinking of. Not SYSTEM_RESET2.I mixed those up. > > If I were bike-shedding, I wouldn't do separate caps for them; I'd have > > done it as a *bitmask* of the optional PSCI calls that should be > > enabled. > > > > The *mandatory* ones should obviously come from the PSCI version alone, > > but I can't see how that makes sense for the optional ones... > > The guest is in a position to probe for what is supported or not with > the PSCI_FEATURES call. Why would you add anything else? Because we don't want to silently *change* what's advertised to the guest with the VMM explicitly opting in. > > > Please make PSCI 1.3 the only version of PSCI supporting suspend in a > > > non-optional way, and be done with it. > > > > SYSTEM_OFF2 is an *optional* feature in PSCI v1.3. As are > > CLEAR_INV_MEMREGION and CLEAR_INV_MEMREGION_ATTRIBUTES. > > > > Are you suggesting that enabling v1.3 should automatically enable *all* > > of the optional features that were defined in that version (and > > previous versions) of the spec? > > No. We have everything we need to incrementally *add* features. So you > can perfectly implement PSCI 1.3 with only SYSTEM_OFF2, and only later > on add the rest, if ever. OK. It's still awful, but I suppose can live with that since existing VMMs will just see the same KVM_SYSTEM_EVENT_SHUTDOWN as before, and hopefully just won't understand the flag (and won't notice) the extra flag which says it's a hibernate. A VMM might *perhaps* check for flags it doesn't understand and complain about them, which is why we shouldn't really do that. But where PSCI is concerned it seems we've left best practice behind a long time ago, so I'll let it go. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5965 bytes --]
[-- Attachment #1: Type: text/plain, Size: 2842 bytes --] On Mon, 2024-03-18 at 17:41 +0000, Marc Zyngier wrote: > On Mon, 18 Mar 2024 17:26:07 +0000, > David Woodhouse <dwmw2@infradead.org> wrote: > > > > [1 <text/plain; UTF-8 (quoted-printable)>] > > On Mon, 2024-03-18 at 16:57 +0000, Marc Zyngier wrote: > > > > > > > > > > > There *is* a way for a VMM to opt *out* of newer PSCI versions... by > > > > setting a per-vCPU "special" register that actually ends up setting the > > > > PSCI version KVM-wide. Quite why this isn't just a simple KVM_CAP, I > > > > have no idea. > > > > > > Because the expectations are that the VMM can blindly save/restore the > > > guest's state, including the PSCI version, and restore that blindly. > > > KVM CAPs are just a really bad design pattern for this sort of things. > > > > Hm, am I missing something here? Does the *guest* get to set the PSCI > > version somehow, and opt into the latest version that it understands > > regardless of what the firmware/host can support? > > No. The *VMM* sets the PSCI version by writing to a pseudo register. > It means that when the guest migrates, the VMM saves and restores that > version, and the guest doesn't see any change. And when you boot a guest image which has been working for years under a new kernel+KVM, your guest suddenly experiences a new PSCI version. As I said that's not just new optional functions; it's potentially even returning new error codes to the functions that said guest was already using. And when you *hibernate* a guest and then launch it again under a newer kernel+KVM, it experiences the same incompatibility. Unless the VMM realises this problem and opts *out* of the newer KVM behaviour, of course. This is very much unlike how we *normally* expose new KVM capabilities. > > I don't think we ever aspired to be able to hand an arbitrary KVM fd to > > a userspace VMM and have the VMM be able to drive that VM without > > having any a priori context, did we? > > Arbitrary? No. This is actually very specific and pretty well > documented. > > Also, to answer your question about why we treat 0.1 differently from > 0.2+: 0.1 didn't specify the PSCI SMC/HCR encoding, meaning that KVM > implemented something that was never fully specified. The VMM has to > provide firmware tables that describe that. With 0.2+, there is a > standard encoding for all functions, and the VMM doesn't have to > provide the encoding to the guest. Gotcha. So for that case we were *forced* to do things correctly and allow userspace to opt-in to the capability. While for 0.2 onwards we got away with this awfulness of silently upgrading the version without VMM consent. I was hoping to just follow the existing model of SYSTEM_RESET2 and not have to touch this awfulness with a barge-pole, but sure, whatever you want. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5965 bytes --]
On Mon, 18 Mar 2024 17:54:06 +0000, David Woodhouse <dwmw2@infradead.org> wrote: > > [1 <text/plain; UTF-8 (quoted-printable)>] > On Mon, 2024-03-18 at 17:29 +0000, Marc Zyngier wrote: > > > > Again, I really oppose this way of doing things. We already have an > > infrastructure for selecting PSCI levels. You may not like it, but it > > exists, and I'm not going entertain supporting yet another bike-shed > > model. Adding an orthogonal cap for a feature that is specific to a > > new PSCI version is just awful. > > Huh? This isn't a "new bike-shed model". This is a straight copy of > what we *already* have for SYSTEM_RESET2. There is no KVM capability for SYSTEM_RESET2. It is directly advertised to the guest when PSCI 1.1 is supported. > If I were bike-shedding, I wouldn't do separate caps for them; I'd have > done it as a *bitmask* of the optional PSCI calls that should be > enabled. > > The *mandatory* ones should obviously come from the PSCI version alone, > but I can't see how that makes sense for the optional ones... The guest is in a position to probe for what is supported or not with the PSCI_FEATURES call. Why would you add anything else? > > > Please make PSCI 1.3 the only version of PSCI supporting suspend in a > > non-optional way, and be done with it. > > SYSTEM_OFF2 is an *optional* feature in PSCI v1.3. As are > CLEAR_INV_MEMREGION and CLEAR_INV_MEMREGION_ATTRIBUTES. > > Are you suggesting that enabling v1.3 should automatically enable *all* > of the optional features that were defined in that version (and > previous versions) of the spec? No. We have everything we need to incrementally *add* features. So you can perfectly implement PSCI 1.3 with only SYSTEM_OFF2, and only later on add the rest, if ever. M. -- Without deviation from the norm, progress is not possible.
On 18/03/2024 17:48, Guenter Roeck wrote:
> On 3/18/24 09:12, Krzysztof Kozlowski wrote:
>> On 18/03/2024 12:21, Radu Sabau wrote:
>>> Add support for ADP1050 Digital Controller for Isolated Power Supplies
>>> with PMBus interface Voltage, Current and Temperature Monitor.
>>>
>>
>> ...
>>
>>> +static int adp1050_probe(struct i2c_client *client)
>>> +{
>>> + u32 vin_scale_monitor, iin_scale_monitor;
>>> + int ret;
>>> +
>>> + if (!i2c_check_functionality(client->adapter,
>>> + I2C_FUNC_SMBUS_WRITE_WORD_DATA))
>>> + return -ENODEV;
>>> +
>>> + /* Unlock CHIP's password in order to be able to read/write to it's
>>> + * VIN_SCALE and IIN_SCALE registers.
>>> + */
>>> + ret = i2c_smbus_write_word_data(client, ADP1050_CHIP_PASSWORD, 0xFFFF);
>>> + if (ret < 0) {
>>> + dev_err_probe(&client->dev, "Device can't be unlocked.\n");
>>
>> Syntax is: return dev_err_probe(). Same in other places.
>>
>
> dev_err_probe() expects the error number as second parameter, so I don't
> really understand how the above even compiles.
I did not explain the arguments, because they are obvious... but if you
need so:
return dev_err_probe(&client->dev, ret, "Device can't be unlocked.\n");
Best regards,
Krzysztof
On Mon, Mar 18, 2024 at 07:35:55PM +0200, Jani Nikula wrote: > This works, but I don't know if there are any guarantees that it keeps > working: > > https://www.kernel.org/doc/html/v6.8/core-api/genalloc.html Also works as https://docs.kernel.org/6.8/core-api/genalloc.html No guarantees beyond the usual "we did this on purpose and don't intend to break this." :) -K
On Mon, Mar 18, 2024 at 11:26:31AM -0500, Tom Zanussi wrote: > Hi Jerry, > > On Mon, 2024-03-18 at 00:49 -0700, Jerry Snitselaar wrote: > > On Sun, Mar 17, 2024 at 11:44:21PM -0700, Jerry Snitselaar wrote: > > > This cleans up the following issues I ran into when trying to use > > > the > > > scripts and commands in the iaa-crypto.rst document. > > > > > > - Fix incorrect arguments being passed to accel-config > > > config-wq. > > > - Replace --device_name with --driver-name. > > > - Replace --driver_name with --driver-name. > > > - Replace --size with --wq-size. > > > - Add missing --priority argument. > > > - Add missing accel-config config-engine command after the > > > config-wq commands. > > > - Fix wq name passed to accel-config config-wq. > > > - Add rmmod/modprobe of iaa_crypto to script that disables, > > > then enables all devices and workqueues to avoid enable-wq > > > failing with -EEXIST when trying to register to compression > > > algorithm. > > > - Fix device name in cases where iaa was used instead of iax. > > > > > > Cc: Tom Zanussi <tom.zanussi@linux.intel.com> > > > Cc: Jonathan Corbet <corbet@lwn.net> > > > Cc: linux-crypto@vger.kernel.org > > > Cc: linux-doc@vger.kernel.org > > > Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> > > > --- > > > .../driver-api/crypto/iaa/iaa-crypto.rst | 22 ++++++++++++++- > > > ---- > > > 1 file changed, 16 insertions(+), 6 deletions(-) > > > > > > diff --git a/Documentation/driver-api/crypto/iaa/iaa-crypto.rst > > > b/Documentation/driver-api/crypto/iaa/iaa-crypto.rst > > > index de587cf9cbed..330d35df5f16 100644 > > > --- a/Documentation/driver-api/crypto/iaa/iaa-crypto.rst > > > +++ b/Documentation/driver-api/crypto/iaa/iaa-crypto.rst > > > @@ -179,7 +179,9 @@ has the old 'iax' device naming in place) :: > > > > > > # configure wq1.0 > > > > > > - accel-config config-wq --group-id=0 --mode=dedicated -- > > > type=kernel --name="iaa_crypto" --device_name="crypto" iax1/wq1.0 > > > + accel-config config-wq --group-id=0 --mode=dedicated -- > > > type=kernel --priority=10 --name="iaa_crypto" --driver- > > > name="crypto" iax1/wq1.0 > > > + > > > + accel-config config-engine iax1/engine1.0 --group-id=0 > > > > > > # enable IAA device iax1 > > > > > > @@ -536,12 +538,20 @@ The below script automatically does that:: > > > > > > echo "End Disable IAA" > > > > > > + echo "Reload iaa_crypto module" > > > + > > > + rmmod iaa_crypto > > > + modprobe iaa_crypto > > > + > > > + echo "End Reload iaa_crypto module" > > > + > > > # > > > # configure iaa wqs and devices > > > # > > > echo "Configure IAA" > > > for ((i = 1; i < ${num_iaa} * 2; i += 2)); do > > > - accel-config config-wq --group-id=0 --mode=dedicated -- > > > size=128 --priority=10 --type=kernel --name="iaa_crypto" -- > > > driver_name="crypto" iax${i}/wq${i} > > > + accel-config config-wq --group-id=0 --mode=dedicated --wq- > > > size=128 --priority=10 --type=kernel --name="iaa_crypto" --driver- > > > name="crypto" iax${i}/wq${i}.0 > > > + accel-config config-engine iax${i}/engine${i}.0 --group-id=0 > > > done > > > > > > echo "End Configure IAA" > > > @@ -552,10 +562,10 @@ The below script automatically does that:: > > > echo "Enable IAA" > > > > > > for ((i = 1; i < ${num_iaa} * 2; i += 2)); do > > > - echo enable iaa iaa${i} > > > - accel-config enable-device iaa${i} > > > - echo enable wq iaa${i}/wq${i}.0 > > > - accel-config enable-wq iaa${i}/wq${i}.0 > > > + echo enable iaa iax${i} > > > + accel-config enable-device iax${i} > > > + echo enable wq iax${i}/wq${i}.0 > > > + accel-config enable-wq iax${i}/wq${i}.0 > > > done > > > > > > echo "End Enable IAA" > > > -- > > > 2.41.0 > > > > > > > In addition to the above, the sections related to the modes seem > > to be off to me. > > > > Legacy mode in the Intel IOMMU context is when the IOMMU does not > > have > > scalable mode enabled. If you pass intel_iommu=off the Intel IOMMU > > will not be initialized, and I think that would correspond to the No > > IOMMU > > mode instead of Legacy mode. The other suggestion for Legacy mode of > > disabling VT-d in the BIOS would also be No IOMMU mode, but in > > addition to the dma remapping units being disabled it would disable > > interrupt remapping since the DMAR table would no longer be presented > > to the OS by the BIOS. > > > > I think the modes should be: > > > > Scalable mode: intel_iommu=on,sm_on > > Legacy mode: intel_iommu=on > > No IOMMU mode: intel_iommu=off (or VT-d disabled in BIOS) > > > > Yes, I think you're correct, those make more sense. > > > Since Intel IOMMU and scabale mode have config options that allow > > them > > to be enabled by default, there are different parameter variations > > that would match the above cases. I don't know if they need to > > be detailed here, or if it would just make it more confusing. > > > > Personally, I think it would be useful to have them detailed and might > lessen confusion for people setting things up and/or debugging a setup. > > Thanks, > > Tom Hi Tom, This is what I came up with last night for the kernel parameters when thinking about it: | mode \ default enable | intel_iommu + /sm + | intel_iommu + / sm - | intel_iommu - / sm + | intel_iommu - / sm - | |-----------------------+---------------------+----------------------+-----------------------+----------------------| | Scalable Mode | nothing | intel_iommu=sm_on | intel_iommu=on | intel_iommu=on,sm_on | | Legacy Mode | intel_iommu=sm_off | nothing | intel_iommu=on,sm_off | intel_iommu=on | | No IOMMU Mode | intel_iommu=off | intel_iommu=off | nothing | nothing | Regards, Jerry > > > Regards, > > Jerry > > >
[-- Attachment #1: Type: text/plain, Size: 1195 bytes --] On Mon, 2024-03-18 at 17:29 +0000, Marc Zyngier wrote: > > Again, I really oppose this way of doing things. We already have an > infrastructure for selecting PSCI levels. You may not like it, but it > exists, and I'm not going entertain supporting yet another bike-shed > model. Adding an orthogonal cap for a feature that is specific to a > new PSCI version is just awful. Huh? This isn't a "new bike-shed model". This is a straight copy of what we *already* have for SYSTEM_RESET2. If I were bike-shedding, I wouldn't do separate caps for them; I'd have done it as a *bitmask* of the optional PSCI calls that should be enabled. The *mandatory* ones should obviously come from the PSCI version alone, but I can't see how that makes sense for the optional ones... > Please make PSCI 1.3 the only version of PSCI supporting suspend in a > non-optional way, and be done with it. SYSTEM_OFF2 is an *optional* feature in PSCI v1.3. As are CLEAR_INV_MEMREGION and CLEAR_INV_MEMREGION_ATTRIBUTES. Are you suggesting that enabling v1.3 should automatically enable *all* of the optional features that were defined in that version (and previous versions) of the spec? [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5965 bytes --]