* Re: [Qemu-devel] [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-21 8:30 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-21 8:30 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Thu, Jan 28, 2016 at 09:23:11AM -0500, Gabriel L. Somlo wrote:
> From: Gabriel Somlo <somlo@cmu.edu>
>
> Make fw_cfg entries of type "file" available via sysfs. Entries
> are listed under /sys/firmware/qemu_fw_cfg/by_key, in folders
> named after each entry's selector key. Filename, selector value,
> and size read-only attributes are included for each entry. Also,
> a "raw" attribute allows retrieval of the full binary content of
> each entry.
>
> The fw_cfg device can be instantiated automatically from ACPI or
> the Device Tree, or manually by using a kernel module (or command
> line) parameter, with a syntax outlined in the documentation file.
>
> Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
> ---
> .../ABI/testing/sysfs-firmware-qemu_fw_cfg | 58 ++
> drivers/firmware/Kconfig | 19 +
> drivers/firmware/Makefile | 1 +
> drivers/firmware/qemu_fw_cfg.c | 648 +++++++++++++++++++++
> 4 files changed, 726 insertions(+)
> create mode 100644 Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> create mode 100644 drivers/firmware/qemu_fw_cfg.c
>
> diff --git a/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> new file mode 100644
> index 0000000..e9e58d4
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> @@ -0,0 +1,58 @@
> +What: /sys/firmware/qemu_fw_cfg/
> +Date: August 2015
> +Contact: Gabriel Somlo <somlo@cmu.edu>
> +Description:
> + Several different architectures supported by QEMU (x86, arm,
> + sun4*, ppc/mac) are provisioned with a firmware configuration
> + (fw_cfg) device, originally intended as a way for the host to
> + provide configuration data to the guest firmware. Starting
> + with QEMU v2.4, arbitrary fw_cfg file entries may be specified
> + by the user on the command line, which makes fw_cfg additionally
> + useful as an out-of-band, asynchronous mechanism for providing
> + configuration data to the guest userspace.
> +
> + The authoritative guest-side hardware interface documentation
> + to the fw_cfg device can be found in "docs/specs/fw_cfg.txt"
> + in the QEMU source tree.
> +
> + === SysFS fw_cfg Interface ===
> +
> + The fw_cfg sysfs interface described in this document is only
> + intended to display discoverable blobs (i.e., those registered
> + with the file directory), as there is no way to determine the
> + presence or size of "legacy" blobs (with selector keys between
> + 0x0002 and 0x0018) programmatically.
> +
> + All fw_cfg information is shown under:
> +
> + /sys/firmware/qemu_fw_cfg/
> +
> + The only legacy blob displayed is the fw_cfg device revision:
> +
> + /sys/firmware/qemu_fw_cfg/rev
> +
> + --- Discoverable fw_cfg blobs by selector key ---
> +
> + All discoverable blobs listed in the fw_cfg file directory are
> + displayed as entries named after their unique selector key
> + value, e.g.:
> +
> + /sys/firmware/qemu_fw_cfg/by_key/32
> + /sys/firmware/qemu_fw_cfg/by_key/33
> + /sys/firmware/qemu_fw_cfg/by_key/34
> + ...
> +
> + Each such fw_cfg sysfs entry has the following values exported
> + as attributes:
> +
> + name : The 56-byte nul-terminated ASCII string used as the
> + blob's 'file name' in the fw_cfg directory.
> + size : The length of the blob, as given in the fw_cfg
> + directory.
> + key : The value of the blob's selector key as given in the
> + fw_cfg directory. This value is the same as used in
> + the parent directory name.
> + raw : The raw bytes of the blob, obtained by selecting the
> + entry via the control register, and reading a number
> + of bytes equal to the blob size from the data
> + register.
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index 49a3a11..5130f74 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -161,6 +161,25 @@ config RASPBERRYPI_FIRMWARE
> This option enables support for communicating with the firmware on the
> Raspberry Pi.
>
> +config FW_CFG_SYSFS
> + tristate "QEMU fw_cfg device support in sysfs"
> + depends on SYSFS && (ARM || ARM64 || PPC_PMAC || SPARC || X86)
> + default n
> + help
> + Say Y or M here to enable the exporting of the QEMU firmware
> + configuration (fw_cfg) file entries via sysfs. Entries are
> + found under /sys/firmware/fw_cfg when this option is enabled
> + and loaded.
> +
> +config FW_CFG_SYSFS_CMDLINE
> + bool "QEMU fw_cfg device parameter parsing"
> + depends on FW_CFG_SYSFS
> + help
> + Allow the qemu_fw_cfg device to be initialized via the kernel
> + command line or using a module parameter.
> + WARNING: Using incorrect parameters (base address in particular)
> + may crash your system.
> +
> config QCOM_SCM
> bool
> depends on ARM || ARM64
> diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> index 48dd417..474bada 100644
> --- a/drivers/firmware/Makefile
> +++ b/drivers/firmware/Makefile
> @@ -14,6 +14,7 @@ obj-$(CONFIG_ISCSI_IBFT_FIND) += iscsi_ibft_find.o
> obj-$(CONFIG_ISCSI_IBFT) += iscsi_ibft.o
> obj-$(CONFIG_FIRMWARE_MEMMAP) += memmap.o
> obj-$(CONFIG_RASPBERRYPI_FIRMWARE) += raspberrypi.o
> +obj-$(CONFIG_FW_CFG_SYSFS) += qemu_fw_cfg.o
> obj-$(CONFIG_QCOM_SCM) += qcom_scm.o
> obj-$(CONFIG_QCOM_SCM_64) += qcom_scm-64.o
> obj-$(CONFIG_QCOM_SCM_32) += qcom_scm-32.o
> diff --git a/drivers/firmware/qemu_fw_cfg.c b/drivers/firmware/qemu_fw_cfg.c
> new file mode 100644
> index 0000000..83e8a5c
> --- /dev/null
> +++ b/drivers/firmware/qemu_fw_cfg.c
> @@ -0,0 +1,648 @@
> +/*
> + * drivers/firmware/qemu_fw_cfg.c
> + *
> + * Copyright 2015 Carnegie Mellon University
> + *
> + * Expose entries from QEMU's firmware configuration (fw_cfg) device in
> + * sysfs (read-only, under "/sys/firmware/qemu_fw_cfg/...").
> + *
> + * The fw_cfg device may be instantiated via either an ACPI node (on x86
> + * and select subsets of aarch64), a Device Tree node (on arm), or using
> + * a kernel module (or command line) parameter with the following syntax:
> + *
> + * [fw_cfg.]ioport=<size>@<base>[:<ctrl_off>:<data_off>]
> + * or
> + * [fw_cfg.]mmio=<size>@<base>[:<ctrl_off>:<data_off>]
> + *
> + * where:
> + * <size> := size of ioport or mmio range
> + * <base> := physical base address of ioport or mmio range
> + * <ctrl_off> := (optional) offset of control register
> + * <data_off> := (optional) offset of data register
> + *
> + * e.g.:
> + * fw_cfg.ioport=2@0x510:0:1 (the default on x86)
> + * or
> + * fw_cfg.mmio=0xA@0x9020000:8:0 (the default on arm)
> + */
> +
> +#include <linux/module.h>
> +#include <linux/platform_device.h>
> +#include <linux/acpi.h>
> +#include <linux/slab.h>
> +#include <linux/io.h>
> +#include <linux/ioport.h>
> +
> +MODULE_AUTHOR("Gabriel L. Somlo <somlo@cmu.edu>");
> +MODULE_DESCRIPTION("QEMU fw_cfg sysfs support");
> +MODULE_LICENSE("GPL");
> +
> +/* selector key values for "well-known" fw_cfg entries */
> +#define FW_CFG_SIGNATURE 0x00
> +#define FW_CFG_ID 0x01
> +#define FW_CFG_FILE_DIR 0x19
> +
> +/* size in bytes of fw_cfg signature */
> +#define FW_CFG_SIG_SIZE 4
> +
> +/* fw_cfg "file name" is up to 56 characters (including terminating nul) */
> +#define FW_CFG_MAX_FILE_PATH 56
> +
> +/* fw_cfg file directory entry type */
> +struct fw_cfg_file {
> + u32 size;
> + u16 select;
> + u16 reserved;
> + char name[FW_CFG_MAX_FILE_PATH];
> +};
> +
> +/* fw_cfg device i/o register addresses */
> +static bool fw_cfg_is_mmio;
> +static phys_addr_t fw_cfg_p_base;
> +static resource_size_t fw_cfg_p_size;
> +static void __iomem *fw_cfg_dev_base;
> +static void __iomem *fw_cfg_reg_ctrl;
> +static void __iomem *fw_cfg_reg_data;
> +
> +/* atomic access to fw_cfg device (potentially slow i/o, so using mutex) */
> +static DEFINE_MUTEX(fw_cfg_dev_lock);
> +
> +/* pick appropriate endianness for selector key */
> +static inline u16 fw_cfg_sel_endianness(u16 key)
> +{
> + return fw_cfg_is_mmio ? cpu_to_be16(key) : cpu_to_le16(key);
> +}
> +
> +/* read chunk of given fw_cfg blob (caller responsible for sanity-check) */
> +static inline void fw_cfg_read_blob(u16 key,
> + void *buf, loff_t pos, size_t count)
> +{
> + mutex_lock(&fw_cfg_dev_lock);
> + iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
> + while (pos-- > 0)
> + ioread8(fw_cfg_reg_data);
> + ioread8_rep(fw_cfg_reg_data, buf, count);
> + mutex_unlock(&fw_cfg_dev_lock);
> +}
This locking is not enough I think: ACPI might be
accessing FW CFG meanwhile, and assuming it's
the only owner.
I think that on systems that support ACPI,
it would be better to use an ACPI interface
to read blobs, instead of poking on _CRS and banging on registers directly.
This will also make it easier to extend the interface,
and support DMA.
> +
> +/* clean up fw_cfg device i/o */
> +static void fw_cfg_io_cleanup(void)
> +{
> + if (fw_cfg_is_mmio) {
> + iounmap(fw_cfg_dev_base);
> + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> + } else {
> + ioport_unmap(fw_cfg_dev_base);
> + release_region(fw_cfg_p_base, fw_cfg_p_size);
> + }
> +}
> +
> +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
So for all arches which support ACPI, I think this driver
should just rely on ACPI.
> +#if !(defined(FW_CFG_CTRL_OFF) && defined(FW_CTRL_DATA_OFF))
> +# if (defined(CONFIG_ARM) || defined(CONFIG_ARM64))
> +# define FW_CFG_CTRL_OFF 0x08
> +# define FW_CFG_DATA_OFF 0x00
> +# elif (defined(CONFIG_PPC_PMAC) || defined(CONFIG_SPARC32)) /* ppc/mac,sun4m */
> +# define FW_CFG_CTRL_OFF 0x00
> +# define FW_CFG_DATA_OFF 0x02
> +# elif (defined(CONFIG_X86) || defined(CONFIG_SPARC64)) /* x86, sun4u */
> +# define FW_CFG_CTRL_OFF 0x00
> +# define FW_CFG_DATA_OFF 0x01
> +# else
> +# warning "QEMU FW_CFG may not be available on this architecture!"
> +# define FW_CFG_CTRL_OFF 0x00
> +# define FW_CFG_DATA_OFF 0x01
Better not try hacks like this, they are hard
to support down the road. Please only list what is tested and
actually exposed by QEMU.
> +# endif
> +#endif
> +
> +/* initialize fw_cfg device i/o from platform data */
> +static int fw_cfg_do_platform_probe(struct platform_device *pdev)
> +{
> + char sig[FW_CFG_SIG_SIZE];
> + struct resource *range, *ctrl, *data;
> +
> + /* acquire i/o range details */
> + fw_cfg_is_mmio = false;
> + range = platform_get_resource(pdev, IORESOURCE_IO, 0);
> + if (!range) {
> + fw_cfg_is_mmio = true;
> + range = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> + if (!range)
> + return -EINVAL;
> + }
> + fw_cfg_p_base = range->start;
> + fw_cfg_p_size = resource_size(range);
> +
> + if (fw_cfg_is_mmio) {
> + if (!request_mem_region(fw_cfg_p_base,
> + fw_cfg_p_size, "fw_cfg_mem"))
> + return -EBUSY;
> + fw_cfg_dev_base = ioremap(fw_cfg_p_base, fw_cfg_p_size);
> + if (!fw_cfg_dev_base) {
> + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> + return -EFAULT;
> + }
> + } else {
> + if (!request_region(fw_cfg_p_base,
> + fw_cfg_p_size, "fw_cfg_io"))
> + return -EBUSY;
> + fw_cfg_dev_base = ioport_map(fw_cfg_p_base, fw_cfg_p_size);
> + if (!fw_cfg_dev_base) {
> + release_region(fw_cfg_p_base, fw_cfg_p_size);
> + return -EFAULT;
> + }
> + }
> +
> + /* were custom register offsets provided (e.g. on the command line)? */
> + ctrl = platform_get_resource_byname(pdev, IORESOURCE_REG, "ctrl");
> + data = platform_get_resource_byname(pdev, IORESOURCE_REG, "data");
> + if (ctrl && data) {
> + fw_cfg_reg_ctrl = fw_cfg_dev_base + ctrl->start;
> + fw_cfg_reg_data = fw_cfg_dev_base + data->start;
> + } else {
> + /* use architecture-specific offsets */
> + fw_cfg_reg_ctrl = fw_cfg_dev_base + FW_CFG_CTRL_OFF;
> + fw_cfg_reg_data = fw_cfg_dev_base + FW_CFG_DATA_OFF;
> + }
> +
> + /* verify fw_cfg device signature */
> + fw_cfg_read_blob(FW_CFG_SIGNATURE, sig, 0, FW_CFG_SIG_SIZE);
> + if (memcmp(sig, "QEMU", FW_CFG_SIG_SIZE) != 0) {
> + fw_cfg_io_cleanup();
> + return -ENODEV;
> + }
> +
> + return 0;
> +}
> +
> +/* fw_cfg revision attribute, in /sys/firmware/qemu_fw_cfg top-level dir. */
> +static u32 fw_cfg_rev;
> +
> +static ssize_t fw_cfg_showrev(struct kobject *k, struct attribute *a, char *buf)
> +{
> + return sprintf(buf, "%u\n", fw_cfg_rev);
> +}
> +
> +static const struct {
> + struct attribute attr;
> + ssize_t (*show)(struct kobject *k, struct attribute *a, char *buf);
> +} fw_cfg_rev_attr = {
> + .attr = { .name = "rev", .mode = S_IRUSR },
> + .show = fw_cfg_showrev,
> +};
> +
> +/* fw_cfg_sysfs_entry type */
> +struct fw_cfg_sysfs_entry {
> + struct kobject kobj;
> + struct fw_cfg_file f;
> + struct list_head list;
> +};
> +
> +/* get fw_cfg_sysfs_entry from kobject member */
> +static inline struct fw_cfg_sysfs_entry *to_entry(struct kobject *kobj)
> +{
> + return container_of(kobj, struct fw_cfg_sysfs_entry, kobj);
> +}
> +
> +/* fw_cfg_sysfs_attribute type */
> +struct fw_cfg_sysfs_attribute {
> + struct attribute attr;
> + ssize_t (*show)(struct fw_cfg_sysfs_entry *entry, char *buf);
> +};
> +
> +/* get fw_cfg_sysfs_attribute from attribute member */
> +static inline struct fw_cfg_sysfs_attribute *to_attr(struct attribute *attr)
> +{
> + return container_of(attr, struct fw_cfg_sysfs_attribute, attr);
> +}
> +
> +/* global cache of fw_cfg_sysfs_entry objects */
> +static LIST_HEAD(fw_cfg_entry_cache);
> +
> +/* kobjects removed lazily by kernel, mutual exclusion needed */
> +static DEFINE_SPINLOCK(fw_cfg_cache_lock);
> +
> +static inline void fw_cfg_sysfs_cache_enlist(struct fw_cfg_sysfs_entry *entry)
> +{
> + spin_lock(&fw_cfg_cache_lock);
> + list_add_tail(&entry->list, &fw_cfg_entry_cache);
> + spin_unlock(&fw_cfg_cache_lock);
> +}
> +
> +static inline void fw_cfg_sysfs_cache_delist(struct fw_cfg_sysfs_entry *entry)
> +{
> + spin_lock(&fw_cfg_cache_lock);
> + list_del(&entry->list);
> + spin_unlock(&fw_cfg_cache_lock);
> +}
> +
> +static void fw_cfg_sysfs_cache_cleanup(void)
> +{
> + struct fw_cfg_sysfs_entry *entry, *next;
> +
> + list_for_each_entry_safe(entry, next, &fw_cfg_entry_cache, list) {
> + /* will end up invoking fw_cfg_sysfs_cache_delist()
> + * via each object's release() method (i.e. destructor)
> + */
> + kobject_put(&entry->kobj);
> + }
> +}
> +
> +/* default_attrs: per-entry attributes and show methods */
> +
> +#define FW_CFG_SYSFS_ATTR(_attr) \
> +struct fw_cfg_sysfs_attribute fw_cfg_sysfs_attr_##_attr = { \
> + .attr = { .name = __stringify(_attr), .mode = S_IRUSR }, \
> + .show = fw_cfg_sysfs_show_##_attr, \
> +}
> +
> +static ssize_t fw_cfg_sysfs_show_size(struct fw_cfg_sysfs_entry *e, char *buf)
> +{
> + return sprintf(buf, "%u\n", e->f.size);
> +}
> +
> +static ssize_t fw_cfg_sysfs_show_key(struct fw_cfg_sysfs_entry *e, char *buf)
> +{
> + return sprintf(buf, "%u\n", e->f.select);
> +}
> +
> +static ssize_t fw_cfg_sysfs_show_name(struct fw_cfg_sysfs_entry *e, char *buf)
> +{
> + return sprintf(buf, "%s\n", e->f.name);
> +}
> +
> +static FW_CFG_SYSFS_ATTR(size);
> +static FW_CFG_SYSFS_ATTR(key);
> +static FW_CFG_SYSFS_ATTR(name);
> +
> +static struct attribute *fw_cfg_sysfs_entry_attrs[] = {
> + &fw_cfg_sysfs_attr_size.attr,
> + &fw_cfg_sysfs_attr_key.attr,
> + &fw_cfg_sysfs_attr_name.attr,
> + NULL,
> +};
> +
> +/* sysfs_ops: find fw_cfg_[entry, attribute] and call appropriate show method */
> +static ssize_t fw_cfg_sysfs_attr_show(struct kobject *kobj, struct attribute *a,
> + char *buf)
> +{
> + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> + struct fw_cfg_sysfs_attribute *attr = to_attr(a);
> +
> + return attr->show(entry, buf);
> +}
> +
> +static const struct sysfs_ops fw_cfg_sysfs_attr_ops = {
> + .show = fw_cfg_sysfs_attr_show,
> +};
> +
> +/* release: destructor, to be called via kobject_put() */
> +static void fw_cfg_sysfs_release_entry(struct kobject *kobj)
> +{
> + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> +
> + fw_cfg_sysfs_cache_delist(entry);
> + kfree(entry);
> +}
> +
> +/* kobj_type: ties together all properties required to register an entry */
> +static struct kobj_type fw_cfg_sysfs_entry_ktype = {
> + .default_attrs = fw_cfg_sysfs_entry_attrs,
> + .sysfs_ops = &fw_cfg_sysfs_attr_ops,
> + .release = fw_cfg_sysfs_release_entry,
> +};
> +
> +/* raw-read method and attribute */
> +static ssize_t fw_cfg_sysfs_read_raw(struct file *filp, struct kobject *kobj,
> + struct bin_attribute *bin_attr,
> + char *buf, loff_t pos, size_t count)
> +{
> + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> +
> + if (pos > entry->f.size)
> + return -EINVAL;
> +
> + if (count > entry->f.size - pos)
> + count = entry->f.size - pos;
> +
> + fw_cfg_read_blob(entry->f.select, buf, pos, count);
> + return count;
> +}
> +
> +static struct bin_attribute fw_cfg_sysfs_attr_raw = {
> + .attr = { .name = "raw", .mode = S_IRUSR },
> + .read = fw_cfg_sysfs_read_raw,
> +};
> +
> +/* kobjects representing top-level and by_key folders */
> +static struct kobject *fw_cfg_top_ko;
> +static struct kobject *fw_cfg_sel_ko;
> +
> +/* register an individual fw_cfg file */
> +static int fw_cfg_register_file(const struct fw_cfg_file *f)
> +{
> + int err;
> + struct fw_cfg_sysfs_entry *entry;
> +
> + /* allocate new entry */
> + entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> + if (!entry)
> + return -ENOMEM;
> +
> + /* set file entry information */
> + memcpy(&entry->f, f, sizeof(struct fw_cfg_file));
> +
> + /* register entry under "/sys/firmware/qemu_fw_cfg/by_key/" */
> + err = kobject_init_and_add(&entry->kobj, &fw_cfg_sysfs_entry_ktype,
> + fw_cfg_sel_ko, "%d", entry->f.select);
> + if (err)
> + goto err_register;
> +
> + /* add raw binary content access */
> + err = sysfs_create_bin_file(&entry->kobj, &fw_cfg_sysfs_attr_raw);
> + if (err)
> + goto err_add_raw;
> +
> + /* success, add entry to global cache */
> + fw_cfg_sysfs_cache_enlist(entry);
> + return 0;
> +
> +err_add_raw:
> + kobject_del(&entry->kobj);
> +err_register:
> + kfree(entry);
> + return err;
> +}
> +
> +/* iterate over all fw_cfg directory entries, registering each one */
> +static int fw_cfg_register_dir_entries(void)
> +{
> + int ret = 0;
> + u32 count, i;
> + struct fw_cfg_file *dir;
> + size_t dir_size;
> +
> + fw_cfg_read_blob(FW_CFG_FILE_DIR, &count, 0, sizeof(count));
> + count = be32_to_cpu(count);
> + dir_size = count * sizeof(struct fw_cfg_file);
> +
> + dir = kmalloc(dir_size, GFP_KERNEL);
> + if (!dir)
> + return -ENOMEM;
> +
> + fw_cfg_read_blob(FW_CFG_FILE_DIR, dir, sizeof(count), dir_size);
> +
> + for (i = 0; i < count; i++) {
> + dir[i].size = be32_to_cpu(dir[i].size);
> + dir[i].select = be16_to_cpu(dir[i].select);
> + ret = fw_cfg_register_file(&dir[i]);
> + if (ret)
> + break;
> + }
> +
> + kfree(dir);
> + return ret;
> +}
> +
> +/* unregister top-level or by_key folder */
> +static inline void fw_cfg_kobj_cleanup(struct kobject *kobj)
> +{
> + kobject_del(kobj);
> + kobject_put(kobj);
> +}
> +
> +static int fw_cfg_sysfs_probe(struct platform_device *pdev)
> +{
> + int err;
> +
> + /* NOTE: If we supported multiple fw_cfg devices, we'd first create
> + * a subdirectory named after e.g. pdev->id, then hang per-device
> + * by_key subdirectories underneath it. However, only
> + * one fw_cfg device exist system-wide, so if one was already found
> + * earlier, we might as well stop here.
> + */
> + if (fw_cfg_sel_ko)
> + return -EBUSY;
> +
> + /* create by_key subdirectory of /sys/firmware/qemu_fw_cfg/ */
> + err = -ENOMEM;
> + fw_cfg_sel_ko = kobject_create_and_add("by_key", fw_cfg_top_ko);
> + if (!fw_cfg_sel_ko)
> + goto err_sel;
> +
> + /* initialize fw_cfg device i/o from platform data */
> + err = fw_cfg_do_platform_probe(pdev);
> + if (err)
> + goto err_probe;
> +
> + /* get revision number, add matching top-level attribute */
> + fw_cfg_read_blob(FW_CFG_ID, &fw_cfg_rev, 0, sizeof(fw_cfg_rev));
> + fw_cfg_rev = le32_to_cpu(fw_cfg_rev);
> + err = sysfs_create_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
> + if (err)
> + goto err_rev;
> +
> + /* process fw_cfg file directory entry, registering each file */
> + err = fw_cfg_register_dir_entries();
> + if (err)
> + goto err_dir;
> +
> + /* success */
> + pr_debug("fw_cfg: loaded.\n");
> + return 0;
> +
> +err_dir:
> + fw_cfg_sysfs_cache_cleanup();
> + sysfs_remove_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
> +err_rev:
> + fw_cfg_io_cleanup();
> +err_probe:
> + fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
> +err_sel:
> + return err;
> +}
> +
> +static int fw_cfg_sysfs_remove(struct platform_device *pdev)
> +{
> + pr_debug("fw_cfg: unloading.\n");
> + fw_cfg_sysfs_cache_cleanup();
> + fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
> + fw_cfg_io_cleanup();
> + return 0;
> +}
> +
> +static const struct of_device_id fw_cfg_sysfs_mmio_match[] = {
> + { .compatible = "qemu,fw-cfg-mmio", },
> + {},
> +};
> +MODULE_DEVICE_TABLE(of, fw_cfg_sysfs_mmio_match);
> +
> +#ifdef CONFIG_ACPI
> +static const struct acpi_device_id fw_cfg_sysfs_acpi_match[] = {
> + { "QEMU0002", },
> + {},
> +};
> +MODULE_DEVICE_TABLE(acpi, fw_cfg_sysfs_acpi_match);
> +#endif
> +
> +static struct platform_driver fw_cfg_sysfs_driver = {
> + .probe = fw_cfg_sysfs_probe,
> + .remove = fw_cfg_sysfs_remove,
> + .driver = {
> + .name = "fw_cfg",
> + .of_match_table = fw_cfg_sysfs_mmio_match,
> + .acpi_match_table = ACPI_PTR(fw_cfg_sysfs_acpi_match),
> + },
> +};
> +
> +#ifdef CONFIG_FW_CFG_SYSFS_CMDLINE
> +
> +static struct platform_device *fw_cfg_cmdline_dev;
> +
> +/* this probably belongs in e.g. include/linux/types.h,
> + * but right now we are the only ones doing it...
> + */
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +#define __PHYS_ADDR_PREFIX "ll"
> +#else
> +#define __PHYS_ADDR_PREFIX ""
> +#endif
> +
> +/* use special scanf/printf modifier for phys_addr_t, resource_size_t */
> +#define PH_ADDR_SCAN_FMT "@%" __PHYS_ADDR_PREFIX "i%n" \
> + ":%" __PHYS_ADDR_PREFIX "i" \
> + ":%" __PHYS_ADDR_PREFIX "i%n"
> +
> +#define PH_ADDR_PR_1_FMT "0x%" __PHYS_ADDR_PREFIX "x@" \
> + "0x%" __PHYS_ADDR_PREFIX "x"
> +
> +#define PH_ADDR_PR_3_FMT PH_ADDR_PR_1_FMT \
> + ":%" __PHYS_ADDR_PREFIX "u" \
> + ":%" __PHYS_ADDR_PREFIX "u"
> +
> +static int fw_cfg_cmdline_set(const char *arg, const struct kernel_param *kp)
> +{
> + struct resource res[3] = {};
> + char *str;
> + phys_addr_t base;
> + resource_size_t size, ctrl_off, data_off;
> + int processed, consumed = 0;
> +
> + /* only one fw_cfg device can exist system-wide, so if one
> + * was processed on the command line already, we might as
> + * well stop here.
> + */
> + if (fw_cfg_cmdline_dev) {
> + /* avoid leaking previously registered device */
> + platform_device_unregister(fw_cfg_cmdline_dev);
> + return -EINVAL;
> + }
> +
> + /* consume "<size>" portion of command line argument */
> + size = memparse(arg, &str);
> +
> + /* get "@<base>[:<ctrl_off>:<data_off>]" chunks */
> + processed = sscanf(str, PH_ADDR_SCAN_FMT,
> + &base, &consumed,
> + &ctrl_off, &data_off, &consumed);
> +
> + /* sscanf() must process precisely 1 or 3 chunks:
> + * <base> is mandatory, optionally followed by <ctrl_off>
> + * and <data_off>;
> + * there must be no extra characters after the last chunk,
> + * so str[consumed] must be '\0'.
> + */
> + if (str[consumed] ||
> + (processed != 1 && processed != 3))
> + return -EINVAL;
> +
> + res[0].start = base;
> + res[0].end = base + size - 1;
> + res[0].flags = !strcmp(kp->name, "mmio") ? IORESOURCE_MEM :
> + IORESOURCE_IO;
> +
> + /* insert register offsets, if provided */
> + if (processed > 1) {
> + res[1].name = "ctrl";
> + res[1].start = ctrl_off;
> + res[1].flags = IORESOURCE_REG;
> + res[2].name = "data";
> + res[2].start = data_off;
> + res[2].flags = IORESOURCE_REG;
> + }
> +
> + /* "processed" happens to nicely match the number of resources
> + * we need to pass in to this platform device.
> + */
> + fw_cfg_cmdline_dev = platform_device_register_simple("fw_cfg",
> + PLATFORM_DEVID_NONE, res, processed);
> + if (IS_ERR(fw_cfg_cmdline_dev))
> + return PTR_ERR(fw_cfg_cmdline_dev);
> +
> + return 0;
> +}
> +
> +static int fw_cfg_cmdline_get(char *buf, const struct kernel_param *kp)
> +{
> + /* stay silent if device was not configured via the command
> + * line, or if the parameter name (ioport/mmio) doesn't match
> + * the device setting
> + */
> + if (!fw_cfg_cmdline_dev ||
> + (!strcmp(kp->name, "mmio") ^
> + (fw_cfg_cmdline_dev->resource[0].flags == IORESOURCE_MEM)))
> + return 0;
> +
> + switch (fw_cfg_cmdline_dev->num_resources) {
> + case 1:
> + return snprintf(buf, PAGE_SIZE, PH_ADDR_PR_1_FMT,
> + resource_size(&fw_cfg_cmdline_dev->resource[0]),
> + fw_cfg_cmdline_dev->resource[0].start);
> + case 3:
> + return snprintf(buf, PAGE_SIZE, PH_ADDR_PR_3_FMT,
> + resource_size(&fw_cfg_cmdline_dev->resource[0]),
> + fw_cfg_cmdline_dev->resource[0].start,
> + fw_cfg_cmdline_dev->resource[1].start,
> + fw_cfg_cmdline_dev->resource[2].start);
> + }
> +
> + /* Should never get here */
> + WARN(1, "Unexpected number of resources: %d\n",
> + fw_cfg_cmdline_dev->num_resources);
> + return 0;
> +}
> +
> +static const struct kernel_param_ops fw_cfg_cmdline_param_ops = {
> + .set = fw_cfg_cmdline_set,
> + .get = fw_cfg_cmdline_get,
> +};
> +
> +device_param_cb(ioport, &fw_cfg_cmdline_param_ops, NULL, S_IRUSR);
> +device_param_cb(mmio, &fw_cfg_cmdline_param_ops, NULL, S_IRUSR);
> +
> +#endif /* CONFIG_FW_CFG_SYSFS_CMDLINE */
> +
> +static int __init fw_cfg_sysfs_init(void)
> +{
> + /* create /sys/firmware/qemu_fw_cfg/ top level directory */
> + fw_cfg_top_ko = kobject_create_and_add("qemu_fw_cfg", firmware_kobj);
> + if (!fw_cfg_top_ko)
> + return -ENOMEM;
> +
> + return platform_driver_register(&fw_cfg_sysfs_driver);
> +}
> +
> +static void __exit fw_cfg_sysfs_exit(void)
> +{
> + platform_driver_unregister(&fw_cfg_sysfs_driver);
> +
> +#ifdef CONFIG_FW_CFG_SYSFS_CMDLINE
> + platform_device_unregister(fw_cfg_cmdline_dev);
> +#endif
> +
> + /* clean up /sys/firmware/qemu_fw_cfg/ */
> + fw_cfg_kobj_cleanup(fw_cfg_top_ko);
> +}
> +
> +module_init(fw_cfg_sysfs_init);
> +module_exit(fw_cfg_sysfs_exit);
> --
> 2.4.3
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-21 8:30 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-21 8:30 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Thu, Jan 28, 2016 at 09:23:11AM -0500, Gabriel L. Somlo wrote:
> From: Gabriel Somlo <somlo@cmu.edu>
>
> Make fw_cfg entries of type "file" available via sysfs. Entries
> are listed under /sys/firmware/qemu_fw_cfg/by_key, in folders
> named after each entry's selector key. Filename, selector value,
> and size read-only attributes are included for each entry. Also,
> a "raw" attribute allows retrieval of the full binary content of
> each entry.
>
> The fw_cfg device can be instantiated automatically from ACPI or
> the Device Tree, or manually by using a kernel module (or command
> line) parameter, with a syntax outlined in the documentation file.
>
> Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
> ---
> .../ABI/testing/sysfs-firmware-qemu_fw_cfg | 58 ++
> drivers/firmware/Kconfig | 19 +
> drivers/firmware/Makefile | 1 +
> drivers/firmware/qemu_fw_cfg.c | 648 +++++++++++++++++++++
> 4 files changed, 726 insertions(+)
> create mode 100644 Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> create mode 100644 drivers/firmware/qemu_fw_cfg.c
>
> diff --git a/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> new file mode 100644
> index 0000000..e9e58d4
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> @@ -0,0 +1,58 @@
> +What: /sys/firmware/qemu_fw_cfg/
> +Date: August 2015
> +Contact: Gabriel Somlo <somlo@cmu.edu>
> +Description:
> + Several different architectures supported by QEMU (x86, arm,
> + sun4*, ppc/mac) are provisioned with a firmware configuration
> + (fw_cfg) device, originally intended as a way for the host to
> + provide configuration data to the guest firmware. Starting
> + with QEMU v2.4, arbitrary fw_cfg file entries may be specified
> + by the user on the command line, which makes fw_cfg additionally
> + useful as an out-of-band, asynchronous mechanism for providing
> + configuration data to the guest userspace.
> +
> + The authoritative guest-side hardware interface documentation
> + to the fw_cfg device can be found in "docs/specs/fw_cfg.txt"
> + in the QEMU source tree.
> +
> + === SysFS fw_cfg Interface ===
> +
> + The fw_cfg sysfs interface described in this document is only
> + intended to display discoverable blobs (i.e., those registered
> + with the file directory), as there is no way to determine the
> + presence or size of "legacy" blobs (with selector keys between
> + 0x0002 and 0x0018) programmatically.
> +
> + All fw_cfg information is shown under:
> +
> + /sys/firmware/qemu_fw_cfg/
> +
> + The only legacy blob displayed is the fw_cfg device revision:
> +
> + /sys/firmware/qemu_fw_cfg/rev
> +
> + --- Discoverable fw_cfg blobs by selector key ---
> +
> + All discoverable blobs listed in the fw_cfg file directory are
> + displayed as entries named after their unique selector key
> + value, e.g.:
> +
> + /sys/firmware/qemu_fw_cfg/by_key/32
> + /sys/firmware/qemu_fw_cfg/by_key/33
> + /sys/firmware/qemu_fw_cfg/by_key/34
> + ...
> +
> + Each such fw_cfg sysfs entry has the following values exported
> + as attributes:
> +
> + name : The 56-byte nul-terminated ASCII string used as the
> + blob's 'file name' in the fw_cfg directory.
> + size : The length of the blob, as given in the fw_cfg
> + directory.
> + key : The value of the blob's selector key as given in the
> + fw_cfg directory. This value is the same as used in
> + the parent directory name.
> + raw : The raw bytes of the blob, obtained by selecting the
> + entry via the control register, and reading a number
> + of bytes equal to the blob size from the data
> + register.
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index 49a3a11..5130f74 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -161,6 +161,25 @@ config RASPBERRYPI_FIRMWARE
> This option enables support for communicating with the firmware on the
> Raspberry Pi.
>
> +config FW_CFG_SYSFS
> + tristate "QEMU fw_cfg device support in sysfs"
> + depends on SYSFS && (ARM || ARM64 || PPC_PMAC || SPARC || X86)
> + default n
> + help
> + Say Y or M here to enable the exporting of the QEMU firmware
> + configuration (fw_cfg) file entries via sysfs. Entries are
> + found under /sys/firmware/fw_cfg when this option is enabled
> + and loaded.
> +
> +config FW_CFG_SYSFS_CMDLINE
> + bool "QEMU fw_cfg device parameter parsing"
> + depends on FW_CFG_SYSFS
> + help
> + Allow the qemu_fw_cfg device to be initialized via the kernel
> + command line or using a module parameter.
> + WARNING: Using incorrect parameters (base address in particular)
> + may crash your system.
> +
> config QCOM_SCM
> bool
> depends on ARM || ARM64
> diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> index 48dd417..474bada 100644
> --- a/drivers/firmware/Makefile
> +++ b/drivers/firmware/Makefile
> @@ -14,6 +14,7 @@ obj-$(CONFIG_ISCSI_IBFT_FIND) += iscsi_ibft_find.o
> obj-$(CONFIG_ISCSI_IBFT) += iscsi_ibft.o
> obj-$(CONFIG_FIRMWARE_MEMMAP) += memmap.o
> obj-$(CONFIG_RASPBERRYPI_FIRMWARE) += raspberrypi.o
> +obj-$(CONFIG_FW_CFG_SYSFS) += qemu_fw_cfg.o
> obj-$(CONFIG_QCOM_SCM) += qcom_scm.o
> obj-$(CONFIG_QCOM_SCM_64) += qcom_scm-64.o
> obj-$(CONFIG_QCOM_SCM_32) += qcom_scm-32.o
> diff --git a/drivers/firmware/qemu_fw_cfg.c b/drivers/firmware/qemu_fw_cfg.c
> new file mode 100644
> index 0000000..83e8a5c
> --- /dev/null
> +++ b/drivers/firmware/qemu_fw_cfg.c
> @@ -0,0 +1,648 @@
> +/*
> + * drivers/firmware/qemu_fw_cfg.c
> + *
> + * Copyright 2015 Carnegie Mellon University
> + *
> + * Expose entries from QEMU's firmware configuration (fw_cfg) device in
> + * sysfs (read-only, under "/sys/firmware/qemu_fw_cfg/...").
> + *
> + * The fw_cfg device may be instantiated via either an ACPI node (on x86
> + * and select subsets of aarch64), a Device Tree node (on arm), or using
> + * a kernel module (or command line) parameter with the following syntax:
> + *
> + * [fw_cfg.]ioport=<size>@<base>[:<ctrl_off>:<data_off>]
> + * or
> + * [fw_cfg.]mmio=<size>@<base>[:<ctrl_off>:<data_off>]
> + *
> + * where:
> + * <size> := size of ioport or mmio range
> + * <base> := physical base address of ioport or mmio range
> + * <ctrl_off> := (optional) offset of control register
> + * <data_off> := (optional) offset of data register
> + *
> + * e.g.:
> + * fw_cfg.ioport=2@0x510:0:1 (the default on x86)
> + * or
> + * fw_cfg.mmio=0xA@0x9020000:8:0 (the default on arm)
> + */
> +
> +#include <linux/module.h>
> +#include <linux/platform_device.h>
> +#include <linux/acpi.h>
> +#include <linux/slab.h>
> +#include <linux/io.h>
> +#include <linux/ioport.h>
> +
> +MODULE_AUTHOR("Gabriel L. Somlo <somlo@cmu.edu>");
> +MODULE_DESCRIPTION("QEMU fw_cfg sysfs support");
> +MODULE_LICENSE("GPL");
> +
> +/* selector key values for "well-known" fw_cfg entries */
> +#define FW_CFG_SIGNATURE 0x00
> +#define FW_CFG_ID 0x01
> +#define FW_CFG_FILE_DIR 0x19
> +
> +/* size in bytes of fw_cfg signature */
> +#define FW_CFG_SIG_SIZE 4
> +
> +/* fw_cfg "file name" is up to 56 characters (including terminating nul) */
> +#define FW_CFG_MAX_FILE_PATH 56
> +
> +/* fw_cfg file directory entry type */
> +struct fw_cfg_file {
> + u32 size;
> + u16 select;
> + u16 reserved;
> + char name[FW_CFG_MAX_FILE_PATH];
> +};
> +
> +/* fw_cfg device i/o register addresses */
> +static bool fw_cfg_is_mmio;
> +static phys_addr_t fw_cfg_p_base;
> +static resource_size_t fw_cfg_p_size;
> +static void __iomem *fw_cfg_dev_base;
> +static void __iomem *fw_cfg_reg_ctrl;
> +static void __iomem *fw_cfg_reg_data;
> +
> +/* atomic access to fw_cfg device (potentially slow i/o, so using mutex) */
> +static DEFINE_MUTEX(fw_cfg_dev_lock);
> +
> +/* pick appropriate endianness for selector key */
> +static inline u16 fw_cfg_sel_endianness(u16 key)
> +{
> + return fw_cfg_is_mmio ? cpu_to_be16(key) : cpu_to_le16(key);
> +}
> +
> +/* read chunk of given fw_cfg blob (caller responsible for sanity-check) */
> +static inline void fw_cfg_read_blob(u16 key,
> + void *buf, loff_t pos, size_t count)
> +{
> + mutex_lock(&fw_cfg_dev_lock);
> + iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
> + while (pos-- > 0)
> + ioread8(fw_cfg_reg_data);
> + ioread8_rep(fw_cfg_reg_data, buf, count);
> + mutex_unlock(&fw_cfg_dev_lock);
> +}
This locking is not enough I think: ACPI might be
accessing FW CFG meanwhile, and assuming it's
the only owner.
I think that on systems that support ACPI,
it would be better to use an ACPI interface
to read blobs, instead of poking on _CRS and banging on registers directly.
This will also make it easier to extend the interface,
and support DMA.
> +
> +/* clean up fw_cfg device i/o */
> +static void fw_cfg_io_cleanup(void)
> +{
> + if (fw_cfg_is_mmio) {
> + iounmap(fw_cfg_dev_base);
> + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> + } else {
> + ioport_unmap(fw_cfg_dev_base);
> + release_region(fw_cfg_p_base, fw_cfg_p_size);
> + }
> +}
> +
> +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
So for all arches which support ACPI, I think this driver
should just rely on ACPI.
> +#if !(defined(FW_CFG_CTRL_OFF) && defined(FW_CTRL_DATA_OFF))
> +# if (defined(CONFIG_ARM) || defined(CONFIG_ARM64))
> +# define FW_CFG_CTRL_OFF 0x08
> +# define FW_CFG_DATA_OFF 0x00
> +# elif (defined(CONFIG_PPC_PMAC) || defined(CONFIG_SPARC32)) /* ppc/mac,sun4m */
> +# define FW_CFG_CTRL_OFF 0x00
> +# define FW_CFG_DATA_OFF 0x02
> +# elif (defined(CONFIG_X86) || defined(CONFIG_SPARC64)) /* x86, sun4u */
> +# define FW_CFG_CTRL_OFF 0x00
> +# define FW_CFG_DATA_OFF 0x01
> +# else
> +# warning "QEMU FW_CFG may not be available on this architecture!"
> +# define FW_CFG_CTRL_OFF 0x00
> +# define FW_CFG_DATA_OFF 0x01
Better not try hacks like this, they are hard
to support down the road. Please only list what is tested and
actually exposed by QEMU.
> +# endif
> +#endif
> +
> +/* initialize fw_cfg device i/o from platform data */
> +static int fw_cfg_do_platform_probe(struct platform_device *pdev)
> +{
> + char sig[FW_CFG_SIG_SIZE];
> + struct resource *range, *ctrl, *data;
> +
> + /* acquire i/o range details */
> + fw_cfg_is_mmio = false;
> + range = platform_get_resource(pdev, IORESOURCE_IO, 0);
> + if (!range) {
> + fw_cfg_is_mmio = true;
> + range = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> + if (!range)
> + return -EINVAL;
> + }
> + fw_cfg_p_base = range->start;
> + fw_cfg_p_size = resource_size(range);
> +
> + if (fw_cfg_is_mmio) {
> + if (!request_mem_region(fw_cfg_p_base,
> + fw_cfg_p_size, "fw_cfg_mem"))
> + return -EBUSY;
> + fw_cfg_dev_base = ioremap(fw_cfg_p_base, fw_cfg_p_size);
> + if (!fw_cfg_dev_base) {
> + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> + return -EFAULT;
> + }
> + } else {
> + if (!request_region(fw_cfg_p_base,
> + fw_cfg_p_size, "fw_cfg_io"))
> + return -EBUSY;
> + fw_cfg_dev_base = ioport_map(fw_cfg_p_base, fw_cfg_p_size);
> + if (!fw_cfg_dev_base) {
> + release_region(fw_cfg_p_base, fw_cfg_p_size);
> + return -EFAULT;
> + }
> + }
> +
> + /* were custom register offsets provided (e.g. on the command line)? */
> + ctrl = platform_get_resource_byname(pdev, IORESOURCE_REG, "ctrl");
> + data = platform_get_resource_byname(pdev, IORESOURCE_REG, "data");
> + if (ctrl && data) {
> + fw_cfg_reg_ctrl = fw_cfg_dev_base + ctrl->start;
> + fw_cfg_reg_data = fw_cfg_dev_base + data->start;
> + } else {
> + /* use architecture-specific offsets */
> + fw_cfg_reg_ctrl = fw_cfg_dev_base + FW_CFG_CTRL_OFF;
> + fw_cfg_reg_data = fw_cfg_dev_base + FW_CFG_DATA_OFF;
> + }
> +
> + /* verify fw_cfg device signature */
> + fw_cfg_read_blob(FW_CFG_SIGNATURE, sig, 0, FW_CFG_SIG_SIZE);
> + if (memcmp(sig, "QEMU", FW_CFG_SIG_SIZE) != 0) {
> + fw_cfg_io_cleanup();
> + return -ENODEV;
> + }
> +
> + return 0;
> +}
> +
> +/* fw_cfg revision attribute, in /sys/firmware/qemu_fw_cfg top-level dir. */
> +static u32 fw_cfg_rev;
> +
> +static ssize_t fw_cfg_showrev(struct kobject *k, struct attribute *a, char *buf)
> +{
> + return sprintf(buf, "%u\n", fw_cfg_rev);
> +}
> +
> +static const struct {
> + struct attribute attr;
> + ssize_t (*show)(struct kobject *k, struct attribute *a, char *buf);
> +} fw_cfg_rev_attr = {
> + .attr = { .name = "rev", .mode = S_IRUSR },
> + .show = fw_cfg_showrev,
> +};
> +
> +/* fw_cfg_sysfs_entry type */
> +struct fw_cfg_sysfs_entry {
> + struct kobject kobj;
> + struct fw_cfg_file f;
> + struct list_head list;
> +};
> +
> +/* get fw_cfg_sysfs_entry from kobject member */
> +static inline struct fw_cfg_sysfs_entry *to_entry(struct kobject *kobj)
> +{
> + return container_of(kobj, struct fw_cfg_sysfs_entry, kobj);
> +}
> +
> +/* fw_cfg_sysfs_attribute type */
> +struct fw_cfg_sysfs_attribute {
> + struct attribute attr;
> + ssize_t (*show)(struct fw_cfg_sysfs_entry *entry, char *buf);
> +};
> +
> +/* get fw_cfg_sysfs_attribute from attribute member */
> +static inline struct fw_cfg_sysfs_attribute *to_attr(struct attribute *attr)
> +{
> + return container_of(attr, struct fw_cfg_sysfs_attribute, attr);
> +}
> +
> +/* global cache of fw_cfg_sysfs_entry objects */
> +static LIST_HEAD(fw_cfg_entry_cache);
> +
> +/* kobjects removed lazily by kernel, mutual exclusion needed */
> +static DEFINE_SPINLOCK(fw_cfg_cache_lock);
> +
> +static inline void fw_cfg_sysfs_cache_enlist(struct fw_cfg_sysfs_entry *entry)
> +{
> + spin_lock(&fw_cfg_cache_lock);
> + list_add_tail(&entry->list, &fw_cfg_entry_cache);
> + spin_unlock(&fw_cfg_cache_lock);
> +}
> +
> +static inline void fw_cfg_sysfs_cache_delist(struct fw_cfg_sysfs_entry *entry)
> +{
> + spin_lock(&fw_cfg_cache_lock);
> + list_del(&entry->list);
> + spin_unlock(&fw_cfg_cache_lock);
> +}
> +
> +static void fw_cfg_sysfs_cache_cleanup(void)
> +{
> + struct fw_cfg_sysfs_entry *entry, *next;
> +
> + list_for_each_entry_safe(entry, next, &fw_cfg_entry_cache, list) {
> + /* will end up invoking fw_cfg_sysfs_cache_delist()
> + * via each object's release() method (i.e. destructor)
> + */
> + kobject_put(&entry->kobj);
> + }
> +}
> +
> +/* default_attrs: per-entry attributes and show methods */
> +
> +#define FW_CFG_SYSFS_ATTR(_attr) \
> +struct fw_cfg_sysfs_attribute fw_cfg_sysfs_attr_##_attr = { \
> + .attr = { .name = __stringify(_attr), .mode = S_IRUSR }, \
> + .show = fw_cfg_sysfs_show_##_attr, \
> +}
> +
> +static ssize_t fw_cfg_sysfs_show_size(struct fw_cfg_sysfs_entry *e, char *buf)
> +{
> + return sprintf(buf, "%u\n", e->f.size);
> +}
> +
> +static ssize_t fw_cfg_sysfs_show_key(struct fw_cfg_sysfs_entry *e, char *buf)
> +{
> + return sprintf(buf, "%u\n", e->f.select);
> +}
> +
> +static ssize_t fw_cfg_sysfs_show_name(struct fw_cfg_sysfs_entry *e, char *buf)
> +{
> + return sprintf(buf, "%s\n", e->f.name);
> +}
> +
> +static FW_CFG_SYSFS_ATTR(size);
> +static FW_CFG_SYSFS_ATTR(key);
> +static FW_CFG_SYSFS_ATTR(name);
> +
> +static struct attribute *fw_cfg_sysfs_entry_attrs[] = {
> + &fw_cfg_sysfs_attr_size.attr,
> + &fw_cfg_sysfs_attr_key.attr,
> + &fw_cfg_sysfs_attr_name.attr,
> + NULL,
> +};
> +
> +/* sysfs_ops: find fw_cfg_[entry, attribute] and call appropriate show method */
> +static ssize_t fw_cfg_sysfs_attr_show(struct kobject *kobj, struct attribute *a,
> + char *buf)
> +{
> + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> + struct fw_cfg_sysfs_attribute *attr = to_attr(a);
> +
> + return attr->show(entry, buf);
> +}
> +
> +static const struct sysfs_ops fw_cfg_sysfs_attr_ops = {
> + .show = fw_cfg_sysfs_attr_show,
> +};
> +
> +/* release: destructor, to be called via kobject_put() */
> +static void fw_cfg_sysfs_release_entry(struct kobject *kobj)
> +{
> + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> +
> + fw_cfg_sysfs_cache_delist(entry);
> + kfree(entry);
> +}
> +
> +/* kobj_type: ties together all properties required to register an entry */
> +static struct kobj_type fw_cfg_sysfs_entry_ktype = {
> + .default_attrs = fw_cfg_sysfs_entry_attrs,
> + .sysfs_ops = &fw_cfg_sysfs_attr_ops,
> + .release = fw_cfg_sysfs_release_entry,
> +};
> +
> +/* raw-read method and attribute */
> +static ssize_t fw_cfg_sysfs_read_raw(struct file *filp, struct kobject *kobj,
> + struct bin_attribute *bin_attr,
> + char *buf, loff_t pos, size_t count)
> +{
> + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> +
> + if (pos > entry->f.size)
> + return -EINVAL;
> +
> + if (count > entry->f.size - pos)
> + count = entry->f.size - pos;
> +
> + fw_cfg_read_blob(entry->f.select, buf, pos, count);
> + return count;
> +}
> +
> +static struct bin_attribute fw_cfg_sysfs_attr_raw = {
> + .attr = { .name = "raw", .mode = S_IRUSR },
> + .read = fw_cfg_sysfs_read_raw,
> +};
> +
> +/* kobjects representing top-level and by_key folders */
> +static struct kobject *fw_cfg_top_ko;
> +static struct kobject *fw_cfg_sel_ko;
> +
> +/* register an individual fw_cfg file */
> +static int fw_cfg_register_file(const struct fw_cfg_file *f)
> +{
> + int err;
> + struct fw_cfg_sysfs_entry *entry;
> +
> + /* allocate new entry */
> + entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> + if (!entry)
> + return -ENOMEM;
> +
> + /* set file entry information */
> + memcpy(&entry->f, f, sizeof(struct fw_cfg_file));
> +
> + /* register entry under "/sys/firmware/qemu_fw_cfg/by_key/" */
> + err = kobject_init_and_add(&entry->kobj, &fw_cfg_sysfs_entry_ktype,
> + fw_cfg_sel_ko, "%d", entry->f.select);
> + if (err)
> + goto err_register;
> +
> + /* add raw binary content access */
> + err = sysfs_create_bin_file(&entry->kobj, &fw_cfg_sysfs_attr_raw);
> + if (err)
> + goto err_add_raw;
> +
> + /* success, add entry to global cache */
> + fw_cfg_sysfs_cache_enlist(entry);
> + return 0;
> +
> +err_add_raw:
> + kobject_del(&entry->kobj);
> +err_register:
> + kfree(entry);
> + return err;
> +}
> +
> +/* iterate over all fw_cfg directory entries, registering each one */
> +static int fw_cfg_register_dir_entries(void)
> +{
> + int ret = 0;
> + u32 count, i;
> + struct fw_cfg_file *dir;
> + size_t dir_size;
> +
> + fw_cfg_read_blob(FW_CFG_FILE_DIR, &count, 0, sizeof(count));
> + count = be32_to_cpu(count);
> + dir_size = count * sizeof(struct fw_cfg_file);
> +
> + dir = kmalloc(dir_size, GFP_KERNEL);
> + if (!dir)
> + return -ENOMEM;
> +
> + fw_cfg_read_blob(FW_CFG_FILE_DIR, dir, sizeof(count), dir_size);
> +
> + for (i = 0; i < count; i++) {
> + dir[i].size = be32_to_cpu(dir[i].size);
> + dir[i].select = be16_to_cpu(dir[i].select);
> + ret = fw_cfg_register_file(&dir[i]);
> + if (ret)
> + break;
> + }
> +
> + kfree(dir);
> + return ret;
> +}
> +
> +/* unregister top-level or by_key folder */
> +static inline void fw_cfg_kobj_cleanup(struct kobject *kobj)
> +{
> + kobject_del(kobj);
> + kobject_put(kobj);
> +}
> +
> +static int fw_cfg_sysfs_probe(struct platform_device *pdev)
> +{
> + int err;
> +
> + /* NOTE: If we supported multiple fw_cfg devices, we'd first create
> + * a subdirectory named after e.g. pdev->id, then hang per-device
> + * by_key subdirectories underneath it. However, only
> + * one fw_cfg device exist system-wide, so if one was already found
> + * earlier, we might as well stop here.
> + */
> + if (fw_cfg_sel_ko)
> + return -EBUSY;
> +
> + /* create by_key subdirectory of /sys/firmware/qemu_fw_cfg/ */
> + err = -ENOMEM;
> + fw_cfg_sel_ko = kobject_create_and_add("by_key", fw_cfg_top_ko);
> + if (!fw_cfg_sel_ko)
> + goto err_sel;
> +
> + /* initialize fw_cfg device i/o from platform data */
> + err = fw_cfg_do_platform_probe(pdev);
> + if (err)
> + goto err_probe;
> +
> + /* get revision number, add matching top-level attribute */
> + fw_cfg_read_blob(FW_CFG_ID, &fw_cfg_rev, 0, sizeof(fw_cfg_rev));
> + fw_cfg_rev = le32_to_cpu(fw_cfg_rev);
> + err = sysfs_create_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
> + if (err)
> + goto err_rev;
> +
> + /* process fw_cfg file directory entry, registering each file */
> + err = fw_cfg_register_dir_entries();
> + if (err)
> + goto err_dir;
> +
> + /* success */
> + pr_debug("fw_cfg: loaded.\n");
> + return 0;
> +
> +err_dir:
> + fw_cfg_sysfs_cache_cleanup();
> + sysfs_remove_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
> +err_rev:
> + fw_cfg_io_cleanup();
> +err_probe:
> + fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
> +err_sel:
> + return err;
> +}
> +
> +static int fw_cfg_sysfs_remove(struct platform_device *pdev)
> +{
> + pr_debug("fw_cfg: unloading.\n");
> + fw_cfg_sysfs_cache_cleanup();
> + fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
> + fw_cfg_io_cleanup();
> + return 0;
> +}
> +
> +static const struct of_device_id fw_cfg_sysfs_mmio_match[] = {
> + { .compatible = "qemu,fw-cfg-mmio", },
> + {},
> +};
> +MODULE_DEVICE_TABLE(of, fw_cfg_sysfs_mmio_match);
> +
> +#ifdef CONFIG_ACPI
> +static const struct acpi_device_id fw_cfg_sysfs_acpi_match[] = {
> + { "QEMU0002", },
> + {},
> +};
> +MODULE_DEVICE_TABLE(acpi, fw_cfg_sysfs_acpi_match);
> +#endif
> +
> +static struct platform_driver fw_cfg_sysfs_driver = {
> + .probe = fw_cfg_sysfs_probe,
> + .remove = fw_cfg_sysfs_remove,
> + .driver = {
> + .name = "fw_cfg",
> + .of_match_table = fw_cfg_sysfs_mmio_match,
> + .acpi_match_table = ACPI_PTR(fw_cfg_sysfs_acpi_match),
> + },
> +};
> +
> +#ifdef CONFIG_FW_CFG_SYSFS_CMDLINE
> +
> +static struct platform_device *fw_cfg_cmdline_dev;
> +
> +/* this probably belongs in e.g. include/linux/types.h,
> + * but right now we are the only ones doing it...
> + */
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +#define __PHYS_ADDR_PREFIX "ll"
> +#else
> +#define __PHYS_ADDR_PREFIX ""
> +#endif
> +
> +/* use special scanf/printf modifier for phys_addr_t, resource_size_t */
> +#define PH_ADDR_SCAN_FMT "@%" __PHYS_ADDR_PREFIX "i%n" \
> + ":%" __PHYS_ADDR_PREFIX "i" \
> + ":%" __PHYS_ADDR_PREFIX "i%n"
> +
> +#define PH_ADDR_PR_1_FMT "0x%" __PHYS_ADDR_PREFIX "x@" \
> + "0x%" __PHYS_ADDR_PREFIX "x"
> +
> +#define PH_ADDR_PR_3_FMT PH_ADDR_PR_1_FMT \
> + ":%" __PHYS_ADDR_PREFIX "u" \
> + ":%" __PHYS_ADDR_PREFIX "u"
> +
> +static int fw_cfg_cmdline_set(const char *arg, const struct kernel_param *kp)
> +{
> + struct resource res[3] = {};
> + char *str;
> + phys_addr_t base;
> + resource_size_t size, ctrl_off, data_off;
> + int processed, consumed = 0;
> +
> + /* only one fw_cfg device can exist system-wide, so if one
> + * was processed on the command line already, we might as
> + * well stop here.
> + */
> + if (fw_cfg_cmdline_dev) {
> + /* avoid leaking previously registered device */
> + platform_device_unregister(fw_cfg_cmdline_dev);
> + return -EINVAL;
> + }
> +
> + /* consume "<size>" portion of command line argument */
> + size = memparse(arg, &str);
> +
> + /* get "@<base>[:<ctrl_off>:<data_off>]" chunks */
> + processed = sscanf(str, PH_ADDR_SCAN_FMT,
> + &base, &consumed,
> + &ctrl_off, &data_off, &consumed);
> +
> + /* sscanf() must process precisely 1 or 3 chunks:
> + * <base> is mandatory, optionally followed by <ctrl_off>
> + * and <data_off>;
> + * there must be no extra characters after the last chunk,
> + * so str[consumed] must be '\0'.
> + */
> + if (str[consumed] ||
> + (processed != 1 && processed != 3))
> + return -EINVAL;
> +
> + res[0].start = base;
> + res[0].end = base + size - 1;
> + res[0].flags = !strcmp(kp->name, "mmio") ? IORESOURCE_MEM :
> + IORESOURCE_IO;
> +
> + /* insert register offsets, if provided */
> + if (processed > 1) {
> + res[1].name = "ctrl";
> + res[1].start = ctrl_off;
> + res[1].flags = IORESOURCE_REG;
> + res[2].name = "data";
> + res[2].start = data_off;
> + res[2].flags = IORESOURCE_REG;
> + }
> +
> + /* "processed" happens to nicely match the number of resources
> + * we need to pass in to this platform device.
> + */
> + fw_cfg_cmdline_dev = platform_device_register_simple("fw_cfg",
> + PLATFORM_DEVID_NONE, res, processed);
> + if (IS_ERR(fw_cfg_cmdline_dev))
> + return PTR_ERR(fw_cfg_cmdline_dev);
> +
> + return 0;
> +}
> +
> +static int fw_cfg_cmdline_get(char *buf, const struct kernel_param *kp)
> +{
> + /* stay silent if device was not configured via the command
> + * line, or if the parameter name (ioport/mmio) doesn't match
> + * the device setting
> + */
> + if (!fw_cfg_cmdline_dev ||
> + (!strcmp(kp->name, "mmio") ^
> + (fw_cfg_cmdline_dev->resource[0].flags == IORESOURCE_MEM)))
> + return 0;
> +
> + switch (fw_cfg_cmdline_dev->num_resources) {
> + case 1:
> + return snprintf(buf, PAGE_SIZE, PH_ADDR_PR_1_FMT,
> + resource_size(&fw_cfg_cmdline_dev->resource[0]),
> + fw_cfg_cmdline_dev->resource[0].start);
> + case 3:
> + return snprintf(buf, PAGE_SIZE, PH_ADDR_PR_3_FMT,
> + resource_size(&fw_cfg_cmdline_dev->resource[0]),
> + fw_cfg_cmdline_dev->resource[0].start,
> + fw_cfg_cmdline_dev->resource[1].start,
> + fw_cfg_cmdline_dev->resource[2].start);
> + }
> +
> + /* Should never get here */
> + WARN(1, "Unexpected number of resources: %d\n",
> + fw_cfg_cmdline_dev->num_resources);
> + return 0;
> +}
> +
> +static const struct kernel_param_ops fw_cfg_cmdline_param_ops = {
> + .set = fw_cfg_cmdline_set,
> + .get = fw_cfg_cmdline_get,
> +};
> +
> +device_param_cb(ioport, &fw_cfg_cmdline_param_ops, NULL, S_IRUSR);
> +device_param_cb(mmio, &fw_cfg_cmdline_param_ops, NULL, S_IRUSR);
> +
> +#endif /* CONFIG_FW_CFG_SYSFS_CMDLINE */
> +
> +static int __init fw_cfg_sysfs_init(void)
> +{
> + /* create /sys/firmware/qemu_fw_cfg/ top level directory */
> + fw_cfg_top_ko = kobject_create_and_add("qemu_fw_cfg", firmware_kobj);
> + if (!fw_cfg_top_ko)
> + return -ENOMEM;
> +
> + return platform_driver_register(&fw_cfg_sysfs_driver);
> +}
> +
> +static void __exit fw_cfg_sysfs_exit(void)
> +{
> + platform_driver_unregister(&fw_cfg_sysfs_driver);
> +
> +#ifdef CONFIG_FW_CFG_SYSFS_CMDLINE
> + platform_device_unregister(fw_cfg_cmdline_dev);
> +#endif
> +
> + /* clean up /sys/firmware/qemu_fw_cfg/ */
> + fw_cfg_kobj_cleanup(fw_cfg_top_ko);
> +}
> +
> +module_init(fw_cfg_sysfs_init);
> +module_exit(fw_cfg_sysfs_exit);
> --
> 2.4.3
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
2016-02-21 8:30 ` Michael S. Tsirkin
(?)
@ 2016-02-21 13:06 ` Gabriel L. Somlo
-1 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-21 13:06 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: gregkh, robh+dt, pawel.moll, mark.rutland, ijc+devicetree, galak,
arnd, lersek, ralf, rmk+kernel, eric, hanjun.guo, zajec5,
sudeep.holla, agross, linux-api, linux-kernel, devicetree,
qemu-devel, imammedo, peter.maydell, leif.lindholm,
ard.biesheuvel, pbonzini, kraxel, ehabkost, luto, stefanha,
revol, matt, rth
On Sun, Feb 21, 2016 at 10:30:26AM +0200, Michael S. Tsirkin wrote:
> On Thu, Jan 28, 2016 at 09:23:11AM -0500, Gabriel L. Somlo wrote:
> > From: Gabriel Somlo <somlo@cmu.edu>
> >
> > Make fw_cfg entries of type "file" available via sysfs. Entries
> > are listed under /sys/firmware/qemu_fw_cfg/by_key, in folders
> > named after each entry's selector key. Filename, selector value,
> > and size read-only attributes are included for each entry. Also,
> > a "raw" attribute allows retrieval of the full binary content of
> > each entry.
> >
> > The fw_cfg device can be instantiated automatically from ACPI or
> > the Device Tree, or manually by using a kernel module (or command
> > line) parameter, with a syntax outlined in the documentation file.
> >
> > Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
> > ---
> > .../ABI/testing/sysfs-firmware-qemu_fw_cfg | 58 ++
> > drivers/firmware/Kconfig | 19 +
> > drivers/firmware/Makefile | 1 +
> > drivers/firmware/qemu_fw_cfg.c | 648 +++++++++++++++++++++
> > 4 files changed, 726 insertions(+)
> > create mode 100644 Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> > create mode 100644 drivers/firmware/qemu_fw_cfg.c
> >
> > diff --git a/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> > new file mode 100644
> > index 0000000..e9e58d4
> > --- /dev/null
> > +++ b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> > @@ -0,0 +1,58 @@
> > +What: /sys/firmware/qemu_fw_cfg/
> > +Date: August 2015
> > +Contact: Gabriel Somlo <somlo@cmu.edu>
> > +Description:
> > + Several different architectures supported by QEMU (x86, arm,
> > + sun4*, ppc/mac) are provisioned with a firmware configuration
> > + (fw_cfg) device, originally intended as a way for the host to
> > + provide configuration data to the guest firmware. Starting
> > + with QEMU v2.4, arbitrary fw_cfg file entries may be specified
> > + by the user on the command line, which makes fw_cfg additionally
> > + useful as an out-of-band, asynchronous mechanism for providing
> > + configuration data to the guest userspace.
> > +
> > + The authoritative guest-side hardware interface documentation
> > + to the fw_cfg device can be found in "docs/specs/fw_cfg.txt"
> > + in the QEMU source tree.
> > +
> > + === SysFS fw_cfg Interface ===
> > +
> > + The fw_cfg sysfs interface described in this document is only
> > + intended to display discoverable blobs (i.e., those registered
> > + with the file directory), as there is no way to determine the
> > + presence or size of "legacy" blobs (with selector keys between
> > + 0x0002 and 0x0018) programmatically.
> > +
> > + All fw_cfg information is shown under:
> > +
> > + /sys/firmware/qemu_fw_cfg/
> > +
> > + The only legacy blob displayed is the fw_cfg device revision:
> > +
> > + /sys/firmware/qemu_fw_cfg/rev
> > +
> > + --- Discoverable fw_cfg blobs by selector key ---
> > +
> > + All discoverable blobs listed in the fw_cfg file directory are
> > + displayed as entries named after their unique selector key
> > + value, e.g.:
> > +
> > + /sys/firmware/qemu_fw_cfg/by_key/32
> > + /sys/firmware/qemu_fw_cfg/by_key/33
> > + /sys/firmware/qemu_fw_cfg/by_key/34
> > + ...
> > +
> > + Each such fw_cfg sysfs entry has the following values exported
> > + as attributes:
> > +
> > + name : The 56-byte nul-terminated ASCII string used as the
> > + blob's 'file name' in the fw_cfg directory.
> > + size : The length of the blob, as given in the fw_cfg
> > + directory.
> > + key : The value of the blob's selector key as given in the
> > + fw_cfg directory. This value is the same as used in
> > + the parent directory name.
> > + raw : The raw bytes of the blob, obtained by selecting the
> > + entry via the control register, and reading a number
> > + of bytes equal to the blob size from the data
> > + register.
> > diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> > index 49a3a11..5130f74 100644
> > --- a/drivers/firmware/Kconfig
> > +++ b/drivers/firmware/Kconfig
> > @@ -161,6 +161,25 @@ config RASPBERRYPI_FIRMWARE
> > This option enables support for communicating with the firmware on the
> > Raspberry Pi.
> >
> > +config FW_CFG_SYSFS
> > + tristate "QEMU fw_cfg device support in sysfs"
> > + depends on SYSFS && (ARM || ARM64 || PPC_PMAC || SPARC || X86)
> > + default n
> > + help
> > + Say Y or M here to enable the exporting of the QEMU firmware
> > + configuration (fw_cfg) file entries via sysfs. Entries are
> > + found under /sys/firmware/fw_cfg when this option is enabled
> > + and loaded.
> > +
> > +config FW_CFG_SYSFS_CMDLINE
> > + bool "QEMU fw_cfg device parameter parsing"
> > + depends on FW_CFG_SYSFS
> > + help
> > + Allow the qemu_fw_cfg device to be initialized via the kernel
> > + command line or using a module parameter.
> > + WARNING: Using incorrect parameters (base address in particular)
> > + may crash your system.
> > +
> > config QCOM_SCM
> > bool
> > depends on ARM || ARM64
> > diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> > index 48dd417..474bada 100644
> > --- a/drivers/firmware/Makefile
> > +++ b/drivers/firmware/Makefile
> > @@ -14,6 +14,7 @@ obj-$(CONFIG_ISCSI_IBFT_FIND) += iscsi_ibft_find.o
> > obj-$(CONFIG_ISCSI_IBFT) += iscsi_ibft.o
> > obj-$(CONFIG_FIRMWARE_MEMMAP) += memmap.o
> > obj-$(CONFIG_RASPBERRYPI_FIRMWARE) += raspberrypi.o
> > +obj-$(CONFIG_FW_CFG_SYSFS) += qemu_fw_cfg.o
> > obj-$(CONFIG_QCOM_SCM) += qcom_scm.o
> > obj-$(CONFIG_QCOM_SCM_64) += qcom_scm-64.o
> > obj-$(CONFIG_QCOM_SCM_32) += qcom_scm-32.o
> > diff --git a/drivers/firmware/qemu_fw_cfg.c b/drivers/firmware/qemu_fw_cfg.c
> > new file mode 100644
> > index 0000000..83e8a5c
> > --- /dev/null
> > +++ b/drivers/firmware/qemu_fw_cfg.c
> > @@ -0,0 +1,648 @@
> > +/*
> > + * drivers/firmware/qemu_fw_cfg.c
> > + *
> > + * Copyright 2015 Carnegie Mellon University
> > + *
> > + * Expose entries from QEMU's firmware configuration (fw_cfg) device in
> > + * sysfs (read-only, under "/sys/firmware/qemu_fw_cfg/...").
> > + *
> > + * The fw_cfg device may be instantiated via either an ACPI node (on x86
> > + * and select subsets of aarch64), a Device Tree node (on arm), or using
> > + * a kernel module (or command line) parameter with the following syntax:
> > + *
> > + * [fw_cfg.]ioport=<size>@<base>[:<ctrl_off>:<data_off>]
> > + * or
> > + * [fw_cfg.]mmio=<size>@<base>[:<ctrl_off>:<data_off>]
> > + *
> > + * where:
> > + * <size> := size of ioport or mmio range
> > + * <base> := physical base address of ioport or mmio range
> > + * <ctrl_off> := (optional) offset of control register
> > + * <data_off> := (optional) offset of data register
> > + *
> > + * e.g.:
> > + * fw_cfg.ioport=2@0x510:0:1 (the default on x86)
> > + * or
> > + * fw_cfg.mmio=0xA@0x9020000:8:0 (the default on arm)
> > + */
> > +
> > +#include <linux/module.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/acpi.h>
> > +#include <linux/slab.h>
> > +#include <linux/io.h>
> > +#include <linux/ioport.h>
> > +
> > +MODULE_AUTHOR("Gabriel L. Somlo <somlo@cmu.edu>");
> > +MODULE_DESCRIPTION("QEMU fw_cfg sysfs support");
> > +MODULE_LICENSE("GPL");
> > +
> > +/* selector key values for "well-known" fw_cfg entries */
> > +#define FW_CFG_SIGNATURE 0x00
> > +#define FW_CFG_ID 0x01
> > +#define FW_CFG_FILE_DIR 0x19
> > +
> > +/* size in bytes of fw_cfg signature */
> > +#define FW_CFG_SIG_SIZE 4
> > +
> > +/* fw_cfg "file name" is up to 56 characters (including terminating nul) */
> > +#define FW_CFG_MAX_FILE_PATH 56
> > +
> > +/* fw_cfg file directory entry type */
> > +struct fw_cfg_file {
> > + u32 size;
> > + u16 select;
> > + u16 reserved;
> > + char name[FW_CFG_MAX_FILE_PATH];
> > +};
> > +
> > +/* fw_cfg device i/o register addresses */
> > +static bool fw_cfg_is_mmio;
> > +static phys_addr_t fw_cfg_p_base;
> > +static resource_size_t fw_cfg_p_size;
> > +static void __iomem *fw_cfg_dev_base;
> > +static void __iomem *fw_cfg_reg_ctrl;
> > +static void __iomem *fw_cfg_reg_data;
> > +
> > +/* atomic access to fw_cfg device (potentially slow i/o, so using mutex) */
> > +static DEFINE_MUTEX(fw_cfg_dev_lock);
> > +
> > +/* pick appropriate endianness for selector key */
> > +static inline u16 fw_cfg_sel_endianness(u16 key)
> > +{
> > + return fw_cfg_is_mmio ? cpu_to_be16(key) : cpu_to_le16(key);
> > +}
> > +
> > +/* read chunk of given fw_cfg blob (caller responsible for sanity-check) */
> > +static inline void fw_cfg_read_blob(u16 key,
> > + void *buf, loff_t pos, size_t count)
> > +{
> > + mutex_lock(&fw_cfg_dev_lock);
> > + iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
> > + while (pos-- > 0)
> > + ioread8(fw_cfg_reg_data);
> > + ioread8_rep(fw_cfg_reg_data, buf, count);
> > + mutex_unlock(&fw_cfg_dev_lock);
> > +}
>
> This locking is not enough I think: ACPI might be
> accessing FW CFG meanwhile, and assuming it's
> the only owner.
>
> I think that on systems that support ACPI,
> it would be better to use an ACPI interface
> to read blobs, instead of poking on _CRS and banging on registers directly.
>
> This will also make it easier to extend the interface,
> and support DMA.
>
>
>
> > +
> > +/* clean up fw_cfg device i/o */
> > +static void fw_cfg_io_cleanup(void)
> > +{
> > + if (fw_cfg_is_mmio) {
> > + iounmap(fw_cfg_dev_base);
> > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > + } else {
> > + ioport_unmap(fw_cfg_dev_base);
> > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > + }
> > +}
> > +
> > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
>
> So for all arches which support ACPI, I think this driver
> should just rely on ACPI.
There was a discussion about that a few versions ago, and IIRC the
conclusion was not to expect the firmware to contend for fw_cfg access
after the guest kernel boots:
https://lkml.org/lkml/2015/10/5/283
(I even had a prototype version doing what you suggested, but per the above
reference decided to drop it -- which IMHO is for the better, since otherwise
I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
see https://lkml.org/lkml/2015/11/4/534 )
>
> > +#if !(defined(FW_CFG_CTRL_OFF) && defined(FW_CTRL_DATA_OFF))
> > +# if (defined(CONFIG_ARM) || defined(CONFIG_ARM64))
> > +# define FW_CFG_CTRL_OFF 0x08
> > +# define FW_CFG_DATA_OFF 0x00
> > +# elif (defined(CONFIG_PPC_PMAC) || defined(CONFIG_SPARC32)) /* ppc/mac,sun4m */
> > +# define FW_CFG_CTRL_OFF 0x00
> > +# define FW_CFG_DATA_OFF 0x02
> > +# elif (defined(CONFIG_X86) || defined(CONFIG_SPARC64)) /* x86, sun4u */
> > +# define FW_CFG_CTRL_OFF 0x00
> > +# define FW_CFG_DATA_OFF 0x01
> > +# else
> > +# warning "QEMU FW_CFG may not be available on this architecture!"
> > +# define FW_CFG_CTRL_OFF 0x00
> > +# define FW_CFG_DATA_OFF 0x01
>
> Better not try hacks like this, they are hard
> to support down the road. Please only list what is tested and
> actually exposed by QEMU.
I was looking for a standard way to advertise register offsets within
the ioport or mmio region assigned to fw_cfg, but right now the answer
is "there isn't one", and "register offsets are an arch specific
detail". As such, the only reasonable way I saw was to copy the same
values used in the QEMU source.
See also:
https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05037.html
Thanks much,
--Gabriel
>
> > +# endif
> > +#endif
> > +
> > +/* initialize fw_cfg device i/o from platform data */
> > +static int fw_cfg_do_platform_probe(struct platform_device *pdev)
> > +{
> > + char sig[FW_CFG_SIG_SIZE];
> > + struct resource *range, *ctrl, *data;
> > +
> > + /* acquire i/o range details */
> > + fw_cfg_is_mmio = false;
> > + range = platform_get_resource(pdev, IORESOURCE_IO, 0);
> > + if (!range) {
> > + fw_cfg_is_mmio = true;
> > + range = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > + if (!range)
> > + return -EINVAL;
> > + }
> > + fw_cfg_p_base = range->start;
> > + fw_cfg_p_size = resource_size(range);
> > +
> > + if (fw_cfg_is_mmio) {
> > + if (!request_mem_region(fw_cfg_p_base,
> > + fw_cfg_p_size, "fw_cfg_mem"))
> > + return -EBUSY;
> > + fw_cfg_dev_base = ioremap(fw_cfg_p_base, fw_cfg_p_size);
> > + if (!fw_cfg_dev_base) {
> > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > + return -EFAULT;
> > + }
> > + } else {
> > + if (!request_region(fw_cfg_p_base,
> > + fw_cfg_p_size, "fw_cfg_io"))
> > + return -EBUSY;
> > + fw_cfg_dev_base = ioport_map(fw_cfg_p_base, fw_cfg_p_size);
> > + if (!fw_cfg_dev_base) {
> > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > + return -EFAULT;
> > + }
> > + }
> > +
> > + /* were custom register offsets provided (e.g. on the command line)? */
> > + ctrl = platform_get_resource_byname(pdev, IORESOURCE_REG, "ctrl");
> > + data = platform_get_resource_byname(pdev, IORESOURCE_REG, "data");
> > + if (ctrl && data) {
> > + fw_cfg_reg_ctrl = fw_cfg_dev_base + ctrl->start;
> > + fw_cfg_reg_data = fw_cfg_dev_base + data->start;
> > + } else {
> > + /* use architecture-specific offsets */
> > + fw_cfg_reg_ctrl = fw_cfg_dev_base + FW_CFG_CTRL_OFF;
> > + fw_cfg_reg_data = fw_cfg_dev_base + FW_CFG_DATA_OFF;
> > + }
> > +
> > + /* verify fw_cfg device signature */
> > + fw_cfg_read_blob(FW_CFG_SIGNATURE, sig, 0, FW_CFG_SIG_SIZE);
> > + if (memcmp(sig, "QEMU", FW_CFG_SIG_SIZE) != 0) {
> > + fw_cfg_io_cleanup();
> > + return -ENODEV;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +/* fw_cfg revision attribute, in /sys/firmware/qemu_fw_cfg top-level dir. */
> > +static u32 fw_cfg_rev;
> > +
> > +static ssize_t fw_cfg_showrev(struct kobject *k, struct attribute *a, char *buf)
> > +{
> > + return sprintf(buf, "%u\n", fw_cfg_rev);
> > +}
> > +
> > +static const struct {
> > + struct attribute attr;
> > + ssize_t (*show)(struct kobject *k, struct attribute *a, char *buf);
> > +} fw_cfg_rev_attr = {
> > + .attr = { .name = "rev", .mode = S_IRUSR },
> > + .show = fw_cfg_showrev,
> > +};
> > +
> > +/* fw_cfg_sysfs_entry type */
> > +struct fw_cfg_sysfs_entry {
> > + struct kobject kobj;
> > + struct fw_cfg_file f;
> > + struct list_head list;
> > +};
> > +
> > +/* get fw_cfg_sysfs_entry from kobject member */
> > +static inline struct fw_cfg_sysfs_entry *to_entry(struct kobject *kobj)
> > +{
> > + return container_of(kobj, struct fw_cfg_sysfs_entry, kobj);
> > +}
> > +
> > +/* fw_cfg_sysfs_attribute type */
> > +struct fw_cfg_sysfs_attribute {
> > + struct attribute attr;
> > + ssize_t (*show)(struct fw_cfg_sysfs_entry *entry, char *buf);
> > +};
> > +
> > +/* get fw_cfg_sysfs_attribute from attribute member */
> > +static inline struct fw_cfg_sysfs_attribute *to_attr(struct attribute *attr)
> > +{
> > + return container_of(attr, struct fw_cfg_sysfs_attribute, attr);
> > +}
> > +
> > +/* global cache of fw_cfg_sysfs_entry objects */
> > +static LIST_HEAD(fw_cfg_entry_cache);
> > +
> > +/* kobjects removed lazily by kernel, mutual exclusion needed */
> > +static DEFINE_SPINLOCK(fw_cfg_cache_lock);
> > +
> > +static inline void fw_cfg_sysfs_cache_enlist(struct fw_cfg_sysfs_entry *entry)
> > +{
> > + spin_lock(&fw_cfg_cache_lock);
> > + list_add_tail(&entry->list, &fw_cfg_entry_cache);
> > + spin_unlock(&fw_cfg_cache_lock);
> > +}
> > +
> > +static inline void fw_cfg_sysfs_cache_delist(struct fw_cfg_sysfs_entry *entry)
> > +{
> > + spin_lock(&fw_cfg_cache_lock);
> > + list_del(&entry->list);
> > + spin_unlock(&fw_cfg_cache_lock);
> > +}
> > +
> > +static void fw_cfg_sysfs_cache_cleanup(void)
> > +{
> > + struct fw_cfg_sysfs_entry *entry, *next;
> > +
> > + list_for_each_entry_safe(entry, next, &fw_cfg_entry_cache, list) {
> > + /* will end up invoking fw_cfg_sysfs_cache_delist()
> > + * via each object's release() method (i.e. destructor)
> > + */
> > + kobject_put(&entry->kobj);
> > + }
> > +}
> > +
> > +/* default_attrs: per-entry attributes and show methods */
> > +
> > +#define FW_CFG_SYSFS_ATTR(_attr) \
> > +struct fw_cfg_sysfs_attribute fw_cfg_sysfs_attr_##_attr = { \
> > + .attr = { .name = __stringify(_attr), .mode = S_IRUSR }, \
> > + .show = fw_cfg_sysfs_show_##_attr, \
> > +}
> > +
> > +static ssize_t fw_cfg_sysfs_show_size(struct fw_cfg_sysfs_entry *e, char *buf)
> > +{
> > + return sprintf(buf, "%u\n", e->f.size);
> > +}
> > +
> > +static ssize_t fw_cfg_sysfs_show_key(struct fw_cfg_sysfs_entry *e, char *buf)
> > +{
> > + return sprintf(buf, "%u\n", e->f.select);
> > +}
> > +
> > +static ssize_t fw_cfg_sysfs_show_name(struct fw_cfg_sysfs_entry *e, char *buf)
> > +{
> > + return sprintf(buf, "%s\n", e->f.name);
> > +}
> > +
> > +static FW_CFG_SYSFS_ATTR(size);
> > +static FW_CFG_SYSFS_ATTR(key);
> > +static FW_CFG_SYSFS_ATTR(name);
> > +
> > +static struct attribute *fw_cfg_sysfs_entry_attrs[] = {
> > + &fw_cfg_sysfs_attr_size.attr,
> > + &fw_cfg_sysfs_attr_key.attr,
> > + &fw_cfg_sysfs_attr_name.attr,
> > + NULL,
> > +};
> > +
> > +/* sysfs_ops: find fw_cfg_[entry, attribute] and call appropriate show method */
> > +static ssize_t fw_cfg_sysfs_attr_show(struct kobject *kobj, struct attribute *a,
> > + char *buf)
> > +{
> > + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> > + struct fw_cfg_sysfs_attribute *attr = to_attr(a);
> > +
> > + return attr->show(entry, buf);
> > +}
> > +
> > +static const struct sysfs_ops fw_cfg_sysfs_attr_ops = {
> > + .show = fw_cfg_sysfs_attr_show,
> > +};
> > +
> > +/* release: destructor, to be called via kobject_put() */
> > +static void fw_cfg_sysfs_release_entry(struct kobject *kobj)
> > +{
> > + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> > +
> > + fw_cfg_sysfs_cache_delist(entry);
> > + kfree(entry);
> > +}
> > +
> > +/* kobj_type: ties together all properties required to register an entry */
> > +static struct kobj_type fw_cfg_sysfs_entry_ktype = {
> > + .default_attrs = fw_cfg_sysfs_entry_attrs,
> > + .sysfs_ops = &fw_cfg_sysfs_attr_ops,
> > + .release = fw_cfg_sysfs_release_entry,
> > +};
> > +
> > +/* raw-read method and attribute */
> > +static ssize_t fw_cfg_sysfs_read_raw(struct file *filp, struct kobject *kobj,
> > + struct bin_attribute *bin_attr,
> > + char *buf, loff_t pos, size_t count)
> > +{
> > + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> > +
> > + if (pos > entry->f.size)
> > + return -EINVAL;
> > +
> > + if (count > entry->f.size - pos)
> > + count = entry->f.size - pos;
> > +
> > + fw_cfg_read_blob(entry->f.select, buf, pos, count);
> > + return count;
> > +}
> > +
> > +static struct bin_attribute fw_cfg_sysfs_attr_raw = {
> > + .attr = { .name = "raw", .mode = S_IRUSR },
> > + .read = fw_cfg_sysfs_read_raw,
> > +};
> > +
> > +/* kobjects representing top-level and by_key folders */
> > +static struct kobject *fw_cfg_top_ko;
> > +static struct kobject *fw_cfg_sel_ko;
> > +
> > +/* register an individual fw_cfg file */
> > +static int fw_cfg_register_file(const struct fw_cfg_file *f)
> > +{
> > + int err;
> > + struct fw_cfg_sysfs_entry *entry;
> > +
> > + /* allocate new entry */
> > + entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> > + if (!entry)
> > + return -ENOMEM;
> > +
> > + /* set file entry information */
> > + memcpy(&entry->f, f, sizeof(struct fw_cfg_file));
> > +
> > + /* register entry under "/sys/firmware/qemu_fw_cfg/by_key/" */
> > + err = kobject_init_and_add(&entry->kobj, &fw_cfg_sysfs_entry_ktype,
> > + fw_cfg_sel_ko, "%d", entry->f.select);
> > + if (err)
> > + goto err_register;
> > +
> > + /* add raw binary content access */
> > + err = sysfs_create_bin_file(&entry->kobj, &fw_cfg_sysfs_attr_raw);
> > + if (err)
> > + goto err_add_raw;
> > +
> > + /* success, add entry to global cache */
> > + fw_cfg_sysfs_cache_enlist(entry);
> > + return 0;
> > +
> > +err_add_raw:
> > + kobject_del(&entry->kobj);
> > +err_register:
> > + kfree(entry);
> > + return err;
> > +}
> > +
> > +/* iterate over all fw_cfg directory entries, registering each one */
> > +static int fw_cfg_register_dir_entries(void)
> > +{
> > + int ret = 0;
> > + u32 count, i;
> > + struct fw_cfg_file *dir;
> > + size_t dir_size;
> > +
> > + fw_cfg_read_blob(FW_CFG_FILE_DIR, &count, 0, sizeof(count));
> > + count = be32_to_cpu(count);
> > + dir_size = count * sizeof(struct fw_cfg_file);
> > +
> > + dir = kmalloc(dir_size, GFP_KERNEL);
> > + if (!dir)
> > + return -ENOMEM;
> > +
> > + fw_cfg_read_blob(FW_CFG_FILE_DIR, dir, sizeof(count), dir_size);
> > +
> > + for (i = 0; i < count; i++) {
> > + dir[i].size = be32_to_cpu(dir[i].size);
> > + dir[i].select = be16_to_cpu(dir[i].select);
> > + ret = fw_cfg_register_file(&dir[i]);
> > + if (ret)
> > + break;
> > + }
> > +
> > + kfree(dir);
> > + return ret;
> > +}
> > +
> > +/* unregister top-level or by_key folder */
> > +static inline void fw_cfg_kobj_cleanup(struct kobject *kobj)
> > +{
> > + kobject_del(kobj);
> > + kobject_put(kobj);
> > +}
> > +
> > +static int fw_cfg_sysfs_probe(struct platform_device *pdev)
> > +{
> > + int err;
> > +
> > + /* NOTE: If we supported multiple fw_cfg devices, we'd first create
> > + * a subdirectory named after e.g. pdev->id, then hang per-device
> > + * by_key subdirectories underneath it. However, only
> > + * one fw_cfg device exist system-wide, so if one was already found
> > + * earlier, we might as well stop here.
> > + */
> > + if (fw_cfg_sel_ko)
> > + return -EBUSY;
> > +
> > + /* create by_key subdirectory of /sys/firmware/qemu_fw_cfg/ */
> > + err = -ENOMEM;
> > + fw_cfg_sel_ko = kobject_create_and_add("by_key", fw_cfg_top_ko);
> > + if (!fw_cfg_sel_ko)
> > + goto err_sel;
> > +
> > + /* initialize fw_cfg device i/o from platform data */
> > + err = fw_cfg_do_platform_probe(pdev);
> > + if (err)
> > + goto err_probe;
> > +
> > + /* get revision number, add matching top-level attribute */
> > + fw_cfg_read_blob(FW_CFG_ID, &fw_cfg_rev, 0, sizeof(fw_cfg_rev));
> > + fw_cfg_rev = le32_to_cpu(fw_cfg_rev);
> > + err = sysfs_create_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
> > + if (err)
> > + goto err_rev;
> > +
> > + /* process fw_cfg file directory entry, registering each file */
> > + err = fw_cfg_register_dir_entries();
> > + if (err)
> > + goto err_dir;
> > +
> > + /* success */
> > + pr_debug("fw_cfg: loaded.\n");
> > + return 0;
> > +
> > +err_dir:
> > + fw_cfg_sysfs_cache_cleanup();
> > + sysfs_remove_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
> > +err_rev:
> > + fw_cfg_io_cleanup();
> > +err_probe:
> > + fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
> > +err_sel:
> > + return err;
> > +}
> > +
> > +static int fw_cfg_sysfs_remove(struct platform_device *pdev)
> > +{
> > + pr_debug("fw_cfg: unloading.\n");
> > + fw_cfg_sysfs_cache_cleanup();
> > + fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
> > + fw_cfg_io_cleanup();
> > + return 0;
> > +}
> > +
> > +static const struct of_device_id fw_cfg_sysfs_mmio_match[] = {
> > + { .compatible = "qemu,fw-cfg-mmio", },
> > + {},
> > +};
> > +MODULE_DEVICE_TABLE(of, fw_cfg_sysfs_mmio_match);
> > +
> > +#ifdef CONFIG_ACPI
> > +static const struct acpi_device_id fw_cfg_sysfs_acpi_match[] = {
> > + { "QEMU0002", },
> > + {},
> > +};
> > +MODULE_DEVICE_TABLE(acpi, fw_cfg_sysfs_acpi_match);
> > +#endif
> > +
> > +static struct platform_driver fw_cfg_sysfs_driver = {
> > + .probe = fw_cfg_sysfs_probe,
> > + .remove = fw_cfg_sysfs_remove,
> > + .driver = {
> > + .name = "fw_cfg",
> > + .of_match_table = fw_cfg_sysfs_mmio_match,
> > + .acpi_match_table = ACPI_PTR(fw_cfg_sysfs_acpi_match),
> > + },
> > +};
> > +
> > +#ifdef CONFIG_FW_CFG_SYSFS_CMDLINE
> > +
> > +static struct platform_device *fw_cfg_cmdline_dev;
> > +
> > +/* this probably belongs in e.g. include/linux/types.h,
> > + * but right now we are the only ones doing it...
> > + */
> > +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> > +#define __PHYS_ADDR_PREFIX "ll"
> > +#else
> > +#define __PHYS_ADDR_PREFIX ""
> > +#endif
> > +
> > +/* use special scanf/printf modifier for phys_addr_t, resource_size_t */
> > +#define PH_ADDR_SCAN_FMT "@%" __PHYS_ADDR_PREFIX "i%n" \
> > + ":%" __PHYS_ADDR_PREFIX "i" \
> > + ":%" __PHYS_ADDR_PREFIX "i%n"
> > +
> > +#define PH_ADDR_PR_1_FMT "0x%" __PHYS_ADDR_PREFIX "x@" \
> > + "0x%" __PHYS_ADDR_PREFIX "x"
> > +
> > +#define PH_ADDR_PR_3_FMT PH_ADDR_PR_1_FMT \
> > + ":%" __PHYS_ADDR_PREFIX "u" \
> > + ":%" __PHYS_ADDR_PREFIX "u"
> > +
> > +static int fw_cfg_cmdline_set(const char *arg, const struct kernel_param *kp)
> > +{
> > + struct resource res[3] = {};
> > + char *str;
> > + phys_addr_t base;
> > + resource_size_t size, ctrl_off, data_off;
> > + int processed, consumed = 0;
> > +
> > + /* only one fw_cfg device can exist system-wide, so if one
> > + * was processed on the command line already, we might as
> > + * well stop here.
> > + */
> > + if (fw_cfg_cmdline_dev) {
> > + /* avoid leaking previously registered device */
> > + platform_device_unregister(fw_cfg_cmdline_dev);
> > + return -EINVAL;
> > + }
> > +
> > + /* consume "<size>" portion of command line argument */
> > + size = memparse(arg, &str);
> > +
> > + /* get "@<base>[:<ctrl_off>:<data_off>]" chunks */
> > + processed = sscanf(str, PH_ADDR_SCAN_FMT,
> > + &base, &consumed,
> > + &ctrl_off, &data_off, &consumed);
> > +
> > + /* sscanf() must process precisely 1 or 3 chunks:
> > + * <base> is mandatory, optionally followed by <ctrl_off>
> > + * and <data_off>;
> > + * there must be no extra characters after the last chunk,
> > + * so str[consumed] must be '\0'.
> > + */
> > + if (str[consumed] ||
> > + (processed != 1 && processed != 3))
> > + return -EINVAL;
> > +
> > + res[0].start = base;
> > + res[0].end = base + size - 1;
> > + res[0].flags = !strcmp(kp->name, "mmio") ? IORESOURCE_MEM :
> > + IORESOURCE_IO;
> > +
> > + /* insert register offsets, if provided */
> > + if (processed > 1) {
> > + res[1].name = "ctrl";
> > + res[1].start = ctrl_off;
> > + res[1].flags = IORESOURCE_REG;
> > + res[2].name = "data";
> > + res[2].start = data_off;
> > + res[2].flags = IORESOURCE_REG;
> > + }
> > +
> > + /* "processed" happens to nicely match the number of resources
> > + * we need to pass in to this platform device.
> > + */
> > + fw_cfg_cmdline_dev = platform_device_register_simple("fw_cfg",
> > + PLATFORM_DEVID_NONE, res, processed);
> > + if (IS_ERR(fw_cfg_cmdline_dev))
> > + return PTR_ERR(fw_cfg_cmdline_dev);
> > +
> > + return 0;
> > +}
> > +
> > +static int fw_cfg_cmdline_get(char *buf, const struct kernel_param *kp)
> > +{
> > + /* stay silent if device was not configured via the command
> > + * line, or if the parameter name (ioport/mmio) doesn't match
> > + * the device setting
> > + */
> > + if (!fw_cfg_cmdline_dev ||
> > + (!strcmp(kp->name, "mmio") ^
> > + (fw_cfg_cmdline_dev->resource[0].flags == IORESOURCE_MEM)))
> > + return 0;
> > +
> > + switch (fw_cfg_cmdline_dev->num_resources) {
> > + case 1:
> > + return snprintf(buf, PAGE_SIZE, PH_ADDR_PR_1_FMT,
> > + resource_size(&fw_cfg_cmdline_dev->resource[0]),
> > + fw_cfg_cmdline_dev->resource[0].start);
> > + case 3:
> > + return snprintf(buf, PAGE_SIZE, PH_ADDR_PR_3_FMT,
> > + resource_size(&fw_cfg_cmdline_dev->resource[0]),
> > + fw_cfg_cmdline_dev->resource[0].start,
> > + fw_cfg_cmdline_dev->resource[1].start,
> > + fw_cfg_cmdline_dev->resource[2].start);
> > + }
> > +
> > + /* Should never get here */
> > + WARN(1, "Unexpected number of resources: %d\n",
> > + fw_cfg_cmdline_dev->num_resources);
> > + return 0;
> > +}
> > +
> > +static const struct kernel_param_ops fw_cfg_cmdline_param_ops = {
> > + .set = fw_cfg_cmdline_set,
> > + .get = fw_cfg_cmdline_get,
> > +};
> > +
> > +device_param_cb(ioport, &fw_cfg_cmdline_param_ops, NULL, S_IRUSR);
> > +device_param_cb(mmio, &fw_cfg_cmdline_param_ops, NULL, S_IRUSR);
> > +
> > +#endif /* CONFIG_FW_CFG_SYSFS_CMDLINE */
> > +
> > +static int __init fw_cfg_sysfs_init(void)
> > +{
> > + /* create /sys/firmware/qemu_fw_cfg/ top level directory */
> > + fw_cfg_top_ko = kobject_create_and_add("qemu_fw_cfg", firmware_kobj);
> > + if (!fw_cfg_top_ko)
> > + return -ENOMEM;
> > +
> > + return platform_driver_register(&fw_cfg_sysfs_driver);
> > +}
> > +
> > +static void __exit fw_cfg_sysfs_exit(void)
> > +{
> > + platform_driver_unregister(&fw_cfg_sysfs_driver);
> > +
> > +#ifdef CONFIG_FW_CFG_SYSFS_CMDLINE
> > + platform_device_unregister(fw_cfg_cmdline_dev);
> > +#endif
> > +
> > + /* clean up /sys/firmware/qemu_fw_cfg/ */
> > + fw_cfg_kobj_cleanup(fw_cfg_top_ko);
> > +}
> > +
> > +module_init(fw_cfg_sysfs_init);
> > +module_exit(fw_cfg_sysfs_exit);
> > --
> > 2.4.3
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [Qemu-devel] [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-21 13:06 ` Gabriel L. Somlo
0 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-21 13:06 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Sun, Feb 21, 2016 at 10:30:26AM +0200, Michael S. Tsirkin wrote:
> On Thu, Jan 28, 2016 at 09:23:11AM -0500, Gabriel L. Somlo wrote:
> > From: Gabriel Somlo <somlo@cmu.edu>
> >
> > Make fw_cfg entries of type "file" available via sysfs. Entries
> > are listed under /sys/firmware/qemu_fw_cfg/by_key, in folders
> > named after each entry's selector key. Filename, selector value,
> > and size read-only attributes are included for each entry. Also,
> > a "raw" attribute allows retrieval of the full binary content of
> > each entry.
> >
> > The fw_cfg device can be instantiated automatically from ACPI or
> > the Device Tree, or manually by using a kernel module (or command
> > line) parameter, with a syntax outlined in the documentation file.
> >
> > Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
> > ---
> > .../ABI/testing/sysfs-firmware-qemu_fw_cfg | 58 ++
> > drivers/firmware/Kconfig | 19 +
> > drivers/firmware/Makefile | 1 +
> > drivers/firmware/qemu_fw_cfg.c | 648 +++++++++++++++++++++
> > 4 files changed, 726 insertions(+)
> > create mode 100644 Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> > create mode 100644 drivers/firmware/qemu_fw_cfg.c
> >
> > diff --git a/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> > new file mode 100644
> > index 0000000..e9e58d4
> > --- /dev/null
> > +++ b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> > @@ -0,0 +1,58 @@
> > +What: /sys/firmware/qemu_fw_cfg/
> > +Date: August 2015
> > +Contact: Gabriel Somlo <somlo@cmu.edu>
> > +Description:
> > + Several different architectures supported by QEMU (x86, arm,
> > + sun4*, ppc/mac) are provisioned with a firmware configuration
> > + (fw_cfg) device, originally intended as a way for the host to
> > + provide configuration data to the guest firmware. Starting
> > + with QEMU v2.4, arbitrary fw_cfg file entries may be specified
> > + by the user on the command line, which makes fw_cfg additionally
> > + useful as an out-of-band, asynchronous mechanism for providing
> > + configuration data to the guest userspace.
> > +
> > + The authoritative guest-side hardware interface documentation
> > + to the fw_cfg device can be found in "docs/specs/fw_cfg.txt"
> > + in the QEMU source tree.
> > +
> > + === SysFS fw_cfg Interface ===
> > +
> > + The fw_cfg sysfs interface described in this document is only
> > + intended to display discoverable blobs (i.e., those registered
> > + with the file directory), as there is no way to determine the
> > + presence or size of "legacy" blobs (with selector keys between
> > + 0x0002 and 0x0018) programmatically.
> > +
> > + All fw_cfg information is shown under:
> > +
> > + /sys/firmware/qemu_fw_cfg/
> > +
> > + The only legacy blob displayed is the fw_cfg device revision:
> > +
> > + /sys/firmware/qemu_fw_cfg/rev
> > +
> > + --- Discoverable fw_cfg blobs by selector key ---
> > +
> > + All discoverable blobs listed in the fw_cfg file directory are
> > + displayed as entries named after their unique selector key
> > + value, e.g.:
> > +
> > + /sys/firmware/qemu_fw_cfg/by_key/32
> > + /sys/firmware/qemu_fw_cfg/by_key/33
> > + /sys/firmware/qemu_fw_cfg/by_key/34
> > + ...
> > +
> > + Each such fw_cfg sysfs entry has the following values exported
> > + as attributes:
> > +
> > + name : The 56-byte nul-terminated ASCII string used as the
> > + blob's 'file name' in the fw_cfg directory.
> > + size : The length of the blob, as given in the fw_cfg
> > + directory.
> > + key : The value of the blob's selector key as given in the
> > + fw_cfg directory. This value is the same as used in
> > + the parent directory name.
> > + raw : The raw bytes of the blob, obtained by selecting the
> > + entry via the control register, and reading a number
> > + of bytes equal to the blob size from the data
> > + register.
> > diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> > index 49a3a11..5130f74 100644
> > --- a/drivers/firmware/Kconfig
> > +++ b/drivers/firmware/Kconfig
> > @@ -161,6 +161,25 @@ config RASPBERRYPI_FIRMWARE
> > This option enables support for communicating with the firmware on the
> > Raspberry Pi.
> >
> > +config FW_CFG_SYSFS
> > + tristate "QEMU fw_cfg device support in sysfs"
> > + depends on SYSFS && (ARM || ARM64 || PPC_PMAC || SPARC || X86)
> > + default n
> > + help
> > + Say Y or M here to enable the exporting of the QEMU firmware
> > + configuration (fw_cfg) file entries via sysfs. Entries are
> > + found under /sys/firmware/fw_cfg when this option is enabled
> > + and loaded.
> > +
> > +config FW_CFG_SYSFS_CMDLINE
> > + bool "QEMU fw_cfg device parameter parsing"
> > + depends on FW_CFG_SYSFS
> > + help
> > + Allow the qemu_fw_cfg device to be initialized via the kernel
> > + command line or using a module parameter.
> > + WARNING: Using incorrect parameters (base address in particular)
> > + may crash your system.
> > +
> > config QCOM_SCM
> > bool
> > depends on ARM || ARM64
> > diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> > index 48dd417..474bada 100644
> > --- a/drivers/firmware/Makefile
> > +++ b/drivers/firmware/Makefile
> > @@ -14,6 +14,7 @@ obj-$(CONFIG_ISCSI_IBFT_FIND) += iscsi_ibft_find.o
> > obj-$(CONFIG_ISCSI_IBFT) += iscsi_ibft.o
> > obj-$(CONFIG_FIRMWARE_MEMMAP) += memmap.o
> > obj-$(CONFIG_RASPBERRYPI_FIRMWARE) += raspberrypi.o
> > +obj-$(CONFIG_FW_CFG_SYSFS) += qemu_fw_cfg.o
> > obj-$(CONFIG_QCOM_SCM) += qcom_scm.o
> > obj-$(CONFIG_QCOM_SCM_64) += qcom_scm-64.o
> > obj-$(CONFIG_QCOM_SCM_32) += qcom_scm-32.o
> > diff --git a/drivers/firmware/qemu_fw_cfg.c b/drivers/firmware/qemu_fw_cfg.c
> > new file mode 100644
> > index 0000000..83e8a5c
> > --- /dev/null
> > +++ b/drivers/firmware/qemu_fw_cfg.c
> > @@ -0,0 +1,648 @@
> > +/*
> > + * drivers/firmware/qemu_fw_cfg.c
> > + *
> > + * Copyright 2015 Carnegie Mellon University
> > + *
> > + * Expose entries from QEMU's firmware configuration (fw_cfg) device in
> > + * sysfs (read-only, under "/sys/firmware/qemu_fw_cfg/...").
> > + *
> > + * The fw_cfg device may be instantiated via either an ACPI node (on x86
> > + * and select subsets of aarch64), a Device Tree node (on arm), or using
> > + * a kernel module (or command line) parameter with the following syntax:
> > + *
> > + * [fw_cfg.]ioport=<size>@<base>[:<ctrl_off>:<data_off>]
> > + * or
> > + * [fw_cfg.]mmio=<size>@<base>[:<ctrl_off>:<data_off>]
> > + *
> > + * where:
> > + * <size> := size of ioport or mmio range
> > + * <base> := physical base address of ioport or mmio range
> > + * <ctrl_off> := (optional) offset of control register
> > + * <data_off> := (optional) offset of data register
> > + *
> > + * e.g.:
> > + * fw_cfg.ioport=2@0x510:0:1 (the default on x86)
> > + * or
> > + * fw_cfg.mmio=0xA@0x9020000:8:0 (the default on arm)
> > + */
> > +
> > +#include <linux/module.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/acpi.h>
> > +#include <linux/slab.h>
> > +#include <linux/io.h>
> > +#include <linux/ioport.h>
> > +
> > +MODULE_AUTHOR("Gabriel L. Somlo <somlo@cmu.edu>");
> > +MODULE_DESCRIPTION("QEMU fw_cfg sysfs support");
> > +MODULE_LICENSE("GPL");
> > +
> > +/* selector key values for "well-known" fw_cfg entries */
> > +#define FW_CFG_SIGNATURE 0x00
> > +#define FW_CFG_ID 0x01
> > +#define FW_CFG_FILE_DIR 0x19
> > +
> > +/* size in bytes of fw_cfg signature */
> > +#define FW_CFG_SIG_SIZE 4
> > +
> > +/* fw_cfg "file name" is up to 56 characters (including terminating nul) */
> > +#define FW_CFG_MAX_FILE_PATH 56
> > +
> > +/* fw_cfg file directory entry type */
> > +struct fw_cfg_file {
> > + u32 size;
> > + u16 select;
> > + u16 reserved;
> > + char name[FW_CFG_MAX_FILE_PATH];
> > +};
> > +
> > +/* fw_cfg device i/o register addresses */
> > +static bool fw_cfg_is_mmio;
> > +static phys_addr_t fw_cfg_p_base;
> > +static resource_size_t fw_cfg_p_size;
> > +static void __iomem *fw_cfg_dev_base;
> > +static void __iomem *fw_cfg_reg_ctrl;
> > +static void __iomem *fw_cfg_reg_data;
> > +
> > +/* atomic access to fw_cfg device (potentially slow i/o, so using mutex) */
> > +static DEFINE_MUTEX(fw_cfg_dev_lock);
> > +
> > +/* pick appropriate endianness for selector key */
> > +static inline u16 fw_cfg_sel_endianness(u16 key)
> > +{
> > + return fw_cfg_is_mmio ? cpu_to_be16(key) : cpu_to_le16(key);
> > +}
> > +
> > +/* read chunk of given fw_cfg blob (caller responsible for sanity-check) */
> > +static inline void fw_cfg_read_blob(u16 key,
> > + void *buf, loff_t pos, size_t count)
> > +{
> > + mutex_lock(&fw_cfg_dev_lock);
> > + iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
> > + while (pos-- > 0)
> > + ioread8(fw_cfg_reg_data);
> > + ioread8_rep(fw_cfg_reg_data, buf, count);
> > + mutex_unlock(&fw_cfg_dev_lock);
> > +}
>
> This locking is not enough I think: ACPI might be
> accessing FW CFG meanwhile, and assuming it's
> the only owner.
>
> I think that on systems that support ACPI,
> it would be better to use an ACPI interface
> to read blobs, instead of poking on _CRS and banging on registers directly.
>
> This will also make it easier to extend the interface,
> and support DMA.
>
>
>
> > +
> > +/* clean up fw_cfg device i/o */
> > +static void fw_cfg_io_cleanup(void)
> > +{
> > + if (fw_cfg_is_mmio) {
> > + iounmap(fw_cfg_dev_base);
> > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > + } else {
> > + ioport_unmap(fw_cfg_dev_base);
> > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > + }
> > +}
> > +
> > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
>
> So for all arches which support ACPI, I think this driver
> should just rely on ACPI.
There was a discussion about that a few versions ago, and IIRC the
conclusion was not to expect the firmware to contend for fw_cfg access
after the guest kernel boots:
https://lkml.org/lkml/2015/10/5/283
(I even had a prototype version doing what you suggested, but per the above
reference decided to drop it -- which IMHO is for the better, since otherwise
I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
see https://lkml.org/lkml/2015/11/4/534 )
>
> > +#if !(defined(FW_CFG_CTRL_OFF) && defined(FW_CTRL_DATA_OFF))
> > +# if (defined(CONFIG_ARM) || defined(CONFIG_ARM64))
> > +# define FW_CFG_CTRL_OFF 0x08
> > +# define FW_CFG_DATA_OFF 0x00
> > +# elif (defined(CONFIG_PPC_PMAC) || defined(CONFIG_SPARC32)) /* ppc/mac,sun4m */
> > +# define FW_CFG_CTRL_OFF 0x00
> > +# define FW_CFG_DATA_OFF 0x02
> > +# elif (defined(CONFIG_X86) || defined(CONFIG_SPARC64)) /* x86, sun4u */
> > +# define FW_CFG_CTRL_OFF 0x00
> > +# define FW_CFG_DATA_OFF 0x01
> > +# else
> > +# warning "QEMU FW_CFG may not be available on this architecture!"
> > +# define FW_CFG_CTRL_OFF 0x00
> > +# define FW_CFG_DATA_OFF 0x01
>
> Better not try hacks like this, they are hard
> to support down the road. Please only list what is tested and
> actually exposed by QEMU.
I was looking for a standard way to advertise register offsets within
the ioport or mmio region assigned to fw_cfg, but right now the answer
is "there isn't one", and "register offsets are an arch specific
detail". As such, the only reasonable way I saw was to copy the same
values used in the QEMU source.
See also:
https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05037.html
Thanks much,
--Gabriel
>
> > +# endif
> > +#endif
> > +
> > +/* initialize fw_cfg device i/o from platform data */
> > +static int fw_cfg_do_platform_probe(struct platform_device *pdev)
> > +{
> > + char sig[FW_CFG_SIG_SIZE];
> > + struct resource *range, *ctrl, *data;
> > +
> > + /* acquire i/o range details */
> > + fw_cfg_is_mmio = false;
> > + range = platform_get_resource(pdev, IORESOURCE_IO, 0);
> > + if (!range) {
> > + fw_cfg_is_mmio = true;
> > + range = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > + if (!range)
> > + return -EINVAL;
> > + }
> > + fw_cfg_p_base = range->start;
> > + fw_cfg_p_size = resource_size(range);
> > +
> > + if (fw_cfg_is_mmio) {
> > + if (!request_mem_region(fw_cfg_p_base,
> > + fw_cfg_p_size, "fw_cfg_mem"))
> > + return -EBUSY;
> > + fw_cfg_dev_base = ioremap(fw_cfg_p_base, fw_cfg_p_size);
> > + if (!fw_cfg_dev_base) {
> > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > + return -EFAULT;
> > + }
> > + } else {
> > + if (!request_region(fw_cfg_p_base,
> > + fw_cfg_p_size, "fw_cfg_io"))
> > + return -EBUSY;
> > + fw_cfg_dev_base = ioport_map(fw_cfg_p_base, fw_cfg_p_size);
> > + if (!fw_cfg_dev_base) {
> > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > + return -EFAULT;
> > + }
> > + }
> > +
> > + /* were custom register offsets provided (e.g. on the command line)? */
> > + ctrl = platform_get_resource_byname(pdev, IORESOURCE_REG, "ctrl");
> > + data = platform_get_resource_byname(pdev, IORESOURCE_REG, "data");
> > + if (ctrl && data) {
> > + fw_cfg_reg_ctrl = fw_cfg_dev_base + ctrl->start;
> > + fw_cfg_reg_data = fw_cfg_dev_base + data->start;
> > + } else {
> > + /* use architecture-specific offsets */
> > + fw_cfg_reg_ctrl = fw_cfg_dev_base + FW_CFG_CTRL_OFF;
> > + fw_cfg_reg_data = fw_cfg_dev_base + FW_CFG_DATA_OFF;
> > + }
> > +
> > + /* verify fw_cfg device signature */
> > + fw_cfg_read_blob(FW_CFG_SIGNATURE, sig, 0, FW_CFG_SIG_SIZE);
> > + if (memcmp(sig, "QEMU", FW_CFG_SIG_SIZE) != 0) {
> > + fw_cfg_io_cleanup();
> > + return -ENODEV;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +/* fw_cfg revision attribute, in /sys/firmware/qemu_fw_cfg top-level dir. */
> > +static u32 fw_cfg_rev;
> > +
> > +static ssize_t fw_cfg_showrev(struct kobject *k, struct attribute *a, char *buf)
> > +{
> > + return sprintf(buf, "%u\n", fw_cfg_rev);
> > +}
> > +
> > +static const struct {
> > + struct attribute attr;
> > + ssize_t (*show)(struct kobject *k, struct attribute *a, char *buf);
> > +} fw_cfg_rev_attr = {
> > + .attr = { .name = "rev", .mode = S_IRUSR },
> > + .show = fw_cfg_showrev,
> > +};
> > +
> > +/* fw_cfg_sysfs_entry type */
> > +struct fw_cfg_sysfs_entry {
> > + struct kobject kobj;
> > + struct fw_cfg_file f;
> > + struct list_head list;
> > +};
> > +
> > +/* get fw_cfg_sysfs_entry from kobject member */
> > +static inline struct fw_cfg_sysfs_entry *to_entry(struct kobject *kobj)
> > +{
> > + return container_of(kobj, struct fw_cfg_sysfs_entry, kobj);
> > +}
> > +
> > +/* fw_cfg_sysfs_attribute type */
> > +struct fw_cfg_sysfs_attribute {
> > + struct attribute attr;
> > + ssize_t (*show)(struct fw_cfg_sysfs_entry *entry, char *buf);
> > +};
> > +
> > +/* get fw_cfg_sysfs_attribute from attribute member */
> > +static inline struct fw_cfg_sysfs_attribute *to_attr(struct attribute *attr)
> > +{
> > + return container_of(attr, struct fw_cfg_sysfs_attribute, attr);
> > +}
> > +
> > +/* global cache of fw_cfg_sysfs_entry objects */
> > +static LIST_HEAD(fw_cfg_entry_cache);
> > +
> > +/* kobjects removed lazily by kernel, mutual exclusion needed */
> > +static DEFINE_SPINLOCK(fw_cfg_cache_lock);
> > +
> > +static inline void fw_cfg_sysfs_cache_enlist(struct fw_cfg_sysfs_entry *entry)
> > +{
> > + spin_lock(&fw_cfg_cache_lock);
> > + list_add_tail(&entry->list, &fw_cfg_entry_cache);
> > + spin_unlock(&fw_cfg_cache_lock);
> > +}
> > +
> > +static inline void fw_cfg_sysfs_cache_delist(struct fw_cfg_sysfs_entry *entry)
> > +{
> > + spin_lock(&fw_cfg_cache_lock);
> > + list_del(&entry->list);
> > + spin_unlock(&fw_cfg_cache_lock);
> > +}
> > +
> > +static void fw_cfg_sysfs_cache_cleanup(void)
> > +{
> > + struct fw_cfg_sysfs_entry *entry, *next;
> > +
> > + list_for_each_entry_safe(entry, next, &fw_cfg_entry_cache, list) {
> > + /* will end up invoking fw_cfg_sysfs_cache_delist()
> > + * via each object's release() method (i.e. destructor)
> > + */
> > + kobject_put(&entry->kobj);
> > + }
> > +}
> > +
> > +/* default_attrs: per-entry attributes and show methods */
> > +
> > +#define FW_CFG_SYSFS_ATTR(_attr) \
> > +struct fw_cfg_sysfs_attribute fw_cfg_sysfs_attr_##_attr = { \
> > + .attr = { .name = __stringify(_attr), .mode = S_IRUSR }, \
> > + .show = fw_cfg_sysfs_show_##_attr, \
> > +}
> > +
> > +static ssize_t fw_cfg_sysfs_show_size(struct fw_cfg_sysfs_entry *e, char *buf)
> > +{
> > + return sprintf(buf, "%u\n", e->f.size);
> > +}
> > +
> > +static ssize_t fw_cfg_sysfs_show_key(struct fw_cfg_sysfs_entry *e, char *buf)
> > +{
> > + return sprintf(buf, "%u\n", e->f.select);
> > +}
> > +
> > +static ssize_t fw_cfg_sysfs_show_name(struct fw_cfg_sysfs_entry *e, char *buf)
> > +{
> > + return sprintf(buf, "%s\n", e->f.name);
> > +}
> > +
> > +static FW_CFG_SYSFS_ATTR(size);
> > +static FW_CFG_SYSFS_ATTR(key);
> > +static FW_CFG_SYSFS_ATTR(name);
> > +
> > +static struct attribute *fw_cfg_sysfs_entry_attrs[] = {
> > + &fw_cfg_sysfs_attr_size.attr,
> > + &fw_cfg_sysfs_attr_key.attr,
> > + &fw_cfg_sysfs_attr_name.attr,
> > + NULL,
> > +};
> > +
> > +/* sysfs_ops: find fw_cfg_[entry, attribute] and call appropriate show method */
> > +static ssize_t fw_cfg_sysfs_attr_show(struct kobject *kobj, struct attribute *a,
> > + char *buf)
> > +{
> > + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> > + struct fw_cfg_sysfs_attribute *attr = to_attr(a);
> > +
> > + return attr->show(entry, buf);
> > +}
> > +
> > +static const struct sysfs_ops fw_cfg_sysfs_attr_ops = {
> > + .show = fw_cfg_sysfs_attr_show,
> > +};
> > +
> > +/* release: destructor, to be called via kobject_put() */
> > +static void fw_cfg_sysfs_release_entry(struct kobject *kobj)
> > +{
> > + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> > +
> > + fw_cfg_sysfs_cache_delist(entry);
> > + kfree(entry);
> > +}
> > +
> > +/* kobj_type: ties together all properties required to register an entry */
> > +static struct kobj_type fw_cfg_sysfs_entry_ktype = {
> > + .default_attrs = fw_cfg_sysfs_entry_attrs,
> > + .sysfs_ops = &fw_cfg_sysfs_attr_ops,
> > + .release = fw_cfg_sysfs_release_entry,
> > +};
> > +
> > +/* raw-read method and attribute */
> > +static ssize_t fw_cfg_sysfs_read_raw(struct file *filp, struct kobject *kobj,
> > + struct bin_attribute *bin_attr,
> > + char *buf, loff_t pos, size_t count)
> > +{
> > + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> > +
> > + if (pos > entry->f.size)
> > + return -EINVAL;
> > +
> > + if (count > entry->f.size - pos)
> > + count = entry->f.size - pos;
> > +
> > + fw_cfg_read_blob(entry->f.select, buf, pos, count);
> > + return count;
> > +}
> > +
> > +static struct bin_attribute fw_cfg_sysfs_attr_raw = {
> > + .attr = { .name = "raw", .mode = S_IRUSR },
> > + .read = fw_cfg_sysfs_read_raw,
> > +};
> > +
> > +/* kobjects representing top-level and by_key folders */
> > +static struct kobject *fw_cfg_top_ko;
> > +static struct kobject *fw_cfg_sel_ko;
> > +
> > +/* register an individual fw_cfg file */
> > +static int fw_cfg_register_file(const struct fw_cfg_file *f)
> > +{
> > + int err;
> > + struct fw_cfg_sysfs_entry *entry;
> > +
> > + /* allocate new entry */
> > + entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> > + if (!entry)
> > + return -ENOMEM;
> > +
> > + /* set file entry information */
> > + memcpy(&entry->f, f, sizeof(struct fw_cfg_file));
> > +
> > + /* register entry under "/sys/firmware/qemu_fw_cfg/by_key/" */
> > + err = kobject_init_and_add(&entry->kobj, &fw_cfg_sysfs_entry_ktype,
> > + fw_cfg_sel_ko, "%d", entry->f.select);
> > + if (err)
> > + goto err_register;
> > +
> > + /* add raw binary content access */
> > + err = sysfs_create_bin_file(&entry->kobj, &fw_cfg_sysfs_attr_raw);
> > + if (err)
> > + goto err_add_raw;
> > +
> > + /* success, add entry to global cache */
> > + fw_cfg_sysfs_cache_enlist(entry);
> > + return 0;
> > +
> > +err_add_raw:
> > + kobject_del(&entry->kobj);
> > +err_register:
> > + kfree(entry);
> > + return err;
> > +}
> > +
> > +/* iterate over all fw_cfg directory entries, registering each one */
> > +static int fw_cfg_register_dir_entries(void)
> > +{
> > + int ret = 0;
> > + u32 count, i;
> > + struct fw_cfg_file *dir;
> > + size_t dir_size;
> > +
> > + fw_cfg_read_blob(FW_CFG_FILE_DIR, &count, 0, sizeof(count));
> > + count = be32_to_cpu(count);
> > + dir_size = count * sizeof(struct fw_cfg_file);
> > +
> > + dir = kmalloc(dir_size, GFP_KERNEL);
> > + if (!dir)
> > + return -ENOMEM;
> > +
> > + fw_cfg_read_blob(FW_CFG_FILE_DIR, dir, sizeof(count), dir_size);
> > +
> > + for (i = 0; i < count; i++) {
> > + dir[i].size = be32_to_cpu(dir[i].size);
> > + dir[i].select = be16_to_cpu(dir[i].select);
> > + ret = fw_cfg_register_file(&dir[i]);
> > + if (ret)
> > + break;
> > + }
> > +
> > + kfree(dir);
> > + return ret;
> > +}
> > +
> > +/* unregister top-level or by_key folder */
> > +static inline void fw_cfg_kobj_cleanup(struct kobject *kobj)
> > +{
> > + kobject_del(kobj);
> > + kobject_put(kobj);
> > +}
> > +
> > +static int fw_cfg_sysfs_probe(struct platform_device *pdev)
> > +{
> > + int err;
> > +
> > + /* NOTE: If we supported multiple fw_cfg devices, we'd first create
> > + * a subdirectory named after e.g. pdev->id, then hang per-device
> > + * by_key subdirectories underneath it. However, only
> > + * one fw_cfg device exist system-wide, so if one was already found
> > + * earlier, we might as well stop here.
> > + */
> > + if (fw_cfg_sel_ko)
> > + return -EBUSY;
> > +
> > + /* create by_key subdirectory of /sys/firmware/qemu_fw_cfg/ */
> > + err = -ENOMEM;
> > + fw_cfg_sel_ko = kobject_create_and_add("by_key", fw_cfg_top_ko);
> > + if (!fw_cfg_sel_ko)
> > + goto err_sel;
> > +
> > + /* initialize fw_cfg device i/o from platform data */
> > + err = fw_cfg_do_platform_probe(pdev);
> > + if (err)
> > + goto err_probe;
> > +
> > + /* get revision number, add matching top-level attribute */
> > + fw_cfg_read_blob(FW_CFG_ID, &fw_cfg_rev, 0, sizeof(fw_cfg_rev));
> > + fw_cfg_rev = le32_to_cpu(fw_cfg_rev);
> > + err = sysfs_create_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
> > + if (err)
> > + goto err_rev;
> > +
> > + /* process fw_cfg file directory entry, registering each file */
> > + err = fw_cfg_register_dir_entries();
> > + if (err)
> > + goto err_dir;
> > +
> > + /* success */
> > + pr_debug("fw_cfg: loaded.\n");
> > + return 0;
> > +
> > +err_dir:
> > + fw_cfg_sysfs_cache_cleanup();
> > + sysfs_remove_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
> > +err_rev:
> > + fw_cfg_io_cleanup();
> > +err_probe:
> > + fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
> > +err_sel:
> > + return err;
> > +}
> > +
> > +static int fw_cfg_sysfs_remove(struct platform_device *pdev)
> > +{
> > + pr_debug("fw_cfg: unloading.\n");
> > + fw_cfg_sysfs_cache_cleanup();
> > + fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
> > + fw_cfg_io_cleanup();
> > + return 0;
> > +}
> > +
> > +static const struct of_device_id fw_cfg_sysfs_mmio_match[] = {
> > + { .compatible = "qemu,fw-cfg-mmio", },
> > + {},
> > +};
> > +MODULE_DEVICE_TABLE(of, fw_cfg_sysfs_mmio_match);
> > +
> > +#ifdef CONFIG_ACPI
> > +static const struct acpi_device_id fw_cfg_sysfs_acpi_match[] = {
> > + { "QEMU0002", },
> > + {},
> > +};
> > +MODULE_DEVICE_TABLE(acpi, fw_cfg_sysfs_acpi_match);
> > +#endif
> > +
> > +static struct platform_driver fw_cfg_sysfs_driver = {
> > + .probe = fw_cfg_sysfs_probe,
> > + .remove = fw_cfg_sysfs_remove,
> > + .driver = {
> > + .name = "fw_cfg",
> > + .of_match_table = fw_cfg_sysfs_mmio_match,
> > + .acpi_match_table = ACPI_PTR(fw_cfg_sysfs_acpi_match),
> > + },
> > +};
> > +
> > +#ifdef CONFIG_FW_CFG_SYSFS_CMDLINE
> > +
> > +static struct platform_device *fw_cfg_cmdline_dev;
> > +
> > +/* this probably belongs in e.g. include/linux/types.h,
> > + * but right now we are the only ones doing it...
> > + */
> > +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> > +#define __PHYS_ADDR_PREFIX "ll"
> > +#else
> > +#define __PHYS_ADDR_PREFIX ""
> > +#endif
> > +
> > +/* use special scanf/printf modifier for phys_addr_t, resource_size_t */
> > +#define PH_ADDR_SCAN_FMT "@%" __PHYS_ADDR_PREFIX "i%n" \
> > + ":%" __PHYS_ADDR_PREFIX "i" \
> > + ":%" __PHYS_ADDR_PREFIX "i%n"
> > +
> > +#define PH_ADDR_PR_1_FMT "0x%" __PHYS_ADDR_PREFIX "x@" \
> > + "0x%" __PHYS_ADDR_PREFIX "x"
> > +
> > +#define PH_ADDR_PR_3_FMT PH_ADDR_PR_1_FMT \
> > + ":%" __PHYS_ADDR_PREFIX "u" \
> > + ":%" __PHYS_ADDR_PREFIX "u"
> > +
> > +static int fw_cfg_cmdline_set(const char *arg, const struct kernel_param *kp)
> > +{
> > + struct resource res[3] = {};
> > + char *str;
> > + phys_addr_t base;
> > + resource_size_t size, ctrl_off, data_off;
> > + int processed, consumed = 0;
> > +
> > + /* only one fw_cfg device can exist system-wide, so if one
> > + * was processed on the command line already, we might as
> > + * well stop here.
> > + */
> > + if (fw_cfg_cmdline_dev) {
> > + /* avoid leaking previously registered device */
> > + platform_device_unregister(fw_cfg_cmdline_dev);
> > + return -EINVAL;
> > + }
> > +
> > + /* consume "<size>" portion of command line argument */
> > + size = memparse(arg, &str);
> > +
> > + /* get "@<base>[:<ctrl_off>:<data_off>]" chunks */
> > + processed = sscanf(str, PH_ADDR_SCAN_FMT,
> > + &base, &consumed,
> > + &ctrl_off, &data_off, &consumed);
> > +
> > + /* sscanf() must process precisely 1 or 3 chunks:
> > + * <base> is mandatory, optionally followed by <ctrl_off>
> > + * and <data_off>;
> > + * there must be no extra characters after the last chunk,
> > + * so str[consumed] must be '\0'.
> > + */
> > + if (str[consumed] ||
> > + (processed != 1 && processed != 3))
> > + return -EINVAL;
> > +
> > + res[0].start = base;
> > + res[0].end = base + size - 1;
> > + res[0].flags = !strcmp(kp->name, "mmio") ? IORESOURCE_MEM :
> > + IORESOURCE_IO;
> > +
> > + /* insert register offsets, if provided */
> > + if (processed > 1) {
> > + res[1].name = "ctrl";
> > + res[1].start = ctrl_off;
> > + res[1].flags = IORESOURCE_REG;
> > + res[2].name = "data";
> > + res[2].start = data_off;
> > + res[2].flags = IORESOURCE_REG;
> > + }
> > +
> > + /* "processed" happens to nicely match the number of resources
> > + * we need to pass in to this platform device.
> > + */
> > + fw_cfg_cmdline_dev = platform_device_register_simple("fw_cfg",
> > + PLATFORM_DEVID_NONE, res, processed);
> > + if (IS_ERR(fw_cfg_cmdline_dev))
> > + return PTR_ERR(fw_cfg_cmdline_dev);
> > +
> > + return 0;
> > +}
> > +
> > +static int fw_cfg_cmdline_get(char *buf, const struct kernel_param *kp)
> > +{
> > + /* stay silent if device was not configured via the command
> > + * line, or if the parameter name (ioport/mmio) doesn't match
> > + * the device setting
> > + */
> > + if (!fw_cfg_cmdline_dev ||
> > + (!strcmp(kp->name, "mmio") ^
> > + (fw_cfg_cmdline_dev->resource[0].flags == IORESOURCE_MEM)))
> > + return 0;
> > +
> > + switch (fw_cfg_cmdline_dev->num_resources) {
> > + case 1:
> > + return snprintf(buf, PAGE_SIZE, PH_ADDR_PR_1_FMT,
> > + resource_size(&fw_cfg_cmdline_dev->resource[0]),
> > + fw_cfg_cmdline_dev->resource[0].start);
> > + case 3:
> > + return snprintf(buf, PAGE_SIZE, PH_ADDR_PR_3_FMT,
> > + resource_size(&fw_cfg_cmdline_dev->resource[0]),
> > + fw_cfg_cmdline_dev->resource[0].start,
> > + fw_cfg_cmdline_dev->resource[1].start,
> > + fw_cfg_cmdline_dev->resource[2].start);
> > + }
> > +
> > + /* Should never get here */
> > + WARN(1, "Unexpected number of resources: %d\n",
> > + fw_cfg_cmdline_dev->num_resources);
> > + return 0;
> > +}
> > +
> > +static const struct kernel_param_ops fw_cfg_cmdline_param_ops = {
> > + .set = fw_cfg_cmdline_set,
> > + .get = fw_cfg_cmdline_get,
> > +};
> > +
> > +device_param_cb(ioport, &fw_cfg_cmdline_param_ops, NULL, S_IRUSR);
> > +device_param_cb(mmio, &fw_cfg_cmdline_param_ops, NULL, S_IRUSR);
> > +
> > +#endif /* CONFIG_FW_CFG_SYSFS_CMDLINE */
> > +
> > +static int __init fw_cfg_sysfs_init(void)
> > +{
> > + /* create /sys/firmware/qemu_fw_cfg/ top level directory */
> > + fw_cfg_top_ko = kobject_create_and_add("qemu_fw_cfg", firmware_kobj);
> > + if (!fw_cfg_top_ko)
> > + return -ENOMEM;
> > +
> > + return platform_driver_register(&fw_cfg_sysfs_driver);
> > +}
> > +
> > +static void __exit fw_cfg_sysfs_exit(void)
> > +{
> > + platform_driver_unregister(&fw_cfg_sysfs_driver);
> > +
> > +#ifdef CONFIG_FW_CFG_SYSFS_CMDLINE
> > + platform_device_unregister(fw_cfg_cmdline_dev);
> > +#endif
> > +
> > + /* clean up /sys/firmware/qemu_fw_cfg/ */
> > + fw_cfg_kobj_cleanup(fw_cfg_top_ko);
> > +}
> > +
> > +module_init(fw_cfg_sysfs_init);
> > +module_exit(fw_cfg_sysfs_exit);
> > --
> > 2.4.3
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-21 13:06 ` Gabriel L. Somlo
0 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-21 13:06 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Sun, Feb 21, 2016 at 10:30:26AM +0200, Michael S. Tsirkin wrote:
> On Thu, Jan 28, 2016 at 09:23:11AM -0500, Gabriel L. Somlo wrote:
> > From: Gabriel Somlo <somlo@cmu.edu>
> >
> > Make fw_cfg entries of type "file" available via sysfs. Entries
> > are listed under /sys/firmware/qemu_fw_cfg/by_key, in folders
> > named after each entry's selector key. Filename, selector value,
> > and size read-only attributes are included for each entry. Also,
> > a "raw" attribute allows retrieval of the full binary content of
> > each entry.
> >
> > The fw_cfg device can be instantiated automatically from ACPI or
> > the Device Tree, or manually by using a kernel module (or command
> > line) parameter, with a syntax outlined in the documentation file.
> >
> > Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
> > ---
> > .../ABI/testing/sysfs-firmware-qemu_fw_cfg | 58 ++
> > drivers/firmware/Kconfig | 19 +
> > drivers/firmware/Makefile | 1 +
> > drivers/firmware/qemu_fw_cfg.c | 648 +++++++++++++++++++++
> > 4 files changed, 726 insertions(+)
> > create mode 100644 Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> > create mode 100644 drivers/firmware/qemu_fw_cfg.c
> >
> > diff --git a/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> > new file mode 100644
> > index 0000000..e9e58d4
> > --- /dev/null
> > +++ b/Documentation/ABI/testing/sysfs-firmware-qemu_fw_cfg
> > @@ -0,0 +1,58 @@
> > +What: /sys/firmware/qemu_fw_cfg/
> > +Date: August 2015
> > +Contact: Gabriel Somlo <somlo@cmu.edu>
> > +Description:
> > + Several different architectures supported by QEMU (x86, arm,
> > + sun4*, ppc/mac) are provisioned with a firmware configuration
> > + (fw_cfg) device, originally intended as a way for the host to
> > + provide configuration data to the guest firmware. Starting
> > + with QEMU v2.4, arbitrary fw_cfg file entries may be specified
> > + by the user on the command line, which makes fw_cfg additionally
> > + useful as an out-of-band, asynchronous mechanism for providing
> > + configuration data to the guest userspace.
> > +
> > + The authoritative guest-side hardware interface documentation
> > + to the fw_cfg device can be found in "docs/specs/fw_cfg.txt"
> > + in the QEMU source tree.
> > +
> > + === SysFS fw_cfg Interface ===
> > +
> > + The fw_cfg sysfs interface described in this document is only
> > + intended to display discoverable blobs (i.e., those registered
> > + with the file directory), as there is no way to determine the
> > + presence or size of "legacy" blobs (with selector keys between
> > + 0x0002 and 0x0018) programmatically.
> > +
> > + All fw_cfg information is shown under:
> > +
> > + /sys/firmware/qemu_fw_cfg/
> > +
> > + The only legacy blob displayed is the fw_cfg device revision:
> > +
> > + /sys/firmware/qemu_fw_cfg/rev
> > +
> > + --- Discoverable fw_cfg blobs by selector key ---
> > +
> > + All discoverable blobs listed in the fw_cfg file directory are
> > + displayed as entries named after their unique selector key
> > + value, e.g.:
> > +
> > + /sys/firmware/qemu_fw_cfg/by_key/32
> > + /sys/firmware/qemu_fw_cfg/by_key/33
> > + /sys/firmware/qemu_fw_cfg/by_key/34
> > + ...
> > +
> > + Each such fw_cfg sysfs entry has the following values exported
> > + as attributes:
> > +
> > + name : The 56-byte nul-terminated ASCII string used as the
> > + blob's 'file name' in the fw_cfg directory.
> > + size : The length of the blob, as given in the fw_cfg
> > + directory.
> > + key : The value of the blob's selector key as given in the
> > + fw_cfg directory. This value is the same as used in
> > + the parent directory name.
> > + raw : The raw bytes of the blob, obtained by selecting the
> > + entry via the control register, and reading a number
> > + of bytes equal to the blob size from the data
> > + register.
> > diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> > index 49a3a11..5130f74 100644
> > --- a/drivers/firmware/Kconfig
> > +++ b/drivers/firmware/Kconfig
> > @@ -161,6 +161,25 @@ config RASPBERRYPI_FIRMWARE
> > This option enables support for communicating with the firmware on the
> > Raspberry Pi.
> >
> > +config FW_CFG_SYSFS
> > + tristate "QEMU fw_cfg device support in sysfs"
> > + depends on SYSFS && (ARM || ARM64 || PPC_PMAC || SPARC || X86)
> > + default n
> > + help
> > + Say Y or M here to enable the exporting of the QEMU firmware
> > + configuration (fw_cfg) file entries via sysfs. Entries are
> > + found under /sys/firmware/fw_cfg when this option is enabled
> > + and loaded.
> > +
> > +config FW_CFG_SYSFS_CMDLINE
> > + bool "QEMU fw_cfg device parameter parsing"
> > + depends on FW_CFG_SYSFS
> > + help
> > + Allow the qemu_fw_cfg device to be initialized via the kernel
> > + command line or using a module parameter.
> > + WARNING: Using incorrect parameters (base address in particular)
> > + may crash your system.
> > +
> > config QCOM_SCM
> > bool
> > depends on ARM || ARM64
> > diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> > index 48dd417..474bada 100644
> > --- a/drivers/firmware/Makefile
> > +++ b/drivers/firmware/Makefile
> > @@ -14,6 +14,7 @@ obj-$(CONFIG_ISCSI_IBFT_FIND) += iscsi_ibft_find.o
> > obj-$(CONFIG_ISCSI_IBFT) += iscsi_ibft.o
> > obj-$(CONFIG_FIRMWARE_MEMMAP) += memmap.o
> > obj-$(CONFIG_RASPBERRYPI_FIRMWARE) += raspberrypi.o
> > +obj-$(CONFIG_FW_CFG_SYSFS) += qemu_fw_cfg.o
> > obj-$(CONFIG_QCOM_SCM) += qcom_scm.o
> > obj-$(CONFIG_QCOM_SCM_64) += qcom_scm-64.o
> > obj-$(CONFIG_QCOM_SCM_32) += qcom_scm-32.o
> > diff --git a/drivers/firmware/qemu_fw_cfg.c b/drivers/firmware/qemu_fw_cfg.c
> > new file mode 100644
> > index 0000000..83e8a5c
> > --- /dev/null
> > +++ b/drivers/firmware/qemu_fw_cfg.c
> > @@ -0,0 +1,648 @@
> > +/*
> > + * drivers/firmware/qemu_fw_cfg.c
> > + *
> > + * Copyright 2015 Carnegie Mellon University
> > + *
> > + * Expose entries from QEMU's firmware configuration (fw_cfg) device in
> > + * sysfs (read-only, under "/sys/firmware/qemu_fw_cfg/...").
> > + *
> > + * The fw_cfg device may be instantiated via either an ACPI node (on x86
> > + * and select subsets of aarch64), a Device Tree node (on arm), or using
> > + * a kernel module (or command line) parameter with the following syntax:
> > + *
> > + * [fw_cfg.]ioport=<size>@<base>[:<ctrl_off>:<data_off>]
> > + * or
> > + * [fw_cfg.]mmio=<size>@<base>[:<ctrl_off>:<data_off>]
> > + *
> > + * where:
> > + * <size> := size of ioport or mmio range
> > + * <base> := physical base address of ioport or mmio range
> > + * <ctrl_off> := (optional) offset of control register
> > + * <data_off> := (optional) offset of data register
> > + *
> > + * e.g.:
> > + * fw_cfg.ioport=2@0x510:0:1 (the default on x86)
> > + * or
> > + * fw_cfg.mmio=0xA@0x9020000:8:0 (the default on arm)
> > + */
> > +
> > +#include <linux/module.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/acpi.h>
> > +#include <linux/slab.h>
> > +#include <linux/io.h>
> > +#include <linux/ioport.h>
> > +
> > +MODULE_AUTHOR("Gabriel L. Somlo <somlo@cmu.edu>");
> > +MODULE_DESCRIPTION("QEMU fw_cfg sysfs support");
> > +MODULE_LICENSE("GPL");
> > +
> > +/* selector key values for "well-known" fw_cfg entries */
> > +#define FW_CFG_SIGNATURE 0x00
> > +#define FW_CFG_ID 0x01
> > +#define FW_CFG_FILE_DIR 0x19
> > +
> > +/* size in bytes of fw_cfg signature */
> > +#define FW_CFG_SIG_SIZE 4
> > +
> > +/* fw_cfg "file name" is up to 56 characters (including terminating nul) */
> > +#define FW_CFG_MAX_FILE_PATH 56
> > +
> > +/* fw_cfg file directory entry type */
> > +struct fw_cfg_file {
> > + u32 size;
> > + u16 select;
> > + u16 reserved;
> > + char name[FW_CFG_MAX_FILE_PATH];
> > +};
> > +
> > +/* fw_cfg device i/o register addresses */
> > +static bool fw_cfg_is_mmio;
> > +static phys_addr_t fw_cfg_p_base;
> > +static resource_size_t fw_cfg_p_size;
> > +static void __iomem *fw_cfg_dev_base;
> > +static void __iomem *fw_cfg_reg_ctrl;
> > +static void __iomem *fw_cfg_reg_data;
> > +
> > +/* atomic access to fw_cfg device (potentially slow i/o, so using mutex) */
> > +static DEFINE_MUTEX(fw_cfg_dev_lock);
> > +
> > +/* pick appropriate endianness for selector key */
> > +static inline u16 fw_cfg_sel_endianness(u16 key)
> > +{
> > + return fw_cfg_is_mmio ? cpu_to_be16(key) : cpu_to_le16(key);
> > +}
> > +
> > +/* read chunk of given fw_cfg blob (caller responsible for sanity-check) */
> > +static inline void fw_cfg_read_blob(u16 key,
> > + void *buf, loff_t pos, size_t count)
> > +{
> > + mutex_lock(&fw_cfg_dev_lock);
> > + iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
> > + while (pos-- > 0)
> > + ioread8(fw_cfg_reg_data);
> > + ioread8_rep(fw_cfg_reg_data, buf, count);
> > + mutex_unlock(&fw_cfg_dev_lock);
> > +}
>
> This locking is not enough I think: ACPI might be
> accessing FW CFG meanwhile, and assuming it's
> the only owner.
>
> I think that on systems that support ACPI,
> it would be better to use an ACPI interface
> to read blobs, instead of poking on _CRS and banging on registers directly.
>
> This will also make it easier to extend the interface,
> and support DMA.
>
>
>
> > +
> > +/* clean up fw_cfg device i/o */
> > +static void fw_cfg_io_cleanup(void)
> > +{
> > + if (fw_cfg_is_mmio) {
> > + iounmap(fw_cfg_dev_base);
> > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > + } else {
> > + ioport_unmap(fw_cfg_dev_base);
> > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > + }
> > +}
> > +
> > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
>
> So for all arches which support ACPI, I think this driver
> should just rely on ACPI.
There was a discussion about that a few versions ago, and IIRC the
conclusion was not to expect the firmware to contend for fw_cfg access
after the guest kernel boots:
https://lkml.org/lkml/2015/10/5/283
(I even had a prototype version doing what you suggested, but per the above
reference decided to drop it -- which IMHO is for the better, since otherwise
I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
see https://lkml.org/lkml/2015/11/4/534 )
>
> > +#if !(defined(FW_CFG_CTRL_OFF) && defined(FW_CTRL_DATA_OFF))
> > +# if (defined(CONFIG_ARM) || defined(CONFIG_ARM64))
> > +# define FW_CFG_CTRL_OFF 0x08
> > +# define FW_CFG_DATA_OFF 0x00
> > +# elif (defined(CONFIG_PPC_PMAC) || defined(CONFIG_SPARC32)) /* ppc/mac,sun4m */
> > +# define FW_CFG_CTRL_OFF 0x00
> > +# define FW_CFG_DATA_OFF 0x02
> > +# elif (defined(CONFIG_X86) || defined(CONFIG_SPARC64)) /* x86, sun4u */
> > +# define FW_CFG_CTRL_OFF 0x00
> > +# define FW_CFG_DATA_OFF 0x01
> > +# else
> > +# warning "QEMU FW_CFG may not be available on this architecture!"
> > +# define FW_CFG_CTRL_OFF 0x00
> > +# define FW_CFG_DATA_OFF 0x01
>
> Better not try hacks like this, they are hard
> to support down the road. Please only list what is tested and
> actually exposed by QEMU.
I was looking for a standard way to advertise register offsets within
the ioport or mmio region assigned to fw_cfg, but right now the answer
is "there isn't one", and "register offsets are an arch specific
detail". As such, the only reasonable way I saw was to copy the same
values used in the QEMU source.
See also:
https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05037.html
Thanks much,
--Gabriel
>
> > +# endif
> > +#endif
> > +
> > +/* initialize fw_cfg device i/o from platform data */
> > +static int fw_cfg_do_platform_probe(struct platform_device *pdev)
> > +{
> > + char sig[FW_CFG_SIG_SIZE];
> > + struct resource *range, *ctrl, *data;
> > +
> > + /* acquire i/o range details */
> > + fw_cfg_is_mmio = false;
> > + range = platform_get_resource(pdev, IORESOURCE_IO, 0);
> > + if (!range) {
> > + fw_cfg_is_mmio = true;
> > + range = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > + if (!range)
> > + return -EINVAL;
> > + }
> > + fw_cfg_p_base = range->start;
> > + fw_cfg_p_size = resource_size(range);
> > +
> > + if (fw_cfg_is_mmio) {
> > + if (!request_mem_region(fw_cfg_p_base,
> > + fw_cfg_p_size, "fw_cfg_mem"))
> > + return -EBUSY;
> > + fw_cfg_dev_base = ioremap(fw_cfg_p_base, fw_cfg_p_size);
> > + if (!fw_cfg_dev_base) {
> > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > + return -EFAULT;
> > + }
> > + } else {
> > + if (!request_region(fw_cfg_p_base,
> > + fw_cfg_p_size, "fw_cfg_io"))
> > + return -EBUSY;
> > + fw_cfg_dev_base = ioport_map(fw_cfg_p_base, fw_cfg_p_size);
> > + if (!fw_cfg_dev_base) {
> > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > + return -EFAULT;
> > + }
> > + }
> > +
> > + /* were custom register offsets provided (e.g. on the command line)? */
> > + ctrl = platform_get_resource_byname(pdev, IORESOURCE_REG, "ctrl");
> > + data = platform_get_resource_byname(pdev, IORESOURCE_REG, "data");
> > + if (ctrl && data) {
> > + fw_cfg_reg_ctrl = fw_cfg_dev_base + ctrl->start;
> > + fw_cfg_reg_data = fw_cfg_dev_base + data->start;
> > + } else {
> > + /* use architecture-specific offsets */
> > + fw_cfg_reg_ctrl = fw_cfg_dev_base + FW_CFG_CTRL_OFF;
> > + fw_cfg_reg_data = fw_cfg_dev_base + FW_CFG_DATA_OFF;
> > + }
> > +
> > + /* verify fw_cfg device signature */
> > + fw_cfg_read_blob(FW_CFG_SIGNATURE, sig, 0, FW_CFG_SIG_SIZE);
> > + if (memcmp(sig, "QEMU", FW_CFG_SIG_SIZE) != 0) {
> > + fw_cfg_io_cleanup();
> > + return -ENODEV;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +/* fw_cfg revision attribute, in /sys/firmware/qemu_fw_cfg top-level dir. */
> > +static u32 fw_cfg_rev;
> > +
> > +static ssize_t fw_cfg_showrev(struct kobject *k, struct attribute *a, char *buf)
> > +{
> > + return sprintf(buf, "%u\n", fw_cfg_rev);
> > +}
> > +
> > +static const struct {
> > + struct attribute attr;
> > + ssize_t (*show)(struct kobject *k, struct attribute *a, char *buf);
> > +} fw_cfg_rev_attr = {
> > + .attr = { .name = "rev", .mode = S_IRUSR },
> > + .show = fw_cfg_showrev,
> > +};
> > +
> > +/* fw_cfg_sysfs_entry type */
> > +struct fw_cfg_sysfs_entry {
> > + struct kobject kobj;
> > + struct fw_cfg_file f;
> > + struct list_head list;
> > +};
> > +
> > +/* get fw_cfg_sysfs_entry from kobject member */
> > +static inline struct fw_cfg_sysfs_entry *to_entry(struct kobject *kobj)
> > +{
> > + return container_of(kobj, struct fw_cfg_sysfs_entry, kobj);
> > +}
> > +
> > +/* fw_cfg_sysfs_attribute type */
> > +struct fw_cfg_sysfs_attribute {
> > + struct attribute attr;
> > + ssize_t (*show)(struct fw_cfg_sysfs_entry *entry, char *buf);
> > +};
> > +
> > +/* get fw_cfg_sysfs_attribute from attribute member */
> > +static inline struct fw_cfg_sysfs_attribute *to_attr(struct attribute *attr)
> > +{
> > + return container_of(attr, struct fw_cfg_sysfs_attribute, attr);
> > +}
> > +
> > +/* global cache of fw_cfg_sysfs_entry objects */
> > +static LIST_HEAD(fw_cfg_entry_cache);
> > +
> > +/* kobjects removed lazily by kernel, mutual exclusion needed */
> > +static DEFINE_SPINLOCK(fw_cfg_cache_lock);
> > +
> > +static inline void fw_cfg_sysfs_cache_enlist(struct fw_cfg_sysfs_entry *entry)
> > +{
> > + spin_lock(&fw_cfg_cache_lock);
> > + list_add_tail(&entry->list, &fw_cfg_entry_cache);
> > + spin_unlock(&fw_cfg_cache_lock);
> > +}
> > +
> > +static inline void fw_cfg_sysfs_cache_delist(struct fw_cfg_sysfs_entry *entry)
> > +{
> > + spin_lock(&fw_cfg_cache_lock);
> > + list_del(&entry->list);
> > + spin_unlock(&fw_cfg_cache_lock);
> > +}
> > +
> > +static void fw_cfg_sysfs_cache_cleanup(void)
> > +{
> > + struct fw_cfg_sysfs_entry *entry, *next;
> > +
> > + list_for_each_entry_safe(entry, next, &fw_cfg_entry_cache, list) {
> > + /* will end up invoking fw_cfg_sysfs_cache_delist()
> > + * via each object's release() method (i.e. destructor)
> > + */
> > + kobject_put(&entry->kobj);
> > + }
> > +}
> > +
> > +/* default_attrs: per-entry attributes and show methods */
> > +
> > +#define FW_CFG_SYSFS_ATTR(_attr) \
> > +struct fw_cfg_sysfs_attribute fw_cfg_sysfs_attr_##_attr = { \
> > + .attr = { .name = __stringify(_attr), .mode = S_IRUSR }, \
> > + .show = fw_cfg_sysfs_show_##_attr, \
> > +}
> > +
> > +static ssize_t fw_cfg_sysfs_show_size(struct fw_cfg_sysfs_entry *e, char *buf)
> > +{
> > + return sprintf(buf, "%u\n", e->f.size);
> > +}
> > +
> > +static ssize_t fw_cfg_sysfs_show_key(struct fw_cfg_sysfs_entry *e, char *buf)
> > +{
> > + return sprintf(buf, "%u\n", e->f.select);
> > +}
> > +
> > +static ssize_t fw_cfg_sysfs_show_name(struct fw_cfg_sysfs_entry *e, char *buf)
> > +{
> > + return sprintf(buf, "%s\n", e->f.name);
> > +}
> > +
> > +static FW_CFG_SYSFS_ATTR(size);
> > +static FW_CFG_SYSFS_ATTR(key);
> > +static FW_CFG_SYSFS_ATTR(name);
> > +
> > +static struct attribute *fw_cfg_sysfs_entry_attrs[] = {
> > + &fw_cfg_sysfs_attr_size.attr,
> > + &fw_cfg_sysfs_attr_key.attr,
> > + &fw_cfg_sysfs_attr_name.attr,
> > + NULL,
> > +};
> > +
> > +/* sysfs_ops: find fw_cfg_[entry, attribute] and call appropriate show method */
> > +static ssize_t fw_cfg_sysfs_attr_show(struct kobject *kobj, struct attribute *a,
> > + char *buf)
> > +{
> > + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> > + struct fw_cfg_sysfs_attribute *attr = to_attr(a);
> > +
> > + return attr->show(entry, buf);
> > +}
> > +
> > +static const struct sysfs_ops fw_cfg_sysfs_attr_ops = {
> > + .show = fw_cfg_sysfs_attr_show,
> > +};
> > +
> > +/* release: destructor, to be called via kobject_put() */
> > +static void fw_cfg_sysfs_release_entry(struct kobject *kobj)
> > +{
> > + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> > +
> > + fw_cfg_sysfs_cache_delist(entry);
> > + kfree(entry);
> > +}
> > +
> > +/* kobj_type: ties together all properties required to register an entry */
> > +static struct kobj_type fw_cfg_sysfs_entry_ktype = {
> > + .default_attrs = fw_cfg_sysfs_entry_attrs,
> > + .sysfs_ops = &fw_cfg_sysfs_attr_ops,
> > + .release = fw_cfg_sysfs_release_entry,
> > +};
> > +
> > +/* raw-read method and attribute */
> > +static ssize_t fw_cfg_sysfs_read_raw(struct file *filp, struct kobject *kobj,
> > + struct bin_attribute *bin_attr,
> > + char *buf, loff_t pos, size_t count)
> > +{
> > + struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
> > +
> > + if (pos > entry->f.size)
> > + return -EINVAL;
> > +
> > + if (count > entry->f.size - pos)
> > + count = entry->f.size - pos;
> > +
> > + fw_cfg_read_blob(entry->f.select, buf, pos, count);
> > + return count;
> > +}
> > +
> > +static struct bin_attribute fw_cfg_sysfs_attr_raw = {
> > + .attr = { .name = "raw", .mode = S_IRUSR },
> > + .read = fw_cfg_sysfs_read_raw,
> > +};
> > +
> > +/* kobjects representing top-level and by_key folders */
> > +static struct kobject *fw_cfg_top_ko;
> > +static struct kobject *fw_cfg_sel_ko;
> > +
> > +/* register an individual fw_cfg file */
> > +static int fw_cfg_register_file(const struct fw_cfg_file *f)
> > +{
> > + int err;
> > + struct fw_cfg_sysfs_entry *entry;
> > +
> > + /* allocate new entry */
> > + entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> > + if (!entry)
> > + return -ENOMEM;
> > +
> > + /* set file entry information */
> > + memcpy(&entry->f, f, sizeof(struct fw_cfg_file));
> > +
> > + /* register entry under "/sys/firmware/qemu_fw_cfg/by_key/" */
> > + err = kobject_init_and_add(&entry->kobj, &fw_cfg_sysfs_entry_ktype,
> > + fw_cfg_sel_ko, "%d", entry->f.select);
> > + if (err)
> > + goto err_register;
> > +
> > + /* add raw binary content access */
> > + err = sysfs_create_bin_file(&entry->kobj, &fw_cfg_sysfs_attr_raw);
> > + if (err)
> > + goto err_add_raw;
> > +
> > + /* success, add entry to global cache */
> > + fw_cfg_sysfs_cache_enlist(entry);
> > + return 0;
> > +
> > +err_add_raw:
> > + kobject_del(&entry->kobj);
> > +err_register:
> > + kfree(entry);
> > + return err;
> > +}
> > +
> > +/* iterate over all fw_cfg directory entries, registering each one */
> > +static int fw_cfg_register_dir_entries(void)
> > +{
> > + int ret = 0;
> > + u32 count, i;
> > + struct fw_cfg_file *dir;
> > + size_t dir_size;
> > +
> > + fw_cfg_read_blob(FW_CFG_FILE_DIR, &count, 0, sizeof(count));
> > + count = be32_to_cpu(count);
> > + dir_size = count * sizeof(struct fw_cfg_file);
> > +
> > + dir = kmalloc(dir_size, GFP_KERNEL);
> > + if (!dir)
> > + return -ENOMEM;
> > +
> > + fw_cfg_read_blob(FW_CFG_FILE_DIR, dir, sizeof(count), dir_size);
> > +
> > + for (i = 0; i < count; i++) {
> > + dir[i].size = be32_to_cpu(dir[i].size);
> > + dir[i].select = be16_to_cpu(dir[i].select);
> > + ret = fw_cfg_register_file(&dir[i]);
> > + if (ret)
> > + break;
> > + }
> > +
> > + kfree(dir);
> > + return ret;
> > +}
> > +
> > +/* unregister top-level or by_key folder */
> > +static inline void fw_cfg_kobj_cleanup(struct kobject *kobj)
> > +{
> > + kobject_del(kobj);
> > + kobject_put(kobj);
> > +}
> > +
> > +static int fw_cfg_sysfs_probe(struct platform_device *pdev)
> > +{
> > + int err;
> > +
> > + /* NOTE: If we supported multiple fw_cfg devices, we'd first create
> > + * a subdirectory named after e.g. pdev->id, then hang per-device
> > + * by_key subdirectories underneath it. However, only
> > + * one fw_cfg device exist system-wide, so if one was already found
> > + * earlier, we might as well stop here.
> > + */
> > + if (fw_cfg_sel_ko)
> > + return -EBUSY;
> > +
> > + /* create by_key subdirectory of /sys/firmware/qemu_fw_cfg/ */
> > + err = -ENOMEM;
> > + fw_cfg_sel_ko = kobject_create_and_add("by_key", fw_cfg_top_ko);
> > + if (!fw_cfg_sel_ko)
> > + goto err_sel;
> > +
> > + /* initialize fw_cfg device i/o from platform data */
> > + err = fw_cfg_do_platform_probe(pdev);
> > + if (err)
> > + goto err_probe;
> > +
> > + /* get revision number, add matching top-level attribute */
> > + fw_cfg_read_blob(FW_CFG_ID, &fw_cfg_rev, 0, sizeof(fw_cfg_rev));
> > + fw_cfg_rev = le32_to_cpu(fw_cfg_rev);
> > + err = sysfs_create_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
> > + if (err)
> > + goto err_rev;
> > +
> > + /* process fw_cfg file directory entry, registering each file */
> > + err = fw_cfg_register_dir_entries();
> > + if (err)
> > + goto err_dir;
> > +
> > + /* success */
> > + pr_debug("fw_cfg: loaded.\n");
> > + return 0;
> > +
> > +err_dir:
> > + fw_cfg_sysfs_cache_cleanup();
> > + sysfs_remove_file(fw_cfg_top_ko, &fw_cfg_rev_attr.attr);
> > +err_rev:
> > + fw_cfg_io_cleanup();
> > +err_probe:
> > + fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
> > +err_sel:
> > + return err;
> > +}
> > +
> > +static int fw_cfg_sysfs_remove(struct platform_device *pdev)
> > +{
> > + pr_debug("fw_cfg: unloading.\n");
> > + fw_cfg_sysfs_cache_cleanup();
> > + fw_cfg_kobj_cleanup(fw_cfg_sel_ko);
> > + fw_cfg_io_cleanup();
> > + return 0;
> > +}
> > +
> > +static const struct of_device_id fw_cfg_sysfs_mmio_match[] = {
> > + { .compatible = "qemu,fw-cfg-mmio", },
> > + {},
> > +};
> > +MODULE_DEVICE_TABLE(of, fw_cfg_sysfs_mmio_match);
> > +
> > +#ifdef CONFIG_ACPI
> > +static const struct acpi_device_id fw_cfg_sysfs_acpi_match[] = {
> > + { "QEMU0002", },
> > + {},
> > +};
> > +MODULE_DEVICE_TABLE(acpi, fw_cfg_sysfs_acpi_match);
> > +#endif
> > +
> > +static struct platform_driver fw_cfg_sysfs_driver = {
> > + .probe = fw_cfg_sysfs_probe,
> > + .remove = fw_cfg_sysfs_remove,
> > + .driver = {
> > + .name = "fw_cfg",
> > + .of_match_table = fw_cfg_sysfs_mmio_match,
> > + .acpi_match_table = ACPI_PTR(fw_cfg_sysfs_acpi_match),
> > + },
> > +};
> > +
> > +#ifdef CONFIG_FW_CFG_SYSFS_CMDLINE
> > +
> > +static struct platform_device *fw_cfg_cmdline_dev;
> > +
> > +/* this probably belongs in e.g. include/linux/types.h,
> > + * but right now we are the only ones doing it...
> > + */
> > +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> > +#define __PHYS_ADDR_PREFIX "ll"
> > +#else
> > +#define __PHYS_ADDR_PREFIX ""
> > +#endif
> > +
> > +/* use special scanf/printf modifier for phys_addr_t, resource_size_t */
> > +#define PH_ADDR_SCAN_FMT "@%" __PHYS_ADDR_PREFIX "i%n" \
> > + ":%" __PHYS_ADDR_PREFIX "i" \
> > + ":%" __PHYS_ADDR_PREFIX "i%n"
> > +
> > +#define PH_ADDR_PR_1_FMT "0x%" __PHYS_ADDR_PREFIX "x@" \
> > + "0x%" __PHYS_ADDR_PREFIX "x"
> > +
> > +#define PH_ADDR_PR_3_FMT PH_ADDR_PR_1_FMT \
> > + ":%" __PHYS_ADDR_PREFIX "u" \
> > + ":%" __PHYS_ADDR_PREFIX "u"
> > +
> > +static int fw_cfg_cmdline_set(const char *arg, const struct kernel_param *kp)
> > +{
> > + struct resource res[3] = {};
> > + char *str;
> > + phys_addr_t base;
> > + resource_size_t size, ctrl_off, data_off;
> > + int processed, consumed = 0;
> > +
> > + /* only one fw_cfg device can exist system-wide, so if one
> > + * was processed on the command line already, we might as
> > + * well stop here.
> > + */
> > + if (fw_cfg_cmdline_dev) {
> > + /* avoid leaking previously registered device */
> > + platform_device_unregister(fw_cfg_cmdline_dev);
> > + return -EINVAL;
> > + }
> > +
> > + /* consume "<size>" portion of command line argument */
> > + size = memparse(arg, &str);
> > +
> > + /* get "@<base>[:<ctrl_off>:<data_off>]" chunks */
> > + processed = sscanf(str, PH_ADDR_SCAN_FMT,
> > + &base, &consumed,
> > + &ctrl_off, &data_off, &consumed);
> > +
> > + /* sscanf() must process precisely 1 or 3 chunks:
> > + * <base> is mandatory, optionally followed by <ctrl_off>
> > + * and <data_off>;
> > + * there must be no extra characters after the last chunk,
> > + * so str[consumed] must be '\0'.
> > + */
> > + if (str[consumed] ||
> > + (processed != 1 && processed != 3))
> > + return -EINVAL;
> > +
> > + res[0].start = base;
> > + res[0].end = base + size - 1;
> > + res[0].flags = !strcmp(kp->name, "mmio") ? IORESOURCE_MEM :
> > + IORESOURCE_IO;
> > +
> > + /* insert register offsets, if provided */
> > + if (processed > 1) {
> > + res[1].name = "ctrl";
> > + res[1].start = ctrl_off;
> > + res[1].flags = IORESOURCE_REG;
> > + res[2].name = "data";
> > + res[2].start = data_off;
> > + res[2].flags = IORESOURCE_REG;
> > + }
> > +
> > + /* "processed" happens to nicely match the number of resources
> > + * we need to pass in to this platform device.
> > + */
> > + fw_cfg_cmdline_dev = platform_device_register_simple("fw_cfg",
> > + PLATFORM_DEVID_NONE, res, processed);
> > + if (IS_ERR(fw_cfg_cmdline_dev))
> > + return PTR_ERR(fw_cfg_cmdline_dev);
> > +
> > + return 0;
> > +}
> > +
> > +static int fw_cfg_cmdline_get(char *buf, const struct kernel_param *kp)
> > +{
> > + /* stay silent if device was not configured via the command
> > + * line, or if the parameter name (ioport/mmio) doesn't match
> > + * the device setting
> > + */
> > + if (!fw_cfg_cmdline_dev ||
> > + (!strcmp(kp->name, "mmio") ^
> > + (fw_cfg_cmdline_dev->resource[0].flags == IORESOURCE_MEM)))
> > + return 0;
> > +
> > + switch (fw_cfg_cmdline_dev->num_resources) {
> > + case 1:
> > + return snprintf(buf, PAGE_SIZE, PH_ADDR_PR_1_FMT,
> > + resource_size(&fw_cfg_cmdline_dev->resource[0]),
> > + fw_cfg_cmdline_dev->resource[0].start);
> > + case 3:
> > + return snprintf(buf, PAGE_SIZE, PH_ADDR_PR_3_FMT,
> > + resource_size(&fw_cfg_cmdline_dev->resource[0]),
> > + fw_cfg_cmdline_dev->resource[0].start,
> > + fw_cfg_cmdline_dev->resource[1].start,
> > + fw_cfg_cmdline_dev->resource[2].start);
> > + }
> > +
> > + /* Should never get here */
> > + WARN(1, "Unexpected number of resources: %d\n",
> > + fw_cfg_cmdline_dev->num_resources);
> > + return 0;
> > +}
> > +
> > +static const struct kernel_param_ops fw_cfg_cmdline_param_ops = {
> > + .set = fw_cfg_cmdline_set,
> > + .get = fw_cfg_cmdline_get,
> > +};
> > +
> > +device_param_cb(ioport, &fw_cfg_cmdline_param_ops, NULL, S_IRUSR);
> > +device_param_cb(mmio, &fw_cfg_cmdline_param_ops, NULL, S_IRUSR);
> > +
> > +#endif /* CONFIG_FW_CFG_SYSFS_CMDLINE */
> > +
> > +static int __init fw_cfg_sysfs_init(void)
> > +{
> > + /* create /sys/firmware/qemu_fw_cfg/ top level directory */
> > + fw_cfg_top_ko = kobject_create_and_add("qemu_fw_cfg", firmware_kobj);
> > + if (!fw_cfg_top_ko)
> > + return -ENOMEM;
> > +
> > + return platform_driver_register(&fw_cfg_sysfs_driver);
> > +}
> > +
> > +static void __exit fw_cfg_sysfs_exit(void)
> > +{
> > + platform_driver_unregister(&fw_cfg_sysfs_driver);
> > +
> > +#ifdef CONFIG_FW_CFG_SYSFS_CMDLINE
> > + platform_device_unregister(fw_cfg_cmdline_dev);
> > +#endif
> > +
> > + /* clean up /sys/firmware/qemu_fw_cfg/ */
> > + fw_cfg_kobj_cleanup(fw_cfg_top_ko);
> > +}
> > +
> > +module_init(fw_cfg_sysfs_init);
> > +module_exit(fw_cfg_sysfs_exit);
> > --
> > 2.4.3
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
2016-02-21 13:06 ` Gabriel L. Somlo
(?)
@ 2016-02-21 13:10 ` Michael S. Tsirkin
-1 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-21 13:10 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: gregkh, robh+dt, pawel.moll, mark.rutland, ijc+devicetree, galak,
arnd, lersek, ralf, rmk+kernel, eric, hanjun.guo, zajec5,
sudeep.holla, agross, linux-api, linux-kernel, devicetree,
qemu-devel, imammedo, peter.maydell, leif.lindholm,
ard.biesheuvel, pbonzini, kraxel, ehabkost, luto, stefanha,
revol, matt, rth
On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
>
> >
> > > +#if !(defined(FW_CFG_CTRL_OFF) && defined(FW_CTRL_DATA_OFF))
> > > +# if (defined(CONFIG_ARM) || defined(CONFIG_ARM64))
> > > +# define FW_CFG_CTRL_OFF 0x08
> > > +# define FW_CFG_DATA_OFF 0x00
> > > +# elif (defined(CONFIG_PPC_PMAC) || defined(CONFIG_SPARC32)) /* ppc/mac,sun4m */
> > > +# define FW_CFG_CTRL_OFF 0x00
> > > +# define FW_CFG_DATA_OFF 0x02
> > > +# elif (defined(CONFIG_X86) || defined(CONFIG_SPARC64)) /* x86, sun4u */
> > > +# define FW_CFG_CTRL_OFF 0x00
> > > +# define FW_CFG_DATA_OFF 0x01
> > > +# else
> > > +# warning "QEMU FW_CFG may not be available on this architecture!"
> > > +# define FW_CFG_CTRL_OFF 0x00
> > > +# define FW_CFG_DATA_OFF 0x01
> >
> > Better not try hacks like this, they are hard
> > to support down the road. Please only list what is tested and
> > actually exposed by QEMU.
>
> I was looking for a standard way to advertise register offsets within
> the ioport or mmio region assigned to fw_cfg, but right now the answer
> is "there isn't one", and "register offsets are an arch specific
> detail". As such, the only reasonable way I saw was to copy the same
> values used in the QEMU source.
>
> See also:
> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05037.html
>
> Thanks much,
> --Gabriel
My point is you don't know what will qemu do on these
other arches which do not at the moment have fw cfg.
So don't try to guess!
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [Qemu-devel] [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-21 13:10 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-21 13:10 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
>
> >
> > > +#if !(defined(FW_CFG_CTRL_OFF) && defined(FW_CTRL_DATA_OFF))
> > > +# if (defined(CONFIG_ARM) || defined(CONFIG_ARM64))
> > > +# define FW_CFG_CTRL_OFF 0x08
> > > +# define FW_CFG_DATA_OFF 0x00
> > > +# elif (defined(CONFIG_PPC_PMAC) || defined(CONFIG_SPARC32)) /* ppc/mac,sun4m */
> > > +# define FW_CFG_CTRL_OFF 0x00
> > > +# define FW_CFG_DATA_OFF 0x02
> > > +# elif (defined(CONFIG_X86) || defined(CONFIG_SPARC64)) /* x86, sun4u */
> > > +# define FW_CFG_CTRL_OFF 0x00
> > > +# define FW_CFG_DATA_OFF 0x01
> > > +# else
> > > +# warning "QEMU FW_CFG may not be available on this architecture!"
> > > +# define FW_CFG_CTRL_OFF 0x00
> > > +# define FW_CFG_DATA_OFF 0x01
> >
> > Better not try hacks like this, they are hard
> > to support down the road. Please only list what is tested and
> > actually exposed by QEMU.
>
> I was looking for a standard way to advertise register offsets within
> the ioport or mmio region assigned to fw_cfg, but right now the answer
> is "there isn't one", and "register offsets are an arch specific
> detail". As such, the only reasonable way I saw was to copy the same
> values used in the QEMU source.
>
> See also:
> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05037.html
>
> Thanks much,
> --Gabriel
My point is you don't know what will qemu do on these
other arches which do not at the moment have fw cfg.
So don't try to guess!
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-21 13:10 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-21 13:10 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
>
> >
> > > +#if !(defined(FW_CFG_CTRL_OFF) && defined(FW_CTRL_DATA_OFF))
> > > +# if (defined(CONFIG_ARM) || defined(CONFIG_ARM64))
> > > +# define FW_CFG_CTRL_OFF 0x08
> > > +# define FW_CFG_DATA_OFF 0x00
> > > +# elif (defined(CONFIG_PPC_PMAC) || defined(CONFIG_SPARC32)) /* ppc/mac,sun4m */
> > > +# define FW_CFG_CTRL_OFF 0x00
> > > +# define FW_CFG_DATA_OFF 0x02
> > > +# elif (defined(CONFIG_X86) || defined(CONFIG_SPARC64)) /* x86, sun4u */
> > > +# define FW_CFG_CTRL_OFF 0x00
> > > +# define FW_CFG_DATA_OFF 0x01
> > > +# else
> > > +# warning "QEMU FW_CFG may not be available on this architecture!"
> > > +# define FW_CFG_CTRL_OFF 0x00
> > > +# define FW_CFG_DATA_OFF 0x01
> >
> > Better not try hacks like this, they are hard
> > to support down the road. Please only list what is tested and
> > actually exposed by QEMU.
>
> I was looking for a standard way to advertise register offsets within
> the ioport or mmio region assigned to fw_cfg, but right now the answer
> is "there isn't one", and "register offsets are an arch specific
> detail". As such, the only reasonable way I saw was to copy the same
> values used in the QEMU source.
>
> See also:
> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05037.html
>
> Thanks much,
> --Gabriel
My point is you don't know what will qemu do on these
other arches which do not at the moment have fw cfg.
So don't try to guess!
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-21 17:20 ` Gabriel L. Somlo
0 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-21 17:20 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: gregkh, robh+dt, pawel.moll, mark.rutland, ijc+devicetree, galak,
arnd, lersek, ralf, rmk+kernel, eric, hanjun.guo, zajec5,
sudeep.holla, agross, linux-api, linux-kernel, devicetree,
qemu-devel, imammedo, peter.maydell, leif.lindholm,
ard.biesheuvel, pbonzini, kraxel, ehabkost, luto, stefanha,
revol, matt, rth
On Sun, Feb 21, 2016 at 03:10:30PM +0200, Michael S. Tsirkin wrote:
> On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> >
> > >
> > > > +#if !(defined(FW_CFG_CTRL_OFF) && defined(FW_CTRL_DATA_OFF))
> > > > +# if (defined(CONFIG_ARM) || defined(CONFIG_ARM64))
> > > > +# define FW_CFG_CTRL_OFF 0x08
> > > > +# define FW_CFG_DATA_OFF 0x00
> > > > +# elif (defined(CONFIG_PPC_PMAC) || defined(CONFIG_SPARC32)) /* ppc/mac,sun4m */
> > > > +# define FW_CFG_CTRL_OFF 0x00
> > > > +# define FW_CFG_DATA_OFF 0x02
> > > > +# elif (defined(CONFIG_X86) || defined(CONFIG_SPARC64)) /* x86, sun4u */
> > > > +# define FW_CFG_CTRL_OFF 0x00
> > > > +# define FW_CFG_DATA_OFF 0x01
> > > > +# else
> > > > +# warning "QEMU FW_CFG may not be available on this architecture!"
> > > > +# define FW_CFG_CTRL_OFF 0x00
> > > > +# define FW_CFG_DATA_OFF 0x01
> > >
> > > Better not try hacks like this, they are hard
> > > to support down the road. Please only list what is tested and
> > > actually exposed by QEMU.
> >
> > I was looking for a standard way to advertise register offsets within
> > the ioport or mmio region assigned to fw_cfg, but right now the answer
> > is "there isn't one", and "register offsets are an arch specific
> > detail". As such, the only reasonable way I saw was to copy the same
> > values used in the QEMU source.
> >
> > See also:
> > https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05037.html
> >
> > Thanks much,
> > --Gabriel
>
> My point is you don't know what will qemu do on these
> other arches which do not at the moment have fw cfg.
> So don't try to guess!
Oh, you mean for the "else". I originally wanted to be able to compile
this on any architecture and wanted some dummy defaults I could
override on the command line. But now we're already restricting this
to known architectures only, so I'll send a patch turning the warning
into an error, and removing the #defines for the "else" branch above.
Sorry I misunderstood you the first time around :)
Thanks,
--Gabriel
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [Qemu-devel] [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-21 17:20 ` Gabriel L. Somlo
0 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-21 17:20 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Sun, Feb 21, 2016 at 03:10:30PM +0200, Michael S. Tsirkin wrote:
> On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> >
> > >
> > > > +#if !(defined(FW_CFG_CTRL_OFF) && defined(FW_CTRL_DATA_OFF))
> > > > +# if (defined(CONFIG_ARM) || defined(CONFIG_ARM64))
> > > > +# define FW_CFG_CTRL_OFF 0x08
> > > > +# define FW_CFG_DATA_OFF 0x00
> > > > +# elif (defined(CONFIG_PPC_PMAC) || defined(CONFIG_SPARC32)) /* ppc/mac,sun4m */
> > > > +# define FW_CFG_CTRL_OFF 0x00
> > > > +# define FW_CFG_DATA_OFF 0x02
> > > > +# elif (defined(CONFIG_X86) || defined(CONFIG_SPARC64)) /* x86, sun4u */
> > > > +# define FW_CFG_CTRL_OFF 0x00
> > > > +# define FW_CFG_DATA_OFF 0x01
> > > > +# else
> > > > +# warning "QEMU FW_CFG may not be available on this architecture!"
> > > > +# define FW_CFG_CTRL_OFF 0x00
> > > > +# define FW_CFG_DATA_OFF 0x01
> > >
> > > Better not try hacks like this, they are hard
> > > to support down the road. Please only list what is tested and
> > > actually exposed by QEMU.
> >
> > I was looking for a standard way to advertise register offsets within
> > the ioport or mmio region assigned to fw_cfg, but right now the answer
> > is "there isn't one", and "register offsets are an arch specific
> > detail". As such, the only reasonable way I saw was to copy the same
> > values used in the QEMU source.
> >
> > See also:
> > https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05037.html
> >
> > Thanks much,
> > --Gabriel
>
> My point is you don't know what will qemu do on these
> other arches which do not at the moment have fw cfg.
> So don't try to guess!
Oh, you mean for the "else". I originally wanted to be able to compile
this on any architecture and wanted some dummy defaults I could
override on the command line. But now we're already restricting this
to known architectures only, so I'll send a patch turning the warning
into an error, and removing the #defines for the "else" branch above.
Sorry I misunderstood you the first time around :)
Thanks,
--Gabriel
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-21 17:20 ` Gabriel L. Somlo
0 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-21 17:20 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
robh+dt-DgEjT+Ai2ygdnm+yROfE0A, pawel.moll-5wv7dgnIgG8,
mark.rutland-5wv7dgnIgG8, ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
galak-sgV2jX0FEOL9JmXXK+q4OQ, arnd-r2nGTMty4D4,
lersek-H+wXaHxf7aLQT0dZR+AlfA, ralf-6z/3iImG2C8G8FEW9MqTrA,
rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ, eric-WhKQ6XTQaPysTnJN9+BGXg,
hanjun.guo-QSEj5FYQhm4dnm+yROfE0A, zajec5-Re5JQEeQqe8AvxtiuMwx3w,
sudeep.holla-5wv7dgnIgG8, agross-sgV2jX0FEOL9JmXXK+q4OQ,
linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA,
qemu-devel-qX2TKyscuCcdnm+yROfE0A,
imammedo-H+wXaHxf7aLQT0dZR+AlfA,
peter.maydell-QSEj5FYQhm4dnm+yROfE0A,
leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
pbonzini-H+wXaHxf7aLQT0dZR+AlfA, kraxel-H+wXaHxf7aLQT0dZR+AlfA,
ehabkost-H+wXaHxf7aLQT0dZR+AlfA, luto-kltTT9wpgjJwATOyAt5JVQ,
stefanha-Re5JQEeQqe8AvxtiuMwx3w, revol-GANU6spQydw,
matt-mF/unelCI9GS6iBeEJttW/XRex20P6io,
rth-hL46jP5Bxq7R7s880joybQ
On Sun, Feb 21, 2016 at 03:10:30PM +0200, Michael S. Tsirkin wrote:
> On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> >
> > >
> > > > +#if !(defined(FW_CFG_CTRL_OFF) && defined(FW_CTRL_DATA_OFF))
> > > > +# if (defined(CONFIG_ARM) || defined(CONFIG_ARM64))
> > > > +# define FW_CFG_CTRL_OFF 0x08
> > > > +# define FW_CFG_DATA_OFF 0x00
> > > > +# elif (defined(CONFIG_PPC_PMAC) || defined(CONFIG_SPARC32)) /* ppc/mac,sun4m */
> > > > +# define FW_CFG_CTRL_OFF 0x00
> > > > +# define FW_CFG_DATA_OFF 0x02
> > > > +# elif (defined(CONFIG_X86) || defined(CONFIG_SPARC64)) /* x86, sun4u */
> > > > +# define FW_CFG_CTRL_OFF 0x00
> > > > +# define FW_CFG_DATA_OFF 0x01
> > > > +# else
> > > > +# warning "QEMU FW_CFG may not be available on this architecture!"
> > > > +# define FW_CFG_CTRL_OFF 0x00
> > > > +# define FW_CFG_DATA_OFF 0x01
> > >
> > > Better not try hacks like this, they are hard
> > > to support down the road. Please only list what is tested and
> > > actually exposed by QEMU.
> >
> > I was looking for a standard way to advertise register offsets within
> > the ioport or mmio region assigned to fw_cfg, but right now the answer
> > is "there isn't one", and "register offsets are an arch specific
> > detail". As such, the only reasonable way I saw was to copy the same
> > values used in the QEMU source.
> >
> > See also:
> > https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05037.html
> >
> > Thanks much,
> > --Gabriel
>
> My point is you don't know what will qemu do on these
> other arches which do not at the moment have fw cfg.
> So don't try to guess!
Oh, you mean for the "else". I originally wanted to be able to compile
this on any architecture and wanted some dummy defaults I could
override on the command line. But now we're already restricting this
to known architectures only, so I'll send a patch turning the warning
into an error, and removing the #defines for the "else" branch above.
Sorry I misunderstood you the first time around :)
Thanks,
--Gabriel
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
2016-02-21 13:06 ` Gabriel L. Somlo
(?)
@ 2016-02-21 13:14 ` Michael S. Tsirkin
-1 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-21 13:14 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: gregkh, robh+dt, pawel.moll, mark.rutland, ijc+devicetree, galak,
arnd, lersek, ralf, rmk+kernel, eric, hanjun.guo, zajec5,
sudeep.holla, agross, linux-api, linux-kernel, devicetree,
qemu-devel, imammedo, peter.maydell, leif.lindholm,
ard.biesheuvel, pbonzini, kraxel, ehabkost, luto, stefanha,
revol, matt, rth
On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > So for all arches which support ACPI, I think this driver
> > should just rely on ACPI.
>
> There was a discussion about that a few versions ago, and IIRC the
> conclusion was not to expect the firmware to contend for fw_cfg access
> after the guest kernel boots:
>
> https://lkml.org/lkml/2015/10/5/283
Interesting. Igor wanted to do this again recently ...
I'll think it over and get back to you/list on this.
> (I even had a prototype version doing what you suggested, but per the above
> reference decided to drop it -- which IMHO is for the better, since otherwise
> I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> see https://lkml.org/lkml/2015/11/4/534 )
I'm not sure I get it - won't you only ifdef the accessor function?
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [Qemu-devel] [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-21 13:14 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-21 13:14 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > So for all arches which support ACPI, I think this driver
> > should just rely on ACPI.
>
> There was a discussion about that a few versions ago, and IIRC the
> conclusion was not to expect the firmware to contend for fw_cfg access
> after the guest kernel boots:
>
> https://lkml.org/lkml/2015/10/5/283
Interesting. Igor wanted to do this again recently ...
I'll think it over and get back to you/list on this.
> (I even had a prototype version doing what you suggested, but per the above
> reference decided to drop it -- which IMHO is for the better, since otherwise
> I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> see https://lkml.org/lkml/2015/11/4/534 )
I'm not sure I get it - won't you only ifdef the accessor function?
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-21 13:14 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-21 13:14 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > So for all arches which support ACPI, I think this driver
> > should just rely on ACPI.
>
> There was a discussion about that a few versions ago, and IIRC the
> conclusion was not to expect the firmware to contend for fw_cfg access
> after the guest kernel boots:
>
> https://lkml.org/lkml/2015/10/5/283
Interesting. Igor wanted to do this again recently ...
I'll think it over and get back to you/list on this.
> (I even had a prototype version doing what you suggested, but per the above
> reference decided to drop it -- which IMHO is for the better, since otherwise
> I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> see https://lkml.org/lkml/2015/11/4/534 )
I'm not sure I get it - won't you only ifdef the accessor function?
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-22 20:14 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-22 20:14 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: gregkh, robh+dt, pawel.moll, mark.rutland, ijc+devicetree, galak,
arnd, lersek, ralf, rmk+kernel, eric, hanjun.guo, zajec5,
sudeep.holla, agross, linux-api, linux-kernel, devicetree,
qemu-devel, imammedo, peter.maydell, leif.lindholm,
ard.biesheuvel, pbonzini, kraxel, ehabkost, luto, stefanha,
revol, matt, rth
On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > +static void fw_cfg_io_cleanup(void)
> > > +{
> > > + if (fw_cfg_is_mmio) {
> > > + iounmap(fw_cfg_dev_base);
> > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > + } else {
> > > + ioport_unmap(fw_cfg_dev_base);
> > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > + }
> > > +}
> > > +
> > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> >
> > So for all arches which support ACPI, I think this driver
> > should just rely on ACPI.
>
> There was a discussion about that a few versions ago, and IIRC the
> conclusion was not to expect the firmware to contend for fw_cfg access
> after the guest kernel boots:
>
> https://lkml.org/lkml/2015/10/5/283
>
So it looks like NVDIMM at least wants to pass label data to guest -
for which fw cfg might be a reasonable choice.
I suspect things changed - fw cfg used to be very slow but we now have
DMA interface which makes it useful for a range of applications.
> (I even had a prototype version doing what you suggested, but per the above
> reference decided to drop it -- which IMHO is for the better, since otherwise
> I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> see https://lkml.org/lkml/2015/11/4/534 )
I'm not sure why you have these ifdefs - they are on the host, are they
not?
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [Qemu-devel] [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-22 20:14 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-22 20:14 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > +static void fw_cfg_io_cleanup(void)
> > > +{
> > > + if (fw_cfg_is_mmio) {
> > > + iounmap(fw_cfg_dev_base);
> > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > + } else {
> > > + ioport_unmap(fw_cfg_dev_base);
> > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > + }
> > > +}
> > > +
> > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> >
> > So for all arches which support ACPI, I think this driver
> > should just rely on ACPI.
>
> There was a discussion about that a few versions ago, and IIRC the
> conclusion was not to expect the firmware to contend for fw_cfg access
> after the guest kernel boots:
>
> https://lkml.org/lkml/2015/10/5/283
>
So it looks like NVDIMM at least wants to pass label data to guest -
for which fw cfg might be a reasonable choice.
I suspect things changed - fw cfg used to be very slow but we now have
DMA interface which makes it useful for a range of applications.
> (I even had a prototype version doing what you suggested, but per the above
> reference decided to drop it -- which IMHO is for the better, since otherwise
> I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> see https://lkml.org/lkml/2015/11/4/534 )
I'm not sure why you have these ifdefs - they are on the host, are they
not?
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-22 20:14 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-22 20:14 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
robh+dt-DgEjT+Ai2ygdnm+yROfE0A, pawel.moll-5wv7dgnIgG8,
mark.rutland-5wv7dgnIgG8, ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
galak-sgV2jX0FEOL9JmXXK+q4OQ, arnd-r2nGTMty4D4,
lersek-H+wXaHxf7aLQT0dZR+AlfA, ralf-6z/3iImG2C8G8FEW9MqTrA,
rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ, eric-WhKQ6XTQaPysTnJN9+BGXg,
hanjun.guo-QSEj5FYQhm4dnm+yROfE0A, zajec5-Re5JQEeQqe8AvxtiuMwx3w,
sudeep.holla-5wv7dgnIgG8, agross-sgV2jX0FEOL9JmXXK+q4OQ,
linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA,
qemu-devel-qX2TKyscuCcdnm+yROfE0A,
imammedo-H+wXaHxf7aLQT0dZR+AlfA,
peter.maydell-QSEj5FYQhm4dnm+yROfE0A,
leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
pbonzini-H+wXaHxf7aLQT0dZR+AlfA, kraxel-H+wXaHxf7aLQT0dZR+AlfA,
ehabkost-H+wXaHxf7aLQT0dZR+AlfA, luto-kltTT9wpgjJwATOyAt5JVQ,
stefanha-Re5JQEeQqe8AvxtiuMwx3w, revol-GANU6spQydw,
matt-mF/unelCI9GS6iBeEJttW/XRex20P6io,
rth-hL46jP5Bxq7R7s880joybQ
On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > +static void fw_cfg_io_cleanup(void)
> > > +{
> > > + if (fw_cfg_is_mmio) {
> > > + iounmap(fw_cfg_dev_base);
> > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > + } else {
> > > + ioport_unmap(fw_cfg_dev_base);
> > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > + }
> > > +}
> > > +
> > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> >
> > So for all arches which support ACPI, I think this driver
> > should just rely on ACPI.
>
> There was a discussion about that a few versions ago, and IIRC the
> conclusion was not to expect the firmware to contend for fw_cfg access
> after the guest kernel boots:
>
> https://lkml.org/lkml/2015/10/5/283
>
So it looks like NVDIMM at least wants to pass label data to guest -
for which fw cfg might be a reasonable choice.
I suspect things changed - fw cfg used to be very slow but we now have
DMA interface which makes it useful for a range of applications.
> (I even had a prototype version doing what you suggested, but per the above
> reference decided to drop it -- which IMHO is for the better, since otherwise
> I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> see https://lkml.org/lkml/2015/11/4/534 )
I'm not sure why you have these ifdefs - they are on the host, are they
not?
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-22 20:26 ` Gabriel L. Somlo
0 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-22 20:26 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: gregkh, robh+dt, pawel.moll, mark.rutland, ijc+devicetree, galak,
arnd, lersek, ralf, rmk+kernel, eric, hanjun.guo, zajec5,
sudeep.holla, agross, linux-api, linux-kernel, devicetree,
qemu-devel, imammedo, peter.maydell, leif.lindholm,
ard.biesheuvel, pbonzini, kraxel, ehabkost, luto, stefanha,
revol, matt, rth
On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > +static void fw_cfg_io_cleanup(void)
> > > > +{
> > > > + if (fw_cfg_is_mmio) {
> > > > + iounmap(fw_cfg_dev_base);
> > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > + } else {
> > > > + ioport_unmap(fw_cfg_dev_base);
> > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > + }
> > > > +}
> > > > +
> > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > >
> > > So for all arches which support ACPI, I think this driver
> > > should just rely on ACPI.
> >
> > There was a discussion about that a few versions ago, and IIRC the
> > conclusion was not to expect the firmware to contend for fw_cfg access
> > after the guest kernel boots:
> >
> > https://lkml.org/lkml/2015/10/5/283
> >
>
> So it looks like NVDIMM at least wants to pass label data to guest -
> for which fw cfg might be a reasonable choice.
>
> I suspect things changed - fw cfg used to be very slow but we now have
> DMA interface which makes it useful for a range of applications.
>
> > (I even had a prototype version doing what you suggested, but per the above
> > reference decided to drop it -- which IMHO is for the better, since otherwise
> > I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> > see https://lkml.org/lkml/2015/11/4/534 )
>
> I'm not sure why you have these ifdefs - they are on the host, are they
> not?
Think of those as "pseudocode" ifdefs, they're there to distinguish
between AML that would be generated on MMIO vs. IOPORT systems
(specifically, arm vs. x86, respectively)
Some of the AML is the same, but obviously the _CRS, and
OperationRegion + Field are different, and I wanted to point that out
somehow :)
Cheers,
--Gabriel
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [Qemu-devel] [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-22 20:26 ` Gabriel L. Somlo
0 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-22 20:26 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > +static void fw_cfg_io_cleanup(void)
> > > > +{
> > > > + if (fw_cfg_is_mmio) {
> > > > + iounmap(fw_cfg_dev_base);
> > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > + } else {
> > > > + ioport_unmap(fw_cfg_dev_base);
> > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > + }
> > > > +}
> > > > +
> > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > >
> > > So for all arches which support ACPI, I think this driver
> > > should just rely on ACPI.
> >
> > There was a discussion about that a few versions ago, and IIRC the
> > conclusion was not to expect the firmware to contend for fw_cfg access
> > after the guest kernel boots:
> >
> > https://lkml.org/lkml/2015/10/5/283
> >
>
> So it looks like NVDIMM at least wants to pass label data to guest -
> for which fw cfg might be a reasonable choice.
>
> I suspect things changed - fw cfg used to be very slow but we now have
> DMA interface which makes it useful for a range of applications.
>
> > (I even had a prototype version doing what you suggested, but per the above
> > reference decided to drop it -- which IMHO is for the better, since otherwise
> > I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> > see https://lkml.org/lkml/2015/11/4/534 )
>
> I'm not sure why you have these ifdefs - they are on the host, are they
> not?
Think of those as "pseudocode" ifdefs, they're there to distinguish
between AML that would be generated on MMIO vs. IOPORT systems
(specifically, arm vs. x86, respectively)
Some of the AML is the same, but obviously the _CRS, and
OperationRegion + Field are different, and I wanted to point that out
somehow :)
Cheers,
--Gabriel
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-22 20:26 ` Gabriel L. Somlo
0 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-22 20:26 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
robh+dt-DgEjT+Ai2ygdnm+yROfE0A, pawel.moll-5wv7dgnIgG8,
mark.rutland-5wv7dgnIgG8, ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
galak-sgV2jX0FEOL9JmXXK+q4OQ, arnd-r2nGTMty4D4,
lersek-H+wXaHxf7aLQT0dZR+AlfA, ralf-6z/3iImG2C8G8FEW9MqTrA,
rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ, eric-WhKQ6XTQaPysTnJN9+BGXg,
hanjun.guo-QSEj5FYQhm4dnm+yROfE0A, zajec5-Re5JQEeQqe8AvxtiuMwx3w,
sudeep.holla-5wv7dgnIgG8, agross-sgV2jX0FEOL9JmXXK+q4OQ,
linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA,
qemu-devel-qX2TKyscuCcdnm+yROfE0A,
imammedo-H+wXaHxf7aLQT0dZR+AlfA,
peter.maydell-QSEj5FYQhm4dnm+yROfE0A,
leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
pbonzini-H+wXaHxf7aLQT0dZR+AlfA, kraxel-H+wXaHxf7aLQT0dZR+AlfA,
ehabkost-H+wXaHxf7aLQT0dZR+AlfA, luto-kltTT9wpgjJwATOyAt5JVQ,
stefanha-Re5JQEeQqe8AvxtiuMwx3w, revol-GANU6spQydw,
matt-mF/unelCI9GS6iBeEJttW/XRex20P6io,
rth-hL46jP5Bxq7R7s880joybQ
On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > +static void fw_cfg_io_cleanup(void)
> > > > +{
> > > > + if (fw_cfg_is_mmio) {
> > > > + iounmap(fw_cfg_dev_base);
> > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > + } else {
> > > > + ioport_unmap(fw_cfg_dev_base);
> > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > + }
> > > > +}
> > > > +
> > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > >
> > > So for all arches which support ACPI, I think this driver
> > > should just rely on ACPI.
> >
> > There was a discussion about that a few versions ago, and IIRC the
> > conclusion was not to expect the firmware to contend for fw_cfg access
> > after the guest kernel boots:
> >
> > https://lkml.org/lkml/2015/10/5/283
> >
>
> So it looks like NVDIMM at least wants to pass label data to guest -
> for which fw cfg might be a reasonable choice.
>
> I suspect things changed - fw cfg used to be very slow but we now have
> DMA interface which makes it useful for a range of applications.
>
> > (I even had a prototype version doing what you suggested, but per the above
> > reference decided to drop it -- which IMHO is for the better, since otherwise
> > I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> > see https://lkml.org/lkml/2015/11/4/534 )
>
> I'm not sure why you have these ifdefs - they are on the host, are they
> not?
Think of those as "pseudocode" ifdefs, they're there to distinguish
between AML that would be generated on MMIO vs. IOPORT systems
(specifically, arm vs. x86, respectively)
Some of the AML is the same, but obviously the _CRS, and
OperationRegion + Field are different, and I wanted to point that out
somehow :)
Cheers,
--Gabriel
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-23 5:07 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-23 5:07 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: gregkh, robh+dt, pawel.moll, mark.rutland, ijc+devicetree, galak,
arnd, lersek, ralf, rmk+kernel, eric, hanjun.guo, zajec5,
sudeep.holla, agross, linux-api, linux-kernel, devicetree,
qemu-devel, imammedo, peter.maydell, leif.lindholm,
ard.biesheuvel, pbonzini, kraxel, ehabkost, luto, stefanha,
revol, matt, rth
On Mon, Feb 22, 2016 at 03:26:23PM -0500, Gabriel L. Somlo wrote:
> On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> > On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > > +static void fw_cfg_io_cleanup(void)
> > > > > +{
> > > > > + if (fw_cfg_is_mmio) {
> > > > > + iounmap(fw_cfg_dev_base);
> > > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > + } else {
> > > > > + ioport_unmap(fw_cfg_dev_base);
> > > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > + }
> > > > > +}
> > > > > +
> > > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > > >
> > > > So for all arches which support ACPI, I think this driver
> > > > should just rely on ACPI.
> > >
> > > There was a discussion about that a few versions ago, and IIRC the
> > > conclusion was not to expect the firmware to contend for fw_cfg access
> > > after the guest kernel boots:
> > >
> > > https://lkml.org/lkml/2015/10/5/283
> > >
> >
> > So it looks like NVDIMM at least wants to pass label data to guest -
> > for which fw cfg might be a reasonable choice.
> >
> > I suspect things changed - fw cfg used to be very slow but we now have
> > DMA interface which makes it useful for a range of applications.
Comment on this? I'm really worried we'll release linux
without a way to access fw cfg from aml.
How about taking acpi lock around all accesses?
> > > (I even had a prototype version doing what you suggested, but per the above
> > > reference decided to drop it -- which IMHO is for the better, since otherwise
> > > I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> > > see https://lkml.org/lkml/2015/11/4/534 )
> >
> > I'm not sure why you have these ifdefs - they are on the host, are they
> > not?
>
> Think of those as "pseudocode" ifdefs, they're there to distinguish
> between AML that would be generated on MMIO vs. IOPORT systems
> (specifically, arm vs. x86, respectively)
>
> Some of the AML is the same, but obviously the _CRS, and
> OperationRegion + Field are different, and I wanted to point that out
> somehow :)
>
> Cheers,
> --Gabriel
You can do ifs as well.
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [Qemu-devel] [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-23 5:07 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-23 5:07 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Mon, Feb 22, 2016 at 03:26:23PM -0500, Gabriel L. Somlo wrote:
> On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> > On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > > +static void fw_cfg_io_cleanup(void)
> > > > > +{
> > > > > + if (fw_cfg_is_mmio) {
> > > > > + iounmap(fw_cfg_dev_base);
> > > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > + } else {
> > > > > + ioport_unmap(fw_cfg_dev_base);
> > > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > + }
> > > > > +}
> > > > > +
> > > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > > >
> > > > So for all arches which support ACPI, I think this driver
> > > > should just rely on ACPI.
> > >
> > > There was a discussion about that a few versions ago, and IIRC the
> > > conclusion was not to expect the firmware to contend for fw_cfg access
> > > after the guest kernel boots:
> > >
> > > https://lkml.org/lkml/2015/10/5/283
> > >
> >
> > So it looks like NVDIMM at least wants to pass label data to guest -
> > for which fw cfg might be a reasonable choice.
> >
> > I suspect things changed - fw cfg used to be very slow but we now have
> > DMA interface which makes it useful for a range of applications.
Comment on this? I'm really worried we'll release linux
without a way to access fw cfg from aml.
How about taking acpi lock around all accesses?
> > > (I even had a prototype version doing what you suggested, but per the above
> > > reference decided to drop it -- which IMHO is for the better, since otherwise
> > > I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> > > see https://lkml.org/lkml/2015/11/4/534 )
> >
> > I'm not sure why you have these ifdefs - they are on the host, are they
> > not?
>
> Think of those as "pseudocode" ifdefs, they're there to distinguish
> between AML that would be generated on MMIO vs. IOPORT systems
> (specifically, arm vs. x86, respectively)
>
> Some of the AML is the same, but obviously the _CRS, and
> OperationRegion + Field are different, and I wanted to point that out
> somehow :)
>
> Cheers,
> --Gabriel
You can do ifs as well.
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-23 5:07 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-23 5:07 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
robh+dt-DgEjT+Ai2ygdnm+yROfE0A, pawel.moll-5wv7dgnIgG8,
mark.rutland-5wv7dgnIgG8, ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
galak-sgV2jX0FEOL9JmXXK+q4OQ, arnd-r2nGTMty4D4,
lersek-H+wXaHxf7aLQT0dZR+AlfA, ralf-6z/3iImG2C8G8FEW9MqTrA,
rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ, eric-WhKQ6XTQaPysTnJN9+BGXg,
hanjun.guo-QSEj5FYQhm4dnm+yROfE0A, zajec5-Re5JQEeQqe8AvxtiuMwx3w,
sudeep.holla-5wv7dgnIgG8, agross-sgV2jX0FEOL9JmXXK+q4OQ,
linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA,
qemu-devel-qX2TKyscuCcdnm+yROfE0A,
imammedo-H+wXaHxf7aLQT0dZR+AlfA,
peter.maydell-QSEj5FYQhm4dnm+yROfE0A,
leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
pbonzini-H+wXaHxf7aLQT0dZR+AlfA, kraxel-H+wXaHxf7aLQT0dZR+AlfA,
ehabkost-H+wXaHxf7aLQT0dZR+AlfA, luto-kltTT9wpgjJwATOyAt5JVQ,
stefanha-Re5JQEeQqe8AvxtiuMwx3w, revol-GANU6spQydw,
matt-mF/unelCI9GS6iBeEJttW/XRex20P6io,
rth-hL46jP5Bxq7R7s880joybQ
On Mon, Feb 22, 2016 at 03:26:23PM -0500, Gabriel L. Somlo wrote:
> On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> > On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > > +static void fw_cfg_io_cleanup(void)
> > > > > +{
> > > > > + if (fw_cfg_is_mmio) {
> > > > > + iounmap(fw_cfg_dev_base);
> > > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > + } else {
> > > > > + ioport_unmap(fw_cfg_dev_base);
> > > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > + }
> > > > > +}
> > > > > +
> > > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > > >
> > > > So for all arches which support ACPI, I think this driver
> > > > should just rely on ACPI.
> > >
> > > There was a discussion about that a few versions ago, and IIRC the
> > > conclusion was not to expect the firmware to contend for fw_cfg access
> > > after the guest kernel boots:
> > >
> > > https://lkml.org/lkml/2015/10/5/283
> > >
> >
> > So it looks like NVDIMM at least wants to pass label data to guest -
> > for which fw cfg might be a reasonable choice.
> >
> > I suspect things changed - fw cfg used to be very slow but we now have
> > DMA interface which makes it useful for a range of applications.
Comment on this? I'm really worried we'll release linux
without a way to access fw cfg from aml.
How about taking acpi lock around all accesses?
> > > (I even had a prototype version doing what you suggested, but per the above
> > > reference decided to drop it -- which IMHO is for the better, since otherwise
> > > I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> > > see https://lkml.org/lkml/2015/11/4/534 )
> >
> > I'm not sure why you have these ifdefs - they are on the host, are they
> > not?
>
> Think of those as "pseudocode" ifdefs, they're there to distinguish
> between AML that would be generated on MMIO vs. IOPORT systems
> (specifically, arm vs. x86, respectively)
>
> Some of the AML is the same, but obviously the _CRS, and
> OperationRegion + Field are different, and I wanted to point that out
> somehow :)
>
> Cheers,
> --Gabriel
You can do ifs as well.
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
2016-02-23 5:07 ` Michael S. Tsirkin
(?)
@ 2016-02-23 13:47 ` Gabriel L. Somlo
-1 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-23 13:47 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: gregkh, robh+dt, pawel.moll, mark.rutland, ijc+devicetree, galak,
arnd, lersek, ralf, rmk+kernel, eric, hanjun.guo, zajec5,
sudeep.holla, agross, linux-api, linux-kernel, devicetree,
qemu-devel, imammedo, peter.maydell, leif.lindholm,
ard.biesheuvel, pbonzini, kraxel, ehabkost, luto, stefanha,
revol, matt, rth
On Tue, Feb 23, 2016 at 07:07:36AM +0200, Michael S. Tsirkin wrote:
> On Mon, Feb 22, 2016 at 03:26:23PM -0500, Gabriel L. Somlo wrote:
> > On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> > > On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > > > +static void fw_cfg_io_cleanup(void)
> > > > > > +{
> > > > > > + if (fw_cfg_is_mmio) {
> > > > > > + iounmap(fw_cfg_dev_base);
> > > > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > + } else {
> > > > > > + ioport_unmap(fw_cfg_dev_base);
> > > > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > + }
> > > > > > +}
> > > > > > +
> > > > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > > > >
> > > > > So for all arches which support ACPI, I think this driver
> > > > > should just rely on ACPI.
> > > >
> > > > There was a discussion about that a few versions ago, and IIRC the
> > > > conclusion was not to expect the firmware to contend for fw_cfg access
> > > > after the guest kernel boots:
> > > >
> > > > https://lkml.org/lkml/2015/10/5/283
> > > >
> > >
> > > So it looks like NVDIMM at least wants to pass label data to guest -
> > > for which fw cfg might be a reasonable choice.
> > >
> > > I suspect things changed - fw cfg used to be very slow but we now have
> > > DMA interface which makes it useful for a range of applications.
>
> Comment on this? I'm really worried we'll release linux
> without a way to access fw cfg from aml.
> How about taking acpi lock around all accesses?
You mean something like this (haven't tried compiling it yet, so it
might be a bit more complicated, but just for the purpose of this
conversation):
diff --git a/drivers/firmware/qemu_fw_cfg.c
b/drivers/firmware/qemu_fw_cfg.c
index fedbff5..3462a2c 100644
--- a/drivers/firmware/qemu_fw_cfg.c
+++ b/drivers/firmware/qemu_fw_cfg.c
@@ -77,12 +77,18 @@ static inline u16 fw_cfg_sel_endianness(u16 key)
static inline void fw_cfg_read_blob(u16 key,
void *buf, loff_t pos, size_t
count)
{
+#ifdef CONFIG_ACPI
+ acpi_os_acquire_mutex(acpi_gbl_osi_mutex, ACPI_WAIT_FOREVER);
+#endif
mutex_lock(&fw_cfg_dev_lock);
iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
while (pos-- > 0)
ioread8(fw_cfg_reg_data);
ioread8_rep(fw_cfg_reg_data, buf, count);
mutex_unlock(&fw_cfg_dev_lock);
+#ifdef CONFIG_ACPI
+ acpi_os_release_mutex(acpi_gbl_osi_mutex);
+#endif
}
/* clean up fw_cfg device i/o */
I wouldn't particularly *mind* doing that, but I'd still like to hear
from other QEMU devs on whether it's really necessary.
> > > > (I even had a prototype version doing what you suggested, but per the above
> > > > reference decided to drop it -- which IMHO is for the better, since otherwise
> > > > I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> > > > see https://lkml.org/lkml/2015/11/4/534 )
> > >
> > > I'm not sure why you have these ifdefs - they are on the host, are they
> > > not?
> >
> > Think of those as "pseudocode" ifdefs, they're there to distinguish
> > between AML that would be generated on MMIO vs. IOPORT systems
> > (specifically, arm vs. x86, respectively)
> >
> > Some of the AML is the same, but obviously the _CRS, and
> > OperationRegion + Field are different, and I wanted to point that out
> > somehow :)
> >
> > Cheers,
> > --Gabriel
>
> You can do ifs as well.
Yeah, but the AML is generated from arch-specific locations in QEMU,
so we'd be doing MMIO-only from e.g. hw/arm/virt-acpi-build.c, and
IOPORT-only from hw/i386/acpi-build.c, etc. I wouldnt need to write a
generic AML blob with 'if' statements and insert it the same way on
all architectures, or would I ? Not sure what the best practice would
be for that :)
Speaking of AML, if we were to implement a "RDBL" (read-blob) method
for fw_cfg in AML, and call it from the guest-side kernel module,
we'll never be able to make it use DMA on ACPI systems. The way
fw_cfg_read_blob is written now, we could patch that in at some later
point. So that's an argument in favor of *at most* wrapping
acpi_os_acquire_mutex() around the current fw_cfg_read_blob, rather
than including an acpi-specific version implemented on top of an
AML call.
Thanks,
--Gabriel
^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [Qemu-devel] [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-23 13:47 ` Gabriel L. Somlo
0 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-23 13:47 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Tue, Feb 23, 2016 at 07:07:36AM +0200, Michael S. Tsirkin wrote:
> On Mon, Feb 22, 2016 at 03:26:23PM -0500, Gabriel L. Somlo wrote:
> > On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> > > On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > > > +static void fw_cfg_io_cleanup(void)
> > > > > > +{
> > > > > > + if (fw_cfg_is_mmio) {
> > > > > > + iounmap(fw_cfg_dev_base);
> > > > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > + } else {
> > > > > > + ioport_unmap(fw_cfg_dev_base);
> > > > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > + }
> > > > > > +}
> > > > > > +
> > > > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > > > >
> > > > > So for all arches which support ACPI, I think this driver
> > > > > should just rely on ACPI.
> > > >
> > > > There was a discussion about that a few versions ago, and IIRC the
> > > > conclusion was not to expect the firmware to contend for fw_cfg access
> > > > after the guest kernel boots:
> > > >
> > > > https://lkml.org/lkml/2015/10/5/283
> > > >
> > >
> > > So it looks like NVDIMM at least wants to pass label data to guest -
> > > for which fw cfg might be a reasonable choice.
> > >
> > > I suspect things changed - fw cfg used to be very slow but we now have
> > > DMA interface which makes it useful for a range of applications.
>
> Comment on this? I'm really worried we'll release linux
> without a way to access fw cfg from aml.
> How about taking acpi lock around all accesses?
You mean something like this (haven't tried compiling it yet, so it
might be a bit more complicated, but just for the purpose of this
conversation):
diff --git a/drivers/firmware/qemu_fw_cfg.c
b/drivers/firmware/qemu_fw_cfg.c
index fedbff5..3462a2c 100644
--- a/drivers/firmware/qemu_fw_cfg.c
+++ b/drivers/firmware/qemu_fw_cfg.c
@@ -77,12 +77,18 @@ static inline u16 fw_cfg_sel_endianness(u16 key)
static inline void fw_cfg_read_blob(u16 key,
void *buf, loff_t pos, size_t
count)
{
+#ifdef CONFIG_ACPI
+ acpi_os_acquire_mutex(acpi_gbl_osi_mutex, ACPI_WAIT_FOREVER);
+#endif
mutex_lock(&fw_cfg_dev_lock);
iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
while (pos-- > 0)
ioread8(fw_cfg_reg_data);
ioread8_rep(fw_cfg_reg_data, buf, count);
mutex_unlock(&fw_cfg_dev_lock);
+#ifdef CONFIG_ACPI
+ acpi_os_release_mutex(acpi_gbl_osi_mutex);
+#endif
}
/* clean up fw_cfg device i/o */
I wouldn't particularly *mind* doing that, but I'd still like to hear
from other QEMU devs on whether it's really necessary.
> > > > (I even had a prototype version doing what you suggested, but per the above
> > > > reference decided to drop it -- which IMHO is for the better, since otherwise
> > > > I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> > > > see https://lkml.org/lkml/2015/11/4/534 )
> > >
> > > I'm not sure why you have these ifdefs - they are on the host, are they
> > > not?
> >
> > Think of those as "pseudocode" ifdefs, they're there to distinguish
> > between AML that would be generated on MMIO vs. IOPORT systems
> > (specifically, arm vs. x86, respectively)
> >
> > Some of the AML is the same, but obviously the _CRS, and
> > OperationRegion + Field are different, and I wanted to point that out
> > somehow :)
> >
> > Cheers,
> > --Gabriel
>
> You can do ifs as well.
Yeah, but the AML is generated from arch-specific locations in QEMU,
so we'd be doing MMIO-only from e.g. hw/arm/virt-acpi-build.c, and
IOPORT-only from hw/i386/acpi-build.c, etc. I wouldnt need to write a
generic AML blob with 'if' statements and insert it the same way on
all architectures, or would I ? Not sure what the best practice would
be for that :)
Speaking of AML, if we were to implement a "RDBL" (read-blob) method
for fw_cfg in AML, and call it from the guest-side kernel module,
we'll never be able to make it use DMA on ACPI systems. The way
fw_cfg_read_blob is written now, we could patch that in at some later
point. So that's an argument in favor of *at most* wrapping
acpi_os_acquire_mutex() around the current fw_cfg_read_blob, rather
than including an acpi-specific version implemented on top of an
AML call.
Thanks,
--Gabriel
^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-23 13:47 ` Gabriel L. Somlo
0 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-23 13:47 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Tue, Feb 23, 2016 at 07:07:36AM +0200, Michael S. Tsirkin wrote:
> On Mon, Feb 22, 2016 at 03:26:23PM -0500, Gabriel L. Somlo wrote:
> > On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> > > On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > > > +static void fw_cfg_io_cleanup(void)
> > > > > > +{
> > > > > > + if (fw_cfg_is_mmio) {
> > > > > > + iounmap(fw_cfg_dev_base);
> > > > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > + } else {
> > > > > > + ioport_unmap(fw_cfg_dev_base);
> > > > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > + }
> > > > > > +}
> > > > > > +
> > > > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > > > >
> > > > > So for all arches which support ACPI, I think this driver
> > > > > should just rely on ACPI.
> > > >
> > > > There was a discussion about that a few versions ago, and IIRC the
> > > > conclusion was not to expect the firmware to contend for fw_cfg access
> > > > after the guest kernel boots:
> > > >
> > > > https://lkml.org/lkml/2015/10/5/283
> > > >
> > >
> > > So it looks like NVDIMM at least wants to pass label data to guest -
> > > for which fw cfg might be a reasonable choice.
> > >
> > > I suspect things changed - fw cfg used to be very slow but we now have
> > > DMA interface which makes it useful for a range of applications.
>
> Comment on this? I'm really worried we'll release linux
> without a way to access fw cfg from aml.
> How about taking acpi lock around all accesses?
You mean something like this (haven't tried compiling it yet, so it
might be a bit more complicated, but just for the purpose of this
conversation):
diff --git a/drivers/firmware/qemu_fw_cfg.c
b/drivers/firmware/qemu_fw_cfg.c
index fedbff5..3462a2c 100644
--- a/drivers/firmware/qemu_fw_cfg.c
+++ b/drivers/firmware/qemu_fw_cfg.c
@@ -77,12 +77,18 @@ static inline u16 fw_cfg_sel_endianness(u16 key)
static inline void fw_cfg_read_blob(u16 key,
void *buf, loff_t pos, size_t
count)
{
+#ifdef CONFIG_ACPI
+ acpi_os_acquire_mutex(acpi_gbl_osi_mutex, ACPI_WAIT_FOREVER);
+#endif
mutex_lock(&fw_cfg_dev_lock);
iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
while (pos-- > 0)
ioread8(fw_cfg_reg_data);
ioread8_rep(fw_cfg_reg_data, buf, count);
mutex_unlock(&fw_cfg_dev_lock);
+#ifdef CONFIG_ACPI
+ acpi_os_release_mutex(acpi_gbl_osi_mutex);
+#endif
}
/* clean up fw_cfg device i/o */
I wouldn't particularly *mind* doing that, but I'd still like to hear
from other QEMU devs on whether it's really necessary.
> > > > (I even had a prototype version doing what you suggested, but per the above
> > > > reference decided to drop it -- which IMHO is for the better, since otherwise
> > > > I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> > > > see https://lkml.org/lkml/2015/11/4/534 )
> > >
> > > I'm not sure why you have these ifdefs - they are on the host, are they
> > > not?
> >
> > Think of those as "pseudocode" ifdefs, they're there to distinguish
> > between AML that would be generated on MMIO vs. IOPORT systems
> > (specifically, arm vs. x86, respectively)
> >
> > Some of the AML is the same, but obviously the _CRS, and
> > OperationRegion + Field are different, and I wanted to point that out
> > somehow :)
> >
> > Cheers,
> > --Gabriel
>
> You can do ifs as well.
Yeah, but the AML is generated from arch-specific locations in QEMU,
so we'd be doing MMIO-only from e.g. hw/arm/virt-acpi-build.c, and
IOPORT-only from hw/i386/acpi-build.c, etc. I wouldnt need to write a
generic AML blob with 'if' statements and insert it the same way on
all architectures, or would I ? Not sure what the best practice would
be for that :)
Speaking of AML, if we were to implement a "RDBL" (read-blob) method
for fw_cfg in AML, and call it from the guest-side kernel module,
we'll never be able to make it use DMA on ACPI systems. The way
fw_cfg_read_blob is written now, we could patch that in at some later
point. So that's an argument in favor of *at most* wrapping
acpi_os_acquire_mutex() around the current fw_cfg_read_blob, rather
than including an acpi-specific version implemented on top of an
AML call.
Thanks,
--Gabriel
^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
2016-02-23 13:47 ` Gabriel L. Somlo
(?)
@ 2016-02-23 14:14 ` Michael S. Tsirkin
-1 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-23 14:14 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: gregkh, robh+dt, pawel.moll, mark.rutland, ijc+devicetree, galak,
arnd, lersek, ralf, rmk+kernel, eric, hanjun.guo, zajec5,
sudeep.holla, agross, linux-api, linux-kernel, devicetree,
qemu-devel, imammedo, peter.maydell, leif.lindholm,
ard.biesheuvel, pbonzini, kraxel, ehabkost, luto, stefanha,
revol, matt, rth
On Tue, Feb 23, 2016 at 08:47:00AM -0500, Gabriel L. Somlo wrote:
> On Tue, Feb 23, 2016 at 07:07:36AM +0200, Michael S. Tsirkin wrote:
> > On Mon, Feb 22, 2016 at 03:26:23PM -0500, Gabriel L. Somlo wrote:
> > > On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> > > > On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > > > > +static void fw_cfg_io_cleanup(void)
> > > > > > > +{
> > > > > > > + if (fw_cfg_is_mmio) {
> > > > > > > + iounmap(fw_cfg_dev_base);
> > > > > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > > + } else {
> > > > > > > + ioport_unmap(fw_cfg_dev_base);
> > > > > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > > + }
> > > > > > > +}
> > > > > > > +
> > > > > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > > > > >
> > > > > > So for all arches which support ACPI, I think this driver
> > > > > > should just rely on ACPI.
> > > > >
> > > > > There was a discussion about that a few versions ago, and IIRC the
> > > > > conclusion was not to expect the firmware to contend for fw_cfg access
> > > > > after the guest kernel boots:
> > > > >
> > > > > https://lkml.org/lkml/2015/10/5/283
> > > > >
> > > >
> > > > So it looks like NVDIMM at least wants to pass label data to guest -
> > > > for which fw cfg might be a reasonable choice.
> > > >
> > > > I suspect things changed - fw cfg used to be very slow but we now have
> > > > DMA interface which makes it useful for a range of applications.
> >
> > Comment on this? I'm really worried we'll release linux
> > without a way to access fw cfg from aml.
> > How about taking acpi lock around all accesses?
>
> You mean something like this (haven't tried compiling it yet, so it
> might be a bit more complicated, but just for the purpose of this
> conversation):
>
> diff --git a/drivers/firmware/qemu_fw_cfg.c
> b/drivers/firmware/qemu_fw_cfg.c
> index fedbff5..3462a2c 100644
> --- a/drivers/firmware/qemu_fw_cfg.c
> +++ b/drivers/firmware/qemu_fw_cfg.c
> @@ -77,12 +77,18 @@ static inline u16 fw_cfg_sel_endianness(u16 key)
> static inline void fw_cfg_read_blob(u16 key,
> void *buf, loff_t pos, size_t
> count)
> {
> +#ifdef CONFIG_ACPI
> + acpi_os_acquire_mutex(acpi_gbl_osi_mutex, ACPI_WAIT_FOREVER);
> +#endif
> mutex_lock(&fw_cfg_dev_lock);
> iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
> while (pos-- > 0)
> ioread8(fw_cfg_reg_data);
> ioread8_rep(fw_cfg_reg_data, buf, count);
> mutex_unlock(&fw_cfg_dev_lock);
> +#ifdef CONFIG_ACPI
> + acpi_os_release_mutex(acpi_gbl_osi_mutex);
> +#endif
> }
>
> /* clean up fw_cfg device i/o */
Fundamentally yes.
> I wouldn't particularly *mind* doing that, but I'd still like to hear
> from other QEMU devs on whether it's really necessary.
It seems like a prudent thing to do IMHO, before this
goes out to users.
> > > > > (I even had a prototype version doing what you suggested, but per the above
> > > > > reference decided to drop it -- which IMHO is for the better, since otherwise
> > > > > I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> > > > > see https://lkml.org/lkml/2015/11/4/534 )
> > > >
> > > > I'm not sure why you have these ifdefs - they are on the host, are they
> > > > not?
> > >
> > > Think of those as "pseudocode" ifdefs, they're there to distinguish
> > > between AML that would be generated on MMIO vs. IOPORT systems
> > > (specifically, arm vs. x86, respectively)
> > >
> > > Some of the AML is the same, but obviously the _CRS, and
> > > OperationRegion + Field are different, and I wanted to point that out
> > > somehow :)
> > >
> > > Cheers,
> > > --Gabriel
> >
> > You can do ifs as well.
>
> Yeah, but the AML is generated from arch-specific locations in QEMU,
> so we'd be doing MMIO-only from e.g. hw/arm/virt-acpi-build.c, and
> IOPORT-only from hw/i386/acpi-build.c, etc. I wouldnt need to write a
> generic AML blob with 'if' statements and insert it the same way on
> all architectures, or would I ? Not sure what the best practice would
> be for that :)
Just regular C, put common code in a common function.
> Speaking of AML, if we were to implement a "RDBL" (read-blob) method
> for fw_cfg in AML, and call it from the guest-side kernel module,
> we'll never be able to make it use DMA on ACPI systems. The way
> fw_cfg_read_blob is written now, we could patch that in at some later
> point. So that's an argument in favor of *at most* wrapping
> acpi_os_acquire_mutex() around the current fw_cfg_read_blob, rather
> than including an acpi-specific version implemented on top of an
> AML call.
>
> Thanks,
> --Gabriel
On balance, I think locking ACPI solves most problems so
if we just do that, I think what you did here is fine.
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [Qemu-devel] [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-23 14:14 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-23 14:14 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Tue, Feb 23, 2016 at 08:47:00AM -0500, Gabriel L. Somlo wrote:
> On Tue, Feb 23, 2016 at 07:07:36AM +0200, Michael S. Tsirkin wrote:
> > On Mon, Feb 22, 2016 at 03:26:23PM -0500, Gabriel L. Somlo wrote:
> > > On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> > > > On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > > > > +static void fw_cfg_io_cleanup(void)
> > > > > > > +{
> > > > > > > + if (fw_cfg_is_mmio) {
> > > > > > > + iounmap(fw_cfg_dev_base);
> > > > > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > > + } else {
> > > > > > > + ioport_unmap(fw_cfg_dev_base);
> > > > > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > > + }
> > > > > > > +}
> > > > > > > +
> > > > > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > > > > >
> > > > > > So for all arches which support ACPI, I think this driver
> > > > > > should just rely on ACPI.
> > > > >
> > > > > There was a discussion about that a few versions ago, and IIRC the
> > > > > conclusion was not to expect the firmware to contend for fw_cfg access
> > > > > after the guest kernel boots:
> > > > >
> > > > > https://lkml.org/lkml/2015/10/5/283
> > > > >
> > > >
> > > > So it looks like NVDIMM at least wants to pass label data to guest -
> > > > for which fw cfg might be a reasonable choice.
> > > >
> > > > I suspect things changed - fw cfg used to be very slow but we now have
> > > > DMA interface which makes it useful for a range of applications.
> >
> > Comment on this? I'm really worried we'll release linux
> > without a way to access fw cfg from aml.
> > How about taking acpi lock around all accesses?
>
> You mean something like this (haven't tried compiling it yet, so it
> might be a bit more complicated, but just for the purpose of this
> conversation):
>
> diff --git a/drivers/firmware/qemu_fw_cfg.c
> b/drivers/firmware/qemu_fw_cfg.c
> index fedbff5..3462a2c 100644
> --- a/drivers/firmware/qemu_fw_cfg.c
> +++ b/drivers/firmware/qemu_fw_cfg.c
> @@ -77,12 +77,18 @@ static inline u16 fw_cfg_sel_endianness(u16 key)
> static inline void fw_cfg_read_blob(u16 key,
> void *buf, loff_t pos, size_t
> count)
> {
> +#ifdef CONFIG_ACPI
> + acpi_os_acquire_mutex(acpi_gbl_osi_mutex, ACPI_WAIT_FOREVER);
> +#endif
> mutex_lock(&fw_cfg_dev_lock);
> iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
> while (pos-- > 0)
> ioread8(fw_cfg_reg_data);
> ioread8_rep(fw_cfg_reg_data, buf, count);
> mutex_unlock(&fw_cfg_dev_lock);
> +#ifdef CONFIG_ACPI
> + acpi_os_release_mutex(acpi_gbl_osi_mutex);
> +#endif
> }
>
> /* clean up fw_cfg device i/o */
Fundamentally yes.
> I wouldn't particularly *mind* doing that, but I'd still like to hear
> from other QEMU devs on whether it's really necessary.
It seems like a prudent thing to do IMHO, before this
goes out to users.
> > > > > (I even had a prototype version doing what you suggested, but per the above
> > > > > reference decided to drop it -- which IMHO is for the better, since otherwise
> > > > > I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> > > > > see https://lkml.org/lkml/2015/11/4/534 )
> > > >
> > > > I'm not sure why you have these ifdefs - they are on the host, are they
> > > > not?
> > >
> > > Think of those as "pseudocode" ifdefs, they're there to distinguish
> > > between AML that would be generated on MMIO vs. IOPORT systems
> > > (specifically, arm vs. x86, respectively)
> > >
> > > Some of the AML is the same, but obviously the _CRS, and
> > > OperationRegion + Field are different, and I wanted to point that out
> > > somehow :)
> > >
> > > Cheers,
> > > --Gabriel
> >
> > You can do ifs as well.
>
> Yeah, but the AML is generated from arch-specific locations in QEMU,
> so we'd be doing MMIO-only from e.g. hw/arm/virt-acpi-build.c, and
> IOPORT-only from hw/i386/acpi-build.c, etc. I wouldnt need to write a
> generic AML blob with 'if' statements and insert it the same way on
> all architectures, or would I ? Not sure what the best practice would
> be for that :)
Just regular C, put common code in a common function.
> Speaking of AML, if we were to implement a "RDBL" (read-blob) method
> for fw_cfg in AML, and call it from the guest-side kernel module,
> we'll never be able to make it use DMA on ACPI systems. The way
> fw_cfg_read_blob is written now, we could patch that in at some later
> point. So that's an argument in favor of *at most* wrapping
> acpi_os_acquire_mutex() around the current fw_cfg_read_blob, rather
> than including an acpi-specific version implemented on top of an
> AML call.
>
> Thanks,
> --Gabriel
On balance, I think locking ACPI solves most problems so
if we just do that, I think what you did here is fine.
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-23 14:14 ` Michael S. Tsirkin
0 siblings, 0 replies; 57+ messages in thread
From: Michael S. Tsirkin @ 2016-02-23 14:14 UTC (permalink / raw)
To: Gabriel L. Somlo
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Tue, Feb 23, 2016 at 08:47:00AM -0500, Gabriel L. Somlo wrote:
> On Tue, Feb 23, 2016 at 07:07:36AM +0200, Michael S. Tsirkin wrote:
> > On Mon, Feb 22, 2016 at 03:26:23PM -0500, Gabriel L. Somlo wrote:
> > > On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> > > > On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > > > > +static void fw_cfg_io_cleanup(void)
> > > > > > > +{
> > > > > > > + if (fw_cfg_is_mmio) {
> > > > > > > + iounmap(fw_cfg_dev_base);
> > > > > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > > + } else {
> > > > > > > + ioport_unmap(fw_cfg_dev_base);
> > > > > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > > + }
> > > > > > > +}
> > > > > > > +
> > > > > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > > > > >
> > > > > > So for all arches which support ACPI, I think this driver
> > > > > > should just rely on ACPI.
> > > > >
> > > > > There was a discussion about that a few versions ago, and IIRC the
> > > > > conclusion was not to expect the firmware to contend for fw_cfg access
> > > > > after the guest kernel boots:
> > > > >
> > > > > https://lkml.org/lkml/2015/10/5/283
> > > > >
> > > >
> > > > So it looks like NVDIMM at least wants to pass label data to guest -
> > > > for which fw cfg might be a reasonable choice.
> > > >
> > > > I suspect things changed - fw cfg used to be very slow but we now have
> > > > DMA interface which makes it useful for a range of applications.
> >
> > Comment on this? I'm really worried we'll release linux
> > without a way to access fw cfg from aml.
> > How about taking acpi lock around all accesses?
>
> You mean something like this (haven't tried compiling it yet, so it
> might be a bit more complicated, but just for the purpose of this
> conversation):
>
> diff --git a/drivers/firmware/qemu_fw_cfg.c
> b/drivers/firmware/qemu_fw_cfg.c
> index fedbff5..3462a2c 100644
> --- a/drivers/firmware/qemu_fw_cfg.c
> +++ b/drivers/firmware/qemu_fw_cfg.c
> @@ -77,12 +77,18 @@ static inline u16 fw_cfg_sel_endianness(u16 key)
> static inline void fw_cfg_read_blob(u16 key,
> void *buf, loff_t pos, size_t
> count)
> {
> +#ifdef CONFIG_ACPI
> + acpi_os_acquire_mutex(acpi_gbl_osi_mutex, ACPI_WAIT_FOREVER);
> +#endif
> mutex_lock(&fw_cfg_dev_lock);
> iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
> while (pos-- > 0)
> ioread8(fw_cfg_reg_data);
> ioread8_rep(fw_cfg_reg_data, buf, count);
> mutex_unlock(&fw_cfg_dev_lock);
> +#ifdef CONFIG_ACPI
> + acpi_os_release_mutex(acpi_gbl_osi_mutex);
> +#endif
> }
>
> /* clean up fw_cfg device i/o */
Fundamentally yes.
> I wouldn't particularly *mind* doing that, but I'd still like to hear
> from other QEMU devs on whether it's really necessary.
It seems like a prudent thing to do IMHO, before this
goes out to users.
> > > > > (I even had a prototype version doing what you suggested, but per the above
> > > > > reference decided to drop it -- which IMHO is for the better, since otherwise
> > > > > I'd have had to ifdef between ACPI and non-ACPI versions of the driver --
> > > > > see https://lkml.org/lkml/2015/11/4/534 )
> > > >
> > > > I'm not sure why you have these ifdefs - they are on the host, are they
> > > > not?
> > >
> > > Think of those as "pseudocode" ifdefs, they're there to distinguish
> > > between AML that would be generated on MMIO vs. IOPORT systems
> > > (specifically, arm vs. x86, respectively)
> > >
> > > Some of the AML is the same, but obviously the _CRS, and
> > > OperationRegion + Field are different, and I wanted to point that out
> > > somehow :)
> > >
> > > Cheers,
> > > --Gabriel
> >
> > You can do ifs as well.
>
> Yeah, but the AML is generated from arch-specific locations in QEMU,
> so we'd be doing MMIO-only from e.g. hw/arm/virt-acpi-build.c, and
> IOPORT-only from hw/i386/acpi-build.c, etc. I wouldnt need to write a
> generic AML blob with 'if' statements and insert it the same way on
> all architectures, or would I ? Not sure what the best practice would
> be for that :)
Just regular C, put common code in a common function.
> Speaking of AML, if we were to implement a "RDBL" (read-blob) method
> for fw_cfg in AML, and call it from the guest-side kernel module,
> we'll never be able to make it use DMA on ACPI systems. The way
> fw_cfg_read_blob is written now, we could patch that in at some later
> point. So that's an argument in favor of *at most* wrapping
> acpi_os_acquire_mutex() around the current fw_cfg_read_blob, rather
> than including an acpi-specific version implemented on top of an
> AML call.
>
> Thanks,
> --Gabriel
On balance, I think locking ACPI solves most problems so
if we just do that, I think what you did here is fine.
--
MST
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-24 0:03 ` Gabriel L. Somlo
0 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-24 0:03 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: gregkh, robh+dt, pawel.moll, mark.rutland, ijc+devicetree, galak,
arnd, lersek, ralf, rmk+kernel, eric, hanjun.guo, zajec5,
sudeep.holla, agross, linux-api, linux-kernel, devicetree,
qemu-devel, imammedo, peter.maydell, leif.lindholm,
ard.biesheuvel, pbonzini, kraxel, ehabkost, luto, stefanha,
revol, matt, rth
On Tue, Feb 23, 2016 at 04:14:46PM +0200, Michael S. Tsirkin wrote:
> On Tue, Feb 23, 2016 at 08:47:00AM -0500, Gabriel L. Somlo wrote:
> > On Tue, Feb 23, 2016 at 07:07:36AM +0200, Michael S. Tsirkin wrote:
> > > On Mon, Feb 22, 2016 at 03:26:23PM -0500, Gabriel L. Somlo wrote:
> > > > On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> > > > > On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > > > > > +static void fw_cfg_io_cleanup(void)
> > > > > > > > +{
> > > > > > > > + if (fw_cfg_is_mmio) {
> > > > > > > > + iounmap(fw_cfg_dev_base);
> > > > > > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > > > + } else {
> > > > > > > > + ioport_unmap(fw_cfg_dev_base);
> > > > > > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > > > + }
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > > > > > >
> > > > > > > So for all arches which support ACPI, I think this driver
> > > > > > > should just rely on ACPI.
> > > > > >
> > > > > > There was a discussion about that a few versions ago, and IIRC the
> > > > > > conclusion was not to expect the firmware to contend for fw_cfg access
> > > > > > after the guest kernel boots:
> > > > > >
> > > > > > https://lkml.org/lkml/2015/10/5/283
> > > > > >
> > > > >
> > > > > So it looks like NVDIMM at least wants to pass label data to guest -
> > > > > for which fw cfg might be a reasonable choice.
> > > > >
> > > > > I suspect things changed - fw cfg used to be very slow but we now have
> > > > > DMA interface which makes it useful for a range of applications.
> > >
> > > Comment on this? I'm really worried we'll release linux
> > > without a way to access fw cfg from aml.
> > > How about taking acpi lock around all accesses?
> >
> > You mean something like this (haven't tried compiling it yet, so it
> > might be a bit more complicated, but just for the purpose of this
> > conversation):
> >
> > diff --git a/drivers/firmware/qemu_fw_cfg.c
> > b/drivers/firmware/qemu_fw_cfg.c
> > index fedbff5..3462a2c 100644
> > --- a/drivers/firmware/qemu_fw_cfg.c
> > +++ b/drivers/firmware/qemu_fw_cfg.c
> > @@ -77,12 +77,18 @@ static inline u16 fw_cfg_sel_endianness(u16 key)
> > static inline void fw_cfg_read_blob(u16 key,
> > void *buf, loff_t pos, size_t
> > count)
> > {
> > +#ifdef CONFIG_ACPI
> > + acpi_os_acquire_mutex(acpi_gbl_osi_mutex, ACPI_WAIT_FOREVER);
> > +#endif
> > mutex_lock(&fw_cfg_dev_lock);
> > iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
> > while (pos-- > 0)
> > ioread8(fw_cfg_reg_data);
> > ioread8_rep(fw_cfg_reg_data, buf, count);
> > mutex_unlock(&fw_cfg_dev_lock);
> > +#ifdef CONFIG_ACPI
> > + acpi_os_release_mutex(acpi_gbl_osi_mutex);
> > +#endif
> > }
> >
> > /* clean up fw_cfg device i/o */
>
> Fundamentally yes.
>
> > I wouldn't particularly *mind* doing that, but I'd still like to hear
> > from other QEMU devs on whether it's really necessary.
>
> It seems like a prudent thing to do IMHO, before this
> goes out to users.
>
> [...]
>
> On balance, I think locking ACPI solves most problems so
> if we just do that, I think what you did here is fine.
Only trouble is, acpi_gbl_osi_mutex seems to be "private" to the acpi
subsystem, and I'm not sure how well a patch to allow some random
module to lock/unlock ACPI at its convenience would be received...
So unless I'm missing something obvious (wouldn't be the first time),
I think we're back to where *if* we *have* to do this [*], providing an
AML blob-reader method in ACPI and punting to it from the guest-side
kernel module (via #ifdef CONFIG_ACPI) would be the less painful
alternative.
[*] that is, mutual exclusion between kernel and firmware regarding
fw_cfg is (back) on the table, for real this time...
It would be good to know that it's the new consensus among QEMU
folks, since I have a strong feeling I'd no longer be "Keeping It Simple"
by moving in this direction.
Thanks,
--Gabriel
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [Qemu-devel] [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-24 0:03 ` Gabriel L. Somlo
0 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-24 0:03 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: mark.rutland, peter.maydell, matt, stefanha, qemu-devel, eric,
kraxel, linux-api, agross, pawel.moll, zajec5, rmk+kernel,
lersek, devicetree, ehabkost, arnd, ijc+devicetree, galak,
leif.lindholm, robh+dt, pbonzini, rth, ard.biesheuvel, gregkh,
linux-kernel, luto, hanjun.guo, sudeep.holla, imammedo, revol
On Tue, Feb 23, 2016 at 04:14:46PM +0200, Michael S. Tsirkin wrote:
> On Tue, Feb 23, 2016 at 08:47:00AM -0500, Gabriel L. Somlo wrote:
> > On Tue, Feb 23, 2016 at 07:07:36AM +0200, Michael S. Tsirkin wrote:
> > > On Mon, Feb 22, 2016 at 03:26:23PM -0500, Gabriel L. Somlo wrote:
> > > > On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> > > > > On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > > > > > +static void fw_cfg_io_cleanup(void)
> > > > > > > > +{
> > > > > > > > + if (fw_cfg_is_mmio) {
> > > > > > > > + iounmap(fw_cfg_dev_base);
> > > > > > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > > > + } else {
> > > > > > > > + ioport_unmap(fw_cfg_dev_base);
> > > > > > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > > > + }
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > > > > > >
> > > > > > > So for all arches which support ACPI, I think this driver
> > > > > > > should just rely on ACPI.
> > > > > >
> > > > > > There was a discussion about that a few versions ago, and IIRC the
> > > > > > conclusion was not to expect the firmware to contend for fw_cfg access
> > > > > > after the guest kernel boots:
> > > > > >
> > > > > > https://lkml.org/lkml/2015/10/5/283
> > > > > >
> > > > >
> > > > > So it looks like NVDIMM at least wants to pass label data to guest -
> > > > > for which fw cfg might be a reasonable choice.
> > > > >
> > > > > I suspect things changed - fw cfg used to be very slow but we now have
> > > > > DMA interface which makes it useful for a range of applications.
> > >
> > > Comment on this? I'm really worried we'll release linux
> > > without a way to access fw cfg from aml.
> > > How about taking acpi lock around all accesses?
> >
> > You mean something like this (haven't tried compiling it yet, so it
> > might be a bit more complicated, but just for the purpose of this
> > conversation):
> >
> > diff --git a/drivers/firmware/qemu_fw_cfg.c
> > b/drivers/firmware/qemu_fw_cfg.c
> > index fedbff5..3462a2c 100644
> > --- a/drivers/firmware/qemu_fw_cfg.c
> > +++ b/drivers/firmware/qemu_fw_cfg.c
> > @@ -77,12 +77,18 @@ static inline u16 fw_cfg_sel_endianness(u16 key)
> > static inline void fw_cfg_read_blob(u16 key,
> > void *buf, loff_t pos, size_t
> > count)
> > {
> > +#ifdef CONFIG_ACPI
> > + acpi_os_acquire_mutex(acpi_gbl_osi_mutex, ACPI_WAIT_FOREVER);
> > +#endif
> > mutex_lock(&fw_cfg_dev_lock);
> > iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
> > while (pos-- > 0)
> > ioread8(fw_cfg_reg_data);
> > ioread8_rep(fw_cfg_reg_data, buf, count);
> > mutex_unlock(&fw_cfg_dev_lock);
> > +#ifdef CONFIG_ACPI
> > + acpi_os_release_mutex(acpi_gbl_osi_mutex);
> > +#endif
> > }
> >
> > /* clean up fw_cfg device i/o */
>
> Fundamentally yes.
>
> > I wouldn't particularly *mind* doing that, but I'd still like to hear
> > from other QEMU devs on whether it's really necessary.
>
> It seems like a prudent thing to do IMHO, before this
> goes out to users.
>
> [...]
>
> On balance, I think locking ACPI solves most problems so
> if we just do that, I think what you did here is fine.
Only trouble is, acpi_gbl_osi_mutex seems to be "private" to the acpi
subsystem, and I'm not sure how well a patch to allow some random
module to lock/unlock ACPI at its convenience would be received...
So unless I'm missing something obvious (wouldn't be the first time),
I think we're back to where *if* we *have* to do this [*], providing an
AML blob-reader method in ACPI and punting to it from the guest-side
kernel module (via #ifdef CONFIG_ACPI) would be the less painful
alternative.
[*] that is, mutual exclusion between kernel and firmware regarding
fw_cfg is (back) on the table, for real this time...
It would be good to know that it's the new consensus among QEMU
folks, since I have a strong feeling I'd no longer be "Keeping It Simple"
by moving in this direction.
Thanks,
--Gabriel
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v8 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device
@ 2016-02-24 0:03 ` Gabriel L. Somlo
0 siblings, 0 replies; 57+ messages in thread
From: Gabriel L. Somlo @ 2016-02-24 0:03 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
robh+dt-DgEjT+Ai2ygdnm+yROfE0A, pawel.moll-5wv7dgnIgG8,
mark.rutland-5wv7dgnIgG8, ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
galak-sgV2jX0FEOL9JmXXK+q4OQ, arnd-r2nGTMty4D4,
lersek-H+wXaHxf7aLQT0dZR+AlfA, ralf-6z/3iImG2C8G8FEW9MqTrA,
rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ, eric-WhKQ6XTQaPysTnJN9+BGXg,
hanjun.guo-QSEj5FYQhm4dnm+yROfE0A, zajec5-Re5JQEeQqe8AvxtiuMwx3w,
sudeep.holla-5wv7dgnIgG8, agross-sgV2jX0FEOL9JmXXK+q4OQ,
linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA,
qemu-devel-qX2TKyscuCcdnm+yROfE0A,
imammedo-H+wXaHxf7aLQT0dZR+AlfA,
peter.maydell-QSEj5FYQhm4dnm+yROfE0A,
leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
pbonzini-H+wXaHxf7aLQT0dZR+AlfA, kraxel-H+wXaHxf7aLQT0dZR+AlfA,
ehabkost-H+wXaHxf7aLQT0dZR+AlfA, luto-kltTT9wpgjJwATOyAt5JVQ,
stefanha-Re5JQEeQqe8AvxtiuMwx3w, revol-GANU6spQydw,
matt-mF/unelCI9GS6iBeEJttW/XRex20P6io,
rth-hL46jP5Bxq7R7s880joybQ
On Tue, Feb 23, 2016 at 04:14:46PM +0200, Michael S. Tsirkin wrote:
> On Tue, Feb 23, 2016 at 08:47:00AM -0500, Gabriel L. Somlo wrote:
> > On Tue, Feb 23, 2016 at 07:07:36AM +0200, Michael S. Tsirkin wrote:
> > > On Mon, Feb 22, 2016 at 03:26:23PM -0500, Gabriel L. Somlo wrote:
> > > > On Mon, Feb 22, 2016 at 10:14:50PM +0200, Michael S. Tsirkin wrote:
> > > > > On Sun, Feb 21, 2016 at 08:06:17AM -0500, Gabriel L. Somlo wrote:
> > > > > > > > +static void fw_cfg_io_cleanup(void)
> > > > > > > > +{
> > > > > > > > + if (fw_cfg_is_mmio) {
> > > > > > > > + iounmap(fw_cfg_dev_base);
> > > > > > > > + release_mem_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > > > + } else {
> > > > > > > > + ioport_unmap(fw_cfg_dev_base);
> > > > > > > > + release_region(fw_cfg_p_base, fw_cfg_p_size);
> > > > > > > > + }
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
> > > > > > >
> > > > > > > So for all arches which support ACPI, I think this driver
> > > > > > > should just rely on ACPI.
> > > > > >
> > > > > > There was a discussion about that a few versions ago, and IIRC the
> > > > > > conclusion was not to expect the firmware to contend for fw_cfg access
> > > > > > after the guest kernel boots:
> > > > > >
> > > > > > https://lkml.org/lkml/2015/10/5/283
> > > > > >
> > > > >
> > > > > So it looks like NVDIMM at least wants to pass label data to guest -
> > > > > for which fw cfg might be a reasonable choice.
> > > > >
> > > > > I suspect things changed - fw cfg used to be very slow but we now have
> > > > > DMA interface which makes it useful for a range of applications.
> > >
> > > Comment on this? I'm really worried we'll release linux
> > > without a way to access fw cfg from aml.
> > > How about taking acpi lock around all accesses?
> >
> > You mean something like this (haven't tried compiling it yet, so it
> > might be a bit more complicated, but just for the purpose of this
> > conversation):
> >
> > diff --git a/drivers/firmware/qemu_fw_cfg.c
> > b/drivers/firmware/qemu_fw_cfg.c
> > index fedbff5..3462a2c 100644
> > --- a/drivers/firmware/qemu_fw_cfg.c
> > +++ b/drivers/firmware/qemu_fw_cfg.c
> > @@ -77,12 +77,18 @@ static inline u16 fw_cfg_sel_endianness(u16 key)
> > static inline void fw_cfg_read_blob(u16 key,
> > void *buf, loff_t pos, size_t
> > count)
> > {
> > +#ifdef CONFIG_ACPI
> > + acpi_os_acquire_mutex(acpi_gbl_osi_mutex, ACPI_WAIT_FOREVER);
> > +#endif
> > mutex_lock(&fw_cfg_dev_lock);
> > iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
> > while (pos-- > 0)
> > ioread8(fw_cfg_reg_data);
> > ioread8_rep(fw_cfg_reg_data, buf, count);
> > mutex_unlock(&fw_cfg_dev_lock);
> > +#ifdef CONFIG_ACPI
> > + acpi_os_release_mutex(acpi_gbl_osi_mutex);
> > +#endif
> > }
> >
> > /* clean up fw_cfg device i/o */
>
> Fundamentally yes.
>
> > I wouldn't particularly *mind* doing that, but I'd still like to hear
> > from other QEMU devs on whether it's really necessary.
>
> It seems like a prudent thing to do IMHO, before this
> goes out to users.
>
> [...]
>
> On balance, I think locking ACPI solves most problems so
> if we just do that, I think what you did here is fine.
Only trouble is, acpi_gbl_osi_mutex seems to be "private" to the acpi
subsystem, and I'm not sure how well a patch to allow some random
module to lock/unlock ACPI at its convenience would be received...
So unless I'm missing something obvious (wouldn't be the first time),
I think we're back to where *if* we *have* to do this [*], providing an
AML blob-reader method in ACPI and punting to it from the guest-side
kernel module (via #ifdef CONFIG_ACPI) would be the less painful
alternative.
[*] that is, mutual exclusion between kernel and firmware regarding
fw_cfg is (back) on the table, for real this time...
It would be good to know that it's the new consensus among QEMU
folks, since I have a strong feeling I'd no longer be "Keeping It Simple"
by moving in this direction.
Thanks,
--Gabriel
^ permalink raw reply [flat|nested] 57+ messages in thread