linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] device.h: pack struct dev_links_info
@ 2019-02-26 14:41 Greg Kroah-Hartman
  2019-02-26 14:41 ` [PATCH 2/2] device.h: reorganize struct device Greg Kroah-Hartman
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-26 14:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, Rafael J. Wysocki

The dev_links_info structure has 4 bytes of padding at the end of it
when embedded in struct device (which is the only place it lives).  To
help reduce the size of struct device pack this structure so we can take
advantage of the hole with later structure reorganizations.

Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/device.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/device.h b/include/linux/device.h
index 6cb4640b6160..b63165276a09 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -884,7 +884,7 @@ struct dev_links_info {
 	struct list_head suppliers;
 	struct list_head consumers;
 	enum dl_dev_state status;
-};
+} __packed;
 
 /**
  * struct device - The basic device structure
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] device.h: reorganize struct device
  2019-02-26 14:41 [PATCH 1/2] device.h: pack struct dev_links_info Greg Kroah-Hartman
@ 2019-02-26 14:41 ` Greg Kroah-Hartman
  2019-02-26 15:40 ` [PATCH 1/2] device.h: pack struct dev_links_info Rafael J. Wysocki
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-26 14:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, Rafael J. Wysocki

struct device is big, around 760 bytes on x86_64.  It's not a critical
structure, but it is embedded everywhere, so making it smaller is always
a good thing.

With a recent patch that moved a field from struct device to the private
structure, some benchmarks showed a very odd regression, despite this
structure having nothing to do with those benchmarks.  That caused me to
look into the layout of the structure.  Using 'pahole', it showed a
number of holes and ways that the structure could be reordered in order
to align some cachelines better, as well as reduce the size of the
overall structure.

This patch removes 16 bytes from 'struct device' on a 64bit system, just
by moving things around.  Given we know there are systems with at least
30k devices in memory at once, every little byte counts, and this change
could be a savings of 480k of kernel memory for them.  On "normal"
systems the overall memory savings would be much less.

Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/device.h | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/include/linux/device.h b/include/linux/device.h
index b63165276a09..4e6b9a2ab8d0 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -967,18 +967,14 @@ struct dev_links_info {
  * a higher-level representation of the device.
  */
 struct device {
+	struct kobject kobj;
 	struct device		*parent;
 
 	struct device_private	*p;
 
-	struct kobject kobj;
 	const char		*init_name; /* initial name of the device */
 	const struct device_type *type;
 
-	struct mutex		mutex;	/* mutex to synchronize calls to
-					 * its driver.
-					 */
-
 	struct bus_type	*bus;		/* type of bus device is on */
 	struct device_driver *driver;	/* which driver has allocated this
 					   device */
@@ -986,7 +982,14 @@ struct device {
 					   core doesn't touch it */
 	void		*driver_data;	/* Driver data, set and get with
 					   dev_set/get_drvdata */
+	struct mutex		mutex;	/* mutex to synchronize calls to
+					 * its driver.
+					 */
+
 	struct dev_links_info	links;
+	spinlock_t		devres_lock;
+	struct list_head	devres_head;
+
 	struct dev_pm_info	power;
 	struct dev_pm_domain	*pm_domain;
 
@@ -1000,9 +1003,6 @@ struct device {
 	struct list_head	msi_list;
 #endif
 
-#ifdef CONFIG_NUMA
-	int		numa_node;	/* NUMA node this device is close to */
-#endif
 	const struct dma_map_ops *dma_ops;
 	u64		*dma_mask;	/* dma mask (if dma'able device) */
 	u64		coherent_dma_mask;/* Like dma_mask, but for
@@ -1032,9 +1032,6 @@ struct device {
 	dev_t			devt;	/* dev_t, creates the sysfs "dev" */
 	u32			id;	/* device instance */
 
-	spinlock_t		devres_lock;
-	struct list_head	devres_head;
-
 	struct klist_node	knode_class;
 	struct class		*class;
 	const struct attribute_group **groups;	/* optional groups */
@@ -1043,6 +1040,9 @@ struct device {
 	struct iommu_group	*iommu_group;
 	struct iommu_fwspec	*iommu_fwspec;
 
+#ifdef CONFIG_NUMA
+	int		numa_node;	/* NUMA node this device is close to */
+#endif
 	bool			offline_disabled:1;
 	bool			offline:1;
 	bool			of_node_reused:1;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] device.h: pack struct dev_links_info
  2019-02-26 14:41 [PATCH 1/2] device.h: pack struct dev_links_info Greg Kroah-Hartman
  2019-02-26 14:41 ` [PATCH 2/2] device.h: reorganize struct device Greg Kroah-Hartman
@ 2019-02-26 15:40 ` Rafael J. Wysocki
  2019-02-27  9:23 ` Johan Hovold
  2019-02-28 13:58 ` [PATCH v2] device.h: reorganize struct device Greg Kroah-Hartman
  3 siblings, 0 replies; 13+ messages in thread
From: Rafael J. Wysocki @ 2019-02-26 15:40 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel

On 2/26/2019 3:41 PM, Greg Kroah-Hartman wrote:
> The dev_links_info structure has 4 bytes of padding at the end of it
> when embedded in struct device (which is the only place it lives).  To
> help reduce the size of struct device pack this structure so we can take
> advantage of the hole with later structure reorganizations.
>
> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---
>   include/linux/device.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 6cb4640b6160..b63165276a09 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -884,7 +884,7 @@ struct dev_links_info {
>   	struct list_head suppliers;
>   	struct list_head consumers;
>   	enum dl_dev_state status;
> -};
> +} __packed;
>   
>   /**
>    * struct device - The basic device structure



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] device.h: pack struct dev_links_info
  2019-02-26 14:41 [PATCH 1/2] device.h: pack struct dev_links_info Greg Kroah-Hartman
  2019-02-26 14:41 ` [PATCH 2/2] device.h: reorganize struct device Greg Kroah-Hartman
  2019-02-26 15:40 ` [PATCH 1/2] device.h: pack struct dev_links_info Rafael J. Wysocki
@ 2019-02-27  9:23 ` Johan Hovold
  2019-02-27  9:31   ` Greg Kroah-Hartman
  2019-02-28 13:58 ` [PATCH v2] device.h: reorganize struct device Greg Kroah-Hartman
  3 siblings, 1 reply; 13+ messages in thread
From: Johan Hovold @ 2019-02-27  9:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Rafael J. Wysocki

On Tue, Feb 26, 2019 at 03:41:07PM +0100, Greg Kroah-Hartman wrote:
> The dev_links_info structure has 4 bytes of padding at the end of it
> when embedded in struct device (which is the only place it lives).  To
> help reduce the size of struct device pack this structure so we can take
> advantage of the hole with later structure reorganizations.
> 
> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  include/linux/device.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 6cb4640b6160..b63165276a09 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -884,7 +884,7 @@ struct dev_links_info {
>  	struct list_head suppliers;
>  	struct list_head consumers;
>  	enum dl_dev_state status;
> -};
> +} __packed;

This seems like a bad idea. You're changing the alignment of these
fields to one byte, something which may cause the compiler to generate
less efficient code to deal with unaligned accesses (even if they happen
to currently be naturally aligned in struct device).

I don't think we should mess with __packed unless for things that
actually require it (e.g. data going on to the wire) even if it means
wasting 4 bytes on 64-bit archs.

Johan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] device.h: pack struct dev_links_info
  2019-02-27  9:23 ` Johan Hovold
@ 2019-02-27  9:31   ` Greg Kroah-Hartman
  2019-02-27  9:40     ` Johan Hovold
  0 siblings, 1 reply; 13+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-27  9:31 UTC (permalink / raw)
  To: Johan Hovold; +Cc: linux-kernel, Rafael J. Wysocki

On Wed, Feb 27, 2019 at 10:23:18AM +0100, Johan Hovold wrote:
> On Tue, Feb 26, 2019 at 03:41:07PM +0100, Greg Kroah-Hartman wrote:
> > The dev_links_info structure has 4 bytes of padding at the end of it
> > when embedded in struct device (which is the only place it lives).  To
> > help reduce the size of struct device pack this structure so we can take
> > advantage of the hole with later structure reorganizations.
> > 
> > Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > ---
> >  include/linux/device.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/device.h b/include/linux/device.h
> > index 6cb4640b6160..b63165276a09 100644
> > --- a/include/linux/device.h
> > +++ b/include/linux/device.h
> > @@ -884,7 +884,7 @@ struct dev_links_info {
> >  	struct list_head suppliers;
> >  	struct list_head consumers;
> >  	enum dl_dev_state status;
> > -};
> > +} __packed;
> 
> This seems like a bad idea. You're changing the alignment of these
> fields to one byte, something which may cause the compiler to generate
> less efficient code to deal with unaligned accesses (even if they happen
> to currently be naturally aligned in struct device).

No, all this changes is the trailing "space" is gone.  The alignment of
the fields did not change at all as they are all naturally aligned
(list_head is just 2 pointers).

Here's the pahole output before and after this patch:

Before:
struct dev_links_info {
        struct list_head           suppliers;            /*     0    16 */
        struct list_head           consumers;            /*    16    16 */
        enum dl_dev_state          status;               /*    32     4 */

        /* size: 40, cachelines: 1, members: 3 */
        /* padding: 4 */
        /* last cacheline: 40 bytes */
};

After:
struct dev_links_info {
        struct list_head           suppliers;            /*     0    16 */
        struct list_head           consumers;            /*    16    16 */
        enum dl_dev_state          status;               /*    32     4 */

        /* size: 36, cachelines: 1, members: 3 */
        /* last cacheline: 36 bytes */
};


So this allows us to save 4 bytes in struct device by putting something in that
trailing "hole" that can be aligned with it better (i.e. an integer or
something else).

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] device.h: pack struct dev_links_info
  2019-02-27  9:31   ` Greg Kroah-Hartman
@ 2019-02-27  9:40     ` Johan Hovold
  2019-02-27  9:54       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 13+ messages in thread
From: Johan Hovold @ 2019-02-27  9:40 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Johan Hovold, linux-kernel, Rafael J. Wysocki

On Wed, Feb 27, 2019 at 10:31:04AM +0100, Greg Kroah-Hartman wrote:
> On Wed, Feb 27, 2019 at 10:23:18AM +0100, Johan Hovold wrote:
> > On Tue, Feb 26, 2019 at 03:41:07PM +0100, Greg Kroah-Hartman wrote:
> > > The dev_links_info structure has 4 bytes of padding at the end of it
> > > when embedded in struct device (which is the only place it lives).  To
> > > help reduce the size of struct device pack this structure so we can take
> > > advantage of the hole with later structure reorganizations.
> > > 
> > > Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> > > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > ---
> > >  include/linux/device.h | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/include/linux/device.h b/include/linux/device.h
> > > index 6cb4640b6160..b63165276a09 100644
> > > --- a/include/linux/device.h
> > > +++ b/include/linux/device.h
> > > @@ -884,7 +884,7 @@ struct dev_links_info {
> > >  	struct list_head suppliers;
> > >  	struct list_head consumers;
> > >  	enum dl_dev_state status;
> > > -};
> > > +} __packed;
> > 
> > This seems like a bad idea. You're changing the alignment of these
> > fields to one byte, something which may cause the compiler to generate
> > less efficient code to deal with unaligned accesses (even if they happen
> > to currently be naturally aligned in struct device).
> 
> No, all this changes is the trailing "space" is gone.  The alignment of
> the fields did not change at all as they are all naturally aligned
> (list_head is just 2 pointers).

Yes, currently and in struct device, but given a pointer to a struct
dev_links_info the compiler must assume it is unaligned and act
accordingly for example.

> So this allows us to save 4 bytes in struct device by putting something in that
> trailing "hole" that can be aligned with it better (i.e. an integer or
> something else).

I understand that, but I don't think it is worth to start using packed
liked this for internal structures as it may have subtle and unintended
consequences.

Johan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] device.h: pack struct dev_links_info
  2019-02-27  9:40     ` Johan Hovold
@ 2019-02-27  9:54       ` Greg Kroah-Hartman
  2019-02-27 10:59         ` Johan Hovold
  0 siblings, 1 reply; 13+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-27  9:54 UTC (permalink / raw)
  To: Johan Hovold; +Cc: linux-kernel, Rafael J. Wysocki

On Wed, Feb 27, 2019 at 10:40:21AM +0100, Johan Hovold wrote:
> On Wed, Feb 27, 2019 at 10:31:04AM +0100, Greg Kroah-Hartman wrote:
> > On Wed, Feb 27, 2019 at 10:23:18AM +0100, Johan Hovold wrote:
> > > On Tue, Feb 26, 2019 at 03:41:07PM +0100, Greg Kroah-Hartman wrote:
> > > > The dev_links_info structure has 4 bytes of padding at the end of it
> > > > when embedded in struct device (which is the only place it lives).  To
> > > > help reduce the size of struct device pack this structure so we can take
> > > > advantage of the hole with later structure reorganizations.
> > > > 
> > > > Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> > > > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > ---
> > > >  include/linux/device.h | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/include/linux/device.h b/include/linux/device.h
> > > > index 6cb4640b6160..b63165276a09 100644
> > > > --- a/include/linux/device.h
> > > > +++ b/include/linux/device.h
> > > > @@ -884,7 +884,7 @@ struct dev_links_info {
> > > >  	struct list_head suppliers;
> > > >  	struct list_head consumers;
> > > >  	enum dl_dev_state status;
> > > > -};
> > > > +} __packed;
> > > 
> > > This seems like a bad idea. You're changing the alignment of these
> > > fields to one byte, something which may cause the compiler to generate
> > > less efficient code to deal with unaligned accesses (even if they happen
> > > to currently be naturally aligned in struct device).
> > 
> > No, all this changes is the trailing "space" is gone.  The alignment of
> > the fields did not change at all as they are all naturally aligned
> > (list_head is just 2 pointers).
> 
> Yes, currently and in struct device, but given a pointer to a struct
> dev_links_info the compiler must assume it is unaligned and act
> accordingly for example.

Packing the structure doesn't mean that the addressing of it is not also
aligned, that should just depend on the location of the pointer in the
first place, right?

Surely compilers are not that foolish :)

And accessing this field should not be an issue of "slow", hopefully the
memory savings would offset any compiler mess.

> > So this allows us to save 4 bytes in struct device by putting something in that
> > trailing "hole" that can be aligned with it better (i.e. an integer or
> > something else).
> 
> I understand that, but I don't think it is worth to start using packed
> liked this for internal structures as it may have subtle and unintended
> consequences.

I'm not understanding what the consequences are here, sorry.  Does the
compiler output change given that the structure is still aligned
properly in the "parent" structure?  I can't see any output changed
here, but maybe I am not looking properly?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] device.h: pack struct dev_links_info
  2019-02-27  9:54       ` Greg Kroah-Hartman
@ 2019-02-27 10:59         ` Johan Hovold
  2019-02-27 12:06           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 13+ messages in thread
From: Johan Hovold @ 2019-02-27 10:59 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Johan Hovold, linux-kernel, Rafael J. Wysocki

On Wed, Feb 27, 2019 at 10:54:24AM +0100, Greg Kroah-Hartman wrote:
> On Wed, Feb 27, 2019 at 10:40:21AM +0100, Johan Hovold wrote:
> > On Wed, Feb 27, 2019 at 10:31:04AM +0100, Greg Kroah-Hartman wrote:
> > > On Wed, Feb 27, 2019 at 10:23:18AM +0100, Johan Hovold wrote:
> > > > On Tue, Feb 26, 2019 at 03:41:07PM +0100, Greg Kroah-Hartman wrote:
> > > > > The dev_links_info structure has 4 bytes of padding at the end of it
> > > > > when embedded in struct device (which is the only place it lives).  To
> > > > > help reduce the size of struct device pack this structure so we can take
> > > > > advantage of the hole with later structure reorganizations.
> > > > > 
> > > > > Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> > > > > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > ---
> > > > >  include/linux/device.h | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/include/linux/device.h b/include/linux/device.h
> > > > > index 6cb4640b6160..b63165276a09 100644
> > > > > --- a/include/linux/device.h
> > > > > +++ b/include/linux/device.h
> > > > > @@ -884,7 +884,7 @@ struct dev_links_info {
> > > > >  	struct list_head suppliers;
> > > > >  	struct list_head consumers;
> > > > >  	enum dl_dev_state status;
> > > > > -};
> > > > > +} __packed;
> > > > 
> > > > This seems like a bad idea. You're changing the alignment of these
> > > > fields to one byte, something which may cause the compiler to generate
> > > > less efficient code to deal with unaligned accesses (even if they happen
> > > > to currently be naturally aligned in struct device).
> > > 
> > > No, all this changes is the trailing "space" is gone.  The alignment of
> > > the fields did not change at all as they are all naturally aligned
> > > (list_head is just 2 pointers).
> > 
> > Yes, currently and in struct device, but given a pointer to a struct
> > dev_links_info the compiler must assume it is unaligned and act
> > accordingly for example.
> 
> Packing the structure doesn't mean that the addressing of it is not also
> aligned, that should just depend on the location of the pointer in the
> first place, right?

Packing a structure per definition means changing the alignment
requirement of each field of the struct to 1-byte alignment.

Another example of unintended consequences would obviously be that if
someone later adds a short field, say 1-byte, field before the
dev_links_info struct, all its fields would be non-naturally aligned
also in struct device.

Sure that can be avoided by inspection (and refusal to add new holes),
but again, not obvious when the link structure is defined elsewhere.

> Surely compilers are not that foolish :)
> 
> And accessing this field should not be an issue of "slow", hopefully the
> memory savings would offset any compiler mess.

There are other subtleties like atomicity that may come into play.

And even if any penalties are deemed acceptable in this case, you're
also setting a precedent for others. Note that we do not seem to use
__packed this way currently

> > > So this allows us to save 4 bytes in struct device by putting something in that
> > > trailing "hole" that can be aligned with it better (i.e. an integer or
> > > something else).
> > 
> > I understand that, but I don't think it is worth to start using packed
> > liked this for internal structures as it may have subtle and unintended
> > consequences.
> 
> I'm not understanding what the consequences are here, sorry.  Does the
> compiler output change given that the structure is still aligned
> properly in the "parent" structure?  I can't see any output changed
> here, but maybe I am not looking properly?

It's all arch dependent, and you won't see any difference on x86-64.

The following example produces additional instructions even on 32-bit
arm here:

struct a1 {
	void *p;
	void *q;
	int i;
} __attribute__((__packed__));

struct a2 {
	void *p;
	void *q;
	int i;
};

int f(struct a1 *a)
{
	return a->i;
}

int g(struct a2 *a)
{
	return a->i;
}

Johan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] device.h: pack struct dev_links_info
  2019-02-27 10:59         ` Johan Hovold
@ 2019-02-27 12:06           ` Greg Kroah-Hartman
  2019-02-27 13:32             ` Johan Hovold
  0 siblings, 1 reply; 13+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-27 12:06 UTC (permalink / raw)
  To: Johan Hovold; +Cc: linux-kernel, Rafael J. Wysocki

On Wed, Feb 27, 2019 at 11:59:51AM +0100, Johan Hovold wrote:
> On Wed, Feb 27, 2019 at 10:54:24AM +0100, Greg Kroah-Hartman wrote:
> > On Wed, Feb 27, 2019 at 10:40:21AM +0100, Johan Hovold wrote:
> > > On Wed, Feb 27, 2019 at 10:31:04AM +0100, Greg Kroah-Hartman wrote:
> > > > On Wed, Feb 27, 2019 at 10:23:18AM +0100, Johan Hovold wrote:
> > > > > On Tue, Feb 26, 2019 at 03:41:07PM +0100, Greg Kroah-Hartman wrote:
> > > > > > The dev_links_info structure has 4 bytes of padding at the end of it
> > > > > > when embedded in struct device (which is the only place it lives).  To
> > > > > > help reduce the size of struct device pack this structure so we can take
> > > > > > advantage of the hole with later structure reorganizations.
> > > > > > 
> > > > > > Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> > > > > > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > > ---
> > > > > >  include/linux/device.h | 2 +-
> > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/include/linux/device.h b/include/linux/device.h
> > > > > > index 6cb4640b6160..b63165276a09 100644
> > > > > > --- a/include/linux/device.h
> > > > > > +++ b/include/linux/device.h
> > > > > > @@ -884,7 +884,7 @@ struct dev_links_info {
> > > > > >  	struct list_head suppliers;
> > > > > >  	struct list_head consumers;
> > > > > >  	enum dl_dev_state status;
> > > > > > -};
> > > > > > +} __packed;
> > > > > 
> > > > > This seems like a bad idea. You're changing the alignment of these
> > > > > fields to one byte, something which may cause the compiler to generate
> > > > > less efficient code to deal with unaligned accesses (even if they happen
> > > > > to currently be naturally aligned in struct device).
> > > > 
> > > > No, all this changes is the trailing "space" is gone.  The alignment of
> > > > the fields did not change at all as they are all naturally aligned
> > > > (list_head is just 2 pointers).
> > > 
> > > Yes, currently and in struct device, but given a pointer to a struct
> > > dev_links_info the compiler must assume it is unaligned and act
> > > accordingly for example.
> > 
> > Packing the structure doesn't mean that the addressing of it is not also
> > aligned, that should just depend on the location of the pointer in the
> > first place, right?
> 
> Packing a structure per definition means changing the alignment
> requirement of each field of the struct to 1-byte alignment.
> 
> Another example of unintended consequences would obviously be that if
> someone later adds a short field, say 1-byte, field before the
> dev_links_info struct, all its fields would be non-naturally aligned
> also in struct device.
> 
> Sure that can be avoided by inspection (and refusal to add new holes),
> but again, not obvious when the link structure is defined elsewhere.
> 
> > Surely compilers are not that foolish :)
> > 
> > And accessing this field should not be an issue of "slow", hopefully the
> > memory savings would offset any compiler mess.
> 
> There are other subtleties like atomicity that may come into play.
> 
> And even if any penalties are deemed acceptable in this case, you're
> also setting a precedent for others. Note that we do not seem to use
> __packed this way currently

Yeah, that is a good point, normally we use packed to keep padding from
the middle of the structure from happening.

I just don't like that 4 bytes sitting there doing nothing :)

> > > > So this allows us to save 4 bytes in struct device by putting something in that
> > > > trailing "hole" that can be aligned with it better (i.e. an integer or
> > > > something else).
> > > 
> > > I understand that, but I don't think it is worth to start using packed
> > > liked this for internal structures as it may have subtle and unintended
> > > consequences.
> > 
> > I'm not understanding what the consequences are here, sorry.  Does the
> > compiler output change given that the structure is still aligned
> > properly in the "parent" structure?  I can't see any output changed
> > here, but maybe I am not looking properly?
> 
> It's all arch dependent, and you won't see any difference on x86-64.
> 
> The following example produces additional instructions even on 32-bit
> arm here:
> 
> struct a1 {
> 	void *p;
> 	void *q;
> 	int i;
> } __attribute__((__packed__));
> 
> struct a2 {
> 	void *p;
> 	void *q;
> 	int i;
> };
> 
> int f(struct a1 *a)
> {
> 	return a->i;
> }
> 
> int g(struct a2 *a)
> {
> 	return a->i;
> }

Ok, fair enough, I'll leave this alone.

But, in thinking about this, there is no real reason that I can see that
this structure even is in struct device.  It should be able to be in the
private "internal" structure.

The patch below moves it out of struct device entirely.  Overall there
is no memory savings, but it could give us the chance to only create
this structure if we really need it later on, as very few things use
links at this point in time.

Rafael, there is one logic change below, the link structure is not
initialized until device_add() happens, instead of device_initialize().
Will that affect anything that you can think of?  Does anyone do
anything with links before device_add() is called?

I only test-built this patch, I didn't boot anything with it to see how
bad it explodes :)

thanks,

greg k-h


diff --git a/drivers/base/base.h b/drivers/base/base.h
index 7a419a7a6235..5444941dd42c 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -53,6 +53,18 @@ struct driver_private {
 };
 #define to_driver(obj) container_of(obj, struct driver_private, kobj)
 
+/**
+ * struct dev_links_info - Device data related to device links.
+ * @suppliers: List of links to supplier devices.
+ * @consumers: List of links to consumer devices.
+ * @status: Driver status information.
+ */
+struct dev_links_info {
+	struct list_head suppliers;
+	struct list_head consumers;
+	enum dl_dev_state status;
+};
+
 /**
  * struct device_private - structure to hold the private to the driver core portions of the device structure.
  *
@@ -76,6 +88,7 @@ struct device_private {
 	struct klist_node knode_bus;
 	struct list_head deferred_probe;
 	struct device *device;
+	struct dev_links_info links;
 };
 #define to_device_private_parent(obj)	\
 	container_of(obj, struct device_private, knode_parent)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 0073b09bb99f..5210428f621c 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -113,7 +113,7 @@ static int device_is_dependent(struct device *dev, void *target)
 	if (ret)
 		return ret;
 
-	list_for_each_entry(link, &dev->links.consumers, s_node) {
+	list_for_each_entry(link, &dev->p->links.consumers, s_node) {
 		if (link->consumer == target)
 			return 1;
 
@@ -139,7 +139,7 @@ static int device_reorder_to_tail(struct device *dev, void *not_used)
 		device_pm_move_last(dev);
 
 	device_for_each_child(dev, NULL, device_reorder_to_tail);
-	list_for_each_entry(link, &dev->links.consumers, s_node)
+	list_for_each_entry(link, &dev->p->links.consumers, s_node)
 		device_reorder_to_tail(link->consumer, NULL);
 
 	return 0;
@@ -217,7 +217,7 @@ struct device_link *device_link_add(struct device *consumer,
 		goto out;
 	}
 
-	list_for_each_entry(link, &supplier->links.consumers, s_node)
+	list_for_each_entry(link, &supplier->p->links.consumers, s_node)
 		if (link->consumer == consumer) {
 			kref_get(&link->kref);
 			goto out;
@@ -243,7 +243,7 @@ struct device_link *device_link_add(struct device *consumer,
 		 * time, balance the decrementation of the supplier's runtime PM
 		 * usage counter after consumer probe in driver_probe_device().
 		 */
-		if (consumer->links.status == DL_DEV_PROBING)
+		if (consumer->p->links.status == DL_DEV_PROBING)
 			pm_runtime_get_noresume(supplier);
 	}
 	get_device(supplier);
@@ -259,9 +259,9 @@ struct device_link *device_link_add(struct device *consumer,
 	if (flags & DL_FLAG_STATELESS) {
 		link->status = DL_STATE_NONE;
 	} else {
-		switch (supplier->links.status) {
+		switch (supplier->p->links.status) {
 		case DL_DEV_DRIVER_BOUND:
-			switch (consumer->links.status) {
+			switch (consumer->p->links.status) {
 			case DL_DEV_PROBING:
 				/*
 				 * Some callers expect the link creation during
@@ -299,8 +299,8 @@ struct device_link *device_link_add(struct device *consumer,
 	 */
 	device_reorder_to_tail(consumer, NULL);
 
-	list_add_tail_rcu(&link->s_node, &supplier->links.consumers);
-	list_add_tail_rcu(&link->c_node, &consumer->links.suppliers);
+	list_add_tail_rcu(&link->s_node, &supplier->p->links.consumers);
+	list_add_tail_rcu(&link->c_node, &consumer->p->links.suppliers);
 
 	dev_info(consumer, "Linked as a consumer to %s\n", dev_name(supplier));
 
@@ -392,7 +392,7 @@ void device_link_remove(void *consumer, struct device *supplier)
 	device_links_write_lock();
 	device_pm_lock();
 
-	list_for_each_entry(link, &supplier->links.consumers, s_node) {
+	list_for_each_entry(link, &supplier->p->links.consumers, s_node) {
 		if (link->consumer == consumer) {
 			kref_put(&link->kref, __device_link_del);
 			break;
@@ -408,7 +408,7 @@ static void device_links_missing_supplier(struct device *dev)
 {
 	struct device_link *link;
 
-	list_for_each_entry(link, &dev->links.suppliers, c_node)
+	list_for_each_entry(link, &dev->p->links.suppliers, c_node)
 		if (link->status == DL_STATE_CONSUMER_PROBE)
 			WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
 }
@@ -436,7 +436,7 @@ int device_links_check_suppliers(struct device *dev)
 
 	device_links_write_lock();
 
-	list_for_each_entry(link, &dev->links.suppliers, c_node) {
+	list_for_each_entry(link, &dev->p->links.suppliers, c_node) {
 		if (link->flags & DL_FLAG_STATELESS)
 			continue;
 
@@ -447,7 +447,7 @@ int device_links_check_suppliers(struct device *dev)
 		}
 		WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE);
 	}
-	dev->links.status = DL_DEV_PROBING;
+	dev->p->links.status = DL_DEV_PROBING;
 
 	device_links_write_unlock();
 	return ret;
@@ -470,7 +470,7 @@ void device_links_driver_bound(struct device *dev)
 
 	device_links_write_lock();
 
-	list_for_each_entry(link, &dev->links.consumers, s_node) {
+	list_for_each_entry(link, &dev->p->links.consumers, s_node) {
 		if (link->flags & DL_FLAG_STATELESS)
 			continue;
 
@@ -478,7 +478,7 @@ void device_links_driver_bound(struct device *dev)
 		WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
 	}
 
-	list_for_each_entry(link, &dev->links.suppliers, c_node) {
+	list_for_each_entry(link, &dev->p->links.suppliers, c_node) {
 		if (link->flags & DL_FLAG_STATELESS)
 			continue;
 
@@ -486,7 +486,7 @@ void device_links_driver_bound(struct device *dev)
 		WRITE_ONCE(link->status, DL_STATE_ACTIVE);
 	}
 
-	dev->links.status = DL_DEV_DRIVER_BOUND;
+	dev->p->links.status = DL_DEV_DRIVER_BOUND;
 
 	device_links_write_unlock();
 }
@@ -507,7 +507,7 @@ static void __device_links_no_driver(struct device *dev)
 {
 	struct device_link *link, *ln;
 
-	list_for_each_entry_safe_reverse(link, ln, &dev->links.suppliers, c_node) {
+	list_for_each_entry_safe_reverse(link, ln, &dev->p->links.suppliers, c_node) {
 		if (link->flags & DL_FLAG_STATELESS)
 			continue;
 
@@ -517,7 +517,7 @@ static void __device_links_no_driver(struct device *dev)
 			WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
 	}
 
-	dev->links.status = DL_DEV_NO_DRIVER;
+	dev->p->links.status = DL_DEV_NO_DRIVER;
 }
 
 void device_links_no_driver(struct device *dev)
@@ -543,7 +543,7 @@ void device_links_driver_cleanup(struct device *dev)
 
 	device_links_write_lock();
 
-	list_for_each_entry(link, &dev->links.consumers, s_node) {
+	list_for_each_entry(link, &dev->p->links.consumers, s_node) {
 		if (link->flags & DL_FLAG_STATELESS)
 			continue;
 
@@ -588,7 +588,7 @@ bool device_links_busy(struct device *dev)
 
 	device_links_write_lock();
 
-	list_for_each_entry(link, &dev->links.consumers, s_node) {
+	list_for_each_entry(link, &dev->p->links.consumers, s_node) {
 		if (link->flags & DL_FLAG_STATELESS)
 			continue;
 
@@ -600,7 +600,7 @@ bool device_links_busy(struct device *dev)
 		WRITE_ONCE(link->status, DL_STATE_SUPPLIER_UNBIND);
 	}
 
-	dev->links.status = DL_DEV_UNBINDING;
+	dev->p->links.status = DL_DEV_UNBINDING;
 
 	device_links_write_unlock();
 	return ret;
@@ -628,7 +628,7 @@ void device_links_unbind_consumers(struct device *dev)
  start:
 	device_links_write_lock();
 
-	list_for_each_entry(link, &dev->links.consumers, s_node) {
+	list_for_each_entry(link, &dev->p->links.consumers, s_node) {
 		enum device_link_state status;
 
 		if (link->flags & DL_FLAG_STATELESS)
@@ -673,12 +673,12 @@ static void device_links_purge(struct device *dev)
 	 */
 	device_links_write_lock();
 
-	list_for_each_entry_safe_reverse(link, ln, &dev->links.suppliers, c_node) {
+	list_for_each_entry_safe_reverse(link, ln, &dev->p->links.suppliers, c_node) {
 		WARN_ON(link->status == DL_STATE_ACTIVE);
 		__device_link_del(&link->kref);
 	}
 
-	list_for_each_entry_safe_reverse(link, ln, &dev->links.consumers, s_node) {
+	list_for_each_entry_safe_reverse(link, ln, &dev->p->links.consumers, s_node) {
 		WARN_ON(link->status != DL_STATE_DORMANT &&
 			link->status != DL_STATE_NONE);
 		__device_link_del(&link->kref);
@@ -1526,9 +1526,6 @@ void device_initialize(struct device *dev)
 #ifdef CONFIG_GENERIC_MSI_IRQ
 	INIT_LIST_HEAD(&dev->msi_list);
 #endif
-	INIT_LIST_HEAD(&dev->links.consumers);
-	INIT_LIST_HEAD(&dev->links.suppliers);
-	dev->links.status = DL_DEV_NO_DRIVER;
 }
 EXPORT_SYMBOL_GPL(device_initialize);
 
@@ -1830,6 +1827,9 @@ static int device_private_init(struct device *dev)
 	klist_init(&dev->p->klist_children, klist_children_get,
 		   klist_children_put);
 	INIT_LIST_HEAD(&dev->p->deferred_probe);
+	INIT_LIST_HEAD(&dev->p->links.consumers);
+	INIT_LIST_HEAD(&dev->p->links.suppliers);
+	dev->p->links.status = DL_DEV_NO_DRIVER;
 	return 0;
 }
 
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 0992e67e862b..9739bb5764f9 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -259,7 +259,7 @@ static void dpm_wait_for_suppliers(struct device *dev, bool async)
 	 * callbacks freeing the link objects for the links in the list we're
 	 * walking.
 	 */
-	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+	list_for_each_entry_rcu(link, &dev->p->links.suppliers, c_node)
 		if (READ_ONCE(link->status) != DL_STATE_DORMANT)
 			dpm_wait(link->supplier, async);
 
@@ -288,7 +288,7 @@ static void dpm_wait_for_consumers(struct device *dev, bool async)
 	 * continue instead of trying to continue in parallel with its
 	 * unregistration).
 	 */
-	list_for_each_entry_rcu(link, &dev->links.consumers, s_node)
+	list_for_each_entry_rcu(link, &dev->p->links.consumers, s_node)
 		if (READ_ONCE(link->status) != DL_STATE_DORMANT)
 			dpm_wait(link->consumer, async);
 
@@ -1214,7 +1214,7 @@ static void dpm_superior_set_must_resume(struct device *dev)
 
 	idx = device_links_read_lock();
 
-	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+	list_for_each_entry_rcu(link, &dev->p->links.suppliers, c_node)
 		link->supplier->power.must_resume = true;
 
 	device_links_read_unlock(idx);
@@ -1688,7 +1688,7 @@ static void dpm_clear_superiors_direct_complete(struct device *dev)
 
 	idx = device_links_read_lock();
 
-	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) {
+	list_for_each_entry_rcu(link, &dev->p->links.suppliers, c_node) {
 		spin_lock_irq(&link->supplier->power.lock);
 		link->supplier->power.direct_complete = false;
 		spin_unlock_irq(&link->supplier->power.lock);
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index ccd296dbb95c..54c30ed3f384 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -256,7 +256,7 @@ static int rpm_get_suppliers(struct device *dev)
 {
 	struct device_link *link;
 
-	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) {
+	list_for_each_entry_rcu(link, &dev->p->links.suppliers, c_node) {
 		int retval;
 
 		if (!(link->flags & DL_FLAG_PM_RUNTIME))
@@ -281,7 +281,7 @@ static void rpm_put_suppliers(struct device *dev)
 {
 	struct device_link *link;
 
-	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+	list_for_each_entry_rcu(link, &dev->p->links.suppliers, c_node)
 		if (link->rpm_active &&
 		    READ_ONCE(link->status) != DL_STATE_SUPPLIER_UNBIND) {
 			pm_runtime_put(link->supplier);
@@ -1557,7 +1557,7 @@ void pm_runtime_clean_up_links(struct device *dev)
 
 	idx = device_links_read_lock();
 
-	list_for_each_entry_rcu(link, &dev->links.consumers, s_node) {
+	list_for_each_entry_rcu(link, &dev->p->links.consumers, s_node) {
 		if (link->flags & DL_FLAG_STATELESS)
 			continue;
 
@@ -1581,7 +1581,7 @@ void pm_runtime_get_suppliers(struct device *dev)
 
 	idx = device_links_read_lock();
 
-	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+	list_for_each_entry_rcu(link, &dev->p->links.suppliers, c_node)
 		if (link->flags & DL_FLAG_PM_RUNTIME)
 			pm_runtime_get_sync(link->supplier);
 
@@ -1599,7 +1599,7 @@ void pm_runtime_put_suppliers(struct device *dev)
 
 	idx = device_links_read_lock();
 
-	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+	list_for_each_entry_rcu(link, &dev->p->links.suppliers, c_node)
 		if (link->flags & DL_FLAG_PM_RUNTIME)
 			pm_runtime_put(link->supplier);
 
diff --git a/include/linux/device.h b/include/linux/device.h
index 6cb4640b6160..701be4385102 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -874,18 +874,6 @@ enum dl_dev_state {
 	DL_DEV_UNBINDING,
 };
 
-/**
- * struct dev_links_info - Device data related to device links.
- * @suppliers: List of links to supplier devices.
- * @consumers: List of links to consumer devices.
- * @status: Driver status information.
- */
-struct dev_links_info {
-	struct list_head suppliers;
-	struct list_head consumers;
-	enum dl_dev_state status;
-};
-
 /**
  * struct device - The basic device structure
  * @parent:	The device's "parent" device, the device to which it is attached.
@@ -986,7 +974,6 @@ struct device {
 					   core doesn't touch it */
 	void		*driver_data;	/* Driver data, set and get with
 					   dev_set/get_drvdata */
-	struct dev_links_info	links;
 	struct dev_pm_info	power;
 	struct dev_pm_domain	*pm_domain;
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] device.h: pack struct dev_links_info
  2019-02-27 12:06           ` Greg Kroah-Hartman
@ 2019-02-27 13:32             ` Johan Hovold
  2019-02-28  8:35               ` Greg Kroah-Hartman
  0 siblings, 1 reply; 13+ messages in thread
From: Johan Hovold @ 2019-02-27 13:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Johan Hovold, linux-kernel, Rafael J. Wysocki

On Wed, Feb 27, 2019 at 01:06:45PM +0100, Greg Kroah-Hartman wrote:
> On Wed, Feb 27, 2019 at 11:59:51AM +0100, Johan Hovold wrote:

> Yeah, that is a good point, normally we use packed to keep padding from
> the middle of the structure from happening.
> 
> I just don't like that 4 bytes sitting there doing nothing :)

You could perhaps put them directly in struct device if the
dev_links_info struct is just used a separator there and this really
bothers you. :)

> But, in thinking about this, there is no real reason that I can see that
> this structure even is in struct device.  It should be able to be in the
> private "internal" structure.
> 
> The patch below moves it out of struct device entirely.  Overall there
> is no memory savings, but it could give us the chance to only create
> this structure if we really need it later on, as very few things use
> links at this point in time.
> 
> Rafael, there is one logic change below, the link structure is not
> initialized until device_add() happens, instead of device_initialize().
> Will that affect anything that you can think of?  Does anyone do
> anything with links before device_add() is called?

I think device_add() may be too late.

	The earliest point in time when device links can be added is
	after :c:func:`device_add()` has been called for the supplier
	and :c:func:`device_initialize()` has been called for the
	consumer.

Johan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] device.h: pack struct dev_links_info
  2019-02-27 13:32             ` Johan Hovold
@ 2019-02-28  8:35               ` Greg Kroah-Hartman
  2019-02-28 23:43                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-28  8:35 UTC (permalink / raw)
  To: Johan Hovold; +Cc: linux-kernel, Rafael J. Wysocki

On Wed, Feb 27, 2019 at 02:32:26PM +0100, Johan Hovold wrote:
> On Wed, Feb 27, 2019 at 01:06:45PM +0100, Greg Kroah-Hartman wrote:
> > On Wed, Feb 27, 2019 at 11:59:51AM +0100, Johan Hovold wrote:
> 
> > Yeah, that is a good point, normally we use packed to keep padding from
> > the middle of the structure from happening.
> > 
> > I just don't like that 4 bytes sitting there doing nothing :)
> 
> You could perhaps put them directly in struct device if the
> dev_links_info struct is just used a separator there and this really
> bothers you. :)

True :)

> > But, in thinking about this, there is no real reason that I can see that
> > this structure even is in struct device.  It should be able to be in the
> > private "internal" structure.
> > 
> > The patch below moves it out of struct device entirely.  Overall there
> > is no memory savings, but it could give us the chance to only create
> > this structure if we really need it later on, as very few things use
> > links at this point in time.
> > 
> > Rafael, there is one logic change below, the link structure is not
> > initialized until device_add() happens, instead of device_initialize().
> > Will that affect anything that you can think of?  Does anyone do
> > anything with links before device_add() is called?
> 
> I think device_add() may be too late.
> 
> 	The earliest point in time when device links can be added is
> 	after :c:func:`device_add()` has been called for the supplier
> 	and :c:func:`device_initialize()` has been called for the
> 	consumer.

That is true today due to the way the code is set up, but it would be
good to figure out if anyone actually does call it this early.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] device.h: reorganize struct device
  2019-02-26 14:41 [PATCH 1/2] device.h: pack struct dev_links_info Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2019-02-27  9:23 ` Johan Hovold
@ 2019-02-28 13:58 ` Greg Kroah-Hartman
  3 siblings, 0 replies; 13+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-28 13:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Rafael J. Wysocki, Johan Hovold

struct device is big, around 760 bytes on x86_64.  It's not a critical
structure, but it is embedded everywhere, so making it smaller is always
a good thing.

With a recent patch that moved a field from struct device to the private
structure, some benchmarks showed a very odd regression, despite this
structure having nothing to do with those benchmarks.  That caused me to
look into the layout of the structure.  Using 'pahole', it showed a
number of holes and ways that the structure could be reordered in order
to align some cachelines better, as well as reduce the size of the
overall structure.

Move 'struct kobj' to the start of the structure, to keep that access
in the first cacheline, and try to organize things a bit more compactly
where possible

By doing these few moves, the result removes at least 8 bytes from
'struct device' on a 64bit system.  Given we know there are systems with
at least 30k devices in memory at once, every little byte counts, and
this change could be a savings of 240k of kernel memory for them.  On
"normal" systems the overall memory savings would be much less.

Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
v2: drop the first patch, and make this one a bit simpler to try to take
    advantage where we can.  It's not as much savings, but it's better
    than nothing.

 include/linux/device.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/linux/device.h b/include/linux/device.h
index 6cb4640b6160..4eaa09468ab9 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -967,18 +967,14 @@ struct dev_links_info {
  * a higher-level representation of the device.
  */
 struct device {
+	struct kobject kobj;
 	struct device		*parent;
 
 	struct device_private	*p;
 
-	struct kobject kobj;
 	const char		*init_name; /* initial name of the device */
 	const struct device_type *type;
 
-	struct mutex		mutex;	/* mutex to synchronize calls to
-					 * its driver.
-					 */
-
 	struct bus_type	*bus;		/* type of bus device is on */
 	struct device_driver *driver;	/* which driver has allocated this
 					   device */
@@ -986,6 +982,10 @@ struct device {
 					   core doesn't touch it */
 	void		*driver_data;	/* Driver data, set and get with
 					   dev_set/get_drvdata */
+	struct mutex		mutex;	/* mutex to synchronize calls to
+					 * its driver.
+					 */
+
 	struct dev_links_info	links;
 	struct dev_pm_info	power;
 	struct dev_pm_domain	*pm_domain;
@@ -1000,9 +1000,6 @@ struct device {
 	struct list_head	msi_list;
 #endif
 
-#ifdef CONFIG_NUMA
-	int		numa_node;	/* NUMA node this device is close to */
-#endif
 	const struct dma_map_ops *dma_ops;
 	u64		*dma_mask;	/* dma mask (if dma'able device) */
 	u64		coherent_dma_mask;/* Like dma_mask, but for
@@ -1029,6 +1026,9 @@ struct device {
 	struct device_node	*of_node; /* associated device tree node */
 	struct fwnode_handle	*fwnode; /* firmware device node */
 
+#ifdef CONFIG_NUMA
+	int		numa_node;	/* NUMA node this device is close to */
+#endif
 	dev_t			devt;	/* dev_t, creates the sysfs "dev" */
 	u32			id;	/* device instance */
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] device.h: pack struct dev_links_info
  2019-02-28  8:35               ` Greg Kroah-Hartman
@ 2019-02-28 23:43                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 13+ messages in thread
From: Rafael J. Wysocki @ 2019-02-28 23:43 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Johan Hovold, linux-kernel

On 2/28/2019 9:35 AM, Greg Kroah-Hartman wrote:
> On Wed, Feb 27, 2019 at 02:32:26PM +0100, Johan Hovold wrote:
>> On Wed, Feb 27, 2019 at 01:06:45PM +0100, Greg Kroah-Hartman wrote:
>>> On Wed, Feb 27, 2019 at 11:59:51AM +0100, Johan Hovold wrote:
>>> Yeah, that is a good point, normally we use packed to keep padding from
>>> the middle of the structure from happening.
>>>
>>> I just don't like that 4 bytes sitting there doing nothing :)
>> You could perhaps put them directly in struct device if the
>> dev_links_info struct is just used a separator there and this really
>> bothers you. :)
> True :)
>
>>> But, in thinking about this, there is no real reason that I can see that
>>> this structure even is in struct device.  It should be able to be in the
>>> private "internal" structure.
>>>
>>> The patch below moves it out of struct device entirely.  Overall there
>>> is no memory savings, but it could give us the chance to only create
>>> this structure if we really need it later on, as very few things use
>>> links at this point in time.
>>>
>>> Rafael, there is one logic change below, the link structure is not
>>> initialized until device_add() happens, instead of device_initialize().
>>> Will that affect anything that you can think of?  Does anyone do
>>> anything with links before device_add() is called?
>> I think device_add() may be too late.
>>
>> 	The earliest point in time when device links can be added is
>> 	after :c:func:`device_add()` has been called for the supplier
>> 	and :c:func:`device_initialize()` has been called for the
>> 	consumer.
> That is true today due to the way the code is set up, but it would be
> good to figure out if anyone actually does call it this early.

ISTR a use case where it was needed in the IOMMU subsystem, but I'm not 
sure if it has been used that way eventually.

The only way to really find out would be to audit all of the 
device_link_add() callers I'm afraid.

Cheers,

Rafael



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-02-28 23:43 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-26 14:41 [PATCH 1/2] device.h: pack struct dev_links_info Greg Kroah-Hartman
2019-02-26 14:41 ` [PATCH 2/2] device.h: reorganize struct device Greg Kroah-Hartman
2019-02-26 15:40 ` [PATCH 1/2] device.h: pack struct dev_links_info Rafael J. Wysocki
2019-02-27  9:23 ` Johan Hovold
2019-02-27  9:31   ` Greg Kroah-Hartman
2019-02-27  9:40     ` Johan Hovold
2019-02-27  9:54       ` Greg Kroah-Hartman
2019-02-27 10:59         ` Johan Hovold
2019-02-27 12:06           ` Greg Kroah-Hartman
2019-02-27 13:32             ` Johan Hovold
2019-02-28  8:35               ` Greg Kroah-Hartman
2019-02-28 23:43                 ` Rafael J. Wysocki
2019-02-28 13:58 ` [PATCH v2] device.h: reorganize struct device Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).