linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm: panel: fix excessive stack usage in td028ttec1_prepare
@ 2020-01-07 21:27 Arnd Bergmann
  2020-01-07 22:00 ` Laurent Pinchart
  0 siblings, 1 reply; 6+ messages in thread
From: Arnd Bergmann @ 2020-01-07 21:27 UTC (permalink / raw)
  To: Thierry Reding, David Airlie, Daniel Vetter
  Cc: Oleksandr Natalenko, Arnd Bergmann, Sam Ravnborg,
	Laurent Pinchart, Linus Walleij, Boris Brezillon, Tomi Valkeinen,
	dri-devel, linux-kernel

With gcc -O3, the compiler can inline very aggressively,
leading to rather large stack usage:

drivers/gpu/drm/panel/panel-tpo-td028ttec1.c: In function 'td028ttec1_prepare':
drivers/gpu/drm/panel/panel-tpo-td028ttec1.c:233:1: error: the frame size of 2768 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
 }

Marking jbt_reg_write_1() as noinline avoids the case where
multiple instances of this function get inlined into the same
stack frame and each one adds a copy of 'tx_buf'.

Fixes: mmtom ("init/Kconfig: enable -O3 for all arches")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/gpu/drm/panel/panel-tpo-td028ttec1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panel/panel-tpo-td028ttec1.c b/drivers/gpu/drm/panel/panel-tpo-td028ttec1.c
index cf29405a2dbe..17ee5e87141f 100644
--- a/drivers/gpu/drm/panel/panel-tpo-td028ttec1.c
+++ b/drivers/gpu/drm/panel/panel-tpo-td028ttec1.c
@@ -105,7 +105,7 @@ static int jbt_ret_write_0(struct td028ttec1_panel *lcd, u8 reg, int *err)
 	return ret;
 }
 
-static int jbt_reg_write_1(struct td028ttec1_panel *lcd,
+static int noinline_for_stack jbt_reg_write_1(struct td028ttec1_panel *lcd,
 			   u8 reg, u8 data, int *err)
 {
 	struct spi_device *spi = lcd->spi;
-- 
2.20.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm: panel: fix excessive stack usage in td028ttec1_prepare
  2020-01-07 21:27 [PATCH] drm: panel: fix excessive stack usage in td028ttec1_prepare Arnd Bergmann
@ 2020-01-07 22:00 ` Laurent Pinchart
  2020-01-07 22:09   ` Arnd Bergmann
  0 siblings, 1 reply; 6+ messages in thread
From: Laurent Pinchart @ 2020-01-07 22:00 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Thierry Reding, David Airlie, Daniel Vetter, Oleksandr Natalenko,
	Sam Ravnborg, Linus Walleij, Boris Brezillon, Tomi Valkeinen,
	dri-devel, linux-kernel

Hi Arnd,

Thank you for the patch.

On Tue, Jan 07, 2020 at 10:27:33PM +0100, Arnd Bergmann wrote:
> With gcc -O3, the compiler can inline very aggressively,
> leading to rather large stack usage:
> 
> drivers/gpu/drm/panel/panel-tpo-td028ttec1.c: In function 'td028ttec1_prepare':
> drivers/gpu/drm/panel/panel-tpo-td028ttec1.c:233:1: error: the frame size of 2768 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
>  }
> 
> Marking jbt_reg_write_1() as noinline avoids the case where
> multiple instances of this function get inlined into the same
> stack frame and each one adds a copy of 'tx_buf'.
> 
> Fixes: mmtom ("init/Kconfig: enable -O3 for all arches")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Isn't this something that should be fixed at the compiler level ?

> ---
>  drivers/gpu/drm/panel/panel-tpo-td028ttec1.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panel/panel-tpo-td028ttec1.c b/drivers/gpu/drm/panel/panel-tpo-td028ttec1.c
> index cf29405a2dbe..17ee5e87141f 100644
> --- a/drivers/gpu/drm/panel/panel-tpo-td028ttec1.c
> +++ b/drivers/gpu/drm/panel/panel-tpo-td028ttec1.c
> @@ -105,7 +105,7 @@ static int jbt_ret_write_0(struct td028ttec1_panel *lcd, u8 reg, int *err)
>  	return ret;
>  }
>  
> -static int jbt_reg_write_1(struct td028ttec1_panel *lcd,
> +static int noinline_for_stack jbt_reg_write_1(struct td028ttec1_panel *lcd,
>  			   u8 reg, u8 data, int *err)
>  {
>  	struct spi_device *spi = lcd->spi;

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm: panel: fix excessive stack usage in td028ttec1_prepare
  2020-01-07 22:00 ` Laurent Pinchart
@ 2020-01-07 22:09   ` Arnd Bergmann
  2020-01-07 22:12     ` Laurent Pinchart
  0 siblings, 1 reply; 6+ messages in thread
From: Arnd Bergmann @ 2020-01-07 22:09 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Thierry Reding, David Airlie, Daniel Vetter, Oleksandr Natalenko,
	Sam Ravnborg, Linus Walleij, Boris Brezillon, Tomi Valkeinen,
	dri-devel, linux-kernel

On Tue, Jan 7, 2020 at 11:00 PM Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
>
> Hi Arnd,
>
> Thank you for the patch.
>
> On Tue, Jan 07, 2020 at 10:27:33PM +0100, Arnd Bergmann wrote:
> > With gcc -O3, the compiler can inline very aggressively,
> > leading to rather large stack usage:
> >
> > drivers/gpu/drm/panel/panel-tpo-td028ttec1.c: In function 'td028ttec1_prepare':
> > drivers/gpu/drm/panel/panel-tpo-td028ttec1.c:233:1: error: the frame size of 2768 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
> >  }
> >
> > Marking jbt_reg_write_1() as noinline avoids the case where
> > multiple instances of this function get inlined into the same
> > stack frame and each one adds a copy of 'tx_buf'.
> >
> > Fixes: mmtom ("init/Kconfig: enable -O3 for all arches")
> > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>
> Isn't this something that should be fixed at the compiler level ?

I suspect but have not verified that structleak gcc plugin is partly at
fault here as well, it has caused similar problems elsewhere.

If you like I can try to dig deeper before that patch gets merged,
and explain more in the changelog or open a gcc bug if necessary.

      Arnd

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm: panel: fix excessive stack usage in td028ttec1_prepare
  2020-01-07 22:09   ` Arnd Bergmann
@ 2020-01-07 22:12     ` Laurent Pinchart
  2020-01-07 22:26       ` Arnd Bergmann
  0 siblings, 1 reply; 6+ messages in thread
From: Laurent Pinchart @ 2020-01-07 22:12 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Thierry Reding, David Airlie, Daniel Vetter, Oleksandr Natalenko,
	Sam Ravnborg, Linus Walleij, Boris Brezillon, Tomi Valkeinen,
	dri-devel, linux-kernel

Hi Arnd,

On Tue, Jan 07, 2020 at 11:09:13PM +0100, Arnd Bergmann wrote:
> On Tue, Jan 7, 2020 at 11:00 PM Laurent Pinchart wrote:
> > On Tue, Jan 07, 2020 at 10:27:33PM +0100, Arnd Bergmann wrote:
> > > With gcc -O3, the compiler can inline very aggressively,
> > > leading to rather large stack usage:
> > >
> > > drivers/gpu/drm/panel/panel-tpo-td028ttec1.c: In function 'td028ttec1_prepare':
> > > drivers/gpu/drm/panel/panel-tpo-td028ttec1.c:233:1: error: the frame size of 2768 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
> > >  }
> > >
> > > Marking jbt_reg_write_1() as noinline avoids the case where
> > > multiple instances of this function get inlined into the same
> > > stack frame and each one adds a copy of 'tx_buf'.
> > >
> > > Fixes: mmtom ("init/Kconfig: enable -O3 for all arches")
> > > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> >
> > Isn't this something that should be fixed at the compiler level ?
> 
> I suspect but have not verified that structleak gcc plugin is partly at
> fault here as well, it has caused similar problems elsewhere.
> 
> If you like I can try to dig deeper before that patch gets merged,
> and explain more in the changelog or open a gcc bug if necessary.

I think we'll need to merge this in the meantime, but if gcc is able to
detect too large frame sizes, I think it should have the ability to take
a frame size limit into account when optimizing. I haven't checked if
this is already possible and just not honoured here (possibly due to a
bug) or if the feature is entirely missing. In any case we'll likely
have to live with this compiler issue for quite some time.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm: panel: fix excessive stack usage in td028ttec1_prepare
  2020-01-07 22:12     ` Laurent Pinchart
@ 2020-01-07 22:26       ` Arnd Bergmann
  2020-01-08 10:31         ` Arnd Bergmann
  0 siblings, 1 reply; 6+ messages in thread
From: Arnd Bergmann @ 2020-01-07 22:26 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Thierry Reding, David Airlie, Daniel Vetter, Oleksandr Natalenko,
	Sam Ravnborg, Linus Walleij, Boris Brezillon, Tomi Valkeinen,
	dri-devel, linux-kernel

On Tue, Jan 7, 2020 at 11:12 PM Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
> On Tue, Jan 07, 2020 at 11:09:13PM +0100, Arnd Bergmann wrote:
> > On Tue, Jan 7, 2020 at 11:00 PM Laurent Pinchart wrote:

> > > Isn't this something that should be fixed at the compiler level ?
> >
> > I suspect but have not verified that structleak gcc plugin is partly at
> > fault here as well, it has caused similar problems elsewhere.

I checked that now, and it's indeed the structleak plugin.

Interestingly the problem goes away without the -fconserve-stack
option, which is meant to reduce the stack usage bug has the
opposite effect here (!).

I'll do some more tests tomorrow.

> > If you like I can try to dig deeper before that patch gets merged,
> > and explain more in the changelog or open a gcc bug if necessary.
>
> I think we'll need to merge this in the meantime, but if gcc is able to
> detect too large frame sizes, I think it should have the ability to take
> a frame size limit into account when optimizing. I haven't checked if
> this is already possible and just not honoured here (possibly due to a
> bug) or if the feature is entirely missing. In any case we'll likely
> have to live with this compiler issue for quite some time.

When talking to gcc developers about other files that use excessive
amounts of stack space, it was pointed out to me that this is a
fundamentally hard problem to solve in general: what usually happens
is that one optimization step uses a heuristic for inlining, but the
register allocator much later runs out of registers and spills them to
the stack at a point when it's too late to undo the earlier optimizations.

        Arnd

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm: panel: fix excessive stack usage in td028ttec1_prepare
  2020-01-07 22:26       ` Arnd Bergmann
@ 2020-01-08 10:31         ` Arnd Bergmann
  0 siblings, 0 replies; 6+ messages in thread
From: Arnd Bergmann @ 2020-01-08 10:31 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Thierry Reding, David Airlie, Daniel Vetter, Oleksandr Natalenko,
	Sam Ravnborg, Linus Walleij, Boris Brezillon, Tomi Valkeinen,
	dri-devel, linux-kernel, Kees Cook, Ard Biesheuvel

On Tue, Jan 7, 2020 at 11:26 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Tue, Jan 7, 2020 at 11:12 PM Laurent Pinchart
> <laurent.pinchart@ideasonboard.com> wrote:
> > On Tue, Jan 07, 2020 at 11:09:13PM +0100, Arnd Bergmann wrote:
> > > On Tue, Jan 7, 2020 at 11:00 PM Laurent Pinchart wrote:
>
> > > > Isn't this something that should be fixed at the compiler level ?
> > >
> > > I suspect but have not verified that structleak gcc plugin is partly at
> > > fault here as well, it has caused similar problems elsewhere.
>
> I checked that now, and it's indeed the structleak plugin.
>
> Interestingly the problem goes away without the -fconserve-stack
> option, which is meant to reduce the stack usage bug has the
> opposite effect here (!).
>
> I'll do some more tests tomorrow.

Here's a reduced test case:

struct list_head {
  struct list_head *next, *prev;
} typedef initcall_t;
struct sg_table {
  int sgl;
  int nents;
  int orig_nents;
};
struct spi_transfer {
  void *tx_buf;
  void *rx_buf;
  unsigned len;
  int tx_dma;
  int rx_dma;
  struct sg_table tx_sg;
  struct sg_table rx_sg;
  short delay_usecs;
  int delay;
  int cs_change_delay;
  int word_delay;
  int speed_hz;
  int effective_speed_hz;
  int ptp_sts_word_pre;
  int ptp_sts_word_post;
  int ptp_sts;
  _Bool timestamped_pre;
  struct list_head transfer_list;
};
void spi_sync_transfer(struct spi_transfer *, int);
void spi_write(void) {
  struct spi_transfer t;
  spi_sync_transfer(&t, 0);
}
int jbt_ret_write_0_err;
void jbt_ret_write_0(void) {
  if (jbt_ret_write_0_err)
    spi_write();
}
void jbt_reg_write_1(int *err) {
  if (*err) {
    struct spi_transfer t;
    spi_sync_transfer(&t, 1);
  }
}
void jbt_reg_write_2(int *err) {
  short tx_buf[3];
  if (err) {
    struct spi_transfer t = {tx_buf};
    spi_sync_transfer(&t, 0);
  }
}
int td028ttec1_prepare_i;
void td028ttec1_prepare() {
  int ret;
  for (; td028ttec1_prepare_i; ++td028ttec1_prepare_i) {
    jbt_ret_write_0();
  }
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_2(&ret);
  jbt_ret_write_0();
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
  jbt_reg_write_1(&ret);
}

$ arm-linux-gnueabi/bin/arm-linux-gnueabi-gcc panel-tpo-td028ttec1.c
-fplugin=scripts/gcc-plugins/structleak_plugin.so
-fplugin-arg-structleak_plugin-byref-all  -S -O3
-Wframe-larger-than=128
panel-tpo-td028ttec1.i: In function 'td028ttec1_prepare':
panel-tpo-td028ttec1.i:80:1: warning: the frame size of 192 bytes is
larger than 128 bytes [-Wframe-larger-than=]

$ arm-linux-gnueabi/bin/arm-linux-gnueabi-gcc panel-tpo-td028ttec1.c
-fplugin=scripts/gcc-plugins/structleak_plugin.so
-fplugin-arg-structleak_plugin-byref-all  -S -O3
-Wframe-larger-than=128 -fconserve-stack
panel-tpo-td028ttec1.i: In function 'td028ttec1_prepare':
panel-tpo-td028ttec1.i:80:1: warning: the frame size of 2032 bytes is
larger than 128 bytes [-Wframe-larger-than=]

I'm still not entirely sure what to make of this. The -fconserve-stack
is supposed
to prevent inlining when the frames get too large, but it appears that inlining
less has the opposite effect here, as it leaves larger structures on the stack
of the caller. structleak_plugin-byref-all causes each copy of the
'struct spi_transfer'
to be initialized (intentionally) and left on the stack (as a
side-effect of a somewhat
suboptimal implementation).

         Arnd

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-01-08 10:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-07 21:27 [PATCH] drm: panel: fix excessive stack usage in td028ttec1_prepare Arnd Bergmann
2020-01-07 22:00 ` Laurent Pinchart
2020-01-07 22:09   ` Arnd Bergmann
2020-01-07 22:12     ` Laurent Pinchart
2020-01-07 22:26       ` Arnd Bergmann
2020-01-08 10:31         ` Arnd Bergmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).