All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
@ 2019-07-11 15:31 Leon Romanovsky
  2019-07-11 15:43 ` Jason Gunthorpe
  0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2019-07-11 15:31 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Yamin Friedman

From: Leon Romanovsky <leonro@mellanox.com>

Multiply by 100 can potentially overflow cpms value and will produce
incorrect wrong ratio statistics. Update code to use built-in division
macro, so it will fix the following UBSAN warning.

 [ 1040.120129] ================================================================================
 [ 1040.127124] UBSAN: Undefined behaviour in lib/dim/dim.c:78:23
 [ 1040.130118] signed integer overflow:
 [ 1040.131643] 134718714 * 100 cannot be represented in type 'int'
 [ 1040.134374] CPU: 0 PID: 22846 Comm: iperf3 Not tainted 5.2.0-rc6-for-upstream-dbg-2019-06-29_03-18-13-29 #1
 [ 1040.139068] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
 [ 1040.144469] Call Trace:
 [ 1040.145897]  <IRQ>
 [ 1040.147366]  dump_stack+0x9a/0xeb
 [ 1040.149061]  ubsan_epilogue+0x9/0x7c
 [ 1040.150462]  handle_overflow+0x16d/0x198
 [ 1040.151911]  ? __ubsan_handle_negate_overflow+0x15c/0x15c
 [ 1040.153679]  ? sk_free+0x15/0x30
 [ 1040.155011]  ? kvm_clock_read+0x14/0x30
 [ 1040.156433]  ? kvm_sched_clock_read+0x5/0x10
 [ 1040.157952]  ? sched_clock+0x5/0x10
 [ 1040.159318]  ? sched_clock_cpu+0x18/0x260
 [ 1040.160801]  dim_calc_stats+0x4a1/0x4c0
 [ 1040.162274]  net_dim+0x147/0x920
 [ 1040.163592]  ? net_dim_stats_compare+0x330/0x330
 [ 1040.165283]  mlx5e_napi_poll+0x410/0x1030 [mlx5_core]
 [ 1040.166876]  ? lock_stats+0xd41/0x1740
 [ 1040.168266]  ? mlx5e_trigger_irq+0x550/0x550 [mlx5_core]
 [ 1040.169918]  ? __module_text_address+0x13/0x140
 [ 1040.171409]  ? lock_stats+0xd41/0x1740
 [ 1040.172757]  ? net_rx_action+0x262/0xda0
 [ 1040.174156]  net_rx_action+0x421/0xda0
 [ 1040.175519]  ? napi_complete_done+0x370/0x370
 [ 1040.176979]  ? kvm_clock_read+0x14/0x30
 [ 1040.178316]  ? kvm_sched_clock_read+0x5/0x10
 [ 1040.179690]  ? sched_clock+0x5/0x10
 [ 1040.180920]  ? sched_clock_cpu+0x18/0x260
 [ 1040.182286]  __do_softirq+0x287/0xb4e
 [ 1040.183581]  ? irqtime_account_irq+0x1d5/0x3b0
 [ 1040.184998]  irq_exit+0x17d/0x1d0
 [ 1040.186212]  do_IRQ+0x129/0x220
 [ 1040.187412]  common_interrupt+0xf/0xf
 [ 1040.188673]  </IRQ>
 [ 1040.189685] RIP: 0033:0x7f092c41a07a
 [ 1040.190884] Code: 45 31 f6 e9 8a 00 00 00 0f 1f 84 00 00 00 00 00 48
89 df ff 93 88 01 00 00 85 c0 0f 88 c7 00 00 00 48 98 48 01 85 88 02 00
00 <48> 8b 85 c8 02 00 00 48 83 85 90 02 00 00 01 48 83 78 10 00 74 0b
 [ 1040.195584] RSP: 002b:00007fffbebe7870 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffd7
 [ 1040.197933] RAX: 0000000000020000 RBX: 0000000000e239b0 RCX: 000000000006b280
 [ 1040.199740] RDX: 0000000000020000 RSI: 00007f092c805000 RDI: 0000000000000007
 [ 1040.201525] RBP: 0000000000e21260 R08: 0000000000000000 R09: 00007fffbebfb0a0
 [ 1040.203237] R10: 0000000000000380 R11: 0000000000000246 R12: 00007fffbebe7950
 [ 1040.204944] R13: 0000000000000007 R14: 0000000000000001 R15: 00007fffbebe7870
 [ 1040.206686] ================================================================================

Fixes: 398c2b05bbee ("linux/dim: Add completions count to dim_sample")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 lib/dim/dim.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/dim/dim.c b/lib/dim/dim.c
index 439d641ec796..38045d6d0538 100644
--- a/lib/dim/dim.c
+++ b/lib/dim/dim.c
@@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
 					delta_us);
 	curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
 	if (curr_stats->epms != 0)
-		curr_stats->cpe_ratio =
-				(curr_stats->cpms * 100) / curr_stats->epms;
+		curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
+			curr_stats->cpms * 100, curr_stats->epms);
 	else
 		curr_stats->cpe_ratio = 0;

--
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
  2019-07-11 15:31 [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics Leon Romanovsky
@ 2019-07-11 15:43 ` Jason Gunthorpe
  2019-07-11 15:47   ` Leon Romanovsky
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2019-07-11 15:43 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Yamin Friedman

On Thu, Jul 11, 2019 at 06:31:18PM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> Multiply by 100 can potentially overflow cpms value and will produce
> incorrect wrong ratio statistics. Update code to use built-in division
> macro, so it will fix the following UBSAN warning.
> 
>  [ 1040.120129] ================================================================================
>  [ 1040.127124] UBSAN: Undefined behaviour in lib/dim/dim.c:78:23
>  [ 1040.130118] signed integer overflow:
>  [ 1040.131643] 134718714 * 100 cannot be represented in type 'int'
>  [ 1040.134374] CPU: 0 PID: 22846 Comm: iperf3 Not tainted 5.2.0-rc6-for-upstream-dbg-2019-06-29_03-18-13-29 #1
>  [ 1040.139068] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
>  [ 1040.144469] Call Trace:
>  [ 1040.145897]  <IRQ>
>  [ 1040.147366]  dump_stack+0x9a/0xeb
>  [ 1040.149061]  ubsan_epilogue+0x9/0x7c
>  [ 1040.150462]  handle_overflow+0x16d/0x198
>  [ 1040.151911]  ? __ubsan_handle_negate_overflow+0x15c/0x15c
>  [ 1040.153679]  ? sk_free+0x15/0x30
>  [ 1040.155011]  ? kvm_clock_read+0x14/0x30
>  [ 1040.156433]  ? kvm_sched_clock_read+0x5/0x10
>  [ 1040.157952]  ? sched_clock+0x5/0x10
>  [ 1040.159318]  ? sched_clock_cpu+0x18/0x260
>  [ 1040.160801]  dim_calc_stats+0x4a1/0x4c0
>  [ 1040.162274]  net_dim+0x147/0x920
>  [ 1040.163592]  ? net_dim_stats_compare+0x330/0x330
>  [ 1040.165283]  mlx5e_napi_poll+0x410/0x1030 [mlx5_core]
>  [ 1040.166876]  ? lock_stats+0xd41/0x1740
>  [ 1040.168266]  ? mlx5e_trigger_irq+0x550/0x550 [mlx5_core]
>  [ 1040.169918]  ? __module_text_address+0x13/0x140
>  [ 1040.171409]  ? lock_stats+0xd41/0x1740
>  [ 1040.172757]  ? net_rx_action+0x262/0xda0
>  [ 1040.174156]  net_rx_action+0x421/0xda0
>  [ 1040.175519]  ? napi_complete_done+0x370/0x370
>  [ 1040.176979]  ? kvm_clock_read+0x14/0x30
>  [ 1040.178316]  ? kvm_sched_clock_read+0x5/0x10
>  [ 1040.179690]  ? sched_clock+0x5/0x10
>  [ 1040.180920]  ? sched_clock_cpu+0x18/0x260
>  [ 1040.182286]  __do_softirq+0x287/0xb4e
>  [ 1040.183581]  ? irqtime_account_irq+0x1d5/0x3b0
>  [ 1040.184998]  irq_exit+0x17d/0x1d0
>  [ 1040.186212]  do_IRQ+0x129/0x220
>  [ 1040.187412]  common_interrupt+0xf/0xf
>  [ 1040.188673]  </IRQ>
>  [ 1040.189685] RIP: 0033:0x7f092c41a07a
>  [ 1040.190884] Code: 45 31 f6 e9 8a 00 00 00 0f 1f 84 00 00 00 00 00 48
> 89 df ff 93 88 01 00 00 85 c0 0f 88 c7 00 00 00 48 98 48 01 85 88 02 00
> 00 <48> 8b 85 c8 02 00 00 48 83 85 90 02 00 00 01 48 83 78 10 00 74 0b
>  [ 1040.195584] RSP: 002b:00007fffbebe7870 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffd7
>  [ 1040.197933] RAX: 0000000000020000 RBX: 0000000000e239b0 RCX: 000000000006b280
>  [ 1040.199740] RDX: 0000000000020000 RSI: 00007f092c805000 RDI: 0000000000000007
>  [ 1040.201525] RBP: 0000000000e21260 R08: 0000000000000000 R09: 00007fffbebfb0a0
>  [ 1040.203237] R10: 0000000000000380 R11: 0000000000000246 R12: 00007fffbebe7950
>  [ 1040.204944] R13: 0000000000000007 R14: 0000000000000001 R15: 00007fffbebe7870
>  [ 1040.206686] ================================================================================
> 
> Fixes: 398c2b05bbee ("linux/dim: Add completions count to dim_sample")
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
>  lib/dim/dim.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> index 439d641ec796..38045d6d0538 100644
> +++ b/lib/dim/dim.c
> @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
>  					delta_us);
>  	curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
>  	if (curr_stats->epms != 0)
> -		curr_stats->cpe_ratio =
> -				(curr_stats->cpms * 100) / curr_stats->epms;
> +		curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> +			curr_stats->cpms * 100, curr_stats->epms);

This will still potentially overfow the 'int' for cpe_ratio if epms <
100 ?

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
  2019-07-11 15:43 ` Jason Gunthorpe
@ 2019-07-11 15:47   ` Leon Romanovsky
  2019-07-11 16:11     ` Jason Gunthorpe
  0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2019-07-11 15:47 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman

On Thu, Jul 11, 2019 at 03:43:28PM +0000, Jason Gunthorpe wrote:
> On Thu, Jul 11, 2019 at 06:31:18PM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@mellanox.com>
> >
> > Multiply by 100 can potentially overflow cpms value and will produce
> > incorrect wrong ratio statistics. Update code to use built-in division
> > macro, so it will fix the following UBSAN warning.
> >
> >  [ 1040.120129] ================================================================================
> >  [ 1040.127124] UBSAN: Undefined behaviour in lib/dim/dim.c:78:23
> >  [ 1040.130118] signed integer overflow:
> >  [ 1040.131643] 134718714 * 100 cannot be represented in type 'int'
> >  [ 1040.134374] CPU: 0 PID: 22846 Comm: iperf3 Not tainted 5.2.0-rc6-for-upstream-dbg-2019-06-29_03-18-13-29 #1
> >  [ 1040.139068] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> >  [ 1040.144469] Call Trace:
> >  [ 1040.145897]  <IRQ>
> >  [ 1040.147366]  dump_stack+0x9a/0xeb
> >  [ 1040.149061]  ubsan_epilogue+0x9/0x7c
> >  [ 1040.150462]  handle_overflow+0x16d/0x198
> >  [ 1040.151911]  ? __ubsan_handle_negate_overflow+0x15c/0x15c
> >  [ 1040.153679]  ? sk_free+0x15/0x30
> >  [ 1040.155011]  ? kvm_clock_read+0x14/0x30
> >  [ 1040.156433]  ? kvm_sched_clock_read+0x5/0x10
> >  [ 1040.157952]  ? sched_clock+0x5/0x10
> >  [ 1040.159318]  ? sched_clock_cpu+0x18/0x260
> >  [ 1040.160801]  dim_calc_stats+0x4a1/0x4c0
> >  [ 1040.162274]  net_dim+0x147/0x920
> >  [ 1040.163592]  ? net_dim_stats_compare+0x330/0x330
> >  [ 1040.165283]  mlx5e_napi_poll+0x410/0x1030 [mlx5_core]
> >  [ 1040.166876]  ? lock_stats+0xd41/0x1740
> >  [ 1040.168266]  ? mlx5e_trigger_irq+0x550/0x550 [mlx5_core]
> >  [ 1040.169918]  ? __module_text_address+0x13/0x140
> >  [ 1040.171409]  ? lock_stats+0xd41/0x1740
> >  [ 1040.172757]  ? net_rx_action+0x262/0xda0
> >  [ 1040.174156]  net_rx_action+0x421/0xda0
> >  [ 1040.175519]  ? napi_complete_done+0x370/0x370
> >  [ 1040.176979]  ? kvm_clock_read+0x14/0x30
> >  [ 1040.178316]  ? kvm_sched_clock_read+0x5/0x10
> >  [ 1040.179690]  ? sched_clock+0x5/0x10
> >  [ 1040.180920]  ? sched_clock_cpu+0x18/0x260
> >  [ 1040.182286]  __do_softirq+0x287/0xb4e
> >  [ 1040.183581]  ? irqtime_account_irq+0x1d5/0x3b0
> >  [ 1040.184998]  irq_exit+0x17d/0x1d0
> >  [ 1040.186212]  do_IRQ+0x129/0x220
> >  [ 1040.187412]  common_interrupt+0xf/0xf
> >  [ 1040.188673]  </IRQ>
> >  [ 1040.189685] RIP: 0033:0x7f092c41a07a
> >  [ 1040.190884] Code: 45 31 f6 e9 8a 00 00 00 0f 1f 84 00 00 00 00 00 48
> > 89 df ff 93 88 01 00 00 85 c0 0f 88 c7 00 00 00 48 98 48 01 85 88 02 00
> > 00 <48> 8b 85 c8 02 00 00 48 83 85 90 02 00 00 01 48 83 78 10 00 74 0b
> >  [ 1040.195584] RSP: 002b:00007fffbebe7870 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffd7
> >  [ 1040.197933] RAX: 0000000000020000 RBX: 0000000000e239b0 RCX: 000000000006b280
> >  [ 1040.199740] RDX: 0000000000020000 RSI: 00007f092c805000 RDI: 0000000000000007
> >  [ 1040.201525] RBP: 0000000000e21260 R08: 0000000000000000 R09: 00007fffbebfb0a0
> >  [ 1040.203237] R10: 0000000000000380 R11: 0000000000000246 R12: 00007fffbebe7950
> >  [ 1040.204944] R13: 0000000000000007 R14: 0000000000000001 R15: 00007fffbebe7870
> >  [ 1040.206686] ================================================================================
> >
> > Fixes: 398c2b05bbee ("linux/dim: Add completions count to dim_sample")
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> >  lib/dim/dim.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > index 439d641ec796..38045d6d0538 100644
> > +++ b/lib/dim/dim.c
> > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> >  					delta_us);
> >  	curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> >  	if (curr_stats->epms != 0)
> > -		curr_stats->cpe_ratio =
> > -				(curr_stats->cpms * 100) / curr_stats->epms;
> > +		curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > +			curr_stats->cpms * 100, curr_stats->epms);
>
> This will still potentially overfow the 'int' for cpe_ratio if epms <
> 100 ?

I assumed that assignment to "unsigned long long" will do the trick.
https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94

Thanks

>
> Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
  2019-07-11 15:47   ` Leon Romanovsky
@ 2019-07-11 16:11     ` Jason Gunthorpe
  2019-07-11 17:19       ` Leon Romanovsky
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2019-07-11 16:11 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman

On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > index 439d641ec796..38045d6d0538 100644
> > > +++ b/lib/dim/dim.c
> > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > >  					delta_us);
> > >  	curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > >  	if (curr_stats->epms != 0)
> > > -		curr_stats->cpe_ratio =
> > > -				(curr_stats->cpms * 100) / curr_stats->epms;
> > > +		curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > +			curr_stats->cpms * 100, curr_stats->epms);
> >
> > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > 100 ?
> 
> I assumed that assignment to "unsigned long long" will do the trick.
> https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94

That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
casted to int.

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
  2019-07-11 16:11     ` Jason Gunthorpe
@ 2019-07-11 17:19       ` Leon Romanovsky
  2019-07-11 17:31         ` Jason Gunthorpe
  0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2019-07-11 17:19 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman

On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > index 439d641ec796..38045d6d0538 100644
> > > > +++ b/lib/dim/dim.c
> > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > >  					delta_us);
> > > >  	curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > >  	if (curr_stats->epms != 0)
> > > > -		curr_stats->cpe_ratio =
> > > > -				(curr_stats->cpms * 100) / curr_stats->epms;
> > > > +		curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > +			curr_stats->cpms * 100, curr_stats->epms);
> > >
> > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > 100 ?
> >
> > I assumed that assignment to "unsigned long long" will do the trick.
> > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
>
> That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> casted to int.

It is ok, the result is "int" and it will be small, 100 in multiply
represents percentage.

Thanks

>
> Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
  2019-07-11 17:19       ` Leon Romanovsky
@ 2019-07-11 17:31         ` Jason Gunthorpe
  2019-07-12  6:03           ` Leon Romanovsky
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2019-07-11 17:31 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman

On Thu, Jul 11, 2019 at 08:19:22PM +0300, Leon Romanovsky wrote:
> On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> > On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > > index 439d641ec796..38045d6d0538 100644
> > > > > +++ b/lib/dim/dim.c
> > > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > > >  					delta_us);
> > > > >  	curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > > >  	if (curr_stats->epms != 0)
> > > > > -		curr_stats->cpe_ratio =
> > > > > -				(curr_stats->cpms * 100) / curr_stats->epms;
> > > > > +		curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > > +			curr_stats->cpms * 100, curr_stats->epms);
> > > >
> > > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > > 100 ?
> > >
> > > I assumed that assignment to "unsigned long long" will do the trick.
> > > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
> >
> > That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> > casted to int.
> 
> It is ok, the result is "int" and it will be small, 100 in multiply
> represents percentage.

Percentage would be divide by 100..

Like I said it will overflow if epms < 100 ...

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
  2019-07-11 17:31         ` Jason Gunthorpe
@ 2019-07-12  6:03           ` Leon Romanovsky
  2019-07-12 15:23             ` Jason Gunthorpe
  0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2019-07-12  6:03 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman

On Thu, Jul 11, 2019 at 05:31:14PM +0000, Jason Gunthorpe wrote:
> On Thu, Jul 11, 2019 at 08:19:22PM +0300, Leon Romanovsky wrote:
> > On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> > > On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > > > index 439d641ec796..38045d6d0538 100644
> > > > > > +++ b/lib/dim/dim.c
> > > > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > > > >  					delta_us);
> > > > > >  	curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > > > >  	if (curr_stats->epms != 0)
> > > > > > -		curr_stats->cpe_ratio =
> > > > > > -				(curr_stats->cpms * 100) / curr_stats->epms;
> > > > > > +		curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > > > +			curr_stats->cpms * 100, curr_stats->epms);
> > > > >
> > > > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > > > 100 ?
> > > >
> > > > I assumed that assignment to "unsigned long long" will do the trick.
> > > > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
> > >
> > > That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> > > casted to int.
> >
> > It is ok, the result is "int" and it will be small, 100 in multiply
> > represents percentage.
>
> Percentage would be divide by 100..
>
> Like I said it will overflow if epms < 100 ...

It is unlikely to happen because cpe_ratio is between 0 to 100 and cpms
* 100 is not large at all.

UBSAN error is "theoretical" overflow.

Thanks

>
> Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
  2019-07-12  6:03           ` Leon Romanovsky
@ 2019-07-12 15:23             ` Jason Gunthorpe
  2019-07-14 10:54               ` Leon Romanovsky
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2019-07-12 15:23 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman

On Fri, Jul 12, 2019 at 09:03:09AM +0300, Leon Romanovsky wrote:
> On Thu, Jul 11, 2019 at 05:31:14PM +0000, Jason Gunthorpe wrote:
> > On Thu, Jul 11, 2019 at 08:19:22PM +0300, Leon Romanovsky wrote:
> > > On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> > > > On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > > > > index 439d641ec796..38045d6d0538 100644
> > > > > > > +++ b/lib/dim/dim.c
> > > > > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > > > > >  					delta_us);
> > > > > > >  	curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > > > > >  	if (curr_stats->epms != 0)
> > > > > > > -		curr_stats->cpe_ratio =
> > > > > > > -				(curr_stats->cpms * 100) / curr_stats->epms;
> > > > > > > +		curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > > > > +			curr_stats->cpms * 100, curr_stats->epms);
> > > > > >
> > > > > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > > > > 100 ?
> > > > >
> > > > > I assumed that assignment to "unsigned long long" will do the trick.
> > > > > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
> > > >
> > > > That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> > > > casted to int.
> > >
> > > It is ok, the result is "int" and it will be small, 100 in multiply
> > > represents percentage.
> >
> > Percentage would be divide by 100..
> >
> > Like I said it will overflow if epms < 100 ...
> 
> It is unlikely to happen because cpe_ratio is between 0 to 100 and cpms
> * 100 is not large at all.
> 
> UBSAN error is "theoretical" overflow.

? UBSAN is not theoretical, it only triggers if something actually
happens. So in this case cpms*100 was very large and overflowed. 

Maybe it shouldn't be and that is the actual bug, but if we overflowed
with cpms*100, then epms must be > 100 or we still overflow the
divide.

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
  2019-07-12 15:23             ` Jason Gunthorpe
@ 2019-07-14 10:54               ` Leon Romanovsky
  2019-07-18 17:39                 ` Jason Gunthorpe
  0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2019-07-14 10:54 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman

On Fri, Jul 12, 2019 at 03:23:20PM +0000, Jason Gunthorpe wrote:
> On Fri, Jul 12, 2019 at 09:03:09AM +0300, Leon Romanovsky wrote:
> > On Thu, Jul 11, 2019 at 05:31:14PM +0000, Jason Gunthorpe wrote:
> > > On Thu, Jul 11, 2019 at 08:19:22PM +0300, Leon Romanovsky wrote:
> > > > On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> > > > > On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > > > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > > > > > index 439d641ec796..38045d6d0538 100644
> > > > > > > > +++ b/lib/dim/dim.c
> > > > > > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > > > > > >  					delta_us);
> > > > > > > >  	curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > > > > > >  	if (curr_stats->epms != 0)
> > > > > > > > -		curr_stats->cpe_ratio =
> > > > > > > > -				(curr_stats->cpms * 100) / curr_stats->epms;
> > > > > > > > +		curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > > > > > +			curr_stats->cpms * 100, curr_stats->epms);
> > > > > > >
> > > > > > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > > > > > 100 ?
> > > > > >
> > > > > > I assumed that assignment to "unsigned long long" will do the trick.
> > > > > > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
> > > > >
> > > > > That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> > > > > casted to int.
> > > >
> > > > It is ok, the result is "int" and it will be small, 100 in multiply
> > > > represents percentage.
> > >
> > > Percentage would be divide by 100..
> > >
> > > Like I said it will overflow if epms < 100 ...
> >
> > It is unlikely to happen because cpe_ratio is between 0 to 100 and cpms
> > * 100 is not large at all.
> >
> > UBSAN error is "theoretical" overflow.
>
> ? UBSAN is not theoretical, it only triggers if something actually
> happens. So in this case cpms*100 was very large and overflowed.
>
> Maybe it shouldn't be and that is the actual bug, but if we overflowed
> with cpms*100, then epms must be > 100 or we still overflow the
> divide.

I think that the real bug is cpms became too big.

Thanks

>
> Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
  2019-07-14 10:54               ` Leon Romanovsky
@ 2019-07-18 17:39                 ` Jason Gunthorpe
  2019-07-19 12:38                   ` Leon Romanovsky
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2019-07-18 17:39 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman

On Sun, Jul 14, 2019 at 01:54:59PM +0300, Leon Romanovsky wrote:
> On Fri, Jul 12, 2019 at 03:23:20PM +0000, Jason Gunthorpe wrote:
> > On Fri, Jul 12, 2019 at 09:03:09AM +0300, Leon Romanovsky wrote:
> > > On Thu, Jul 11, 2019 at 05:31:14PM +0000, Jason Gunthorpe wrote:
> > > > On Thu, Jul 11, 2019 at 08:19:22PM +0300, Leon Romanovsky wrote:
> > > > > On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> > > > > > On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > > > > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > > > > > > index 439d641ec796..38045d6d0538 100644
> > > > > > > > > +++ b/lib/dim/dim.c
> > > > > > > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > > > > > > >  					delta_us);
> > > > > > > > >  	curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > > > > > > >  	if (curr_stats->epms != 0)
> > > > > > > > > -		curr_stats->cpe_ratio =
> > > > > > > > > -				(curr_stats->cpms * 100) / curr_stats->epms;
> > > > > > > > > +		curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > > > > > > +			curr_stats->cpms * 100, curr_stats->epms);
> > > > > > > >
> > > > > > > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > > > > > > 100 ?
> > > > > > >
> > > > > > > I assumed that assignment to "unsigned long long" will do the trick.
> > > > > > > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
> > > > > >
> > > > > > That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> > > > > > casted to int.
> > > > >
> > > > > It is ok, the result is "int" and it will be small, 100 in multiply
> > > > > represents percentage.
> > > >
> > > > Percentage would be divide by 100..
> > > >
> > > > Like I said it will overflow if epms < 100 ...
> > >
> > > It is unlikely to happen because cpe_ratio is between 0 to 100 and cpms
> > > * 100 is not large at all.
> > >
> > > UBSAN error is "theoretical" overflow.
> >
> > ? UBSAN is not theoretical, it only triggers if something actually
> > happens. So in this case cpms*100 was very large and overflowed.
> >
> > Maybe it shouldn't be and that is the actual bug, but if we overflowed
> > with cpms*100, then epms must be > 100 or we still overflow the
> > divide.
> 
> I think that the real bug is cpms became too big.

So I'll drop the patch until someone figures out what is happening

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
  2019-07-18 17:39                 ` Jason Gunthorpe
@ 2019-07-19 12:38                   ` Leon Romanovsky
  0 siblings, 0 replies; 11+ messages in thread
From: Leon Romanovsky @ 2019-07-19 12:38 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman

On Thu, Jul 18, 2019 at 05:39:47PM +0000, Jason Gunthorpe wrote:
> On Sun, Jul 14, 2019 at 01:54:59PM +0300, Leon Romanovsky wrote:
> > On Fri, Jul 12, 2019 at 03:23:20PM +0000, Jason Gunthorpe wrote:
> > > On Fri, Jul 12, 2019 at 09:03:09AM +0300, Leon Romanovsky wrote:
> > > > On Thu, Jul 11, 2019 at 05:31:14PM +0000, Jason Gunthorpe wrote:
> > > > > On Thu, Jul 11, 2019 at 08:19:22PM +0300, Leon Romanovsky wrote:
> > > > > > On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> > > > > > > On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > > > > > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > > > > > > > index 439d641ec796..38045d6d0538 100644
> > > > > > > > > > +++ b/lib/dim/dim.c
> > > > > > > > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > > > > > > > >  					delta_us);
> > > > > > > > > >  	curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > > > > > > > >  	if (curr_stats->epms != 0)
> > > > > > > > > > -		curr_stats->cpe_ratio =
> > > > > > > > > > -				(curr_stats->cpms * 100) / curr_stats->epms;
> > > > > > > > > > +		curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > > > > > > > +			curr_stats->cpms * 100, curr_stats->epms);
> > > > > > > > >
> > > > > > > > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > > > > > > > 100 ?
> > > > > > > >
> > > > > > > > I assumed that assignment to "unsigned long long" will do the trick.
> > > > > > > > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
> > > > > > >
> > > > > > > That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> > > > > > > casted to int.
> > > > > >
> > > > > > It is ok, the result is "int" and it will be small, 100 in multiply
> > > > > > represents percentage.
> > > > >
> > > > > Percentage would be divide by 100..
> > > > >
> > > > > Like I said it will overflow if epms < 100 ...
> > > >
> > > > It is unlikely to happen because cpe_ratio is between 0 to 100 and cpms
> > > > * 100 is not large at all.
> > > >
> > > > UBSAN error is "theoretical" overflow.
> > >
> > > ? UBSAN is not theoretical, it only triggers if something actually
> > > happens. So in this case cpms*100 was very large and overflowed.
> > >
> > > Maybe it shouldn't be and that is the actual bug, but if we overflowed
> > > with cpms*100, then epms must be > 100 or we still overflow the
> > > divide.
> >
> > I think that the real bug is cpms became too big.
>
> So I'll drop the patch until someone figures out what is happening

Thanks, Yamin is working to fix it.

>
> Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-07-19 12:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-11 15:31 [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics Leon Romanovsky
2019-07-11 15:43 ` Jason Gunthorpe
2019-07-11 15:47   ` Leon Romanovsky
2019-07-11 16:11     ` Jason Gunthorpe
2019-07-11 17:19       ` Leon Romanovsky
2019-07-11 17:31         ` Jason Gunthorpe
2019-07-12  6:03           ` Leon Romanovsky
2019-07-12 15:23             ` Jason Gunthorpe
2019-07-14 10:54               ` Leon Romanovsky
2019-07-18 17:39                 ` Jason Gunthorpe
2019-07-19 12:38                   ` Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.