* [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
@ 2019-07-11 15:31 Leon Romanovsky
2019-07-11 15:43 ` Jason Gunthorpe
0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2019-07-11 15:31 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe
Cc: Leon Romanovsky, RDMA mailing list, Yamin Friedman
From: Leon Romanovsky <leonro@mellanox.com>
Multiply by 100 can potentially overflow cpms value and will produce
incorrect wrong ratio statistics. Update code to use built-in division
macro, so it will fix the following UBSAN warning.
[ 1040.120129] ================================================================================
[ 1040.127124] UBSAN: Undefined behaviour in lib/dim/dim.c:78:23
[ 1040.130118] signed integer overflow:
[ 1040.131643] 134718714 * 100 cannot be represented in type 'int'
[ 1040.134374] CPU: 0 PID: 22846 Comm: iperf3 Not tainted 5.2.0-rc6-for-upstream-dbg-2019-06-29_03-18-13-29 #1
[ 1040.139068] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[ 1040.144469] Call Trace:
[ 1040.145897] <IRQ>
[ 1040.147366] dump_stack+0x9a/0xeb
[ 1040.149061] ubsan_epilogue+0x9/0x7c
[ 1040.150462] handle_overflow+0x16d/0x198
[ 1040.151911] ? __ubsan_handle_negate_overflow+0x15c/0x15c
[ 1040.153679] ? sk_free+0x15/0x30
[ 1040.155011] ? kvm_clock_read+0x14/0x30
[ 1040.156433] ? kvm_sched_clock_read+0x5/0x10
[ 1040.157952] ? sched_clock+0x5/0x10
[ 1040.159318] ? sched_clock_cpu+0x18/0x260
[ 1040.160801] dim_calc_stats+0x4a1/0x4c0
[ 1040.162274] net_dim+0x147/0x920
[ 1040.163592] ? net_dim_stats_compare+0x330/0x330
[ 1040.165283] mlx5e_napi_poll+0x410/0x1030 [mlx5_core]
[ 1040.166876] ? lock_stats+0xd41/0x1740
[ 1040.168266] ? mlx5e_trigger_irq+0x550/0x550 [mlx5_core]
[ 1040.169918] ? __module_text_address+0x13/0x140
[ 1040.171409] ? lock_stats+0xd41/0x1740
[ 1040.172757] ? net_rx_action+0x262/0xda0
[ 1040.174156] net_rx_action+0x421/0xda0
[ 1040.175519] ? napi_complete_done+0x370/0x370
[ 1040.176979] ? kvm_clock_read+0x14/0x30
[ 1040.178316] ? kvm_sched_clock_read+0x5/0x10
[ 1040.179690] ? sched_clock+0x5/0x10
[ 1040.180920] ? sched_clock_cpu+0x18/0x260
[ 1040.182286] __do_softirq+0x287/0xb4e
[ 1040.183581] ? irqtime_account_irq+0x1d5/0x3b0
[ 1040.184998] irq_exit+0x17d/0x1d0
[ 1040.186212] do_IRQ+0x129/0x220
[ 1040.187412] common_interrupt+0xf/0xf
[ 1040.188673] </IRQ>
[ 1040.189685] RIP: 0033:0x7f092c41a07a
[ 1040.190884] Code: 45 31 f6 e9 8a 00 00 00 0f 1f 84 00 00 00 00 00 48
89 df ff 93 88 01 00 00 85 c0 0f 88 c7 00 00 00 48 98 48 01 85 88 02 00
00 <48> 8b 85 c8 02 00 00 48 83 85 90 02 00 00 01 48 83 78 10 00 74 0b
[ 1040.195584] RSP: 002b:00007fffbebe7870 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffd7
[ 1040.197933] RAX: 0000000000020000 RBX: 0000000000e239b0 RCX: 000000000006b280
[ 1040.199740] RDX: 0000000000020000 RSI: 00007f092c805000 RDI: 0000000000000007
[ 1040.201525] RBP: 0000000000e21260 R08: 0000000000000000 R09: 00007fffbebfb0a0
[ 1040.203237] R10: 0000000000000380 R11: 0000000000000246 R12: 00007fffbebe7950
[ 1040.204944] R13: 0000000000000007 R14: 0000000000000001 R15: 00007fffbebe7870
[ 1040.206686] ================================================================================
Fixes: 398c2b05bbee ("linux/dim: Add completions count to dim_sample")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
lib/dim/dim.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/dim/dim.c b/lib/dim/dim.c
index 439d641ec796..38045d6d0538 100644
--- a/lib/dim/dim.c
+++ b/lib/dim/dim.c
@@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
delta_us);
curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
if (curr_stats->epms != 0)
- curr_stats->cpe_ratio =
- (curr_stats->cpms * 100) / curr_stats->epms;
+ curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
+ curr_stats->cpms * 100, curr_stats->epms);
else
curr_stats->cpe_ratio = 0;
--
2.20.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
2019-07-11 15:31 [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics Leon Romanovsky
@ 2019-07-11 15:43 ` Jason Gunthorpe
2019-07-11 15:47 ` Leon Romanovsky
0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2019-07-11 15:43 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Yamin Friedman
On Thu, Jul 11, 2019 at 06:31:18PM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
>
> Multiply by 100 can potentially overflow cpms value and will produce
> incorrect wrong ratio statistics. Update code to use built-in division
> macro, so it will fix the following UBSAN warning.
>
> [ 1040.120129] ================================================================================
> [ 1040.127124] UBSAN: Undefined behaviour in lib/dim/dim.c:78:23
> [ 1040.130118] signed integer overflow:
> [ 1040.131643] 134718714 * 100 cannot be represented in type 'int'
> [ 1040.134374] CPU: 0 PID: 22846 Comm: iperf3 Not tainted 5.2.0-rc6-for-upstream-dbg-2019-06-29_03-18-13-29 #1
> [ 1040.139068] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [ 1040.144469] Call Trace:
> [ 1040.145897] <IRQ>
> [ 1040.147366] dump_stack+0x9a/0xeb
> [ 1040.149061] ubsan_epilogue+0x9/0x7c
> [ 1040.150462] handle_overflow+0x16d/0x198
> [ 1040.151911] ? __ubsan_handle_negate_overflow+0x15c/0x15c
> [ 1040.153679] ? sk_free+0x15/0x30
> [ 1040.155011] ? kvm_clock_read+0x14/0x30
> [ 1040.156433] ? kvm_sched_clock_read+0x5/0x10
> [ 1040.157952] ? sched_clock+0x5/0x10
> [ 1040.159318] ? sched_clock_cpu+0x18/0x260
> [ 1040.160801] dim_calc_stats+0x4a1/0x4c0
> [ 1040.162274] net_dim+0x147/0x920
> [ 1040.163592] ? net_dim_stats_compare+0x330/0x330
> [ 1040.165283] mlx5e_napi_poll+0x410/0x1030 [mlx5_core]
> [ 1040.166876] ? lock_stats+0xd41/0x1740
> [ 1040.168266] ? mlx5e_trigger_irq+0x550/0x550 [mlx5_core]
> [ 1040.169918] ? __module_text_address+0x13/0x140
> [ 1040.171409] ? lock_stats+0xd41/0x1740
> [ 1040.172757] ? net_rx_action+0x262/0xda0
> [ 1040.174156] net_rx_action+0x421/0xda0
> [ 1040.175519] ? napi_complete_done+0x370/0x370
> [ 1040.176979] ? kvm_clock_read+0x14/0x30
> [ 1040.178316] ? kvm_sched_clock_read+0x5/0x10
> [ 1040.179690] ? sched_clock+0x5/0x10
> [ 1040.180920] ? sched_clock_cpu+0x18/0x260
> [ 1040.182286] __do_softirq+0x287/0xb4e
> [ 1040.183581] ? irqtime_account_irq+0x1d5/0x3b0
> [ 1040.184998] irq_exit+0x17d/0x1d0
> [ 1040.186212] do_IRQ+0x129/0x220
> [ 1040.187412] common_interrupt+0xf/0xf
> [ 1040.188673] </IRQ>
> [ 1040.189685] RIP: 0033:0x7f092c41a07a
> [ 1040.190884] Code: 45 31 f6 e9 8a 00 00 00 0f 1f 84 00 00 00 00 00 48
> 89 df ff 93 88 01 00 00 85 c0 0f 88 c7 00 00 00 48 98 48 01 85 88 02 00
> 00 <48> 8b 85 c8 02 00 00 48 83 85 90 02 00 00 01 48 83 78 10 00 74 0b
> [ 1040.195584] RSP: 002b:00007fffbebe7870 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffd7
> [ 1040.197933] RAX: 0000000000020000 RBX: 0000000000e239b0 RCX: 000000000006b280
> [ 1040.199740] RDX: 0000000000020000 RSI: 00007f092c805000 RDI: 0000000000000007
> [ 1040.201525] RBP: 0000000000e21260 R08: 0000000000000000 R09: 00007fffbebfb0a0
> [ 1040.203237] R10: 0000000000000380 R11: 0000000000000246 R12: 00007fffbebe7950
> [ 1040.204944] R13: 0000000000000007 R14: 0000000000000001 R15: 00007fffbebe7870
> [ 1040.206686] ================================================================================
>
> Fixes: 398c2b05bbee ("linux/dim: Add completions count to dim_sample")
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> lib/dim/dim.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> index 439d641ec796..38045d6d0538 100644
> +++ b/lib/dim/dim.c
> @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> delta_us);
> curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> if (curr_stats->epms != 0)
> - curr_stats->cpe_ratio =
> - (curr_stats->cpms * 100) / curr_stats->epms;
> + curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> + curr_stats->cpms * 100, curr_stats->epms);
This will still potentially overfow the 'int' for cpe_ratio if epms <
100 ?
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
2019-07-11 15:43 ` Jason Gunthorpe
@ 2019-07-11 15:47 ` Leon Romanovsky
2019-07-11 16:11 ` Jason Gunthorpe
0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2019-07-11 15:47 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman
On Thu, Jul 11, 2019 at 03:43:28PM +0000, Jason Gunthorpe wrote:
> On Thu, Jul 11, 2019 at 06:31:18PM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@mellanox.com>
> >
> > Multiply by 100 can potentially overflow cpms value and will produce
> > incorrect wrong ratio statistics. Update code to use built-in division
> > macro, so it will fix the following UBSAN warning.
> >
> > [ 1040.120129] ================================================================================
> > [ 1040.127124] UBSAN: Undefined behaviour in lib/dim/dim.c:78:23
> > [ 1040.130118] signed integer overflow:
> > [ 1040.131643] 134718714 * 100 cannot be represented in type 'int'
> > [ 1040.134374] CPU: 0 PID: 22846 Comm: iperf3 Not tainted 5.2.0-rc6-for-upstream-dbg-2019-06-29_03-18-13-29 #1
> > [ 1040.139068] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> > [ 1040.144469] Call Trace:
> > [ 1040.145897] <IRQ>
> > [ 1040.147366] dump_stack+0x9a/0xeb
> > [ 1040.149061] ubsan_epilogue+0x9/0x7c
> > [ 1040.150462] handle_overflow+0x16d/0x198
> > [ 1040.151911] ? __ubsan_handle_negate_overflow+0x15c/0x15c
> > [ 1040.153679] ? sk_free+0x15/0x30
> > [ 1040.155011] ? kvm_clock_read+0x14/0x30
> > [ 1040.156433] ? kvm_sched_clock_read+0x5/0x10
> > [ 1040.157952] ? sched_clock+0x5/0x10
> > [ 1040.159318] ? sched_clock_cpu+0x18/0x260
> > [ 1040.160801] dim_calc_stats+0x4a1/0x4c0
> > [ 1040.162274] net_dim+0x147/0x920
> > [ 1040.163592] ? net_dim_stats_compare+0x330/0x330
> > [ 1040.165283] mlx5e_napi_poll+0x410/0x1030 [mlx5_core]
> > [ 1040.166876] ? lock_stats+0xd41/0x1740
> > [ 1040.168266] ? mlx5e_trigger_irq+0x550/0x550 [mlx5_core]
> > [ 1040.169918] ? __module_text_address+0x13/0x140
> > [ 1040.171409] ? lock_stats+0xd41/0x1740
> > [ 1040.172757] ? net_rx_action+0x262/0xda0
> > [ 1040.174156] net_rx_action+0x421/0xda0
> > [ 1040.175519] ? napi_complete_done+0x370/0x370
> > [ 1040.176979] ? kvm_clock_read+0x14/0x30
> > [ 1040.178316] ? kvm_sched_clock_read+0x5/0x10
> > [ 1040.179690] ? sched_clock+0x5/0x10
> > [ 1040.180920] ? sched_clock_cpu+0x18/0x260
> > [ 1040.182286] __do_softirq+0x287/0xb4e
> > [ 1040.183581] ? irqtime_account_irq+0x1d5/0x3b0
> > [ 1040.184998] irq_exit+0x17d/0x1d0
> > [ 1040.186212] do_IRQ+0x129/0x220
> > [ 1040.187412] common_interrupt+0xf/0xf
> > [ 1040.188673] </IRQ>
> > [ 1040.189685] RIP: 0033:0x7f092c41a07a
> > [ 1040.190884] Code: 45 31 f6 e9 8a 00 00 00 0f 1f 84 00 00 00 00 00 48
> > 89 df ff 93 88 01 00 00 85 c0 0f 88 c7 00 00 00 48 98 48 01 85 88 02 00
> > 00 <48> 8b 85 c8 02 00 00 48 83 85 90 02 00 00 01 48 83 78 10 00 74 0b
> > [ 1040.195584] RSP: 002b:00007fffbebe7870 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffd7
> > [ 1040.197933] RAX: 0000000000020000 RBX: 0000000000e239b0 RCX: 000000000006b280
> > [ 1040.199740] RDX: 0000000000020000 RSI: 00007f092c805000 RDI: 0000000000000007
> > [ 1040.201525] RBP: 0000000000e21260 R08: 0000000000000000 R09: 00007fffbebfb0a0
> > [ 1040.203237] R10: 0000000000000380 R11: 0000000000000246 R12: 00007fffbebe7950
> > [ 1040.204944] R13: 0000000000000007 R14: 0000000000000001 R15: 00007fffbebe7870
> > [ 1040.206686] ================================================================================
> >
> > Fixes: 398c2b05bbee ("linux/dim: Add completions count to dim_sample")
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > lib/dim/dim.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > index 439d641ec796..38045d6d0538 100644
> > +++ b/lib/dim/dim.c
> > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > delta_us);
> > curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > if (curr_stats->epms != 0)
> > - curr_stats->cpe_ratio =
> > - (curr_stats->cpms * 100) / curr_stats->epms;
> > + curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > + curr_stats->cpms * 100, curr_stats->epms);
>
> This will still potentially overfow the 'int' for cpe_ratio if epms <
> 100 ?
I assumed that assignment to "unsigned long long" will do the trick.
https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
2019-07-11 15:47 ` Leon Romanovsky
@ 2019-07-11 16:11 ` Jason Gunthorpe
2019-07-11 17:19 ` Leon Romanovsky
0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2019-07-11 16:11 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman
On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > index 439d641ec796..38045d6d0538 100644
> > > +++ b/lib/dim/dim.c
> > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > delta_us);
> > > curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > if (curr_stats->epms != 0)
> > > - curr_stats->cpe_ratio =
> > > - (curr_stats->cpms * 100) / curr_stats->epms;
> > > + curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > + curr_stats->cpms * 100, curr_stats->epms);
> >
> > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > 100 ?
>
> I assumed that assignment to "unsigned long long" will do the trick.
> https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
casted to int.
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
2019-07-11 16:11 ` Jason Gunthorpe
@ 2019-07-11 17:19 ` Leon Romanovsky
2019-07-11 17:31 ` Jason Gunthorpe
0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2019-07-11 17:19 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman
On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > index 439d641ec796..38045d6d0538 100644
> > > > +++ b/lib/dim/dim.c
> > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > > delta_us);
> > > > curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > > if (curr_stats->epms != 0)
> > > > - curr_stats->cpe_ratio =
> > > > - (curr_stats->cpms * 100) / curr_stats->epms;
> > > > + curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > + curr_stats->cpms * 100, curr_stats->epms);
> > >
> > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > 100 ?
> >
> > I assumed that assignment to "unsigned long long" will do the trick.
> > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
>
> That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> casted to int.
It is ok, the result is "int" and it will be small, 100 in multiply
represents percentage.
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
2019-07-11 17:19 ` Leon Romanovsky
@ 2019-07-11 17:31 ` Jason Gunthorpe
2019-07-12 6:03 ` Leon Romanovsky
0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2019-07-11 17:31 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman
On Thu, Jul 11, 2019 at 08:19:22PM +0300, Leon Romanovsky wrote:
> On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> > On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > > index 439d641ec796..38045d6d0538 100644
> > > > > +++ b/lib/dim/dim.c
> > > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > > > delta_us);
> > > > > curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > > > if (curr_stats->epms != 0)
> > > > > - curr_stats->cpe_ratio =
> > > > > - (curr_stats->cpms * 100) / curr_stats->epms;
> > > > > + curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > > + curr_stats->cpms * 100, curr_stats->epms);
> > > >
> > > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > > 100 ?
> > >
> > > I assumed that assignment to "unsigned long long" will do the trick.
> > > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
> >
> > That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> > casted to int.
>
> It is ok, the result is "int" and it will be small, 100 in multiply
> represents percentage.
Percentage would be divide by 100..
Like I said it will overflow if epms < 100 ...
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
2019-07-11 17:31 ` Jason Gunthorpe
@ 2019-07-12 6:03 ` Leon Romanovsky
2019-07-12 15:23 ` Jason Gunthorpe
0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2019-07-12 6:03 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman
On Thu, Jul 11, 2019 at 05:31:14PM +0000, Jason Gunthorpe wrote:
> On Thu, Jul 11, 2019 at 08:19:22PM +0300, Leon Romanovsky wrote:
> > On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> > > On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > > > index 439d641ec796..38045d6d0538 100644
> > > > > > +++ b/lib/dim/dim.c
> > > > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > > > > delta_us);
> > > > > > curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > > > > if (curr_stats->epms != 0)
> > > > > > - curr_stats->cpe_ratio =
> > > > > > - (curr_stats->cpms * 100) / curr_stats->epms;
> > > > > > + curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > > > + curr_stats->cpms * 100, curr_stats->epms);
> > > > >
> > > > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > > > 100 ?
> > > >
> > > > I assumed that assignment to "unsigned long long" will do the trick.
> > > > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
> > >
> > > That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> > > casted to int.
> >
> > It is ok, the result is "int" and it will be small, 100 in multiply
> > represents percentage.
>
> Percentage would be divide by 100..
>
> Like I said it will overflow if epms < 100 ...
It is unlikely to happen because cpe_ratio is between 0 to 100 and cpms
* 100 is not large at all.
UBSAN error is "theoretical" overflow.
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
2019-07-12 6:03 ` Leon Romanovsky
@ 2019-07-12 15:23 ` Jason Gunthorpe
2019-07-14 10:54 ` Leon Romanovsky
0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2019-07-12 15:23 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman
On Fri, Jul 12, 2019 at 09:03:09AM +0300, Leon Romanovsky wrote:
> On Thu, Jul 11, 2019 at 05:31:14PM +0000, Jason Gunthorpe wrote:
> > On Thu, Jul 11, 2019 at 08:19:22PM +0300, Leon Romanovsky wrote:
> > > On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> > > > On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > > > > index 439d641ec796..38045d6d0538 100644
> > > > > > > +++ b/lib/dim/dim.c
> > > > > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > > > > > delta_us);
> > > > > > > curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > > > > > if (curr_stats->epms != 0)
> > > > > > > - curr_stats->cpe_ratio =
> > > > > > > - (curr_stats->cpms * 100) / curr_stats->epms;
> > > > > > > + curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > > > > + curr_stats->cpms * 100, curr_stats->epms);
> > > > > >
> > > > > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > > > > 100 ?
> > > > >
> > > > > I assumed that assignment to "unsigned long long" will do the trick.
> > > > > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
> > > >
> > > > That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> > > > casted to int.
> > >
> > > It is ok, the result is "int" and it will be small, 100 in multiply
> > > represents percentage.
> >
> > Percentage would be divide by 100..
> >
> > Like I said it will overflow if epms < 100 ...
>
> It is unlikely to happen because cpe_ratio is between 0 to 100 and cpms
> * 100 is not large at all.
>
> UBSAN error is "theoretical" overflow.
? UBSAN is not theoretical, it only triggers if something actually
happens. So in this case cpms*100 was very large and overflowed.
Maybe it shouldn't be and that is the actual bug, but if we overflowed
with cpms*100, then epms must be > 100 or we still overflow the
divide.
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
2019-07-12 15:23 ` Jason Gunthorpe
@ 2019-07-14 10:54 ` Leon Romanovsky
2019-07-18 17:39 ` Jason Gunthorpe
0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2019-07-14 10:54 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman
On Fri, Jul 12, 2019 at 03:23:20PM +0000, Jason Gunthorpe wrote:
> On Fri, Jul 12, 2019 at 09:03:09AM +0300, Leon Romanovsky wrote:
> > On Thu, Jul 11, 2019 at 05:31:14PM +0000, Jason Gunthorpe wrote:
> > > On Thu, Jul 11, 2019 at 08:19:22PM +0300, Leon Romanovsky wrote:
> > > > On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> > > > > On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > > > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > > > > > index 439d641ec796..38045d6d0538 100644
> > > > > > > > +++ b/lib/dim/dim.c
> > > > > > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > > > > > > delta_us);
> > > > > > > > curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > > > > > > if (curr_stats->epms != 0)
> > > > > > > > - curr_stats->cpe_ratio =
> > > > > > > > - (curr_stats->cpms * 100) / curr_stats->epms;
> > > > > > > > + curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > > > > > + curr_stats->cpms * 100, curr_stats->epms);
> > > > > > >
> > > > > > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > > > > > 100 ?
> > > > > >
> > > > > > I assumed that assignment to "unsigned long long" will do the trick.
> > > > > > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
> > > > >
> > > > > That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> > > > > casted to int.
> > > >
> > > > It is ok, the result is "int" and it will be small, 100 in multiply
> > > > represents percentage.
> > >
> > > Percentage would be divide by 100..
> > >
> > > Like I said it will overflow if epms < 100 ...
> >
> > It is unlikely to happen because cpe_ratio is between 0 to 100 and cpms
> > * 100 is not large at all.
> >
> > UBSAN error is "theoretical" overflow.
>
> ? UBSAN is not theoretical, it only triggers if something actually
> happens. So in this case cpms*100 was very large and overflowed.
>
> Maybe it shouldn't be and that is the actual bug, but if we overflowed
> with cpms*100, then epms must be > 100 or we still overflow the
> divide.
I think that the real bug is cpms became too big.
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
2019-07-14 10:54 ` Leon Romanovsky
@ 2019-07-18 17:39 ` Jason Gunthorpe
2019-07-19 12:38 ` Leon Romanovsky
0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2019-07-18 17:39 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman
On Sun, Jul 14, 2019 at 01:54:59PM +0300, Leon Romanovsky wrote:
> On Fri, Jul 12, 2019 at 03:23:20PM +0000, Jason Gunthorpe wrote:
> > On Fri, Jul 12, 2019 at 09:03:09AM +0300, Leon Romanovsky wrote:
> > > On Thu, Jul 11, 2019 at 05:31:14PM +0000, Jason Gunthorpe wrote:
> > > > On Thu, Jul 11, 2019 at 08:19:22PM +0300, Leon Romanovsky wrote:
> > > > > On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> > > > > > On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > > > > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > > > > > > index 439d641ec796..38045d6d0538 100644
> > > > > > > > > +++ b/lib/dim/dim.c
> > > > > > > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > > > > > > > delta_us);
> > > > > > > > > curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > > > > > > > if (curr_stats->epms != 0)
> > > > > > > > > - curr_stats->cpe_ratio =
> > > > > > > > > - (curr_stats->cpms * 100) / curr_stats->epms;
> > > > > > > > > + curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > > > > > > + curr_stats->cpms * 100, curr_stats->epms);
> > > > > > > >
> > > > > > > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > > > > > > 100 ?
> > > > > > >
> > > > > > > I assumed that assignment to "unsigned long long" will do the trick.
> > > > > > > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
> > > > > >
> > > > > > That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> > > > > > casted to int.
> > > > >
> > > > > It is ok, the result is "int" and it will be small, 100 in multiply
> > > > > represents percentage.
> > > >
> > > > Percentage would be divide by 100..
> > > >
> > > > Like I said it will overflow if epms < 100 ...
> > >
> > > It is unlikely to happen because cpe_ratio is between 0 to 100 and cpms
> > > * 100 is not large at all.
> > >
> > > UBSAN error is "theoretical" overflow.
> >
> > ? UBSAN is not theoretical, it only triggers if something actually
> > happens. So in this case cpms*100 was very large and overflowed.
> >
> > Maybe it shouldn't be and that is the actual bug, but if we overflowed
> > with cpms*100, then epms must be > 100 or we still overflow the
> > divide.
>
> I think that the real bug is cpms became too big.
So I'll drop the patch until someone figures out what is happening
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics
2019-07-18 17:39 ` Jason Gunthorpe
@ 2019-07-19 12:38 ` Leon Romanovsky
0 siblings, 0 replies; 11+ messages in thread
From: Leon Romanovsky @ 2019-07-19 12:38 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Doug Ledford, RDMA mailing list, Yamin Friedman
On Thu, Jul 18, 2019 at 05:39:47PM +0000, Jason Gunthorpe wrote:
> On Sun, Jul 14, 2019 at 01:54:59PM +0300, Leon Romanovsky wrote:
> > On Fri, Jul 12, 2019 at 03:23:20PM +0000, Jason Gunthorpe wrote:
> > > On Fri, Jul 12, 2019 at 09:03:09AM +0300, Leon Romanovsky wrote:
> > > > On Thu, Jul 11, 2019 at 05:31:14PM +0000, Jason Gunthorpe wrote:
> > > > > On Thu, Jul 11, 2019 at 08:19:22PM +0300, Leon Romanovsky wrote:
> > > > > > On Thu, Jul 11, 2019 at 04:11:07PM +0000, Jason Gunthorpe wrote:
> > > > > > > On Thu, Jul 11, 2019 at 06:47:34PM +0300, Leon Romanovsky wrote:
> > > > > > > > > > diff --git a/lib/dim/dim.c b/lib/dim/dim.c
> > > > > > > > > > index 439d641ec796..38045d6d0538 100644
> > > > > > > > > > +++ b/lib/dim/dim.c
> > > > > > > > > > @@ -74,8 +74,8 @@ void dim_calc_stats(struct dim_sample *start, struct dim_sample *end,
> > > > > > > > > > delta_us);
> > > > > > > > > > curr_stats->cpms = DIV_ROUND_UP(ncomps * USEC_PER_MSEC, delta_us);
> > > > > > > > > > if (curr_stats->epms != 0)
> > > > > > > > > > - curr_stats->cpe_ratio =
> > > > > > > > > > - (curr_stats->cpms * 100) / curr_stats->epms;
> > > > > > > > > > + curr_stats->cpe_ratio = DIV_ROUND_DOWN_ULL(
> > > > > > > > > > + curr_stats->cpms * 100, curr_stats->epms);
> > > > > > > > >
> > > > > > > > > This will still potentially overfow the 'int' for cpe_ratio if epms <
> > > > > > > > > 100 ?
> > > > > > > >
> > > > > > > > I assumed that assignment to "unsigned long long" will do the trick.
> > > > > > > > https://elixir.bootlin.com/linux/latest/source/include/linux/kernel.h#L94
> > > > > > >
> > > > > > > That only protects the multiply, the result of DIV_ROUND_DOWN_ULL is
> > > > > > > casted to int.
> > > > > >
> > > > > > It is ok, the result is "int" and it will be small, 100 in multiply
> > > > > > represents percentage.
> > > > >
> > > > > Percentage would be divide by 100..
> > > > >
> > > > > Like I said it will overflow if epms < 100 ...
> > > >
> > > > It is unlikely to happen because cpe_ratio is between 0 to 100 and cpms
> > > > * 100 is not large at all.
> > > >
> > > > UBSAN error is "theoretical" overflow.
> > >
> > > ? UBSAN is not theoretical, it only triggers if something actually
> > > happens. So in this case cpms*100 was very large and overflowed.
> > >
> > > Maybe it shouldn't be and that is the actual bug, but if we overflowed
> > > with cpms*100, then epms must be > 100 or we still overflow the
> > > divide.
> >
> > I think that the real bug is cpms became too big.
>
> So I'll drop the patch until someone figures out what is happening
Thanks, Yamin is working to fix it.
>
> Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-07-19 12:39 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-11 15:31 [PATCH rdma-next] lib/dim: Prevent overflow in calculation of ratio statistics Leon Romanovsky
2019-07-11 15:43 ` Jason Gunthorpe
2019-07-11 15:47 ` Leon Romanovsky
2019-07-11 16:11 ` Jason Gunthorpe
2019-07-11 17:19 ` Leon Romanovsky
2019-07-11 17:31 ` Jason Gunthorpe
2019-07-12 6:03 ` Leon Romanovsky
2019-07-12 15:23 ` Jason Gunthorpe
2019-07-14 10:54 ` Leon Romanovsky
2019-07-18 17:39 ` Jason Gunthorpe
2019-07-19 12:38 ` Leon Romanovsky
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.