From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [net-next PATCH] net: ptr_ring: otherwise safe empty checks can overrun array bounds Date: Tue, 2 Jan 2018 19:17:30 +0200 Message-ID: <20180102191329-mutt-send-email-mst@kernel.org> References: <20171228035024.14699.69453.stgit@john-Precision-Tower-5810> <20180102.115219.1101472320429215260.davem@davemloft.net> <20180102190107-mutt-send-email-mst@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: john.fastabend@gmail.com, jakub.kicinski@netronome.com, xiyou.wangcong@gmail.com, jiri@resnulli.us, netdev@vger.kernel.org To: David Miller Return-path: Received: from mx1.redhat.com ([209.132.183.28]:51882 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751150AbeABRRb (ORCPT ); Tue, 2 Jan 2018 12:17:31 -0500 Content-Disposition: inline In-Reply-To: <20180102190107-mutt-send-email-mst@kernel.org> Sender: netdev-owner@vger.kernel.org List-ID: On Tue, Jan 02, 2018 at 07:01:33PM +0200, Michael S. Tsirkin wrote: > On Tue, Jan 02, 2018 at 11:52:19AM -0500, David Miller wrote: > > From: John Fastabend > > Date: Wed, 27 Dec 2017 19:50:25 -0800 > > > > > When running consumer and/or producer operations and empty checks in > > > parallel its possible to have the empty check run past the end of the > > > array. The scenario occurs when an empty check is run while > > > __ptr_ring_discard_one() is in progress. Specifically after the > > > consumer_head is incremented but before (consumer_head >= ring_size) > > > check is made and the consumer head is zeroe'd. > > > > > > To resolve this, without having to rework how consumer/producer ops > > > work on the array, simply add an extra dummy slot to the end of the > > > array. Even if we did a rework to avoid the extra slot it looks > > > like the normal case checks would suffer some so best to just > > > allocate an extra pointer. > > > > > > Reported-by: Jakub Kicinski > > > Fixes: c5ad119fb6c09 ("net: sched: pfifo_fast use skb_array") > > > Signed-off-by: John Fastabend > > > > Applied, thanks John. > > I think that patch is wrong. I'd rather it got reverted. And replaced with something like the following - stil untested, but apparently there's some urgency to fix it so posting for review ASAP. John, others, could you pls confirm it's not too bad performance-wise? I'll then split it up properly and re-post. --> net: don't misuse ptr_ring_peek ptr_ring_peek only claims to be safe if the result is never dereferenced, which isn't the case for its use in sch_generic. Add locked API variants and use the bh one here. Signed-off-by: Michael S. Tsirkin --- diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h index 6866df4..f3d96c7 100644 --- a/include/linux/ptr_ring.h +++ b/include/linux/ptr_ring.h @@ -236,6 +236,51 @@ static inline bool ptr_ring_empty_bh(struct ptr_ring *r) return ret; } +static inline void *ptr_ring_peek(struct ptr_ring *r) +{ + void *ret; + + spin_lock(&r->consumer_lock); + ret = __ptr_ring_peek(r); + spin_unlock(&r->consumer_lock); + + return ret; +} + +static inline void *ptr_ring_peek_irq(struct ptr_ring *r) +{ + void *ret; + + spin_lock_irq(&r->consumer_lock); + ret = __ptr_ring_peek(r); + spin_unlock_irq(&r->consumer_lock); + + return ret; +} + +static inline void *ptr_ring_peek_any(struct ptr_ring *r) +{ + unsigned long flags; + void *ret; + + spin_lock_irqsave(&r->consumer_lock, flags); + ret = __ptr_ring_peek(r); + spin_unlock_irqrestore(&r->consumer_lock, flags); + + return ret; +} + +static inline void *ptr_ring_peek_bh(struct ptr_ring *r) +{ + void *ret; + + spin_lock_bh(&r->consumer_lock); + ret = __ptr_ring_peek(r); + spin_unlock_bh(&r->consumer_lock); + + return ret; +} + /* Must only be called after __ptr_ring_peek returned !NULL */ static inline void __ptr_ring_discard_one(struct ptr_ring *r) { diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h index c7addf3..ee3625c 100644 --- a/include/linux/skb_array.h +++ b/include/linux/skb_array.h @@ -97,6 +97,26 @@ static inline bool skb_array_empty_any(struct skb_array *a) return ptr_ring_empty_any(&a->ring); } +static inline struct sk_buff *skb_array_peek(struct skb_array *a) +{ + return ptr_ring_peek(&a->ring); +} + +static inline struct sk_buff *skb_array_peek_bh(struct skb_array *a) +{ + return ptr_ring_peek_bh(&a->ring); +} + +static inline struct sk_buff *skb_array_peek_irq(struct skb_array *a) +{ + return ptr_ring_peek_irq(&a->ring); +} + +static inline struct sk_buff *skb_array_peek_any(struct skb_array *a) +{ + return ptr_ring_peek_any(&a->ring); +} + static inline struct sk_buff *skb_array_consume(struct skb_array *a) { return ptr_ring_consume(&a->ring); diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index cc069b2..9288e84 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -659,7 +659,7 @@ static struct sk_buff *pfifo_fast_peek(struct Qdisc *qdisc) for (band = 0; band < PFIFO_FAST_BANDS && !skb; band++) { struct skb_array *q = band2list(priv, band); - skb = __skb_array_peek(q); + skb = skb_array_peek_bh(q); } return skb;