On 13.04.21 11:03, Yunsheng Lin wrote: > On 2021/4/13 16:33, Hillf Danton wrote: >> On Tue, 13 Apr 2021 15:57:29 Yunsheng Lin wrote: >>> On 2021/4/13 15:12, Hillf Danton wrote: >>>> On Tue, 13 Apr 2021 11:34:27 Yunsheng Lin wrote: >>>>> On 2021/4/13 11:26, Hillf Danton wrote: >>>>>> On Tue, 13 Apr 2021 10:56:42 Yunsheng Lin wrote: >>>>>>> On 2021/4/13 10:21, Hillf Danton wrote: >>>>>>>> On Mon, 12 Apr 2021 20:00:43 Yunsheng Lin wrote: >>>>>>>>> >>>>>>>>> Yes, the below patch seems to fix the data race described in >>>>>>>>> the commit log. >>>>>>>>> Then what is the difference between my patch and your patch below:) >>>>>>>> >>>>>>>> Hehe, this is one of the tough questions over a bounch of weeks. >>>>>>>> >>>>>>>> If a seqcount can detect the race between skb enqueue and dequeue then we >>>>>>>> cant see any excuse for not rolling back to the point without NOLOCK. >>>>>>> >>>>>>> I am not sure I understood what you meant above. >>>>>>> >>>>>>> As my understanding, the below patch is essentially the same as >>>>>>> your previous patch, the only difference I see is it uses qdisc->pad >>>>>>> instead of __QDISC_STATE_NEED_RESCHEDULE. >>>>>>> >>>>>>> So instead of proposing another patch, it would be better if you >>>>>>> comment on my patch, and make improvement upon that. >>>>>>> >>>>>> Happy to do that after you show how it helps revert NOLOCK. >>>>> >>>>> Actually I am not going to revert NOLOCK, but add optimization >>>>> to it if the patch fixes the packet stuck problem. >>>>> >>>> Fix is not optimization, right? >>> >>> For this patch, it is a fix. >>> In case you missed it, I do have a couple of idea to optimize the >>> lockless qdisc: >>> >>> 1. RFC patch to add lockless qdisc bypass optimization: >>> >>> https://patchwork.kernel.org/project/netdevbpf/patch/1616404156-11772-1-git-send-email-linyunsheng@huawei.com/ >>> >>> 2. implement lockless enqueuing for lockless qdisc using the idea >>> from Jason and Toke. And it has a noticable proformance increase with >>> 1-4 threads running using the below prototype based on ptr_ring. >>> >>> static inline int __ptr_ring_multi_produce(struct ptr_ring *r, void *ptr) >>> { >>> >>> int producer, next_producer; >>> >>> >>> do { >>> producer = READ_ONCE(r->producer); >>> if (unlikely(!r->size) || r->queue[producer]) >>> return -ENOSPC; >>> next_producer = producer + 1; >>> if (unlikely(next_producer >= r->size)) >>> next_producer = 0; >>> } while(cmpxchg_relaxed(&r->producer, producer, next_producer) != producer); >>> >>> /* Make sure the pointer we are storing points to a valid data. */ >>> /* Pairs with the dependency ordering in __ptr_ring_consume. */ >>> smp_wmb(); >>> >>> WRITE_ONCE(r->queue[producer], ptr); >>> return 0; >>> } >>> >>> 3. Maybe it is possible to remove the netif_tx_lock for lockless qdisc >>> too, because dev_hard_start_xmit is also in the protection of >>> qdisc_run_begin()/qdisc_run_end()(if there is only one qdisc using >>> a netdev queue, which is true for pfifo_fast, I believe). >>> >>> 4. Remove the qdisc->running seqcount operation for lockless qdisc, which >>> is mainly used to do heuristic locking on q->busylock for locked qdisc. >>> >> >> Sounds good. They can stand two months, cant they? >>>> >>>>> Is there any reason why you want to revert it? >>>>> >>>> I think you know Jiri's plan and it would be nice to wait a couple of >>>> months for it to complete. >>> >>> I am not sure I am aware of Jiri's plan. >>> Is there any link referring to the plan? >>> >> https://lore.kernel.org/lkml/eaff25bc-9b64-037e-b9bc-c06fc4a5a9fb@huawei.com/ > > I think there is some misunderstanding here. > > As my understanding, Jiri and Juergen are from the same team(using > the suse.com mail server). Yes, we are. > what Jiri said about "I am still planning to have Yunsheng Lin's > (CCing) fix [1] tested in the coming days." is that Juergen has > done the test and provide a "Tested-by" tag. Correct. And I did this after Jiri asking me to do so. > So if this patch fixes the packet stuck problem, Jiri is ok with > NOLOCK qdisc too. I think so, yes. Otherwise I don't see why he asked me to test the patch. :-) > Or do I misunderstand it again? Perhaps Jiri and Juergen can help to > clarify this? I hope I did. :-) Juergen