From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753869AbcKRMDg (ORCPT ); Fri, 18 Nov 2016 07:03:36 -0500 Received: from pb-sasl1.pobox.com ([64.147.108.66]:60952 "EHLO sasl.smtp.pobox.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753566AbcKRMDb (ORCPT ); Fri, 18 Nov 2016 07:03:31 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=subject:to :references:cc:from:message-id:date:mime-version:in-reply-to :content-type:content-transfer-encoding; q=dns; s=sasl; b=cYEPzN IlwKFYE87+OoU7QdWpw4Iv7ErMa/PX/3/mSWGvk1XjNYwGd7lvPKwyaQlKxN8ueP YCETdsyLt3TcEkiGWO7im8lepkn+UxhjAO3UXeryeRvrIyKpUhB2i1Km6JPBVNLI f4nhi1VypAAU4hki9pz9MTyehrCO4Z8NmLpIs= Subject: Re: [PATCH net 1/2] r8152: fix the sw rx checksum is unavailable To: Hayes Wang , "netdev@vger.kernel.org" References: <1394712342-15778-226-Taiwan-albertk@realtek.com> <1394712342-15778-227-Taiwan-albertk@realtek.com> <0835B3720019904CB8F7AA43166CEEB20105013E@RTITMBSV03.realtek.com.tw> <0835B3720019904CB8F7AA43166CEEB201050B7E@RTITMBSV03.realtek.com.tw> Cc: nic_swsd , "linux-kernel@vger.kernel.org" , "linux-usb@vger.kernel.org" From: Mark Lord Message-ID: Date: Fri, 18 Nov 2016 07:03:28 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <0835B3720019904CB8F7AA43166CEEB201050B7E@RTITMBSV03.realtek.com.tw> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 04EB668C-AD87-11E6-9E02-92296462E9F6-82205200!pb-sasl1.pobox.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 16-11-18 02:57 AM, Hayes Wang wrote: .. > Besides, the maximum data length which the RTL8152 would send to > the host is 16KB. That is, if the agg_buf_sz is 16KB, the host > wouldn't split it. However, you still see problems for it. How does the RTL8152 know that the limit is 16KB, rather than some other number? Is this a hardwired number in the hardware, or is it a parameter that the software sends to the chip during initialization? I have a USB analyzer, but it is difficult to figure out how to program an appropriate trigger point for the capture, since the problem (with 16KB URBs) takes minutes to hours or even days to trigger. And the output from the analyzer is in some proprietary format. The in-kernel software analzer could be useful, but I have never figured out how to use it. :) Since my earlier email, I have figured out another piece of the puzzle with this dongle. The first issue is that a packet sometimes begins in one URB, and completes in the next URB, without an rx_desc at the start of the second URB. This I have already reported earlier. But the driver, as written, sometimes accesses bytes outside of the 16KB URB buffer, because it trusts the non-existent rx_desc in these cases, and also because it accesses bytes from the rx_desc without first checking whether there is sufficient remaining space in the URB to hold an rx_desc. These incorrect accesses sometimes touch memory outside of the URB buffer. Since the driver allocates all of its rx URB buffers at once, they are highly likely to be physically (and therefore virtually) adjacent in memory. So mistakenly accessing beyond the end of one buffer will often result in a read from memory of the next URB buffer. Which causes a portion of it to be loaded in the the D-cache. When that URB is subsequently filled by DMA, there then exists a data-consistency issue: the D-cache contains stale information from before the latest DMA cycle. So this explains the strange memory behaviour observed earlier on. When I add a call to invalidate_dcache_range() to the driver just before it begins examining a new rx URB, the problems go away. So this confirms the observations. Using non-cacheable RAM also makes the problem go away. But neither is a fix for the real buffer overrun accesses in the driver. Fix the "packet spans URBs" bug, and fix the driver to ALWAYS test lengths/ranges before accessing the actual buffer, and everything should begin working reliably. Cheers -- Mark Lord Real-Time Remedies Inc. mlord@pobox.com