All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: mcp251xfd: rx frame truncation
       [not found] <FR0P281MB1966C774C1ED0752CCD6B80497C79@FR0P281MB1966.DEUP281.PROD.OUTLOOK.COM>
@ 2023-01-18 11:39 ` Marc Kleine-Budde
  2023-01-18 11:45 ` Marc Kleine-Budde
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Marc Kleine-Budde @ 2023-01-18 11:39 UTC (permalink / raw)
  To: Stefan Althöfer; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 1822 bytes --]

On 18.01.2023 11:26:31, Stefan Althöfer wrote:
> While still testing the double-RX erratum patch I'm experiencing another problem.
> 
> I'm using my setup with two mcp2518fd and my own 'sctest' program. Both are
> sending  messages to each other. I start sending a number of transmit ahead 
> frames and then  send out one more messages when a frame arrives at the peer. 
> So  overrun should not occur.
> 
> In contrast to the tests I did over the weekend, I increased the transmit 
> ahead to 3, sending now up to 64 bytes and the frame size is now random.
> 
> I now frequently see these kind of messages appear:
> 
> Jan 18 05:34:00 raspberrypi kernel: [405471.116322] mcp251xfd spi0.0 can0: Transmit Event FIFO buffer not empty. (seq=0x00001e36, tef_tail=0x00001e3a, tef_head=0x00001e3b, tx_head=0x00001e3b).
> Jan 18 09:36:27 raspberrypi kernel: [420017.452369] mcp251xfd spi6.0 can1: Transmit Event FIFO buffer full. (seq=0x00001062, tef_tail=0x00001066, tef_head=0x00001067, tx_head=0x00001067).
> 
> Why do they occur?

As I mentioned in the other mail, the erratum also affects the TX FIFO.

In tests with high TX load I have seen the message "buffer not empty" in
the past, but never the message "buffer full". With the "buffer not
empty" errors, I haven't seen any CAN frame drops or truncated messages,
but I've been testing mainly CAN-2.0.

I'll have to look at the TEF and TX handling with what I learned from
developing the RX FIFO workaround.

I'll look into your mail in detail later.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: mcp251xfd: rx frame truncation
       [not found] <FR0P281MB1966C774C1ED0752CCD6B80497C79@FR0P281MB1966.DEUP281.PROD.OUTLOOK.COM>
  2023-01-18 11:39 ` mcp251xfd: rx frame truncation Marc Kleine-Budde
@ 2023-01-18 11:45 ` Marc Kleine-Budde
  2023-01-18 12:25   ` AW: " Stefan Althöfer
  2023-01-18 12:32 ` WG: mcp251xfd: rx frame truncation (1/2) Stefan Althöfer
  2023-01-18 12:32 ` WG: mcp251xfd: rx frame truncation Stefan Althöfer
  3 siblings, 1 reply; 23+ messages in thread
From: Marc Kleine-Budde @ 2023-01-18 11:45 UTC (permalink / raw)
  To: Stefan Althöfer; +Cc: linux-can, Thomas.Kopp

[-- Attachment #1: Type: text/plain, Size: 963 bytes --]

On 18.01.2023 11:26:31, Stefan Althöfer wrote:
> 
> But now the real problem. After some time one of the socketcans is 
> receiving only a truncated message:
> 
> 1: idx    2 exp:2A5#00 02 F4 7C 7B F2 53 BD 53 84 B3 4B B7 5F AE 55 2D 88 13 25 7C D4 82 EF 85 35 9C 33 1B 7C 5B 55 91 4F D1 0D 41 24 CA 95 A9 7E E0 60 DD 8E B5 0A 16 C8 2F 92 9C B1 82 22 E6 1E 55 01 9A B0 56 2B
> 1:           rx:2A5#00 02 F4 7C 7B F2 53 BD 53 84 B3 4B B7 5F AE 55 2D 88 13 25

Thomas, can you try to reproduce and capture the SPI traffic?

> Ask if you want more information!

Stefan, can you capture the SPI traffic with a logic analyzer, too?
Preferred with sigrok/PulseView.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* AW: mcp251xfd: rx frame truncation
  2023-01-18 11:45 ` Marc Kleine-Budde
@ 2023-01-18 12:25   ` Stefan Althöfer
  2023-01-18 18:55     ` Marc Kleine-Budde
  0 siblings, 1 reply; 23+ messages in thread
From: Stefan Althöfer @ 2023-01-18 12:25 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Thomas.Kopp

> Stefan, can you capture the SPI traffic with a logic analyzer, too?

Should be possible, but I need to change my hardware setup to have both of my click board on
the on pi. So have some patience ;-) I can save as vcd, which you can import in sigrok.

--
Stefan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* WG: mcp251xfd: rx frame truncation (1/2)
       [not found] <FR0P281MB1966C774C1ED0752CCD6B80497C79@FR0P281MB1966.DEUP281.PROD.OUTLOOK.COM>
  2023-01-18 11:39 ` mcp251xfd: rx frame truncation Marc Kleine-Budde
  2023-01-18 11:45 ` Marc Kleine-Budde
@ 2023-01-18 12:32 ` Stefan Althöfer
  2023-01-18 14:43   ` Thomas.Kopp
  2023-01-18 12:32 ` WG: mcp251xfd: rx frame truncation Stefan Althöfer
  3 siblings, 1 reply; 23+ messages in thread
From: Stefan Althöfer @ 2023-01-18 12:32 UTC (permalink / raw)
  To: linux-can

[-- Attachment #1: Type: text/plain, Size: 2511 bytes --]

Send again as the original post was to long

-----Ursprüngliche Nachricht-----
Von: Stefan Althöfer 
Gesendet: Mittwoch, 18. Januar 2023 12:27
An: linux-can@vger.kernel.org
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Betreff: mcp251xfd: rx frame truncation

While still testing the double-RX erratum patch I'm experiencing another problem.

I'm using my setup with two mcp2518fd and my own 'sctest' program. Both are sending  messages to each other. I start sending a number of transmit ahead frames and then  send out one more messages when a frame arrives at the peer. 
So  overrun should not occur.

In contrast to the tests I did over the weekend, I increased the transmit ahead to 3, sending now up to 64 bytes and the frame size is now random.

I now frequently see these kind of messages appear:

Jan 18 05:34:00 raspberrypi kernel: [405471.116322] mcp251xfd spi0.0 can0: Transmit Event FIFO buffer not empty. (seq=0x00001e36, tef_tail=0x00001e3a, tef_head=0x00001e3b, tx_head=0x00001e3b).
Jan 18 09:36:27 raspberrypi kernel: [420017.452369] mcp251xfd spi6.0 can1: Transmit Event FIFO buffer full. (seq=0x00001062, tef_tail=0x00001066, tef_head=0x00001067, tx_head=0x00001067).

Why do they occur?

But now the real problem. After some time one of the socketcans is receiving only a truncated message:

1: idx    2 exp:2A5#00 02 F4 7C 7B F2 53 BD 53 84 B3 4B B7 5F AE 55 2D 88 13 25 7C D4 82 EF 85 35 9C 33 1B 7C 5B 55 91 4F D1 0D 41 24 CA 95 A9 7E E0 60 DD 8E B5 0A 16 C8 2F 92 9C B1 82 22 E6 1E 55 01 9A B0 56 2B
1:           rx:2A5#00 02 F4 7C 7B F2 53 BD 53 84 B3 4B B7 5F AE 55 2D 88 13 25

Same reception in the candump1 log (candump -H -td -D canX,0:0,#FFFFFFFF). 
However the message was Transmitted correctly, as the external logger has received it.

47:36.412.753      +0.172   0677fd  64  00 02 f4 7c 7b f2 53 bd  53 84 b3 4b b7 5f ae 55  2d 88 13 25 7c d4 82 ef  85 35 9c 33 1b 7c 5b 55  91 4f d1 0d 41 24 ca 95  a9 7e e0 60 dd 8e b5 0a  16 c8 2f 92 9c b1 82 22  e6 1e 55 01 9a b0 56 2b

I attached the regdump at the time when the test loop finished. At least all data bytes seem to be in the RAM.

I attached the 'sctest' source. It should build in the same environment as 'sctestself'.

I've seen this truncation occurring several times.

I think the problem did not occur when transmit ahead was set to 2, but I re-run the test again right now to verify.

Ask if you want more information!

-----
Stefan Althöfer (stefan.althoefer@janztec.com)

[-- Attachment #2: sctest.c --]
[-- Type: text/plain, Size: 36098 bytes --]

/* -*-C-*-
 * sctest.c - Raw socketcan test program (2 node rx/tx test)
 *
 *-----------------------------------------------------------------------
 *
 *                   Copyright (c) 2023 Janz Tec AG
 *                        All Rights Reserved
 *
 * Permission is hereby granted to licensees of Janz Tec AG
 * products to use or abstract this computer program for the sole
 * purpose of implementing a product based on Janz Tec AG
 * products.   No other rights to reproduce, use, or disseminate
 * this computer program, whether in part or in whole, are granted.
 *
 * Janz Tec AG makes no representation or warranties
 * with respect to the performance of this computer program, and
 * specifically disclaims any responsibility for any damages, special
 * or consequential, connected with the use of this program.
 *
 * Janz Tec AG products are not authorized for use as
 * critical components in life support devices or systems without the
 * express written approval of Janz Tec AG.
 *
 *-----------------------------------------------------------------------
 *
 * $Id: sctest.c 4020 2023-01-17 22:09:24Z as $
 *
 *-----------------------------------------------------------------------
 */


/***** includes *************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <net/if.h>
#include <sys/socket.h>
#include <sys/ioctl.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <time.h>
#include <fcntl.h>
#include <sys/time.h>
#include <sys/times.h>
#include <unistd.h>
#include <errno.h>
#include <sched.h>
#include <sys/mman.h>
#include <signal.h>

#include <netinet/in.h>
#include <netdb.h>

#include <linux/can.h>
#include <linux/can/raw.h>
#include <linux/can/error.h>

#include "svnglbrev.h"
#include "jlib.h"
#include "libsocketcan.h"


/***** defines **************************************************************/

#define CT_RAW_TIMEOUT       1
#define CT_RAW_BAD_RECEIVE   2
#define CT_RAW_NO_RESPONSE   3
#define CT_RAW_BUS_ERRORS    4
#define CT_RAW_READ_ERROR    5
#define CT_RAW_BAD_ORDER     6
#define CT_RAW_DUPLICATES    7
#define CT_RAW_TERMINATED   99

#define MAXFPATH 512

#define MAXPKTNUM   1000
#define DFLTPKTNUM    80
#define MAXLOOPS   10000
#define DFLTLOOPS     25
#define DFLTTXAHEAD    4
#define DFLTTIMEOUT   20 
#define DFLTID        0x2a5 

#define FLAGS_NORANDOMDATA    0x001
#define FLAGS_RANDOMIZE       0x002
#define FLAGS_REFILLDATA      0x004
#define FLAGS_ALLOWOOOD       0x010    /* Out-Of-Order-Delivery */
#define FLAGS_LESSDOTS        0x100
#define FLAGS_LEASTDOTS       0x200
#define FLAGS_STOPONERROR     0x400

#define VERB_STATS      0x001
#define VERB_IFPROP     0x002
#define VERB_TIMERR     0x100
#define VERB_CMPERR     0x200
#define VERB_LOGMSG     0x400
#define VERB_LOSSERR    0x800

#define vprintf(level, format, ...) do {        \
    if(level & opt_verbose)                     \
        fprintf(stdout, format, ##__VA_ARGS__); \
} while(0)


/***** typedefs *************************************************************/

struct cantest_raw {
    int     status;      /* 0=ok, 1=fail */
    double  dT;          /* in seconds */
    double  speed;       /* in kbit/s */
    int     loss;        /* messages lost */
    int     berrcnt;     /* bus error count */
} cantest_raw;

typedef unsigned int   usTimer_t;


/***** imports **************************************************************/


/***** locals ***************************************************************/


/***** globals **************************************************************/

int     opt_verbose;
char    *opt_candev[2];
int     opt_rt;
int     opt_pkts;
int     opt_loops;
unsigned int opt_ldelay;
int     opt_timeout;
int     opt_allowerr;
int     opt_txahead;
int     opt_flags;
int     opt_term;
int     opt_id;
int     opt_dlen;
int     opt_eff;
int     opt_brs;
int     opt_txdevnum;
char    *opt_logger;
char    *opt_send;

int     sockfd[2];
int     can_mtu[2];  /* MTU of each channel */
int     cmp_mtu;     /* least MTU of both channel */

struct canfd_frame  txdata[2][MAXPKTNUM+100]; /* message out buf */
struct canfd_frame  rxdata[2][MAXPKTNUM+100]; /* message in buf */
int                 rxidxcnt[2][MAXPKTNUM+100]; /* count each rx index */

volatile int    hup;
volatile int    sigint;


#define DBG_LOG_MAX      4196
#define DBG_LOG_TX       0
#define DBG_LOG_RX       1

int                  dbg_log_pos;

struct dbg_log_buff {
    int     ch;
    int     dir;
    int     txcnt;
    int     rxcnt;
    struct canfd_frame frame;
} dbg_log_buff[DBG_LOG_MAX];


/***** forward declarations *************************************************/

static int cleanup(int retval);


/*
 * logger:           <host>:<portno>     (implemented)
 *          [<type>@]<host>:<portno>     (not yet implemented)
 */
int readLogger(
    char       *logger,
    int        msgcnt
)
{
    struct hostent        *him;
    struct sockaddr_in    sin;
    int	                  fd;
    char                  inbuff[1024];
    char                  cmdbuff[64];
    int                   n;
    struct timeval        timeout;
    char                  host[64];
    int                   port;

    host[0] = '\0';
    port = 0;
    if( sscanf(logger, "%63[^:]:%d", host, &port) != 2 ){
	fprintf(stderr, "readLogger(): wrong logger parameter: found %s:%d\n", host, port);
	return -1;
    }
    
    if ((fd = socket(AF_INET, SOCK_STREAM, 0)) < 0) {
	fprintf(stderr, "readLogger(): socket() failed\n");
	return -1;
    }
    
    bzero((char*)&sin,sizeof(sin));
    
    if ((him = gethostbyname(host)) == NULL) {
	fprintf(stderr, "readLogger(): Unknown host %s\n", host);
	return -2;
    }
    bcopy(him->h_addr, (caddr_t)&sin.sin_addr, him->h_length);
    sin.sin_family = him->h_addrtype;      

    sin.sin_port = htons(port);

    if (connect(fd, (struct sockaddr *)&sin, sizeof(sin)) < 0) {
	fprintf(stderr, "readLogger(): cannot connect()\n");
	close(fd);
	return -3;
    }

    timeout.tv_sec = 0;
    timeout.tv_usec = 100*1000;
    if( setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO,
		   &timeout,sizeof(timeout)) < 0 )
    {
	fprintf(stderr, "readLogger(): cannot set RCVTIMEOUT\n");
	return -2;
    }

    /* Request data from peer */
    sprintf(cmdbuff,"ics:%ds\r\n", msgcnt);
    if( write(fd, cmdbuff, strlen(cmdbuff)) < 0 ){
	return -4;
    }

    /* FIXME: the data on the socket is sent with CRNL line ending.
       We dont convert to NL-only ending, and just print the CRs
       along with the NLs. */
    while( 1 ){
	n = read(fd, inbuff, 1023);
	if( n <= 0 ){
	    printf("done (%d)\n", n);
	    break;
	}
	inbuff[n] = '\0';
	printf("%s", inbuff);
    }

    close(fd);
    
    return 0;
}


void dbg_log(
    int                 ch,
    int                 dir,
    int                 txcnt,
    int                 rxcnt,
    struct canfd_frame  *frame
)
{
    dbg_log_buff[dbg_log_pos].ch = ch;
    dbg_log_buff[dbg_log_pos].dir = dir;
    dbg_log_buff[dbg_log_pos].txcnt = txcnt;
    dbg_log_buff[dbg_log_pos].rxcnt = rxcnt;
    memcpy(&(dbg_log_buff[dbg_log_pos].frame), frame, sizeof(struct canfd_frame));
    dbg_log_pos++;
    if( dbg_log_pos >= DBG_LOG_MAX ){
	dbg_log_pos = 0;
    }
}

void dbg_log_print(void)
{
    int   i;
    int   logcnt;
    
    /* print in reverse order */
    dbg_log_pos--;
    if( dbg_log_pos < 0 ){
	dbg_log_pos = DBG_LOG_MAX-1;
    }

    logcnt = opt_pkts * 5;
    if( logcnt > 4000 ){
	logcnt = 4000;
    }
    
    for(i=0; i<logcnt; i++){
	printf("%d: %s (%.3d/%.3d)  ",
	       dbg_log_buff[dbg_log_pos].ch,
	       dbg_log_buff[dbg_log_pos].dir ? "RX": "TX" ,
	       dbg_log_buff[dbg_log_pos].txcnt,
	       dbg_log_buff[dbg_log_pos].rxcnt);
	fprint_canframe_mtu(stdout , &(dbg_log_buff[dbg_log_pos].frame), "\n", 1,
			    sizeof(struct canfd_frame));
	dbg_log_pos--;
	if( dbg_log_pos < 0 ){
	    dbg_log_pos = DBG_LOG_MAX-1;
	}
    }
}


usTimer_t usTimer(
    usTimer_t      reference
)
{
    struct timeval   tv;
    struct timezone  tz;
    int              x;
    
    gettimeofday(&tv, &tz);
    x = (tv.tv_sec%100)*1000000 + tv.tv_usec;
    /* printf("sec=%u, us=%u, x=%u   --", (tv.tv_sec%10), tv.tv_usec, x); */
    
    if( reference != 0 ){
	if( x >= reference ){
	    return (x-reference);
	} else {
	    return (100*1000000-reference+x);
	}
    }
    
    return x;
}

/*error - wrapper for perror */
void error(
    char *msg
)
{
  perror(msg);
  exit(126);
}

void set_hup(
    int signo
)
{
    hup = 1;
    if( signo == SIGINT ){
	sigint = 1;
    }
}

int testloop(
    int   sockfd[2],
    int   pktnum
)
{
    int       txnum;
    int       n;
    int       loopcnt = 0;
    int       txcnt[2];
    int       rxcnt[2];
    int       goodcnt[2];
    int       idx1 = 0;
    int       devidx;
    usTimer_t t0;
    struct timeval timeout;
    int       berrcnt = 0;
    int       oocnt = 0;
    int       dupcnt = 0;
    fd_set    rdfs;
    int       ret;
    int       first_cmp_err;
    int       thisrx;
    int       exprx;
    int       read_error = 0;

    /* Unlink UDP, CAN is reliable. Hence we do not need to send
       execss packets to compensate loss */
    txnum = pktnum;

    /* set our socket waiting timeout */
    timeout.tv_sec = 0;
    timeout.tv_usec = opt_timeout*1000;
    for( devidx=0; devidx<2; devidx++ ){
	if( setsockopt(sockfd[0], SOL_SOCKET, SO_RCVTIMEO,
		       &timeout,sizeof(timeout)) < 0 )
	{
	    error("\nError in setsockopt ");
	    exit(cleanup(1));
	}

	goodcnt[devidx] = 0;
	txcnt[devidx] = 0;
	rxcnt[devidx] = 0;
    }

    /* Pre-fill (0xff) both receive buffers */
    memset(rxdata, 0xff, sizeof(rxdata));

    memset(rxidxcnt, 0, sizeof(rxidxcnt));

    /* sample time */
    t0 = usTimer(0);

    /* send ahead packets (only on devidx=0 when opt_txdevnum=1) */
    for( devidx=0; devidx<opt_txdevnum; devidx++ ){
	for( txcnt[devidx]=0; (txcnt[devidx]<txnum) && (txcnt[devidx]<opt_txahead);
	     txcnt[devidx]++ )
	{
	    n = write(sockfd[devidx], (void *)&(txdata[devidx][txcnt[devidx]]),
		      can_mtu[devidx]);
	    if( n < can_mtu[devidx] ){
		fprintf(stderr, "\nERROR in write (%d, errno=%d) ", n, errno);
		exit(cleanup(1));
	    }
	    if( opt_verbose & VERB_LOGMSG ){
		dbg_log(devidx, DBG_LOG_TX, txcnt[devidx], rxcnt[devidx], 
			&(txdata[devidx][txcnt[devidx]]));
	    }
	}
    }

    while( 1 ){
	/* Receive messages: Only on devidx=1 when opt_txdevnum=1 */
	FD_ZERO(&rdfs);
	for(devidx=(opt_txdevnum==2)?0:1; devidx<2; devidx++){
	    FD_SET(sockfd[devidx], &rdfs);
	}

	timeout.tv_sec = 0;
	timeout.tv_usec = opt_timeout*1000;
	if( (ret = select(sockfd[1]+1, &rdfs, NULL, NULL, &timeout)) <= 0 ){
	    //fprintf(stderr, "\nERROR select timed out ");
	    goto done;
	}
	
	for(devidx=(opt_txdevnum==2)?0:1; devidx<2; devidx++){
	    if( ! FD_ISSET(sockfd[devidx], &rdfs)) {
		continue;
	    }

	    n = read(sockfd[devidx], (void *)&(rxdata[devidx][rxcnt[devidx]]), can_mtu[devidx]);
	    if( opt_verbose & VERB_LOGMSG ){
		dbg_log(devidx, DBG_LOG_RX, txcnt[devidx], rxcnt[devidx], 
			&(rxdata[devidx][rxcnt[devidx]]));
	    }

#if 0
	    /* Simulate msg error: create a duplicate */
	    if( (devidx == 1) && (rxcnt[devidx] == 3) ){
		memcpy(&(rxdata[devidx][3]), &(rxdata[devidx][3-2]), sizeof(struct canfd_frame));
	    }
#endif

	    thisrx  = (rxdata[devidx][rxcnt[devidx]].data[0]<<8)&0xff00;
	    thisrx |= (rxdata[devidx][rxcnt[devidx]].data[1]<<0)&0x00ff;
	    if( thisrx < txnum ){
		rxidxcnt[devidx][thisrx]++;
	    }
	    
#if 0
	    /* Simulate data error: toggle byte in sequence count */
	    if( (devidx == 1) && (rxcnt[devidx] == 3) ){
		printf("SIM RXERROR\n");
		rxdata[devidx][rxcnt[devidx]].data[0] ^= 0xff;
	    }
	    /* Simulate data error: toggle byte in payload */
	    if( (devidx == 1) && (rxcnt[devidx] == 19) ){
		printf("SIM RXERROR\n");
		rxdata[devidx][rxcnt[devidx]].data[2] ^= 0xff;
	    }
#endif

	    if( (n == can_mtu[devidx]) || (n == CAN_MTU) ){
		if( n == CAN_MTU ){
		    /* Note: We dont care for the flags field when we
		       compare incomming data, but we dont want random
		       data here */
		    rxdata[devidx][rxcnt[devidx]].flags = 0;
		}
		if( rxdata[devidx][rxcnt[devidx]].can_id & CAN_ERR_FLAG ){
		    if( rxdata[devidx][rxcnt[devidx]].can_id & CAN_ERR_BUSOFF ){
			printf("Eoff");
		    } else if ( rxdata[devidx][rxcnt[devidx]].can_id & CAN_ERR_ACK ){
			printf("Eack");
		    } else if ( rxdata[devidx][rxcnt[devidx]].can_id & CAN_ERR_BUSERROR ){
			printf("E");
			berrcnt++;
		    } else {
			printf("E???");
		    }
		    fflush(stdout);
		} else {
		    rxcnt[devidx]++;
		    if( (rxcnt[1] == txnum) &&
			/* When opt_txdevnum==1, then the second part is always true */
			((opt_txdevnum==1) || (rxcnt[0] == txnum)) )
		    {
			/* done: sample time and stop testing */
			cantest_raw.dT  = (double)usTimer(t0);
			cantest_raw.dT /= 1e6;
			goto done;
		    }
		}
	    } else {
		if( (errno == EAGAIN) || (errno == EWOULDBLOCK) ){
		    /* timed out while waiting. This can occur when:
		       - there is no message return at all (e.g. no server)
		       - more than txahead messages are lost in sequence.
		       Both conditions are considered hard errors. */
		    cantest_raw.dT  = (double)usTimer(t0);
		    cantest_raw.dT /= 1e6;
		    vprintf(VERB_TIMERR, "\nERROR: read EAGAIN after %fs "
			    "(ret=%d txcnt=%d, rxcnt=%d) ",
			    cantest_raw.dT, n, txcnt[devidx], rxcnt[devidx]);
		    goto done;
		} else {
		    fprintf(stderr, "\nERROR: read: ret=%d, errno=%d, rxcnt=%d ",
			    n, errno, rxcnt[devidx]);
		    read_error = 1;
		    /* Do not break here, so we can see if more message
		       will be received (likely aborted by timeout) */
		}
	    }

	    /* Maintain txahead. The tx-ahead of one channel depends on the rxcnt of the
	       other channel. */
	    while( ((txcnt[devidx^1] - rxcnt[devidx]) < opt_txahead) && (txcnt[devidx^1] < txnum) ){
		n = write(sockfd[devidx^1], (void *)&(txdata[devidx^1][txcnt[devidx^1]]), can_mtu[devidx^1]);
		if (n < 0){
		    fprintf(stderr, "\nERROR in write (%d, errno=%d) ", n, errno);
		}
		if( opt_verbose & VERB_LOGMSG ){
		    dbg_log(devidx^1, DBG_LOG_TX, txcnt[devidx^1], rxcnt[devidx^1], 
			    &(txdata[devidx^1][txcnt[devidx^1]]));
		}
		txcnt[devidx^1]++;

#if 0
		/* Simulate tx error: drop one TX messages */
		if( (devidx == 0) && (txcnt[devidx] == 3) ){
		    /* drop a message for testing */
		    printf("SIM TXERROR\n");
		    txcnt[devidx^1]++;
		}
#endif
	    }
	    loopcnt++;
	}

	if( hup ){
	    break;
	}
    }
done:

    if( hup ){
	cantest_raw.status = CT_RAW_TERMINATED;
	return cantest_raw.status;
    }

#if 0
    /* Simulate msg error: swap RX messages */
    struct canfd_frame etmp;
    int mm1, mm2;
    mm1=2; mm2=34;
    memcpy(&etmp,             &(rxdata[0][mm1]), sizeof(struct canfd_frame));
    memcpy(&(rxdata[0][mm1]), &(rxdata[0][mm2]), sizeof(struct canfd_frame));
    memcpy(&(rxdata[0][mm2]), &etmp,             sizeof(struct canfd_frame));
    mm1=11; mm2=17;
    memcpy(&etmp,             &(rxdata[0][mm1]), sizeof(struct canfd_frame));
    memcpy(&(rxdata[0][mm1]), &(rxdata[0][mm2]), sizeof(struct canfd_frame));
    memcpy(&(rxdata[0][mm2]), &etmp,             sizeof(struct canfd_frame));
#endif

    first_cmp_err = 1;
    for(devidx=(opt_txdevnum==2)?0:1; devidx<2; devidx++){
	/* compare RX of devidx with TX of devidx^1 (the other channel */
	for( idx1=0; idx1<rxcnt[devidx]; idx1++ ){
	    thisrx  = (rxdata[devidx][idx1].data[0]<<8)&0xff00;
	    thisrx |= (rxdata[devidx][idx1].data[1]<<0)&0x00ff;

	    if( thisrx < txnum ){
		if( thisrx != idx1 ){
		    if( (rxcnt[1] == pktnum) && ((opt_txdevnum == 1) || (rxcnt[0] == pktnum)) ){
			/* Only indicate out-of-order error when we have the correct
			   packet number. If not, it is not out-of-order, it was just
			   message loss */
			oocnt = 1;
			printf("O");
			if( opt_verbose & VERB_CMPERR ){
			    printf("[idx=%.4x,rx=%.4x]", idx1, thisrx);
			}
		    }
		}
		
		if( cmp_canframe(&txdata[devidx^1][thisrx], &rxdata[devidx][idx1], cmp_mtu) == 0 ){
		    goodcnt[devidx]++;
		} else {
		    if( opt_verbose & VERB_CMPERR ){
			if( first_cmp_err ){
			    fprintf(stdout, "\n");
			    first_cmp_err = 0;
			}
			fprintf(stdout, "%d: idx%5d exp:", devidx, idx1);
			fprint_canframe_mtu(stdout , &txdata[devidx^1][idx1], "\n", 1, can_mtu[devidx^1]);
			fprintf(stdout, "%d:           rx:", devidx);
			fprint_canframe_mtu(stdout , &rxdata[devidx][idx1], "\n", 1, can_mtu[devidx]);
		    }
		}
	    } else {
		/* A receive error in bytes 0 and 1, the sequence counter
		   Dont' inc goodcnt, so it will be an error. */
		if( opt_verbose & VERB_CMPERR ){
		    if( first_cmp_err ){
			fprintf(stdout, "\n");
			first_cmp_err = 0;
		    }
		    fprintf(stdout, "%d: idx%5d: out of range rx'ed sequence # 0x%x\n", devidx, idx1, thisrx);
		} else {
		    printf("I");
		}
	    }
	}

	for( idx1=0; idx1<rxcnt[devidx]; idx1++ ){
	    /* Check for duplicates */
	    if( rxidxcnt[devidx][idx1] > 1 ){
		dupcnt++;
		if( opt_verbose & VERB_CMPERR ){
		    fprintf(stdout, "%d: idx%5d: duplicate (x%d)\n", devidx, idx1,
			    rxidxcnt[devidx][idx1]);
		} else {
		    printf("D");
		}
	    }
	}
    }
	
    cantest_raw.speed = ((rxcnt[0]+rxcnt[1])*8)/(cantest_raw.dT*1e3);
    cantest_raw.loss = 0;
    cantest_raw.berrcnt = berrcnt;

    if( (rxcnt[1] < pktnum) || ((opt_txdevnum == 2) && (rxcnt[0] < pktnum)) ){
	/* Did not receive the requested number of packets */
	printf("L");
	if( opt_verbose & VERB_LOSSERR ){
	    printf("[%d", pktnum);
	    for(devidx=0; devidx<2; devidx++){
		printf("/%d:%d", rxcnt[devidx], goodcnt[devidx]);
		if( rxcnt[devidx] > 0 && rxcnt[devidx] < pktnum){
		    /* Print missing packets, Assume that they are in strict sequence */
		    /* FIXME: This code is ugly und not complely correct */
		    printf("(missing:");
		    exprx = 0;
		    for(idx1=0; (idx1<pktnum) && (exprx<pktnum); ){
			thisrx  = (rxdata[devidx][idx1].data[0]<<8)&0xff00;
			thisrx |= (rxdata[devidx][idx1].data[1]<<0)&0x00ff;
			if( thisrx != exprx ){
			    printf("%.4x,", exprx);
			    exprx = thisrx;
			    continue;
			} else {
			    idx1++;
			    exprx++;
			}
		    }
		    printf(")");
		}
	    }
	    printf("]");
	}
	if( opt_txdevnum == 2 ){
	    cantest_raw.loss += (pktnum-rxcnt[0]);
	}
	cantest_raw.loss += (pktnum-rxcnt[1]);
	if( (rxcnt[1] == 0) || ((opt_txdevnum == 2) && (rxcnt[1] == 0)) ){
	    cantest_raw.status = CT_RAW_NO_RESPONSE;
	} else {
	    cantest_raw.status = CT_RAW_TIMEOUT;
	}
    } else {
	/* requested number of packets arrived */

	if( (rxcnt[0]+rxcnt[1]) == (goodcnt[0]+goodcnt[1]) ){
	    if( dupcnt ){
		cantest_raw.status = CT_RAW_DUPLICATES;
	    } else if( berrcnt ){
		if( opt_allowerr  ){
		    cantest_raw.status = 0;
		} else {
		    cantest_raw.status = CT_RAW_BUS_ERRORS;
		}
	    } else {
		if( oocnt ){
		    cantest_raw.status = CT_RAW_BAD_ORDER;
		} else {
		    cantest_raw.status = 0;
		}
	    }
	} else {
	    cantest_raw.status = CT_RAW_BAD_RECEIVE;
	}
    }

    if( opt_verbose & VERB_STATS ){
	printf("status=%d ", cantest_raw.status);
	printf("dT=%.6f ", cantest_raw.dT);
	printf("speed=%.1f ", cantest_raw.speed);
	printf("loss=%d ", cantest_raw.loss);
	printf("\n");
    }

    if( read_error ){
	cantest_raw.status = CT_RAW_READ_ERROR;
    }

    return cantest_raw.status;
}


/* Free all resources we used and return with <retVal> */
static int cleanup(
    int     retval
)
{
    if( sockfd[0] ){
	close(sockfd[0]);
    }
    if( sockfd[1] ){
	close(sockfd[1]);
    }

    if( opt_term ){
	/* Turn off, dont care for erros */
	set_can_term(opt_candev[0], 0);
	set_can_term(opt_candev[1], 0);
    }

    if( sigint ){
	return 255;
    }
    return retval;
}

/* parse for opt_flags, return -1 if error */
int parse_flags(
    const char *flags
)
{
    const char   *p;
    int          opt_flags;

    opt_flags = 0;
    if( sscanf(flags, "%i", &opt_flags) != 1 ){
	p = flags;
	while( *p != '\0' ){
	    if( strstr(p, "norandomdata") == p ){
		opt_flags |= FLAGS_NORANDOMDATA;
	    } else if( strstr(p, "randomize") == p ){
		opt_flags |= FLAGS_RANDOMIZE;
	    } else if( strstr(p, "refilldata") == p ){
		opt_flags |= FLAGS_REFILLDATA;
	    } else if( strstr(p, "allowoood") == p ){
		opt_flags |= FLAGS_ALLOWOOOD;
	    } else if( strstr(p, "lessdots") == p ){
		opt_flags |= FLAGS_LESSDOTS;
	    } else if( strstr(p, "leastdots") == p ){
		opt_flags |= FLAGS_LEASTDOTS;
	    } else if( strstr(p, "stoponerror") == p ){
		opt_flags |= FLAGS_STOPONERROR;
	    } else {
		return -1;
	    }
	    /* skip after next comma, or break if the is none */
	    if( (p=strstr(p, ",")) ){
		p++;
	    } else {
		break;
	    }
	}
    }
    return opt_flags;
}

/* parse for opt_verbose, return -1 if error */
int parse_verbose(
    const char *verbose
)
{
    const char   *p;
    int          opt_verbose;

    opt_verbose = 0;
    if( sscanf(verbose, "%i", &opt_verbose) != 1 ){
	p = verbose;
	while( *p != '\0' ){
	    if( strstr(p, "stats") == p ){
		opt_verbose |= VERB_STATS;
	    } else if( strstr(p, "ifprop") == p ){
		opt_verbose |= VERB_IFPROP;
	    } else if( strstr(p, "timerr") == p ){
		opt_verbose |= VERB_TIMERR;
	    } else if( strstr(p, "cmperr") == p ){
		opt_verbose |= VERB_CMPERR;
	    } else if( strstr(p, "logmsg") == p ){
		opt_verbose |= VERB_LOGMSG;
	    } else if( strstr(p, "losserr") == p ){
		opt_verbose |= VERB_LOSSERR;
	    } else {
		return -1;
	    }
	    /* skip after next comma, or break if the is none */
	    if( (p=strstr(p, ",")) ){
		p++;
	    } else {
		break;
	    }
	}
    }
    return opt_verbose;
}

#define xstr(a) str(a)
#define str(a) #a

/*
 * Show help
 *
 * RETURNS: N/A
 */
void usage(void)
{
    fprintf(
	stdout,
	/*        1         2         3         4         5         6         7         8 */
	/* 345678901234567890123456789012345678901234567890123456789012345678901234567890 */
	"sctest - SocketCAN two node transmission test program\n"
	"USAGE: sctest [options] [canx-netif] [cany-netif]\n"
	"VERSION: $Id: sctest.c 4020 2023-01-17 22:09:24Z as $ "
	"$GlbRev: " __SVNGLBREV__ " $\n"
	"BUILT: " __DATE__ " " __TIME__ "\n"
	"\n"
	"The program sends out some tx-ahead packets on both can interfaces.\n"
	"It waits for messages to be recieved on the opposite channel. On every receipt,\n"
	"the next message is transmitted.\n"
	"When all messages are transmitted and received, data is checked for equalness.\n"
	"\n"
	"The following indicators are printed: Eoff when bus-off is detected, Eack\n"
	"when acknowledge error is detected, E if a bus-error (e.g. CRC) is detected.\n"
	"E??? is printed for error that is not dechipered.\n"
	"\n"
	"Sctest changes some socketcan options, hence the interfaces are put into\n"
	"down state duing program start.\n"
	"\n"
	"OPTIONS:\n"
	"  -l <loops> :  Number of test loops (default " xstr(DFLTLOOPS)
	                                      ", max " xstr(MAXLOOPS) ")\n"
	"  -n <msgs>  :  Messages sent in one loop (default " xstr(DFLTPKTNUM)
	                                      ", max " xstr(MAXPKTNUM) ")\n"
	"  -i <id>    :  CAN ID to use for canx-netif transmit. cany-netif will\n"
	"                use ID+1 (default " xstr(DFLTID) ")\n"
	"  -e         :  Transmit extended format frames\n"
	"  -d <dlen>  :  Data bytes for one message (default = MTU)\n"
	"  -b         :  Enable bitrate switch\n"
	"  -t <num>   :  number of tx-ahead messages (default " xstr(DFLTTXAHEAD) ")\n"
	"  -T <msecs> :  timeout for one message (default " xstr(DFLTTIMEOUT) " ms)\n"
	"  -u         :  Unidirectional operation. Messages are only send on canx-netif\n"
	"                and received by cany-netif\n"
	"  -e         :  Send transmit extended frames\n"
	"  -F <flags> :  Modify operation by comma separated list of modifiers\n"
	"                or by bitmask value:\n"
	"                norandomdata (" xstr(FLAGS_NORANDOMDATA) "): non-random data payload.\n"
	"                randomize (" xstr(FLAGS_RANDOMIZE) "): randomize parameters.\n"
	"                   dlen = 2 .. dlen.\n"
	"                refilldata (" xstr(FLAGS_REFILLDATA) "): data payload is randomized for\n"
	"                  every program loop. Increases runtime, but may help to\n"
	"                  identify a sent messages more easily\n"
	"                allowoood (" xstr(FLAGS_ALLOWOOOD) "): allow out of order\n"
	"                  receiption. If this happens, then O is printed to stdout.\n"
	"                lessdots (" xstr(FLAGS_LESSDOTS) "): print dot only every 10th loop dot.\n"
	"                leastdots (" xstr(FLAGS_LEASTDOTS) "): print dot only every 100th loop.\n"
	"                stoponerror (" xstr(FLAGS_STOPONERROR) "): Dont cleanup and exit\n"
	"  -E         :  ignore bus errors (e.g. in EMC test)\n"
	"  -z <msecs> :  delay after each loop (defaut 0ms)\n"
	"  -r <prio>  :  enable realtime task with prio <prio>\n"
	"  -R         :  enable termination resistor on canx-netif. Duplicate option to\n"
	"                enable termination on both netifs.\n"
	/* 345678901234567890123456789012345678901234567890123456789012345678901234567890 */
	"  -X <logdev>:  In case of error try to capture logged data from external\n"
	"                logger. Currently onyl available with ICANOS icanbuff tool.\n"
	"                Use parameter hostname:portno to select which device to use.\n"
	"  -Y <msg> :    Just send a CAN frame on canx-netif and cany-netif for detection\n"
	"                purposes. The argument is a hex digit which is used as data\n"
	"                payload.\n"
	"  -v <level> :  Select verbose output by comma separated list of options\n"
	"                or by bitmask value:\n"
	"                stats (" xstr(VERB_STATS) "): per loop statistics\n"
	"                ifprop (" xstr(VERB_IFPROP) "): interface properties\n"
	"                timerr (" xstr(VERB_TIMERR) "): show timeout error details\n"
	"                cmperr (" xstr(VERB_CMPERR) "): show compare error details\n"
	"                logmsg (" xstr(VERB_LOGMSG) "): log tx/rx-ed frames and\n"
	"                     print on error\n"
	"\n"
    );
}

int main(int argc, char** argv) {
    int        rc;
    time_t     t;
    int        devidx;
    int        opt;
    int        i,n;
    int        loop;
    int        txcnt;
    struct sched_param param;
    struct sockaddr_can addr;
    struct ifreq ifr;
    struct can_ctrlmode cm;
    can_err_mask_t err_mask;
    int        enable_canfd;


    hup = 0;
    sigint = 0;
    dbg_log_pos = 0;

    opt_verbose = 0;
    opt_rt = 0;
    opt_pkts = DFLTPKTNUM;
    opt_loops = DFLTLOOPS;
    opt_ldelay = 0;
    opt_timeout = DFLTTIMEOUT;
    opt_allowerr = 0;
    opt_txahead = DFLTTXAHEAD;
    opt_flags = 0;
    opt_term = 0;
    opt_dlen = -1;
    opt_eff = 0;
    opt_brs = 0;
    opt_txdevnum = 2;
    opt_logger = 0;
    opt_send = 0;
    opt_id = DFLTID;

    /*
     * Parse command line arguments
     */
    while ((opt = getopt(argc, argv, "bd:eEF:i:hl:n:r:Rt:T:uv:X:Y:z:")) != -1) {
	switch (opt) {
	case 'b':
	    opt_brs = 1;
	    break;
	case 'd':
	    opt_dlen = strtol(optarg, 0, 0);
	    break;
	case 'e':
	    opt_eff = 1;
	    break;
	case 'E':
	    opt_allowerr = 1;
	    break;
	case 'F':
	    opt_flags = parse_flags(optarg);
	    if( opt_flags == -1 ){
		fprintf(stderr, "unrecognised -F flag\n");
		exit(1);
	    }
	    break;
	case 'i':
	    opt_id = strtol(optarg, 0, 0);
	    break;
	case 'l':
	    opt_loops = strtol(optarg, 0, 0);
	    break;
	case 'n':
	    opt_pkts = strtol(optarg, 0, 0);
	    break;
	case 'r':
	    opt_rt = strtol(optarg, 0, 0);
	    break;
	case 'R':
	    opt_term++;
	    break;
	case 't':
	    opt_txahead = strtol(optarg, 0, 0);
	    break;
	case 'T':
	    opt_timeout = strtol(optarg, 0, 0);
	    break;
	case 'u':
	    opt_txdevnum = 1;
	    break;
	case 'v':
	    opt_verbose = parse_verbose(optarg);
	    if( opt_verbose == -1 ){
		fprintf(stderr, "unrecognised -v flag\n");
		exit(1);
	    }
	    break;
	case 'X':
	    opt_logger = optarg;
	    break;
	case 'Y':
	    opt_send = optarg;
	    break;
	case 'z':
	    opt_ldelay = strtoul(optarg, 0, 0);
	    break;
	case 'h':
	default:
	    usage();
	    exit(1);
	}
    }

    if (optind != (argc - 2)) {
	fprintf(stderr, "missing network interface option\n");
	usage();
	return 1;
    }
    opt_candev[0] = argv[optind];
    opt_candev[1] = argv[optind+1];

    signal(SIGINT, set_hup);
    signal(SIGHUP, set_hup);
    signal(SIGTERM, set_hup);

    if( opt_loops >= MAXLOOPS ){
	fprintf(stderr, "error: -l must be less than %d\n", MAXLOOPS);
	exit(cleanup(1));
    }
    if( opt_pkts >= MAXPKTNUM ){
	fprintf(stderr, "error: -n must be less than %d\n", MAXPKTNUM);
	exit(cleanup(1));
    }
    
    printf("CAN test %s %s ", opt_candev[0], opt_candev[1]);

    if( opt_term ){
	if( ! set_can_term(opt_candev[0], 1) ){
	    fprintf(stderr, "can not enable termination %s\n", opt_candev[0]);
	    exit(cleanup(80));
	}
	if( opt_term > 1 ){
	    if( ! set_can_term(opt_candev[1], 1) ){
		fprintf(stderr, "can not enable termination %s\n", opt_candev[1]);
		exit(cleanup(80));
	    }
	}
    }

    for(devidx=0; devidx<2; devidx++){
	/*
	 * Configure CAN opt_candev global options. Here only bus error
	 * reporting. We don't care if it is not supported.
	 */
	if (can_do_stop(opt_candev[devidx]) < 0) {
	    fprintf(stderr, "%s: failed to stop\n", opt_candev[devidx]);
	    exit(cleanup(1));
	}

	cm.mask  = CAN_CTRLMODE_BERR_REPORTING;
	cm.flags = cm.mask;

	if (can_set_ctrlmode(opt_candev[devidx], &cm) < 0) {
	    fprintf(stderr, "%s: failed to set controlmode\n", opt_candev[devidx]);
	    exit(cleanup(1));
	}

	if (can_do_start(opt_candev[devidx]) < 0) {
	    fprintf(stderr, "%s: failed to start\n", opt_candev[devidx]);
	    exit(cleanup(1));
	}
    
	/*
	 * Open CAN network devices
	 */
	sockfd[devidx] = socket(PF_CAN, SOCK_RAW, CAN_RAW);
	if (-1 == sockfd[devidx]) {
	    perror("socket");
	    exit(cleanup(1));
	}

	// Get the index of the network opt_candev
	strncpy(ifr.ifr_name, opt_candev[devidx], IFNAMSIZ);
	if (ioctl(sockfd[devidx], SIOCGIFINDEX, &ifr) == -1) {
	    perror("ioctl");
	    exit(cleanup(1));
	}

	// Bind the socket to the network opt_candev
	addr.can_family = AF_CAN;
	addr.can_ifindex = ifr.ifr_ifindex;
	rc = bind(sockfd[devidx], (struct sockaddr *)&addr, sizeof(addr));
	if (-1 == rc) {
	    perror("bind");
	    exit(cleanup(1));
	}

	err_mask  =  CAN_ERR_MASK; /* all */
	err_mask &= ~CAN_ERR_LOSTARB;
	if( setsockopt(sockfd[devidx], SOL_CAN_RAW, CAN_RAW_ERR_FILTER,
		       &err_mask, sizeof(err_mask)) < 0 ) 
	{
	    error("\nError in setsockopt ");
	    exit(cleanup(1));
	}

	/* Check if we have FD capable interface */
	if( ioctl(sockfd[devidx], SIOCGIFMTU, &ifr) < 0 ) {
		error("\nError in SIOCGIFMTU ");
		exit(cleanup(1));
	}
	can_mtu[devidx] = ifr.ifr_mtu;

	/* Try to switch the socket into CAN FD mode.
	   This was needed for kcan with "fd off" to be able to
	   receive frames sent for a kcan with "fd on". */
	enable_canfd = 1;
	setsockopt(sockfd[devidx], SOL_CAN_RAW, CAN_RAW_FD_FRAMES,
		   &enable_canfd, sizeof(enable_canfd));
    }

    if( (can_mtu[0] == CANFD_MTU) && (can_mtu[1] == CANFD_MTU) ){
	/* Only if both support FD */
	cmp_mtu = CANFD_MTU;
	    
	if( opt_dlen == -1 ){
	    opt_dlen = 64;
	}
	if( opt_dlen > 64 ){
	    fprintf(stderr, "\nError: dlen does not fit into MTU\n");
	    exit(cleanup(1));
	}
    } else {
	cmp_mtu = CAN_MTU;

	if( opt_dlen == -1 ){
	    opt_dlen = 8;
	}
	if( opt_dlen > 8 ){
	    fprintf(stderr, "\nError: dlen does not fit into MTU (no FD support)\n");
	    exit(cleanup(1));
	}
	if( opt_brs ){
	    fprintf(stderr, "\nError: non-FD does not support bit rate switch\n");
	    exit(cleanup(1));
	}
    }
    
    vprintf(VERB_IFPROP, " MTU=%d/%d ", can_mtu[0], can_mtu[1]);


    if( opt_send ){
	struct canfd_frame  txframe;
	int                 txdata;
	int                 txfail=0;
	if( sscanf(opt_send, "%x", &txdata) == 1 ){
	    txframe.can_id = 0;
	    txframe.flags = 0;
	    txframe.len = 8;
	    for(i=0; i<8; i++){
		txframe.data[i] = txdata;
	    }
	    printf("tx0 ");
	    if( write(sockfd[0], (void *)&txframe, can_mtu[0]) < can_mtu[0] ){
		fprintf(stderr, "opt_send: send error canx-netif\n");
		txfail = 1;
	    }
	    txframe.can_id = 1;
	    printf("tx1 ");
	    if( write(sockfd[1], (void *)&txframe, can_mtu[1]) < can_mtu[1] ){
		fprintf(stderr, "opt_send: send error cany-netif\n");
		txfail = 1;
	    }
	    /* wait untils msgs are sended */
	    usleep(10*1000);
	} else {
	    fprintf(stderr, "opt_send: wrong parameter\n");
	    txfail = 1;
	}

	if( txfail ){
	    printf("fail\n");
	} else {
	    printf("ok\n");
	}
	exit(cleanup(txfail));
    }
    
    if( opt_rt ){
	/* Declare ourself as a real time task */
	param.sched_priority = opt_rt;
	if(sched_setscheduler(0, SCHED_FIFO, &param) == -1) {
	    perror("\nERROR: sched_setscheduler failed");
	    exit(cleanup(2));
	}
	
	/* Lock memory */
	if(mlockall(MCL_CURRENT|MCL_FUTURE) == -1) {
	    perror("\nERROR: mlockall failed");
	    exit(cleanup(2));
	}
    }

    random_dlc_init(opt_dlen);

    /* Create our transmission packets.
       bytes 0..1 : sequence number
       rest       : random data */
    time(&t);
    srand((unsigned int)t);
    for(txcnt=0; txcnt<MAXPKTNUM; txcnt++){
	for(devidx=0; devidx<2; devidx++){
	    txdata[devidx][txcnt].can_id  = (opt_id+devidx);
	    if( opt_eff ){
		txdata[devidx][txcnt].can_id |= CAN_EFF_FLAG;
	    }
	    txdata[devidx][txcnt].flags = 0;
	    if( opt_brs ){
		txdata[devidx][txcnt].flags |= CANFD_BRS;
	    }
	    txdata[devidx][txcnt].len = opt_dlen;
	    if( opt_flags & FLAGS_RANDOMIZE ){
		txdata[devidx][txcnt].len = random_dlc();
	    }
	    txdata[devidx][txcnt].data[0] = (txcnt>> 8)&0xff;
	    txdata[devidx][txcnt].data[1] = (txcnt>> 0)&0xff;
	    for(i=2; i<opt_dlen; i++){
		if( opt_flags & FLAGS_NORANDOMDATA ){
		    txdata[devidx][txcnt].data[i] = txcnt & 0xff;
		} else {
		    txdata[devidx][txcnt].data[i] = rand() & 0xff;
		}
	    }
	}
    }
    
    fflush(stdout);

    /* Run requested loops of test */
    for(loop=0; loop<opt_loops; loop++){
	rc = testloop(sockfd, opt_pkts);
	if( rc < 0 ){
	    printf("fail\n");
	    exit(cleanup(1));
	}
	if( opt_flags & FLAGS_LEASTDOTS ){
	    if( (loop % 100) == 0 ){
		printf(".");
	    }
	} else if( opt_flags & FLAGS_LESSDOTS ){
	    if( (loop % 10) == 0 ){
		printf(".");
	    }
	} else {
	    printf(".");
	}
	fflush(stdout);

	if( cantest_raw.status == CT_RAW_TERMINATED ){
	    if( ! sigint ){
		printf(" (terminated) ok\n");
		fflush(stdout);
	    }
	    exit(cleanup(0));
	}
	
	if( cantest_raw.status != 0 ){
	    if( (cantest_raw.status == CT_RAW_BAD_ORDER)
		&& (opt_flags & FLAGS_ALLOWOOOD) )
	    {
		/* allowed */
	    } else { 
		if( (cantest_raw.status == CT_RAW_BAD_RECEIVE)
		    || (cantest_raw.status == CT_RAW_BAD_ORDER)
		    || (cantest_raw.status == CT_RAW_DUPLICATES) )
		{
		    printf("c (%d) fail\n", loop);
		} else if( cantest_raw.status == CT_RAW_TIMEOUT ){
		    printf("t (%d) fail\n", loop);
		} else if( cantest_raw.status == CT_RAW_NO_RESPONSE){
		    printf("T (%d) fail\n", loop);
		} else if( cantest_raw.status == CT_RAW_BUS_ERRORS){
		    printf("b (%d) fail\n", loop);
		} else if( cantest_raw.status == CT_RAW_READ_ERROR){
		    printf("r (%d) fail\n", loop);
		} else {
		    printf("?%d (%d) fail\n", cantest_raw.status, loop);
		}
		if( opt_verbose & VERB_LOGMSG ){
		    dbg_log_print();
		}
		if( opt_logger ){
		    readLogger(opt_logger, opt_pkts*2+20);
		}
		if( opt_flags & FLAGS_STOPONERROR ){
		    printf("Stopped. Ctrl-C to exit.\n");
		    fflush(stdout);
		    while( 1 ){
			sleep(1);
		    }
		}
		exit(cleanup(1));
	    }
	}
	
	if( opt_ldelay ){
	    usleep(opt_ldelay*1000);
	}

	if( opt_flags & FLAGS_REFILLDATA ){
	    for(devidx=0; devidx<2; devidx++){
		for(txcnt=0; txcnt<opt_pkts; txcnt++){
		    n = txdata[devidx][txcnt].len;
		    for(i=2; i<n; i++){
			txdata[devidx][txcnt].data[i] = rand() & 0xff;
		    }
		}
	    }
	}
    
    }
    
    printf(" ok\n");
    return cleanup(0);
}

^ permalink raw reply	[flat|nested] 23+ messages in thread

* WG: mcp251xfd: rx frame truncation
       [not found] <FR0P281MB1966C774C1ED0752CCD6B80497C79@FR0P281MB1966.DEUP281.PROD.OUTLOOK.COM>
                   ` (2 preceding siblings ...)
  2023-01-18 12:32 ` WG: mcp251xfd: rx frame truncation (1/2) Stefan Althöfer
@ 2023-01-18 12:32 ` Stefan Althöfer
  3 siblings, 0 replies; 23+ messages in thread
From: Stefan Althöfer @ 2023-01-18 12:32 UTC (permalink / raw)
  To: linux-can

[-- Attachment #1: Type: text/plain, Size: 16 bytes --]

2nd Attachment

[-- Attachment #2: 20230118T0952_can0_can1.tgz --]
[-- Type: application/x-compressed, Size: 43985 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-18 12:32 ` WG: mcp251xfd: rx frame truncation (1/2) Stefan Althöfer
@ 2023-01-18 14:43   ` Thomas.Kopp
  2023-01-18 16:03     ` AW: " Stefan Althöfer
  0 siblings, 1 reply; 23+ messages in thread
From: Thomas.Kopp @ 2023-01-18 14:43 UTC (permalink / raw)
  To: Stefan.Althoefer, linux-can

> Jan 18 05:34:00 raspberrypi kernel: [405471.116322] mcp251xfd spi0.0 can0:
> Transmit Event FIFO buffer not empty. (seq=0x00001e36,
> tef_tail=0x00001e3a, tef_head=0x00001e3b, tx_head=0x00001e3b).
> Jan 18 09:36:27 raspberrypi kernel: [420017.452369] mcp251xfd spi6.0 can1:
> Transmit Event FIFO buffer full. (seq=0x00001062, tef_tail=0x00001066,
> tef_head=0x00001067, tx_head=0x00001067).
> 
Interesting, never seen of of the full messages, but only running since ~2h now.

> But now the real problem. After some time one of the socketcans is receiving
> only a truncated message:
> 
> 1: idx    2 exp:2A5#00 02 F4 7C 7B F2 53 BD 53 84 B3 4B B7 5F AE 55 2D 88 13 25
> 7C D4 82 EF 85 35 9C 33 1B 7C 5B 55 91 4F D1 0D 41 24 CA 95 A9 7E E0 60 DD 8E
> B5 0A 16 C8 2F 92 9C B1 82 22 E6 1E 55 01 9A B0 56 2B
> 1:           rx:2A5#00 02 F4 7C 7B F2 53 BD 53 84 B3 4B B7 5F AE 55 2D 88 13 25
> 
 
> I've seen this truncation occurring several times.

How often does this happen for you?

What's your SPI speed? Do you see the errors on SPI0 and SPI6 or only one of them?

Thanks,
Thomas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* AW: mcp251xfd: rx frame truncation (1/2)
  2023-01-18 14:43   ` Thomas.Kopp
@ 2023-01-18 16:03     ` Stefan Althöfer
  2023-01-19  8:11       ` Thomas.Kopp
  0 siblings, 1 reply; 23+ messages in thread
From: Stefan Althöfer @ 2023-01-18 16:03 UTC (permalink / raw)
  To: Thomas.Kopp, linux-can

> How often does this happen for you?
> What's your SPI speed? Do you see the errors on SPI0 and SPI6 or only one of them?

One failure was after 1h30, the other after 5h

Seems the transmit fifo messages appear on both:

Jan 18 04:09:32 raspberrypi kernel: [400402.605220] mcp251xfd spi0.0 can0: Transmit Event FIFO buffer not empty. (seq=0x0000016c, tef_tail=0x00000170, tef_head=0x00000171, tx_head=0x00000171).
Jan 18 04:31:19 raspberrypi kernel: [401710.169610] mcp251xfd spi6.0 can1: Transmit Event FIFO buffer not empty. (seq=0x00001d26, tef_tail=0x00001d2a, tef_head=0x00001d2b, tx_head=0x00001d2b).
Jan 18 04:54:48 raspberrypi kernel: [403118.964603] mcp251xfd spi0.0 can0: Transmit Event FIFO buffer not empty. (seq=0x00000900, tef_tail=0x00000904, tef_head=0x00000905, tx_head=0x00000905).
Jan 18 05:03:31 raspberrypi kernel: [403641.894620] mcp251xfd spi6.0 can1: CRC read error at address 0x0010 (length=4, data=04 4b 01 00, CRC=0xb14c) retrying.
Jan 18 05:08:32 raspberrypi kernel: [403942.287189] mcp251xfd spi6.0 can1: Transmit Event FIFO buffer not empty. (seq=0x000008fe, tef_tail=0x00000902, tef_head=0x00000903, tx_head=0x00000903).
Jan 18 05:16:20 raspberrypi kernel: [404410.525137] mcp251xfd spi6.0 can1: Transmit Event FIFO buffer not empty. (seq=0x000012aa, tef_tail=0x000012ae, tef_head=0x000012af, tx_head=0x000012b0).
Jan 18 05:27:41 raspberrypi kernel: [405091.805650] mcp251xfd spi6.0 can1: Transmit Event FIFO buffer not empty. (seq=0x00000152, tef_tail=0x00000156, tef_head=0x00000157, tx_head=0x00000157).
Jan 18 05:34:00 raspberrypi kernel: [405471.116322] mcp251xfd spi0.0 can0: Transmit Event FIFO buffer not empty. (seq=0x00001e36, tef_tail=0x00001e3a, tef_head=0x00001e3b, tx_head=0x00001e3b).

spi6 (can1): 10 MHz
spi0 (can0): 16,67 MHz (admitted, this was not intended)

Funny: the truncated message was received on can1 - with lower speed. And I have never seen other CRC errors than for address 0x0000 and 0x0010.

--
Stefan




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: AW: mcp251xfd: rx frame truncation
  2023-01-18 12:25   ` AW: " Stefan Althöfer
@ 2023-01-18 18:55     ` Marc Kleine-Budde
  0 siblings, 0 replies; 23+ messages in thread
From: Marc Kleine-Budde @ 2023-01-18 18:55 UTC (permalink / raw)
  To: Stefan Althöfer; +Cc: linux-can, Thomas.Kopp

[-- Attachment #1: Type: text/plain, Size: 608 bytes --]

On 18.01.2023 12:25:21, Stefan Althöfer wrote:
> > Stefan, can you capture the SPI traffic with a logic analyzer, too?
> 
> Should be possible, but I need to change my hardware setup to have both of my click board on
> the on pi. So have some patience ;-) I can save as vcd, which you can import in sigrok.

Sounds good!

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-18 16:03     ` AW: " Stefan Althöfer
@ 2023-01-19  8:11       ` Thomas.Kopp
       [not found]         ` <FR0P281MB19669083201C1B9C2DFB9E5497C49@FR0P281MB1966.DEUP281.PROD.OUTLOOK.COM>
  0 siblings, 1 reply; 23+ messages in thread
From: Thomas.Kopp @ 2023-01-19  8:11 UTC (permalink / raw)
  To: Stefan.Althoefer, linux-can

 
> spi6 (can1): 10 MHz
> spi0 (can0): 16,67 MHz (admitted, this was not intended)

The 16,67 are expected on the pi4. There's an erratum on the chip which makes it only safe to use SPI speeds of ( sysclock/2)*0.85, so 17 MHz with a 40 MHz clock and the 16.67 is as close as the pi can get. That shouldn't be a problem though. I'm more surprised by the 10 MHz though. Is that something deliberately limited in your DT or where does that come from?
 
> Funny: the truncated message was received on can1 - with lower speed. And
AFAIK spi6 is different compared to spi0 in more than just the speed. I had my setup(spi0 and spi1)  running for a total of 16hours but didn't get anything. Will switch to spi6 instead of spi1 as well now. 

Where do the 10 MHz in your case come from? I think the spi6 can also reach the 16.67 MHz and with the throughput you're generating with the testscript that should be better.
For the record, this is how you call the test as well, right? ./sctest -b -l 990 -v cmperr,logmsg -F refilldata,lessdots,stoponerror,randomize -d 64 -n 10 -t 3 can0 can1

> I have never seen other CRC errors than for address 0x0000 and 0x0010.
CRC errors are very rare in a non-noisy environment. The two above are exceptions due to an erratum (0x10) and timing(0x0).

Thomas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* AW: mcp251xfd: rx frame truncation (1/2)
       [not found]         ` <FR0P281MB19669083201C1B9C2DFB9E5497C49@FR0P281MB1966.DEUP281.PROD.OUTLOOK.COM>
@ 2023-01-19 12:48           ` Stefan Althöfer
  2023-01-19 13:05             ` Thomas.Kopp
  0 siblings, 1 reply; 23+ messages in thread
From: Stefan Althöfer @ 2023-01-19 12:48 UTC (permalink / raw)
  To: Thomas.Kopp, Marc Kleine-Budde, linux-can

[-- Attachment #1: Type: multipart/alternative, Size: 7300 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-19 12:48           ` AW: " Stefan Althöfer
@ 2023-01-19 13:05             ` Thomas.Kopp
  2023-01-19 13:32               ` Marc Kleine-Budde
  0 siblings, 1 reply; 23+ messages in thread
From: Thomas.Kopp @ 2023-01-19 13:05 UTC (permalink / raw)
  To: Stefan.Althoefer, mkl, linux-can



>Update:
>I moved my click boards to the standard RPI and attached the logic analyzer. The same spi0/spi6 16/10 MHz setup.
>However the truncation error has not appeared within 16h. Nevertheless I see lots of Transmit Event
>issues:
 Mhm ☹ That could explain why I can't catch anything. I'm using the Pi4 with the original Seeed Studio HAT(soldered the 18FD onto it instead of the 17FD) on SPI0 and SPI6 and matched your SPI frequencies.

Jan 19 04:17:52 raspberrypi kernel: [43155.280967] mcp251xfd spi6.0 can1: Transmit Event FIFO buffer not empty. (seq=0x000018aa, tef_tail=0x000018ae, tef_head=0x000018af, tx_head=0x000018af). 
Jan 19 04:18:38 raspberrypi kernel: [43200.859040] mcp251xfd spi6.0 can1: Transmit Event FIFO buffer full. (seq=0x00000d42, tef_tail=0x00000d46, tef_head=0x00000d47, tx_head=0x00000d47).
Jan 19 04:48:04 raspberrypi kernel: [44967.518028] mcp251xfd spi0.0 can0: Transmit Event FIFO buffer not empty. (seq=0x000000ae, tef_tail=0x000000b2, tef_head=0x000000b3, tx_head=0x000000b3).
Jan 19 04:49:03 raspberrypi kernel: [45026.114903] mcp251xfd spi0.0 can0: Transmit Event FIFO buffer not empty. (seq=0x000015b0, tef_tail=0x000015b4, tef_head=0x000015b5, tx_head=0x000015b5).
Jan 19 04:51:22 raspberrypi kernel: [45165.012206] mcp251xfd spi6.0 can1: Transmit Event FIFO buffer not empty. (seq=0x00000e12, tef_tail=0x00000e16, tef_head=0x00000e17, tx_head=0x00000e18).
Jan 19 05:13:33 raspberrypi kernel: [46496.222321] mcp251xfd spi0.0 can0: Transmit Event FIFO buffer not empty. (seq=0x00001c3e, tef_tail=0x00001c42, tef_head=0x00001c43, tx_head=0x00001c43).
Jan 19 05:34:52 raspberrypi kernel: [47775.805160] mcp251xfd spi0.0 can0: Transmit Event FIFO buffer not empty. (seq=0x0000244e, tef_tail=0x00002452, tef_head=0x00002453, tx_head=0x00002453).
Jan 19 06:04:57 raspberrypi kernel: [49580.850626] mcp251xfd spi0.0 can0: Transmit Event FIFO buffer not empty. (seq=0x0000103a, tef_tail=0x0000103e, tef_head=0x0000103f, tx_head=0x0000103f).
Jan 19 06:28:33 raspberrypi kernel: [50996.403445] mcp251xfd spi6.0 can1: Transmit Event FIFO buffer full. (seq=0x00001ed6, tef_tail=0x00001eda, tef_head=0x00001edb, tx_head=0x00001edb).
Jan 19 09:47:13 raspberrypi kernel: [62916.335751] mcp251xfd spi6.0 can1: Transmit Event FIFO buffer not empty. (seq=0x0000143e, tef_tail=0x00001442, tef_head=0x00001443, tx_head=0x00001444).
Jan 19 09:56:14 raspberrypi kernel: [63457.721406] mcp251xfd spi6.0 can1: Transmit Event FIFO buffer not empty. (seq=0x00001808, tef_tail=0x0000180c, tef_head=0x0000180d, tx_head=0x0000180d).
Jan 19 11:12:44 raspberrypi kernel: [68047.346038] mcp251xfd spi0.0 can0: Transmit Event FIFO buffer not empty. (seq=0x000022fe, tef_tail=0x00002302, tef_head=0x00002303, tx_head=0x00002303).
Jan 19 11:48:41 raspberrypi kernel: [70204.560171] mcp251xfd spi6.0 can1: Transmit Event FIFO buffer not empty. (seq=0x00001ea8, tef_tail=0x00001eac, tef_head=0x00001ead, tx_head=0x00001ead).
Jan 19 11:59:54 raspberrypi kernel: [70877.645965] mcp251xfd spi0.0 can0: Transmit Event FIFO buffer not empty. (seq=0x00000932, tef_tail=0x00000936, tef_head=0x00000937, tx_head=0x00000938).
Jan 19 12:10:23 raspberrypi kernel: [71506.521062] mcp251xfd spi0.0 can0: Transmit Event FIFO buffer not empty. (seq=0x0000135e, tef_tail=0x00001362, tef_head=0x00001363, tx_head=0x00001363).

Are you interrested in analyzing what's going on at those times? If so, what to do in the driver to stop operation
after the message has been issued so I can stop the analyzer and sample the chip registers?

Absolutely interested, yes.  I see the not empty cases pretty often as well and they shouldn't be an issue. I still can't reproduce the buffer full though. I pretty much only look at the raw SPI communication when analyzing this. So whatever you can capture on the SPI traces for SPI6 would be interesting for me to look at. If possible it would be great to capture either CANL or CANH as well. Yes, a full dump of the chip registers and RAM help to correlate things but in the end it's the pure SPI traces that help find problematic timings.
Not sure how much you can log at a time but maybe it's possible to check the dmesg log and then kill your testscript and set a GPIO on the pi to trigger your logic analyzer to stop acquiring data? That would work if the analyzer can keep a couple seconds in a Ringbuffer or something.

Thanks,
Thomas



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-19 13:05             ` Thomas.Kopp
@ 2023-01-19 13:32               ` Marc Kleine-Budde
  2023-01-20 20:24                 ` AW: " Stefan Althöfer
  0 siblings, 1 reply; 23+ messages in thread
From: Marc Kleine-Budde @ 2023-01-19 13:32 UTC (permalink / raw)
  To: Thomas.Kopp; +Cc: Stefan.Althoefer, linux-can

[-- Attachment #1: Type: text/plain, Size: 2622 bytes --]

On 19.01.2023 13:05:02, Thomas.Kopp@microchip.com wrote:
> Are you interrested in analyzing what's going on at those times?

Sure.

> If so, what to do in the driver to stop operation after the message
> has been issued so I can stop the analyzer and sample the chip
> registers?

If you return with an error value, the driver will dump the registers
with devcoredump and disable the IRQs and timestamping:

diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c
index 0c89e4f33976..8f821ecde69b 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c
@@ -79,6 +79,8 @@ mcp251xfd_handle_tefif_recover(const struct mcp251xfd_priv *priv, const u32 seq)
                     "not empty" : "empty",
                     seq, priv->tef->tail, priv->tef->head, tx_ring->head);
 
+        return -EINVAL;
+
         /* The Sequence Number in the TEF doesn't match our tef_tail. */
         return -EAGAIN;
 }

Copy
https://github.com/linux-can/can-utils/blob/master/mcp251xfd/99-devcoredump.rules
to /etc/udev/rules.d and
https://github.com/linux-can/can-utils/blob/master/mcp251xfd/devcoredump
to /usr/sbin/

You'll find a coredump in /var/log/

> Absolutely interested, yes. I see the not empty cases pretty often as
> well and they shouldn't be an issue. I still can't reproduce the
> buffer full though. I pretty much only look at the raw SPI
> communication when analyzing this. So whatever you can capture on the
> SPI traces for SPI6 would be interesting for me to look at. If
> possible it would be great to capture either CANL or CANH as well.
> Yes, a full dump of the chip registers and RAM help to correlate
> things but in the end it's the pure SPI traces that help find
> problematic timings.
>
> Not sure how much you can log at a time but maybe it's possible to
> check the dmesg log and then kill your testscript and set a GPIO on
> the pi to trigger your logic analyzer to stop acquiring data? That
> would work if the analyzer can keep a couple seconds in a Ringbuffer
> or something.

If you don't want the devcoredump (which reads all chip registers)
comment it out in
https://elixir.bootlin.com/linux/v6.1.7/source/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c#L1578

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* AW: RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-19 13:32               ` Marc Kleine-Budde
@ 2023-01-20 20:24                 ` Stefan Althöfer
  2023-01-20 21:55                   ` Marc Kleine-Budde
  0 siblings, 1 reply; 23+ messages in thread
From: Stefan Althöfer @ 2023-01-20 20:24 UTC (permalink / raw)
  To: Marc Kleine-Budde, Thomas.Kopp; +Cc: linux-can

Damn shit. The error triggered right soon after you last email. It took 
two more errors until I learned that is a good idea to save in native format
first, and then attempt to export to vcd (which tends to hang).

This time the frame in question was truncated to 4 bytes. The other cases it was 
always truncated to 20 bytes.

The regdumps were taken quite a while after the error.

--
Stefan


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: AW: RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-20 20:24                 ` AW: " Stefan Althöfer
@ 2023-01-20 21:55                   ` Marc Kleine-Budde
  2023-01-20 22:46                     ` Thomas.Kopp
  0 siblings, 1 reply; 23+ messages in thread
From: Marc Kleine-Budde @ 2023-01-20 21:55 UTC (permalink / raw)
  To: Stefan Althöfer; +Cc: Thomas.Kopp, linux-can

[-- Attachment #1: Type: text/plain, Size: 1020 bytes --]

On 20.01.2023 20:24:13, Stefan Althöfer wrote:
> Damn shit. The error triggered right soon after you last email. It took 
> two more errors until I learned that is a good idea to save in native format
> first, and then attempt to export to vcd (which tends to hang).

doh!

> This time the frame in question was truncated to 4 bytes. The other cases it was 
> always truncated to 20 bytes.
> 
> The regdumps were taken quite a while after the error.

I managed to load the VCD into pulseview, decode the SPI and even my old
mcp251xfd decoder shows something.

Keep the mailing list on Cc. As you cannot send the big dumps via the
list, you can send them to Thomas and me directly, but send the text
only mail to the list.

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: AW: RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-20 21:55                   ` Marc Kleine-Budde
@ 2023-01-20 22:46                     ` Thomas.Kopp
  2023-01-21 21:30                       ` AW: " Stefan Althöfer
  0 siblings, 1 reply; 23+ messages in thread
From: Thomas.Kopp @ 2023-01-20 22:46 UTC (permalink / raw)
  To: mkl, Stefan.Althoefer; +Cc: linux-can

> > Damn shit. The error triggered right soon after you last email. It took
> > two more errors until I learned that is a good idea to save in native format
> > first, and then attempt to export to vcd (which tends to hang).
> 
> doh!
> 
> > This time the frame in question was truncated to 4 bytes. The other cases it
> was
> > always truncated to 20 bytes.
> >
> > The regdumps were taken quite a while after the error.
> 
> I managed to load the VCD into pulseview, decode the SPI and even my old
> mcp251xfd decoder shows something.

Thanks for the captures!

The bad data is on the SPI bus (corrupted DLC, everything else seems to be ok. SPI CRC is also fine (matches the wrong data). The chip sending the faulty DLC is busy at the time with the CAN CRC. Will try to re-create the timing and find out what's going on.

Thomas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* AW: AW: RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-20 22:46                     ` Thomas.Kopp
@ 2023-01-21 21:30                       ` Stefan Althöfer
  2023-01-24  9:15                         ` Thomas.Kopp
  2023-01-25 16:23                         ` Stefan Althöfer
  0 siblings, 2 replies; 23+ messages in thread
From: Stefan Althöfer @ 2023-01-21 21:30 UTC (permalink / raw)
  To: Thomas.Kopp, mkl; +Cc: linux-can

Some additional erors captures.

On more event of truncated CAN frame. This time 8 data bytes (instead of 64 are
received). So much to my 20 bytes observation ...

And then two cases of "Transmit Event FIFO buffer full" along with the requested
coredump.

Data sent to the gurus personally. If anybody else want to browse through 10 ms
of exiting SPI waveform, just ask!

--
Stefan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: AW: RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-21 21:30                       ` AW: " Stefan Althöfer
@ 2023-01-24  9:15                         ` Thomas.Kopp
  2023-01-25  9:35                           ` Thomas.Kopp
  2023-01-25 16:23                         ` Stefan Althöfer
  1 sibling, 1 reply; 23+ messages in thread
From: Thomas.Kopp @ 2023-01-24  9:15 UTC (permalink / raw)
  To: Stefan.Althoefer, mkl; +Cc: linux-can


> Some additional erors captures.

Thanks!

> On more event of truncated CAN frame. This time 8 data bytes (instead of 64
> are
> received). So much to my 20 bytes observation ...

Took some time but I think I understood what's going on. The issue here is that due to the bug that Marc wrote the workaround for, we read a false HEAD from FIFOSTA and read multiple RX message objects including one that's currently in the process of being updated with the CAN-Frame that is just in the process of finishing on the bus. At this point the Timestamp is already updated but the DLC is not, so you're re-reading the old DLC from the previous iteration of the Ring but the new timestamp so the workaround doesn't fire as it's a positive relative timestamp. At this time I don't know if this is constrained to the DLC or if data bytes could also be affected. I will have this checked in simulation.
Now, the only workaround I can currently think of, is completely disabling fetching more than one RX Message Object at a time and forcing re-reading the FIFOSTA after every iteration. This will be a performance hit so it's not a nice solution but that *should* get rid of this error in your case. I suppose it could be faster reading diag1 and using the error free messages to determine how much to read. This also means an additional SPI transfer/SPI message but I suppose reading C1INT and DIAG1 in 1 Message using two transfers would be faster (at least on a SPI driver comparable to the PI).

> And then two cases of "Transmit Event FIFO buffer full" along with the
> requested
> coredump.
Will work on this next but didn't have a chance yet.

Thomas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: AW: RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-24  9:15                         ` Thomas.Kopp
@ 2023-01-25  9:35                           ` Thomas.Kopp
  2023-01-25 11:16                             ` Thomas.Kopp
  2023-01-25 11:25                             ` AW: " Stefan Althöfer
  0 siblings, 2 replies; 23+ messages in thread
From: Thomas.Kopp @ 2023-01-25  9:35 UTC (permalink / raw)
  To: Stefan.Althoefer, mkl; +Cc: linux-can

> > And then two cases of "Transmit Event FIFO buffer full" along with the
> > requested
> > coredump.
> Will work on this next but didn't have a chance yet.

Ok, looking at the traces this doesn't seem to be connected to the issue.
Taken from one of your traces is data below We have a tef fifo size of 4, so yeah in this scenario it's expected to have TEF full situations. A larger TEF/Faster SPI/better SPI utilization would all reduce the occurrence of the issue.

The A0 5D.... are UINC after transmitting a frame in the TX fifo, the A0 41 are uincs of the same TEF after transmitting a frame.

"1.030 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","207259"
"1.036 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","208526"
"1.157 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","232733"
"1.433 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","287848"
"1.999 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","401133"
"2.006 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","402378"
"2.106 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","422392"
"2.237 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"2.243 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"2.250 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"2.467 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"2.473 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"2.627 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","526634"
"2.908 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","582761"
"2.914 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","584014"
"3.027 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"3.033 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"3.094 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"3.101 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"3.267 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"3.424 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","686002"
"3.545 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","710278"
"3.666 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","734479"
"3.944 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","790121"
"4.581 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","917495"
"4.588 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","918753"
"4.594 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","919995"
"4.735 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"4.741 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"4.747 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"4.968 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"4.975 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"5.127 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","1026654"
"5.402 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","1081638"
"5.408 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","1082893"
"5.518 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"5.524 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"5.592 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"5.598 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"5.774 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"5.926 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","1186456"
"6.047 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","1210668"
"6.168 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","1234755"
"6.443 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","1289833"
"7.131 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","1427454"
"7.137 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","1428711"
"7.144 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","1429981"
"7.280 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"7.286 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"7.292 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"7.493 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"7.500 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"7.655 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","1532199"
"7.930 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","1587171"
"7.936 ms","SPI","MOSI transfers","MOSI transfer","A0 5D 01 01 82 B6","1588439"

Thomas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: AW: RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-25  9:35                           ` Thomas.Kopp
@ 2023-01-25 11:16                             ` Thomas.Kopp
  2023-01-25 11:25                             ` AW: " Stefan Althöfer
  1 sibling, 0 replies; 23+ messages in thread
From: Thomas.Kopp @ 2023-01-25 11:16 UTC (permalink / raw)
  To: Stefan.Althoefer, mkl; +Cc: linux-can

> > > And then two cases of "Transmit Event FIFO buffer full" along with the
> > > requested
> > > coredump.
> > Will work on this next but didn't have a chance yet.
> 
> Ok, looking at the traces this doesn't seem to be connected to the issue.
Whoops, most of the email was wrong - I mixed up fifos and looked at RX UINC instead of TX FIFO Uincs.

Below is the real SPI sequence of UINCs for TX and TEF and that looks ok. Now, the only time FIFOsta is read (and hence a full TEF reported is in the mcp251xfd_handle_tefif_recover function - and at this point in time the TEF just happens to be full in your case, which is still not an issue but by design. An overflow would be an issue.

"1.706 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","342458"
"1.730 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","347322"
"1.798 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","360950"
"2.229 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","447008"
"2.237 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"2.243 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"2.250 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"2.322 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","465753"
"2.384 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","478142"
"2.467 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"2.473 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"2.839 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","569023"
"2.899 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","581155"
"2.947 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","590637"
"3.027 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"3.033 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"3.094 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"3.101 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"3.157 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","632642"
"3.267 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"4.219 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","844967"
"4.243 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","849849"
"4.312 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","863599"
"4.726 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","946550"
"4.735 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"4.741 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"4.747 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"4.820 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","965234"
"4.881 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","977548"
"4.968 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"4.975 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"5.336 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","1068483"
"5.394 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","1080029"
"5.464 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","1094110"
"5.518 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"5.524 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"5.592 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"5.598 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"5.665 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","1134219"
"5.774 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"6.736 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","1348375"
"6.760 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","1353238"
"6.849 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","1370964"
"7.247 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","1450675"
"7.280 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"7.286 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"7.292 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"7.362 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","1473706"
"7.424 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","1486008"
"7.493 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"7.500 ms","SPI","MOSI transfers","MOSI transfer","A0 41
"7.864 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","1574021"
"7.921 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","1585562"
"7.977 ms","SPI","MOSI transfers","MOSI transfer","A0 75 01 03 00 99","1596673"

Thomas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* AW: AW: RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-25  9:35                           ` Thomas.Kopp
  2023-01-25 11:16                             ` Thomas.Kopp
@ 2023-01-25 11:25                             ` Stefan Althöfer
  2023-01-25 11:57                               ` Marc Kleine-Budde
  1 sibling, 1 reply; 23+ messages in thread
From: Stefan Althöfer @ 2023-01-25 11:25 UTC (permalink / raw)
  To: Thomas.Kopp, mkl; +Cc: linux-can

> Taken from one of your traces is data below We have a tef fifo size of 4, so yeah in this scenario it's expected to have TEF full situations. A larger 
> TEF/Faster SPI/better SPI utilization would all reduce the occurrence of the issue.

I was pretty sure that my application never has more than three frames on the fly - having sent
three frames I wait for the first to be arrived before I sent the next.

However, it seems that the RX IRQ processing is faster than than the TX IRQ
processing and I am sending the next frame when the TX event fifo has not yet
been read.

As "buffer full" is no error (as opposed to buffer overflow), should this really be
a non debug kernel message? Some other means of getting such statistics appreciated
(see my posting " mcp251xfd diagnostic outputs").

--
Stefan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: AW: AW: RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-25 11:25                             ` AW: " Stefan Althöfer
@ 2023-01-25 11:57                               ` Marc Kleine-Budde
  0 siblings, 0 replies; 23+ messages in thread
From: Marc Kleine-Budde @ 2023-01-25 11:57 UTC (permalink / raw)
  To: Stefan Althöfer; +Cc: Thomas.Kopp, linux-can

[-- Attachment #1: Type: text/plain, Size: 1874 bytes --]

On 25.01.2023 11:25:48, Stefan Althöfer wrote:
> > Taken from one of your traces is data below We have a tef fifo size
> > of 4, so yeah in this scenario it's expected to have TEF full
> > situations. A larger TEF/Faster SPI/better SPI utilization would all
> > reduce the occurrence of the issue.
> 
> I was pretty sure that my application never has more than three frames
> on the fly - having sent three frames I wait for the first to be
> arrived before I sent the next.

The driver takes care of flow control, no matter how many CAN frames you
send. If you send too many you'll get a -ENOBUFS.

> However, it seems that the RX IRQ processing is faster than than the TX IRQ
> processing and I am sending the next frame when the TX event fifo has not yet
> been read.

ACK, the RX IRQ is processed before the TEF (TX complete IRQ), as we
cannot do flow control on the RX.

> As "buffer full" is no error (as opposed to buffer overflow), should this really be
> a non debug kernel message? Some other means of getting such statistics appreciated
> (see my posting " mcp251xfd diagnostic outputs").

In the mainline version of the driver you'll get these error messages,
if the TEF doesn't contain the expected sequence number, probably
because the TX FIFO STA register shows a too big TX FIFO head index.
The root cause of this error wasn't known, so an error message is
printed. Feel free to test this series:

| https://lore.kernel.org/all/20230124152729.814840-1-mkl@pengutronix.de

It uses a similar workaround as in the RX FIFO STA workaround.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* AW: AW: RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-21 21:30                       ` AW: " Stefan Althöfer
  2023-01-24  9:15                         ` Thomas.Kopp
@ 2023-01-25 16:23                         ` Stefan Althöfer
  2023-01-25 17:06                           ` Marc Kleine-Budde
  1 sibling, 1 reply; 23+ messages in thread
From: Stefan Althöfer @ 2023-01-25 16:23 UTC (permalink / raw)
  To: Thomas.Kopp, mkl; +Cc: linux-can

I dont know if this is still of interest:

As it seemed that all my truncation errors appeared for DLEN=64 packets, I
Limited to DLEN=32.

After 38 h I got an "increase error". Seems logical that this can happen if the DLEN is read wrong.

1: idx    8 exp:2A5#00 08 27 05 A8 9F 48 02 94 30 F4 DA AD 3D 0C 53 8C 7F 91 7E 53 84 9B EB
1:           rx:2A5#00 08 27 05 A8 9F 48 02 94 30 F4 DA AD 3D 0C 53 8C 7F 91 7E 53 84 9B EB C7 1D 92 24 D1 B4 72 29

Now the interesting part:

1: RX72 (010/009)  2A5#00 09 D2 91 35 52 03 A5 41 DA 2A 68 DF D3 07 27 D5 9B 57 C9 76 04 06 82
1: RX72 (010/008)  2A5#00 08 27 05 A8 9F 48 02 94 30 F4 DA AD 3D 0C 53 8C 7F 91 7E 53 84 9B EB C7 1D 92 24 D1 B4 72 29    ********
1: RX72 (010/007)  2A5#00 07 7F 72 70 EF D7 85
1: RX72 (010/006)  2A5#00 06 E7 F2 46 7D 49 32 A6 4E 72 3D F2 D4 F3 1D 98 4E F5 4A
1: RX72 (010/005)  2A5#00 05 6F 65 15 38 2E 23 78 43 59 F5
1: RX72 (009/004)  2A5#00 04 95 57 EC 29 05 40 97 A3 62 B5 2B C3 5B D8 B2 31 7C 26
1: RX72 (008/003)  2A5#00 03 1E 88 61 A6 AC EE D5 A4 74 3E E8 EF C8 C9 0E 40 15 36 7D A4 98 50
1: RX72 (005/002)  2A5#00 02 E9 40 87 24 8D 2F F8 85 69 44 7B 8A E0 1F 77 4C 27 3D 8E 62 D3 F1 BF 9B 94 AD 54 6D 9E 21
1: RX72 (004/001)  2A5#00 01 63 5C C8 A2 EE B9 D9 F1 CD B1
1: RX72 (004/000)  2A5#00 00 F8 39 4C F5 21 9C A4 ED 6C 08 C6 42 3B 95
1: RX72 (010/009)  2A5#00 09 B2 54 EC AB 1B 3A A0 D9 01 AD BB 3E 12 2F ED B7 50 41 C1 9F 31 76
1: RX72 (010/008)  2A5#00 08 0C E2 3D 65 74 AF A4 21 53 0A 4E F0 B5 8D AB 75 91 F5 73 AE A4 BC
1: RX72 (010/007)  2A5#00 07 F8 C7 4D F5 BE C7
1: RX72 (010/006)  2A5#00 06 AD A4 66 2D 9D AA 3E BA C0 04 4A FE 1D AF 48 04 AF 30
1: RX72 (010/005)  2A5#00 05 FD F7 79 17 24 76 59 92 72 3F
1: RX72 (010/004)  2A5#00 04 7F F9 44 11 1D 15 C6 90 3E 18 64 4A E7 00 E8 F4 17 1D
1: RX72 (009/003)  2A5#00 03 52 D4 0B CE 9C 9E 0D 17 35 08 DF 5C 1A 2D FC 42 6E FC E6 1B 31 27
1: RX72 (006/002)  2A5#00 02 62 E2 7C C7 92 3F 7B 97 FB C8 27 11 4D A0 28 41 00 A4 AD 35 41 64 C7 1D 92 24 D1 B4 72 29   *********
1: RX72 (005/001)  2A5#00 01 B6 92 DC 2E 5D B3 22 4E 58 9E
1: RX72 (004/000)  2A5#00 00 04 BC 4D 90 16 85 78 16 F3 60 7C 6C 35 9D
1: RX72 (010/009)  2A5#00 09 9E 6A E2 7E 4F 81 1F BE AD 84 15 E5 50 D1 80 03 3A 46 63 85 F2 CB
1: RX72 (010/008)  2A5#00 08 65 56 38 CC BC 9B B2 69 C5 60 4B AC 68 94 C7 97 5D EA 3A 92 BA EF
1: RX72 (010/007)  2A5#00 07 E0 E5 9E A0 6F 2C
1: RX72 (010/006)  2A5#00 06 AC 2A AD E2 E6 08 49 1B 2E C8 23 A3 35 D0 B5 0C B0 F3

The on-top data has been received before (see marked lines). However at a queue length of
20 (according to ethtool) I would not have expected that the distance to this frame
Would be 16, but 20. Doesn't the chip/driver use the queue elements one-by-one?

--
Stefan


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: AW: AW: RE: mcp251xfd: rx frame truncation (1/2)
  2023-01-25 16:23                         ` Stefan Althöfer
@ 2023-01-25 17:06                           ` Marc Kleine-Budde
  0 siblings, 0 replies; 23+ messages in thread
From: Marc Kleine-Budde @ 2023-01-25 17:06 UTC (permalink / raw)
  To: Stefan Althöfer; +Cc: Thomas.Kopp, linux-can

[-- Attachment #1: Type: text/plain, Size: 3529 bytes --]

On 25.01.2023 16:23:27, Stefan Althöfer wrote:
> I dont know if this is still of interest:
> 
> As it seemed that all my truncation errors appeared for DLEN=64 packets, I
> Limited to DLEN=32.

Changing the transmitted length influences the timing on the sending
code path, it's the same on the RX path.

> After 38 h I got an "increase error". Seems logical that this can happen if the DLEN is read wrong.
> 
> 1: idx    8 exp:2A5#00 08 27 05 A8 9F 48 02 94 30 F4 DA AD 3D 0C 53 8C 7F 91 7E 53 84 9B EB
> 1:           rx:2A5#00 08 27 05 A8 9F 48 02 94 30 F4 DA AD 3D 0C 53 8C 7F 91 7E 53 84 9B EB C7 1D 92 24 D1 B4 72 29

It looks like the new data (including the timestamp) has already been
written, but the DLC is not yet updated.

> Now the interesting part:
> 
> 1: RX72 (010/009)  2A5#00 09 D2 91 35 52 03 A5 41 DA 2A 68 DF D3 07 27 D5 9B 57 C9 76 04 06 82
> 1: RX72 (010/008)  2A5#00 08 27 05 A8 9F 48 02 94 30 F4 DA AD 3D 0C 53 8C 7F 91 7E 53 84 9B EB C7 1D 92 24 D1 B4 72 29    ********
> 1: RX72 (010/007)  2A5#00 07 7F 72 70 EF D7 85
> 1: RX72 (010/006)  2A5#00 06 E7 F2 46 7D 49 32 A6 4E 72 3D F2 D4 F3 1D 98 4E F5 4A
> 1: RX72 (010/005)  2A5#00 05 6F 65 15 38 2E 23 78 43 59 F5
> 1: RX72 (009/004)  2A5#00 04 95 57 EC 29 05 40 97 A3 62 B5 2B C3 5B D8 B2 31 7C 26
> 1: RX72 (008/003)  2A5#00 03 1E 88 61 A6 AC EE D5 A4 74 3E E8 EF C8 C9 0E 40 15 36 7D A4 98 50
> 1: RX72 (005/002)  2A5#00 02 E9 40 87 24 8D 2F F8 85 69 44 7B 8A E0 1F 77 4C 27 3D 8E 62 D3 F1 BF 9B 94 AD 54 6D 9E 21
> 1: RX72 (004/001)  2A5#00 01 63 5C C8 A2 EE B9 D9 F1 CD B1
> 1: RX72 (004/000)  2A5#00 00 F8 39 4C F5 21 9C A4 ED 6C 08 C6 42 3B 95
> 1: RX72 (010/009)  2A5#00 09 B2 54 EC AB 1B 3A A0 D9 01 AD BB 3E 12 2F ED B7 50 41 C1 9F 31 76
> 1: RX72 (010/008)  2A5#00 08 0C E2 3D 65 74 AF A4 21 53 0A 4E F0 B5 8D AB 75 91 F5 73 AE A4 BC
> 1: RX72 (010/007)  2A5#00 07 F8 C7 4D F5 BE C7
> 1: RX72 (010/006)  2A5#00 06 AD A4 66 2D 9D AA 3E BA C0 04 4A FE 1D AF 48 04 AF 30
> 1: RX72 (010/005)  2A5#00 05 FD F7 79 17 24 76 59 92 72 3F
> 1: RX72 (010/004)  2A5#00 04 7F F9 44 11 1D 15 C6 90 3E 18 64 4A E7 00 E8 F4 17 1D
> 1: RX72 (009/003)  2A5#00 03 52 D4 0B CE 9C 9E 0D 17 35 08 DF 5C 1A 2D FC 42 6E FC E6 1B 31 27
> 1: RX72 (006/002)  2A5#00 02 62 E2 7C C7 92 3F 7B 97 FB C8 27 11 4D A0 28 41 00 A4 AD 35 41 64 C7 1D 92 24 D1 B4 72 29   *********
> 1: RX72 (005/001)  2A5#00 01 B6 92 DC 2E 5D B3 22 4E 58 9E
> 1: RX72 (004/000)  2A5#00 00 04 BC 4D 90 16 85 78 16 F3 60 7C 6C 35 9D
> 1: RX72 (010/009)  2A5#00 09 9E 6A E2 7E 4F 81 1F BE AD 84 15 E5 50 D1 80 03 3A 46 63 85 F2 CB
> 1: RX72 (010/008)  2A5#00 08 65 56 38 CC BC 9B B2 69 C5 60 4B AC 68 94 C7 97 5D EA 3A 92 BA EF
> 1: RX72 (010/007)  2A5#00 07 E0 E5 9E A0 6F 2C
> 1: RX72 (010/006)  2A5#00 06 AC 2A AD E2 E6 08 49 1B 2E C8 23 A3 35 D0 B5 0C B0 F3
> 
> The on-top data has been received before (see marked lines). However
> at a queue length of 20 (according to ethtool) I would not have
> expected that the distance to this frame Would be 16, but 20. Doesn't
> the chip/driver use the queue elements one-by-one?

The driver uses internally power-of-two sized FIFOs: FIFO 1 with depth
of 16 and FIFO 2 with depth of 4. So this looks correct.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-01-25 17:07 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <FR0P281MB1966C774C1ED0752CCD6B80497C79@FR0P281MB1966.DEUP281.PROD.OUTLOOK.COM>
2023-01-18 11:39 ` mcp251xfd: rx frame truncation Marc Kleine-Budde
2023-01-18 11:45 ` Marc Kleine-Budde
2023-01-18 12:25   ` AW: " Stefan Althöfer
2023-01-18 18:55     ` Marc Kleine-Budde
2023-01-18 12:32 ` WG: mcp251xfd: rx frame truncation (1/2) Stefan Althöfer
2023-01-18 14:43   ` Thomas.Kopp
2023-01-18 16:03     ` AW: " Stefan Althöfer
2023-01-19  8:11       ` Thomas.Kopp
     [not found]         ` <FR0P281MB19669083201C1B9C2DFB9E5497C49@FR0P281MB1966.DEUP281.PROD.OUTLOOK.COM>
2023-01-19 12:48           ` AW: " Stefan Althöfer
2023-01-19 13:05             ` Thomas.Kopp
2023-01-19 13:32               ` Marc Kleine-Budde
2023-01-20 20:24                 ` AW: " Stefan Althöfer
2023-01-20 21:55                   ` Marc Kleine-Budde
2023-01-20 22:46                     ` Thomas.Kopp
2023-01-21 21:30                       ` AW: " Stefan Althöfer
2023-01-24  9:15                         ` Thomas.Kopp
2023-01-25  9:35                           ` Thomas.Kopp
2023-01-25 11:16                             ` Thomas.Kopp
2023-01-25 11:25                             ` AW: " Stefan Althöfer
2023-01-25 11:57                               ` Marc Kleine-Budde
2023-01-25 16:23                         ` Stefan Althöfer
2023-01-25 17:06                           ` Marc Kleine-Budde
2023-01-18 12:32 ` WG: mcp251xfd: rx frame truncation Stefan Althöfer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.