netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Question on ptr_ring linux header
@ 2019-01-17  3:51 fei phung
  2019-01-31  5:16 ` fei phung
  0 siblings, 1 reply; 13+ messages in thread
From: fei phung @ 2019-01-17  3:51 UTC (permalink / raw)
  To: mst, feiphung, netdev

Hi,

I am having data race from threadsanitizer report.

Could you see if there is still data race in the following code using
ptr_ring API ?



/*
 * Filename: circ_ring.c
 * Version: 1.0
 * Description: A circular buffer using API from
 * https://github.com/torvalds/linux/blob/master/include/linux/ptr_ring.h
 */

#include <linux/slab.h>
#include <linux/mm.h>
#include <linux/ptr_ring.h>
#include "circ_ring.h"
//#include <assert.h>

#define DEBUG 1

#ifdef DEBUG
#define DEBUG_MSG(...) printk(__VA_ARGS__)
#else
#define DEBUG_MSG(...)
#endif

struct ptr_ring * init_circ_queue(int len)
{
    struct ptr_ring * q;

    q = kzalloc(sizeof(struct ptr_ring), GFP_KERNEL);
    if (q == NULL) {
        DEBUG_MSG(KERN_ERR "Not enough memory to allocate ptr_ring");
        return NULL;
    }

    // creates an array of length 'len' where each array location can
store a struct * item
    if(ptr_ring_init(q, len, GFP_KERNEL) != 0) {
        DEBUG_MSG(KERN_ERR "Not enough memory to allocate ptr_ring array");
        return NULL;
    }

    return q;
}

inline int push_circ_queue(struct ptr_ring * buffer, struct item * item_push)
{
    /* insert one item into the buffer */
    if(ptr_ring_produce_any(buffer, item_push) == 0) // operation is successful
    {
        DEBUG_MSG(KERN_INFO "Successfully pushed val1 = %u and val2 =
%u\n", item_push->val1, item_push->val2);

        return 0;
    }

    else {
        DEBUG_MSG(KERN_INFO "full, not enough buffer space\n");

        return 1;
    }
}

inline int pop_circ_queue(struct ptr_ring * buffer, struct item * item_pop)
{
    struct item * item_temp = 0;

    /* extract one item struct containing two unsigned integers from
the buffer */
    item_temp = (struct item *)ptr_ring_consume_any(buffer);

    if(item_temp) // (!= NULL)
    {
        item_pop->val1 = item_temp->val1;
        item_pop->val2 = item_temp->val2;

    // val1 will never be zero since the event number starts from 1
(so, val1 in push_circ_queue() will not be zero, same case after
pop_circ_queue()), and 0 is only possible during initialization, not
during pop_circ_queue()
        //assert(item_pop->val1 != 0);

        DEBUG_MSG(KERN_INFO "Before pop, head = %u , tail = %u\n",
buffer->consumer_head, buffer->consumer_tail);

        DEBUG_MSG(KERN_INFO "val1 = %u , val2 = %u\n", item_pop->val1,
item_pop->val2);

        DEBUG_MSG(KERN_INFO "After pop, head = %u , tail = %u\n",
buffer->consumer_head, buffer->consumer_tail);

        return 0;
    }

    else {
        //DEBUG_MSG(KERN_INFO "empty, nothing to pop from the ring\n");

        return 1;
    }
}

void free_circ_queue(struct ptr_ring * q)
{
    ptr_ring_cleanup(q, NULL);
}



Regards,
Phung

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question on ptr_ring linux header
  2019-01-17  3:51 Question on ptr_ring linux header fei phung
@ 2019-01-31  5:16 ` fei phung
  2019-01-31 14:39   ` Michael S. Tsirkin
  0 siblings, 1 reply; 13+ messages in thread
From: fei phung @ 2019-01-31  5:16 UTC (permalink / raw)
  To: mst, feiphung, netdev

Hi,

/*
 * Filename: circ_ring.c
 * Version: 1.0
 * Description: A circular buffer using API from
 * https://github.com/torvalds/linux/blob/master/include/linux/ptr_ring.h
 */

ptr_ring's void** queue is just giving data race problem, running
consume() together with [assignment of pointers+produce()] will
definitely give rise to data race

mutex or lock cannot help in this case. Please correct me if wrong

Regards,
Phung

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question on ptr_ring linux header
  2019-01-31  5:16 ` fei phung
@ 2019-01-31 14:39   ` Michael S. Tsirkin
  2019-02-01  8:12     ` fei phung
  0 siblings, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2019-01-31 14:39 UTC (permalink / raw)
  To: fei phung; +Cc: feiphung, netdev

On Thu, Jan 31, 2019 at 01:16:31PM +0800, fei phung wrote:
> Hi,
> 
> /*
>  * Filename: circ_ring.c
>  * Version: 1.0
>  * Description: A circular buffer using API from
>  * https://github.com/torvalds/linux/blob/master/include/linux/ptr_ring.h
>  */
> 
> ptr_ring's void** queue is just giving data race problem, running
> consume() together with [assignment of pointers+produce()] will
> definitely give rise to data race
> 
> mutex or lock cannot help in this case. Please correct me if wrong
> 
> Regards,
> Phung

I am not sure what does assignment of pointers mean in this context.
ptr_ring is designed for a single producer and a single consumer.  For
why it works see explanation about data dependencies in
Documentation/memory-barriers.txt.  You will have to be more specific
about the data race that you see if you expect more specific answers.

Thanks,

-- 
MST

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question on ptr_ring linux header
  2019-01-31 14:39   ` Michael S. Tsirkin
@ 2019-02-01  8:12     ` fei phung
  2019-02-01 15:36       ` Michael S. Tsirkin
  0 siblings, 1 reply; 13+ messages in thread
From: fei phung @ 2019-02-01  8:12 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: feiphung, netdev

> I am not sure what does assignment of pointers mean in this context.
> ptr_ring is designed for a single producer and a single consumer.  For
> why it works see explanation about data dependencies in
> Documentation/memory-barriers.txt.  You will have to be more specific
> about the data race that you see if you expect more specific answers.

Hi,

ptr_ring_produce_any(sc->recv[chnl]->msgs,
&item_recv_push[item_recv_push_index])  needs to have
non-NULL pointer assigned for item_recv_push[item_recv_push_index] , right ?

Note:
ptr_ring_produce_any() occurs in interrupt handler, while
ptr_ring_consume_any() occurs in thread.

https://gist.github.com/promach/7716ee8addcaa33fda140d74d1ad94d6/cdc6599b8313e265bdfb073a65a124e1ba3303a2#file-riffa_driver_ptr_ring-c-L306-L320

        // TX (PC receive) scatter gather buffer is read.
        if (vect & (1<<((5*i)+1))) {
            recv = 1;

            item_recv_push[item_recv_push_index].val1 = EVENT_SG_BUF_READ;
            item_recv_push[item_recv_push_index].val2 = 0;

            // Keep track so the thread can handle this.
            if (ptr_ring_produce_any(sc->recv[chnl]->msgs,
&item_recv_push[item_recv_push_index])) {
                printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv sg buf
read msg queue full\n", sc->id, chnl);
            }
            DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, recv sg buf
read\n", sc->id, chnl);

            item_recv_push_index++;
        }


The kernel log points me to ptr_ring_consume_any(). So, this is
definitely data race issue
with my own ptr_ring interfacing code.

besides, I am also getting a reference to zero-length ring for the
kernel dmesg log.
I am not sure how this is related to the data race though.

The kernel log points to
https://github.com/torvalds/linux/blame/master/include/linux/ptr_ring.h#L175
(click open git blame on this line)

Regards,
Phung

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question on ptr_ring linux header
  2019-02-01  8:12     ` fei phung
@ 2019-02-01 15:36       ` Michael S. Tsirkin
  2019-02-15  3:03         ` fei phung
  0 siblings, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2019-02-01 15:36 UTC (permalink / raw)
  To: fei phung; +Cc: feiphung, netdev

On Fri, Feb 01, 2019 at 04:12:46PM +0800, fei phung wrote:
> > I am not sure what does assignment of pointers mean in this context.
> > ptr_ring is designed for a single producer and a single consumer.  For
> > why it works see explanation about data dependencies in
> > Documentation/memory-barriers.txt.  You will have to be more specific
> > about the data race that you see if you expect more specific answers.
> 
> Hi,
> 
> ptr_ring_produce_any(sc->recv[chnl]->msgs,
> &item_recv_push[item_recv_push_index])  needs to have
> non-NULL pointer assigned for item_recv_push[item_recv_push_index] , right ?

No that's irrelevant. If your item_recv_push_index isn't getting
out of bounds then &item_recv_push[item_recv_push_index]
won't be NULL and that's all ptr_ring cares about.


> Note:
> ptr_ring_produce_any() occurs in interrupt handler, while
> ptr_ring_consume_any() occurs in thread.
> 
> https://gist.github.com/promach/7716ee8addcaa33fda140d74d1ad94d6/cdc6599b8313e265bdfb073a65a124e1ba3303a2#file-riffa_driver_ptr_ring-c-L306-L320
> 
>         // TX (PC receive) scatter gather buffer is read.
>         if (vect & (1<<((5*i)+1))) {
>             recv = 1;
> 
>             item_recv_push[item_recv_push_index].val1 = EVENT_SG_BUF_READ;
>             item_recv_push[item_recv_push_index].val2 = 0;
> 
>             // Keep track so the thread can handle this.
>             if (ptr_ring_produce_any(sc->recv[chnl]->msgs,
> &item_recv_push[item_recv_push_index])) {
>                 printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv sg buf
> read msg queue full\n", sc->id, chnl);
>             }
>             DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, recv sg buf
> read\n", sc->id, chnl);
> 
>             item_recv_push_index++;
>         }
> 
> 
> The kernel log points me to ptr_ring_consume_any(). So, this is
> definitely data race issue
> with my own ptr_ring interfacing code.
> 
> besides, I am also getting a reference to zero-length ring for the
> kernel dmesg log.
> I am not sure how this is related to the data race though.
> 
> The kernel log points to
> https://github.com/torvalds/linux/blame/master/include/linux/ptr_ring.h#L175
> (click open git blame on this line)
> 
> Regards,
> Phung


Sorry I'm not sure what you are trying to say here. If I had to
guess I'd say the way you play with indices is probably racy
so you are producing an invalid index.


-- 
MST

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question on ptr_ring linux header
  2019-02-01 15:36       ` Michael S. Tsirkin
@ 2019-02-15  3:03         ` fei phung
  2019-03-01  3:20           ` fei phung
  0 siblings, 1 reply; 13+ messages in thread
From: fei phung @ 2019-02-15  3:03 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: feiphung, netdev

Hi Michael,

> If I had to guess I'd say the way you play with indices is probably racy
> so you are producing an invalid index.

You are probably right.

I am suspecting item_recv_push_index and item_send_push_index in
https://gist.github.com/promach/7716ee8addcaa33fda140d74d1ad94d6#file-riffa_driver_ptr_ring-c-L254-L404
causing data race (illegal index) for consume() function

note that this interrupt handler could be interrupted again by another
hardware interrupt

I do not think we can use lock within interrupt handler, right ?


Code:
~~~

struct item item_recv_push[CIRC_BUFF_SIZE];
struct item item_send_push[CIRC_BUFF_SIZE];

unsigned int item_send_push_index;
unsigned int item_recv_push_index;

///////////////////////////////////////////////////////
// INTERRUPT HANDLER
///////////////////////////////////////////////////////

/**
* Reads the interrupt vector and processes it. If processing VECT0, off will
* be 0. If processing VECT1, off will be 6.
*/
static inline void process_intr_vector(struct fpga_state * sc, int off,
unsigned int vect)
{
// VECT_0/VECT_1 are organized from right to left (LSB to MSB) as:
// [ 0] TX_TXN for channel 0 in VECT_0, channel 6 in VECT_1
// [ 1] TX_SG_BUF_RECVD for channel 0 in VECT_0, channel 6 in VECT_1
// [ 2] TX_TXN_DONE for channel 0 in VECT_0, channel 6 in VECT_1
// [ 3] RX_SG_BUF_RECVD for channel 0 in VECT_0, channel 6 in VECT_1
// [ 4] RX_TXN_DONE for channel 0 in VECT_0, channel 6 in VECT_1
// ...
// [25] TX_TXN for channel 5 in VECT_0, channel 11 in VECT_1
// [26] TX_SG_BUF_RECVD for channel 5 in VECT_0, channel 11 in VECT_1
// [27] TX_TXN_DONE for channel 5 in VECT_0, channel 11 in VECT_1
// [28] RX_SG_BUF_RECVD for channel 5 in VECT_0, channel 11 in VECT_1
// [29] RX_TXN_DONE for channel 5 in VECT_0, channel 11 in VECT_1
// Positions 30 - 31 in both VECT_0 and VECT_1 are zero.

unsigned int offlast;
unsigned int len;
int recv;
int send;
int chnl;
int i;

offlast = 0;
len = 0;

//printk(KERN_INFO "riffa: intrpt_handler received:%08x\n", vect);
if (vect & 0xC0000000) {
printk(KERN_ERR "riffa: fpga:%d, received bad interrupt
vector:%08x\n", sc->id, vect);
return;
}

for (i = 0; i < 6 && (i+off) < sc->num_chnls; ++i) {
chnl = i + off;
recv = 0;
send = 0;

// TX (PC receive) scatter gather buffer is read.
if (vect & (1<<((5*i)+1))) {
recv = 1;

item_recv_push[item_recv_push_index].val1 = EVENT_SG_BUF_READ;
item_recv_push[item_recv_push_index].val2 = 0;

// Keep track so the thread can handle this.
if (ptr_ring_produce_any(sc->recv[chnl]->msgs,
&item_recv_push[item_recv_push_index])) {
printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv sg buf read msg queue
full\n", sc->id, chnl);
}
DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, recv sg buf read\n", sc->id, chnl);

item_recv_push_index++;
}

// TX (PC receive) transaction done.
if (vect & (1<<((5*i)+2))) {
recv = 1;

item_recv_push[item_recv_push_index].val1 = EVENT_TXN_DONE;
item_recv_push[item_recv_push_index].val2 = len;

// Read the transferred amount.
len = read_reg(sc, CHNL_REG(chnl, TX_TNFR_LEN_REG_OFF));
// Notify the thread.
if (ptr_ring_produce_any(sc->recv[chnl]->msgs,
&item_recv_push[item_recv_push_index])) {
printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv txn done msg queue
full\n", sc->id, chnl);
}
DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, recv txn done\n", sc->id, chnl);

item_recv_push_index++;
}

// New TX (PC receive) transaction.
if (vect & (1<<((5*i)+0))) {
recv = 1;
recv_sg_buf_populated = 0; // resets for new transaction

// Read the offset/last and length
offlast = read_reg(sc, CHNL_REG(chnl, TX_OFFLAST_REG_OFF));
tx_len = read_reg(sc, CHNL_REG(chnl, TX_LEN_REG_OFF));

item_recv_push[item_recv_push_index].val1 = EVENT_TXN_OFFLAST;
item_recv_push[item_recv_push_index].val2 = offlast;

// Keep track of this transaction
if (ptr_ring_produce_any(sc->recv[chnl]->msgs,
&item_recv_push[item_recv_push_index])) {
printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv txn offlast msg queue
full\n", sc->id, chnl);
}
/*if (push_circ_queue(sc->recv[chnl]->msgs, EVENT_TXN_LEN, len)) {
printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv txn len msg queue
full\n", sc->id, chnl);
}*/
DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, recv txn (len:%d off:%d
last:%d)\n", sc->id, chnl, tx_len, (offlast>>1), (offlast & 0x1));

item_recv_push_index++;
}

// RX (PC send) scatter gather buffer is read.
if (vect & (1<<((5*i)+3))) {
send = 1;

item_send_push[item_send_push_index].val1 = EVENT_SG_BUF_READ;
item_send_push[item_send_push_index].val2 = 0;

// Keep track so the thread can handle this.
if (ptr_ring_produce_any(sc->send[chnl]->msgs,
&item_send_push[item_send_push_index])) {
printk(KERN_ERR "riffa: fpga:%d chnl:%d, send sg buf read msg queue
full\n", sc->id, chnl);
}
DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, send sg buf read\n", sc->id, chnl);

item_send_push_index++;
}

// RX (PC send) transaction done.
if (vect & (1<<((5*i)+4))) {
send = 1;

item_send_push[item_send_push_index].val1 = EVENT_TXN_DONE;
item_send_push[item_send_push_index].val2 = len;

// Read the transferred amount.
len = read_reg(sc, CHNL_REG(chnl, RX_TNFR_LEN_REG_OFF));
// Notify the thread.
if (ptr_ring_produce_any(sc->send[chnl]->msgs,
&item_send_push[item_send_push_index])) {
printk(KERN_ERR "riffa: fpga:%d chnl:%d, send txn done msg queue
full\n", sc->id, chnl);
}
DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, send txn done\n", sc->id, chnl);

item_send_push_index++;
}

// Wake up the thread?
if (recv)
wake_up(&sc->recv[chnl]->waitq);
if (send)
wake_up(&sc->send[chnl]->waitq);
}
}

~~~


Regards,
Cheng Fei

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question on ptr_ring linux header
  2019-02-15  3:03         ` fei phung
@ 2019-03-01  3:20           ` fei phung
  0 siblings, 0 replies; 13+ messages in thread
From: fei phung @ 2019-03-01  3:20 UTC (permalink / raw)
  To: Michael S. Tsirkin, netdev, feiphung

Hi Michael ,

I have solved all the data race issue , it seems.
The data race culprit is the ring index which I have removed
in the latest datarace-free code.

The diff can be found at https://www.diffchecker.com/w0Pxp2mF

I plan to study the internal implementation of ptr_ring c coding
in the coming weeks and months.

See the ptr_ring code as follows:


struct item item_recv_push_EVENT_SG_BUF_READ = {EVENT_SG_BUF_READ, 0};
struct item item_recv_push_EVENT_TXN_DONE = {EVENT_TXN_DONE, 0};
struct item item_recv_push_EVENT_TXN_OFFLAST = {EVENT_TXN_OFFLAST, 0};
struct item item_send_push_EVENT_SG_BUF_READ = {EVENT_SG_BUF_READ, 0};
struct item item_send_push_EVENT_TXN_DONE = {EVENT_TXN_DONE, 0};

///////////////////////////////////////////////////////
// INTERRUPT HANDLER
///////////////////////////////////////////////////////

/**
 * Reads the interrupt vector and processes it. If processing VECT0, off will
 * be 0. If processing VECT1, off will be 6.
 */
static inline void process_intr_vector(struct fpga_state * sc, int off,
                unsigned int vect)
{
    // VECT_0/VECT_1 are organized from right to left (LSB to MSB) as:
    // [ 0] TX_TXN            for channel 0 in VECT_0, channel 6 in VECT_1
    // [ 1] TX_SG_BUF_RECVD    for channel 0 in VECT_0, channel 6 in VECT_1
    // [ 2] TX_TXN_DONE        for channel 0 in VECT_0, channel 6 in VECT_1
    // [ 3] RX_SG_BUF_RECVD    for channel 0 in VECT_0, channel 6 in VECT_1
    // [ 4] RX_TXN_DONE        for channel 0 in VECT_0, channel 6 in VECT_1
    // ...
    // [25] TX_TXN            for channel 5 in VECT_0, channel 11 in VECT_1
    // [26] TX_SG_BUF_RECVD    for channel 5 in VECT_0, channel 11 in VECT_1
    // [27] TX_TXN_DONE        for channel 5 in VECT_0, channel 11 in VECT_1
    // [28] RX_SG_BUF_RECVD    for channel 5 in VECT_0, channel 11 in VECT_1
    // [29] RX_TXN_DONE        for channel 5 in VECT_0, channel 11 in VECT_1
    // Positions 30 - 31 in both VECT_0 and VECT_1 are zero.

    unsigned int offlast;
    unsigned int len;
    int recv;
    int send;
    int chnl;
    int i;

    offlast = 0;
    len = 0;

//printk(KERN_INFO "riffa: intrpt_handler received:%08x\n", vect);
    if (vect & 0xC0000000) {
        printk(KERN_ERR "riffa: fpga:%d, received bad interrupt
vector:%08x\n", sc->id, vect);
        return;
    }

    for (i = 0; i < 6 && (i+off) < sc->num_chnls; ++i) {
        chnl = i + off;
        recv = 0;
        send = 0;

        // TX (PC receive) scatter gather buffer is read.
        if (vect & (1<<((5*i)+1))) {
            recv = 1;
            // Keep track so the thread can handle this.
            if (ptr_ring_produce_any(sc->recv[chnl]->msgs,
&item_recv_push_EVENT_SG_BUF_READ)) {
                printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv sg buf
read msg queue full\n", sc->id, chnl);
            }
            DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, recv sg buf
read\n", sc->id, chnl);
        }

        // TX (PC receive) transaction done.
        if (vect & (1<<((5*i)+2))) {
            recv = 1;
            // Read the transferred amount.
            len = read_reg(sc, CHNL_REG(chnl, TX_TNFR_LEN_REG_OFF));

            item_recv_push_EVENT_TXN_DONE.val2 = len;

            // Notify the thread.
            if (ptr_ring_produce_any(sc->recv[chnl]->msgs,
&item_recv_push_EVENT_TXN_DONE)) {
                printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv txn done
msg queue full\n", sc->id, chnl);
            }
            DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, recv txn
done\n", sc->id, chnl);
        }

        // New TX (PC receive) transaction.
        if (vect & (1<<((5*i)+0))) {
            recv = 1;
            recv_sg_buf_populated = 0; // resets for new transaction

            // Read the offset/last and length
            offlast = read_reg(sc, CHNL_REG(chnl, TX_OFFLAST_REG_OFF));
            tx_len = read_reg(sc, CHNL_REG(chnl, TX_LEN_REG_OFF));

            item_recv_push_EVENT_TXN_OFFLAST.val2 = offlast;

            // Keep track of this transaction
            if (ptr_ring_produce_any(sc->recv[chnl]->msgs,
&item_recv_push_EVENT_TXN_OFFLAST)) {
                printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv txn
offlast msg queue full\n", sc->id, chnl);
            }
            /*if (push_circ_queue(sc->recv[chnl]->msgs, EVENT_TXN_LEN, len)) {
                printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv txn len
msg queue full\n", sc->id, chnl);
            }*/
            DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, recv txn
(len:%d off:%d last:%d)\n", sc->id, chnl, tx_len, (offlast>>1),
(offlast & 0x1));
        }

        // RX (PC send) scatter gather buffer is read.
        if (vect & (1<<((5*i)+3))) {
            send = 1;
            // Keep track so the thread can handle this.
            if (ptr_ring_produce_any(sc->send[chnl]->msgs,
&item_send_push_EVENT_SG_BUF_READ)) {
                printk(KERN_ERR "riffa: fpga:%d chnl:%d, send sg buf
read msg queue full\n", sc->id, chnl);
            }
            DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, send sg buf
read\n", sc->id, chnl);
        }

        // RX (PC send) transaction done.
        if (vect & (1<<((5*i)+4))) {
            send = 1;
            // Read the transferred amount.
            len = read_reg(sc, CHNL_REG(chnl, RX_TNFR_LEN_REG_OFF));

            item_send_push_EVENT_TXN_DONE.val2 = len;

            // Notify the thread.
            if (ptr_ring_produce_any(sc->send[chnl]->msgs,
&item_send_push_EVENT_TXN_DONE)) {
                printk(KERN_ERR "riffa: fpga:%d chnl:%d, send txn done
msg queue full\n", sc->id, chnl);
            }
            DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, send txn
done\n", sc->id, chnl);
        }

        // Wake up the thread?
        if (recv)
            wake_up(&sc->recv[chnl]->waitq);
        if (send)
            wake_up(&sc->send[chnl]->waitq);
    }
}


Regards,
Cheng Fei

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question on ptr_ring linux header
       [not found]   ` <SG2PR06MB3176FA4C12888C84990F109BC1820@SG2PR06MB3176.apcprd06.prod.outlook.com>
@ 2019-01-16 14:09     ` Michael S. Tsirkin
  0 siblings, 0 replies; 13+ messages in thread
From: Michael S. Tsirkin @ 2019-01-16 14:09 UTC (permalink / raw)
  To: Cheng Fei Phung; +Cc: fei phung, netdev

On Wed, Jan 16, 2019 at 06:48:41AM +0000, Cheng Fei Phung wrote:
> Hi,
> 
> > https://gist.github.com/promach/65e9331d55a43a2815239430a28e29c6#file-circ_ring-c-L62
> > racy if there are multiple consumers.
> > just call ptr_ring_consume_any.
> 
> If I modify pop_circ_queue() below to directly use ptr_ring_consume_any() without
> ptr_ring_empty_any() , I am afraid I am running into system crash due to NULL pointer
> deferencing for *item_pop:   
> 
>           *item_pop = *((struct item *)ptr_ring_consume_any(buffer));

Yes - you want to check the returned pointer before dereferencing.


> Just one silly beginner question, why will ptr_ring_empty_any() lead to data race problem ?

because ring can become empty between the check and the call to consume?

> 
> inline int pop_circ_queue(struct ptr_ring * buffer, struct item * item_pop)
> {
> 	DEBUG_MSG(KERN_INFO "Before pop, head = %u , tail = %u\n", buffer->consumer_head, buffer->consumer_tail);
> 
> 	/* extract one item struct containing two unsigned integers from the buffer */
> 	*item_pop = *((struct item *)ptr_ring_consume_any(buffer));
> 
> 	if((item_pop != NULL) && (item_pop->val1 > 0)) 
> 	{ 
> 	// val1 will never be zero since the event number starts from 1 (so, val1 in push_circ_queue() will not be zero, same case after pop_circ_queue()), and 0 is only possible during initialization, not during pop_circ_queue()
> 
> 		DEBUG_MSG(KERN_INFO "val1 = %u , val2 = %u\n", item_pop->val1, item_pop->val2);	    
> 
> 		DEBUG_MSG(KERN_INFO "After pop, head = %u , tail = %u\n", buffer->consumer_head, buffer->consumer_tail);
> 
> 		return 0; 
> 	}
> 
> 	else {
> 		DEBUG_MSG(KERN_INFO "empty, nothing to pop from the ring\n");
> 
> 		return 1;
> 	}
> }
> 
> 
> 
> Regards,
> Phung

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question on ptr_ring linux header
@ 2019-01-16  7:00 fei phung
  0 siblings, 0 replies; 13+ messages in thread
From: fei phung @ 2019-01-16  7:00 UTC (permalink / raw)
  To: mst, netdev, feiphung

Hi,

> https://gist.github.com/promach/65e9331d55a43a2815239430a28e29c6#file-circ_ring-c-L62
> racy if there are multiple consumers.
> just call ptr_ring_consume_any.

If I modify pop_circ_queue() below to directly use
ptr_ring_consume_any() without
ptr_ring_empty_any() , I am afraid I am running into system crash due
to NULL pointer
deferencing for *item_pop:

          *item_pop = *((struct item *)ptr_ring_consume_any(buffer));

Just one silly beginner question, why will ptr_ring_empty_any() lead
to data race problem ?


inline int pop_circ_queue(struct ptr_ring * buffer, struct item * item_pop)
{
        DEBUG_MSG(KERN_INFO "Before pop, head = %u , tail = %u\n",
buffer->consumer_head, buffer->consumer_tail);

        /* extract one item struct containing two unsigned integers
from the buffer */
        *item_pop = *((struct item *)ptr_ring_consume_any(buffer));

        if((item_pop != NULL) && (item_pop->val1 > 0))
        {
        // val1 will never be zero since the event number starts from
1 (so, val1 in push_circ_queue() will not be zero, same case after
pop_circ_queue()), and 0 is only possible during initialization, not
during pop_circ_queue()

                DEBUG_MSG(KERN_INFO "val1 = %u , val2 = %u\n",
item_pop->val1, item_pop->val2);

                DEBUG_MSG(KERN_INFO "After pop, head = %u , tail =
%u\n", buffer->consumer_head, buffer->consumer_tail);

                return 0;
        }

        else {
                DEBUG_MSG(KERN_INFO "empty, nothing to pop from the ring\n");

                return 1;
        }
}



Regards,
Phung

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question on ptr_ring linux header
  2019-01-15 17:10 fei phung
@ 2019-01-15 18:13 ` Michael S. Tsirkin
       [not found]   ` <SG2PR06MB3176FA4C12888C84990F109BC1820@SG2PR06MB3176.apcprd06.prod.outlook.com>
  0 siblings, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2019-01-15 18:13 UTC (permalink / raw)
  To: fei phung; +Cc: netdev, feiphung

On Wed, Jan 16, 2019 at 01:10:31AM +0800, fei phung wrote:
> Hi,
> 
> inline int pop_circ_queue(struct ptr_ring * buffer, struct item * item_pop)
> {
>         if (!ptr_ring_empty_any(buffer)) // if (not empty)
>         {
>                 DEBUG_MSG(KERN_INFO "Before pop, head = %u , tail =
> %u\n", buffer->consumer_head, buffer->consumer_tail);
> 
>                 /* extract one item struct containing two unsigned
> integers from the buffer */
>                 *item_pop = *((struct item *)ptr_ring_consume_any(buffer));
> 
>                 DEBUG_MSG(KERN_INFO "val1 = %u , val2 = %u\n",
> item_pop->val1, item_pop->val2);
> 
>                 DEBUG_MSG(KERN_INFO "After pop, head = %u , tail =
> %u\n", buffer->consumer_head, buffer->consumer_tail);
> 
>                 return 0;
>         }
> 
>         else return 1; // empty, nothing to pop from the ring
> }
> 
> > https://gist.github.com/promach/65e9331d55a43a2815239430a28e29c6#file-circ_ring-c-L44
> > racy if there are multiple consumers.
> > just call ptr_ring_consume_any.
> 
> > And it seems to leak the memory the pointer to which you
> > have consumed - although it's possible it's freed elsewhere -
> 
> I will definitely just call ptr_ring_consume_any() without ptr_ring_empty_any()
> 
> Which exact line has memory leak ?
> Are you referring to   struct item * item_pop   ?

yes:

                  *item_pop = *((struct item *)ptr_ring_consume_any(buffer));

seems to discard the pointer returned.

> 
> 
> // TX (PC receive) scatter gather buffer is read.
> if (vect & (1<<((5*i)+1))) {
>         recv = 1;
> 
>         item_recv_push[sc->recv[chnl]->msgs->producer].val1 = EVENT_SG_BUF_READ;
>         item_recv_push[sc->recv[chnl]->msgs->producer].val2 = 0;
> 
>         // Keep track so the thread can handle this.
>         if (push_circ_queue(sc->recv[chnl]->msgs,
> &item_recv_push[sc->recv[chnl]->msgs->producer])) {
>                 printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv sg buf
> read msg queue full\n", sc->id, chnl);
>         }
>         DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, recv sg buf
> read\n", sc->id, chnl);
> }
> 
> // TX (PC receive) transaction done.
> if (vect & (1<<((5*i)+2))) {
>         recv = 1;
> 
>         item_recv_push[sc->recv[chnl]->msgs->producer].val1 = EVENT_TXN_DONE;
>         item_recv_push[sc->recv[chnl]->msgs->producer].val2 = len;
> 
>         // Read the transferred amount.
>         len = read_reg(sc, CHNL_REG(chnl, TX_TNFR_LEN_REG_OFF));
>         // Notify the thread.
>         if (push_circ_queue(sc->recv[chnl]->msgs,
> &item_recv_push[sc->recv[chnl]->msgs->producer])) {
>                 printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv txn done
> msg queue full\n", sc->id, chnl);
>         }
>         DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, recv txn done\n",
> sc->id, chnl);
> }
> 
> > https://gist.github.com/promach/7716ee8addcaa33fda140d74d1ad94d6#file-riffa_driver-c-L663
> > this one seems to poke at the internal producer index for its own
> > purposes. That's not something I expected when I built ptr_ring
> 
> if I do not use ->producer in the above multiple (due to multiple if()
> clause) sequential
> push_circ_queue() operations,  what else I can use other than
> ->producer to store the data
> for item_recv_push   ? Probably a simple integer indexing will do.


maybe

> 
> 
> 
> > https://i.imgur.com/xWJOH1G.png
> > I am having problem getting proper ptr_ring operation where one
> > ptr_ring entry (val1=2 for item_recv_pop) is missing from the void ** queue
> 
> This "ptr_ring packet" drop does not make any sense to me.
> 
> Regards,
> Phung


i suspect that you overwrite an entry in your data structure
due to race wrt producer index access.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question on ptr_ring linux header
@ 2019-01-15 17:10 fei phung
  2019-01-15 18:13 ` Michael S. Tsirkin
  0 siblings, 1 reply; 13+ messages in thread
From: fei phung @ 2019-01-15 17:10 UTC (permalink / raw)
  To: netdev, mst, feiphung

Hi,

inline int pop_circ_queue(struct ptr_ring * buffer, struct item * item_pop)
{
        if (!ptr_ring_empty_any(buffer)) // if (not empty)
        {
                DEBUG_MSG(KERN_INFO "Before pop, head = %u , tail =
%u\n", buffer->consumer_head, buffer->consumer_tail);

                /* extract one item struct containing two unsigned
integers from the buffer */
                *item_pop = *((struct item *)ptr_ring_consume_any(buffer));

                DEBUG_MSG(KERN_INFO "val1 = %u , val2 = %u\n",
item_pop->val1, item_pop->val2);

                DEBUG_MSG(KERN_INFO "After pop, head = %u , tail =
%u\n", buffer->consumer_head, buffer->consumer_tail);

                return 0;
        }

        else return 1; // empty, nothing to pop from the ring
}

> https://gist.github.com/promach/65e9331d55a43a2815239430a28e29c6#file-circ_ring-c-L44
> racy if there are multiple consumers.
> just call ptr_ring_consume_any.

> And it seems to leak the memory the pointer to which you
> have consumed - although it's possible it's freed elsewhere -

I will definitely just call ptr_ring_consume_any() without ptr_ring_empty_any()

Which exact line has memory leak ?
Are you referring to   struct item * item_pop   ?




// TX (PC receive) scatter gather buffer is read.
if (vect & (1<<((5*i)+1))) {
        recv = 1;

        item_recv_push[sc->recv[chnl]->msgs->producer].val1 = EVENT_SG_BUF_READ;
        item_recv_push[sc->recv[chnl]->msgs->producer].val2 = 0;

        // Keep track so the thread can handle this.
        if (push_circ_queue(sc->recv[chnl]->msgs,
&item_recv_push[sc->recv[chnl]->msgs->producer])) {
                printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv sg buf
read msg queue full\n", sc->id, chnl);
        }
        DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, recv sg buf
read\n", sc->id, chnl);
}

// TX (PC receive) transaction done.
if (vect & (1<<((5*i)+2))) {
        recv = 1;

        item_recv_push[sc->recv[chnl]->msgs->producer].val1 = EVENT_TXN_DONE;
        item_recv_push[sc->recv[chnl]->msgs->producer].val2 = len;

        // Read the transferred amount.
        len = read_reg(sc, CHNL_REG(chnl, TX_TNFR_LEN_REG_OFF));
        // Notify the thread.
        if (push_circ_queue(sc->recv[chnl]->msgs,
&item_recv_push[sc->recv[chnl]->msgs->producer])) {
                printk(KERN_ERR "riffa: fpga:%d chnl:%d, recv txn done
msg queue full\n", sc->id, chnl);
        }
        DEBUG_MSG(KERN_INFO "riffa: fpga:%d chnl:%d, recv txn done\n",
sc->id, chnl);
}

> https://gist.github.com/promach/7716ee8addcaa33fda140d74d1ad94d6#file-riffa_driver-c-L663
> this one seems to poke at the internal producer index for its own
> purposes. That's not something I expected when I built ptr_ring

if I do not use ->producer in the above multiple (due to multiple if()
clause) sequential
push_circ_queue() operations,  what else I can use other than
->producer to store the data
for item_recv_push   ? Probably a simple integer indexing will do.




> https://i.imgur.com/xWJOH1G.png
> I am having problem getting proper ptr_ring operation where one
> ptr_ring entry (val1=2 for item_recv_pop) is missing from the void ** queue

This "ptr_ring packet" drop does not make any sense to me.

Regards,
Phung

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question on ptr_ring linux header
  2019-01-15  4:33 fei phung
@ 2019-01-15 12:48 ` Michael S. Tsirkin
  0 siblings, 0 replies; 13+ messages in thread
From: Michael S. Tsirkin @ 2019-01-15 12:48 UTC (permalink / raw)
  To: fei phung; +Cc: netdev, majordomo, feiphung

On Tue, Jan 15, 2019 at 12:33:28PM +0800, fei phung wrote:
> Hi netdev mailing list and Michael,
> 
> I am having problem getting proper ptr_ring operation where one
> ptr_ring entry (val1=2 for item_recv_pop) is missing from the void **
> queue
> 
> Did I initialize the ring pointers (ptr_ring_init()) correctly ?
> 
> See the following for more context:
> 
> https://i.imgur.com/xWJOH1G.png
> https://gist.github.com/promach/7716ee8addcaa33fda140d74d1ad94d6#file-riffa_driver-c-L663

Didn't read in depth but
this one seems to poke at the internal producer index for its own
purposes. That's not something I expected when I built ptr_ring so
I don't know how well that will work. You might want to
try and come up with a reasonable API for that instead.


> https://gist.github.com/promach/65e9331d55a43a2815239430a28e29c6#file-circ_ring-c-L44
> 


Hard to say for sure but things like that

inline int push_circ_queue(struct ptr_ring * buffer, struct item * item_push)
{ 
	if (!ptr_ring_full_any(buffer)) // if (not full)
	{
		DEBUG_MSG(KERN_INFO "Pushing %u and %u\n", item_push->val1, item_push->val2);

		/* insert one item into the buffer */
	    ptr_ring_produce_any(buffer, item_push);


are a waste, and they can be racy with multiple producers.


Just call ptr_ring_produce_any and it will tell you whether it
could produce.


same here:

inline int pop_circ_queue(struct ptr_ring * buffer, struct item * item_pop)
{
	if (!ptr_ring_empty_any(buffer)) // if (not empty)
	{
		DEBUG_MSG(KERN_INFO "Before pop, head = %u , tail = %u\n", buffer->consumer_head, buffer->consumer_tail);

		/* extract one item struct containing two unsigned integers from the buffer */
		*item_pop = *((struct item *)ptr_ring_consume_any(buffer));

		DEBUG_MSG(KERN_INFO "val1 = %u , val2 = %u\n", item_pop->val1, item_pop->val2);	    

		DEBUG_MSG(KERN_INFO "After pop, head = %u , tail = %u\n", buffer->consumer_head, buffer->consumer_tail);

		return 0;
	}

	else return 1; // empty, nothing to pop from the ring
}

racy if there are multiple consumers.
just call ptr_ring_consume_any.

And it seems to leak the memory the pointer to which you
have consumed - although it's possible it's freed elsewhere -
I wouldn't know.



> struct ptr_ring * init_circ_queue(int len)
> {
>         struct ptr_ring * q;
> 
>         q = kzalloc(sizeof(struct ptr_ring), GFP_KERNEL);
>         if (q == NULL) {
>                 DEBUG_MSG(KERN_ERR "Not enough memory to allocate ptr_ring");
>                 return NULL;
>         }
> 
>         // creates an array of length 'len' where each array location
> can store a struct * item
>         if(ptr_ring_init(q, len, GFP_KERNEL) != 0) {
>                 DEBUG_MSG(KERN_ERR "Not enough memory to allocate
> ptr_ring array");
>                 return NULL;
>         }
> 
>         return q;
> }
> 


This part looks good.


> 
>                 while ((nomsg = pop_circ_queue(sc->recv[chnl]->msgs,
> &item_recv_pop))) {
>                         prepare_to_wait(&sc->recv[chnl]->waitq, &wait,
> TASK_INTERRUPTIBLE);
>                         // Another check before we schedule.
>                         if ((nomsg =
> pop_circ_queue(sc->recv[chnl]->msgs, &item_recv_pop)))
>                                 tymeout = schedule_timeout(tymeout);
>                         finish_wait(&sc->recv[chnl]->waitq, &wait);
>                         if (signal_pending(current)) {
>                                 free_sg_buf(sc, sc->recv[chnl]->sg_map_0);
>                                 free_sg_buf(sc, sc->recv[chnl]->sg_map_1);
>                                 return -ERESTARTSYS;
>                         }
>                         if (!nomsg)
>                                 break;
>                         if (tymeout == 0) {
>                                 printk(KERN_ERR "riffa: fpga:%d
> chnl:%d, recv timed out\n", sc->id, chnl);
>                                 /*free_sg_buf(sc, sc->recv[chnl]->sg_map_0);
>                                 free_sg_buf(sc, sc->recv[chnl]->sg_map_1);
>                                 return (unsigned int)(recvd>>2);*/
>                         }
>                 }
>                 tymeout = tymeouto;
>                 msg_type = item_recv_pop.val1;
>                 msg = item_recv_pop.val2;
>                 DEBUG_MSG(KERN_INFO "recv msg_type: %u\n", msg_type);
> 
> 
> Regards,
> Phung

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Question on ptr_ring linux header
@ 2019-01-15  4:33 fei phung
  2019-01-15 12:48 ` Michael S. Tsirkin
  0 siblings, 1 reply; 13+ messages in thread
From: fei phung @ 2019-01-15  4:33 UTC (permalink / raw)
  To: netdev, majordomo, mst, feiphung

Hi netdev mailing list and Michael,

I am having problem getting proper ptr_ring operation where one
ptr_ring entry (val1=2 for item_recv_pop) is missing from the void **
queue

Did I initialize the ring pointers (ptr_ring_init()) correctly ?

See the following for more context:

https://i.imgur.com/xWJOH1G.png
https://gist.github.com/promach/7716ee8addcaa33fda140d74d1ad94d6#file-riffa_driver-c-L663
https://gist.github.com/promach/65e9331d55a43a2815239430a28e29c6#file-circ_ring-c-L44


struct ptr_ring * init_circ_queue(int len)
{
        struct ptr_ring * q;

        q = kzalloc(sizeof(struct ptr_ring), GFP_KERNEL);
        if (q == NULL) {
                DEBUG_MSG(KERN_ERR "Not enough memory to allocate ptr_ring");
                return NULL;
        }

        // creates an array of length 'len' where each array location
can store a struct * item
        if(ptr_ring_init(q, len, GFP_KERNEL) != 0) {
                DEBUG_MSG(KERN_ERR "Not enough memory to allocate
ptr_ring array");
                return NULL;
        }

        return q;
}



                while ((nomsg = pop_circ_queue(sc->recv[chnl]->msgs,
&item_recv_pop))) {
                        prepare_to_wait(&sc->recv[chnl]->waitq, &wait,
TASK_INTERRUPTIBLE);
                        // Another check before we schedule.
                        if ((nomsg =
pop_circ_queue(sc->recv[chnl]->msgs, &item_recv_pop)))
                                tymeout = schedule_timeout(tymeout);
                        finish_wait(&sc->recv[chnl]->waitq, &wait);
                        if (signal_pending(current)) {
                                free_sg_buf(sc, sc->recv[chnl]->sg_map_0);
                                free_sg_buf(sc, sc->recv[chnl]->sg_map_1);
                                return -ERESTARTSYS;
                        }
                        if (!nomsg)
                                break;
                        if (tymeout == 0) {
                                printk(KERN_ERR "riffa: fpga:%d
chnl:%d, recv timed out\n", sc->id, chnl);
                                /*free_sg_buf(sc, sc->recv[chnl]->sg_map_0);
                                free_sg_buf(sc, sc->recv[chnl]->sg_map_1);
                                return (unsigned int)(recvd>>2);*/
                        }
                }
                tymeout = tymeouto;
                msg_type = item_recv_pop.val1;
                msg = item_recv_pop.val2;
                DEBUG_MSG(KERN_INFO "recv msg_type: %u\n", msg_type);


Regards,
Phung

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-03-01  3:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-17  3:51 Question on ptr_ring linux header fei phung
2019-01-31  5:16 ` fei phung
2019-01-31 14:39   ` Michael S. Tsirkin
2019-02-01  8:12     ` fei phung
2019-02-01 15:36       ` Michael S. Tsirkin
2019-02-15  3:03         ` fei phung
2019-03-01  3:20           ` fei phung
  -- strict thread matches above, loose matches on Subject: below --
2019-01-16  7:00 fei phung
2019-01-15 17:10 fei phung
2019-01-15 18:13 ` Michael S. Tsirkin
     [not found]   ` <SG2PR06MB3176FA4C12888C84990F109BC1820@SG2PR06MB3176.apcprd06.prod.outlook.com>
2019-01-16 14:09     ` Michael S. Tsirkin
2019-01-15  4:33 fei phung
2019-01-15 12:48 ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).