All of lore.kernel.org
 help / color / mirror / Atom feed
From: Corey Minyard <minyard@acm.org>
To: "Wang, Haiyue" <haiyue.wang@linux.intel.com>,
	joel@jms.id.au, avifishman70@gmail.com, openbmc@lists.ozlabs.org,
	openipmi-developer@lists.sourceforge.net,
	linux-kernel@vger.kernel.org
Cc: andriy.shevchenko@intel.com
Subject: Re: [PATCH arm/aspeed/ast2500 v2] ipmi: add an Aspeed KCS IPMI BMC driver
Date: Tue, 30 Jan 2018 19:52:41 -0600	[thread overview]
Message-ID: <bc5db9c7-a290-b49e-0217-879817c8cc87@acm.org> (raw)
In-Reply-To: <ba4afbb0-5f52-5f06-9822-630dca7e0e45@linux.intel.com>

On 01/30/2018 07:37 PM, Wang, Haiyue wrote:
>
>
> On 2018-01-31 09:25, Corey Minyard wrote:
>> On 01/30/2018 07:02 PM, Wang, Haiyue wrote:
>>>
>>>
>>> On 2018-01-31 08:52, Corey Minyard wrote:
>>>> On 01/30/2018 06:02 PM, Wang, Haiyue wrote:
>>>>>
>>>>>
>>>>> On 2018-01-30 21:49, Corey Minyard wrote:
>>>>>> On 01/29/2018 07:57 AM, Wang, Haiyue wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2018-01-26 22:48, Corey Minyard wrote:
>>>>>>>> On 01/26/2018 12:08 AM, Wang, Haiyue wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2018-01-25 01:48, Corey Minyard wrote:
>>>>>>>>>> On 01/24/2018 10:06 AM, Haiyue Wang wrote:
>>>>>>>>>>> The KCS (Keyboard Controller Style) interface is used to 
>>>>>>>>>>> perform in-band
>>>>>>>>>>> IPMI communication between a server host and its BMC 
>>>>>>>>>>> (BaseBoard Management
>>>>>>>>>>> Controllers).
>>>>>>>>>>>
>>>>>>>>>>> This driver exposes the KCS interface on ASpeed SOCs 
>>>>>>>>>>> (AST2400 and AST2500)
>>>>>>>>>>> as a character device. Such SOCs are commonly used as BMCs 
>>>>>>>>>>> and this driver
>>>>>>>>>>> implements the BMC side of the KCS interface.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Haiyue Wang <haiyue.wang@linux.intel.com>
>>>>>>>>>>>
>>>>>>>>>>> ---
>>>>>>>>>>> v1->v2
>>>>>>>>>>>
>>>>>>>>>>> - Divide the driver into two parts, one handles the BMC KCS 
>>>>>>>>>>> IPMI 2.0 state;
>>>>>>>>>>>    the other handles the BMC KCS controller such as AST2500 
>>>>>>>>>>> IO accessing.
>>>>>>>>>>> - Use the spin lock APIs to handle the device file 
>>>>>>>>>>> operations and BMC chip
>>>>>>>>>>>    IRQ inferface for accessing the same KCS BMC data structure.
>>>>>>>>>>> - Enhanced the phases handling of the KCS BMC.
>>>>>>>>>>> - Unified the IOCTL definition for IPMI BMC, it will be used 
>>>>>>>>>>> by KCS and BT.
>>>>>>>>>>>
>>>>>>>>>>> ---
>>>>>>>>>>>
>>>>>>>>>>> +
>>>>>>>>>>> +static void kcs_bmc_handle_data(struct kcs_bmc *kcs_bmc)
>>>>>>>>>>> +{
>>>>>>>>>>> +    u8 data;
>>>>>>>>>>> +
>>>>>>>>>>> +    switch (kcs_bmc->phase) {
>>>>>>>>>>> +    case KCS_PHASE_WRITE:
>>>>>>>>>>> +        set_state(kcs_bmc, WRITE_STATE);
>>>>>>>>>>> +
>>>>>>>>>>> +        /* set OBF before reading data */
>>>>>>>>>>> +        write_data(kcs_bmc, KCS_ZERO_DATA);
>>>>>>>>>>> +
>>>>>>>>>>> +        if (kcs_bmc->data_in_idx < KCS_MSG_BUFSIZ)
>>>>>>>>>>> + kcs_bmc->data_in[kcs_bmc->data_in_idx++] =
>>>>>>>>>>> +                        read_data(kcs_bmc);
>>>>>>>>
>>>>>>>> I missed this earlier, you need to issue a length error if the 
>>>>>>>> data is too large.
>>>>>>>>
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>> +    case KCS_PHASE_WRITE_END:
>>>>>>>>>>> +        set_state(kcs_bmc, READ_STATE);
>>>>>>>>>>> +
>>>>>>>>>>> +        if (kcs_bmc->data_in_idx < KCS_MSG_BUFSIZ)
>>>>>>>>>>> + kcs_bmc->data_in[kcs_bmc->data_in_idx++] =
>>>>>>>>>>> +                        read_data(kcs_bmc);
>>>>>>>>>>> +
>>>>>>>>>>> +        kcs_bmc->phase = KCS_PHASE_WAIT_READ;
>>>>>>>>>>> +        if (kcs_bmc->running) {
>>>>>>>>>>
>>>>>>>>>> Why do you only do this when running is set?  It won't hurt 
>>>>>>>>>> anything if it's not
>>>>>>>>>> set.  As it is, you have a race if something opens the device 
>>>>>>>>>> while this code
>>>>>>>>>> runs.
>>>>>>>>>>
>>>>>>>>>> Also, don't set the state to wait read until the "write" has 
>>>>>>>>>> finished (userland has
>>>>>>>>>> read the data out of the buffer.  More on that later.
>>>>>>>>>>
>>>>>>>>> Understood.
>>>>>>>>>>> + kcs_bmc->data_in_avail = true;
>>>>>>>>>>> + wake_up_interruptible(&kcs_bmc->queue);
>>>>>>>>>>> +        }
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>> +    case KCS_PHASE_READ:
>>>>>>>>>>> +        if (kcs_bmc->data_out_idx == kcs_bmc->data_out_len)
>>>>>>>>>>> +            set_state(kcs_bmc, IDLE_STATE);
>>>>>>>>>>> +
>>>>>>>>>>> +        data = read_data(kcs_bmc);
>>>>>>>>>>> +        if (data != KCS_CMD_READ_BYTE) {
>>>>>>>>>>> +            set_state(kcs_bmc, ERROR_STATE);
>>>>>>>>>>> +            write_data(kcs_bmc, KCS_ZERO_DATA);
>>>>>>>>>>> +            break;
>>>>>>>>>>> +        }
>>>>>>>>>>> +
>>>>>>>>>>> +        if (kcs_bmc->data_out_idx == kcs_bmc->data_out_len) {
>>>>>>>>>>> +            write_data(kcs_bmc, KCS_ZERO_DATA);
>>>>>>>>>>> +            kcs_bmc->phase = KCS_PHASE_IDLE;
>>>>>>>>>>> +            break;
>>>>>>>>>>> +        }
>>>>>>>>>>> +
>>>>>>>>>>> +        write_data(kcs_bmc,
>>>>>>>>>>> + kcs_bmc->data_out[kcs_bmc->data_out_idx++]);
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>> +    case KCS_PHASE_ABORT_ERROR1:
>>>>>>>>>>> +        set_state(kcs_bmc, READ_STATE);
>>>>>>>>>>> +
>>>>>>>>>>> +        /* Read the Dummy byte */
>>>>>>>>>>> +        read_data(kcs_bmc);
>>>>>>>>>>> +
>>>>>>>>>>> +        write_data(kcs_bmc, kcs_bmc->error);
>>>>>>>>>>> +        kcs_bmc->phase = KCS_PHASE_ABORT_ERROR2;
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>> +    case KCS_PHASE_ABORT_ERROR2:
>>>>>>>>>>> +        set_state(kcs_bmc, IDLE_STATE);
>>>>>>>>>>> +
>>>>>>>>>>> +        /* Read the Dummy byte */
>>>>>>>>>>> +        read_data(kcs_bmc);
>>>>>>>>>>> +
>>>>>>>>>>> +        write_data(kcs_bmc, KCS_ZERO_DATA);
>>>>>>>>>>> +        kcs_bmc->phase = KCS_PHASE_IDLE;
>>>>>>>>>>> +
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>> +    default:
>>>>>>>>>>> +        set_state(kcs_bmc, ERROR_STATE);
>>>>>>>>>>> +
>>>>>>>>>>> +        /* Read the Dummy byte */
>>>>>>>>>>> +        read_data(kcs_bmc);
>>>>>>>>>>> +
>>>>>>>>>>> +        write_data(kcs_bmc, KCS_ZERO_DATA);
>>>>>>>>>>> +        break;
>>>>>>>>>>> +    }
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static void kcs_bmc_handle_command(struct kcs_bmc *kcs_bmc)
>>>>>>>>>>> +{
>>>>>>>>>>> +    u8 cmd;
>>>>>>>>>>> +
>>>>>>>>>>> +    set_state(kcs_bmc, WRITE_STATE);
>>>>>>>>>>> +
>>>>>>>>>>> +    /* Dummy data to generate OBF */
>>>>>>>>>>> +    write_data(kcs_bmc, KCS_ZERO_DATA);
>>>>>>>>>>> +
>>>>>>>>>>> +    cmd = read_data(kcs_bmc);
>>>>>>>>>>
>>>>>>>>>> Shouldn't you check the phase in all the cases below and do 
>>>>>>>>>> error
>>>>>>>>>> handling if the phase isn't correct?
>>>>>>>>>>
>>>>>>>>>> Similar thing if the device here isn't open. You need to handle
>>>>>>>>>> that gracefully.
>>>>>>>>>>
>>>>>>>>>> Also, you should remove data_in_avail and data_in_idx setting 
>>>>>>>>>> from
>>>>>>>>>> here, for reasons I will explain later.
>>>>>>>>>>
>>>>>>>>> If host software sends the data twice such as a retry before 
>>>>>>>>> the BMC's IPMI service starts,
>>>>>>>>> then the two IPMI requests will be merged into one, if not 
>>>>>>>>> clear data_in_idx after receving
>>>>>>>>> KCS_CMD_WRITE_START. Most of the states are driven by host 
>>>>>>>>> software (SMS). :(
>>>>>>>>
>>>>>>>> True, but what if the host issues WRITE_START or a WRITE_END 
>>>>>>>> while this driver is in read
>>>>>>>> state?  The spec is unclear on this, but it really only makes 
>>>>>>>> sense for the host to issue
>>>>>>>> WRITE_START in idle stat and WRITE_END in write state. IMHO it 
>>>>>>>> should go to error
>>>>>>>> state.  You might make the case that a WRITE_START anywhere 
>>>>>>>> restarts the transaction,
>>>>>>>> but the feel of the error state machine kind of goes against 
>>>>>>>> that. WRITE_END is definitely
>>>>>>>> wrong anywhere but write state.
>>>>>>>>
>>>>>>>> I just found the following in the spec (section 9.12):
>>>>>>>>
>>>>>>>>    Thus, since the interface will allow a command transfer to be
>>>>>>>>    started or restarted
>>>>>>>>    at any time when the input buffer is empty, software could 
>>>>>>>> elect to
>>>>>>>>    simply retry
>>>>>>>>    the command upon detecting an error condition, or issue a 
>>>>>>>> ‘known good’
>>>>>>>>    command in order to clear ERROR_STATE
>>>>>>>>
>>>>>>>> So a WRITE_START anywhere is ok.  A WRITE_END in the wrong 
>>>>>>>> state should probably
>>>>>>>> still go to error state.  This means the user needs to be able 
>>>>>>>> to handle a write error at
>>>>>>>> any time.  It also means it's very important to make sure the 
>>>>>>>> user does a read before
>>>>>>>> doing a write.  If the host re-issues a WRITE_START and writes 
>>>>>>>> a new command
>>>>>>>> between the time the use reads the data and writes the 
>>>>>>>> response, the response would
>>>>>>>> be for the wrong command.
>>>>>>>>
>>>>>>>>>>> +    switch (cmd) {
>>>>>>>>>>> +    case KCS_CMD_WRITE_START:
>>>>>>>>>>> +        kcs_bmc->data_in_avail = false;
>>>>>>>>>>> +        kcs_bmc->data_in_idx   = 0;
>>>>>>>>>>> +        kcs_bmc->phase         = KCS_PHASE_WRITE;
>>>>>>>>>>> +        kcs_bmc->error         = KCS_NO_ERROR;
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>> +    case KCS_CMD_WRITE_END:
>>>>>>>>>>> +        kcs_bmc->phase = KCS_PHASE_WRITE_END;
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>> +    case KCS_CMD_ABORT:
>>>>>>>>>>> +        if (kcs_bmc->error == KCS_NO_ERROR)
>>>>>>>>>>> +            kcs_bmc->error = KCS_ABORTED_BY_COMMAND;
>>>>>>>>>>> +
>>>>>>>>>>> +        kcs_bmc->phase = KCS_PHASE_ABORT_ERROR1;
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>> +    default:
>>>>>>>>>>> +        kcs_bmc->error = KCS_ILLEGAL_CONTROL_CODE;
>>>>>>>>>>> +        set_state(kcs_bmc, ERROR_STATE);
>>>>>>>>>>> +        write_data(kcs_bmc, kcs_bmc->error);
>>>>>>>>>>> +        kcs_bmc->phase = KCS_PHASE_ERROR;
>>>>>>>>>>> +        break;
>>>>>>>>>>> +    }
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +int kcs_bmc_handle_event(struct kcs_bmc *kcs_bmc)
>>>>>>>>>>> +{
>>>>>>>>>>> +    unsigned long flags;
>>>>>>>>>>> +    int ret = 0;
>>>>>>>>>>> +    u8 status;
>>>>>>>>>>> +
>>>>>>>>>>> +    spin_lock_irqsave(&kcs_bmc->lock, flags);
>>>>>>>>>>> +
>>>>>>>>>>> +    status = read_status(kcs_bmc) & (KCS_STATUS_IBF | 
>>>>>>>>>>> KCS_STATUS_CMD_DAT);
>>>>>>>>>>> +
>>>>>>>>>>> +    switch (status) {
>>>>>>>>>>> +    case KCS_STATUS_IBF | KCS_STATUS_CMD_DAT:
>>>>>>>>>>> +        kcs_bmc_handle_command(kcs_bmc);
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>> +    case KCS_STATUS_IBF:
>>>>>>>>>>> +        kcs_bmc_handle_data(kcs_bmc);
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>> +    default:
>>>>>>>>>>> +        ret = -1;
>>>>>>>>>>> +        break;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> + spin_unlock_irqrestore(&kcs_bmc->lock, flags);
>>>>>>>>>>> +
>>>>>>>>>>> +    return ret;
>>>>>>>>>>> +}
>>>>>>>>>>> +EXPORT_SYMBOL(kcs_bmc_handle_event);
>>>>>>>>>>> +
>>>>>>>>>>> +static inline struct kcs_bmc *file_kcs_bmc(struct file *filp)
>>>>>>>>>>> +{
>>>>>>>>>>> +    return container_of(filp->private_data, struct kcs_bmc, 
>>>>>>>>>>> miscdev);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static int kcs_bmc_open(struct inode *inode, struct file 
>>>>>>>>>>> *filp)
>>>>>>>>>>> +{
>>>>>>>>>>> +    struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
>>>>>>>>>>> +    int ret = 0;
>>>>>>>>>>> +
>>>>>>>>>>> +    spin_lock_irq(&kcs_bmc->lock);
>>>>>>>>>>> +
>>>>>>>>>>> +    if (!kcs_bmc->running) {
>>>>>>>>>>> +        kcs_bmc->running       = 1;
>>>>>>>>>>> +        kcs_bmc->phase         = KCS_PHASE_IDLE;
>>>>>>>>>>> +        kcs_bmc->data_in_avail = false;
>>>>>>>>>>
>>>>>>>>>> If you do everything right, setting the phase and 
>>>>>>>>>> data_in_avail should not
>>>>>>>>>> be necessary here.
>>>>>>>>>>
>>>>>>>>>>> +    } else {
>>>>>>>>>>> +        ret = -EBUSY;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    spin_unlock_irq(&kcs_bmc->lock);
>>>>>>>>>>> +
>>>>>>>>>>> +    return ret;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static unsigned int kcs_bmc_poll(struct file *filp, 
>>>>>>>>>>> poll_table *wait)
>>>>>>>>>>> +{
>>>>>>>>>>> +    struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
>>>>>>>>>>> +    unsigned int mask = 0;
>>>>>>>>>>> +
>>>>>>>>>>> +    poll_wait(filp, &kcs_bmc->queue, wait);
>>>>>>>>>>> +
>>>>>>>>>>> +    spin_lock_irq(&kcs_bmc->lock);
>>>>>>>>>>> +
>>>>>>>>>>> +    if (kcs_bmc->data_in_avail)
>>>>>>>>>>> +        mask |= POLLIN;
>>>>>>>>>>> +
>>>>>>>>>>> +    spin_unlock_irq(&kcs_bmc->lock);
>>>>>>>>>>> +
>>>>>>>>>>> +    return mask;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static ssize_t kcs_bmc_read(struct file *filp, char *buf,
>>>>>>>>>>> +                size_t count, loff_t *offset)
>>>>>>>>>>> +{
>>>>>>>>>>> +    struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
>>>>>>>>>>> +    ssize_t ret = -EAGAIN;
>>>>>>>>>>> +
>>>>>>>>>>
>>>>>>>>>> This function still has some issues.
>>>>>>>>>>
>>>>>>>>>> You can't call copy_to_user() with a spinlock held or 
>>>>>>>>>> interrupts disabled.
>>>>>>>>>> To handle readers, you probably need a separate mutex.
>>>>>>>>>>
>>>>>>>>>> Also, this function can return -EAGAIN even if O_NONBLOCK is 
>>>>>>>>>> not set if
>>>>>>>>>> kcs_bmc->data_in_avail changes between when you wait on the 
>>>>>>>>>> event
>>>>>>>>>> and when you check it under the lock.
>>>>>>>>>>
>>>>>>>>>> You also clear data_in_avail even if the copy_to_user() 
>>>>>>>>>> fails, which is
>>>>>>>>>> wrong.
>>>>>>>>>>
>>>>>>>>>> I believe the best way to handle this would be to have the 
>>>>>>>>>> spinlock
>>>>>>>>>> protect the inner workings of the state machine and a mutex 
>>>>>>>>>> handle
>>>>>>>>>> copying data out, setting/clearing the running flag (thus a 
>>>>>>>>>> mutex
>>>>>>>>>> instead of spinlock in open and release) and the ioctl 
>>>>>>>>>> settings (except
>>>>>>>>>> for abort where you will need to grab the spinlock).
>>>>>>>>>>
>>>>>>>>>> After the wait event below, grab the mutex.  If data is not 
>>>>>>>>>> available
>>>>>>>>>> and O_NONBLOCK is not set, drop the mutex and retry. Otherwise
>>>>>>>>>> this is the only place (besides release) that sets 
>>>>>>>>>> data_in_avail to false.
>>>>>>>>>> Do the copy_to_user(), grab the spinlock, clear data_in_avail 
>>>>>>>>>> and
>>>>>>>>>> data_in_idx, then release the lock and mutex. If you are really
>>>>>>>>>> adventurous you can do this without grabbing the lock using
>>>>>>>>>> barriers, but it's probably not necessary here.
>>>>>>>>>>
>>>>>>>>
>>>>>>>> With the state machine being able to be restarted at any time, 
>>>>>>>> you need
>>>>>>>> something a little different here.  You still need the mutex to 
>>>>>>>> handle
>>>>>>>> multiple readers and the copy.  I think the function should be 
>>>>>>>> something
>>>>>>>> like:
>>>>>>>>
>>>>>>> Since KCS is not a multi-reader protocol from BMC's view, you 
>>>>>>> makes things complex. :-)
>>>>>>
>>>>>> No, I don't think you understand.  The primary purpose of the 
>>>>>> complexity
>>>>>> here is to protect the driver from the host system (on the other 
>>>>>> side of
>>>>>> the KCS interface).  Without this protection, it is possible for 
>>>>>> the host
>>>>>> system to start a new write while the user on the BMC side is 
>>>>>> reading
>>>>>> data out, resulting in corrupt data being read.
>>>>>>
>>>>>> I haven't thought too much about this.  There may be a simpler way,
>>>>>> but the protection needs to be there.
>>>>>>
>>>>>> And you may not think you need to protect the driver against a
>>>>>> malicious BMC side user code, but you would be wrong. You can
>>>>>> only have one opener, but with threads or a fork you can have
>>>>>> multiple readers.  And you don't know if a malicious piece of
>>>>>> code has taken over userland.  You always need to protect the
>>>>>> kernel.
>>>>>>
>>>>> Sure, the read/write have protected the critical data area with 
>>>>> IRQ, and also, these
>>>>> functions should be thread local safe I believe.
>>>>>
>>>>> spin_lock_irq(&kcs_bmc->lock);
>>>>> ...
>>>>> spin_unlock_irq(&kcs_bmc->lock);
>>>>>
>>>>
>>>> But remember, you can't call copy_to_user() when IRQs are off or 
>>>> when you are holding
>>>> a spinlock.  That is an absolute no.  It can crash the kernel.
>>>>
>>>> So you need a design that takes this into account, but will not 
>>>> result in the possibility
>>>> of bad data being read.
>>>>
>>> Yes, sure, as I said before: access_ok(VERIFY_WRITE, to, n), then 
>>> memcpy in spin_lock.
>>
>> Where did you get the idea that this was ok?  It's not. access_ok() 
>> is not actually very
>> useful, since the permissions on memory can change at any time unless 
>> you are holding
>> the mm lock, which is also not an ok thing to do.  It is entirely 
>> possible for access_ok()
>> to pass and copy_to_user() to fail.
>>
> I thought memcpy will not fail. :(

Oh, memcpy won't fail as long as the source and destination is kernel 
memory.
I was a little confused by the access_ok() thing, it's common for people to
assume that if they do access_ok(), that copy_to_user() won't fail.

>> I'm not exactly sure what you are saying, though.  In any event, a 
>> well-designed read()/write()
>> operation should leave the system unchanged if it gets an error.
>>
> I saw BT use a local buffer, If I change the '#define 
> KCS_MSG_BUFSIZ    1024' to ".. 512", should it be OK
> as BT ?
>
> static ssize_t bt_bmc_read(struct file *file, char __user *buf,
>                size_t count, loff_t *ppos)
> {
>     struct bt_bmc *bt_bmc = file_bt_bmc(file);
>     u8 len;
>     int len_byte = 1;
>     u8 kbuffer[BT_BMC_BUFFER_SIZE];  --> #define BT_BMC_BUFFER_SIZE 256

It's good practice to keep larger things off the stack, which is why I 
dynamically
allocated it.  But if you have a mutex, you can put that buffer in 
struct bt_bmc
since it would only be accessed when holding the mutex.

>
>> -corey
>>
>>>>>>>>    static ssize_t kcs_bmc_read(struct file *filp, char *buf,
>>>>>>>>                     size_t count, loff_t *offset)
>>>>>>>>    {
>>>>>>>>         struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
>>>>>>>>         ssize_t ret;
>>>>>>>>         bool avail;
>>>>>>>>         size_t data_size;
>>>>>>>>         u8 *data;
>>>>>>>>
>>>>>>>>         data = kmalloc(KCS_MSG_BUFSIZ, GFP_KERNEL);
>>>>>>>>         if (!data)
>>>>>>>>             return -ENOMEM;
>>>>>>>>
>>>>>>>>    retry:
>>>>>>>>         ret = -EAGAIN;
>>>>>>>>         if (!(filp->f_flags & O_NONBLOCK))
>>>>>>>> wait_event_interruptible(kcs_bmc->queue,
>>>>>>>>                          kcs_bmc->data_in_avail);
>>>>>>>>
>>>>>>>>         mutex_lock(&kcs_bmc->read_mutex);
>>>>>>>>
>>>>>>>>         spin_lock_irq(&kcs_bmc->lock);
>>>>>>>>         avail = kcs_bmc->data_in_avail;
>>>>>>>>         if (avail) {
>>>>>>>>             memcpy(data, kcs_bmc->data_in, kcs_bmc->data_in_idx);
>>>>>>>>             data_size = kcs_bmc->data_in_idx;
>>>>>>>>         }
>>>>>>>>         spin_unlock_irq(&kcs_bmc->lock);
>>>>>>>>
>>>>>>>>         if (!avail) {
>>>>>>>>             if (filp->f_flags & O_NONBLOCK)
>>>>>>>>                 goto out_mutex_unlock;
>>>>>>>> mutex_unlock(&kcs_bmc->read_mutex);
>>>>>>>>             goto retry;
>>>>>>>>         }
>>>>>>>>
>>>>>>>>         if (count < data_size) {
>>>>>>>>             ret = -EOVERFLOW;
>>>>>>>>              ? I'm not sure about the error, but userspace 
>>>>>>>> needs to know.
>>>>>>>>             goto out_mutex_unlock;
>>>>>>
>>>>>> Maybe a length error to the host side here?
>>>>
>>>> You didn't comment on this or the other length error.  That needs 
>>>> to be
>>>> handled.
>>>>
>>> Yes, will send a length error by following KCS spec.
>>>>>>
>>>>>>>>         }
>>>>>>>>
>>>>>>>>         if (!copy_to_user(buf, data, data_size)) {
>>>>>>>>             ret = -EFAULT;
>>>>>>>>             goto out_mutex_unlock;
>>>>>>>>         }
>>>>>>>>
>>>>>>>>         ret = data_size;
>>>>>>>>
>>>>>>>>         spin_lock_irq(&kcs_bmc->lock);
>>>>>>>>
>>>>>>>>         if (kcs_bmc->phase != KCS_PHASE_WRITE_END_DONE)
>>>>>>>>             /* Something aborted or restarted the state 
>>>>>>>> machine. */
>>>>>>>>             ? Maybe restart if O_NONBLOCK is not set and 
>>>>>>>> -EAGAIN if it is?
>>>>>>>>             ret = -EIO;
>>>>>>>>         } else {
>>>>>>>>             kcs_bmc->phase = KCS_PHASE_WAIT_READ;
>>>>>>>>             kcs_bmc->data_in_avail = false;
>>>>>>>>             kcs_bmc->data_in_idx = 0;
>>>>>>>>         }
>>>>>>>>
>>>>>>>>         spin_unlock_irq(&kcs_bmc->lock);
>>>>>>>>
>>>>>>>>    out_mutex_unlock:
>>>>>>>>         mutex_unlock(&kcs_bmc->read_mutex);
>>>>>>>>
>>>>>>>>         kfree(data);
>>>>>>>>
>>>>>>>>         return ret;
>>>>>>>>    }
>>>>>>>> Note that I added a state, KCS_PHASE_WRITE_END_DONE, which 
>>>>>>>> would be
>>>>>>>> set after the final byte from the host is received. You want 
>>>>>>>> the read here
>>>>>>>> done before you can do the write below to avoid the race I 
>>>>>>>> talked about.
>>>>>>>>
>>>>>>>> There is a local copy made of the data.  What you *never* want 
>>>>>>>> to happen
>>>>>>>> here is for the state machine to start processing a new write 
>>>>>>>> command
>>>>>>>> while the data is being copied.  It could result in corrupt 
>>>>>>>> data being read
>>>>>>>> and some random operation being done by the BMC.
>>>>>>>>
>>>>>>>> If you want to avoid the local copy, it could be done, but it's 
>>>>>>>> more complex.
>>>>>>>>
>>>>>>>>>>> +    if (!(filp->f_flags & O_NONBLOCK))
>>>>>>>>>>> + wait_event_interruptible(kcs_bmc->queue,
>>>>>>>>>>> + kcs_bmc->data_in_avail);
>>>>>>>>>>> +
>>>>>>>>>>> +    spin_lock_irq(&kcs_bmc->lock);
>>>>>>>>>>> +
>>>>>>>>>>> +    if (kcs_bmc->data_in_avail) {
>>>>>>>>>>> +        kcs_bmc->data_in_avail = false;
>>>>>>>>>>> +
>>>>>>>>>>> +        if (count > kcs_bmc->data_in_idx)
>>>>>>>>>>> +            count = kcs_bmc->data_in_idx;
>>>>>>>>>>> +
>>>>>>>>>>> +        if (!copy_to_user(buf, kcs_bmc->data_in, count))
>>>>>>>>>>> +            ret = count;
>>>>>>>>>>> +        else
>>>>>>>>>>> +            ret = -EFAULT;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    spin_unlock_irq(&kcs_bmc->lock);
>>>>>>>>>>> +
>>>>>>>>>>> +    return ret;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static ssize_t kcs_bmc_write(struct file *filp, const char 
>>>>>>>>>>> *buf,
>>>>>>>>>>> +                 size_t count, loff_t *offset)
>>>>>>>>>>> +{
>>>>>>>>>>> +    struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
>>>>>>>>>>> +    ssize_t ret = count;
>>>>>>>>>>> +
>>>>>>>>>>> +    if (count < 1 || count > KCS_MSG_BUFSIZ)
>>>>>>>>>>> +        return -EINVAL;
>>>>>>>>>>> +
>>>>>>>>>>> +    spin_lock_irq(&kcs_bmc->lock);
>>>>>>>>>>> +
>>>>>>>>>>> +    if (kcs_bmc->phase == KCS_PHASE_WAIT_READ) {
>>>>>>>>>>> +        if (copy_from_user(kcs_bmc->data_out, buf, count)) {
>>>>>>>>>>> + spin_unlock_irq(&kcs_bmc->lock);
>>>>>>>>>>> +            return -EFAULT;
>>>>>>>>>>> +        }
>>>>>>>>>>> +
>>>>>>>>>>> +        kcs_bmc->phase = KCS_PHASE_READ;
>>>>>>>>>>> +        kcs_bmc->data_out_idx = 1;
>>>>>>>>>>> +        kcs_bmc->data_out_len = count;
>>>>>>>>>>> +        write_data(kcs_bmc, kcs_bmc->data_out[0]);
>>>>>>>>>>> +    } else if (kcs_bmc->phase == KCS_PHASE_READ) {
>>>>>>>>>>> +        ret = -EBUSY;
>>>>>>>>>>> +    } else {
>>>>>>>>>>> +        ret = -EINVAL;
>>>>>>>>>>
>>>>>>>>>> Is there a reason you return -EINVAL here?  Why not just 
>>>>>>>>>> -EBUSY in all
>>>>>>>>>> cases?  Is there something that userland will need to do 
>>>>>>>>>> differently?
>>>>>>>>>>
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    spin_unlock_irq(&kcs_bmc->lock);
>>>>>>>>>>> +
>>>>>>>>>>> +    return ret;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static long kcs_bmc_ioctl(struct file *filp, unsigned int cmd,
>>>>>>>>>>> +              unsigned long arg)
>>>>>>>>>>> +{
>>>>>>>>>>> +    struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
>>>>>>>>>>> +    long ret = 0;
>>>>>>>>>>> +
>>>>>>>>>>> +    spin_lock_irq(&kcs_bmc->lock);
>>>>>>>>>>> +
>>>>>>>>>>> +    switch (cmd) {
>>>>>>>>>>> +    case IPMI_BMC_IOCTL_SET_SMS_ATN:
>>>>>>>>>>> +        update_status_bits(kcs_bmc, KCS_STATUS_SMS_ATN,
>>>>>>>>>>> +                        KCS_STATUS_SMS_ATN);
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>> +    case IPMI_BMC_IOCTL_CLEAR_SMS_ATN:
>>>>>>>>>>> +        update_status_bits(kcs_bmc, KCS_STATUS_SMS_ATN,
>>>>>>>>>>> +                        0);
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>> +    case IPMI_BMC_IOCTL_FORCE_ABORT:
>>>>>>>>>>> +        set_state(kcs_bmc, ERROR_STATE);
>>>>>>>>>>> +        read_data(kcs_bmc);
>>>>>>>>>>> +        write_data(kcs_bmc, KCS_ZERO_DATA);
>>>>>>>>>>> +
>>>>>>>>>>> +        kcs_bmc->phase = KCS_PHASE_ERROR;
>>>>>>>>>>> +        kcs_bmc->data_in_avail = false;
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>> +    default:
>>>>>>>>>>> +        ret = -EINVAL;
>>>>>>>>>>> +        break;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    spin_unlock_irq(&kcs_bmc->lock);
>>>>>>>>>>> +
>>>>>>>>>>> +    return ret;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static int kcs_bmc_release(struct inode *inode, struct file 
>>>>>>>>>>> *filp)
>>>>>>>>>>> +{
>>>>>>>>>>> +    struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
>>>>>>>>>>> +
>>>>>>>>>>
>>>>>>>>>> What happens if the device gets closed in the middle of a 
>>>>>>>>>> transaction?  That's
>>>>>>>>>> an important case to handle.  If something is in process, you 
>>>>>>>>>> need to abort it.
>>>>>>>>>>
>>>>>>>>> The device just provides the read & write data, the 
>>>>>>>>> transaction is handled in the KCS
>>>>>>>>> controller's IRQ handler.
>>>>>>>>
>>>>>>>> From the spec, section 9.14:
>>>>>>>>
>>>>>>>>    The BMC must change the status to ERROR_STATE on any 
>>>>>>>> condition where it
>>>>>>>>    aborts a command transfer in progress.
>>>>>>>>
>>>>>>>> So you need to do something here.
>>>>>>>>
>>>>>>> In practice, we do this as spec said in ipmid, NOT in driver, 
>>>>>>> driver can't handle anything, let's
>>>>>>> make it simple, thanks!
>>>>>>
>>>>>> If ipmid crashes or is killed, how does it accomplish this?
>>>>>>
>>>>> Every time ipmids (or kcsd) crashed or killed, it needs start to 
>>>>> call FORCE_ARBORT firstly, to sync with
>>>>> host side software.
>>>>>>>
>>>>>>> Whenever the BMC is reset (from power-on or a hard reset), the 
>>>>>>> State Bits are initialized to “11 - Error State”. Doing so
>>>>>>> allows SMS to detect that the BMC has been reset and that any 
>>>>>>> message in process has been terminated by the BMC.
>>>>>>
>>>>>> Right, that's fine, like it should be.  But we are not talking 
>>>>>> about a reset.
>>>>>>
>>>>> I think the final error handling solution is that kcsd (user land) 
>>>>> runs, otherwise, the host software side still got stuck. We meet
>>>>> this kind of issue, so in general, we just doesn't handle some 
>>>>> mirror errors in driver, then in kcsd, when it can provide the real
>>>>> IPMI service, it will reset the channel firstly to sync with host 
>>>>> side software.
>>>>
>>>> "Userland will do the right thing" is not very convincing to a 
>>>> kernel developer.
>>>>
>>>> Plus if the above is true, I would think that you would just want 
>>>> to hold the device
>>>> in an error state when it wasn't opened.
>>>>
>>> I understand your concern, of course, driver need handles things 
>>> well. But in fact, if a user app is truly a bad boy, it still can hang
>>> the host side: set SMS_ATN, but no message returned when software 
>>> host side requests, then host open-ipmi driver will hang, we
>>> meet this kind of error to hang the customer's host. :) In my 
>>> understanding, kcs-bmc should do the right thing about read and write,
>>> the real transaction should be handled correctly by the kcsd.
>>>
>>> And if no kcsd starts, then this kind of BMC can't be sold out. :)
>>
>> True.  I'm not as concerned about this sort of thing.  It's nicer to 
>> the host side if
>> it can detect problems quickly, but it will eventually time out.
>>
>> From what I can tell from the current design, if the BMC userland is 
>> not running,
>> the driver will step through the state machine until it hits read 
>> state, then it
>> will sit there until the host times out and aborts the operation.
>>
>> IMHO, it would be better for the host side if the driver just stayed 
>> in error state
>> if nothing had it open.  It would think the spec says that in the 
>> quote I referenced
>> above, but that quote, like many things in the IPMI spec, is fairly 
>> vague and could
>> be interpreted many ways.
>>
> Well, I will try to fix this errors as possible.
>> -corey
>>
>>
>>>> -corey
>>>>
>>>>>> -corey
>>>>>>
>>>>>>>>>>> + spin_lock_irq(&kcs_bmc->lock);
>>>>>>>>>>> +
>>>>>>>>>>> +    kcs_bmc->running = 0;
>>>>>>>>>>> +
>>>>>>>>>>> +    spin_unlock_irq(&kcs_bmc->lock);
>>>>>>>>>>> +
>>>>>>>>>>> +    return 0;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

  reply	other threads:[~2018-01-31  1:52 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-24 16:06 [PATCH arm/aspeed/ast2500 v2] ipmi: add an Aspeed KCS IPMI BMC driver Haiyue Wang
2018-01-24 17:05 ` Andy Shevchenko
2018-01-24 17:05   ` Andy Shevchenko
2018-01-26  5:33   ` Wang, Haiyue
2018-01-24 17:48 ` Corey Minyard
2018-01-26  6:08   ` Wang, Haiyue
2018-01-26 14:48     ` Corey Minyard
2018-01-29 13:57       ` Wang, Haiyue
2018-01-30 13:49         ` Corey Minyard
2018-01-31  0:02           ` Wang, Haiyue
2018-01-31  0:52             ` Corey Minyard
2018-01-31  1:02               ` Wang, Haiyue
2018-01-31  1:25                 ` Corey Minyard
2018-01-31  1:37                   ` Wang, Haiyue
2018-01-31  1:52                     ` Corey Minyard [this message]
2018-01-31  2:01                       ` Wang, Haiyue
2018-01-26  6:26   ` Wang, Haiyue

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc5db9c7-a290-b49e-0217-879817c8cc87@acm.org \
    --to=minyard@acm.org \
    --cc=andriy.shevchenko@intel.com \
    --cc=avifishman70@gmail.com \
    --cc=haiyue.wang@linux.intel.com \
    --cc=joel@jms.id.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=openbmc@lists.ozlabs.org \
    --cc=openipmi-developer@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.