All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: amd_mce.c redundant if check?
@ 2014-08-21  4:08 Chip
  2014-08-22  4:55 ` Borislav Petkov
  0 siblings, 1 reply; 5+ messages in thread
From: Chip @ 2014-08-21  4:08 UTC (permalink / raw)
  To: linux-kernel

On Wed, Aug 20, 2014 at 11:18:21AM -0600, Adam Duskett wrote:

> I have recently come upon this section of code in
> arch/x86/kernel/cpu/mcheck/mce_amd.c that seems to be a redundant
> unnecessary if check.
>
>
> From line 170 - 176:
>
> if (tr->set_lvt_off) {
> if (lvt_off_valid(tr->b, tr->lvt_off, lo, hi)) {
> /* set new lvt offset */
> hi &= ~MASK_LVTOFF_HI;
> hi |= tr->lvt_off << 20;
> }
> }
>
>
> This seems like it's not actually doing anything because it's setting
> the same value that the bit-field already has to itself.

I brought this up to Adam the other day, so he posted the question to this 
list today to elicit a response from the original developer(s).  I realize 
the quickest response is to ask the original poster (Adam) to investigate 
further, such as with pen and paper, but that is not a proper response to a 
legitimate question.  Here is the #define that is referenced, and the two 
routines in question.  This is current in kernel version 3.16 in 
arch/x86/kernel/cpu/mcheck/mce_amd.c.

#define MASK_LVTOFF_HI    0x00F00000

static int lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 
hi)
{
        int msr = (hi & MASK_LVTOFF_HI) >> 20;

        if (apic < 0) {
                pr_err(FW_BUG "cpu %d, failed to setup threshold interrupt "
                       "for bank %d, block %d (MSR%08X=0x%x%08x)\n", b->cpu,
                       b->bank, b->block, b->address, hi, lo);
                return 0;
        }

        if (apic != msr) {
                pr_err(FW_BUG "cpu %d, invalid threshold interrupt offset %d 
"
                       "for bank %d, block %d (MSR%08X=0x%x%08x)\n",
                       b->cpu, apic, b->bank, b->block, b->address, hi, lo);
                return 0;
        }

        return 1;
};

/*
* Called via smp_call_function_single(), must be called with correct
* cpu affinity.
*/
static void threshold_restart_bank(void *_tr)
{
        struct thresh_restart *tr = _tr;
        u32 hi, lo;

        rdmsr(tr->b->address, lo, hi);

        if (tr->b->threshold_limit < (hi & THRESHOLD_MAX))
                tr->reset = 1;  /* limit cannot be lower than err count */

        if (tr->reset) {                /* reset err count and overflow bit 
*/
                hi =
                    (hi & ~(MASK_ERR_COUNT_HI | MASK_OVERFLOW_HI)) |
                    (THRESHOLD_MAX - tr->b->threshold_limit);
        } else if (tr->old_limit) {     /* change limit w/o reset */
                int new_count = (hi & THRESHOLD_MAX) +
                    (tr->old_limit - tr->b->threshold_limit);

                hi = (hi & ~MASK_ERR_COUNT_HI) |
                    (new_count & THRESHOLD_MAX);
        }

        /* clear IntType */
        hi &= ~MASK_INT_TYPE_HI;

        if (!tr->b->interrupt_capable)
                goto done;

        if (tr->set_lvt_off) {
                if (lvt_off_valid(tr->b, tr->lvt_off, lo, hi)) {
                        /* set new lvt offset */
                        hi &= ~MASK_LVTOFF_HI;
                        hi |= tr->lvt_off << 20;
                }
        }

        if (tr->b->interrupt_enable)
                hi |= INT_TYPE_APIC;

done:

        hi |= MASK_COUNT_EN_HI;
        wrmsr(tr->b->address, lo, hi);
}


If one were to actually analyze the source file from which this snippet 
comes (lines 117 - 185), one would realize the call to lvt_off_valid() is 
given tr->lvt_off as the input "apic" value that is compared to the content 
in "hi" at bit positions 23:20 (MSR bits 55:52); this field is called LVT 
Offset (LVTOFF).  The value for tr->lvt_off is usually from 0 to 4, 
inclusive.  If this value is equal to the LVTOFF value in "hi", then 
lvt_off_valid() returns 1 for true.  If the value for tr->lvt_off differs 
from the LVTOFF value in "hi", then lvt_off_valid() returns 0 for false.

Now, if the return from lvt_off_valid() is false, then nothing is changed in 
"hi".  However, if the return is true, which means the value in tr->lvt_off 
is equal to the LVTOFF value in "hi", then the LVTOFF value in "hi" is 
replaced with the value in tr->lvt_off.  One has to wonder, then, why bother 
actually calling lvt_off_valid() in the first place when the end result is 
that "hi" does not change.  What is the rationale for having the code 
snippet at lines 170 - 176 when that condition check does nothing to change 
"hi"?

-- 
Chip 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: amd_mce.c redundant if check?
  2014-08-21  4:08 amd_mce.c redundant if check? Chip
@ 2014-08-22  4:55 ` Borislav Petkov
  2014-08-28 11:09   ` Robert Richter
  0 siblings, 1 reply; 5+ messages in thread
From: Borislav Petkov @ 2014-08-22  4:55 UTC (permalink / raw)
  To: Chip, Robert Richter; +Cc: linux-kernel

Hi,

first of all, please remember to hit Reply-to-all when replying to mails
on lkml otherwise your note might get lost in the flood.

On Wed, Aug 20, 2014 at 10:08:18PM -0600, Chip wrote:
> On Wed, Aug 20, 2014 at 11:18:21AM -0600, Adam Duskett wrote:
> 
> >I have recently come upon this section of code in
> >arch/x86/kernel/cpu/mcheck/mce_amd.c that seems to be a redundant
> >unnecessary if check.
> >
> >
> >From line 170 - 176:
> >
> >if (tr->set_lvt_off) {
> >if (lvt_off_valid(tr->b, tr->lvt_off, lo, hi)) {
> >/* set new lvt offset */
> >hi &= ~MASK_LVTOFF_HI;
> >hi |= tr->lvt_off << 20;
> >}
> >}
> >
> >
> >This seems like it's not actually doing anything because it's setting
> >the same value that the bit-field already has to itself.
> 
> I brought this up to Adam the other day, so he posted the question
> to this list today to elicit a response from the original
> developer(s).  I realize the quickest response is to ask the
> original poster (Adam) to investigate further, such as with pen and
> paper, but that is not a proper response to a legitimate question.
> Here is the #define that is referenced, and the two routines in
> question.  This is current in kernel version 3.16 in
> arch/x86/kernel/cpu/mcheck/mce_amd.c.
> 
> #define MASK_LVTOFF_HI    0x00F00000
> 
> static int lvt_off_valid(struct threshold_block *b, int apic, u32
> lo, u32 hi)
> {
>        int msr = (hi & MASK_LVTOFF_HI) >> 20;
> 
>        if (apic < 0) {
>                pr_err(FW_BUG "cpu %d, failed to setup threshold interrupt "
>                       "for bank %d, block %d (MSR%08X=0x%x%08x)\n", b->cpu,
>                       b->bank, b->block, b->address, hi, lo);
>                return 0;
>        }
> 
>        if (apic != msr) {
>                pr_err(FW_BUG "cpu %d, invalid threshold interrupt
> offset %d "
>                       "for bank %d, block %d (MSR%08X=0x%x%08x)\n",
>                       b->cpu, apic, b->bank, b->block, b->address, hi, lo);
>                return 0;
>        }
> 
>        return 1;
> };
> 
> /*
> * Called via smp_call_function_single(), must be called with correct
> * cpu affinity.
> */
> static void threshold_restart_bank(void *_tr)
> {
>        struct thresh_restart *tr = _tr;
>        u32 hi, lo;
> 
>        rdmsr(tr->b->address, lo, hi);
> 
>        if (tr->b->threshold_limit < (hi & THRESHOLD_MAX))
>                tr->reset = 1;  /* limit cannot be lower than err count */
> 
>        if (tr->reset) {                /* reset err count and
> overflow bit */
>                hi =
>                    (hi & ~(MASK_ERR_COUNT_HI | MASK_OVERFLOW_HI)) |
>                    (THRESHOLD_MAX - tr->b->threshold_limit);
>        } else if (tr->old_limit) {     /* change limit w/o reset */
>                int new_count = (hi & THRESHOLD_MAX) +
>                    (tr->old_limit - tr->b->threshold_limit);
> 
>                hi = (hi & ~MASK_ERR_COUNT_HI) |
>                    (new_count & THRESHOLD_MAX);
>        }
> 
>        /* clear IntType */
>        hi &= ~MASK_INT_TYPE_HI;
> 
>        if (!tr->b->interrupt_capable)
>                goto done;
> 
>        if (tr->set_lvt_off) {
>                if (lvt_off_valid(tr->b, tr->lvt_off, lo, hi)) {
>                        /* set new lvt offset */
>                        hi &= ~MASK_LVTOFF_HI;
>                        hi |= tr->lvt_off << 20;
>                }
>        }
> 
>        if (tr->b->interrupt_enable)
>                hi |= INT_TYPE_APIC;
> 
> done:
> 
>        hi |= MASK_COUNT_EN_HI;
>        wrmsr(tr->b->address, lo, hi);
> }
> 
> 
> If one were to actually analyze the source file from which this
> snippet comes (lines 117 - 185), one would realize the call to
> lvt_off_valid() is given tr->lvt_off as the input "apic" value that
> is compared to the content in "hi" at bit positions 23:20 (MSR bits
> 55:52); this field is called LVT Offset (LVTOFF).  The value for
> tr->lvt_off is usually from 0 to 4, inclusive.  If this value is
> equal to the LVTOFF value in "hi", then lvt_off_valid() returns 1
> for true.  If the value for tr->lvt_off differs from the LVTOFF
> value in "hi", then lvt_off_valid() returns 0 for false.
> 
> Now, if the return from lvt_off_valid() is false, then nothing is
> changed in "hi".  However, if the return is true, which means the
> value in tr->lvt_off is equal to the LVTOFF value in "hi", then the
> LVTOFF value in "hi" is replaced with the value in tr->lvt_off.  One
> has to wonder, then, why bother actually calling lvt_off_valid() in
> the first place when the end result is that "hi" does not change.
> What is the rationale for having the code snippet at lines 170 - 176
> when that condition check does nothing to change "hi"?

Right, I see what you mean now. This is

bbaff08dca3c ("mce, amd: Add helper functions to setup APIC")

Frankly, I'm not too worried about the overwriting the LVT offset with
the same value in the success case - that doesn't hurt anyone.

What is more interesting is what we do in the !lvt_off_valid case... I
don't think we should be writing the MSR at all at the end but just exit
early. But I've long forgotten the details of the whole APIC vectors
deal we thought up at the time.

Bob, can you please take a look?

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: amd_mce.c redundant if check?
  2014-08-22  4:55 ` Borislav Petkov
@ 2014-08-28 11:09   ` Robert Richter
  0 siblings, 0 replies; 5+ messages in thread
From: Robert Richter @ 2014-08-28 11:09 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Chip, linux-kernel

On 22.08.14 06:55:16, Borislav Petkov wrote:
> On Wed, Aug 20, 2014 at 10:08:18PM -0600, Chip wrote:
> > Now, if the return from lvt_off_valid() is false, then nothing is
> > changed in "hi".  However, if the return is true, which means the
> > value in tr->lvt_off is equal to the LVTOFF value in "hi", then the
> > LVTOFF value in "hi" is replaced with the value in tr->lvt_off.  One
> > has to wonder, then, why bother actually calling lvt_off_valid() in
> > the first place when the end result is that "hi" does not change.
> > What is the rationale for having the code snippet at lines 170 - 176
> > when that condition check does nothing to change "hi"?
> 
> Right, I see what you mean now. This is
> 
> bbaff08dca3c ("mce, amd: Add helper functions to setup APIC")
> 
> Frankly, I'm not too worried about the overwriting the LVT offset with
> the same value in the success case - that doesn't hurt anyone.

The main purpose of lvt_off_valid() is to check the initial msr
settings provided by the bios and to throw a FW_BUG message if there
is anything wrong.

The MCE offset must be the same for all banks on all cores at every
time and it must be different to all other vectors, esp. IBS, on all
other cores. APIC settings also get lost after suspend/resume
(different for suspend to disk or ram), this needs to be handled too.

We have seen bioses there this was completely messed up having
different offsets for different banks or same offsets for mce and
ibs. Note that mce is used only on cpu 0 of each node while ibs is
used on every cpu of a node. This makes things even more complex as
you may not use an unused offset on a particular core since this
offset is already used on a different core of the same node.

For these reasons we track offsets in sw which is implemented in
setup_APIC_eilvt().

For mce there is no offset valid bit as for ibs and sw assumes lvt_off
to be always valid. This value is used in setup_APIC_mce() to setup
the apic. If there is already an offset for mce registered, that
current offset is returned instead of the new one, see
mce_amd_feature_init():

                        if (b.interrupt_capable) {
                                int new = (high & MASK_LVTOFF_HI) >> 20;
                                offset  = setup_APIC_mce(offset, new);
                        }

                        mce_threshold_block_init(&b, offset);

'offset' and 'new' may differ then because of a bios bug. In this case
a FW_BUG message is thrown.

Now, I agree with you that current implementation does not allow to
program different offsets than provided by the msr. So we allow to
overwrite the vector only if the value is the same which is needless
and this code may be removed. But for !lvt_off_valid which is the case
if we observe a different offset, we have a bios bug and do not expect
mces to work properly. Maybe we disable interrupts at all then.  But
so far this wasn't an issue.

The check is needed in any case to throw FW bug messages.

-Robert

> What is more interesting is what we do in the !lvt_off_valid case... I
> don't think we should be writing the MSR at all at the end but just exit
> early. But I've long forgotten the details of the whole APIC vectors
> deal we thought up at the time.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: amd_mce.c redundant if check?
  2014-08-20 17:18 Adam Duskett
@ 2014-08-20 18:57 ` Borislav Petkov
  0 siblings, 0 replies; 5+ messages in thread
From: Borislav Petkov @ 2014-08-20 18:57 UTC (permalink / raw)
  To: Adam Duskett; +Cc: linux-kernel

On Wed, Aug 20, 2014 at 11:18:21AM -0600, Adam Duskett wrote:
> I have recently come upon this section of code in
> arch/x86/kernel/cpu/mcheck/mce_amd.c that seems to be a redundant
> unnecessary if check.
> 
> 
> From line 170 - 176:
> 
> if (tr->set_lvt_off) {
> if (lvt_off_valid(tr->b, tr->lvt_off, lo, hi)) {
> /* set new lvt offset */
> hi &= ~MASK_LVTOFF_HI;
> hi |= tr->lvt_off << 20;
> }
> }
> 
> 
> This seems like it's not actually doing anything because it's setting
> the same value that the bit-field already has to itself.

Are you sure?

Just try to find out what MASK_LVTOFF_HI is and then do the operations
with a pen and paper to check whether something else actually happens.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 5+ messages in thread

* amd_mce.c redundant if check?
@ 2014-08-20 17:18 Adam Duskett
  2014-08-20 18:57 ` Borislav Petkov
  0 siblings, 1 reply; 5+ messages in thread
From: Adam Duskett @ 2014-08-20 17:18 UTC (permalink / raw)
  To: linux-kernel

I have recently come upon this section of code in
arch/x86/kernel/cpu/mcheck/mce_amd.c that seems to be a redundant
unnecessary if check.


>From line 170 - 176:

if (tr->set_lvt_off) {
if (lvt_off_valid(tr->b, tr->lvt_off, lo, hi)) {
/* set new lvt offset */
hi &= ~MASK_LVTOFF_HI;
hi |= tr->lvt_off << 20;
}
}


This seems like it's not actually doing anything because it's setting
the same value that the bit-field already has to itself.

Any reason for this?


Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-08-28 11:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-21  4:08 amd_mce.c redundant if check? Chip
2014-08-22  4:55 ` Borislav Petkov
2014-08-28 11:09   ` Robert Richter
  -- strict thread matches above, loose matches on Subject: below --
2014-08-20 17:18 Adam Duskett
2014-08-20 18:57 ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.