On 2021-12-02 7:43 a.m., Mike Lothian wrote:
On Thu, 25 Nov 2021 at 20:42, Felix Kuehling <felix.kuehling@amd.com> wrote:
OK. Dealing with processed timestamp rather than decoded timestamp fixes
the race condition where drain would have returned before the last fault
was processed. But we're still assuming that each interrupt has a unique
timestamp. We haven't seen evidence of the contrary. But I'd like to add
a check to be sure, in case that assumption becomes invalid on future
hardware. Something like this (before updating ih->processed_timestamp):

    WARN_ONCE(ih->processed_timestamp == entry.timestamp, "IH timestamps
are not unique");

With that fixed, the patch is

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>

Hi,

I'm seeing this on my machine
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1818&amp;data=04%7C01%7Cphilip.yang%40amd.com%7C770013943b484db418a008d9b5916797%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637740458397477281%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=8a%2B%2FzRA%2F4VMPp%2B3AA0L%2FwJ9oCZxl7QQbvE4535m3HM4%3D&amp;reserved=0

Thanks for the report, the warning is for chip RENOIR, right after device init done. The assumption of unique timestamp of IH vector is invalid, I will submit another patch to fix the drain fault interrupt logic and remove this WARN_ONCE.

Regards,

Philip

Cheers

Mike