On Wed, Aug 7, 2019 at 3:27 PM John Ogness wrote: > > 2. For the CONFIG_PPC_POWERNV powerpc platform, kernel log buffer > registration is no longer available because there is no longer > a single contigous block of memory to represent all of the > ringbuffer. So this is tangential, but I've actually been wishing for a special "raw dump" format that has absolutely *no* structure to it at all, and is as a result not necessarily strictly reliable, but is a lot more robust. The background for that is that we have a class of bugs that are really hard to debug "in the wild", because people don't have access to serial consoles or any kind of special hardware at all (ie forget things like nvram etc), and when the machine locks up you're happy to just have a reset button (but more likely you have to turn power off and on). End result: a DRAM buffer can work, but is not "reliable". Particularly if you turn power on and off, data retention of DRAM is iffy. But it's possible, at least in theory. So I have a patch that implements a "stupid ring buffer" for thisa case, with absolutely zero data structures (because in the presense of DRAM corruption, all you can get is "hopefully only slightly garbled ASCII". It actually does work. It's a complete hack, but I have used this on real hardware to see dumps that happened after the machine could no longer send them to any device. I actually suspect that this kind of "stupid non-structured secondary log" can often be much more useful than the existing nvram special cases - yes the output can be garbled for multi-cpu cases because it not only is lockless, it's lockess without even any data structures - but it also works somewhat reliably when the machine is _really_ borked. Which is exactly when you want a log that isn't just the normal "working machine syslog". NOTE! This is *not* a replacement for a lockless printk. This is very much an _additional_ "low overhead buffer in RAM" for post-mortem analysis when anything fancier doesn't work. So I'm throwing this patch out there in case people have interest in looking at that very special case. Also note how right now the example code just steals a random physical memory area at roughly physical location 12GB - this is a hack and would need to be configurable obviously in real life, but it worked for the machines I tested (which both happened to have 16GB of RAM). Those parts are marked with "// HACK HACK HACK" and just a hardcoded physical address (0x320000000). Linus