linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	virtio-dev@lists.oasis-open.org,
	"Tony Luck" <tony.luck@intel.com>,
	"Kees Cook" <keescook@chromium.org>, KVM <kvm@vger.kernel.org>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"Anton Vorontsov" <anton@enomsg.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	qemu-devel <qemu-devel@nongnu.org>,
	"Minchan Kim" <minchan@kernel.org>,
	"Anthony Liguori" <aliguori@amazon.com>,
	"Colin Cross" <ccross@android.com>,
	virtualization@lists.linux-foundation.org,
	"Ingo Molnar" <mingo@kernel.org>
Subject: Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
Date: Fri, 18 Nov 2016 12:32:06 +0900	[thread overview]
Message-ID: <20161118033206.GA15698@danjae.aot.lge.com> (raw)
In-Reply-To: <1627682877.13077005.1479298236793.JavaMail.zimbra@redhat.com>

Hi,

Thanks for your detailed information,

On Wed, Nov 16, 2016 at 07:10:36AM -0500, Paolo Bonzini wrote:
> > Not sure how independent ERST is from ACPI and other specs.  It looks
> > like referencing UEFI spec at least.
> 
> It is just the format of error records that comes from the UEFI spec
> (include/linux/cper.h) but you can ignore it, I think.  It should be
> handled by tools on the host side.  For you, the error log address
> range contains a CPER header followed by a binary blob.  In practice,
> you only need the record length field (bytes 20-23 of the header),
> though it may be a good idea to validate the signature at the beginning
> of the header.
> 
> > Btw, is the ERST used for pstore only (in Linux)?
> 
> Yes.  It can store various records, including dmesg and MCE.
> 
> There are other examples in QEMU of interfaces with ACPI.  They all use the
> DSDT, but the logic is similar.  For example, docs/specs/acpi_mem_hotplug.txt
> documents the memory hotplug interface. In all cases, ACPI tables contain small
> programs that talk to specialized hardware registers, typically allocated to
> hard-coded I/O ports.
> 
> In your case, the registers could occupy 16 consecutive I/O ports, like the
> following:
> 
>      0x00       read/write   operation type (0=write,1=read,2=clear,3=dummy write)
> 
>      0x01       read-only    bit 7: if set, operation in progress
> 
>                              bit 0-6: operation status, see "Command Status Definition" in
>                              the ACPI spec
> 
>      0x02       read-only    when read:
> 
>                              - read a 64-bit record id from the store to memory,
>                                from the address that was last written to 0x08.
> 
>                              - if the id is valid and is not the last id in the store,
>                                write the next 64-bit record id to the same address
> 
>                              - otherwise, write the first record id to the same address,
>                                or 0xffffffffffffffff if the store is empty
> 
>      0x03                    unused, read as zero
> 
>      0x04-0x07  read/write   offset of the error record into the error log address range
> 
>      0x08-0x0b  read/write   when read, return number of stored records
> 
>                              when written, the written value is a 32-bit memory address,
>                              which points to a 64-bit location used to communicate record ids.
> 
>      0x0c-0x0f  read/write   when read, always return -1 (together with the "mask" field
>                              and READ_REGISTER, this lets ERST instructions return any value!)
> 
>                              when written, trigger the pstore operation:
> 
>                              - if the current operation is a dummy write, do nothing
> 
>                              - if the current operation is a write, write a new record, using
>                              the written value as the base of the error log address range.  The
>                              length must be parsed from the CPER header.
> 
>                              - if the current operation is a clear, read the record id
>                              from the memory location that was last written to 0x08 and do the
>                              operation.  the value written is ignored.
> 
>                              - if the current operation is a read, read the record id from the
>                              memory location that was last written to 0x08, using the written
>                              value as the base of the error log address range.
> 
> In addition, the firmware will need to reserve a few KB of RAM for the error log
> address range (I checked a real system and it reserves 8KB).  The first eight
> bytes are needed for the record identifier interface, because there's no such
> thing as 64-bit I/O ports, and the rest can be used for the actual buffer.

Is there a limit on the size?  It'd be great if it can use a few MB..

> 
> QEMU already has an interface to allocate RAM and patch the address into an
> ACPI table (bios_linker_loader_alloc).  Because this interface is actually meant
> to load data from QEMU into the firmware (using the "fw_cfg" interface), you
> would have to add a dummy 8KB file to fw_cfg using fw_cfg_add_file (for
> example "etc/erst-memory"), it can be just full of zeros.
> 
> QEMU supports two chipsets, PIIX and ICH9, and the free I/O port ranges are
> different.  You could use 0xa20 for ICH9 and 0xae20 for PIIX.
> 
> All in all, the contents of the ERST table would not be very different from a
> non-virtual system, except that on real hardware the firmware would use SMIs
> as the trap mechanism.  You almost have a one-to-one mapping between ERST
> actions and registers accesses:
> 
>    BEGIN_WRITE_OPERATION                  write value 0 to register at 0x00
>    BEGIN_READ_OPERATION                   write value 1 to register at 0x00
>    BEGIN_CLEAR_OPERATION                  write value 2 to register at 0x00
>    BEGIN_DUMMY_WRITE_OPERATION            write value 3 to register at 0x00
>    END_OPERATION                          no-op
>    CHECK_BUSY_STATUS                      read register at 0x01 with mask 0x80
>    GET_COMMAND_STATUS                     read register at 0x01 with mask 0x7f
>    SET_RECORD_OFFSET                      write register at 0x04
>    GET_RECORD_COUNT                       read register at 0x08
>    EXECUTE_OPERATION                      write ERST memory base + 8 to 0x0c
>    GET_ERROR_LOG_ADDRESS_RANGE            read register at 0x0c (with mask = ERST memory base + 8)
>    GET_ERROR_LOG_ADDRESS_RANGE_LENGTH     read register at 0x0c (with mask = 8192 - 8 = 8184)
>    GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES read register at 0x0c (with mask = 0)
> 
> Only the get/set record identifier instructions are a little harder:
> 
>    GET_RECORD_IDENTIFIER                  write ERST memory base to register at 0x08
>                                           read register at 0x02
>                                           read eight bytes at ERST memory base
> 
>    SET_RECORD_IDENTIFIER                  write ERST memory base to register at 0x08
>                                           write eight bytes at ERST memory base
> 
> On top of this, you need to add the APEI UUID (see apei_osc_setup in Linux)
> to build_q35_osc_method, and use "-M q35" when you start QEMU.  If you need
> more help just ask.  I or others can help you with the ACPI glue, then you
> can write the file backend yourself, based on your existing virtio-pstore code.
> 
> > Also I need to control pstore driver like using bigger buffer,
> > enabling specific message types and so on if ERST supports.  Is it
> > possible for ERST to provide such information?
> 
> It's the normal pstore driver, same as on a real server.  What exactly do you
> need?

Well, I don't want to send additional pstore messages to the device if
it cannot handle them properly - for example, ftrace message should not
overwrite kmsg dump.  It'd be great if device somehow could expose
acceptable message types to the driver IMHO.

Btw I prefer using the kvmtool for my kernel work since it's much more
simpler..

Thanks,
Namhyung

  reply	other threads:[~2016-11-18  3:33 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-20  8:07 [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3) Namhyung Kim
2016-08-20  8:07 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
2016-09-13 15:19   ` Michael S. Tsirkin
2016-09-16  9:05     ` Namhyung Kim
2016-11-10 16:39   ` Michael S. Tsirkin
2016-11-15  4:50     ` Namhyung Kim
2016-11-15  5:06       ` Michael S. Tsirkin
2016-11-15  5:50         ` Namhyung Kim
2016-11-15 14:35           ` Michael S. Tsirkin
2016-11-15  9:57         ` Paolo Bonzini
2016-11-15 14:36           ` Namhyung Kim
2016-11-15 14:38             ` Paolo Bonzini
2016-11-16  7:04               ` Namhyung Kim
2016-11-16 12:10                 ` Paolo Bonzini
2016-11-18  3:32                   ` Namhyung Kim [this message]
2016-11-18  4:07                     ` Michael S. Tsirkin
2016-11-18  9:46                       ` [virtio-dev] " Paolo Bonzini
2016-11-18  9:45                     ` Paolo Bonzini
2016-08-20  8:07 ` [PATCH 2/3] qemu: Implement virtio-pstore device Namhyung Kim
2016-08-24 22:00   ` Daniel P. Berrange
2016-08-26  4:48     ` Namhyung Kim
2016-08-26 12:27       ` Daniel P. Berrange
2016-09-13 15:57   ` Michael S. Tsirkin
2016-09-16 10:05     ` Namhyung Kim
2016-11-10 22:50       ` Michael S. Tsirkin
2016-11-15  6:23         ` Namhyung Kim
2016-11-15 14:38           ` Michael S. Tsirkin
2016-08-20  8:07 ` [PATCH 3/3] kvmtool: " Namhyung Kim
2016-08-23 10:25 ` [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v3) Joel Fernandes
2016-08-23 15:20   ` Namhyung Kim
2016-08-24  7:10     ` Joel
  -- strict thread matches above, loose matches on Subject: below --
2016-09-04 14:38 [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v5) Namhyung Kim
2016-09-04 14:38 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
2016-09-08 20:49   ` Kees Cook
2016-09-22 11:57   ` Stefan Hajnoczi
2016-09-23  5:48     ` Namhyung Kim
2016-08-31  8:07 [RFC/PATCHSET 0/3] virtio: Implement virtio pstore device (v4) Namhyung Kim
2016-08-31  8:08 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
2016-08-31 14:54   ` Michael S. Tsirkin
2016-09-01  0:03     ` Namhyung Kim
2016-07-18  4:37 [RFC/PATCHSET 0/3] virtio-pstore: Implement virtio pstore device Namhyung Kim
2016-07-18  4:37 ` [PATCH 1/3] virtio: Basic implementation of virtio pstore driver Namhyung Kim
2016-07-18  5:12   ` Kees Cook
2016-07-18  5:50     ` Namhyung Kim
2016-07-18 17:50       ` Kees Cook
2016-07-19 13:43         ` Namhyung Kim
2016-07-19 15:32           ` Namhyung Kim
2016-07-20 12:56           ` Namhyung Kim
2016-07-18  7:54   ` Cornelia Huck
2016-07-18  8:29     ` Namhyung Kim
2016-07-18  9:02       ` Cornelia Huck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161118033206.GA15698@danjae.aot.lge.com \
    --to=namhyung@kernel.org \
    --cc=aliguori@amazon.com \
    --cc=anton@enomsg.org \
    --cc=ccross@android.com \
    --cc=keescook@chromium.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=minchan@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rkrcmar@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=tony.luck@intel.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).