[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [qemu-s390x] [PATCH 15/15] s390-bios: Support booting from real dasd
From: |
Cornelia Huck |
Subject: |
Re: [qemu-s390x] [PATCH 15/15] s390-bios: Support booting from real dasd device |
Date: |
Mon, 4 Feb 2019 13:02:38 +0100 |
On Tue, 29 Jan 2019 08:29:22 -0500
"Jason J. Herne" <address@hidden> wrote:
> Allows guest to boot from a vfio configured real dasd device.
>
> Signed-off-by: Jason J. Herne <address@hidden>
> ---
> docs/devel/s390-dasd-ipl.txt | 132 +++++++++++++++++++++++
> pc-bios/s390-ccw/Makefile | 2 +-
> pc-bios/s390-ccw/dasd-ipl.c | 249
> +++++++++++++++++++++++++++++++++++++++++++
> pc-bios/s390-ccw/dasd-ipl.h | 16 +++
> pc-bios/s390-ccw/main.c | 4 +
> pc-bios/s390-ccw/s390-arch.h | 13 +++
> 6 files changed, 415 insertions(+), 1 deletion(-)
> create mode 100644 docs/devel/s390-dasd-ipl.txt
> create mode 100644 pc-bios/s390-ccw/dasd-ipl.c
> create mode 100644 pc-bios/s390-ccw/dasd-ipl.h
>
> diff --git a/docs/devel/s390-dasd-ipl.txt b/docs/devel/s390-dasd-ipl.txt
> new file mode 100644
> index 0000000..84ec7b8
> --- /dev/null
> +++ b/docs/devel/s390-dasd-ipl.txt
> @@ -0,0 +1,132 @@
> +*****************************
> +***** s390 hardware IPL *****
> +*****************************
> +
> +The s390 hardware IPL process consists of the following steps.
> +
> +1. A READ IPL ccw is constructed in memory location 0x0.
> + This ccw, by definition, reads the IPL1 record which is located on the
> disk
> + at cylinder 0 track 0 record 1. Note that the chain flag is on in this
> ccw
> + so when it is complete another ccw will be fetched and executed from
> memory
> + location 0x08.
> +
> +2. Execute the Read IPL ccw at 0x00, thereby reading IPL1 data into 0x00.
> + IPL1 data is 24 bytes in length and consists of the following pieces of
> + information: [psw][read ccw][tic ccw]. When the machine executes the Read
> + IPL ccw it read the 24-bytes of IPL1 to be read into memory starting at
> + location 0x0. Then the ccw program at 0x08 which consists of a read
> + ccw and a tic ccw is automatically executed because of the chain flag
> from
> + the original READ IPL ccw. The read ccw will read the IPL2 data into
> memory
> + and the TIC (Tranfer In Channel) will transfer control to the channel
> + program contained in the IPL2 data. The TIC channel command is the
> + equivalent of a branch/jump/goto instruction for channel programs.
> + NOTE: The ccws in IPL1 are defined by the architecture to be format 0.
> +
> +3. Execute IPL2.
> + The TIC ccw instruction at the end of the IPL1 channel program will begin
> + the execution of the IPL2 channel program. IPL2 is stage-2 of the boot
> + process and will contain a larger channel program than IPL1. The point of
> + IPL2 is to find and load either the operating system or a small program
> that
> + loads the operating system from disk. At the end of this step all or
> some of
> + the real operating system is loaded into memory and we are ready to hand
> + control over to the guest operating system. At this point the guest
> + operating system is entirely responsible for loading any more data it
> might
> + need to function. NOTE: The IPL2 channel program might read data into
> memory
> + location 0 thereby overwriting the IPL1 psw and channel program. This is
> ok
> + as long as the data placed in location 0 contains a psw whose instruction
> + address points to the guest operating system code to execute at the end
> of
> + the IPL/boot process.
> + NOTE: The ccws in IPL2 are defined by the architecture to be format 0.
> +
> +4. Start executing the guest operating system.
> + The psw that was loaded into memory location 0 as part of the ipl process
> + should contain the needed flags for the operating system we have loaded.
> The
> + psw's instruction address will point to the location in memory where we
> want
> + to start executing the operating system. This psw is loaded (via LPSW
> + instruction) causing control to be passed to the operating system code.
> +
> +In a non-virtualized environment this process, handled entirely by the
> hardware,
> +is kicked off by the user initiating a "Load" procedure from the hardware
> +management console. This "Load" procedure crafts a special "Read IPL" ccw in
> +memory location 0x0 that reads IPL1. It then executes this ccw thereby
> kicking
> +off the reading of IPL1 data. Since the channel program from IPL1 will be
> +written immediately after the special "Read IPL" ccw, the IPL1 channel
> program
> +will be executed immediately (the special read ccw has the chaining bit
> turned
> +on). The TIC at the end of the IPL1 channel program will cause the IPL2
> channel
> +program to be executed automatically. After this sequence completes the
> "Load"
> +procedure then loads the psw from 0x0.
Nice summary!
> +
> +*****************************************
> +***** How this all pertains to Qemu *****
s/Qemu/QEMU/
(also below)
> +*****************************************
> +
> +In theory we should merely have to do the following to IPL/boot a guest
> +operating system from a DASD device:
> +
> +1. Place a "Read IPL" ccw into memory location 0x0 with chaining bit on.
> +2. Execute channel program at 0x0.
> +3. LPSW 0x0.
> +
> +However, our emulation of the machine's channel program logic is missing one
> key
> +feature that is required for this process to work: non-prefetch of ccw data.
> +
> +When we start a channel program we pass the channel subsystem parameters via
> an
> +ORB (Operation Request Block). One of those parameters is a prefetch bit. If
> the
> +bit is on then Qemu is allowed to read the entire channel program from guest
> +memory before it starts executing it. This means that any channel commands
> that
> +read additional channel commands will not work as expected because the newly
> +read commands will only exist in guest memory and NOT within Qemu's channel
> +subsystem memory. Qemu's channel subsystem's implementation currently
> requires
But isn't that the vfio-ccw backend, rather than the channel subsystem
implementation?
> +this bit to be on for all channel programs. This is a problem because the IPL
> +process consists of transferring control from the "Read IPL" ccw immediately
> to
> +the IPL1 channel program that was read by "Read IPL".
> +
> +Not being able to turn off prefetch will also prevent the TIC at the end of
> the
> +IPL1 channel program from transferring control to the IPL2 channel program.
> +
> +Lastly, in some cases (the zipl bootloader for example) the IPL2 program also
> +tansfers control to another channel program segment immediately after
> reading it
> +from the disk. So we need to be able to handle this case.
> +
> +**************************
> +***** What Qemu does *****
> +**************************
> +
> +Since we are forced to live with prefetch we cannot use the very simple IPL
> +procedure we defined in the preceding section. So we compensate by doing the
> +following.
> +
> +1. Place "Read IPL" ccw into memory location 0x0, but turn off chaining bit.
> +2. Execute "Read IPL" at 0x0.
> +
> + So now IPL1's psw is at 0x0 and IPL1's channel program is at 0x08.
> +
> +4. Write a custom channel program that will seek to the IPL2 record and then
> + execute the READ and TIC ccws from IPL1. Normamly the seek is not
> required
> + because after reading the IPL1 record the disk is automatically positioned
> + to read the very next record which will be IPL2. But since we are not
> reading
> + both IPL1 and IPL2 as part of the same channel program we must manually
> set
> + the position.
> +
> +5. Grab the target address of the TIC instruction from the IPL1 channel
> program.
> + This address is where the IPL2 channel program starts.
> +
> + Now IPL2 is loaded into memory somewhere, and we know the address.
> +
> +6. Execute the IPL2 channel program at the address obtained in step #5.
> +
> + Because this channel program can be dynamic, we must use a special
> algorithm
> + that detects a READ immediately followed by a TIC and breaks the ccw chain
> + by turning off the chain bit in the READ ccw. When control is returned
> from
> + the kernel/hardware to the Qemu bios code we immediately issue another
> start
> + subchannel to execute the remaining TIC instruction. This causes the
> entire
> + channel program (starting from the TIC) and all needed data to be
> refetched
> + thereby stepping around the limitation that would otherwise prevent this
> + channel program from executing properly.
> +
> + Now the operating system code is loaded somewhere in guest memory and the
> psw
> + in memory location 0x0 will point to entry code for the guest operating
> + system.
> +
> +7. LPSW 0x0.
> + LPSW transfers control to the guest operating system and we're done.
Also a good explanation of the procedure here!
(...)
> +static int run_dynamic_ccw_program(SubChannelId schid, uint32_t cpa)
> +{
> + bool has_next;
> + uint32_t next_cpa = 0;
> + int rc;
> +
> + do {
> + has_next = dynamic_cp_fixup(cpa, &next_cpa);
> +
> + print_int("executing ccw chain at ", cpa);
Do you want to keep the unconditional print here? Or make it a
debug_print_int, and maybe an unconditional print on error?
> + enable_prefixing();
> + rc = do_cio(schid, cpa, CCW_FMT0);
> + disable_prefixing();
> +
> + if (rc) {
> + break;
> + }
> + cpa = next_cpa;
> + } while (has_next);
> +
> + return rc;
> +}
Code looks fine after a quick browse.
- Re: [qemu-s390x] [PATCH 15/15] s390-bios: Support booting from real dasd device,
Cornelia Huck <=