[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2] hw/ide/ahci: fix legacy software reset
From: |
Kevin Wolf |
Subject: |
Re: [PATCH v2] hw/ide/ahci: fix legacy software reset |
Date: |
Wed, 8 Nov 2023 14:10:59 +0100 |
Am 08.11.2023 um 00:57 hat Niklas Cassel geschrieben:
> On Tue, Nov 07, 2023 at 07:14:07PM +0100, Kevin Wolf wrote:
> > Am 05.10.2023 um 12:04 hat Niklas Cassel geschrieben:
> > > From: Niklas Cassel <niklas.cassel@wdc.com>
> > >
> > > Legacy software contains a standard mechanism for generating a reset to a
> > > Serial ATA device - setting the SRST (software reset) bit in the Device
> > > Control register.
> > >
> > > Serial ATA has a more robust mechanism called COMRESET, also referred to
> > > as port reset. A port reset is the preferred mechanism for error
> > > recovery and should be used in place of software reset.
> > >
> > > Commit e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
> > > improved the handling of PxCI, such that PxCI gets cleared after handling
> > > a non-NCQ, or NCQ command (instead of incorrectly clearing PxCI after
> > > receiving an arbitrary FIS).
> > >
> > > However, simply clearing PxCI after a non-NCQ, or NCQ command, is not
> > > enough, we also need to clear PxCI when receiving a SRST in the Device
> > > Control register.
> > >
> > > This fixes an issue for FreeBSD where the device would fail to reset.
> > > The problem was not noticed in Linux, because Linux uses a COMRESET
> > > instead of a legacy software reset by default.
> > >
> > > Fixes: e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
> > > Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
> > > Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> > > ---
> > > Changes since v1: write the D2H FIS before clearing PxCI.
> > >
> > > hw/ide/ahci.c | 16 ++++++++++++++++
> > > 1 file changed, 16 insertions(+)
> > >
> > > diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> > > index babdd7b458..7269dabbdb 100644
> > > --- a/hw/ide/ahci.c
> > > +++ b/hw/ide/ahci.c
> > > @@ -1254,10 +1254,26 @@ static void handle_reg_h2d_fis(AHCIState *s, int
> > > port,
> > > case STATE_RUN:
> > > if (cmd_fis[15] & ATA_SRST) {
> > > s->dev[port].port_state = STATE_RESET;
> > > + /*
> > > + * When setting SRST in the first H2D FIS in the reset
> > > sequence,
> > > + * the device does not send a D2H FIS. Host software
> > > thus has to
> > > + * set the "Clear Busy upon R_OK" bit such that PxCI
> > > (and BUSY)
> > > + * gets cleared. See AHCI 1.3.1, section 10.4.1 Software
> > > Reset.
> > > + */
> > > + if (opts & AHCI_CMD_CLR_BUSY) {
> > > + ahci_clear_cmd_issue(ad, slot);
> > > + }
> >
> > I suspect that AHCI_CMD_CLR_BUSY really should be checked in a more
> > generic place, but this will do for fixing software reset.
>
> This weirdo option "Clear Busy upon R_OK" is for the HBA itself to
> clear the bit for the command in PxCI, when it gets the R_OK that the
> Host to Device (H2D) FIS was transmitted successfully to the device.
> The normal way is that the device sends back a Device to Host (D2H) FIS
> which then causes the bit in PxCI to be cleared in the HBA.
>
> Yes, theoretically you could probably build a FIS that has the "Clear
> Busy upon R_OK" even for e.g. a regular non-NCQ I/O command (where
> PxCI/BUSY is first supposed to be cleared once the command is
> done...), however this would surely cause problems as PxCI would then
> no longer shadow the state of the device.
>
> If you search for "Clear Busy upon R_OK", the only usage seems
> to be for legacy software reset.
> [...]
Yes, looks like ignoring it otherwise is harmless.
> > ahci_reset_port() already calls ahci_init_d2h() -> ahci_write_fis_d2h().
> > So I think this new ahci_write_fis_d2h() only sets some state that will
> > immediately be overwritten again. Which is good, because we didn't set
> > the signature as described in the SATA software reset protocol yet, that
> > is only done in ahci_reset_port().
> >
> > Am I missing something? Why do we need this ahci_write_fis_d2h() call
> > here?
> >
> > As for the ahci_clear_cmd_issue(), I'm surprised that ahci_reset_port()
> > doesn't already clear the register. Wouldn't it make more sense there
> > than just in this one caller?
>
> A start/stop of the engine (PxCMD.ST) does clear PxCI, as I implemented in:
> https://github.com/qemu/qemu/commit/d73b84d0b664e60fffb66f46e84d0db4a8e1c713
>
> But according to AHCI 1.3.1, section:
> 3.3.14 Offset 38h: PxCI – Port x Command Issue
>
> PxCI does also have a reset value of 0h.
>
> So I would assume that it is okay to clear PxCI to zero also in
> ahci_reset_port().
Yes, that was what I thought.
> By doing so, we could avoid both the ahci_clear_cmd_issue(),
> and the ahci_write_fis_d2h(ad, false).
>
> Perhaps you can drop the "bonus" part of my patch and simply
> add a pr->cmd_issue to ahci_reset_port()?
I agree that this is probably the best change to make.
If you also want to add that comment that there already is a D2H FIS
sent during reset, and maybe change the commit message to explain why
we're touching ahci_reset_port(), I think I would prefer if you sent a
v3. I'm usually happy to make small changes while applying, but it feels
like this would be a larger change than I would be comfortable making
without having it on the mailing list again.
Kevin