qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [QEMU-PPC] [PATCH v3] powerpc/spapr: Add host threads p


From: David Gibson
Subject: Re: [Qemu-devel] [QEMU-PPC] [PATCH v3] powerpc/spapr: Add host threads parameter to ibm, get_system_parameter
Date: Wed, 17 Jul 2019 15:46:05 +1000
User-agent: Mutt/1.12.0 (2019-05-25)

On Tue, Jul 09, 2019 at 02:00:04PM +1000, Suraj Jitindar Singh wrote:
> On Thu, 2019-07-04 at 14:59 +1000, David Gibson wrote:
> > On Thu, Jul 04, 2019 at 01:41:59PM +1000, Suraj Jitindar Singh wrote:
> > > On Wed, 2019-07-03 at 16:12 +1000, David Gibson wrote:
> > > > On Mon, Jul 01, 2019 at 04:19:46PM +1000, Suraj Jitindar Singh
> > > > wrote:
> > > > > The ibm,get_system_parameter rtas call is used by the guest to
> > > > > retrieve
> > > > > data relating to certain parameters of the system. The SPLPAR
> > > > > characteristics option (token 20) is used to determin
> > > > > characteristics of
> > > > > the environment in which the lpar will run.
> > > > > 
> > > > > It may be useful for a guest to know the number of physical
> > > > > host
> > > > > threads
> > > > > present on the underlying system where it is being run. Add the
> > > > > characteristic "HostThrs" to the SPLPAR Characteristics
> > > > > ibm,get_system_parameter rtas call to expose this information
> > > > > to a
> > > > > guest and provide an implementation which determines this
> > > > > information
> > > > > based on the number of interrupt servers present in the device
> > > > > tree.
> > > > > 
> > > > > Signed-off-by: Suraj Jitindar Singh <address@hidden>
> > > > 
> > > > Hrm, as I said on our call, I have some misgivings about this.
> > > > 
> > > > Starting with the most general: this again publishes host
> > > > information
> > > > to the guest without filtering, which has caused us problems
> > > > before
> > > > (e.g. security issues with publishing the host serial and model
> > > > information).  Now, I can't immediately see what harm a guest
> > > > could
> > > > do
> > > > with the host # threads (especially since it could in theory
> > > > deduce
> > > > it
> > > > from the PVR, I think) but it still makes me uneasy.
> > > 
> > > Correct, a guest could pretty reliably determine this information
> > > anyway based on the PVR. It can't account for a POWER8 operating in
> > > split core mode, but I don't know any harm that could be done by
> > > introducing this information.
> > > 
> > > Additionally it doesn't really tell you anything about how you're
> > > going
> > > to be scheduled (at least on POWER9) since vcpus are scheduled on a
> > > per
> > > thread, not per core basis.
> > 
> > Hmm.
> > 
> > > > Secondly, the "HostThrs" tag doesn't seem to be documented in
> > > > PAPR as
> > > > something that this system-parameter will include.  I don't much
> > > > like
> > > > the idea of adding ad-hoc bits of information here without some
> > > > thought going into designing and specifying it first.
> > > 
> > > This isn't documented in papr, it has been decided that this is how
> > > the
> > > information will be communicated to a guest. This is the most
> > > appropriate place to put this information and the HostThrs name is
> > > consistent with the naming of other information in this property.
> > 
> > Grr.  If someone can decide this, they can bloody well document it
> > somewhere.
> > 
> > > We have other non-papr information in qemu, for example hcall
> > > numbers,
> > > so this isn't exactly a precedent.
> > 
> > I suppose
> > 
> > > > > ---
> > > > > 
> > > > > V1 -> V2:
> > > > > - Take into account that the core may be operating in split
> > > > > core
> > > > > mode
> > > > >   meaning a single core may be split into multiple subcores.
> > > > > V2 -> V3:
> > > > > - Add curly braces for single line if statements
> > > > > ---
> > > > >  hw/ppc/spapr_rtas.c | 62
> > > > > +++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > >  1 file changed, 62 insertions(+)
> > > > > 
> > > > > diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> > > > > index 5bc1a93271..1bab71c90c 100644
> > > > > --- a/hw/ppc/spapr_rtas.c
> > > > > +++ b/hw/ppc/spapr_rtas.c
> > > > > @@ -229,6 +229,58 @@ static inline int sysparm_st(target_ulong
> > > > > addr, target_ulong len,
> > > > >      return RTAS_OUT_SUCCESS;
> > > > >  }
> > > > >  
> > > > > +#define CPUS_PATH       "/proc/device-tree/cpus/"
> > > > > +#define
> > > > > SUBCORE_PATH    "/sys/devices/system/cpu/subcores_per_core"
> > > > > +
> > > > > +static int rtas_get_num_host_threads(void)
> > > > > +{
> > > > > +    int num_threads = -1;
> > > > > +    unsigned long len;
> > > > > +    const char *entry;
> > > > > +    char *buf;
> > > > > +    GDir *dir;
> > > > > +
> > > > > +    if (!kvm_enabled()) {
> > > > > +        return 1;
> > > > > +    }
> > > > > +
> > > > > +    /* Read interrupt servers to determine number of threads
> > > > > per
> > > > > core */
> > > > > +    dir = g_dir_open(CPUS_PATH, 0, NULL);
> > > > > +    if (!dir) {
> > > > > +        return -1;
> > > > > +    }
> > > > > +
> > > > > +    while ((entry = g_dir_read_name(dir))) {
> > > > > +        if (!strncmp(entry, "PowerPC,POWER",
> > > > > strlen("PowerPC,POWER"))) {
> > > > > +            char *path;
> > > > > +
> > > > > +            path = g_strconcat(CPUS_PATH, entry, "/ibm,ppc-
> > > > > interrupt-server#s",
> > > > > +                               NULL);
> > > > > +            if (g_file_get_contents(path, &buf, &len, NULL)) {
> > > > > +                num_threads = len / sizeof(int);
> > > > > +                g_free(buf);
> > > > > +            }
> > > > > +
> > > > > +            g_free(path);
> > > > > +            break;
> > > > > +        }
> > > > > +    }
> > > > > +
> > > > > +    g_dir_close(dir);
> > > > > +
> > > > > +    /* Check if split core mode in use */
> > > > > +    if (g_file_get_contents(SUBCORE_PATH, &buf, &len, NULL)) {
> > > > > +        int subcores = g_ascii_strtoll(buf, NULL, 10);
> > > > > +
> > > > > +        if (subcores) {
> > > > > +            num_threads /= subcores;
> > > > > +        }
> > > > > +        g_free(buf);
> > > > > +    }
> > > > 
> > > > Finally, all the logic above is built on the assumption of a ppc
> > > > host
> > > > - and not just that but an IBM POWER host...
> > > 
> > > RTAS services are defined as being provided by a papr platform, and
> > > the
> > > existence of the ibm,ppc-interrupt-server#s device tree property is
> > > a
> > > requirement of a papr platform. So I don't see this being an issue.
> > 
> > The *guest* is a PAPR platform, there's no guarantee the host has to
> > be a PAPR platform (in fact it usually won't be, it's just that
> > powernv has a lot of the same device tree properties).
> 
> Well I think technically the host is a papr platform which supplies an
> environment to the papr guest partition. But the terminology isn't
> really important here.

My point is that host userspace does not necessarily live in a PAPR
environment.

> > > > > +    return num_threads;
> > > > > +}
> > > > > +
> > > > >  static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
> > > > >                                            SpaprMachineState
> > > > > *spapr,
> > > > >                                            uint32_t token,
> > > > > uint32_t
> > > > > nargs,
> > > > > @@ -250,6 +302,16 @@ static void
> > > > > rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
> > > > >                                            current_machine-
> > > > > > ram_size / MiB,
> > > > > 
> > > > >                                            smp_cpus,
> > > > >                                            max_cpus);
> > > > > +        int num_host_threads = rtas_get_num_host_threads();
> > > > > +
> > > > > +        if (num_host_threads > 0) {
> > > > 
> > > > ... this sort of implements a fallback in other cases (KVM PR
> > > > with a
> > > > non-IBM host, TCG, but the boundary conditions are not really
> > > > well
> > > > defined.
> > > 
> > > This is essentially catching the error case of
> > > rtas_get_num_host_threads() returning a negative number or not
> > > finding
> > > the required properties (which as mentioned above are required).
> > > The
> > > KVM-PR case will work the same as the KVM-HV case where the host
> > > device
> > > tree will be queried.
> > 
> > Not if you're using PR on, say, an embedded ppc or an old Apple
> > machine that doesn't have the PAPR-ish properties in the host device
> > tree.
> 
> In which case we won't find the device tree property and so we don't
> have a reliable way to determine the number of host threads, so we will
> omit the property.
> 
> > 
> > > For TCG we just default to 1 since this
> > > information shouldn't be relevant to a TCG guest.
> > 
> > Uh.. it doesn't though, it omits it entirely.
> 
> No,
> if (!kvm_enabled()) return 1;
> above
> 
> > 
> > Also I don't really understand how it's relevant to a KVM guest in
> > the
> > first place.
> 
> There are registers which were previously scaled based on the threads
> per core, for example the PURR, but which aren't on POWER9 when running
> in lpar per thread mode and instead count per thread. This will look
> different to a guest OS, and so for whatever reason the guest might
> want to scale these registers based on the host threading mode.

Since apparently POWER's version of "compatibility" is "close but not
quite".

Anyway, after our discussion on Monday, I'm more or less resigned to
the need for HostThrs here.  Still some changes I'd like to see here.

First, I'd like an explanatory comment in get_system_parameter,
explaining what the value is used for.  Mention that it's not in PAPR
and say what little you can about who decided on this representation.

Second, I still really dislike directly grovelling around in the host
device tree.  THe fallbacks you have will probably work in practice,
but I really think it assumes too much about what the host environment
will look like.  Plus it, again, makes a guest visible difference
based on host properties which we want to avoid.  We'd probably get
away with this one (the worst is you'd have messed up PURR scaling
after migration), but still.

I was thinking we could use kvmppc_smt_threads(), but that will be
wrong for POWER9, I think.  Plus it still means guest visible
differences between TCG and KVM (or even different KVM types).

I actually think the right way to do this is to have qemu base it on
the PVR (i.e. look up the information in its internal cpu table).
That way it should give sensible and identical results for KVM and TCG
- with the only KVM dependence being that KVM and -cpu host selects
the exact cpu model for you.

That may give non-1 values for TCG and KVM PR, which could result in
bogus scaling... but AFAICT PURR values will be kind of bogus in those
situations anyway, so I don't know that it matters.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]