qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 05/15] target/ppc: PMU: add instruction counting


From: David Gibson
Subject: Re: [PATCH v3 05/15] target/ppc: PMU: add instruction counting
Date: Mon, 27 Sep 2021 14:59:24 +1000

On Tue, Sep 21, 2021 at 06:11:24PM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 9/6/21 22:57, David Gibson wrote:
> > On Fri, Sep 03, 2021 at 05:31:06PM -0300, Daniel Henrique Barboza wrote:
> > > The PMU is already counting cycles by calculating time elapsed in
> > > nanoseconds. Counting instructions is a different matter and requires
> > > another approach.
> > > 
> > > This patch adds the capability of counting completed instructions
> > > (Perf event PM_INST_CMPL) by counting the amount of instructions
> > > translated in each translation block right before exiting it.
> > > 
> > > A new pmu_count_insns() helper in translation.c was added to do that.
> > > After verifying that the PMU is running (MMCR0_FC bit not set), call
> > > helper_insns_inc(). This new helper from power8_pmu.c will add the
> > > instructions to the relevant counters. It'll also be responsible for
> > > triggering counter negative overflows later on.
> > > 
> > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> > > ---
> > >   target/ppc/cpu.h         |  1 +
> > >   target/ppc/helper.h      |  1 +
> > >   target/ppc/helper_regs.c |  3 ++
> > >   target/ppc/power8_pmu.c  | 70 ++++++++++++++++++++++++++++++++++++----
> > >   target/ppc/translate.c   | 46 ++++++++++++++++++++++++++
> > >   5 files changed, 114 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> > > index 74698a3600..4d4886ac74 100644
> > > --- a/target/ppc/cpu.h
> > > +++ b/target/ppc/cpu.h
> > > @@ -628,6 +628,7 @@ enum {
> > >       HFLAGS_FP = 13,  /* MSR_FP */
> > >       HFLAGS_PR = 14,  /* MSR_PR */
> > >       HFLAGS_PMCCCLEAR = 15, /* PMU MMCR0 PMCC equal to 0b00 */
> > > +    HFLAGS_MMCR0FC = 16, /* MMCR0 FC bit */
> > >       HFLAGS_VSX = 23, /* MSR_VSX if cpu has VSX */
> > >       HFLAGS_VR = 25,  /* MSR_VR if cpu has VRE */
> > > diff --git a/target/ppc/helper.h b/target/ppc/helper.h
> > > index 5122632784..47dbbe6da1 100644
> > > --- a/target/ppc/helper.h
> > > +++ b/target/ppc/helper.h
> > > @@ -21,6 +21,7 @@ DEF_HELPER_1(hrfid, void, env)
> > >   DEF_HELPER_2(store_lpcr, void, env, tl)
> > >   DEF_HELPER_2(store_pcr, void, env, tl)
> > >   DEF_HELPER_2(store_mmcr0, void, env, tl)
> > > +DEF_HELPER_2(insns_inc, void, env, i32)
> > >   #endif
> > >   DEF_HELPER_1(check_tlb_flush_local, void, env)
> > >   DEF_HELPER_1(check_tlb_flush_global, void, env)
> > > diff --git a/target/ppc/helper_regs.c b/target/ppc/helper_regs.c
> > > index 4c1d9575ac..27d139edd8 100644
> > > --- a/target/ppc/helper_regs.c
> > > +++ b/target/ppc/helper_regs.c
> > > @@ -109,6 +109,9 @@ static uint32_t hreg_compute_hflags_value(CPUPPCState 
> > > *env)
> > >       if (((env->spr[SPR_POWER_MMCR0] & MMCR0_PMCC) >> 18) == 0) {
> > >           hflags |= 1 << HFLAGS_PMCCCLEAR;
> > >       }
> > > +    if (env->spr[SPR_POWER_MMCR0] & MMCR0_FC) {
> > > +        hflags |= 1 << HFLAGS_MMCR0FC;
> > > +    }
> > >   #ifndef CONFIG_USER_ONLY
> > >       if (!env->has_hv_mode || (msr & (1ull << MSR_HV))) {
> > > diff --git a/target/ppc/power8_pmu.c b/target/ppc/power8_pmu.c
> > > index 3f7b305f4f..9769c0ff35 100644
> > > --- a/target/ppc/power8_pmu.c
> > > +++ b/target/ppc/power8_pmu.c
> > > @@ -31,10 +31,13 @@ static void update_PMC_PM_CYC(CPUPPCState *env, int 
> > > sprn,
> > >       env->spr[sprn] += time_delta;
> > >   }
> > > -static void update_programmable_PMC_reg(CPUPPCState *env, int sprn,
> > > -                                        uint64_t time_delta)
> > > +static uint8_t get_PMC_event(CPUPPCState *env, int sprn)
> > 
> > I like the idea of splitting out a helper to get the selected event
> > (might even make sense to move that to the earlier patch).  What would
> > be even nicer is if it also included handling of the fact that some
> > events are specific to particular PMCs (like 0xF0 for PMC1).  That
> > means that all the event selection logic will be here, rather than
> > having to check the PMC number again in the caller.  Obviously to do
> > that you'll need some special "bad event" return value, which might
> > mean changing the return type.
> 
> The initial idea of this function was to be a simple event extractor
> returning 0 if MMCR1 is blank. In the end of the series there are 3 callers
> of this function that will execute a specific action based on the event
> returned by it.
> 
> I suppose that we can use the return value 0 as a 'bad value' based on the
> PMC events we're going to support, but that will not prevent the callers
> from doing a 'switch()' like logic, rechecking the PMC, to see which action
> is supposed to be taken.
> 
> IIUC, your idea would require an additional layer of abstraction, e.g. a
> PMCEvent object, that would tie together PMC + event. Then get_PMC_event()
> would return a PMCEvent object and the caller wouldn't need to re-check the
> PMC again.

More-or-less, yes.  I don't think the "object" would need to be
anything more than an enum though: "cycles", "instructions" or "invalid".

> I'll see how hard it would be to introduce this new concept in this existing
> series. If it ends up being too much rework I'll suggest to do this in a
> follow-up.
> 
> 
> 
> Thanks,
> 
> 
> Daniel
> 
> > 
> > >   {
> > > -    uint8_t event, evt_extr;
> > > +    uint8_t evt_extr = 0;
> > > +
> > > +    if (env->spr[SPR_POWER_MMCR1] == 0) {
> > > +        return 0;
> > > +    }
> > >       switch (sprn) {
> > >       case SPR_POWER_PMC1:
> > > @@ -50,10 +53,16 @@ static void update_programmable_PMC_reg(CPUPPCState 
> > > *env, int sprn,
> > >           evt_extr = MMCR1_PMC4EVT_EXTR;
> > >           break;
> > >       default:
> > > -        return;
> > > +        return 0;
> > >       }
> > > -    event = extract64(env->spr[SPR_POWER_MMCR1], evt_extr, 
> > > MMCR1_EVT_SIZE);
> > > +    return extract64(env->spr[SPR_POWER_MMCR1], evt_extr, 
> > > MMCR1_EVT_SIZE);
> > > +}
> > > +
> > > +static void update_programmable_PMC_reg(CPUPPCState *env, int sprn,
> > > +                                        uint64_t time_delta)
> > > +{
> > > +    uint8_t event = get_PMC_event(env, sprn);
> > >       /*
> > >        * MMCR0_PMC1SEL = 0xF0 is the architected PowerISA v3.1 event
> > > @@ -99,8 +108,9 @@ void helper_store_mmcr0(CPUPPCState *env, target_ulong 
> > > value)
> > >       env->spr[SPR_POWER_MMCR0] = value;
> > > -    /* MMCR0 writes can change HFLAGS_PMCCCLEAR */
> > > -    if ((curr_value & MMCR0_PMCC) != (value & MMCR0_PMCC)) {
> > > +    /* MMCR0 writes can change HFLAGS_PMCCCLEAR and HFLAGS_MMCR0FC */
> > > +    if (((curr_value & MMCR0_PMCC) != (value & MMCR0_PMCC)) ||
> > > +        (curr_FC != new_FC)) {
> > >           hreg_compute_hflags(env);
> > >       }
> > > @@ -123,4 +133,50 @@ void helper_store_mmcr0(CPUPPCState *env, 
> > > target_ulong value)
> > >       }
> > >   }
> > > +static bool pmc_counting_insns(CPUPPCState *env, int sprn)
> > > +{
> > > +    bool ret = false;
> > > +    uint8_t event;
> > > +
> > > +    if (sprn == SPR_POWER_PMC5) {
> > > +        return true;
> > > +    }
> > > +
> > > +    event = get_PMC_event(env, sprn);
> > > +
> > > +    /*
> > > +     * Event 0x2 is an implementation-dependent event that IBM
> > > +     * POWER chips implement (at least since POWER8) that is
> > > +     * equivalent to PM_INST_CMPL. Let's support this event on
> > > +     * all programmable PMCs.
> > > +     *
> > > +     * Event 0xFE is the PowerISA v3.1 architected event to
> > > +     * sample PM_INST_CMPL using PMC1.
> > > +     */
> > > +    switch (sprn) {
> > > +    case SPR_POWER_PMC1:
> > > +        return event == 0x2 || event == 0xFE;
> > > +    case SPR_POWER_PMC2:
> > > +    case SPR_POWER_PMC3:
> > > +    case SPR_POWER_PMC4:
> > > +        return event == 0x2;
> > > +    default:
> > > +        break;
> > > +    }
> > > +
> > > +    return ret;
> > > +}
> > > +
> > > +/* This helper assumes that the PMC is running. */
> > > +void helper_insns_inc(CPUPPCState *env, uint32_t num_insns)
> > > +{
> > > +    int sprn;
> > > +
> > > +    for (sprn = SPR_POWER_PMC1; sprn <= SPR_POWER_PMC5; sprn++) {
> > > +        if (pmc_counting_insns(env, sprn)) {
> > > +            env->spr[sprn] += num_insns;
> > > +        }
> > > +    }
> > > +}
> > > +
> > >   #endif /* defined(TARGET_PPC64) && !defined(CONFIG_USER_ONLY) */
> > > diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> > > index c3e2e3d329..b7235a2be0 100644
> > > --- a/target/ppc/translate.c
> > > +++ b/target/ppc/translate.c
> > > @@ -176,6 +176,7 @@ struct DisasContext {
> > >       bool tm_enabled;
> > >       bool gtse;
> > >       bool pmcc_clear;
> > > +    bool pmu_frozen;
> > >       ppc_spr_t *spr_cb; /* Needed to check rights for mfspr/mtspr */
> > >       int singlestep_enabled;
> > >       uint32_t flags;
> > > @@ -411,6 +412,12 @@ void spr_write_MMCR0(DisasContext *ctx, int sprn, 
> > > int gprn)
> > >        */
> > >       gen_icount_io_start(ctx);
> > >       gen_helper_store_mmcr0(cpu_env, cpu_gpr[gprn]);
> > > +
> > > +    /*
> > > +     * End the translation block because MMCR0 writes can change
> > > +     * ctx->pmu_frozen.
> > > +     */
> > > +    ctx->base.is_jmp = DISAS_EXIT_UPDATE;
> > >   }
> > >   #else
> > >   void spr_write_MMCR0(DisasContext *ctx, int sprn, int gprn)
> > > @@ -4407,6 +4414,22 @@ static inline void gen_update_cfar(DisasContext 
> > > *ctx, target_ulong nip)
> > >   #endif
> > >   }
> > > +#if defined(TARGET_PPC64) && !defined(CONFIG_USER_ONLY)
> > > +static void pmu_count_insns(DisasContext *ctx)
> > > +{
> > > +    /* Do not bother calling the helper if the PMU is frozen */
> > > +    if (ctx->pmu_frozen) {
> > > +        return;
> > > +    }
> > > +
> > > +    gen_helper_insns_inc(cpu_env, tcg_constant_i32(ctx->base.num_insns));
> > > +}
> > > +#else
> > > +static void pmu_count_insns(DisasContext *ctx)
> > > +{
> > > +    return;
> > > +}
> > > +#endif
> > >   static inline bool use_goto_tb(DisasContext *ctx, target_ulong dest)
> > >   {
> > >       return translator_use_goto_tb(&ctx->base, dest);
> > > @@ -4421,9 +4444,17 @@ static void gen_lookup_and_goto_ptr(DisasContext 
> > > *ctx)
> > >           } else if (sse & (CPU_SINGLE_STEP | CPU_BRANCH_STEP)) {
> > >               gen_helper_raise_exception(cpu_env, 
> > > tcg_constant_i32(gen_prep_dbgex(ctx)));
> > >           } else {
> > > +            pmu_count_insns(ctx);
> > >               tcg_gen_exit_tb(NULL, 0);
> > >           }
> > >       } else {
> > > +        /*
> > > +         * tcg_gen_lookup_and_goto_ptr will exit the TB if
> > > +         * CF_NO_GOTO_PTR is set. Count insns now.
> > > +         */
> > > +        if (ctx->base.tb->flags & CF_NO_GOTO_PTR) {
> > > +            pmu_count_insns(ctx);
> > > +        }
> > >           tcg_gen_lookup_and_goto_ptr();
> > >       }
> > >   }
> > > @@ -4435,6 +4466,8 @@ static void gen_goto_tb(DisasContext *ctx, int n, 
> > > target_ulong dest)
> > >           dest = (uint32_t) dest;
> > >       }
> > >       if (use_goto_tb(ctx, dest)) {
> > > +        pmu_count_insns(ctx);
> > > +
> > >           tcg_gen_goto_tb(n);
> > >           tcg_gen_movi_tl(cpu_nip, dest & ~3);
> > >           tcg_gen_exit_tb(ctx->base.tb, n);
> > > @@ -8648,6 +8681,7 @@ static void 
> > > ppc_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
> > >       ctx->tm_enabled = (hflags >> HFLAGS_TM) & 1;
> > >       ctx->gtse = (hflags >> HFLAGS_GTSE) & 1;
> > >       ctx->pmcc_clear = (hflags >> HFLAGS_PMCCCLEAR) & 1;
> > > +    ctx->pmu_frozen = (hflags >> HFLAGS_MMCR0FC) & 1;
> > >       ctx->singlestep_enabled = 0;
> > >       if ((hflags >> HFLAGS_SE) & 1) {
> > > @@ -8767,6 +8801,8 @@ static void ppc_tr_tb_stop(DisasContextBase 
> > > *dcbase, CPUState *cs)
> > >       switch (is_jmp) {
> > >       case DISAS_TOO_MANY:
> > >           if (use_goto_tb(ctx, nip)) {
> > > +            pmu_count_insns(ctx);
> > > +
> > >               tcg_gen_goto_tb(0);
> > >               gen_update_nip(ctx, nip);
> > >               tcg_gen_exit_tb(ctx->base.tb, 0);
> > > @@ -8777,6 +8813,14 @@ static void ppc_tr_tb_stop(DisasContextBase 
> > > *dcbase, CPUState *cs)
> > >           gen_update_nip(ctx, nip);
> > >           /* fall through */
> > >       case DISAS_CHAIN:
> > > +        /*
> > > +         * tcg_gen_lookup_and_goto_ptr will exit the TB if
> > > +         * CF_NO_GOTO_PTR is set. Count insns now.
> > > +         */
> > > +        if (ctx->base.tb->flags & CF_NO_GOTO_PTR) {
> > > +            pmu_count_insns(ctx);
> > > +        }
> > > +
> > >           tcg_gen_lookup_and_goto_ptr();
> > >           break;
> > > @@ -8784,6 +8828,8 @@ static void ppc_tr_tb_stop(DisasContextBase 
> > > *dcbase, CPUState *cs)
> > >           gen_update_nip(ctx, nip);
> > >           /* fall through */
> > >       case DISAS_EXIT:
> > > +        pmu_count_insns(ctx);
> > > +
> > >           tcg_gen_exit_tb(NULL, 0);
> > >           break;
> > 
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]