[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] accel/tcg: Consider cluster index in tb_lookup_
From: |
Howard Spoelstra |
Subject: |
Re: [Qemu-devel] [PATCH] accel/tcg: Consider cluster index in tb_lookup__cpu_state() |
Date: |
Tue, 5 Feb 2019 18:14:17 +0100 |
On Tue, Feb 5, 2019 at 6:09 PM Mark Cave-Ayland <
address@hidden> wrote:
> On 05/02/2019 15:18, Peter Maydell wrote:
>
> > In commit f7b78602fdc6c6e4be we added the CPU cluster number to the
> > cflags field of the TB hash; this included adding it to the value
> > kept in tb->cflags, since we pass that field directly into the hash
> > calculation in some places. Unfortunately we forgot to check whether
> > other parts of the code were doing comparisons against tb->cflags
> > that would need to be updated.
> >
> > It turns out that there is exactly one such place: the
> > tb_lookup__cpu_state() function checks whether the TB it has
> > found in the tb_jmp_cache has a tb->cflags matching the cf_mask
> > that is passed in. The tb->cflags has the cluster_index in it
> > but the cf_mask does not.
> >
> > Hoist the "add cluster index to the cf_mask" code up from
> > tb_htable_lookup() to tb_lookup__cpu_state() so it can be considered
> > in the "did this TB match in the jmp cache" condition, as well as
> > when we do the full hash lookup by physical PC, flags, etc.
> > (tb_htable_lookup() is only called from tb_lookup__cpu_state(),
> > so this change doesn't require any further knock-on changes.)
> >
> > Fixes: f7b78602fdc6c6e4be ("accel/tcg: Add cluster number to TCG TB
> hash")
> > Reported-by: Howard Spoelstra <address@hidden>
> > Reported-by: Cleber Rosa <address@hidden>
> > Signed-off-by: Peter Maydell <address@hidden>
> > ---
> > Does anybody know why tb_lookup__cpu_state() has that odd
> > double-underscore in the middle of its name?
> >
> > Since the jmp_cache is per-vcpu we know that we're always going
> > to match on the cluster_index, so the other option would be to
> > leave the cluster_index bits out of the comparison, and leave the
> > "fold in cluster index to cf_mask" code in tb_htable_lookup().
> > Or we could require the callers of tb_lookup__cpu_state() to all
> > provide the cluster index, but that's more places to change,
> > so I prefer this.
> > ---
> > include/exec/tb-lookup.h | 4 ++++
> > accel/tcg/cpu-exec.c | 3 ---
> > 2 files changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/exec/tb-lookup.h b/include/exec/tb-lookup.h
> > index 492cb682894..26921b6dafd 100644
> > --- a/include/exec/tb-lookup.h
> > +++ b/include/exec/tb-lookup.h
> > @@ -28,6 +28,10 @@ tb_lookup__cpu_state(CPUState *cpu, target_ulong *pc,
> target_ulong *cs_base,
> > cpu_get_tb_cpu_state(env, pc, cs_base, flags);
> > hash = tb_jmp_cache_hash_func(*pc);
> > tb = atomic_rcu_read(&cpu->tb_jmp_cache[hash]);
> > +
> > + cf_mask &= ~CF_CLUSTER_MASK;
> > + cf_mask |= cpu->cluster_index << CF_CLUSTER_SHIFT;
> > +
> > if (likely(tb &&
> > tb->pc == *pc &&
> > tb->cs_base == *cs_base &&
> > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> > index 7cf1292546f..60d87d5a19b 100644
> > --- a/accel/tcg/cpu-exec.c
> > +++ b/accel/tcg/cpu-exec.c
> > @@ -325,9 +325,6 @@ TranslationBlock *tb_htable_lookup(CPUState *cpu,
> target_ulong pc,
> > struct tb_desc desc;
> > uint32_t h;
> >
> > - cf_mask &= ~CF_CLUSTER_MASK;
> > - cf_mask |= cpu->cluster_index << CF_CLUSTER_SHIFT;
> > -
> > desc.env = (CPUArchState *)cpu->env_ptr;
> > desc.cs_base = cs_base;
> > desc.flags = flags;
> >
>
>
Confirmed, both Mac OS 9.2 and OS X 10.4 running with qemu-system-ppc are
back to their old performance levels.
Best, and thanks,
Howard