qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-arm] [Qemu-devel] [PATCH v2 07/14] target/arm/cpu64: max cpu:


From: Andrew Jones
Subject: Re: [Qemu-arm] [Qemu-devel] [PATCH v2 07/14] target/arm/cpu64: max cpu: Introduce sve<vl-bits> properties
Date: Fri, 28 Jun 2019 09:27:39 +0200
User-agent: NeoMutt/20180716

On Thu, Jun 27, 2019 at 06:49:02PM +0200, Richard Henderson wrote:
> On 6/21/19 6:34 PM, Andrew Jones wrote:
> > +    /*
> > +     * In sve_vq_map each set bit is a supported vector length of
> > +     * (bit-number + 1) * 16 bytes, i.e. each bit number + 1 is the vector
> > +     * length in quadwords. We need a map size twice the maximum
> > +     * quadword length though because we use two bits for each vector
> > +     * length in order to track three states: uninitialized, off, and on.
> > +     */
> > +    DECLARE_BITMAP(sve_vq_map, ARM_MAX_VQ * 2);
> 
> I don't see that having one continuous bitmap is more convenient than two.
> Indeed, there appear to be several places that would be clearer with two.
> 
> > +static arm_vq_state arm_cpu_vq_map_get(ARMCPU *cpu, int vq)
> > +{
> > +    assert(vq <= ARM_MAX_VQ);
> > +
> > +    return test_bit(vq - 1, cpu->sve_vq_map) |
> > +           test_bit(vq - 1 + ARM_MAX_VQ, cpu->sve_vq_map) << 1;
> > +}
> 
> Neither easier nor more complex w/ one or two bitmaps.
> 
> > +static void arm_cpu_vq_map_init(ARMCPU *cpu)
> > +{
> > +    bitmap_zero(cpu->sve_vq_map, ARM_MAX_VQ * 2);
> > +    bitmap_set(cpu->sve_vq_map, ARM_MAX_VQ, ARM_MAX_VQ);
> > +}
> 
> Clearer with two bitmaps.
> 
>       bitmap_zero(cpu->sve_vq_map, ARM_MAX_VQ);
>       bitmap_set(cpu->sve_vq_uninit_map, 0, ARM_MAX_VQ);
> 
> > +static bool arm_cpu_vq_map_is_finalized(ARMCPU *cpu)
> > +{
> > +    DECLARE_BITMAP(map, ARM_MAX_VQ * 2);
> > +
> > +    bitmap_zero(map, ARM_MAX_VQ * 2);
> > +    bitmap_set(map, ARM_MAX_VQ, ARM_MAX_VQ);
> > +    bitmap_and(map, map, cpu->sve_vq_map, ARM_MAX_VQ * 2);
> > +
> > +    return bitmap_empty(map, ARM_MAX_VQ * 2);
> > +}
> 
> Definitely clearer w/ 2 bitmaps,
> 
>       return bitmap_empty(cpu->sve_vq_uninit_map);

I guess I don't have a strong opinion on one or two bitmaps. I'm not a big
fan of adding temporary variables to data structures (even if the same
amount of storage is being allocated a different way), but I can change
this for v3.

> 
> 
> As for how sve-max-vq=Y and sveX={on,off} interoperate...  I wonder if we can
> just remove cpu->sve_max_vq.  That might simplify your code a bit.
> 
> What if sve-max-vq=Y "expands" to
> 
>       for (X = 1; X <= Y; X++) { sve(X*128)=on }
> 
> Then you've got a reasonable in-order definition of how those two sets of
> switches interoperate.
> 
> The uses of cpu->sve_max_vq within cpu.c and cpu64.c are all initialization
> stuff that you're replacing.

Hmm, I can look at removing cpu->sve_max_vq, by keeping the property and
expanding it, as you suggest. However I still need a few initialization
states to track what the default vq value should be, so I'm not sure
how I'd implement that without it or some other cpu state, which would
be another temporary cpu state member. Also, while it's true we can always
get the max vq with next-smaller(ARM_MAX_VQ + 1), having it cached in
cpu->sve_max_vq is convenient. That said, I think we'd rather keep it.

> 
> The use within sve_zcr_len_for_el can be replaced by AVR_MAX_VQ.  Your "select
> supported vector size not larger than" code goes at the end of that function,
> such that we select a supported maximum no larger than the raw .LEN values.
> 
> The use within aarch64_sve_narrow_vq should in fact assert that the given vq 
> is
> set within cpu->sve_vq_map.

Yeah, I'm glad Eric pointed out that I had a bug there in the emulation.
I'll get that fixed for v3.

Thanks,
drew



reply via email to

[Prev in Thread] Current Thread [Next in Thread]