On Mon, 26 Sept 2022 at 07:05, Cédric Le Goater <clg@kaod.org> wrote:
On 9/26/22 08:26, Cédric Le Goater wrote:
Currently, the CPU features exposed to the AST2600 QEMU machines are :
half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt
vfpd32 lpae evtstrm
But, the features of the Cortex A7 CPU on the Aspeed AST2600 A3 SoC
are :
half thumb fastmult vfp edsp vfpv3 vfpv3d16 tls vfpv4 idiva idivt
lpae evtstrm
The vfpv3d16 feature bit is common to both vfpv3 and vfpv4, and for
this SoC, QEMU should advertise a VFPv4 unit with 16 double-precision
registers, and not 32 registers.
Drop neon support and hack the default mvfr0 register value of the
cortex A7 to advertise 16 registers.
How can that be done cleanly ? Should we :
* introduce a new A7 CPU with its own _initfn routine ?
* introduce a new CPU property to set the number of "Advanced SIMD
and floating-point" registers in arm_cpu_realizefn() ?
This is a note in the Cortex A7 MPCore Technical reference saying :
"When FPU option is selected without NEON, the FPU is VFPv4-D16 and uses 16
double-precision registers. When the FPU is implemented with NEON, the FPU is
VFPv4-D32 and uses 32 double-precision registers. This register bank is shared
with NEON."
The datasheet only has this to say:
"1.2GHz dual-core ARM Cortex A7 (r0p5) 32-bit CPU with FPU"
With no details about the FPU. The hardware is a golden reference though:
fpsid: 41023075
mvfr0: 10110221
mvfr1: 11000011
$ bitfield mvfr0 0x10110221
decoding as Media and VFP Feature Register 0
0x10110221 [269550113]
A_SIMD registers: 0x1 [16 x 64-bit registers]
Single precision: 0x2 [Supported, VFPv4 or VFPv3]
Double precision: 0x2 [Supported, VFPv4 or VFPv3]
VFP exception trapping: 0x0 [Not supported]
Divide: 0x1 [Hardware divide is supported]
Square Root: 0x1 [Hardware square root supported]
Short vectors: 0x0 [Not supported]
VFP Rounding Modes: 0x1 [All modes supported]
$ bitfield mvfr1 0x11000011
decoding as Media and VFP Feature Register 1
0x11000011 [285212689]
FZ: 0x1
D_NaN mode: 0x1
A_SIMD load/store: 0x0
A_SIMD integer: 0x0
A_SIMD SPFP: 0x0
A_SIMD HPFP: 0x0
VFP HPFP: 0x2
A_SIMD FMAC: 0x1
As you say, no NEON and 16 64-bit registers.
Could we deduce the number of registers from the availability of the NEON
feature, on A7 only ?
We certainly should make the NEON property match the mvfr1 value.
Linux tests for NEON with this:
(fmrx(MVFR1) & 0x000fff00) == 0x00011100)
https://elixir.bootlin.com/linux/v5.19/source/arch/arm/vfp/vfpmodule.c#L812