|
From: | BALATON Zoltan |
Subject: | Re: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC |
Date: | Fri, 21 Feb 2020 19:04:07 +0100 (CET) |
User-agent: | Alpine 2.22 (BSF 395 2020-01-19) |
On Fri, 21 Feb 2020, Peter Maydell wrote:
On Fri, 21 Feb 2020 at 16:05, BALATON Zoltan <address@hidden> wrote:On Thu, 20 Feb 2020, Richard Henderson wrote:On 2/18/20 9:10 AM, BALATON Zoltan wrote:+ DEFINE_PROP_BOOL("hardfloat", PowerPCCPU, hardfloat, true),I would also prefer a different name here -- perhaps x-no-fp-fi.What's wrong with hardfloat? That's how the code refers to this so if anyone searches what it does would turn up some meaningful results.This prompted me to check what you're using the property for. The cover letter says:This patch implements a simple way to keep the inexact flag set for hardfloat while still allowing to revert to softfloat for workloads that need more accurate albeit slower emulation. (Set hardfloat property of CPU, i.e. -cpu name,hardfloat=false for that.)I think that is the wrong approach. Enabling use of the host FPU should not affect the accuracy of the emulation, which should remain bitwise-correct. We should only be using the host FPU to the extent that we can do that without discarding accuracy. As far as I'm aware that's how the hardfloat support for other guest CPUs that use it works.
I don't know of a better approach. Please see section 4.2.2 Floating-Point Status and Control Register on page 124 in this document:
https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0especially the definition of the FR and FI bits and tell me how can we emulate these accurately and use host FPU. Not using the FPU even when these bits are not needed (which seems to be the case for all workloads we've tested so far) seriously limits the emulation speed so spending time to emulate obscure and unused part of an architecture when not actually needed just to keep emulation accurate but unusably slow does not seem to be the right approach. In an ideal world of course this should be both fast and accurate but we don't seem to have anyone who could achieve that in past two years so maybe we could give up some accuracy now to get usable speed and worry about emulating obscure features when we come across some workload that actually needs it (but we have the option to revert to accurate but slow emulation for that until a better way can be devised that's both fast and accurate). Insisting on accuracy without any solution to current state just hinders making any progress with this.
Other PowerPC emulators also seem to not bother or have similar optimisation. I've quickly checked three that I know about:
https://github.com/mamedev/mame/blob/master/src/devices/cpu/powerpc/ppcdrc.cpp#L1893 https://github.com/mamedev/mame/blob/master/src/devices/cpu/powerpc/ppcdrc.cpp#L3503 there's also something here but no mention of FI bit I could notice: https://github.com/mamedev/mame/blob/master/src/devices/cpu/powerpc/ppccom.cpp#L2023 https://github.com/xenia-project/xenia/blob/master/src/xenia/cpu/ppc/ppc_hir_builder.cc#L428 https://github.com/dolphin-emu/dolphin/blob/master/Source/Core/Core/PowerPC/Jit64/Jit_FloatingPoint.cppBut I'm not sure I understand all of the above so hope this makes more sense to someone and can advise.
Regards, BALATON Zoltan
[Prev in Thread] | Current Thread | [Next in Thread] |