qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] target/i386: Implement PCLMULQDQ using AArch64 PMULL ins


From: Ard Biesheuvel
Subject: Re: [PATCH 2/2] target/i386: Implement PCLMULQDQ using AArch64 PMULL instructions
Date: Thu, 1 Jun 2023 19:13:10 +0200

On Thu, 1 Jun 2023 at 14:33, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> Use the AArch64 PMULL{2}.P64 instructions to implement PCLMULQDQ instead
> of emulating them in C code if the host supports this. This is used in
> the implementation of GCM, which is widely used in IPsec VPN and HTTPS.
>
> Somewhat surprising results: on my ThunderX2, enabling this on top of
> the AES acceleration I sent out earlier, the speedup is substantial.
>
> (1420 is a typical IPsec block size - in HTTPS, GCM operates on much
> larger block sizes but the kernel mode benchmarks are not the best place
> to measure its performance in this mode)
>
> tcrypt: testing speed of rfc4106(gcm(aes)) (rfc4106-gcm-aesni) encryption
>
> No acceleration
> tcrypt: test 5 (160 bit key, 1420 byte blocks): 10046 operations in 1 seconds 
> (14265320 bytes)
>
> AES acceleration
> tcrypt: test 5 (160 bit key, 1420 byte blocks): 13970 operations in 1 seconds 
> (19837400 bytes)
>
> AES + PMULL acceleration
> tcrypt: test 5 (160 bit key, 1420 byte blocks): 24372 operations in 1 seconds 
> (34608240 bytes)
>

User space benchmark (using OS's qemu-x86_64 vs one built with these
changes applied)

Speedup is about 5x


ard@gambale:~/build/openssl$ apps/openssl speed -evp aes-128-gcm
Doing AES-128-GCM for 3s on 16 size blocks: 1692138 AES-128-GCM's in 2.98s
Doing AES-128-GCM for 3s on 64 size blocks: 665012 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 256 size blocks: 203784 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 1024 size blocks: 49397 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 8192 size blocks: 6447 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 16384 size blocks: 3058 AES-128-GCM's in 3.00s
version: 3.2.0-dev
built on: Thu Jun  1 17:06:09 2023 UTC
options: bn(64,64)
compiler: x86_64-linux-gnu-gcc -pthread -m64 -Wa,--noexecstack -Wall
-O3 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_BUILDING_OPENSSL
-DNDEBUG
CPUINFO: OPENSSL_ia32cap=0xfed8320b0fcbfffd:0x8001020c01d843a9
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes
8192 bytes  16384 bytes
AES-128-GCM       9085.30k    14186.92k    17389.57k    16860.84k
17604.61k    16700.76k



ard@gambale:~/build/openssl$ ../qemu/build/qemu-x86_64 apps/openssl
speed -evp aes-128-gcm
Doing AES-128-GCM for 3s on 16 size blocks: 2703271 AES-128-GCM's in 2.99s
Doing AES-128-GCM for 3s on 64 size blocks: 1537884 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 256 size blocks: 653008 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 1024 size blocks: 203579 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 8192 size blocks: 29020 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 16384 size blocks: 14716 AES-128-GCM's in 2.99s
version: 3.2.0-dev
built on: Thu Jun  1 17:06:09 2023 UTC
options: bn(64,64)
compiler: x86_64-linux-gnu-gcc -pthread -m64 -Wa,--noexecstack -Wall
-O3 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_BUILDING_OPENSSL
-DNDEBUG
CPUINFO: OPENSSL_ia32cap=0xfed8320b0fcbfffd:0x8001020c01d843a9
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes
8192 bytes  16384 bytes
AES-128-GCM      14465.66k    32808.19k    55723.35k    69488.30k
79243.95k    80637.77k



reply via email to

[Prev in Thread] Current Thread [Next in Thread]