qemu-s390x
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 2/2] target/s390x: support SHA-512 extensions


From: Jason A. Donenfeld
Subject: Re: [PATCH v6 2/2] target/s390x: support SHA-512 extensions
Date: Fri, 5 Aug 2022 15:01:40 +0200

Hi David,

On Fri, Aug 05, 2022 at 01:28:18PM +0200, David Hildenbrand wrote:
> On 03.08.22 19:15, Jason A. Donenfeld wrote:
> > In order to fully support MSA_EXT_5, we have to also support the SHA-512
> > special instructions. So implement those.
> > 
> > The implementation began as something TweetNacl-like, and then was
> > adjusted to be useful here. It's not very beautiful, but it is quite
> > short and compact, which is what we're going for.
> > 
> 
> NIT: we could think about reversing the order of patches. IIRC, patch #1
> itself would trigger a warning when starting QEMU. Having this patch
> first make sense logically.

Good idea. Will do.

> > +static int kimd_sha512(CPUS390XState *env, uintptr_t ra, uint64_t 
> > parameter_block,
> > +                       uint64_t *message_reg, uint64_t *len_reg, uint8_t 
> > *stack_buffer)
> > +{
> > +    enum { MAX_BLOCKS_PER_RUN = 64 }; /* This is arbitrary, just to keep 
> > interactivity. */
> 
> I'd just use a #define outside of the function for that.

Why? What does leaking this into file-level scope do?

> 
> > +    uint64_t z[8], b[8], a[8], w[16], t;
> > +    uint64_t message = message_reg ? *message_reg : 0, len = *len_reg, 
> > processed = 0;
> > +    int i, j, reg_len = 64, blocks = 0, cc = 0;
> > +
> > +    if (!(env->psw.mask & PSW_MASK_64)) {
> > +        len = (uint32_t)len;
> > +        reg_len = (env->psw.mask & PSW_MASK_32) ? 32 : 24;
> > +    }
> 
> 
> I'd call that message_reg_len. (same in other function)

Will do.

> 
> 
> > +
> > +    for (i = 0; i < 8; ++i) {
> > +        z[i] = a[i] = cpu_ldq_be_data_ra(env, wrap_address(env, 
> > parameter_block + 8 * i), ra);
> 
> I assume if we get any exception here, we simply didn't make any progress.
> 
> > +    }
> > +
> > +    while (len >= 128) {
> > +        if (++blocks > MAX_BLOCKS_PER_RUN) {
> > +            cc = 3;
> > +            break;
> > +        }
> > +
> > +        for (i = 0; i < 16; ++i) {
> > +            if (message) {
> > +                w[i] = cpu_ldq_be_data_ra(env, wrap_address(env, message + 
> > 8 * i), ra);
> 
> dito

Right, there's no progress, because it's only ever incremented at the
end. And, more importantly, we only ever update the parameter_block
after having done things successfully.

> > +    for (i = 0; i < 8; ++i) {
> > +        cpu_stq_be_data_ra(env, wrap_address(env, parameter_block + 8 * 
> > i), z[i], ra);
> 
> I wonder what happens if we get an exception somewhere in the middle
> here ... fortunately we can only involve 2 pages.

If this fails, then message_reg and len_reg won't be updated, so it will
have to start over. If it fails part way through, though, then things
are inconsistent. I don't think we want to hassle with trying to restore
the previous state or something insane though. That seems a bit much.

> > +    cc = kimd_sha512(env, ra, parameter_block, message_reg, len_reg, NULL);
> > +    if (cc) {
> > +        return cc;
> > +    }
> 
> Doesn't kimd_sha512() update the length register? And if we return with
> cc=3, we'd be in trouble, no?

cc=3 means partial completion. In that case, klmd also returns with a
partial completion. That's good and expected! It means that the next
time it's called, it'll keep going where it left off.

I've actually tried this with the Linux implementation, and it works as
expected.

> One idea could be to simply only process one block at a time. Read all
> inputs first for that block and handle it completely without any
> register modifications. Perform all memory writes in a single call.

That *is* what already happens. Actually, the memory writes only ever
happen at the very end of kimd_sha512.

> Further, I wonder if we should factor out the core of kimd_sha512() to
> only work on temp buffers without any loading/storing of memory, and let
> only kimd_sha512/klmd_sha512 perform all loading/storing. Then it's much
> cleaner who modifies what.

That's not necessary and will complicate things ultimately. See the
above; this is already working as expected.

Jason



reply via email to

[Prev in Thread] Current Thread [Next in Thread]