qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 07/12] hw/arm/smmu-common: Support nested translation


From: Mostafa Saleh
Subject: Re: [RFC PATCH 07/12] hw/arm/smmu-common: Support nested translation
Date: Mon, 25 Mar 2024 20:47:18 +0000

Hi Julien,

On Mon, Mar 25, 2024 at 02:20:07PM +0000, Julien Grall wrote:
> Hi Mostafa,
> 
> On 25/03/2024 10:14, Mostafa Saleh wrote:
> > @@ -524,7 +551,7 @@ static int smmu_ptw_64_s2(SMMUTransCfg *cfg,
> >           tlbe->entry.translated_addr = gpa;
> >           tlbe->entry.iova = ipa & ~mask;
> >           tlbe->entry.addr_mask = mask;
> > -        tlbe->entry.perm = s2ap;
> > +        tlbe->parent_perm = tlbe->entry.perm = s2ap;
> >           tlbe->level = level;
> >           tlbe->granule = granule_sz;
> >           return 0;
> > @@ -537,6 +564,35 @@ error:
> >       return -EINVAL;
> >   }
> > +/* Combine 2 TLB enteries and return in tlbe. */
> > +static void combine_tlb(SMMUTLBEntry *tlbe, SMMUTLBEntry *tlbe_s2,
> > +                        dma_addr_t iova, SMMUTransCfg *cfg)
> > +{
> > +        if (cfg->stage == SMMU_NESTED) {
> > +
> > +            /*
> > +             * tg and level are used from stage-1, while the addr mask can 
> > be
> With the current approach, I can't boot a guest if I create a dummy stage-1
> using 512GB mapping and a stage-2 using 2MB mapping. It looks like this is
> because the level will be used during the TLB lookup.

Agh, I guess that case is’t common with Linux.

I was able to reproduce it with a hacked Linux driver, and the issue
happens in smmu_iotlb_lookup() because it assumes the cached entry has
a mask matching level and granularity, which is not correct with
nesting and I missed it, and fixing the mask is not enough here.

Looking at the mask of the found entry, not good also, if there is
disparity between stage-1 and stage-2 levels we always miss in TLB
even for the same address.

> 
> I managed to solve the issue by using the max level of the two stages. I
> think we may need to do a minimum for the granule.
> 

Just fixing the granularity and level, will alway miss in TLB if they
are different as granularity is used in lookup, I guess one way is to
fall back for stage-2 granularity in lookup if stage-1 lookup fails,
I will have another look and see if there is a better solution for v2.

But for now as you mentioned (also we need update the IOVA to match
the mask), that just should at least work:

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index ef5edfe4dc..ac2dc3efeb 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -572,21 +572,13 @@ static void combine_tlb(SMMUTLBEntry *tlbe, SMMUTLBEntry 
*tlbe_s2,
                         dma_addr_t iova, SMMUTransCfg *cfg)
 {
         if (cfg->stage == SMMU_NESTED) {
-
-            /*
-             * tg and level are used from stage-1, while the addr mask can be
-             * smaller in case stage-2 size(based on granule and level) was
-             * smaller than stage-1.
-             * That should have no impact on:
-             * - lookup: as iova is properly aligned with the stage-1 level and
-             *   granule.
-             * - Invalidation: as it uses the entry mask.
-             */
             tlbe->entry.addr_mask = MIN(tlbe->entry.addr_mask,
                                         tlbe_s2->entry.addr_mask);
             tlbe->entry.translated_addr = CACHED_ENTRY_TO_ADDR(tlbe_s2,
                                           tlbe->entry.translated_addr);
-
+            tlbe->granule = MIN(tlbe->granule, tlbe_s2->granule);
+            tlbe->level = MAX(tlbe->level, tlbe_s2->level);
+            tlbe->entry.iova = iova & ~tlbe->entry.addr_mask;
             /* parent_perm has s2 perm while perm has s1 perm. */
             tlbe->parent_perm = tlbe_s2->entry.perm;

> 
> > +             * smaller in case stage-2 size(based on granule and level) was
> > +             * smaller than stage-1.
> > +             * That should have no impact on:
> > +             * - lookup: as iova is properly aligned with the stage-1 
> > level and
> > +             *   granule.
> > +             * - Invalidation: as it uses the entry mask.
> > +             */
> > +            tlbe->entry.addr_mask = MIN(tlbe->entry.addr_mask,
> > +                                        tlbe_s2->entry.addr_mask);
> > +            tlbe->entry.translated_addr = CACHED_ENTRY_TO_ADDR(tlbe_s2,
> > +                                          tlbe->entry.translated_addr);
> > +
> > +            /* parent_perm has s2 perm while perm has s1 perm. */
> > +            tlbe->parent_perm = tlbe_s2->entry.perm;
> > +            return;
> > +        }
> > +
> > +        /* That was not nested, use the s2. */
> > +        memcpy(tlbe, tlbe_s2, sizeof(*tlbe));
> > +}
> 
> Cheers,
> 
> -- 
> Julien Grall

Thanks,
Mostafa




reply via email to

[Prev in Thread] Current Thread [Next in Thread]