[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] fs/btrfs: Make extent item iteration to handle gaps
From: |
Michael Chang |
Subject: |
Re: [PATCH] fs/btrfs: Make extent item iteration to handle gaps |
Date: |
Thu, 28 Oct 2021 17:27:18 +0800 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
On Thu, Oct 28, 2021 at 03:36:10PM +0800, The development of GNU GRUB wrote:
> Gentle ping?
>
> Without this patch, the new mkfs.btrfs NO_HOLES feature would break any
> kernel/initramfs with hole in it.
>
> And considering the modification is already small, I believe this patch is
> definitely worthy as a bug fix.
+1
This does worth more attention as booting would have been blocked.
>
> Thanks,
> Qu
>
> On 2021/10/16 09:40, Qu Wenruo wrote:
> > [BUG]
> > Grub btrfs implementation can't handle two very basic btrfs file
> > layouts:
> >
> > 1. Mixed inline/regualr extents
> > # mkfs.btrfs -f test.img
> > # mount test.img /mnt/btrfs
> > # xfs_io -f -c "pwrite 0 1k" -c "sync" -c "falloc 0 4k" \
> > -c "pwrite 4k 4k" /mnt/btrfs/file
> > # umount /mnt/btrfs
> > # ./grub-fstest ./grub-fstest --debug=btrfs ~/test.img hex "/file"
> >
> > Such mixed inline/regular extents case is not recommended layout,
> > but all existing tools and kernel can handle it without problem
> >
> > 2. NO_HOLES feature
> > # mkfs.btrfs -f test.img -O no_holes
> > # mount test.img /mnt/btrfs
> > # xfs_io -f -c "pwrite 0 4k" -c "pwrite 8k 4k" /mnt/btrfs/file
> > # umount /mnt/btrfs
> > # ./grub-fstest ./grub-fstest --debug=btrfs ~/test.img hex "/file"
> >
> > NO_HOLES feature is going to be the default mkfs feature in the incoming
> > v5.15 release, and kernel has support for it since v4.0.
> >
> > [CAUSE]
> > The way GRUB btrfs code iterates through file extents relies on no gap
> > between extents.
> >
> > If any gap is hit, then grub btrfs will error out, without any proper
> > reason to help debug the bug.
> >
> > This is a bad assumption, since a long long time ago btrfs has a new
> > feature called NO_HOLES to allow btrfs to skip the padding hole extent
> > to reduce metadata usage.
> >
> > The NO_HOLES feature is already stable since kernel v4.0 and is going to
> > be the default mkfs feature in the incoming v5.15 btrfs-progs release.
> >
> > [FIX]
> > When there is a extent gap, instead of error out, just try next item.
> >
> > This is still not ideal, as kernel/progs/U-boot all do the iteration
> > item by item, not relying on the file offset continuity.
> >
> > But it will be way more time consuming to correct the whole behavior
> > than starting from scratch to build a proper designed btrfs module for GRUB.
> >
> > Signed-off-by: Qu Wenruo <wqu@suse.com>
> > ---
> > grub-core/fs/btrfs.c | 33 ++++++++++++++++++++++++++++++---
> > 1 file changed, 30 insertions(+), 3 deletions(-)
> >
> > diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> > index 63203034dfc6..4fbcbec7524a 100644
> > --- a/grub-core/fs/btrfs.c
> > +++ b/grub-core/fs/btrfs.c
> > @@ -1443,6 +1443,7 @@ grub_btrfs_extent_read (struct grub_btrfs_data *data,
> > grub_size_t csize;
> > grub_err_t err;
> > grub_off_t extoff;
> > + struct grub_btrfs_leaf_descriptor desc;
> > if (!data->extent || data->extstart > pos || data->extino != ino
> > || data->exttree != tree || data->extend <= pos)
> > {
> > @@ -1455,7 +1456,7 @@ grub_btrfs_extent_read (struct grub_btrfs_data *data,
> > key_in.type = GRUB_BTRFS_ITEM_TYPE_EXTENT_ITEM;
> > key_in.offset = grub_cpu_to_le64 (pos);
> > err = lower_bound (data, &key_in, &key_out, tree,
> > - &elemaddr, &elemsize, NULL, 0);
> > + &elemaddr, &elemsize, &desc, 0);
> > if (err)
> > return -1;
> > if (key_out.object_id != ino
> > @@ -1494,10 +1495,36 @@ grub_btrfs_extent_read (struct grub_btrfs_data
> > *data,
> > PRIxGRUB_UINT64_T "\n",
> > grub_le_to_cpu64 (key_out.offset),
> > grub_le_to_cpu64 (data->extent->size));
> > + /*
> > + * The way of extent item iteration is pretty bad, it completely
> > + * requires all extents are contiguous, which is not ensured.
> > + *
> > + * Features like NO_HOLE and mixed inline/regular extents can cause
> > + * gaps between file extent items.
> > + *
> > + * The correct way is to follow kernel/U-boot to iterate item by
> > + * item, without any assumption on the file offset continuity.
> > + *
> > + * Here we just manually skip to next item and re-do the verification.
> > + *
> > + * TODO: Rework the whole extent item iteration code, if not the
> > + * whole btrfs implementation.
> > + */
> > if (data->extend <= pos)
> > {
> > - grub_error (GRUB_ERR_BAD_FS, "extent not found");
> > - return -1;
> > + err = next(data, &desc, &elemaddr, &elemsize, &key_out);
> > + if (err < 0)
> > + return -1;
> > + /* No next item for the inode, we hit the end */
> > + if (err == 0 || key_out.object_id != ino ||
> > + key_out.type != GRUB_BTRFS_ITEM_TYPE_EXTENT_ITEM)
> > + return pos - pos0;
> > +
> > + csize = grub_le_to_cpu64(key_out.offset) - pos;
> > + buf += csize;
> > + pos += csize;
> > + len -= csize;
Does it make sense to add some sort of range check here to safegard len
from being underflow ? My justfication is that csize value is read out
from disk so can be anything wild presumably.
Thanks,
Michael
> > + continue;
> > }
> > }
> > csize = data->extend - pos;
> >
>
>
> _______________________________________________
> Grub-devel mailing list
> Grub-devel@gnu.org
> https://lists.gnu.org/mailman/listinfo/grub-devel