gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gluster-devel] [RFC] storage/posix: healing parent gfid xattrs


From: Raghavendra Gowdappa
Subject: [Gluster-devel] [RFC] storage/posix: healing parent gfid xattrs
Date: Mon, 2 Dec 2013 00:29:26 -0500 (EST)

Hi all,

With nameless lookups [1], all files and directories present in gluster file 
systems can be accessed in a flat namespace addressed by gfids. This helps us 
to bypass the hierarchical path based resolution and operate normally with just 
gfid based resolution. One of the primary users of this feature is gluster-nfs 
server. However there are other scenarios too (like a brick rebooting while 
there are clients connected to it) where we end up doing only gfid based 
resolution. While this works for most of our use cases, there are certain 
features which rely on path based hierarchy. Quota and (earlier version of) 
geo-replication are one of these features. Hence we brought in a feature which, 
given a gfid can give out all possible paths to that gfid from root [2]. This 
feature stores a list of keys, one for each parent directory of the file and 
the name/key is a function of gfid of parent directory. The value of each of 
these keys is the number of hardlinks to the file in that parent directory.

address@hidden gfs]# mkdir dir1 dir2
address@hidden gfs]# touch dir1/file
address@hidden gfs]# link dir1/file dir1/link
address@hidden gfs]# link dir1/file dir2/link

#pgfid keys and their values.
address@hidden gfs]# getfattr -e hex -d -m "trusted.pgfid.*" dir1/file
# file: dir1/file
trusted.pgfid.117d8ae9-4111-44dc-9c67-29ad676876ec=0x00000001
trusted.pgfid.addcacb9-30cb-4555-a450-9aa45598f445=0x00000002

#gfid of directories dir1 and dir2 from backend directories.
address@hidden gfs]# getfattr -e hex -n "trusted.gfid" /home/export/ptop/dir1
trusted.gfid=0xaddcacb930cb4555a4509aa45598f445

address@hidden gfs]# getfattr -e hex -n "trusted.gfid" /home/export/ptop/dir2
trusted.gfid=0x117d8ae9411144dc9c6729ad676876ec

For more details on how these pgfid xattrs are used to construct path, please 
refer [2]. The problem this RFC is trying to solve is, if pgfid xattrs are 
absent on a file, how can we self-heal them. Couple of use cases where we can 
run into this problem are:
1. Quota is enabled on pre-existing data.
2. Brick crashes after completing an operation creating/deleting a dentry but 
before that operation is reflected in pgfid xattrs. One way of bringing in 
crash consistency is to remove pgfid xattrs and heal them freshly.

The fundamental problem here is, calculating correct number of links to a file 
in a directory while not serializing other dentry operations (with self-heal 
operation). It seems inevitable that we have to read the entire parent 
directory to figure out the number of links. Following solution, (it seems to 
me) can solve the issue with least impact on other dentry operations:

1. In a path-based lookup (where we have parent gfid), check for pgfid xattr. 
If present, no need for self-heal and if absent, proceed with self-heal 
operation outlined below. Self-heal can also be triggered for all the files in 
a directory during a lookup on the directory.

2. before starting heal, get current time 
    self_heal_start = gettimeofday()

3. read the entire contents of the directory and store each file (probably in 
an in-memory data-structure like list or rb-tree) with correct link count to 
that file in the directory represented by pgfid.

4. For each entry in the above list, do
   a. acquire a lock exclusively used for synchronization between self-heal 
code and other operations modifying dentries pointing to this file (say 
inode->dentry_lock)
      
      lock (inode->dentry_lock)

   b. if (inode_stbuf.st_ctime < self_heal_start) {
          setxattr pgfid key with correct link count on the file
      }

   c. unlock (inode->dentry_lock)

5. All the other operations modifying dentries (like create, link, rename, 
unlink, mknod, symlink) have to acquire inode->dentry_lock before 
adding/deleting dentries.

This solution has a caveat that, because of the check in 4b, pgfid can never be 
healed. We can work around this situation by storing the number of failed 
heal-attempts in the same pgfid xattr key and when failures cross a certain 
value, we can:

# remember this code is executed, holding inode->dentry_lock
4b.1 if ((inode_stbuf.st_ctime >= self_heal_start) && (self_heal_failures >= 
permissible_failures)) {
            read the entire parent directory
            calculate the link count for this file
            set the pgfid key with correct link count
     }

Improvements to the above solution or different solutions solving the same 
problem are welcome.
           
[1] http://review.gluster.com/#/c/669/
[2] http://review.gluster.org/#/c/5951/

regards,
Raghavendra.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]