[Unionfs] Locking issue with unionfs

Sun Oct 21 22:54:37 EDT 2007

In message <200710191147.12337.herton at mandriva.com.br>, Herton Ronaldo Krzesinski writes:
> Hi,
> 
> While using unionfs (2.1.3) we are having a problem. basically if you do something
> like this:
> 
> - mkdir -p /live/distrib
> - install a distro into /live/distrib so you can do a chroot/play with it
> - mkdir -p /live/memory
> - mount -t tmpfs none /live/memory
> - mkdir -p /live/union
> - mount -o dirs=/live/memory=rw:/live/distrib=ro -t unionfs 
> unionfs /live/union
> - chroot /live/union
> - mkdir /foo
> - mount -t unionfs -o remount,add=/foo=rw none /live/union
> 
> Then the last command freezes, instead of giving a error. This is just an
> example for reproducing, not a real usage. The real usage is using something 
> similar to do something like a pivot root inside a live cd.
> 
> Looking at the stack trace of mount command, you have something like this:
> mount         D dbdb8800     0 13892  13843
>        d1563b50 00200082 d739b000 dbdb8800 00000000 006013d4 c040f320 c040e9e0
>        c040e9e0 c040e9e0 d739b000 cd2d2580 cd2d26ac c13889e0 d739b000 dbdb8800
>        00000000 d1563baa d9b35000 d739b000 fffeffff d9337d68 cd2d2580 d1563b74
> Call Trace:
>  [<c01f9015>] rwsem_down_failed_common+0x75/0x180
>  [<c03157cd>] rwsem_down_read_failed+0x1d/0x28
>  [<c0315853>] call_rwsem_down_read_failed+0x7/0xc
>  [<dcf0f6df>] unionfs_lookup+0x1f/0x170 [unionfs]
>  [<c018ad90>] do_lookup+0x110/0x190
>  [<c018cfcc>] __link_path_walk+0x74c/0xdf0
>  [<c018d6b6>] link_path_walk+0x46/0xd0
>  [<c018d759>] path_walk+0x19/0x20
>  [<c018d917>] do_path_lookup+0x87/0x230
>  [<c018e627>] path_lookup+0x17/0x20
>  [<dcf1262c>] do_remount_add_option+0x10c/0x270 [unionfs]
>  [<dcf130dd>] unionfs_remount_fs+0x39d/0x8b0 [unionfs]
>  [<c01853c1>] do_remount_sb+0xc1/0x130
>  [<c019ab4d>] do_mount+0x23d/0x6d0
>  [<c019b55f>] sys_mount+0x6f/0xb0
>  [<c01041ba>] sysenter_past_esp+0x6b/0xa1
> 
> Looking at above, we can see that unionfs is locked at unionfs_lookup, 
> probably at "unionfs_read_lock(sb);". Then if you analyze the stack trace 
> above looking at the other unionfs functions, you discover that in  
> unionfs_remount_fs there is a "unionfs_write_lock(sb)", and because this the 
> deadlock happens. I made a patch that at least seems to have worked here on a 
> quick test as a workaround, I don't know yet if it has some side effect or if 
> could be the proper fix for this locking issue, here it's the patch:
> 
> --- linux-2.6.22.10-0.2mdv/fs/unionfs/super.c	2007-10-18 
> 17:11:39.000000000 -0200
> +++ linux-2.6.22.10-0.2mdv.mod/fs/unionfs/super.c	2007-10-18 
> 16:56:39.000000000 -0200
> @@ -587,10 +587,12 @@ static int unionfs_remount_fs(struct sup
>  		}
>  
>  		if (!strcmp("add", optname)) {
> +			unionfs_write_unlock(sb);
>  			err = do_remount_add_option(optarg, new_branches,
>  						    tmp_data,
>  						    tmp_lower_paths,
>  						    &new_high_branch_id);
> +			unionfs_write_lock(sb);
>  			if (err)
>  				goto out_release;
>  			new_branches++;
> 
> I just unlock temporarily sb, probably not the right fix, as I didn't 
> understand well the locking scheme yet.
> 
> --
> []'s
> Herton

Herton, your fix is probably ok in the short term.  I'm looking into a way
to solve this deadlock more cleanly.  It's not easy b/c a read-write
semaphore in linux has no easy way to allow the same lock-owner who holds
the write lock, to also get a readlock.

Cheers,
Erez.