[Unionfs] RHEL 2.6.9 hangs with unionfs-1.1.5

Josef Sipek jsipek at fsl.cs.sunysb.edu
Thu Nov 30 13:31:41 EST 2006


On Thu, Nov 30, 2006 at 06:07:54PM +0100, Benoit Guillon wrote:
> Thanks for the hint. With this command I've seen some interesting 
> things. With this simple script several sparse files are created in a 
> read-only layer:
> 
> #! /bin/sh
> for i in 100 1000 10000 100000 1000000; do
>  dd if=/dev/zero of=sparse$i.bin bs=1M seek=$i count=1
> done
 
Ok.
 
> It gives:
> [guest at node1 log]$ ll
> total 5260
> -rwxr-xr-x  1 root root           115 Nov 30 16:44 build_sparse
> -rw-r--r--  1 root root 1048577048577 Nov 30 16:57 sparse1000000.bin
> -rw-r--r--  1 root root  104858648576 Nov 30 16:44 sparse100000.bin
> -rw-r--r--  1 root root   10486808576 Nov 30 16:44 sparse10000.bin
> -rw-r--r--  1 root root    1049624576 Nov 30 16:44 sparse1000.bin
> -rw-r--r--  1 root root     105906176 Nov 30 16:44 sparse100.bin
> 
> [guest at node1 log]$ du -kh sparse100*
> 1.1M    sparse1000000.bin
> 1.1M    sparse100000.bin
> 1.1M    sparse10000.bin
> 1.1M    sparse1000.bin
> 1.1M    sparse100.bin
> 
> The stack is unionfs mounted, NFS exported, and the diskless node 
> (node2) boots on this file system.

Makes sense.

> [root at node1 diskless]# ssh node2
> root at node2's password:
> -bash-3.00#
> -bash-3.00# cd /var/log/
> -bash-3.00# ll
> ...
> -rw-r--r--  1 root root     105906178 Nov 30  2006 sparse100.bin
> -rw-r--r--  1 root root    1049624578 Nov 30  2006 sparse1000.bin
> -rw-r--r--  1 root root   10486808578 Nov 30  2006 sparse10000.bin
> -rw-r--r--  1 root root  104858648578 Nov 30  2006 sparse100000.bin
> -rw-r--r--  1 root root 1048577048578 Nov 30  2006 sparse1000000.bin
> 
> -bash-3.00# du -kh sparse100*
> 1.1M    sparse100.bin
> 1.1M    sparse1000.bin
> 1.1M    sparse10000.bin
> 1.1M    sparse100000.bin
> 1.1M    sparse1000000.bin

Good.

> Then I cat 2 characters to sparse1000:
> 
> -bash-3.00# echo 1 >> sparse1000.bin
> 
> It takes a while to finish and... the file is no more a sparse file !
>
> -bash-3.00# du -kh sparse100*
> 1.1M    sparse100.bin
> 1002M   sparse1000.bin
> 1.1M    sparse10000.bin
> 1.1M    sparse100000.bin
> 1.1M    sparse1000000.bin
 
Yes. That is one problem that's out of our control. When you read the
contents of the hole, you get a series of \0. This causes a small problem
since there is no way unionfs can know of the file was sparse to begin with.
A while back, there have been some discussion on the linux-kernel mailing
list about exposing some information about holes to the callers, but I don't
think it went anywhere.
 
> Incidently the giga byte file is created in the COW directory.
> Now, doing this on sparse10000.bin then freezes the server, with always 
> the same kernel trace (attached).

Hrm. About a month ago I made fixed copyup of files > 4GB (or was it 2?) -
an obvious overflow. I guess I forgot the patch 1.1.5. I want to release
1.1.6 anyway (in addition to 1.5 for 2.6.19).

So, just to make sure...

Make files with the following commands and do echo 1 >> foo to each.

dd if=/dev/zero of=under2g bs=1 count=1 seek=2147483645
dd if=/dev/zero of=2g bs=1 count=1 seek=2147483646
dd if=/dev/zero of=over2g bs=1 count=1 seek=2147483648
dd if=/dev/zero of=under4g bs=1 count=1 seek=4294967293
dd if=/dev/zero of=4g bs=1 count=1 seek=4294967294
dd if=/dev/zero of=over4g bs=1 count=1 seek=4294967296

If it is an overflow it should fail at least the last one.

> I guess one thing wrong is that the sparse nature is not respected by 
> unionfs.

As I mentioned above, there is no way that we can know if a file is sparse
or not. We could of course try to re-sparsify the file during copyup, but
checking every block for being full of \0 is rather expensive.

Thanks,

Josef "Jeff" Sipek.

-- 
Evolution, n.:
  A hypothetical process whereby infinitely improbable events occur with
  alarming frequency, order arises from chaos, and no one is given credit.


More information about the unionfs mailing list