Justifying new compression algorithms in ZFS?

Denis Ahrens recently posted a compression comparison of LZO and LZJB to the OpenSolaris ZFS forum. This is interesting work and there are plenty of opportunities for research and development of new and better ways of compressing data. But when does it make sense to actually implement a new compression scheme in ZFS?

The first barrier is the religilous arguments surrounding licensing. I'd rather not begin to go down that rat hole. Suffice to say, if someone really wants to integrate, they will integrate.

The second barrier is patents. Algorithms can be patented, and in the US patents have real value ($). This is another rat hole, so let's assume that monies are exchanged and the lawyers are held at bay.

The third barrier is integration into the OS. Changes to a file system, especially a file system used for boot, take time to integrate with all of the other parts of the OS: installation, backup, upgrades, boot loaders, etc. This isn't especially hard, but it does take time and involves interacting with many different people.

Now we can get down to the nitty-gritty engineering challenges.

Today, disks use a 512 byte sector size. This is the smallest size you can write to the disk. So compressing below 512 bytes gains nothing. Similarly, if the compression on a larger record does not reduce the overall size by 512 bytes, then it isn't worth compressing. Also, compression algorithms can increase the size of a record, depending on the data and algorithm. ZFS implements the policy that if compression does not reduce the record's size by more than 12.5% (1/8), then the record will be written uncompressed. This prevents inflation and provides a minimum limit for evaluating compression effectiveness. The smallest record size of interest is 8 blocks of 512 bytes, or 4 kBytes.

ZFS compresses each record of a file rather than the whole file. If a file contains some compressible bits and some uncompressible bits, then the file can be compressed, depending on the distribution of compressible bits in the file. This seems like an odd thing to say, but it is needed to understand that the maximum record size is 128 kBytes. When evaluating a compression algorithm for ZFS, the record sizes to be tested should range from 4 kBytes to 128 kBytes. In Denis' example, the test data is in the 200 MBytes to 801 MBytes range. Interesting, but it would be better to measure with the same policy that ZFS implements. Also, two of Denis' tests were on a tarball comprised of files. Again, this is interesting, but will not be representative of the compression of the untarred files, especially files smaller than 4 kBytes.

Now we can build a test profile that compares the effectiveness of compression for ZFS. The records should range from 4 kBytes to 128 kBytes. To do this easily with existing files, they could be split, compressed and the results compared for each split after applying ZFS' policies. The results should also be compared at the sector size, not the file length. To demonstrate, I'll use an example. I took the zfs(1m) man page and split it into 4 kByte files. Then I compressed it with compress(1) and gzip(1). For gzip, I used the -6 option, which is the default for ZFS when gzip compression is specified.


original

compress

gzip -6

file

length

sectors

length

sectors

length

sectors

xaa

4096

8

1688

4

945

2

xab

4096

8

2198

5

1664

4

xac

4096

8

2217

5

1615

4

xad

4096

8

2081

5

1438

3

xae

4096

8

2161

5

1468

3

xaf

4096

8

2170

5

1544

4

xag

4096

8

2174

5

1386

3

xah

4096

8

2163

5

1442

3

xai

4096

8

2154

5

1477

3

xaj

4096

8

2279

5

1747

4

xak

4096

8

2100

5

1299

3

xal

4096

8

2066

5

1335

3

xam

4096

8

2177

5

1481

3

xan

4096

8

2056

5

1326

3

xao

4096

8

2025

4

1206

3

xap

4096

8

2073

5

1388

3

xaq

4096

8

1960

4

1279

3

xar

4096

8

1897

4

1272

3

xas

4096

8

1766

4

1135

3

xat

4096

8

2221

5

1478

3

xau

3721

8

1820

4

1056

3

zfs.1m

85641

168

31464

62

20582

41


It is clear that gzip -6 does a better job compressing for space on these text files. I won't measure the performance costs for this example, but in general gzip -6 uses more CPU resources than compress. The real improvements in space are not readily comparable, though. A bit of spreadsheet summing reveals:


split files single file all files
compress 59% 37% 48%
gzip -6 39% 24% 32%

This shows that the space savings from compression on a single, large file is much better than for smaller files. This also reiterates the issue with compression in general -- you can't accurately predict how well it will work in advance.

In conclusion, the work required to add a compressor to ZFS is largely dominated by non-technical issues. But proper evaluation of the technical issues is also needed to be sure that the engineering results can justify the time and expense to tackle the non-technical issues. This can be done by experiments prior to coding to the ZFS interfaces. Denis and others, are interested in improving ZFS which is very cool. I think you should also help improve ZFS, or at least use it.

Comments