Denis Ahrens recently posted a compression comparison of LZO and LZJB to the OpenSolaris ZFS forum. This is interesting work and there are plenty of opportunities for research and development of new and better ways of compressing data. But when does it make sense to actually implement a new compression scheme in ZFS?
The first barrier is the religilous arguments surrounding licensing. I'd rather not begin to go down that rat hole. Suffice to say, if someone really wants to integrate, they will integrate.
The second barrier is patents. Algorithms can be patented, and in the US patents have real value ($). This is another rat hole, so let's assume that monies are exchanged and the lawyers are held at bay.
The third barrier is integration into the OS. Changes to a file system, especially a file system used for boot, take time to integrate with all of the other parts of the OS: installation, backup, upgrades, boot loaders, etc. This isn't especially hard, but it does take time and involves interacting with many different people.
Now we can get down to the nitty-gritty engineering challenges.
Today, disks use a 512 byte sector size. This is the smallest size you can write to the disk. So compressing below 512 bytes gains nothing. Similarly, if the compression on a larger record does not reduce the overall size by 512 bytes, then it isn't worth compressing. Also, compression algorithms can increase the size of a record, depending on the data and algorithm. ZFS implements the policy that if compression does not reduce the record's size by more than 12.5% (1/8), then the record will be written uncompressed. This prevents inflation and provides a minimum limit for evaluating compression effectiveness. The smallest record size of interest is 8 blocks of 512 bytes, or 4 kBytes.
ZFS compresses each record of a file rather than the whole file. If a file contains some compressible bits and some uncompressible bits, then the file can be compressed, depending on the distribution of compressible bits in the file. This seems like an odd thing to say, but it is needed to understand that the maximum record size is 128 kBytes. When evaluating a compression algorithm for ZFS, the record sizes to be tested should range from 4 kBytes to 128 kBytes. In Denis' example, the test data is in the 200 MBytes to 801 MBytes range. Interesting, but it would be better to measure with the same policy that ZFS implements. Also, two of Denis' tests were on a tarball comprised of files. Again, this is interesting, but will not be representative of the compression of the untarred files, especially files smaller than 4 kBytes.
Now we can build a test profile that compares the effectiveness of compression for ZFS. The records should range from 4 kBytes to 128 kBytes. To do this easily with existing files, they could be split, compressed and the results compared for each split after applying ZFS' policies. The results should also be compared at the sector size, not the file length. To demonstrate, I'll use an example. I took the zfs(1m) man page and split it into 4 kByte files. Then I compressed it with compress(1) and gzip(1). For gzip, I used the -6 option, which is the default for ZFS when gzip compression is specified.
original | compress | gzip -6 | ||||
file | length | sectors | length | sectors | length | sectors |
xaa | 4096 | 8 | 1688 | 4 | 945 | 2 |
xab | 4096 | 8 | 2198 | 5 | 1664 | 4 |
xac | 4096 | 8 | 2217 | 5 | 1615 | 4 |
xad | 4096 | 8 | 2081 | 5 | 1438 | 3 |
xae | 4096 | 8 | 2161 | 5 | 1468 | 3 |
xaf | 4096 | 8 | 2170 | 5 | 1544 | 4 |
xag | 4096 | 8 | 2174 | 5 | 1386 | 3 |
xah | 4096 | 8 | 2163 | 5 | 1442 | 3 |
xai | 4096 | 8 | 2154 | 5 | 1477 | 3 |
xaj | 4096 | 8 | 2279 | 5 | 1747 | 4 |
xak | 4096 | 8 | 2100 | 5 | 1299 | 3 |
xal | 4096 | 8 | 2066 | 5 | 1335 | 3 |
xam | 4096 | 8 | 2177 | 5 | 1481 | 3 |
xan | 4096 | 8 | 2056 | 5 | 1326 | 3 |
xao | 4096 | 8 | 2025 | 4 | 1206 | 3 |
xap | 4096 | 8 | 2073 | 5 | 1388 | 3 |
xaq | 4096 | 8 | 1960 | 4 | 1279 | 3 |
xar | 4096 | 8 | 1897 | 4 | 1272 | 3 |
xas | 4096 | 8 | 1766 | 4 | 1135 | 3 |
xat | 4096 | 8 | 2221 | 5 | 1478 | 3 |
xau | 3721 | 8 | 1820 | 4 | 1056 | 3 |
zfs.1m | 85641 | 168 | 31464 | 62 | 20582 | 41 |
It is clear that gzip -6 does a better job compressing for space on these text files. I won't measure the performance costs for this example, but in general gzip -6 uses more CPU resources than compress. The real improvements in space are not readily comparable, though. A bit of spreadsheet summing reveals:
split files | single file | all files | |
compress | 59% | 37% | 48% |
gzip -6 | 39% | 24% | 32% |
This shows that the space savings from compression on a single, large file is much better than for smaller files. This also reiterates the issue with compression in general -- you can't accurately predict how well it will work in advance.
In conclusion, the work required to add a compressor to ZFS is largely dominated by non-technical issues. But proper evaluation of the technical issues is also needed to be sure that the engineering results can justify the time and expense to tackle the non-technical issues. This can be done by experiments prior to coding to the ZFS interfaces. Denis and others, are interested in improving ZFS which is very cool. I think you should also help improve ZFS, or at least use it.
Comments
Post a Comment