2011/11/01

Filesystem Benchmarks - Part I

This is the first part of our file system benchmark which is primarily dealing with the FSI Server performance with different types of local file systems. The second part will be dealing with the I/O performance of virtual machines.
In order to increase the comparability, we tested all file systems currently usable for Linux with the I/O Benchmark which is included in FSI Server.

The benchmark results include the currently common file systems ext3, ext4, ntfs and xfs as well as older systems like ext2, reiserfs3 and the very new btrfs.

The test system:

Hardware:
2x Xeon L5410 @ 2.33GHz
8 Cores/8 Threads, no Hyperthreading
Intel 5000P Mainboard
16 GB RAM
Adaptec AAC-Raid 5
4x Samsung 750GB S-ATA 300

Software:
Debian 6 (Squeeze), Kernel 2.6.32-5-amd64 (cfq)
Microsoft Windows Server 2008 R2
Oracle JDK 1.7.0-b147 (64bit)
FSI Server Benchmark 2.0.235


Test Script:
#!/bin/sh

java=/usr/bin/java
jars=./lib

timeFactor=3
tests=io
subtests=hotrun,fixed

javaOptions="-Djava.ext.dirs=$jars -Djava.awt.headless=true -server"
vmOptions="-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+UseBiasedLocking -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -Xms1024m -Xmx1024m"

t=1
while [ $t -le 8 ]
do
        /bin/sync
        $java $javaOptions $vmOptions -jar $jars/fsi.jar start runTimeFactor=$timeFactor tests=$tests subtests=$subtests fixedthreads=$t workdir=/mnt/ calibrate verbose > result_threads_$t.txt
        t=$(( $t + 1 ))
done


What is tested:

The I/O benchmark is one of the 6 benchmarks which are included in the FSI Server package. The benchmark creates an FSI Server Storage in version V1001, the version which is used in FSI Server 1 and 2.
Afterwards, typical file operations for the FSI Server Importer and FSI Server Core will be made. Primarily, this consists of compiling, reading and deleting plenty of small files in various directories.
This test will be performed for 1 to 8 threads successively, while 8 permanent working threads would be equivalent to a highly loaded server.

We decided to use the brand new Oracle Java 7 build 147 as virtual machine, due to the fact that preliminary tests with Java 7 delivered significantly more constant results (thanks to NIO2) than Java 6 would allow.
In particular, this becomes apparent in the overload test which is included in the benchmark.
Apart from this, the results of Java 6 and Java 7 are quite comparable, as long as the amount of threads doesn’t exceed the amount of the real CPU cores, which is not the case in this test.
Fundamentally, all tests were performed with the mount option noatime, except for NTFS.
The option "noatime" prevents saving the timestamp of the access to the file when a reading access is operated. This can lead to a noticeable increase of the I/O throughput while accessing a lot of files.
We skipped comparing atime with noatime, due to the fact that the access time is neither used by FSI Server nor FSI Cache.


Test filesystems and options:

NILFS 2, noatime
JFS, noatime
ReiserFS 3.6.21, noatime
Ext2, noatime
Ext3, noatime
Ext4, noatime, data=journal
Ext4, noatime, data=ordered
Ext4, noatime, data=writeback
Ext4, noatime, data=writeback, barrier=0
XFS, noatime
XFS, noatime, nobarrier
btrfs, noatime
btrfs, noatime, nobarrier
NTFS, allocation unit size 4096

Since ext4 is currently the standard file system used on nearly all Linux distributions, the typical performance options, such as writeback and barrier, which are particularly used in battery backed RAID systems (see Link), were additionally tested. The new btrfs, which must still be considered experimental, was tested in version 0.19.
With the same hardware NTFS under Windows Server 2008 R2 was tested, too.


The results in detail:

NILFS 2, noatime

Even though NILF2 has only been available since Kernel 2.6.30 and has objectively viewed no relevance in production systems yet, we have benchmarked this log structured file system.
Not only the extraordinary low and therefore bad results are quite remarkable, also the performance massively collapses while doing multi thread tests.
This behavior is obviously caused by the underlying Recovery Design of NILFS.

Threads
1
2
3
4
5
6
7
8
Passes/s
547.01
447.43
290.07
181.03
98.57
96.84
64.65
97.91



JFS, noatime

This file system which is mainly based in the AIX environment shows significant weaknesses on Linux. The single thread performance may be comparable to the other file systems tested, while the performance with two threads already falls down to 1/35 passes/sec.
Hence this file system is disqualified for the use with multithread I/O accesses and with Linux.


Threads
1
2
3
4
5
6
7
8
Passes/s
2846.65
76.94
56.27
56.14
55.44
59.01
74.80
52.97



ReiserFS 3.6, noatime

ReiserFS 3.6, a quite old-fashioned file system, only has enough speed for single thread accesses. It is obvious that it objectively offers no optimization for parallel accesses and that it is not able to scale.
Due to the lack of support and further development we decided not to test ReiserFS 4.


Threads
1
2
3
4
5
6
7
8
Passes/s
2431.09
2552.66
2779.17
2835.92
2795.78
2918.13
2867.14
2716.39



ext2, noatime

Ext2, which is still widely used, delivers by far the best results in our benchmark.
The file system scales almost perfectly with the tested kernel and even comes close to a tmpfs. The downside of this high performance is the lack of journaling and extraordinary lengthy consistency checks. There is also a rapid declining of performance regarding directories with a lot of files, which is caused by the lack of directory indexing (this is not taken into account by the benchmark)
Therefore ext2 is currently not recommended, despite the good results.


Threads
1
2
3
4
5
6
7
8
Passes/s
4218.41
6940.80
10064.03
12348.73
14151.40
15579.11
16548.76
17072.70



ext3, noatime

Being the standard file system on Linux for many years in the past, ext3 is still used a lot nowadays.
The biggest advantage compared to ext2 is the ability of journaling and the use of  HTree’s for directories. Ext3 is quite robust though old-fashioned. This becomes obvious looking at the bad multi thread scaling results.


Threads
1
2
3
4
5
6
7
8
Passes/s
2754.86
3931.04
5263.11
5622.86
5811.88
6861.59
5215.46
4897.93



ext4, noatime, data=journal

Ext4 in combination with the extraordinary safe journal mode „journal“ delivers similar scores as ext3. Compared to ext 3 (in journal mode „ordered”), ext4 offers remarkably reduced file check times as well as a highly increased security concerning the data integrity in case of a system failure.

Threads
1
2
3
4
5
6
7
8
Passes/s
2464.36
3642.13
4253.60
4994.83
5544.67
5263.79
6007.44
6015.34



ext4, noatime, data=ordered

The ext4 journaling stage “ordered”, which is the common default setting of current Linux distributions, offers a good compromise between data integrity in case of server failure and write performance.
Therefore, the results of this benchmark are used as the reference for the other benchmarks.
Compared to ext3 and ext4 (data=journal), the results scale almost linear to the threads. With this, the file system offers the possibility to estimate performance improvements for production systems.


Threads
1
2
3
4
5
6
7
8
Passes/s
3502.24
5763.72
7678.35
9548.23
11064.84
12287.98
13066.29
13613.44



ext4, noatime, data=writeback

The "writeback" journaling mode, which is much more unsafe, should only be used with battery backed RAID systems.
Theoretically, the sync of the kernel while writing data should be an advantage regarding performance – this couldn’t be proven with our benchmark however.
The results are slightly better compared to the ordered mode.


Threads
1
2
3
4
5
6
7
8
Passes/s
3649.60
5786.49
7874.12
9493.71
10956.70
12104.13
13038.98
13482.32



ext4, noatime, data=writeback, barrier=0

The performance option barrier=0, which should also be used only with battery backed RAID systems, showed no additional improvement in our benchmark compared to barrier=1 (default).

Threads
1
2
3
4
5
6
7
8
Passes/s
3606.81
5914.69
7673.24
9569.98
11082.23
12173.29
12948.52
13452.69



XFS, noatime

XFS is stated to be a very robust and high performance file system, but it delivers very surprising results.
Even though it can compete with ext4 in a lot of aspects (e.g. dbench), the file system lacks a lot of scale potential in the multi thread tests. Therefore, only the single thread test was equivalent to ext4.


Threads
1
2
3
4
5
6
7
8
Passes/s
3119.73
3722.44
4252.56
5126.10
5580.16
5898.89
6487.84
6458.05



XFS, noatime, nobarrier

There were interesting results using XFS in combination with the nobarrier option. Up to 3 threads, a good scaling was obvious, but the limit of this was already reached with the fourth thread. Several runs confirmed the peek result at 4 threads. In conclusion, XFS cannot compete with ext4 regarding high loads.
Threads
1
2
3
4
5
6
7
8
Passes/s
3113.27
4744.15
6910.56
7098.22
5741.22
6279.31
6878.09
6448.76



btrfs, noatime

The so-called B-tree FS is stated to be the next „wonder“ file systems since a few years, due to the fact that it evens out a lot of design problems of common file systems.
Even though it’s not usable for the production systems yet, we wanted to know how it performs.
The scaling is slightly better compared to XFS, but the performance is noticed to collapse when (apparently) too many threads are used.
Altogether, the results are quite good, but still far away from ext4.


Threads
1
2
3
4
5
6
7
8
Passes/s
3145.84
4737.90
6028.56
6961.76
7450.13
7311.22
7014.16
6558.37



btrfs, noatime, nobarrier

BTRFS also supports a nobarrier option, which could theoretically be used to achieve better results on RAID systems.
In our benchmark this options did not deliver significantly different results compared to the use without the nobarrier option, though.


Threads
1
2
3
4
5
6
7
8
Passes/s
3174.12
4878.16
5924.92
6963.60
7532.58
7387.41
7098.63
6724.24



NTFS, 4K block size

NTFS on Windows Server 2008 RS makes no good impression concerning overall performance and scaling compared to the file systems on Linux.
In the end, there is no other current possibility for Windows Server 2008 RS available, so one must be content with the performance of this file system. It would be interesting to compare different benchmarks to Linux, but this will not be part of this article.
Altogether, a bad impression of NTFS regarding I/O performance remains. Optimizations via disablelastaccess did not increase the performance.


Threads
1
2
3
4
5
6
7
8
Passes/s
1664.89
2115.03
2407.93
2774.35
3194.26
3135.82
3260.86
3277.30





Conclusion:

The I/O benchmark delivers some interesting results at the time of testing.
For one thing, currently only ext seems to be able to scale relatively well in multi core systems regarding the performance of data operation dealing with reading or manipulating meta information of files.
XFS is unexpectedly disappointing regarding the relatively low scores.
BTRFS looks promising, it remains to be seen whether it will be usable in production environments.
The biggest disappointment, even though not a surprise, is NTFS. The benchmark works on an overall low level, no matter how much threads are used.

Of course, this test is specific for FSI Server version 2 and is thought to give users an overview of the performance possibilities of their own hardware.
Nevertheless, the impressive result of the test is how much the overall performance can be influenced by choosing the right file system.