For the Red Hat / Intel webinar on big data and JBoss Data Grid (JDG), I executed a performance test against JBoss Data Grid with RadarGun on dedicated Intel hardware. Prior to executing the performance test, I executed a scaled down version of it with a number of garbage collector configurations including the new G1 collector.
Note: Large page memory was configured. (link)
Garbage Collector Configurations
Parallel
-XX:+UseParallelOldGC
Parallel w/ NUMA (NUMA)
-XX:+UseParallelOldGC -XX:+UseNUMA
Concurrent (CMS)
-XX:+UseConcMarkSweepGC
Concurrent w/ Incremental Mode (iCMS)
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode
G1
-XX:+UseG1GC
The following options were not necessary as they are enabled by default:
-XX:+UseParallelGC
-XX:+UseParNewGC (enabled with -XX:+UseConcMarkSweepGC)
-XX:+UseLargePages* (disable if large page memory has not been configured)
-XX:+UseTLAB
-XX:+UseCompressedOops
* With OpenJDK 1.6, UseLargePages is disabled by default.
Hardware / Software Configuration
Physical Servers: 3
Processors / Server: 4
Processor: Intel® Xeon® Processor E7-4860
RAM: 1TB
NIC: 10 GbE (Fiber)
Red Hat Enterprise Linux 6.2
OpenJDK 1.7
JBoss Data Grid 6.1
RadarGun Fork (link)
Performance Test Configuration
Nodes: 30 (10 / Physical Server)
Test Threads / Node: 36
Operations: 80% Reads / 20% Writes
Entry Size: 4,096 Bytes (4K)
Duration: 15 Minutes
Note: The data grid is populated during the warm up phase.
JBoss Data Grid Configuration
Transactions: XA
Transaction Recovery: Enabled
Locking: Pessimistic
Communication: Synchronous
Results
I have configured the concurrent collector in incremental mode when executing performance tests in the past. However, with a total of 80 cores per physical server, it would not be necessary to do so when executing this performance test.
I was curious about NUMA. Would the performance of local reads and writes improve with a NUMA aware allocator? No. I’m not sure why. I need to look into this further.
I executed the performance test with the G1 collector twice.
The first time I executed the performance test with the G1 collector, the concurrent collector not only performed better but there were a number of failed transactions albeit a small number. Then second time, the G1 collector performed better than the concurrent collector but there were still a number of failed transactions. I did not analyze the log files, but I suspect that the transactions failed due to garbage collection.
The performance test was executed with a 4GB of data and an 8GB heap. For the webinar, the performance test would be executed with 34.5GB of data an a 69GB heap. I executed the performance test with an increasing data / heap size incrementally until I reached 34.5GB of data with a 69GB heap.
The concurrent collector performed well regardless of the heap size. While the the performance decreased by about 6% from 4GB of data and an 8GB heap to 24GB of data an a 48GB heap, it remained more or less consistent from 24GB of data and a 48GB heap to 34.5GB of data and a 69 GB heap.
Reminder:
The JBoss Data Grid 6.1 webinar introducing new features and detailing cross site replications and rolling upgrades is today at 10:00AM CST. (link)