In the previous post https://29min.wordpress.com/2013/07/31/measuring-solr-query-performance I tried to look at the effect of default CMS vs G1 garbage collector. Both default were subotimal and I removed them from the future comparisons. Having received suggestions on the solr mailing list, particularly from Shawn Heisey, I’ve tried running the measurements again with added sets of parameters.
cms-x1 ="-Xmx20480m \ -XX:NewRatio=3 \ -XX:SurvivorRatio=4 \ -XX:TargetSurvivorRatio=90 \ -XX:MaxTenuringThreshold=8 \ -XX:+UseConcMarkSweepGC \ -XX:+CMSScavengeBeforeRemark \ -XX:PretenureSizeThreshold=64m \ -XX:CMSFullGCsBeforeCompaction=1 \ -XX:+UseCMSInitiatingOccupancyOnly \ -XX:CMSInitiatingOccupancyFraction=70 \ -XX:CMSTriggerPermRatio=80 \ -XX:CMSMaxAbortablePrecleanTime=6000 \ -XX:+CMSParallelRemarkEnabled \ -XX:+ParallelRefProcEnabled \ -XX:+UseLargePages \ -XX:+AggressiveOpts \ -Dmontysolr.enable.warming=false -Dsolr.cache.size=0" cms-x2 = "-Xmx20480m \ -server \ -XX:+PrintGCTimeStamps \ -XX:+PrintGCDetails \ -XX:MaxPermSize=64m \ -XX:NewSize=1024m \ -XX:SurvivorRatio=1 \ -XX:TargetSurvivorRatio=90 \ -XX:MaxTenuringThreshold=8 \ -XX:+UseConcMarkSweepGC \ -XX:+CMSScavengeBeforeRemark \ -XX:PretenureSizeThreshold=512m \ -XX:CMSFullGCsBeforeCompaction=1 \ -XX:+UseCMSInitiatingOccupancyOnly \ -XX:CMSInitiatingOccupancyFraction=70 \ -XX:CMSTriggerPermRatio=80 \ -XX:CMSMaxAbortablePrecleanTime=6000 \ -XX:+CMSConcurrentMTEnabled \ -XX:+UseParNewGC \ -XX:ConcGCThreads=7 \ -XX:ParallelGCThreads=7 \ -XX:+UseLargePages \ -Dmontysolr.enable.warming=false -Dsolr.cache.size=0" cms-x3 = "-Xmx20480m \ -XX:+AggressiveOpts \ -XX:+HeapDumpOnOutOfMemoryError \ -XX:+OptimizeStringConcat \ -XX:+UseFastAccessorMethods \ -XX:+UseG1GC \ -XX:+UseStringCache \ -XX:-UseSplitVerifier \ -XX:MaxGCPauseMillis=50 \ -Dmontysolr.enable.warming=false -Dsolr.cache.size=0"
This was the scenario (if you want to repeat): https://github.com/romanchyla/solrjmeter/blob/master/scenarios/comparing-garbage-collectors.sh
The results are not at all easy to implement, but I’ll throw it out here, just in case somebody may have some interesting insight. I have two comments: even if the tests execute hundreds of thousands of queries per each run (for every garbage collector configuration), these queries are not good and not the worst case scenarios – especially for the AND and NEAR5 clauses. So, these searches in majority of cases find nothing or only very small number of hits, and that is not helpful – I am working on a better set, but that takes time.
The next observation I have to offer is that the CMS-custom configuration seems to be pretty consistent and performing fastest of all, it keeps to have lowest averages (which may or may not be significant, but most likely it should count, because of the sheer number of requests executed). So I am thinking that CMS-custom (dark blue) and the cms-x3 (green) configuration are my favorites right now. But I need to have better queries to test and the differences may NOT BE SIGNIFICANT.
Colour legend (sorry, not having time to play with it now):
- dark (shade of blue/grey) = cms-custom
- yellow = g1-custom
- green = cms-x1
- pink = cms-x2
- light blue = cms-x3
I have just tweaked the chart to separate the columns – but basically, cms-custom and g1-custom ran 6 times, the others ran 3 times