目前各大互联网公司都开始注重代码质量,在京东单元测试已经在进行全面推广和覆盖中,这次,我们通过一起实际的例子,聊一聊另一种非常重要的测试,也就是微基准性能测试。
Java中数字转字符串相信大家都有做过,四种常用的转换方式,究竟用哪种最优呢?本次通过对
Integer.toString(a)
String.valueOf(a)
a + ""
"" + a
四种数字转字符串的方式进行性能探究和分析,使大家对性能测试有正确的认识,逐步了解和掌握JMH微基准测试。
提到性能测试,如果是没有经验的小伙伴,通常会写出以下代码,企图通过循环来放大每次的运行速度偏差,代码如下:
public class CommonTest {
public static void main(String[] args) {
int num = 1000000;
int a = 123456789;
long start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String m = a + "";
}
long end = System.currentTimeMillis();
System.out.println("a+\"\" = " + (end - start));
start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String m = "" + a;
}
end = System.currentTimeMillis();
System.out.println("\"\"+a = " + (end - start));
start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String n = String.valueOf(a);
}
end = System.currentTimeMillis();
System.out.println("String.valueOf(a) = " +(end-start));
start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String n = Integer.toString(a);
}
end = System.currentTimeMillis();
System.out.println("Integer.toString(a) = " +(end-start));
}
}
本机环境为MacBook,处理器:2.6 GHz 六核Intel Core i7,内存:16 GB 2667 MHz DDR4
1000000数据测试结果:
a+"" = 56""+a = 30String.valueOf(a) = 31Integer.toString(a) = 29
耗时结果为 a+"" > “”+a , String.valueOf(a) > Integer.toString(a),
多次测试,结果显示Integer.toString(a)与String.valueOf(a)消耗时间相似,String.valueOf(a)大于Integer.toString(a),a+""始终大于""+a,且"" + a 消耗时长最小。
500000000数据测试结果:
a+"" = 9527""+a = 9358String.valueOf(a) = 13620Integer.toString(a) = 13501
耗时结果为String.valueOf(a) > Integer.toString(a) > a+"" > “”+a ,但是a+““和”“+a的差距明显缩小了。
如果以上结果你认为是偶然性偏差,那么下面我再来做一个操作,可能会再次改变你的认知。
public class CommonTest {
public static void main(String[] args) {
int num = 500000000;
int a = 123456789;
// 本次代码的改变是,将“” + a 和 a + “”的测试顺序进行了改变,先测试“” + a
long start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String m = "" + a;
}
long end = System.currentTimeMillis();
System.out.println("\"\"+a = " + (end - start));
start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String m = a + "";
}
end = System.currentTimeMillis();
System.out.println("a+\"\" = " + (end - start));
start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String n = String.valueOf(a);
}
end = System.currentTimeMillis();
System.out.println("String.valueOf(a) = " +(end-start));
start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String n = Integer.toString(a);
}
end = System.currentTimeMillis();
System.out.println("Integer.toString(a) = " +(end-start));
}
}
大家可以自行测试一下,这里我可以直接告诉大家测试结果,a+"" 和 “”+a的测试结果出现了反转,由之前的a+"" 始终大于 “”+a变为了a+"" 始终小于 “”+a。
在不同数据量,甚至测试顺序的变化,都导致测试结果发生了差异,究竟哪个结果更可信?我们该如何正确的进行性能测试?
在介绍正确的性能测试方法之前,我们可以先对以上测试结果的偏差原因进行分析。
点开String.valueOf(int i)源码不难看出:
String.valueOf(int i)其实是调用的Integer.toString(i),所以String.valueOf(int i)调用时间大于Integer.toString(i)比较正常,两者时间应该非常相似。
为了验证字符串相加的编译结果,下面给出探究过程:
测试代码:
package com.bestqiang.commontest;
public class CommonTest {
public static void main(String[] args) {
int a = 1;
String str = "hello" + a;
String str2 = "hello2" + 1;
String str3 = a + "hello3";
String str4 = 1 + "hello4";
}
}
首先看 str2, 我们把其他代码注释掉,只留下 String str2 = "hello2" + 1;
使用javap解析可得:
"C:\Program Files\Java\jdk1.8.0_221\bin\javap.exe" -v com.bestqiang.commontest.CommonTestClassfile /D:/idea-space2/learning-technology/target/test-classes/com/bestqiang/commontest/CommonTest.class Last modified 2021-8-19; size 481 bytes MD5 checksum 2d6ee54fb564dc0a7fc51ab0a617cfcc Compiled from "CommonTest.java"public class com.bestqiang.commontest.CommonTest minor version: 0 major version: 52 flags: ACC_PUBLIC, ACC_SUPERConstant pool: #1 = Methodref #4.#20 // java/lang/Object."<init>":()V #2 = String #21 // hello21 #3 = Class #22 // com/bestqiang/commontest/CommonTest #4 = Class #23 // java/lang/Object #5 = Utf8 <init> #6 = Utf8 ()V #7 = Utf8 Code #8 = Utf8 LineNumberTable #9 = Utf8 LocalVariableTable #10 = Utf8 this #11 = Utf8 Lcom/bestqiang/commontest/CommonTest; #12 = Utf8 main #13 = Utf8 ([Ljava/lang/String;)V #14 = Utf8 args #15 = Utf8 [Ljava/lang/String; #16 = Utf8 str2 #17 = Utf8 Ljava/lang/String; #18 = Utf8 SourceFile #19 = Utf8 CommonTest.java #20 = NameAndType #5:#6 // "<init>":()V #21 = Utf8 hello21 #22 = Utf8 com/bestqiang/commontest/CommonTest #23 = Utf8 java/lang/Object{ public com.bestqiang.commontest.CommonTest(); descriptor: ()V flags: ACC_PUBLIC Code: stack=1, locals=1, args_size=1 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return LineNumberTable: line 9: 0 LocalVariableTable: Start Length Slot Name Signature 0 5 0 this Lcom/bestqiang/commontest/CommonTest; public static void main(java.lang.String[]); descriptor: ([Ljava/lang/String;)V flags: ACC_PUBLIC, ACC_STATIC Code: stack=1, locals=2, args_size=1 0: ldc #2 // String hello21 2: astore_1 3: return LineNumberTable: line 14: 0 line 19: 3 LocalVariableTable: Start Length Slot Name Signature 0 4 0 args [Ljava/lang/String; 3 1 1 str2 Ljava/lang/String;}SourceFile: "CommonTest.java"Process finished with exit code 0
解析结果显示,编译器直接优化为了hello21,没有那种StringBuilder追加的情况发生。
接着,我们把其他代码注释掉,只留下 String str = "hello" + a;
使用javap解析可得:
Classfile /D:/idea-space2/learning-technology/target/test-classes/com/bestqiang/commontest/CommonTest.class
Last modified 2021-8-19; size 705 bytes
MD5 checksum 69bc22cd50230bd882b407a17dcff463
Compiled from "CommonTest.java"
public class com.bestqiang.commontest.CommonTest
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #9.#27 // java/lang/Object."<init>":()V
#2 = Class #28 // java/lang/StringBuilder
#3 = Methodref #2.#27 // java/lang/StringBuilder."<init>":()V
#4 = String #29 // hello
#5 = Methodref #2.#30 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#6 = Methodref #2.#31 // java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
#7 = Methodref #2.#32 // java/lang/StringBuilder.toString:()Ljava/lang/String;
#8 = Class #33 // com/bestqiang/commontest/CommonTest
#9 = Class #34 // java/lang/Object
#10 = Utf8 <init>
#11 = Utf8 ()V
#12 = Utf8 Code
#13 = Utf8 LineNumberTable
#14 = Utf8 LocalVariableTable
#15 = Utf8 this
#16 = Utf8 Lcom/bestqiang/commontest/CommonTest;
#17 = Utf8 main
#18 = Utf8 ([Ljava/lang/String;)V
#19 = Utf8 args
#20 = Utf8 [Ljava/lang/String;
#21 = Utf8 a
#22 = Utf8 I
#23 = Utf8 str
#24 = Utf8 Ljava/lang/String;
#25 = Utf8 SourceFile
#26 = Utf8 CommonTest.java
#27 = NameAndType #10:#11 // "<init>":()V
#28 = Utf8 java/lang/StringBuilder
#29 = Utf8 hello
#30 = NameAndType #35:#36 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#31 = NameAndType #35:#37 // append:(I)Ljava/lang/StringBuilder;
#32 = NameAndType #38:#39 // toString:()Ljava/lang/String;
#33 = Utf8 com/bestqiang/commontest/CommonTest
#34 = Utf8 java/lang/Object
#35 = Utf8 append
#36 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder;
#37 = Utf8 (I)Ljava/lang/StringBuilder;
#38 = Utf8 toString
#39 = Utf8 ()Ljava/lang/String;
{
public com.bestqiang.commontest.CommonTest();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 9: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Lcom/bestqiang/commontest/CommonTest;
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=3, args_size=1
0: iconst_1
1: istore_1
2: new #2 // class java/lang/StringBuilder
5: dup
6: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V
9: ldc #4 // String hello
11: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
14: iload_1
15: invokevirtual #6 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
18: invokevirtual #7 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
21: astore_2
22: return
LineNumberTable:
line 11: 0
line 12: 2
line 19: 22
LocalVariableTable:
Start Length Slot Name Signature
0 23 0 args [Ljava/lang/String;
2 21 1 a I
22 1 2 str Ljava/lang/String;
}
SourceFile: "CommonTest.java"
Process finished with exit code 0
如上图所示,其实a + “hello”内部是用stringBuilder进行追加操作的。
a + “hello3”仅仅是顺序发生了变化,底层实现是否是相同的呢?我们把其他代码注释掉,只留下 String str3 = a + “hello3”;
使用javap解析可得:
Classfile /D:/idea-space2/learning-technology/target/test-classes/com/bestqiang/commontest/CommonTest.class Last modified 2021-8-19; size 707 bytes MD5 checksum b767896bc82bc01ac0153354ac6e5886 Compiled from "CommonTest.java"public class com.bestqiang.commontest.CommonTest minor version: 0 major version: 52 flags: ACC_PUBLIC, ACC_SUPERConstant pool: #1 = Methodref #9.#27 // java/lang/Object."<init>":()V #2 = Class #28 // java/lang/StringBuilder #3 = Methodref #2.#27 // java/lang/StringBuilder."<init>":()V #4 = Methodref #2.#29 // java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder; #5 = String #30 // hello3 #6 = Methodref #2.#31 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; #7 = Methodref #2.#32 // java/lang/StringBuilder.toString:()Ljava/lang/String; #8 = Class #33 // com/bestqiang/commontest/CommonTest #9 = Class #34 // java/lang/Object #10 = Utf8 <init> #11 = Utf8 ()V #12 = Utf8 Code #13 = Utf8 LineNumberTable #14 = Utf8 LocalVariableTable #15 = Utf8 this #16 = Utf8 Lcom/bestqiang/commontest/CommonTest; #17 = Utf8 main #18 = Utf8 ([Ljava/lang/String;)V #19 = Utf8 args #20 = Utf8 [Ljava/lang/String; #21 = Utf8 a #22 = Utf8 I #23 = Utf8 str3 #24 = Utf8 Ljava/lang/String; #25 = Utf8 SourceFile #26 = Utf8 CommonTest.java #27 = NameAndType #10:#11 // "<init>":()V #28 = Utf8 java/lang/StringBuilder #29 = NameAndType #35:#36 // append:(I)Ljava/lang/StringBuilder; #30 = Utf8 hello3 #31 = NameAndType #35:#37 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder; #32 = NameAndType #38:#39 // toString:()Ljava/lang/String; #33 = Utf8 com/bestqiang/commontest/CommonTest #34 = Utf8 java/lang/Object #35 = Utf8 append #36 = Utf8 (I)Ljava/lang/StringBuilder; #37 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder; #38 = Utf8 toString #39 = Utf8 ()Ljava/lang/String;{ public com.bestqiang.commontest.CommonTest(); descriptor: ()V flags: ACC_PUBLIC Code: stack=1, locals=1, args_size=1 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return LineNumberTable: line 9: 0 LocalVariableTable: Start Length Slot Name Signature 0 5 0 this Lcom/bestqiang/commontest/CommonTest; public static void main(java.lang.String[]); descriptor: ([Ljava/lang/String;)V flags: ACC_PUBLIC, ACC_STATIC Code: stack=2, locals=3, args_size=1 0: iconst_1 1: istore_1 2: new #2 // class java/lang/StringBuilder 5: dup 6: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V 9: iload_1 10: invokevirtual #4 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder; 13: ldc #5 // String hello3 15: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 18: invokevirtual #7 // Method java/lang/StringBuilder.toString:()Ljava/lang/String; 21: astore_2 22: return LineNumberTable: line 11: 0 line 16: 2 line 19: 22 LocalVariableTable: Start Length Slot Name Signature 0 23 0 args [Ljava/lang/String; 2 21 1 a I 22 1 2 str3 Ljava/lang/String;}SourceFile: "CommonTest.java"
很明显,同样是使用StringBuilder进行的追加操作。
二者底层实现上并无什么不同,
目前有两个疑问产生了,一个是测试循环数据量提升后为何导致“”+a和a+“”的性能反超Integer.toString(a)?另一个是为何“”+a和a+“”的底层实现相同,但是测试顺序不同,导致结果也不同,数据量大后这个差异就缩小了?
其实,结果之所以有这些差异,原因就是JVM未进行预热,且JIT编译器会对我们所写的热点代码进行优化,具体方式有循环展开(loop unrolling)、OSR(On-Stack Replacement)、方法内联等,由于篇幅问题,不再这里详细的一一介绍,后续会附上总的梳理脑图,感兴趣的小伙伴可以单独再去深入了解。也就是说,你所写的代码!=真正执行的代码,我们的JVM会对你写的代码进行隐形优化,导致你的测试结果发生偏差。
简单验证JIT编译优化:
先简单验证一下,验证方法很简单,我们先把把OSR给关闭,具体方法就是在运行JVM时使用JVM参数:-XX:-UseOnStackReplacement,接下来,我们再使用不同数据量级数据进行测试。
1000000数据测试结果:
""+a = 116a+"" = 74String.valueOf(a) = 43Integer.toString(a) = 68
可以看到,”“+a和a + ”“的耗时已经超过了String.valueOf(a) 和 Integer.toString(a)。
500000000数据测试结果:
""+a = 37581a+"" = 36621String.valueOf(a) = 20173Integer.toString(a) = 19934
可以看到,”“+a和a + ”“的耗时依然是超过了String.valueOf(a) 和 Integer.toString(a)的。
简单验证JVM预热影响:
验证这个很好办,我们在进行真正测试之前,关闭OSR,再加上一段测试代码先跑一段时间即可,如:
for (int i=0; i<100000000; i++){ result = "" + a; }
添加代码后,使用1000000数据测试结果:
""+a = 71a+"" = 70String.valueOf(a) = 41Integer.toString(a) = 40
可以看到”“ + a 与 a + ”“耗时的差异已经消失了,且始终大于String.valueOf(a)与Integer.toString(a)。
开启OSR后,使用500000000数据测试结果:
""+a = 11390a+"" = 11220String.valueOf(a) = 14500Integer.toString(a) = 14423
可以看到,”“ + a 与 a + ”“性能再次反超Integer.toString(a)。
由此我们可以得出结论:String.valueOf(a)内部调用Integer.toString(a)在源码中可以看出。a+""与“”+a在javap解析后内部均使用StringBuilder进行相加,时间耗时测试也无特殊差异,另外对于非变量的相加如1+“hello”,使用javap分析可以看出直接优化为“1hello”放入常量池。关键在于Integer.toString(a)与“”+a在不同数据量循环的速度差异问题,考虑到低数据量与高数量结果迥然不同,所以怀疑是热点数据触发OSR编译导致的结果差异,考虑到JVM采用分层编译,主要为解释执行+JIT即时编译,JIT检测热点数据的方法为基于调用计数器(Invocation Counter)和回边计数器(Back Edge Counter)的热点探测,这里测试为直接用-XX:-UseOnStackReplacement直接关闭OSR,测试结果为无论数据量大还是小Integer.toString(a)耗时均小于""+a,初步得出结论为OSR对热点数据优化为机器指令后“”+a效率大于Integer.toString(a),否则Integer.toString(a)效率大于“”+a。
我们仅仅为了测试一个Java数字转字符串的性能,就需要关注这么多影响变量,探究这么多原理,那么,有没有一个工具,能够把这些都考虑进去,自动化统计得到真实环境运行时的相对准确的性能测试结果呢?微基准测试神器JMH正可以满足我们的需求。
官方介绍:JMH 是一个 Java 工具,用于构建、运行和分析用 Java 和其他面向 JVM 的语言编写的 nano/micro/milli/macro 基准测试。
JMH全称为Java Microbenchmark Harness,是一个专为Java做基准测试的工具,因为JVM层面在编译期、加载、运行时对代码做很多优化,同样的一段代码,在整个程序运行时不一定会真正生效,所以对Java做微基准测试是非常困难的,需要对JVM的原理非常了解,所以,JMH由JVM团队亲自进行开发,而且开发者给我们提供了38个例子,供我们进行学习,避坑。
我们如果想要测试一段代码片段的性能,就比如以上的例子:int类型转字符串这种代码粒度,往往不是接口维度的,需要较高精度才能准确测试,我们的一般的性能测试工具,如JMeter,往往测试的为接口维度,且其框架本身的运行成本也会使得测试精度下降,测试精度是无法达到微基准测试的要求的。JMH提供了不同的测试模式,并且帮我们自动进行测试预热、JVM多环境隔离、控制方法内联,并且提供多种参数对象,如黑洞对象、实验状态对象帮我们更好的进行微基准测试和避坑。
Talk is cheap, show me the code!让我们看看如何使用JMH进行上面我们的int类型转字符串的性能测试吧!
@State(Scope.Thread)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class JHMTest {
private int a = 123456;
@Benchmark
public String work1() {
return "" + a;
}
@Benchmark
public String work2() {
return a + "";
}
@Benchmark
public String work3() {
return Integer.toString(a);
}
@Benchmark
public String work4() {
return String.valueOf(a);
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(JHMTest.class.getSimpleName())
.forks(1)
.build();
new Runner(opt).run();
}
}
测试结果为:
# JMH version: 1.21
# VM version: JDK 1.8.0_291, Java HotSpot(TM) 64-Bit Server VM, 25.291-b10
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=64049:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: bestqiang.test.jmh.JHMTest.work1
# Run progress: 0.00% complete, ETA 00:00:40
# Fork: 1 of 1
# Warmup Iteration 1: 29.389 ns/op
# Warmup Iteration 2: 20.454 ns/op
# Warmup Iteration 3: 16.060 ns/op
# Warmup Iteration 4: 16.014 ns/op
# Warmup Iteration 5: 15.806 ns/op
Iteration 1: 15.982 ns/op
Iteration 2: 15.901 ns/op
Iteration 3: 16.003 ns/op
Iteration 4: 15.865 ns/op
Iteration 5: 15.929 ns/op
Result "bestqiang.test.jmh.JHMTest.work1":
15.936 ±(99.9%) 0.219 ns/op [Average]
(min, avg, max) = (15.865, 15.936, 16.003), stdev = 0.057
CI (99.9%): [15.717, 16.155] (assumes normal distribution)
# JMH version: 1.21
# VM version: JDK 1.8.0_291, Java HotSpot(TM) 64-Bit Server VM, 25.291-b10
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=64049:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: bestqiang.test.jmh.JHMTest.work2
# Run progress: 25.00% complete, ETA 00:00:32
# Fork: 1 of 1
# Warmup Iteration 1: 24.678 ns/op
# Warmup Iteration 2: 19.042 ns/op
# Warmup Iteration 3: 15.775 ns/op
# Warmup Iteration 4: 15.769 ns/op
# Warmup Iteration 5: 15.847 ns/op
Iteration 1: 15.915 ns/op
Iteration 2: 15.923 ns/op
Iteration 3: 15.836 ns/op
Iteration 4: 15.898 ns/op
Iteration 5: 15.853 ns/op
Result "bestqiang.test.jmh.JHMTest.work2"
15.885 ±(99.9%) 0.149 ns/op [Average]
(min, avg, max) = (15.836, 15.885, 15.923), stdev = 0.039
CI (99.9%): [15.736, 16.034] (assumes normal distribution)
# JMH version: 1.21
# VM version: JDK 1.8.0_291, Java HotSpot(TM) 64-Bit Server VM, 25.291-b10
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=64049:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: bestqiang.test.jmh.JHMTest.work3
# Run progress: 50.00% complete, ETA 00:00:21
# Fork: 1 of 1
# Warmup Iteration 1: 34.360 ns/op
# Warmup Iteration 2: 26.414 ns/op
# Warmup Iteration 3: 25.597 ns/op
# Warmup Iteration 4: 25.619 ns/op
# Warmup Iteration 5: 25.743 ns/op
Iteration 1: 25.723 ns/op
Iteration 2: 25.671 ns/op
Iteration 3: 25.862 ns/op
Iteration 4: 25.863 ns/op
Iteration 5: 25.774 ns/op
Result "bestqiang.test.jmh.JHMTest.work3":
25.779 ±(99.9%) 0.327 ns/op [Average]
(min, avg, max) = (25.671, 25.779, 25.863), stdev = 0.085
CI (99.9%): [25.452, 26.106] (assumes normal distribution)
# JMH version: 1.21
# VM version: JDK 1.8.0_291, Java HotSpot(TM) 64-Bit Server VM, 25.291-b10
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=64049:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: bestqiang.test.jmh.JHMTest.work4
# Run progress: 75.00% complete, ETA 00:00:10
# Fork: 1 of 1
# Warmup Iteration 1: 34.378 ns/op
# Warmup Iteration 2: 26.560 ns/op
# Warmup Iteration 3: 25.823 ns/op
# Warmup Iteration 4: 25.633 ns/op
# Warmup Iteration 5: 26.084 ns/op
Iteration 1: 25.934 ns/op
Iteration 2: 26.429 ns/op
Iteration 3: 30.559 ns/op
Iteration 4: 28.399 ns/op
Iteration 5: 26.415 ns/op
Result "bestqiang.test.jmh.JHMTest.work4":
27.547 ±(99.9%) 7.439 ns/op [Average]
(min, avg, max) = (25.934, 27.547, 30.559), stdev = 1.932
CI (99.9%): [20.108, 34.987] (assumes normal distribution)
# Run complete. Total time: 00:00:42
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.
Benchmark Mode Cnt Score Error Units
JHMTest.work1 avgt 5 15.936 ± 0.219 ns/op
JHMTest.work2 avgt 5 15.885 ± 0.149 ns/op
JHMTest.work3 avgt 5 25.779 ± 0.327 ns/op
JHMTest.work4 avgt 5 27.547 ± 7.439 ns/op
@Benchmark代表了开启基准测试的方法,avgt代表平均耗时,@Fork代表了我们开启的JVM环境,比如1就代表我们为每个进行基准测试的方法都单独开启一个JVM环境,我们几个进行基准测试的方法可以进行环境隔离避免互相影响,@Warmup代表我们的预热参数,@Measurement代表我们的测试参数,可以看到,基准测试的运行结果和我们使用常规测试的最终结果是一致的。
编写一个合格的基准测试,远不止这么简单,上面的用例看似简单,但是蕴含了一些坑点,比如变量a,如果定义为final,你会发现,最终结果会截然不同,这里我直接贴出最终测试结果:
Benchmark Mode Cnt Score Error UnitsJHMTest.work1 avgt 5 3.166 ± 0.184 ns/opJHMTest.work2 avgt 5 3.167 ± 0.103 ns/opJHMTest.work3 avgt 5 26.091 ± 1.276 ns/opJHMTest.work4 avgt 5 25.424 ± 1.306 ns/op
可以明显的看到,“”+a和a+“”的平均耗时降低为了3 ns/op。这其实就是常量折叠(Constant folding)起了作用,它是一个在编译时期简化常数的一个过程,这个优化导致我们拿到了错误的结果。但是我们也可以看到work3和work4并未受到太大影响,这种优化我们是难以准确预测的,所以我们需要按照标准格式来写我们的测试用例,否则一旦JVM进行优化,我们就会拿到异常测试结果。又比如,上面的例子中,我们在每次计算完毕后就将结果返回,这就是为了避免”死码消除“,如果一块代码我们一直没有用到,那么可能它就会被编译器优化掉,我们将计算结果返回可以避免这种情况,又或者,我们可以使用Blackholes来将我们的计算值进行消耗,避免”死码消除“的情况发生,Blackholes因为本身也有一定的计算逻辑,为避免对测试结果造成影响,这里使用了直接将计算结果返回的方式。
我们在进行微基准测试时,并没有再使用循环,那么使用循环后会有什么后果呢?JMH的官方案例 JMHSample_11_Loops 中特别提到:don't overuse loops, rely on JMH to get the measurement right. 也就是说,不要使用循环,依赖JMH去获取正确的结果,因为循环实在是坑点太多,优化点太多,如果想要写出一个安全的循环也是非常的繁琐的,所以我们日常使用中应该尽量避免。如果实在想要使用循环的话,官方提供了一个例子 JMHSample_34_SafeLooping 用来写正确的循环,应用在我们本次的测试用例上面,代码可以这么写,供大家参考:
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.concurrent.TimeUnit;
@State(Scope.Thread)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class JHMTest {
@Param({"1000000"})
int size;
static String work1(int x) {
return "" + x;
}
static String work2(int x) {
return x + "";
}
static String work3(int x) {
return Integer.toString(x);
}
static String work4(int x) {
return String.valueOf(x);
}
@Benchmark
public void test1() {
for (int i = 0; i < size; i ++) {
work1(i);
}
}
@Benchmark
public void test2() {
for (int i = 0; i < size; i ++) {
work2(i);
}
}
@Benchmark
public void test3() {
for (int i = 0; i < size; i ++) {
work3(i);
}
}
@Benchmark
public void test4() {
for (int i = 0; i < size; i ++) {
work4(i);
}
}
@CompilerControl(CompilerControl.Mode.DONT_INLINE)
public static void sink(String v) {
// IT IS VERY IMPORTANT TO MATCH THE SIGNATURE TO AVOID AUTOBOXING.
// The method intentionally dBenchmarkProcessoroes nothing.
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(JHMTest.class.getSimpleName())
.forks(3)
.build();
new Runner(opt).run();
}
}
CompilerControl.Mode.INLINE 这个注解的意思就是强制跳过内联的意思,避免内联优化。
关于JMH的使用例子,一共有38个,这里我们不再一一介绍,大家可以去查看 https://github.com/openjdk/jmh 官方代码,这些例子可以帮助大家更好的使用JMH。关于这38个例子和涉及的知识点,我进行了脑图梳理总结,可以帮助大家更好的学习和理解: