In this last stage of my SPO600 project, Since I don't have results suitable for upstreaming, I am going to wrap up my project results and do some thorough technical analysis of my results.
First of all, I am going to summary what I did for my project. (If you want to go over the details, you can see my previous posts.)
I picked a software called SSDUP, it is a traffic-aware SSD burst buffer for HPC systems. I noticed that it uses 3 different Murmurhash3 hash functions, the first two hash functions are optimized for x86 platforms and the third hash function is optimized for x64 platforms. I also noticed that it uses 'gcc -std=gnu99' to compile. In order to easier to handler these 3 hash functions, I split them into 3 files and separately testing them on an AArch64 and x86_64 systems.
As the professor said my results in stage two is hard to read, I am going to show my results again in a table format.
First hash function (MurmurHash3_x86_32), the execution time for -O3 is about 802% faster than without compilation option:
Second hash function (MurmurHash3_x86_128), the execution time for -O3 is about 891% faster than without compilation option:
Third hash function (MurmurHash3_x64_128), the execution time for -O3 is about 523% faster than without compilation option, and 0.04% faster with code changed:
All of the tests are first completed on an AArch64 system. My first step to optimize the hash function is to compile my benchmark program with -O3 compilation option. The first two hash functions, which have been optimized for x86 platforms, which has a significant improvement in performance. The third hash function, which has been optimized for x64 platforms, after compiling with -O3 option, which is a very small improvement in performance. My second step in optimization is to change some code in the third function, there is 0.04% faster than without changing the code.
Afterward, I perform the benchmark program on an x86_64 system, the result turns out that it also has a significant improvement in performance if compiling with -O3 option. But the improvement of the third function on an AArch64 system is not as much as different than x86_64 platforms. As a result, compiling with -O3 option for both functions produces the best performance and is the most optimized case.
First of all, I am going to summary what I did for my project. (If you want to go over the details, you can see my previous posts.)
I picked a software called SSDUP, it is a traffic-aware SSD burst buffer for HPC systems. I noticed that it uses 3 different Murmurhash3 hash functions, the first two hash functions are optimized for x86 platforms and the third hash function is optimized for x64 platforms. I also noticed that it uses 'gcc -std=gnu99' to compile. In order to easier to handler these 3 hash functions, I split them into 3 files and separately testing them on an AArch64 and x86_64 systems.
As the professor said my results in stage two is hard to read, I am going to show my results again in a table format.
First hash function (MurmurHash3_x86_32), the execution time for -O3 is about 802% faster than without compilation option:
without -O3 option
|
with -O3 option
|
|
No code changes |
14.117
|
1.572
|
Code changes: i+i and len |
14.035
|
N/A
|
Second hash function (MurmurHash3_x86_128), the execution time for -O3 is about 891% faster than without compilation option:
without -O3 option
|
with -O3 option
|
|
No code changes |
13.332
|
1.338
|
Code changes: i+i and len |
13.543
|
N/a
|
Third hash function (MurmurHash3_x64_128), the execution time for -O3 is about 523% faster than without compilation option, and 0.04% faster with code changed:
without -O3 option
|
with -O3 option
|
|
No code changes |
8.179
|
1.315
|
Code changes: i+i and len |
8.137
|
N/A
|
All of the tests are first completed on an AArch64 system. My first step to optimize the hash function is to compile my benchmark program with -O3 compilation option. The first two hash functions, which have been optimized for x86 platforms, which has a significant improvement in performance. The third hash function, which has been optimized for x64 platforms, after compiling with -O3 option, which is a very small improvement in performance. My second step in optimization is to change some code in the third function, there is 0.04% faster than without changing the code.
Afterward, I perform the benchmark program on an x86_64 system, the result turns out that it also has a significant improvement in performance if compiling with -O3 option. But the improvement of the third function on an AArch64 system is not as much as different than x86_64 platforms. As a result, compiling with -O3 option for both functions produces the best performance and is the most optimized case.
Comments
Post a Comment