Skip to main content

Lab 3

In this lab, we are going to use Assembly language to finish 3 parts.
1. As we are getting familiar with Assembly language, we will create a loop in Assembly to prints out 10 times of "Hello World!". This part is quite easy to do it, here is the source code for x86_64 assembler:
------------------------------------------------------
.text
.globl    _start

start = 0                       /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 10                        /* loop exits when the index hits this number (loop condition is i<max) */

_start:
    mov     $start,%r15         /* loop index */
    mov     %r15,%r10

loop:
        /* ... body of the loop ... do something useful here ... */

    movq        $len,%rdx
        movq    $msg, %rsi
        movq    $1, %rax
        movq    $1, %rax
        syscall

    inc     %r15                /* increment index */
    cmp     $max,%r15           /* see if we're done */
    jne     loop                /* loop if we're not */

    mov     $0,%rdi             /* exit status */
    mov     $60,%rax            /* syscall sys_exit */
    syscall

.section .rodata
msg:    .ascii       "Hello World!\n"
        len = . - msg
------------------------------------------------------

The output shows below:
------------------------------------------------------
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!
------------------------------------------------------

2. The second part is getting harder, we will have to improve the assembly loop code, in order to prints out Loop 0 to Loop 9. Here is the source code I wrote in x86_64 assembler:
------------------------------------------------------
.text
.blobl    _start

stdout = 1
start = 0                        /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 10                         /* loop exits when the index hits this number (loop condition is i<max) */
pos = 6                          /* position of character to be replaced in string */
pos2 = 7                         /* position of newline character */

_start:
    mov     $start,%r15          /* loop index */

loop:
    /* ... body of the loop ... */
    mov     $len,%rdx            /* message length */
    mov     $48,%r14             /* move immediate value 48 (ascii zero) to r14 */
    mov     $10,%r12             /* move immediate value 10 (newline) to r12 */
    add     %r15,%r14            /* add r15 (loop index) to r14 */
    mov     $pos,%r13            /* move value of pos to r13 */
    mov     $pos2,%r11           /* move value of pos2 to r11 */
    mov     $msg,%rsi            /* message location */
    add     %rsi,%r13            /* add rsi (message location) to r13 and store in r13 */
    add     %rsi,%r11            /* add rsi (message location) to r11 and store in r11 */
    mov     %r14,(%r13)          /* move data from r14 to address pointed to by r13 */
    mov     %r12,(%r11)          /* move data from r12 to address pointed to by r11 */
    mov     $stdout,%rdi         /* file descriptior stdout */
    mov     $1,%rax              /* syscall sys_write */
    syscall

    inc     %r15                /* increment index */
    cmp     $max,%r15           /* see if we're done */
    jne     loop                /* loop if we're not */

    mov     $0,%rdi             /* exit status */
    mov     $60,%rax            /* syscall sys_exit */
    syscall
.data

msg:     .ascii  "Loop: x\n"
.set len, . - msg               /* current memory location minus value of label msg */
------------------------------------------------------

And I got the output:
------------------------------------------------------
Loop: 0
Loop: 1
Loop: 2
Loop: 3
Loop: 4
Loop: 5
Loop: 6
Loop: 7
Loop: 8
Loop: 9
------------------------------------------------------

3. In the final part of this lab, it is more complicated. We were being asked to creating a loop to prints out Loop 0 to Loop 29 in Assembly language. Here is the source code I wrote in x86_64 Assembly:
------------------------------------------------------
.text
.global  _start

start = 0 /* loop index starting value */
max = 30 /* loop exits when the index hits this number */

_start:
mov     $start,%r15 /* set loop index */

loop:
mov $0,%rdx /* clear register */
mov %r15,%rax /* store dividend in register */
mov $10,%r10 /* store divisor in register */
div %r10 /* divide */
mov %rax,%r11 /* store quotient in register */
mov %rdx,%r12 /* store remainder in register */
cmp $0,%r11 /* check if quotient is 0 */
je next /* jump if quotient is 0 */
add $0x30,%r11 /* convert to ascii */
mov %r11b,msg+6 /* place quotient in msg */

next:
add $0x30,%r12 /* convert to ascii */
mov %r12b,msg+7 /* place remainder in msg */
        mov $len,%rdx /* message length */
        mov $msg,%rsi /* message location */
        mov $1,%rdi /* file descriptor stdout */
        mov $1,%rax /* syscall sys_write */
        syscall /* syscall */

inc     %r15 /* increment index */
cmp     $max,%r15 /* check if index is at max */
jne     loop /* loop if index not at max */

        mov $0,%rdi /* exit status */
        mov $60,%rax /* syscall sys_exit */
        syscall /* syscall */

.data

msg:    .ascii      "Loop:   \n"
.set len , . - msg
------------------------------------------------------

I also wrote an Aarch64 version:
------------------------------------------------------
.text
.global _start

start = 0                               /* loop index starting value */
max = 30                                /* loop exits when the index hits this number */

_start:
        mov     x3,start                /* set loop index */

loop:
        mov     x9,10 /* store divisor in register */
        udiv    x10,x3,x9 /* divide and store quotient in register */
        msub    x11,x9,x10,x3 /* store remainder in register */
adr     x1,msg                  /* message location */
cmp     x10,0 /* check if quotient is 0 */
b.eq    next /* jump if quotient is 0 */
add     x12,x10,0x30            /* convert to ascii and store in register */
strb    w12,[x1,6]              /* place quotient in msg */

next:
add     x13,x11,0x30            /* convert to ascii and store in register */
strb    w13,[x1,7]              /* place remainder in msg */
        mov     x2,len                  /* message length */
        mov     x0,1                    /* file descriptor stdout */
        mov     x8,64                   /* syscall sys_write */
        svc     0                       /* syscall */

        add     x3,x3,1                 /* increment index */
        cmp     x3,max                  /* check if index is at max */
        b.ne    loop                    /* loop if index not at max */

        mov     x0,0                    /* exit status */
        mov     x8,93                   /* syscall sys_exit */
        svc     0                       /* syscall */

.data

msg: .ascii "Loop:   \n"
.set len, . - msg
------------------------------------------------------

Which will get the same output below:
------------------------------------------------------
Loop:  0
Loop:  1
Loop:  2
Loop:  3
Loop:  4
Loop:  5
Loop:  6
Loop:  7
Loop:  8
Loop:  9
Loop: 10
Loop: 11
Loop: 12
Loop: 13
Loop: 14
Loop: 15
Loop: 16
Loop: 17
Loop: 18
Loop: 19
Loop: 20
Loop: 21
Loop: 22
Loop: 23
Loop: 24
Loop: 25
Loop: 26
Loop: 27
Loop: 28
Loop: 29
------------------------------------------------------

Overall, this lab is pretty difficult for me, Assembly language is much more complicated than other languages such as C/C++ Java, etc. It has to convert the number to Ascii number and combined it to print a string out.
What is different between x86_64 Assembly language and Aarch64 Assembly?
In my opinion, these Assembly languages are not much different, but I would say x86_64 is easier to code. In x86_64 Assembly language, it takes 2 arguments and stores the result in the second argument. In AArch64, it stores the result into argument 1, and uses values from argument 2 and argument 3.

Comments

Popular posts from this blog

Lab2

Complied C Lab In this lab, we were asked to compile a C program, using gcc command with different options. At the beginning of this lab, we wrote a simple C program that prints a message: Then using gcc command and the following compiler options to compile the program: -g # enable debugging information -O0 # do not optimize (that's a capital letter and then the digit zero) -fno-builtin # do not use builtin function optimizations Note that the size of file is 73088 bytes We can use objdump --source a.out command to show source code, the source code is under <main> section. And  readelf -p .rodata a.out contains the string to be printed. Then we add the option "-static" to recompile the program, found out the size is changed to 696264 bytes, which is bigger than the original program. And section headers are also increased. Next, I removed the builtin function optimization by remove option "-fno-builtin"...

SPO600 - Project - Stage Three

In this last stage of my SPO600 project, Since I don't have results suitable for upstreaming, I am going to wrap up my project results and do some thorough technical analysis of my results. First of all, I am going to summary what I did for my project. (If you want to go over the details, you can see my previous posts.) I picked a software called SSDUP, it is a traffic-aware SSD burst buffer for HPC systems. I noticed that it uses 3 different Murmurhash3 hash functions, the first two hash functions are optimized for x86 platforms and the third hash function is optimized for x64 platforms. I also noticed that it uses 'gcc -std=gnu99' to compile. In order to easier to handler these 3 hash functions, I split them into 3 files and separately testing them on an AArch64 and x86_64 systems. As the professor said my results in stage two is hard to read, I am going to show my results again in a table format. First hash function (MurmurHash3_x86_32), the execution time for -O3...

SPO600 - Project - Stage One

In our final project, the project will split into 3 stages. This is the first stage of my SPO600 course project. In this stage, we are given a task to find an open source software package that includes a CPU-intensive function or method that compiles to machine code. After I chose the open source software package, I will have to benchmark the performance of the software function on an AArach64 system. When the benchmark job is completed, I will have to think about my strategy that attempts to optimize the hash function for better performance on an AArch64 system and identify it, because those strategies will be used in the second stage of the project. With so many software, I would say picking software is the hardest job in the project, which is the major reason it took me so long to get this post going. But after a lot of research, I picked a software called SSDUP , it is a traffic-aware SSD burst buffer for HPC systems. You can find the source code over here: https://github.com/CGC...