Exploring Assembler on the x86-64 Platform
Seung Woo (Paul) Ji
Posted on February 27, 2022
Introduction
In this post, we are going to develop the same assembly program that we coded in the previous post but within x86_64 system.
Original Code
.text
.globl _start
min = 0 /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 10 /* loop exits when the index hits this number (loop condition is i<max) */
_start:
mov $min,%r15 /* loop index */
loop:
inc %r15 /* increment index */
cmp $max,%r15 /* see if we're done */
jne loop /* loop if we're not */
mov $0,%rdi /* exit status */
mov $60,%rax /* syscall sys_exit */
syscall
Like the original code from AArch64 program, the code in x86_64 does not do anything but to loop for 10 times (max = 10). However, we can see that there are a number of notable differences when compared to the AArch64 platform. First of all, we use a $
sign to indicate an immediate value and a %
sign to indicate a register. Next, we have inc
instruction to directly increment the value of r15
instead of using add
instruction. We also use jne
instruction to jump to a label instead of breaching and syscall
instruction to invoke a system call. Finally, we use specialized group of registers (e.g. rdi, rax) for syscall
arguments
With that being said, let's continue developing the code to actually print out something to the console screen.
Improved Code - Print Message
.text
.globl _start
min = 0 /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 10 /* loop exits when the index hits this number (loop condition is i<max) */
_start:
mov $min,%r15 /* loop index */
loop:
mov $len,%rdx /* message length */
mov $msg,%rsi /* message location */
mov $1,%rdi /* file descriptor stdout */
mov $1,%rax /* syscall sys_write */
syscall
inc %r15 /* increment index */
cmp $max,%r15 /* see if we're done */
jne loop /* loop if we're not */
mov $0,%rdi /* exit status */
mov $60,%rax /* syscall sys_exit */
syscall
.section .data
msg: .ascii "Loop\n"
len = . - msg
Result
Loop
Loop
Loop
Loop
Loop
Loop
Loop
Loop
Loop
Loop
The program does what we expected for. But, the printed messages are not meaningful us yet. Let's continue on developing the code so that we can have the number of loop.
Improved Code - Print Loop Number
.text
.globl _start
min = 0 /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 10 /* loop exits when the index hits this number (loop condition is i<max) */
_start:
mov $min,%r15 /* loop index */
loop:
mov %r15,%r14 /* Copy the value of r15 to r14 */
add $'0',%r14 /* Add the ascii value of '0' to the r14 and save */
movb %r14b,msg+6 /* Copy one byte of r14 to the address location of msg + 6 */
mov $len,%rdx /* message length */
mov $msg,%rsi /* message location */
mov $1,%rdi /* file descriptor stdout */
mov $1,%rax /* syscall sys_write */
syscall
inc %r15 /* increment index */
cmp $max,%r15 /* see if we're done */
jne loop /* loop if we're not */
mov $0,%rdi /* exit status */
mov $60,%rax /* syscall sys_exit */
syscall
.section .data
msg: .ascii "Loop: #\n"
Result
Loop: 0
Loop: 1
Loop: 2
Loop: 3
Loop: 4
Loop: 5
Loop: 6
Loop: 7
Loop: 8
Loop: 9
Now, the program prints out more meaningful messages to the screen. Note that there are another notable differences as compared to the ones in AArch64 assembly. For example, we may reuse mov
instruction to move data from one register to an address pointed by another register. As you remember, we have to utilize str
instruction to do such job within AArch64 system. Moreover, we put the b
suffix after mov
instruction and the register in order to limit the number of byte to be moved.
However, this code is also not sufficient to handle the two-digit loop numbers.
Improved Code - Print Two Digit Loop Number
.text
.globl _start
min = 0 /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 15 /* loop exits when the index hits this number (loop condition is i<max) */
_start:
mov $min,%r15 /* loop index */
mov $10,%r13 /* Divisor */
loop:
// Dividing by 10
mov %r15,%rax /* Setting rax with the value of dividend */
mov $0,%rdx /* rdx must be set to 0 before using div instruction */
div %r13 /* divide rax by the r13; place quotient into rax and remainder into rdx */
cmp $0,%rax
je oneDigit
// Inserting tens digit
add $'0',%rax /* Add the ascii value of '0' to the rax and save */
mov %rax,%r12
movb %r12b,msg+6 /* Copy one byte of rax to the address location of msg + 6 */
oneDigit:
// Inserting ones digit
add $'0',%rdx /* Add the ascii value of '0' to the rdx and save */
mov %rdx,%r12
movb %r12b,msg+7 /* Copy one byte of rdx to the address location of msg + 7 */
// Print Message
mov $len,%rdx /* message length */
mov $msg,%rsi /* message location */
mov $1,%rdi /* file descriptor stdout */
mov $1,%rax /* syscall sys_write */
syscall
inc %r15 /* increment index */
cmp $max,%r15 /* see if we're done */
jne loop /* loop if we're not */
mov $0,%rdi /* exit status */
mov $60,%rax /* syscall sys_exit */
syscall
.section .data
msg: .ascii "Loop: #\n"
len = . - msg
In this code, we divide the given loop index stored in r15
by 10. We use the quotient to find the tens digit. Unlike udiv
instruction, div
instruction can also calculate a remainder. With given quotient and remainder, we can print the quotient value as tens digit and the remainder value as ones digit. Afterwards, we can remove the leading zero for the tens digit by jumping to the oneDigit
label to skip inserting zero digit character when the quotient value is equal to 0.
Conclusion
In this post, we explored how we can make a program in x86_64 system that has the same logic as the one from AArch64 in the previous post. Having two different systems to develop a code that performs the same result bring developers interesting challenges - we have to understand the different set of instructions and the way they perform. Also, debugging in both systems are difficult as we have to rely on either inspecting compiler error messages or using objdump
to disassemble the generated machine code.
Posted on February 27, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024