Lab 5 - 64-Bit Assembly Language Lab
Introduction
In this lab, I explored how assembly language works on two different architectures: x86_64
and AArch64
. The goal was to understand how simple programs like "Hello, World!" and loops are written, compiled, and executed in assembly for these architectures.
I started by running a C program, then analyzed how it translates into assembly. Finally, I modified and tested an AArch64 assembly program to print numbers in a loop.
Main Process
Step 1: Setting Up the Environment
First, I connected to the x86_64 via SSH. Then, I extracted the provided lab files:
cd ~
tar xvf /public/spo600-assembler-lab-examples.tgz
This created a directory structure with example files for both architectures.
Step 2: Running the C Program
Before diving into assembly, I compiled and ran the C version of the program to understand its behaviour:
cd ~/spo600/examples/hello/c
make # Compiles the C files
./hello # Runs the compiled program
Step 3: Disassembling the Executable
To analyze how the C program translates into assembly, I used objdump
:
objdump -d hello
This showed the assembly instructions inside the executable, especially for the main
function.
Key Takeaways (From the disassembled output):
The
printf("Hello, World!")
call in C maps to an assemblycall printf@plt
instruction, which tells the program to call theprintf
function from the standard library.The
mov
instruction loads values into registers, whilepush
andpop
manage the function stack frame.The compiled C binary includes a lot of additional instructions beyond just printing text. These include stack setup, function prologues/epilogues, and standard library function calls.
The C program has extra instructions for setup and cleanup, which are absent in handwritten assembly, making the compiled binary larger than a manually optimized assembly version.
Step 4: Running x86_64 Assembly Code
Next, I ran the x86_64 assembly version:
tar xvf /public/spo600-assembler-lab-examples.tgz
ls ~/spo600/examples/hello/assembler/aarch64
cd ~/spo600/examples/hello/assembler/aarch64
make
./hello
objdump -d hello
Step 5: Modifying AArch64 Assembly and Debugging Issues
Initial Code
The original AArch64 assembly program was a simple implementation of Hello, World!
, structured as follows:
.text
.globl _start
_start:
mov x0, 1
adr x1, msg
mov x2, len
mov x8, 64
svc 0
mov x0, 0
mov x8, 93
svc 0
.data
msg: .ascii "Hello, world!\n"
len= . - msg
First Experiment: Single-Digit Loop Counter
The first attempt was a simple loop that printed numbers from 0 to 5 as a single ASCII character:
.text
.globl _start
_start:
mov x19, #0 // Loop counter (starts at 0)
loop:
mov x0, 1 // stdout file descriptor
adr x1, loop_msg // Load "Loop: " message address
mov x2, loop_len // Load message length
mov x8, 64 // write syscall
svc 0 // Call kernel
// Convert loop index (x19) to ASCII and print
add x20, x19, #48 // Convert number to ASCII ('0' = 48)
adr x1, num_char // Address of character buffer
strb w20, [x1] // Store ASCII character in buffer
mov x0, 1 // stdout file descriptor
adr x1, num_char // Load buffer address
mov x2, #1 // Print one character
mov x8, 64 // write syscall
svc 0 // Call kernel
// Print newline
mov x0, 1
adr x1, newline
mov x2, #1
mov x8, 64
svc 0
add x19, x19, #1 // Increment counter
cmp x19, #6 // Stop at 6
b.ne loop // Repeat if x19 < 6
// Exit program
mov x0, 0
mov x8, 93 // exit syscall
svc 0
.data
loop_msg: .ascii "Loop: " // Prefix message
loop_len= . - loop_msg // Message length
num_char: .ascii "0" // Placeholder for loop index
newline: .ascii "\n" // Newline character
Second Experiment: Two-Digit Loop Counter (0-32)
After successfully printing single digits, I modified the code to print numbers from 00
to 32
using two-digit formatting:
.text
.globl _start
_start:
mov x19, #0 // Loop counter (starts at 0)
mov x22, #10 // Store 10 in register for division
loop:
// Print "Loop: "
mov x0, 1
ldr x1, =loop_msg
mov x2, loop_len
mov x8, 64
svc 0
// Convert x19 (0-32) to two-digit ASCII
udiv x20, x19, x22 // x20 = x19 / 10 (quotient, tens digit)
msub x21, x20, x22, x19 // x21 = x19 - (x20 * 10) (remainder, ones digi>
add x20, x20, #48 // Convert quotient to ASCII ('0'-'9')
add x21, x21, #48 // Convert remainder to ASCII ('0'-'9')
// Store digits in buffer
ldr x1, =num_chars // Address of buffer
strb w20, [x1] // Store tens digit
strb w21, [x1, #1] // Store ones digit
// Print two-digit number
mov x0, 1
ldr x1, =num_chars
mov x2, #2
mov x8, 64
svc 0
// Print newline
mov x0, 1
ldr x1, =newline
mov x2, #1
mov x8, 64
svc 0
add x19, x19, #1 // Increment counter
cmp x19, #33 // Stop at 33 (prints 00 to 32)
b.ne loop // Repeat if x19 < 33
// Exit program
mov x0, 0
mov x8, 93
svc 0
.data
.align 4
loop_msg: .asciz "Loop: "
loop_len= . - loop_msg
.align 4
num_chars: .asciz "00" // Placeholder for 2-digit number
.align 4
newline: .asciz "\n"
I encountered an issue where the ASCII conversion was incorrect, leading to unintended characters appearing. After debugging, I found that I had not properly handled the division and remainder when splitting the digits. By correctly using udiv
and msub
, I was able to get the expected output.
Third Experiment: Hexadecimal Loop Counter
.text
.globl _start
_start:
mov x19, #0 // loop counter
loop:
// print "Loop: 0x"
mov x0, #1
ldr x1, =prefix
mov x2, #8
mov x8, #64
svc 0
// high nibble = x19 >> 4
mov x20, x19
lsr x21, x20, #4
bl to_hexchar
ldr x1, =hexbuf // Load hexbuf address
strb w0, [x1] // Store high nibble as ASCII
// low nibble = x19 & 0x0F
and x21, x20, #0x0F
bl to_hexchar
strb w0, [x1, #1] // Store low nibble as ASCII
// print hexbuf (2 chars)
mov x0, #1
ldr x1, =hexbuf
mov x2, #2
mov x8, #64
svc 0
// newline
mov x0, #1
ldr x1, =newline
mov x2, #1
mov x8, #64
svc 0
add x19, x19, #1
cmp x19, #33
b.ne loop
// exit
mov x0, #0
mov x8, #93
svc 0
// Subroutine to convert number (0-15) to hex ASCII character
to_hexchar:
cmp x21, #10
blt .digit
add x0, x21, #55 // 'A' = 65 = 10 + 55
ret
.digit:
add x0, x21, #48 // '0' = 48
ret
.data
prefix: .asciz "Loop: 0x"
hexbuf: .space 2
newline: .asciz "\n"
Step 6: Step 5 Experiment on x86_64
First Experiment: Single-Digit Loop Counter
.section .data
prefix: .ascii "Loop: "
newline: .ascii "\n"
digit: .byte 0
.section .text
.globl _start
_start:
mov $0, %r15
loop:
# print prefix
mov $1, %rax
mov $1, %rdi
lea prefix(%rip), %rsi
mov $6, %rdx
syscall
# convert number to ASCII
mov %r15b, %al
add $'0', %al
mov %al, digit(%rip)
# print digit
mov $1, %rax
mov $1, %rdi
lea digit(%rip), %rsi
mov $1, %rdx
syscall
# newline
mov $1, %rax
mov $1, %rdi
lea newline(%rip), %rsi
mov $1, %rdx
syscall
inc %r15
cmp $6, %r15
jne loop
mov $60, %rax
xor %rdi, %rdi
syscall
Second Experiment: Two-Digit Loop Counter (0-32)
.section .data
prefix: .ascii "Loop: "
newline: .ascii "\n"
digits: .space 2
.section .text
.globl _start
_start:
mov $0, %r15 # loop counter
loop:
# print prefix
mov $1, %rax
mov $1, %rdi
lea prefix(%rip), %rsi
mov $6, %rdx
syscall
# move loop counter to %rax for division
mov %r15, %rax
xor %rdx, %rdx # clear remainder
mov $10, %rbx
div %rbx # quotient = %rax, remainder = %rdx
# convert to ASCII
add $'0', %al # tens digit ASCII
add $'0', %dl # ones digit ASCII
# store digits
mov %al, digits(%rip)
mov %dl, digits+1(%rip)
# print digits
mov $1, %rax
mov $1, %rdi
lea digits(%rip), %rsi
mov $2, %rdx
syscall
# print newline
mov $1, %rax
mov $1, %rdi
lea newline(%rip), %rsi
mov $1, %rdx
syscall
inc %r15
cmp $33, %r15
jne loop
# exit
mov $60, %rax
xor %rdi, %rdi
syscall
Third Experiment: Hexadecimal Loop Counter
.section .data
prefix: .ascii "Loop: 0x"
newline: .ascii "\n"
hexbuf: .space 2 # 2 hex digits will go here
.section .text
.globl _start
_start:
mov $0, %r15 # loop counter = 0
loop:
# print "Loop: 0x"
mov $1, %rax
mov $1, %rdi
lea prefix(%rip), %rsi
mov $8, %rdx
syscall
# convert to hex digits
mov %r15b, %al # copy counter to AL
mov %al, %bl # backup for low nibble
shr $4, %al # high nibble
call hexchar
mov %al, hexbuf(%rip)
mov %bl, %al
and $0x0F, %al # low nibble
call hexchar
mov %al, hexbuf+1(%rip)
# print hex digits
mov $1, %rax
mov $1, %rdi
lea hexbuf(%rip), %rsi
mov $2, %rdx
syscall
# print newline
mov $1, %rax
mov $1, %rdi
lea newline(%rip), %rsi
mov $1, %rdx
syscall
inc %r15
cmp $0x21, %r15
jne loop
# exit
mov $60, %rax
xor %rdi, %rdi
syscall
# Subroutine: Convert AL to ASCII hex character
hexchar:
cmp $10, %al
jl .digit
add $55, %al # 'A' = 65 = 10 + 55
ret
.digit:
add $48, %al # '0' = 48
ret
6502 vs x86_64 vs AArch64 Assembly
6502 Assembly
Limited registers: 6502 only has three main registers (A, X, and Y), making it much more constrained than x86_64 and AArch64.
No direct multiplication or division: Unlike x86_64 and AArch64, 6502 lacks built-in multiplication or division instructions, so programmers must implement them manually.
Memory addressing is more restricted: The 6502 only supports 16-bit memory addressing, making it much more difficult to work with large data sets.
More manual memory management: Unlike modern architectures, stack operations and function calls must be manually handled, leading to more complexity.
x86_64 Assembly
Complex instruction set (CISC): x86_64 has a large number of instructions and addressing modes, making it very powerful but also complicated.
Stack-based function calls: Arguments are passed on the stack (or in registers, depending on the calling convention).
Support for floating-point operations and SIMD: Unlike 6502, x86_64 has built-in support for floating-point arithmetic and vectorized operations.
AArch64 Assembly
Simpler and more efficient than x86_64: AArch64 has a more uniform register set and a fixed instruction length, making it easier to optimize.
Register-based calling convention: Unlike x86_64, which often relies on the stack, AArch64 primarily passes arguments in registers.
More suited for modern mobile and embedded devices: Many smartphones and IoT devices use AArch64 because of its power efficiency.
Reflection
This lab provided hands-on experience with assembly programming on three architectures. I learned how C code translates into machine instructions, how different architectures handle function calls, and how to manipulate registers and memory directly.
Debugging assembly was challenging, especially with incorrect ASCII conversions in AArch64, but fixing these issues helped reinforce my understanding of low-level programming.
Overall, this was a challenging but rewarding exercise in low-level programming.
Comments
Post a Comment