Instructions
Objective
Write a MIPS assembly assignment program to perform Matrix operations on square matrices in MIPS assembly on MARS simulator Assembly language.
Requirements and Specifications
In this assignment you will write code to multiply two square n × n matrices of single precision floating point numbers, and then optimize the code to exploit a memory cache. All the functions you write in this assignment must respect register conventions and work for different sizes of square matrices. Your code must also include useful comments to make it readable.
You will need to use two MARS tools in this assignment:
- Data Cache Simulator: This tool allows you to set different cache sizes and types, and measures the number of memory accesses, and cache misses.
- Instruction Counter: This tool counts the number of true MIPS assembly instructions that execute during your program.
Each tool needs to be connected to MARS, and you will want to use a combination of breakpoints and the reset button on each tool to make careful measurements of your code performance.
You will also likely want to try the Memory Reference Visualization tool (much like the Bitmap Display), as it lets you watch the memory reference patterns generated by your program. Likewise, the bitmap display tool will also be useful for visualizing the results. Remember to set the base address to the heap (0x10040000), and choose the unit and display width to match the matrix size (N = display width pided by unit width). Running some tools, may noticeably slow down the execution of your program. If ever you notice MARS running much too slow, try restarting.
Screenshots of output
Source Code
# TODO: PUT YOUR NAME AND STUDENT NUMBER HERE!!!
# TODO: ADD OTHER COMMENTS YOU HAVE HERE AT THE TOP OF THIS FILE
# TODO: SEE LABELS FOR PROCEDURES YOU MUST IMPLEMENT AT THE BOTTOM OF THIS FILE
.data
TestNumber: .word 2 # TODO: Which test to run!
# 0 compare matrices stored in files Afname and Bfname
# 1 test Proc using files A through D named below
# 2 compare MADD1 and MADD2 with random matrices of size Size
Proc: MADD1 # Procedure used by test 2, set to MADD1 or MADD2
Size: .word 64 # matrix size (MUST match size of matrix loaded for test 0 and 1)
Afname: .asciiz "A64.bin"
Bfname: .asciiz "B64.bin"
Cfname: .asciiz "C64.bin"
Dfname: .asciiz "D64.bin"
#################################################################
# Main function for testing assignment objectives.
# Modify this function as needed to complete your assignment.
# Note that the TA will ultimately use a different testing program.
.text
main: la $t0 TestNumber
lw $t0 ($t0)
beq $t0 0 compareMatrix
beq $t0 1 testFromFile
beq $t0 2 compareMADD
li $v0 10 # exit if the test number is out of range
syscall
compareMatrix: la $s7 Size
lw $s7 ($s7) # Let $s7 be the matrix size n
move $a0 $s7
jal mallocMatrix # all3E-4cate heap memory and load matrix A
move $s0 $v0 # $s0 is a pointer to matrix A
la $a0 Afname
move $a1 $s7
move $a2 $s7
move $a3 $s0
jal loadMatrix
move $a0 $s7
jal mallocMatrix # allocate heap memory and load matrix B
move $s1 $v0 # $s1 is a pointer to matrix B
la $a0 Bfname
move $a1 $s7
move $a2 $s7
move $a3 $s1
jal loadMatrix
move $a0 $s0
move $a1 $s1
move $a2 $s7
jal check
li $v0 10 # load exit call code 10 into $v0
syscall # call operating system to exit
testFromFile: la $s7 Size
lw $s7 ($s7) # Let $s7 be the matrix size n
move $a0 $s7
jal mallocMatrix # allocate heap memory and load matrix A
move $s0 $v0 # $s0 is a pointer to matrix A
la $a0 Afname
move $a1 $s7
move $a2 $s7
move $a3 $s0
jal loadMatrix
move $a0 $s7
jal mallocMatrix # allocate heap memory and load matrix B
move $s1 $v0 # $s1 is a pointer to matrix B
la $a0 Bfname
move $a1 $s7
move $a2 $s7
move $a3 $s1
jal loadMatrix
move $a0 $s7
jal mallocMatrix # allocate heap memory and load matrix C
move $s2 $v0 # $s2 is a pointer to matrix C
la $a0 Cfname
move $a1 $s7
move $a2 $s7
move $a3 $s2
jal loadMatrix
move $a0 $s7
jal mallocMatrix # allocate heap memory and load matrix A
move $s3 $v0 # $s3 is a pointer to matrix D
la $a0 Dfname
move $a1 $s7
move $a2 $s7
move $a3 $s3
jal loadMatrix # D is the answer, i.e., D = AB+C
# TODO: add your testing code here
move $a0, $s0 # A
move $a1, $s1 # B
move $a2, $s2 # C
move $a3, $s7 # n
la $ra ReturnHere
la $t0 Proc # function pointer
lw $t0 ($t0)
jr $t0 # like a jal to MADD1 or MADD2 depending on Proc definition
ReturnHere: move $a0 $s2 # C
move $a1 $s3 # D
move $a2 $s7 # n
jal check # check the answer
li $v0, 10 # load exit call code 10 into $v0
syscall # call operating system to exit
compareMADD: la $s7 Size
lw $s7 ($s7) # n is loaded from Size
mul $s4 $s7 $s7 # n^2
sll $s5 $s4 2 # n^2 * 4
move $a0 $s5
li $v0 9 # malloc A
syscall
move $s0 $v0
move $a0 $s5 # malloc B
li $v0 9
syscall
move $s1 $v0
move $a0 $s5 # malloc C1
li $v0 9
syscall
move $s2 $v0
move $a0 $s5 # malloc C2
li $v0 9
syscall
move $s3 $v0
move $a0 $s0 # A
move $a1 $s4 # n^2
jal fillRandom # fill A with random floats
move $a0 $s1 # B
move $a1 $s4 # n^2
jal fillRandom # fill A with random floats
move $a0 $s2 # C1
move $a1 $s4 # n^2
jal fillZero # fill A with random floats
move $a0 $s3 # C2
move $a1 $s4 # n^2
jal fillZero # fill A with random floats
move $a0 $s0 # A
move $a1 $s1 # B
move $a2 $s2 # C1 # note that we assume C1 to contain zeros !
move $a3 $s7 # n
jal MADD1
move $a0 $s0 # A
move $a1 $s1 # B
move $a2 $s3 # C2 # note that we assume C2 to contain zeros !
move $a3 $s7 # n
jal MADD2
move $a0 $s2 # C1
move $a1 $s3 # C2
move $a2 $s7 # n
jal check # check that they match
li $v0 10 # load exit call code 10 into $v0
syscall # call operating system to exit
###############################################################
# mallocMatrix( int N )
# Allocates memory for an N by N matrix of floats
# The pointer to the memory is returned in $v0
mallocMatrix: mul $a0, $a0, $a0 # Let $s5 be n squared
sll $a0, $a0, 2 # Let $s4 be 4 n^2 bytes
li $v0, 9
syscall # malloc A
jr $ra
###############################################################
# loadMatrix( char* filename, int width, int height, float* buffer )
.data
errorMessage: .asciiz "FILE NOT FOUND"
.text
loadMatrix: mul $t0 $a1 $a2 # words to read (width x height) in a2
sll $t0 $t0 2 # multiply by 4 to get bytes to read
li $a1 0 # flags (0: read, 1: write)
li $a2 0 # mode (unused)
li $v0 13 # open file, $a0 is null-terminated string of file name
syscall
slti $t1 $v0 0
beq $t1 $0 fileFound
la $a0 errorMessage
li $v0 4
syscall # print error message
li $v0 10 # and then exit
syscall
fileFound: move $a0 $v0 # file descriptor (negative if error) as argument for read
move $a1 $a3 # address of buffer in which to write
move $a2 $t0 # number of bytes to read
li $v0 14 # system call for read from file
syscall # read from file
# $v0 contains number of characters read (0 if end-of-file, negative if error).
# We'll assume that we do not need to be checking for errors!
# Note, the bitmap display doesn't update properly on load,
# so let's go touch each memory address to refresh it!
move $t0 $a3 # start address
add $t1 $a3 $a2 # end address
loadloop: lw $t2 ($t0)
sw $t2 ($t0)
addi $t0 $t0 4
bne $t0 $t1 loadloop
li $v0 16 # close file ($a0 should still be the file descriptor)
syscall
jr $ra
##########################################################
# Fills the matrix $a0, which has $a1 entries, with random numbers
fillRandom: li $v0 43
syscall # random float, and assume $a0 unmodified!!
swc1 $f0 0($a0)
addi $a0 $a0 4
addi $a1 $a1 -1
bne $a1 $zero fillRandom
jr $ra
##########################################################
# Fills the matrix $a0 , which has $a1 entries, with zero
fillZero: sw $zero 0($a0) # $zero is zero single precision float
addi $a0 $a0 4
addi $a1 $a1 -1
bne $a1 $zero fillZero
jr $ra
######################################################
# TODO: void subtract( float* A, float* B, float* C, int N ) C = A - B
subtract:
mul $t0, $a3, $a3 # multiply N*N
subLoop:
l.s $f0, 0($a0) # load value from A
l.s $f1, 0($a1) # load value from B
sub.s $f0, $f0, $f1 # subtract A - B values
s.s $f0, 0($a2) # save result in C
addi $a0, $a0, 4 # advance A pointer
addi $a1, $a1, 4 # advance B pointer
addi $a2, $a2, 4 # advance C pointer
addi $t0, $t0, -1 # decrement number of entries to subtract
bnez $t0, subLoop # repeat while not zero
jr $ra
#################################################
# TODO: float frobeneousNorm( float* A, int N )
frobeneousNorm:
mul $t0, $a1, $a1 # multiply N*N
mtc1 $zero, $f0 # initialize sum to zero
frobLoop:
l.s $f1, 0($a0) # load value from A
mul.s $f1, $f1, $f1 # calculate square
add.s $f0, $f0, $f1 # add square to sum
addi $a0, $a0, 4 # advance A pointer
addi $t0, $t0, -1 # decrement number of entries to add
bnez $t0, frobLoop # repeat while not zero
sqrt.s $f0, $f0 # take square root and return it
jr $ra
#################################################
# TODO: void check ( float* C, float* D, int N )
# Print the forbeneous norm of the difference of C and D
check:
addi $sp, $sp, -12
sw $ra, 0($sp) # save ra in stack
sw $s0, 4($sp) # save s0 in stack
sw $s1, 8($sp) # save s1 in stack
move $s0, $a0 # save C pointer
move $a3, $a2 # pass N
move $a2, $a0 # save C-D result in C
jal subtract
move $a0, $s0 # pass C which holds the subtraction
move $a1, $a3 # pass N
jal frobeneousNorm # calculate frobeneous norm
mov.s $f12, $f0 # copy result to f12 for syscall
li $v0, 2 # use syscall 2 to print float
syscall # print float
lw $ra, 0($sp) # restore ra from stack
lw $s0, 4($sp) # restore s0 from stack
lw $s1, 8($sp) # restore s1 from stack
addi $sp, $sp, 12
jr $ra
##############################################################
# TODO: void MADD1( float*A, float* B, float* C, N )
MADD1:
sll $t5, $a3, 2 # calculate row size = N*4
move $t6, $a1 # save b pointer
move $t0, $a3 # copy N to t0 (i)
fori: # for( i = 0; i < n; i++ ) {
move $a1, $t6 # restore b pointer
move $t1, $a3 # copy N to t1, (j)
forj: # for( j = 0; j < n; j++ ) {
move $t2, $a3 # copy M to t3, (k)
move $t3, $a0 # copy current a[i] in t3
move $t4, $a1 # get &b[0][j]
mtc1 $zero, $f0 # sum = 0.0;
fork: # for( k = 0; k < n; k++ ) {
# c[i][j] += a[i][k] * b[k][j];
l.s $f1, 0($t3) # load A[i][k]
l.s $f2, 0($t4) # load b[k][j]
mul.s $f1, $f1, $f2 # multiply a[i][k] * b[k][j]
add.s $f0, $f0, $f1 # add result to sum
addi $t3, $t3, 4 # advance A[i][k]
add $t4, $t4, $t5 # advance B[k][j]
addi $t2, $t2, -1 # decrement k
bnez $t2, fork # repeat while not zero
l.s $f1, 0($a2) # load C[i][j]
add.s $f1, $f1, $f0 # add result to C[i][j]
s.s $f1, 0($a2) # save result in C[i][j]
addi $a2, $a2, 4 # advance to next position in C
addi $a1, $a1, 4 # avance to b[0][j]
addi $t1, $t1, -1 # decrement j
bnez $t1, forj # repeat while not zero
move $a0, $t3 # advance A[i] to A[i+1]
addi $t0, $t0, -1 # decrement i
bnez $t0, fori # repeat while not zero
jr $ra
#########################################################
# TODO: void MADD2( float*A, float* B, float* C, N )
MADD2:
addi $sp, $sp, -8 # save used registers
sw $s0, 0($sp)
sw $s1, 4($sp)
sll $t8, $a3, 2 # calculate row size = N*4
li $t0, 0 # jj = 0
forjj: # for( jj = 0; jj < n; jj += bsize ) {
li $t1, 0 # kk = 0
forkk: # for( kk = 0; kk < n; kk += bsize )
li $t2, 0 # i = 0
fori1: # for( i = 0; i < n; i++ ) {
mul $s0, $a3, $t2 # i*n
add $s0, $s0, $t1 # i*n + kk
sll $s0, $s0, 2 # 4*(i*n + kk)
add $s0, $s0, $a0 # &a[i][kk]
move $t3, $t0 # j = jj
li $t6, 4 # bsize
forj1: # for( j = jj; j < min( jj + bsize, n ); j++ ) {
mtc1 $zero, $f0 # sum = 0.0;
move $t5, $s0 # save a[i][k]
mul $s1, $a3, $t1 # kk*n
add $s1, $s1, $t3 # kk*n + j
sll $s1, $s1, 2 # 4*(kk*n + j)
add $s1, $s1, $a1 # &b[kk][j]
sub $t4, $a3, $t1 # k = n - kk
li $t7, 4 # bsize
fork1: # for( k=kk; k < min( kk + bsize, n ); k++ ) {
# sum += a[i][k] * b[k][j];
l.s $f1, 0($s0) # load A[i][k]
l.s $f2, 0($s1) # load b[k][j]
mul.s $f1, $f1, $f2 # multiply a[i][k] * b[k][j]
add.s $f0, $f0, $f1 # add result to sum
addi $s0, $s0, 4 # advance a[i][k]
add $s1, $s1, $t8 # advance b[k][j]
addi $t7, $t7, -1 # decrement bsize
beqz $t7, endfork # end if zero
addi $t4, $t4, -1 # decrement k
bnez $t4, fork1 # end if zero
endfork:
move $s0, $t5 # restore a[i][k]
mul $t5, $a3, $t2 # i*n
add $t5, $t5, $t3 # i*n + j
sll $t5, $t5, 2 # 4*(i*n + j)
add $t5, $t5, $a2 # &c[i][j]
l.s $f1, 0($t5) # load C[i][j]
add.s $f0, $f0, $f1 # add sum to C[i][j]
s.s $f0, 0($t5) # save in C[i][j]
addi $t6, $t6, -1 # decrement bsize
beqz $t6, endforj # end if zero
addi $t3, $t3, 1 # increment j
blt $t3, $a3, forj1 # repeat while < N
endforj:
addi $t2, $t2, 1 # increment i
blt $t2, $a3, fori1 # repeat while < N
addi $t1, $t1, 4 # increment kk by bsize
blt $t1, $a3, forkk # repeat while < N
addi $t0, $t0, 4 # increment jj by bsize
blt $t0, $a3, forjj # repeat while < N
lw $s0, 0($sp) # restore used registers
lw $s1, 4($sp)
addi $sp, $sp, 8
jr $ra
Contact Details
Related Samples
At ProgrammingHomeworkHelp.com, we provide comprehensive assignment support to students. Our website features a wide array of samples, including Assembly Language assignments. These expertly crafted samples serve as a valuable resource for students, offering clear insights into complex coding challenges and solutions. By exploring our Assembly Language examples, students can better understand low-level programming, improve their coding skills, and achieve academic success. Whether you're tackling an assignment or seeking to enhance your knowledge, our support ensures you have the tools needed to excel.
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language
Assembly Language