VMAX documentation
John Strawn
9/20/87
We are required to calculate C[n]=A[n] MAX B[n]. Here are the
necessary steps:
move x:(r1),x0
move y:(r5),a
cmp x0,a
tlt x0,a
move a,x:(r6)
This cannot be meaningfully contracted. At most, the first move
(into x) might be stuck in with the tlt instruction thusly:
move y:(r5),a
cmp x0,a
tlt x0,a x:(r1),x0 ; WRONG
move a,x:(r6)
But unfortunately the tlt instruction does not allow parallel
moves. And it still takes four instructions to calculate an
output element.
The answer is to double up on the accumulators. Here are the
necessary steps:
move x:(r1),x0
move y:(r5),a
cmp x0,a
tlt x0,a
move a,x:(r6)
move x:(r1),x1
move y:(r5),b
cmp x1,b
tlt x1,b
move b,x:(r6)
With four instructions plus six parallel moves, these can be
collapsed as follows, resulting in 3 instructions per output
element:
cmp x0,a x:(r1),x1 y:(r5),b
tlt x0,a
move a,x:(r6)
cmp x1,b x:(r1),x0 y:(r5),a
tlt x1,b
move b,x:(r6)
For pipelining reasons, it makes sense to regroup these as follows:
; pipeline initialization
move x:(r1),x0 y:(r5),a
cmp x0,a x:(r1),x1 y:(r5),b
tlt x0,a
; check for cnt=1 here
; inner loop
move a,x:(r6)
cmp x1,b x:(r1),x0 y:(r5),a
tlt x1,b
move b,x:(r6)
cmp x0,a x:(r1),x1 y:(r5),b
tlt x0,a
; end of inner loop
; output odd element from a if needed
For testing whether sinp_a==sinp_b, it will make sense to regroup
further as follows. The point is that when you do an XY move,
you have register a (or b) as the sole source and destination.
Therefore, you don't have to worry about whether x1 (or y1) is
allowed as the second destination. Note that r5 has been switched
to r5.
; pipeline initialization
move x:(r5),x0 y:(r1),a
cmp x0,a x:(r5),x1 y:(r1),b
tlt x0,a
; check for cnt=1 here
; inner loop
move x:(r5),x0
cmp x1,b a,x:(r6) y:(r1),a
tlt x1,b
move x:(r5),x1
cmp x0,a b,x:(r6) y:(r1),b
tlt x0,a
; end of inner loop
; output odd element from a if needed