VMAX documentation John Strawn 9/20/87 We are required to calculate C[n]=A[n] MAX B[n]. Here are the necessary steps: move x:(r1),x0 move y:(r5),a cmp x0,a tlt x0,a move a,x:(r6) This cannot be meaningfully contracted. At most, the first move (into x) might be stuck in with the tlt instruction thusly: move y:(r5),a cmp x0,a tlt x0,a x:(r1),x0 ; WRONG move a,x:(r6) But unfortunately the tlt instruction does not allow parallel moves. And it still takes four instructions to calculate an output element. The answer is to double up on the accumulators. Here are the necessary steps: move x:(r1),x0 move y:(r5),a cmp x0,a tlt x0,a move a,x:(r6) move x:(r1),x1 move y:(r5),b cmp x1,b tlt x1,b move b,x:(r6) With four instructions plus six parallel moves, these can be collapsed as follows, resulting in 3 instructions per output element: cmp x0,a x:(r1),x1 y:(r5),b tlt x0,a move a,x:(r6) cmp x1,b x:(r1),x0 y:(r5),a tlt x1,b move b,x:(r6) For pipelining reasons, it makes sense to regroup these as follows: ; pipeline initialization move x:(r1),x0 y:(r5),a cmp x0,a x:(r1),x1 y:(r5),b tlt x0,a ; check for cnt=1 here ; inner loop move a,x:(r6) cmp x1,b x:(r1),x0 y:(r5),a tlt x1,b move b,x:(r6) cmp x0,a x:(r1),x1 y:(r5),b tlt x0,a ; end of inner loop ; output odd element from a if needed For testing whether sinp_a==sinp_b, it will make sense to regroup further as follows. The point is that when you do an XY move, you have register a (or b) as the sole source and destination. Therefore, you don't have to worry about whether x1 (or y1) is allowed as the second destination. Note that r5 has been switched to r5. ; pipeline initialization move x:(r5),x0 y:(r1),a cmp x0,a x:(r5),x1 y:(r1),b tlt x0,a ; check for cnt=1 here ; inner loop move x:(r5),x0 cmp x1,b a,x:(r6) y:(r1),a tlt x1,b move x:(r5),x1 cmp x0,a b,x:(r6) y:(r1),b tlt x0,a ; end of inner loop ; output odd element from a if needed