VSBSBM documentation John Strawn 9/18/87 We are required to calculate (v1-v2) * (v3-v4). A straightforward implementation of this macro requires 4 instructions plus 7 moves: move x:(r1),a ; v1 move y:(r5),x0 ; v2 sub x0,a move x:(r0),b ; v3 move y:(r4),y0 ; v4 sub y0,b move a,x1 move b,y1 mpy x1,y1,a rnd a move a,x:(r6) ; The seven moves imply at least four instruction times, even if ; the rnd were omitted. ; But if the calculation is recast as v3*(v1-v2) - v4*(v1-v2), then ; there will be: move x:(r1),a ; v1 move y:(r5),y0 ; v2 sub y0,a move a,y1 move x:(r0),x0 ; v3 mpy y1,x0,b move y:(r2),x1 ; v4 macr -y1,x1,b move b,y:(r6) ; The damages here are *at least* three instruction times plus six moves. ; This can be contracted as follows. The pipeline is five layers ; deep (!). mpy y1,x0,b x:(r2),x1 b,y:(r6) ; v4[n+1], output[n] macr -y1,x1,b x:(r1),a a,y1 ; v1[n+3] sub y0,a x:(r0),x0 y:(r5),y0 ; v3[n+2], v2[n+4] ; Note that you can't write *into* y1 until it has been used for ; both the mpy and the macr. ; The pipeline initialization will look like this: ; calculate output[0] move x:(r1)+,a ; v1 move y:(r5)+,y0 ; v2 sub y0,a move a,y1 move x:(r0)+,x0 ; v3 mpy y1,x0,b move x:(r2)+,x1 ; v4 macr -y1,x1,b ; start to calculate output[1] move x:(r1)+,a ; v1 move y:(r5)+,y0 ; v2 sub y0,a move a,y1 move x:(r0)+,x0 ; v3 ; start to calculate output[2] move x:(r1)+,a ; v1 move y:(r5)+,y0 ; v2 sub y0,a ; start to calculate output[3] move y:(r5)+,y0 ; v2 ; inner loop mpy y1,x0,b x:(r2)+,x1 b,y:(r6)+ ; v4, output macr -y1,x1,b x:(r1)+,a a,y1 ; v1 sub y0,a x:(r0)+,x0 y:(r5)+,y0 ; v3, v2 Here is a simple command file that I used to test this. The goal is to have input vectors A, B, C, D identified clearly. The x and y memories are loaded with different numbers for the first six elements of each vector. By stepping through the elements and watching them very carefully, you can ensure that the pipeline is set up and followed properly. load test log s test radix h display off all ; turn off all standard registers change bcr 0 ; see manual pp. 7-2 and 7-3 display on dsp cyc ictr break pc>=$4000 display on x:$10..15 display on y:$20..25 display on x:$30..35 display on x:$40..45 display on y:$50..55 change pc 0 change x:$10 $a0 change x:$11 $a1 change x:$12 $a2 change x:$13 $a3 change x:$14 $a4 change x:$15 $a5 change y:$20 $B0 change y:$21 $B1 change y:$22 $B2 change y:$23 $B3 change y:$24 $B4 change y:$25 $B5 change x:$30 $C0 change x:$31 $C1 change x:$32 $C2 change x:$33 $C3 change x:$34 $C4 change x:$35 $C5 change x:$40 $D0 change x:$41 $D1 change x:$42 $D2 change x:$43 $D3 change x:$44 $D4 change x:$45 $D5 change y:$50..$60 $777 change r1 10 change r5 20 change r0 30 change r2 40 change r6 50 display