vssq Documentation John Strawn 23 July 1987 On the development of the inner loops. Here is the basic algorithm: move x:(r2),a ; get i abs a ; abs(i) move x:(r2)+n2,x0 ; get another copy of i move a,y0 ; abs(i) to y0 mpyr x0,y0,a ; i * abs(i) move a,x:(r6)+n6 ; store to output Note that to get a second copy of i, you can also do move a,x0 Both options will be used here. For sinp==sout==x, here are the operations that must be taken care of for a "double" inner loop. To make the code easier to read, I associate x0 and y0 with a, and x1 and y1 with b. move x:(r2)+n2,a abs a a,x0 move a,y0 mpyr x0,y0,a move a,x:(r6)+n6 move x:(r2)+n2,b abs b b,x1 move b,y1 mpyr x1,y1,b move b,x:(r6)+n6 The best that I can do is an inner loop like this: ; pipeline startup: move x:(r2)+n2,a abs a a,x0 move a,y0 mpyr x0,y0,a move x:(r2)+n2,b move b,x1 ; begin loop abs b a,x:(r6)+n6 move x:(r2)+n2,a b,y1 mpyr x1,y1,b a,x0 abs a b,x:(r6)+n6 move x:(r2)+n2,b a,y0 mpyr x0,y0,a b,x1 ; end loop ; for odd: move a,x:(r6)+n6 Now it looks on the surface like that could be shrunk. The problem is the restrictions on S,D registers for X,R moves. I tried to create an inner loop like this. If, in the instruction marked as "bug", you could use x0 as a D2 register, then this would work. ; begin loop move x:(r2)+n2,a ; abs a b,x:(r6)+n6 a,x0 ; bug mpyr x0,y1,b x:(r2),a a,y0 abs a b,x:(r6+n6) mpyr x0,y0,b x:(r2)+n2,x0 a,y1 ; end loop For sinp!=sout, analysis of the basic algorithm above shows that there are four moves and two ALU operations. Each ALU operation allows a parallel move. This means that if we're careful, we can pack everything into two instructions. A double inner loop thus looks like this: ; pipeline startup: move x:(r1)+n1,a abs a a,x0 move a,y0 mpyr x0,y0,a move x:(r2),b ; begin loop abs b x:(r1),a a,y:(r6)+n6 abs a x:(r2)+n2,x1 b,y1 mpyr x1,y1,b x:(r1)+n1,x0 a,y0 mpyr x0,y0,a x:(r2),b b,y:(r6)+n6 ; end loop r2 is initialized to point to the second element. r1 and r2 then "leapfrog" each other, since n1=n2 has been changed to twice the original n1. Here are the symbol locations: _SYMBOL X axn_vec I 0000200C axp_vec I 00002000 ixself_vec I 00002021 out1_vec I 00002027 out3_vec I 00002035 out5_vec I 00002043 out6_vec I 0000204A out9_vec I 0000205F sum1 I 00002079 sum2 I 0000207A sum3 I 0000207B sum4 I 0000207C sum5 I 0000207D sum6 I 0000207E sum7 I 0000207F sum8 I 00002080 sum9 I 00002081 sum10 I 00002082 sum11 I 00002083 sum12 I 00002084 ans1 I 00002085 ans2 I 00002086 ans3 I 00002087 ans4 I 00002088 ans5 I 00002089 ans6 I 0000208A ans7 I 0000208B ans8 I 0000208C ans9 I 0000208D ans10 I 0000208E ans11 I 0000208F ans12 I 00002090 ans999 I 00002091 _SYMBOL Y ayn_vec I 0000201A ayp_vec I 00002013 out2_vec I 0000202E out4_vec I 0000203C out7_vec I 00002051 out8_vec I 00002058 out10_vec I 00002066 out11_vec I 0000206D