The XFOR Programming Environment


Here is a sequence of screenshots showing an optimization scenario using XFOR-Wizard, for kernel 2mm of the Polybench benchmark suite:

1. Loading the 2mm.c source file in the wizard's interface

2. Generating an equivalent XFOR code automatically

3. Entering the dual interface with the reference code on the left (2mm), and the XFOR code on the right (equivalent to the reference code at this step).
Data dependences are printed at the bottom.

4. Applying a loop interchange to the XFOR code (button "Interchange"). Note that the 3rd offset of the outermost loop has been set to 0, so that array D is initialized at start-up: the transformation has been analyzed automatically as being legal (printed at the bottom)

5. Selecting iterators j1 and k1 to be interchanged, to improve spatial and temporal data locality of the i1-j1-k1 loop nest

6. The resulted code shows a 2nd offset of k1-j1 in the 2nd loop, which means that the iterator is equal to j1+(k1-j1). In the same way, the iterator of the 3rd loop is equal to k1+(j1-k1). Note that some dependences are not respected. The second loop nest requires an offset of _PB_NI on its outermost loop.

7. A similar loop interchange is applied on the 4th loop nest.

8. The 4th offset of the outermost loop is set to _PB_NI+1 in order to respect the dependence from S1 to S3. All the transformations have been performed. The resulting XFOR program is now respecting all the dependences.

9. All the changes are applied and the resulting XFOR program is created.

10. The XFOR program is compiled into an equivalent program with FOR loops, using IBB.

11. Usually, answer "NO" to this question.

12. The IBB-generated code must now be compiled.

13. And run...

14. XFOR execution time: 0.93 seconds

15. Original code execution time: 4.08 seconds
(XFOR speed-up=4.35)

16. Compiling the original code with Pluto

17. Selecting the Pluto compiler options

18. The generated Pluto code

19. Pluto execution time: 2.09 (XFOR speed-up=2.24)
The XFOR codes: