Investigate enhancing and tune GCC auto-vectorization capabilities for NEON
Enhance and tune GCC auto-vectorization capabilities for NEON to gain performance improvements on typical benchmarks.
Blueprint information
- Status:
- Complete
- Approver:
- Michael Hope
- Priority:
- High
- Drafter:
- Ira Rosen
- Direction:
- Needs approval
- Assignee:
- Ira Rosen
- Definition:
- Approved
- Series goal:
- Accepted for 11.05
- Implementation:
- Implemented
- Milestone target:
- 11.05-final
- Started by
- Michael Hope
- Completed by
- Michael Hope
Related branches
Related bugs
Sprints
Whiteboard
EEMBC Telecom Viterbi gives 4.75x performance improvement with vectorization.
re: Investigate support of mixed vector sizes - looks like there is no immediate need in this feature - postponing it
Work Items
Work items:
Investigate auto-detection of vector size: DONE
Implement auto-detection: DONE
Test auto-detection: DONE
Investigate support of mixed vector sizes: POSTPONED
Identify other areas and add to this blueprint: DONE
Identify benchmarks with vectorization potential: DONE
Analyze EEMBC: DONE
Vectorize Viterbi - strided memory accesses support: DONE
Vectorize Viterbi - conditional store sink: DONE
Vectorize Viterbi - test conditional store sink: DONE
Vectorize Viterbi - if-conversion with loads: DONE
Change default vector size: DONE
Dependency tree
* Blueprints in grey have been implemented.