Improve peeling
Improve the peeling heuristic in the vectoriser, and check if peeling is effective at all.
One if the techniques the vectorizer uses to align misaligned memory accesses is loop peeling . The decision whether to peel or not, and which data reference/s to align (in case not all of the accesses can be aligned simultaneously) is done using heuristic based on target's features. The heuristic needs to be tuned for NEON both with and without vectorizer's cost model.
Blueprint information
- Status:
- Complete
- Approver:
- Michael Hope
- Priority:
- Medium
- Drafter:
- Ira Rosen
- Direction:
- Approved
- Assignee:
- Ira Rosen
- Definition:
- Approved
- Series goal:
- Accepted for 4.6
- Implementation:
- Implemented
- Milestone target:
- None
- Started by
- Ira Rosen
- Completed by
- Mounir Bsaibes
Related branches
Related bugs
Sprints
Whiteboard
Meta:
Roadmap id: TCWG2011-GCC-O3
The following work item was moved to: https:/
Investigate if peeling is effective for NEON both with and without cost model: TODO
Disable peeling for low (known) loop bounds - lp#831094: if all the unaligned accesses are supported and the number of vector iterations is less than 3, peeling is not supposed to be beneficial. The patch in https:/
It is possible to enhance this to unknown loop bounds by adding a run time check.
Also it would be nicer to have a param instead of just setting the threshold to 3.
Work Items
Work items:
Improve peeling heuristic in the vectorizer - without cost model: DONE
Implement, upstream, and backport: DONE
Dependency tree
* Blueprints in grey have been implemented.