Track and investigate performance regression areas for GCC
We would like to be able to track performance regressions along certain parameters in terms of GCC for Cortex A9.
When run on a Cortex A9, the following should be true:
* A9 vs A8: code tuned with -mtune=cortex-a9 should run faster than the same code tuned with -mtune=cortex-a8
* ARMv7 vs ARMv5: code built with -march=armv7-a should run faster than the same code built for earlier architectures easpecially -march=armv5te
* Thumb-2 vs ARM: ARM code is typically faster than Thumb-2, but Thumb-2 code should run at at least 90 % of the speed of ARM code.
NEON vs non-NEON is covered in a vectoriser blueprint.
Blueprint information
- Status:
- Complete
- Approver:
- Michael Hope
- Priority:
- High
- Drafter:
- Ramana Radhakrishnan
- Direction:
- Needs approval
- Assignee:
- None
- Definition:
- Approved
- Series goal:
- Accepted for 4.6
- Implementation:
- Informational
- Milestone target:
- backlog
- Started by
- Matthew Gretton-Dann
- Completed by
- Matthew Gretton-Dann
Related branches
Related bugs
Sprints
Whiteboard
[2013-05-21 matthew-
Work Items (reported):
Investigate and upstream enhancement requests for performance (1w): TODO
Prototype fix 1: TODO
Prototype fix 2: TODO
Implement and upstream if applicable: TODO
Work items (a8-vs-a9):
Compare mtune=cortex-a8 vs mtune=cortex-a9 (1w): TODO
Investigate fixes for Cortex-A8 vs Cortex-A9 regressions (1w): TODO
Fix round 1: TODO
Fix round 2: TODO
Fix round 3: TODO
Upstream: TODO
Work items (v5-vs-v7):
Compare performance of -march=armv5te and -march=armv7-a for A9 (1w): TODO
Fix round 1: TODO
Fix round 2: TODO
Fix round 3: TODO
Upstream: TODO
Note: please change the generic 'Fix round n' for actual titles as the issues are found. Delete any that aren't needed.
Work Items
Dependency tree
* Blueprints in grey have been implemented.