| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
Change-Id: Idffbdb02c29e2be03a75f5a0a664603f2299504a
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The polling is expensive for now as it is done through three
instructions: ld/ld/branch. As a result, a bunch of bonus stuff has
been worked on to mitigate the extra overhead:
- Cleaned up resource flags for memory disambiguation.
- Rewrote load/store elimination and scheduler routines to hide
the ld/ld latency for GC flag. Seperate the dependency checking into
memory disambiguation part and resource conflict part.
- Allowed code motion for Dalvik/constant/non-aliasing loads to be
hoisted above branches for null/range checks.
- Created extended basic blocks following goto instructions so that
longer instruction streams can be optimized as a whole.
Without the bonus stuff, the performance dropped about ~5-10% on some
benchmarks because of the lack of headroom to hide the polling latency
in tight loops. With the bonus stuff, the performance delta is between
+/-5% with polling code generated. With the bonus stuff but disabling
polling, the new bonus stuff provides consistent performance
improvements:
CaffeineMark 3.6%
Linpack 11.1%
Scimark 9.7%
Sieve 33.0%
Checkers 6.0%
As a result, GC polling is disabled by default but can be turned on
through the -Xjitsuspendpoll flag for experimental purposes.
Change-Id: Ia81fc85de3e2b70e6cc93bc37c2b845892003cdb
|
| |
|
|
|
|
|
|
|
| |
- Set up resource masks correctly for Thumb push/pop when LR/PC are involved.
- Preserve LR around simulated heap references under self-verification mode.
- Compact a few simple flags in ArmLIR into bit fields.
- Minor performance tuning in TEMPLATE_MEM_OP_DECODE
Change-Id: Id73edac837c5bb37dfd21f372d6fa21c238cf42a
|
| |
|
|
| |
Change-Id: I06292964a6882ea2d0c17c5c962db95e46b01543
|
| |
|
|
|
|
|
|
|
|
| |
Similarly "Opcode" not "OpCode".
This appears to be the general worldwide consensus on the matter. Other
residents of my office didn't seem to mind one way or the other how it's
spelled in our code, but for whatever reason, it really bugged me.
Change-Id: Ia0b73d19c54aefc0f543a9c9451dda22ee876a59
|
| |
|
|
|
|
| |
Also re-enabled the JIT for the ARMv5te target.
Change-Id: I89fd229205e30e6ee92a4933290a7d8dca001232
|
| |
|
|
| |
Change-Id: Icbe24eaf1ad499f28b68b6a5f05368271a0a7e86
|
| |
|
|
| |
Change-Id: I9b371af4150b6245c0ff59eea63d83406edbbcee
|
| |
|
|
|
| |
Bug: 2520500
Change-Id: I36dbd8b3a6d13c40f9735df4918ab02b5f053b07
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Re-enabled load/store motion that had inadvertently been turned off for
non-armv7 targets. Tagged memory references with the kind of memory
they touch (Dalvik frame, literal pool, heap) to enable more aggressive
load hoisting. Eliminated some largely duplicate code in the target
specific files. Reworked temp register allocation code to allocate next
temp round-robin (to improve scheduling opportunities).
Overall, nice gain for Sapphire. Shows 5% to 15% on some benchmarks, and
measurable improvements for Passion.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original Codegen.c is broken into three components:
- CodegenCommon.c (arch-independend)
- CodegenFactory.c (Thumb1/2 dependent)
- CodegenDriver.c (Dalvik dependent)
For the Thumb/Thumb2 directories, each contain the followin three files:
- Factory.c (low-level routines for instruction selections)
- Gen.c (invoke the ISA-specific instruction selection routines)
- Ralloc.c (arch-dependent register pools)
The FP directory contains FP-specific codegen routines depending on
Thumb/Thumb2/VFP/PortableFP:
- Thumb2VFP.c
- ThumbVFP.c
- ThumbPortableFP.c
Then the hierarchy is formed by stacking these files in the following top-down
order:
1 CodegenCommon.c
2 Thumb[2]/Factory.c
3 CodegenFactory.c
4 Thumb[2]/Gen.c
5 FP stuff
6 Thumb[2]/Ralloc.c
7 CodegenDriver.c
|
| | |
|
| |
|
|
| |
Improved performance by 50% over existing JIT for some FP benchmarks.
|
| |
|
|
|
| |
This is an mid-point checkin to avoid future merge nightmare for the register
allocator work.
|
| |
|
|
|
|
|
| |
Added support for Thumb2 IT. Moved compare-long and floating point
comparisons inline. Temporarily disabled use of Thumb2 CBZ & CBNZ
because they were causing too many out-of-range assembly restarts.
Bug fix for LIR3 assert.
|
| |
|
|
| |
Change-id: I7428278f07f49e675d0271c58b3cbf1f6a4e9da1
|
| |
|
|
|
| |
Bug fix for local optimization
Enable partial floating point store sinking (with significant perf gain!)
|
| | |
|
| |
|