aboutsummaryrefslogtreecommitdiff
path: root/vm/compiler/codegen/arm/armv5te
Commit message (Collapse)AuthorAgeFilesLines
* JIT tuning; set cache size on command linebuzbee2013-05-231-1/+7
| | | | | | | | | | | | | | | | | The tuning knobs for triggering trace compilation for the JIT had not been revisited for several years. In that time, the working set of some applications have significantly increased, leading to frequent cache overlows & flushes. This CL adds the ability to set the maximum size of the JIT's cache on the command line, and we expect to use different settings depending on device configuration (rule of thumb: 1K for each 1M for system RAM, with 2M limit). Additionally, the trace compilation trigger has been tightened to limit the compilation of cold traces. Change-Id: Ice22c5d9d46a93e465c57dd83f50ca3912f1672e
* Fix -Xjitthreshold (for real this time).Elliott Hughes2013-03-011-1/+1
| | | | | | | | | | | | | My previous "fix" (c89d83e1c05979b68037ad15413fa4460a88e36f) had the conditions reversed, so you _had_ to use -Xjitthreshold to get a non-zero threshold, but when you did, you'd get the default instead of what you asked for! This was spotted by the jank tests. Bug: 8285558 Bug: https://code.google.com/p/android/issues/detail?id=52017 Change-Id: I28270f2573d46929eb10d30789fecf7d5a8cea75
* Fix -Xjitthreshold.Elliott Hughes2013-02-251-1/+3
| | | | | | | | Previously, we'd always overwrite the user-supplied value because the architecture-specific default gets set so late. Bug: https://code.google.com/p/android/issues/detail?id=52017 Change-Id: I469bf9ce599820f5ce3dea346aa8f680deffb0c5
* Rename (IF_)LOGE(_IF) to (IF_)ALOGE(_IF) DO NOT MERGESteve Block2012-01-082-3/+3
| | | | | | | | | See https://android-git.corp.google.com/g/#/c/157220 Also fix an occurrence of LOGW missed in an earlier change. Bug: 5449033 Change-Id: I2e3b23839e6dcd09015d6402280e9300c75e3406
* Normalize the include guard style.Carl Shapiro2011-06-141-3/+3
| | | | | | | | | | An leading underscore followed by a capital letter is a reserved name space in C and C++. This change also moves any #include directives within the include guard in some of the compiler/codegen/arm header files. Change-Id: I9715e2c5301699d31886e61d0fe6e29483555a2a
* Establish a subclass relationship between ClassObject and Object.Carl Shapiro2011-05-061-2/+2
| | | | Change-Id: I9fb5d33f23ec7aeb2b9a3908d4125b34be0599ae
* Merge remote branch 'goog/dalvik-dev' into dalvik-dev-to-masterBrian Carlstrom2011-05-054-21/+22
|\ | | | | | | Change-Id: I99c4289bd34f63b0b970b6ed0fa992b44e805393
| * Establish a subclass relationship between ArrayObject and Object.Carl Shapiro2011-05-031-3/+3
| | | | | | | | Change-Id: I9f9fe52bd4ceebb6dde48251a89190ba6bb00ce4
| * Get rid of uneeded extern, enum, typedef and struct qualifiers.Carl Shapiro2011-04-271-2/+2
| | | | | | | | Change-Id: I236c5a1553a51f82c9bc3eaaab042046c854d3b4
| * Move the compiler into C++.Carl Shapiro2011-04-193-0/+0
|/ | | | Change-Id: Idffbdb02c29e2be03a75f5a0a664603f2299504a
* Fix interpreter debug attachbuzbee2011-03-301-1/+1
| | | | | | | | | | | | | | | Fix a few miscellaneous bugs from the interpreter restructuring that were causing a segfault on debugger attach. Added a sanity checking routine for debugging. Fixed a problem in which the JIT's threshold and on/off switch wouldn't get initialized properly on thread creation. Renamed dvmCompilerStateRefresh() to dvmCompilerUpdateGlobalState() to better reflect its function. Change-Id: I5b8af1ce2175e3c6f53cda19dd8e052a5f355587
* Interpreter restructuringbuzbee2011-03-231-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a restructuring of the Dalvik ARM and x86 interpreters: o Combine the old portstd and portdbg interpreters into a single portable interpreter. o Add debug/profiling support to the fast (mterp) interpreters. o Delete old mechansim of switching between interpreters. Now, once you choose an interpreter at startup, you stick with it. o Allow JIT to co-exist with profiling & debugging (necessary for first-class support of debugging with the JIT active). o Adds single-step capability to the fast assembly interpreters without slowing them down (and, in fact, measurably improves their performance). o Remove old "polling for safe point" mechanism. Breakouts now achieved via modifying base of interpreter handler table. o Simplify interpeter control mechanism. o Allow thread-granularity control for profiling & debugging The primary motivation behind this change was to improve the responsiveness of debugging and profiling and to make it easier to add new debugging and profiling capabilities in the future. Instead of always bailing out to the slow debug portable interpreter, we can now stay in the fast interpreter. A nice side effect of the change is that the fast interpreters got a healthy speed boost because we were able to replace the polling safepoint check that involved a dozen or so instructions with a single table-base reload. When combined with the two earlier CLs related to this restructuring, we show a 5.6% performance improvement using libdvm_interp.so on the Checkers benchmark relative to Honeycomb. Change-Id: I8d37e866b3618def4e582fc73f1cf69ffe428f3c
* Interpreter restructuring: eliminate InterpStatebuzbee2011-02-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The key datastructure for the interpreter is InterpState. This change eliminates it, merging its data with the Thread structure. Here's why: In principio creavit Fadden Thread et InterpState. And it was good. Thread holds thread-private state, while InterpState captures data associated with a Dalvik interpreter activation. Because JNI calls can result in nested interpreter invocations, we can have more than one InterpState for each actual thread. InterpState was relatively small, and it all worked well. It was used enough that in the Arm version a register (rGLUE) was dedicated to it. Then, along came the JIT guys, who saw InterpState as a convenient place to dump all sorts of useful data that they wanted quick access to through that dedicated register. InterpState grew and grew. In terms of space, this wasn't a big problem - but it did mean that the initialization cost of each interpreter activation grew as well. For applications that do a lot of callbacks from native code into Dalvik, this is measurable. It's also mostly useless cost because much of the JIT-related InterpState initialization was setting up useful constants - things that don't need to be saved and restored all the time. The biggest problem, though, deals with thread control. When something interesting is happening that needs all threads to be stopped (such as GC and debugger attach), we have access to all of the Thread structures, but we don't have access to all of the InterpState structures (which may be buried/nested on the native stack). As a result, polling for thread suspension is done via a one-indirection pointer chase. InterpState itself can't hold the stop bits because we can't always find it, so instead it holds a pointer to the global or thread-specific stop control. Yuck. With this change, we eliminate InterpState and merge all needed data into Thread. Further, we replace the decidated rGLUE register with a pointer to the Thread structure (rSELF). The small subset of state data that needs to be saved and restored across nested interpreter activations is collected into a record that is saved to the interpreter frame, and restored on exit. Further, these small records are linked together to allow tracebacks to show nested activations. Old InterpState variables that simply contain useful constants are initialized once at thread creation time. This CL is large enough by itself that the new ability to streamline suspend checks is not done here - that will happen in a future CL. Here we just focus on consolidation. Change-Id: Ide6b2fb85716fea454ac113f5611263a96687356
* Add runtime support for method based compilation.Ben Cheng2011-01-263-0/+28
| | | | | | | | | | Enhanced code cache management to accommodate both trace and method compilations. Also implemented a hacky dispatch routine for virtual leaf methods. Microbenchmark showed 3x speedup in leaf method invocation. Change-Id: I79d95b7300ba993667b3aa221c1df9c7b0583521
* More Jit-to-Interp entry point cleanup.Ben Cheng2011-01-051-3/+6
| | | | | | | | | | | Only register entry points dispatched through [r6+#offset] in JitToInterpEntries. For ARM targets check the size of JitToInterpEntries explicitly to make sure that its last entry is within 128 byte from InterpState due to the Thumb codegen constraint. Change-Id: I74184115cb3a3c89afc3a5fe53685671d9cb1027
* It's "opcode" not "opCode".Dan Bornstein2010-12-011-1/+1
| | | | | | | | | | Similarly "Opcode" not "OpCode". This appears to be the general worldwide consensus on the matter. Other residents of my office didn't seem to mind one way or the other how it's spelled in our code, but for whatever reason, it really bugged me. Change-Id: Ia0b73d19c54aefc0f543a9c9451dda22ee876a59
* Rename OpCode.h -> DexOpcodes.h.Dan Bornstein2010-12-011-2/+1
| | | | | | | | Also incorporate the former contents of OpCodeNames.h. This is a small attempt to increase naming consistency in libdex. There will be a bit more to come, in a follow-up. Change-Id: Ia7ab06042dde2e19eda02ef1fee72fb4260e899d
* JIT - support for return-void-barrier [Issue 2992352]buzbee2010-11-011-1/+1
| | | | | | | | Slight reworking of the memory barrier instruction generation to generalize it, and then add "dmb st" for the new return-void-barrier instruction. Change-Id: Iad95aa5b0ba9b616a17dcbe4c6ca2e3906bb49dc
* Re-organize target-independent JIT code.buzbee2010-09-261-4/+7
| | | | | | | | | Most of CodegenFactory.c is at a high-enough abstraction level to reuse for other targets. This CL moves the target-depending routines into a new source file (ArchFactory.c) and what's left up a level into the target-independent directory. Change-Id: I792d5dc6b2dc8aa6aaa384039da464db2c766123
* JIT: Source code reorganization to isolate target independent codebuzbee2010-09-241-1/+1
| | | | | | | Much of the register utility code is target independent. Move it up a level so the x86 JIT can use it. Change-Id: Id9895a42281fd836cb1a2c942e106de94df62a9a
* JIT: Support for Dalvik volatiles (issue 2781881)buzbee2010-07-211-0/+7
| | | | | | Also, on SMP systems generate memory barriers. Change-Id: If64f7c98a8de426930b8f36ac77913e53b7b2d7a
* Relocate OpCodeNames.[ch].Andy McFadden2010-06-221-1/+1
| | | | | | | The JIT was pulling it out of the dexdump directory, which is Just Plain Wrong[tm]. Now it's part of libdex, for all to enjoy. Change-Id: Ic1e4c981eb2d70ccc3c841ceb5a54f4f77af2008
* Clean up warnings detected by gcc.Ben Cheng2010-05-281-0/+2
| | | | | | Also re-enabled the JIT for the ARMv5te target. Change-Id: I89fd229205e30e6ee92a4933290a7d8dca001232
* Fix for the JIT blocking mode plus some code cleanup.Ben Cheng2010-03-241-4/+5
| | | | | Bug: 2517606 Change-Id: I2b5aa92ceaf23d484329330ae20de5966704280b
* Jit: Sapphire tuning - mostly scheduling.Bill Buzbee2010-03-031-0/+14
| | | | | | | | | | | | Re-enabled load/store motion that had inadvertently been turned off for non-armv7 targets. Tagged memory references with the kind of memory they touch (Dalvik frame, literal pool, heap) to enable more aggressive load hoisting. Eliminated some largely duplicate code in the target specific files. Reworked temp register allocation code to allocate next temp round-robin (to improve scheduling opportunities). Overall, nice gain for Sapphire. Shows 5% to 15% on some benchmarks, and measurable improvements for Passion.
* Enable JIT parameters to be initialized in an architecture dependent way.Ben Cheng2010-02-051-0/+1
| | | | | | | | | The search for optimial value is still ongoing. The current settings are: v5 v7 JIT profile table 512 2048 JIT code cache 512K 1M JIT threshold 200 40
* Restore threshold to 200 as a temporary workaroundBill Buzbee2010-01-131-2/+1
| | | | Also, fix blocking mode initialization.
* Integrate call-graph information into JIT method blacklist.Ben Cheng2010-01-121-1/+0
| | | | | | | The new flag is -Xjitcheckcg which will crawl the stack frame of the translation requesting thread. Bug: 2368995
* Performance tweak for Jit lookup & adjust table sizes for better performanceBill Buzbee2010-01-121-0/+12
| | | | | Also, move setting of Jit table parameters to architecture-specific init funciton.
* Move VFP register save/restore routines from template to codegen.Ben Cheng2009-12-161-0/+32
| | | | | | | Code in the template directory will occupy space in the code cache and is invoked from JIT'ed code. Since these routines are only invoked from statically compiled functions we can move them to the codegen directory which also has arch-variant configurations.
* Restructure the codegen to make architectural depedency explicit.Ben Cheng2009-11-222-94/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original Codegen.c is broken into three components: - CodegenCommon.c (arch-independend) - CodegenFactory.c (Thumb1/2 dependent) - CodegenDriver.c (Dalvik dependent) For the Thumb/Thumb2 directories, each contain the followin three files: - Factory.c (low-level routines for instruction selections) - Gen.c (invoke the ISA-specific instruction selection routines) - Ralloc.c (arch-dependent register pools) The FP directory contains FP-specific codegen routines depending on Thumb/Thumb2/VFP/PortableFP: - Thumb2VFP.c - ThumbVFP.c - ThumbPortableFP.c Then the hierarchy is formed by stacking these files in the following top-down order: 1 CodegenCommon.c 2 Thumb[2]/Factory.c 3 CodegenFactory.c 4 Thumb[2]/Gen.c 5 FP stuff 6 Thumb[2]/Ralloc.c 7 CodegenDriver.c
* Introduce "just interpret" chainable pseudo-translation.Bill Buzbee2009-11-091-0/+6
| | | | | | | | | | | | | | | | This is the first step towards enabling translation & self-cosim stress modes. When trace selection begins, the trace head address is pinned and remains in a limbo state until the translation is complete. Previously, if the trace selected aborted for any reason, the trace head would remain forever in limbo. This was not a correctness problem, but caused some small performance anomolies and made life more difficult for self-cosimulation mode. This CL introduces a pseudo-translation that simply routes control to the interpreter. When we detect that a trace selection attempt has failed, the trace head is associated with this fully-chainable pseudo-translation. This also has the benefit for self-cosimulation that we are guaranteed forward progress.
* Major registor allocation rework - stage 1.Bill Buzbee2009-10-301-47/+28
| | | | | | | | | | | | | | | Direct usage of registers abstracted out. Live values tracked locally. Redundant loads and stores suppressed. Address of registers and register pairs unified w/ single "location" mechanism Register types inferred using existing dataflow analysis pass. Interim (i.e. Hack) mechanism for storing register liveness info. Rewrite TBD. Stubbed-out code for linear scan allocation (for loop and long traces) Moved optimistic lock check for monitor-enter/exit inline for Thumb2 Minor restructuring, renaming and general cleanup of codegen Renaming of enums to follow coding convention Formatting fixes introduced by the enum renaming Rewrite of RallocUtil.c and addition of linear scan to come in stage 2.
* Implemented a new scheduler and FP register allocator.Ben Cheng2009-09-251-1/+1
| | | | Improved performance by 50% over existing JIT for some FP benchmarks.
* Improved codegen for inline, continuing codegen restructuringBill Buzbee2009-08-281-10/+0
| | | | | | | Added support for Thumb2 IT. Moved compare-long and floating point comparisons inline. Temporarily disabled use of Thumb2 CBZ & CBNZ because they were causing too many out-of-range assembly restarts. Bug fix for LIR3 assert.
* Stage 2 of structural changes for support of THUMB2. No logic changes.Bill Buzbee2009-07-282-0/+217