aboutsummaryrefslogtreecommitdiff
path: root/vm/compiler/codegen/arm/Thumb
Commit message (Collapse)AuthorAgeFilesLines
* vm: Enable fast multiply on perf builds tooSteve Kondik2014-03-241-1/+1
| | | | Change-Id: I74d152ea9cfe5b15daa9a8353ca27d8afa7474d2
* Merge branch 'kk_2.7_rb1.9' of git://codeaurora.org/platform/dalvik into cafSteve Kondik2013-11-111-1/+144
|\ | | | | | | Change-Id: I885fab2470352d0a625c9946d0d5c9111486b713
| * dalvik: dalvik device extension pack.Xin Qi2013-10-311-1/+1
|/ | | | | | | | | Add support for customer device extension Change-Id: I0402a630ba212d1c5e81cda110f61210f7b60384 (cherry picked from commit 11499df326462bfe25890a35c6abbf019ff7784e) (cherry picked from commit e03b8f8da9cf4eef64cedf39ce9ca90d26ce5124) (cherry picked from commit fb360be406f35b9591f12c61936657f03cc5880f)
* JIT: Use rsb and shift in easy multiply.Anders O Nilsson2013-06-141-0/+8
| | | | | | | | | | | For easy multiplication using reverse subtract (when lit is 2^n-1) use the barrel shifter for rsb. This improves arithmetic performance for code executing in Dalvik. E.g String.hashCode. Change-Id: Ifb086dcec344b30fd3e392ac21d508b43e820cdc Signed-off-by: Patrik Ryd <patrik.ryd@stericsson.com>
* Rename (IF_)LOGE(_IF) to (IF_)ALOGE(_IF) DO NOT MERGESteve Block2012-01-081-10/+10
| | | | | | | | | See https://android-git.corp.google.com/g/#/c/157220 Also fix an occurrence of LOGW missed in an earlier change. Bug: 5449033 Change-Id: I2e3b23839e6dcd09015d6402280e9300c75e3406
* Interpreter/Debugger fix #4479968buzbee2011-05-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | This one was tricky to track down. The underlying problem arose with the consolidation of InterpState with Thread. Rather than having a state structure for each instance of the interpreter, we moved to a model that had a single thread-local struct shared by all interpreter instances running on that thread. A portion of interpreter state can't be shared - and thus was saved and restored on nested invocations of the interpreter. The bug here was that the storage for method return values was not included in the state that needed save/retore. In normal operation, it doesn't need to be saved - that storage isn't live across an invoke that could trigger a nested interpreter activation. However, when debugging, the debugger itself may hijack threads and create new interpreter instances for its own purposed - and there is a small window in which live retval can be trashed. The fix is simply to move retval into the InterpSave struct. Change-Id: Ib621824b799c5caa16fdfa8f5689a181159059df
* Fix a Thumb vs Thumb2 codegen bug.Ben Cheng2011-05-111-1/+31
| | | | | | | | A Thumb2 pc-relative load is slipped into the codegen stream even though the selected platform is armv5te (eg the emulator). Bug: 4399358 Change-Id: I61dd6853cad6c82de43f384814c903dd9f3ae302
* Move the compiler into C++.Carl Shapiro2011-04-193-0/+0
| | | | Change-Id: Idffbdb02c29e2be03a75f5a0a664603f2299504a
* Handle relocatable class objects in JIT'ed code.Ben Cheng2011-03-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1) Split the original literal pool into class object literals and constants. Elements in the class object pool have to match the specicial values perfectly (ie no +delta space optimizations) since they might be relocated. 2) Implement dvmJitScanAllClassPointers(void (*callback)(void *)) which is the entry routine to report all memory locations in the code cache that contain class objects (ie class object pool and predicted chaining cells for virtual calls). 3) Major codegen changes on how/when the class object pool are populated and how predicted chains are patched. Before this change the compiler thread is always in the VM_WAIT state, which won't prevent GC from running. Since the class object pointers captured by a worker thread are no longer guaranteed to be stable at JIT time, change various internal data structures to capture the class descriptor/loader tuple instead. The conversion from descriptor/loader tuple to actual class object pointers are only performed when the thread state is RUNNING or at GC safe point. 4) Separate the class object installation phase out of the main dvmCompilerAssembleLIR routine so that the impact to blocking GC requests is minimal. Add new stats to report the potential block time. For example: Potential GC blocked by compiler: max 46 us / avg 25 us 5) Various cleanup in the trace structure walkup code. Modified the verbose print routine to show the class descriptor in the class literal pool. For example: D/dalvikvm( 1450): -------- end of chaining cells (0x007c) D/dalvikvm( 1450): 0x44020628 (00b4): .class (Lcom/android/unit_tests/PerformanceTests$EmptyClass;) D/dalvikvm( 1450): 0x4402062c (00b8): .word (0xaca8d1a5) D/dalvikvm( 1450): 0x44020630 (00bc): .word (0x401abc02) D/dalvikvm( 1450): End Bug: 3482956 Change-Id: I2e736b00d63adc255c33067544606b8b96b72ffc
* Handle OP_THROW in the method JIT.Ben Cheng2011-03-022-19/+19
| | | | | | | | | | | | | | | | | | | The current implementation is to reconstruct the leaf Dalvik frame and punt to the interpreter, since the amount of work involed to match each catch block and walk through the stack frames is just not worth JIT'ing. Additional changes: - Fixed a control-flow bug where a block that ends with a throw shouldn't have a fall-through block. - Fixed a code cache lookup bug so that method-based compilation is guaranteed a slot in the profiling table. - Created separate handler routines based on opcode format for the method-based JIT. - Renamed a few core registers that also have special meanings to the VM or ARM architecture. Change-Id: I429b3633f281a0e04d352ae17a1c4f4a41bab156
* Interpreter restructuring: eliminate InterpStatebuzbee2011-02-191-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The key datastructure for the interpreter is InterpState. This change eliminates it, merging its data with the Thread structure. Here's why: In principio creavit Fadden Thread et InterpState. And it was good. Thread holds thread-private state, while InterpState captures data associated with a Dalvik interpreter activation. Because JNI calls can result in nested interpreter invocations, we can have more than one InterpState for each actual thread. InterpState was relatively small, and it all worked well. It was used enough that in the Arm version a register (rGLUE) was dedicated to it. Then, along came the JIT guys, who saw InterpState as a convenient place to dump all sorts of useful data that they wanted quick access to through that dedicated register. InterpState grew and grew. In terms of space, this wasn't a big problem - but it did mean that the initialization cost of each interpreter activation grew as well. For applications that do a lot of callbacks from native code into Dalvik, this is measurable. It's also mostly useless cost because much of the JIT-related InterpState initialization was setting up useful constants - things that don't need to be saved and restored all the time. The biggest problem, though, deals with thread control. When something interesting is happening that needs all threads to be stopped (such as GC and debugger attach), we have access to all of the Thread structures, but we don't have access to all of the InterpState structures (which may be buried/nested on the native stack). As a result, polling for thread suspension is done via a one-indirection pointer chase. InterpState itself can't hold the stop bits because we can't always find it, so instead it holds a pointer to the global or thread-specific stop control. Yuck. With this change, we eliminate InterpState and merge all needed data into Thread. Further, we replace the decidated rGLUE register with a pointer to the Thread structure (rSELF). The small subset of state data that needs to be saved and restored across nested interpreter activations is collected into a record that is saved to the interpreter frame, and restored on exit. Further, these small records are linked together to allow tracebacks to show nested activations. Old InterpState variables that simply contain useful constants are initialized once at thread creation time. This CL is large enough by itself that the new ability to streamline suspend checks is not done here - that will happen in a future CL. Here we just focus on consolidation. Change-Id: Ide6b2fb85716fea454ac113f5611263a96687356
* Misc goodies in the JIT in preparation for more aggressive code motion.Ben Cheng2011-02-081-19/+57
| | | | | | | | | - Set up resource masks correctly for Thumb push/pop when LR/PC are involved. - Preserve LR around simulated heap references under self-verification mode. - Compact a few simple flags in ArmLIR into bit fields. - Minor performance tuning in TEMPLATE_MEM_OP_DECODE Change-Id: Id73edac837c5bb37dfd21f372d6fa21c238cf42a
* Light refactoring of handleExecuteInline.Elliott Hughes2011-01-201-2/+2
| | | | | | | | | | | | | | I wanted the code to JIT a call a C function extracted so I can potentially use it elsewhere. The functions that sometimes JIT instructions directly and other times bail out to C can now call this, simplifying the body of the switch. I think there's a behavioral change here with the ThumbVFP genInlineSqrt, which previously had the wrong return value. Tested on passion to ensure that the performance characteristics of assembler intrinsics, C intrinsics, and library native methods haven't changed (using the Math and Float classes). Change-Id: Id79771a31abe3a516f403486454e9c0d9793622a
* [JIT] Trace profiling supportbuzbee2010-12-171-0/+56
| | | | | | | | | | | | | | | | | | | | In preparation for method compilation, this CL causes all traces to include two entry points: profiling and non-profiling. For now, the profiling entry will only be used if dalvik is run with -Xjitprofile, and largely works like it did before. The difference is that profiling support no longer requires the "assert" build - it's always there now. This will enable us to do a form of sampling profiling of traces in order to identify hot methods or hot trace groups, while keeping the overhead low by only switching profiling on periodically. To turn the periodic profiling on and off, we simply unchain all existing translations and set the appropriate global profile state. The underlying translation lookup and chaining utilties will examine the profile state to determine which entry point to use (i.e. - profiling or non-profiling) while the traces naturally rechain during further execution. Change-Id: I9ee33e69e33869b9fab3a57e88f9bc524175172b
* [JIT] Regalloc cleanupbuzbee2010-12-141-6/+0
| | | | | | | | Remove vestiges of code intended for linear scan register allocation in the trace compiler. New plan is to stick with local allocation for traces and build a new linear scan allocator for the method compiler. Change-Id: Ic265ab5a7936b144cbe7fa4dc667fa7aba579045
* Add explicit casts from "void *" to destination types.Ben Cheng2010-12-142-4/+4
| | | | Change-Id: I8828bc628f110aaade578a197bf1f51b30bf1be7
* It's "opcode" not "opCode".Dan Bornstein2010-12-011-115/+115
| | | | | | | | | | Similarly "Opcode" not "OpCode". This appears to be the general worldwide consensus on the matter. Other residents of my office didn't seem to mind one way or the other how it's spelled in our code, but for whatever reason, it really bugged me. Change-Id: Ia0b73d19c54aefc0f543a9c9451dda22ee876a59
* JIT: Add new compare-immed-and-branch primative & drop useless clrexbuzbee2010-08-311-10/+7
| | | | | | | This allows better use of cbz/cbnz on Thumb2 targets. Also, removed the clrex from the inline monitor enter code (not necessary). Change-Id: I3bfa90bcdf34f6ef3e2447c9c6f1b49a98a89e58
* Clean up warnings detected by gcc.Ben Cheng2010-05-282-4/+0
| | | | | | Also re-enabled the JIT for the ARMv5te target. Change-Id: I89fd229205e30e6ee92a4933290a7d8dca001232
* Clean up the codegen for invoking helper callout functions.Ben Cheng2010-04-021-5/+9
| | | | | | | All invoked functions are documented in compiler/codegen/arm/CalloutHelper.h Bug: 2567981 Change-Id: Ia7cd4107272df1b0b5588fbcc0aafcc6d0723d60
* Fix for 2542488 JIT codegen bug with overlapping wide operandsBill Buzbee2010-03-271-1/+1
| | | | Change-Id: I2f31492f68cb753f76dd664cd6b0a52d7d32de4c
* Jit: Fix for 2542488 JIT codegen bug with overlapping wide operandsBill Buzbee2010-03-251-2/+12
| | | | Change-Id: I7b922e223fe1f5242d1f3db1fa18f54aaed725af
* Jit: Make most Jit compile failures non-fatal; just abort offending translationBill Buzbee2010-03-071-10/+20
| | | | | | | | Issue 2175597 Jit compile failures should abort translation, but not the VM Added new dvmCompileAbort() to replace uses of dvmAbort() when something goes wrong during the compliation of a trace. In that case, we'll abort the translation and set it's head to the interpret-only "translation".
* Jit: Sapphire tuning - mostly scheduling.Bill Buzbee2010-03-032-113/+46
| | | | | | | | | | | | Re-enabled load/store motion that had inadvertently been turned off for non-armv7 targets. Tagged memory references with the kind of memory they touch (Dalvik frame, literal pool, heap) to enable more aggressive load hoisting. Eliminated some largely duplicate code in the target specific files. Reworked temp register allocation code to allocate next temp round-robin (to improve scheduling opportunities). Overall, nice gain for Sapphire. Shows 5% to 15% on some benchmarks, and measurable improvements for Passion.
* Fix a couple of typos in JIT function names.Elliott Hughes2010-02-251-1/+1
| | | | (I saw these the other day, but preferred a separate patch.)
* Optimize more easy multiplications by constants.Elliott Hughes2010-02-241-0/+9
| | | | | | | | | | | | | | | Rather than make these changes in the libraries (*10 being a common case), let's do them once and for all in the JIT. The 2^n-1 case could be better if we generated RSB instructions, but the current "fake" RSB is still better than a full multiply. Thumb doesn't support reg/reg/reg/shift instructions, so we can't optimize the "population count <= 2" cases (such as *10) there. Tested on sholes, passion, and passion-running-sapphire (and visually inspected to check we weren't trying to generate Thumb2 instructions there). Also tested with the self-verifier.
* Added LDMIA/STMIA support to Self Verification mode.jeffhao2010-02-101-0/+8
|
* Jit: Phase 1 of register utility cleanup/rewrite - the great renamingBill Buzbee2010-02-093-78/+78
| | | | | Renaming of all of those register utilities which used to be local because of our include mechanism to the standard dvmCompiler prefix scheme.
* Made Self Verification mode's memory interface less intrusive.jeffhao2010-02-041-1/+20
|
* Restructure the codegen to make architectural depedency explicit.Ben Cheng2009-11-223-0/+1160
The original Codegen.c is broken into three components: - CodegenCommon.c (arch-independend) - CodegenFactory.c (Thumb1/2 dependent) - CodegenDriver.c (Dalvik dependent) For the Thumb/Thumb2 directories, each contain the followin three files: - Factory.c (low-level routines for instruction selections) - Gen.c (invoke the ISA-specific instruction selection routines) - Ralloc.c (arch-dependent register pools) The FP directory contains FP-specific codegen routines depending on Thumb/Thumb2/VFP/PortableFP: - Thumb2VFP.c - ThumbVFP.c - ThumbPortableFP.c Then the hierarchy is formed by stacking these files in the following top-down order: 1 CodegenCommon.c 2 Thumb[2]/Factory.c 3 CodegenFactory.c 4 Thumb[2]/Gen.c 5 FP stuff 6 Thumb[2]/Ralloc.c 7 CodegenDriver.c