| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
| |
During generation of code into code cache
an unprotected region of memory does not correspond to
protected one, The patch fixes that.
Author: Katkov Serguei <serguei.i.katkov@intel.com>
(cherry picked from commit 74a62214ef262380371bc21be2a1c42295046fb2)
Change-Id: I362a10897564b987c8a3b2dfc9ded8f0a9efd56a
|
| |\
| |
| |
| | |
Change-Id: If7712cbddd6786c91648c4fc31f04e96937d4670
|
| | |
| |
| |
| | |
Change-Id: I66e226f8390bd499e956b00e4088bc0e1e150cb1
|
| | |
| |
| |
| | |
Change-Id: I679fd6f06e007921251d15d7003615d7b0d91c52
|
| |/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tuning knobs for triggering trace compilation for the JIT
had not been revisited for several years. In that time, the
working set of some applications have significantly increased,
leading to frequent cache overlows & flushes.
This CL adds the ability to set the maximum size of the JIT's
cache on the command line, and we expect to use different settings
depending on device configuration (rule of thumb: 1K for each 1M
for system RAM, with 2M limit).
Additionally, the trace compilation trigger has been tightened to
limit the compilation of cold traces.
Change-Id: Ice22c5d9d46a93e465c57dd83f50ca3912f1672e
|
| |\ |
|
| | |
| |
| |
| |
| |
| |
| |
| | |
This patch makes the necessary changes to pass on correct information to
dvmBumpNoChain, so that WITH_JIT_TUNING flag can be enabled for x86 codegen
Change-Id: Ia5e5c0406433bf645ef67143d0f1a11a28153a66
Signed-off-by: Udayan Banerji <udayan.banerji@intel.com>
|
| |\ \ |
|
| | |/
| |
| |
| |
| |
| |
| |
| |
| | |
The interpreter doesn't allow SGET/SPUT bytecodes in a trace till the field
is resolved. However, exhaustTrace can pick up bytecodes beyond the trace
sent by the interpreter. Terminate the loop formation if this is seen.
Change-Id: I0f38d6690b3501111bd16103623fa545d0ec1873
Signed-off-by: Udayan Banerji <udayan.banerji@intel.com>
|
| |/
|
|
|
|
|
|
|
| |
The x86 codegen uses the FPU stack for double/float to long conversions. We
need to clear out the FPU stack after done, to prevent an eventual stack
overflow.
Change-Id: I2f306d7c228ad3da2b84faf9f08326769a9417af
Signed-off-by: Udayan Banerji <udayan.banerji@intel.com>
|
| |
|
|
|
|
|
| |
Disable Method JIT when compiling for x86 target.
Change-Id: Ide0dbd1f602ffd955b901cc152de1e05771fd529
Signed-off-by: Udayan Banerji <udayan.banerji@intel.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
My previous "fix" (c89d83e1c05979b68037ad15413fa4460a88e36f) had the
conditions reversed, so you _had_ to use -Xjitthreshold to get a non-zero
threshold, but when you did, you'd get the default instead of what you
asked for!
This was spotted by the jank tests.
Bug: 8285558
Bug: https://code.google.com/p/android/issues/detail?id=52017
Change-Id: I28270f2573d46929eb10d30789fecf7d5a8cea75
|
| |
|
|
|
|
|
|
| |
Previously, we'd always overwrite the user-supplied value because
the architecture-specific default gets set so late.
Bug: https://code.google.com/p/android/issues/detail?id=52017
Change-Id: I469bf9ce599820f5ce3dea346aa8f680deffb0c5
|
| |
|
|
|
|
| |
We can just use dvmFindInterfaceMethodInCache directly.
Change-Id: I2f3a660262ba7a39c05689df160ebdd2e7ec38a5
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Removes all the E and I logging in cases like these:
E( 2901) JIT couldn't compile Ljava/lang/Number;<init> dex_pc=0
I( 2901) codeGenBasicBlockJit returns negative number
E( 2901) JIT couldn't compile Ljava/lang/String;<init> dex_pc=0
I( 2901) codeGenBasicBlockJit returns negative number
E( 2901) JIT couldn't compile Ljava/util/Hashtable$HashtableEntry;<init> dex_pc=0
I( 2901) codeGenBasicBlockJit returns negative number
E( 2901) JIT couldn't compile Ljava/lang/AbstractStringBuilder;<init> dex_pc=0
I( 2901) codeGenBasicBlockJit returns negative number
E( 2901) JIT couldn't compile Ljava/util/HashMap$HashMapEntry;<init> dex_pc=0
I( 2901) codeGenBasicBlockJit returns negative number
Change-Id: I020c01c11a3840e700bbeb39237da1a6d508be8a
|
| |
|
|
|
|
|
|
|
| |
I think there was confusion here between method inlining and the method
compiler. Just because the latter isn't yet functional doesn't mean we
don't want the former for those targets that support it.
Bug: 7179010
Change-Id: If0de856b93615f01dfc5e8977d5c97f550cec15f
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch provides a fully functional x86 trace JIT compiler for Dalvik
VM. It is built on top of the existing x86 fast interpreter
with bug fixes and needed extension to support trace JIT interface. The
x86 trace JIT code generator was developed independent of the existing
template-based code generator and thus does not share exactly the same
infrastructure. Included in this patch are:
* Deprecated and removed the x86-atom fast interpreter that is no
longer functional since ICS.
* Augmented x86 fast interpreter to provide interfaces for x86 trace JIT
compiler.
* Added x86 trace JIT code generator with full JDWP debugging support.
* Method JIT and self-verification mode are not supported.
The x86 code generator uses the x86 instruction encoder/decoder library
from the Apache Harmony project. Additional wrapper extension and bug
fixes were added to support the x86 trace JIT code generator. The x86
instruction encoder/decoder is embedded inside the x86 code generator
under the libenc subdirectory.
Change-Id: I241113681963a16c13a3562390813cbaaa6eedf0
Signed-off-by: Dong-Yuan Chen <dong-yuan.chen@intel.com>
Signed-off-by: Yixin Shou <yixin.shou@intel.com>
Signed-off-by: Johnnie Birch <johnnie.l.birch.jr@intel.com>
Signed-off-by: Udayan <udayan.banerji@intel.com>
Signed-off-by: Sushma Kyasaralli Thimmappa <sushma.kyasaralli.thimmappa@intel.com>
Signed-off-by: Bijoy Jose <bijoy.a.jose@intel.com>
Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
Signed-off-by: Tim Hartley <timothy.d.hartley@intel.com>
|
| |\
| |
| |
| | |
Change-Id: I9c1f2e37602bea86e70333d2b274665e99fcbd92
|
| | |
| |
| |
| |
| |
| |
| |
| | |
Change-Id: I9bb4f6875b7061d3ffaee73f204026cb8ba3ed39
Signed-off-by: Raghu Gandham <raghu@mips.com>
Signed-off-by: Chris Dearman <chris@mips.com>
Signed-off-by: Douglas Leung <douglas@mips.com>
Signed-off-by: Don Padgett <don@mips.com>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
See https://android-git.corp.google.com/g/#/c/157220
Also fix an occurrence of LOGW missed in an earlier change.
Bug: 5449033
Change-Id: I2e3b23839e6dcd09015d6402280e9300c75e3406
|
| |/
|
|
|
|
|
| |
See https://android-git.corp.google.com/g/156016
Bug: 5449033
Change-Id: Ic663376d1ad6a6cb14bf81405ad9afd247cf2f60
|
| |
|
|
|
|
|
|
|
|
| |
An leading underscore followed by a capital letter is a reserved
name space in C and C++.
This change also moves any #include directives within the include
guard in some of the compiler/codegen/arm header files.
Change-Id: I9715e2c5301699d31886e61d0fe6e29483555a2a
|
| |
|
|
| |
Change-Id: I9fb5d33f23ec7aeb2b9a3908d4125b34be0599ae
|
| |\
| |
| |
| | |
Change-Id: I99c4289bd34f63b0b970b6ed0fa992b44e805393
|
| | |
| |
| |
| | |
Change-Id: I9f9fe52bd4ceebb6dde48251a89190ba6bb00ce4
|
| | |
| |
| |
| | |
Change-Id: I236c5a1553a51f82c9bc3eaaab042046c854d3b4
|
| |/
|
|
| |
Change-Id: Idffbdb02c29e2be03a75f5a0a664603f2299504a
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix a few miscellaneous bugs from the interpreter restructuring that were
causing a segfault on debugger attach.
Added a sanity checking routine for debugging.
Fixed a problem in which the JIT's threshold and on/off switch
wouldn't get initialized properly on thread creation.
Renamed dvmCompilerStateRefresh() to dvmCompilerUpdateGlobalState() to
better reflect its function.
Change-Id: I5b8af1ce2175e3c6f53cda19dd8e052a5f355587
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a restructuring of the Dalvik ARM and x86 interpreters:
o Combine the old portstd and portdbg interpreters into a single
portable interpreter.
o Add debug/profiling support to the fast (mterp) interpreters.
o Delete old mechansim of switching between interpreters. Now, once
you choose an interpreter at startup, you stick with it.
o Allow JIT to co-exist with profiling & debugging (necessary for
first-class support of debugging with the JIT active).
o Adds single-step capability to the fast assembly interpreters without
slowing them down (and, in fact, measurably improves their performance).
o Remove old "polling for safe point" mechanism. Breakouts now achieved
via modifying base of interpreter handler table.
o Simplify interpeter control mechanism.
o Allow thread-granularity control for profiling & debugging
The primary motivation behind this change was to improve the responsiveness
of debugging and profiling and to make it easier to add new debugging and
profiling capabilities in the future. Instead of always bailing out to the
slow debug portable interpreter, we can now stay in the fast interpreter.
A nice side effect of the change is that the fast interpreters
got a healthy speed boost because we were able to replace the
polling safepoint check that involved a dozen or so instructions
with a single table-base reload. When combined with the two earlier CLs
related to this restructuring, we show a 5.6% performance improvement
using libdvm_interp.so on the Checkers benchmark relative to Honeycomb.
Change-Id: I8d37e866b3618def4e582fc73f1cf69ffe428f3c
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Split the original literal pool into class object literals and
constants. Elements in the class object pool have to match the specicial
values perfectly (ie no +delta space optimizations) since they might be
relocated.
2) Implement dvmJitScanAllClassPointers(void (*callback)(void *))
which is the entry routine to report all memory locations in the code cache
that contain class objects (ie class object pool and predicted chaining
cells for virtual calls).
3) Major codegen changes on how/when the class object pool are populated
and how predicted chains are patched. Before this change the compiler
thread is always in the VM_WAIT state, which won't prevent GC from
running. Since the class object pointers captured by a worker thread
are no longer guaranteed to be stable at JIT time, change various
internal data structures to capture the class descriptor/loader
tuple instead. The conversion from descriptor/loader tuple to actual
class object pointers are only performed when the thread state is
RUNNING or at GC safe point.
4) Separate the class object installation phase out of the main
dvmCompilerAssembleLIR routine so that the impact to blocking GC
requests is minimal. Add new stats to report the potential block time.
For example:
Potential GC blocked by compiler: max 46 us / avg 25 us
5) Various cleanup in the trace structure walkup code. Modified the
verbose print routine to show the class descriptor in the class literal
pool. For example:
D/dalvikvm( 1450): -------- end of chaining cells (0x007c)
D/dalvikvm( 1450): 0x44020628 (00b4): .class
(Lcom/android/unit_tests/PerformanceTests$EmptyClass;)
D/dalvikvm( 1450): 0x4402062c (00b8): .word (0xaca8d1a5)
D/dalvikvm( 1450): 0x44020630 (00bc): .word (0x401abc02)
D/dalvikvm( 1450): End
Bug: 3482956
Change-Id: I2e736b00d63adc255c33067544606b8b96b72ffc
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The key datastructure for the interpreter is InterpState.
This change eliminates it, merging its data with the Thread structure.
Here's why:
In principio creavit Fadden Thread et InterpState. And it was good.
Thread holds thread-private state, while InterpState captures data
associated with a Dalvik interpreter activation. Because JNI calls
can result in nested interpreter invocations, we can have more than one
InterpState for each actual thread. InterpState was relatively small,
and it all worked well. It was used enough that in the Arm version
a register (rGLUE) was dedicated to it.
Then, along came the JIT guys, who saw InterpState as a convenient place
to dump all sorts of useful data that they wanted quick access to through
that dedicated register. InterpState grew and grew. In terms of
space, this wasn't a big problem - but it did mean that the initialization
cost of each interpreter activation grew as well. For applications
that do a lot of callbacks from native code into Dalvik, this is
measurable. It's also mostly useless cost because much of the JIT-related
InterpState initialization was setting up useful constants - things that
don't need to be saved and restored all the time.
The biggest problem, though, deals with thread control. When something
interesting is happening that needs all threads to be stopped (such as
GC and debugger attach), we have access to all of the Thread structures,
but we don't have access to all of the InterpState structures (which
may be buried/nested on the native stack). As a result, polling for
thread suspension is done via a one-indirection pointer chase. InterpState
itself can't hold the stop bits because we can't always find it, so
instead it holds a pointer to the global or thread-specific stop control.
Yuck.
With this change, we eliminate InterpState and merge all needed data
into Thread. Further, we replace the decidated rGLUE register with a
pointer to the Thread structure (rSELF). The small subset of state
data that needs to be saved and restored across nested interpreter
activations is collected into a record that is saved to the interpreter
frame, and restored on exit. Further, these small records are linked
together to allow tracebacks to show nested activations. Old InterpState
variables that simply contain useful constants are initialized once at
thread creation time.
This CL is large enough by itself that the new ability to streamline
suspend checks is not done here - that will happen in a future CL. Here
we just focus on consolidation.
Change-Id: Ide6b2fb85716fea454ac113f5611263a96687356
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even though execute-inline is now a mandatory optimization, you can't be sure
the inline natives will be invoked that way. There's reflection and JNI, for
example, and there's the special case of String.equals that might be invoked
as Object.equals. This patch adds a regular native method corresponding to
each inline native, so that a corresponding libcore patch can drop its
implementations. (For example, despite the fact that we all believed last week
that the Java implementation of String.equals is never used, that turned out
not to be true: every HashMap lookup will have used it. This pair of patches
brings reality in line with our existing belief.)
Change-Id: I19e64c23bea83e91696206ca40ce4e3faf853040
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Closes a window in which the "interpret-only" templace could get chained
to an existing trace while the intended translation was under construction.
Note that this CL also introduces some small, but fundamental changes in trace
formation:
1. Previouosly, when an exception or other trace terminating event
occurred during trace formation, the entire trace was abandoned. With this
change, we instead end the trace at the last successful instruction.
2. We previously allowed multiple attempts (perhaps by multiple threads)
to form a trace compilation request for a dalvik PC. This was done in an
attempt to allow recovery from compiler failures. Now we enforce a new rule:
only the thread that wins the race to allocate an entry in the JitTable will
form the trace request.
3. In a (probably misguided) attempt avoid unnecessary contention, we
previously allowed work order enqueue requests to be dropped if a requester
did not aquire TableLock on first attempt (assuming that if the trace were
hot, it would be requested again). Now we block on enqueue.
Change-Id: I40ea4f1b012250219ca37d5c40c5f22cae2092f1
|
| |\
| |
| |
| |
| |
| |
| | |
entry point.
* commit 'af5aa1f4ce7eecc1b47a4c038cebb67d33f08f18':
Don't treat dvmJitToPatchPredictedChain as a Jit-to-Interp entry point.
|
| | |
| |
| |
| |
| |
| | |
It is just a native callout helper function.
Change-Id: I6398b6876f5ba579b76e732107157a4c99337796
|
| |\|
| |
| |
| |
| | |
* commit '0d1aac383a4bdce9feaad2f614df42234c2dcced':
Revert "Remove inline natives for an unused performance test."
|
| | |
| |
| |
| |
| |
| | |
This reverts commit 7ecd89dc02ce00c425788bd4989bdb6cde9a618a.
Change-Id: I427635b7e3f7be45cfde78b8046dab3b23b64562
|
| |\|
| |
| |
| |
| | |
* commit '7ecd89dc02ce00c425788bd4989bdb6cde9a618a':
Remove inline natives for an unused performance test.
|
| | |
| |
| |
| | |
Change-Id: I80cfb918bdf174aeb6de83909c840563f6b945dd
|
| | |
| |
| |
| |
| |
| |
| |
| | |
Remove vestiges of code intended for linear scan register allocation
in the trace compiler. New plan is to stick with local allocation for
traces and build a new linear scan allocator for the method compiler.
Change-Id: Ic265ab5a7936b144cbe7fa4dc667fa7aba579045
|
| | |
| |
| |
| |
| |
| |
| | |
Nuked a void* cast warnings and moved cacheflush into a target-specific
utility wrapper.
Change-Id: I36c841288b9ec7e03c0cb29b2e89db344f36fad1
|
| |/
|
|
|
|
|
|
|
|
|
|
| |
Experimental support for trace selection for x86 host mode operation.
Not enabled by default. Turned on by setting WITH_HOST_DALVIK true
and WITH_JIT true. When enabled, profiles during x86 fast interpreter
operation, selects hot traces and "compiles" traces consisting of jumps
back to the interpreter.
First in a series of experimental x86 support checkins.
Change-Id: I0e423ec58a7bf01f226cb486f55de2841fab1002
|
| |
|
|
|
|
|
|
| |
kNumDalvikInstructions is now kNumPackedOpcodes, there is a new
kMaxOpcodeValue, and both are generated by opcode-gen.
Change-Id: Ic46f1f52d2d21382452c8e777024f4a985ad31d3
Bonus: Reworded the switch and array data comment for clarity.
|
| |
|
|
|
|
|
|
|
|
| |
Similarly "Opcode" not "OpCode".
This appears to be the general worldwide consensus on the matter. Other
residents of my office didn't seem to mind one way or the other how it's
spelled in our code, but for whatever reason, it really bugged me.
Change-Id: Ia0b73d19c54aefc0f543a9c9451dda22ee876a59
|
| |
|
|
|
|
|
|
| |
Also incorporate the former contents of OpCodeNames.h. This is a small
attempt to increase naming consistency in libdex. There will be a bit
more to come, in a follow-up.
Change-Id: Ia7ab06042dde2e19eda02ef1fee72fb4260e899d
|
| |
|
|
|
|
|
| |
In particular, use it instead of just saying 256, and similarly for
255. The number of opcodes will be changing soon.
Change-Id: Icc77120c2673968dddd6b4003f717245d46e4159
|
| |
|
|
|
|
|
|
| |
Slight reworking of the memory barrier instruction generation to
generalize it, and then add "dmb st" for the new return-void-barrier
instruction.
Change-Id: Iad95aa5b0ba9b616a17dcbe4c6ca2e3906bb49dc
|
|
|
Change-Id: Ic94a916e777e9bc5163cf205899daf9c18dcafe1
|