| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| | |
| |
| |
| | |
Change-Id: I7b922e223fe1f5242d1f3db1fa18f54aaed725af
|
| |\|
| |
| |
| |
| |
| |
| | |
Merge commit '3d5d87364d062734753bfd26336e96a7e8d03360' into dalvik-dev
* commit '3d5d87364d062734753bfd26336e96a7e8d03360':
Fix for the JIT blocking mode plus some code cleanup.
|
| | |
| |
| |
| |
| | |
Bug: 2517606
Change-Id: I2b5aa92ceaf23d484329330ae20de5966704280b
|
| |\|
| |
| |
| |
| |
| |
| |
| |
| | |
running ARMv5te JIT
Merge commit '900a3afd0e8e0d88426b21447d601ee67e17b642' into dalvik-dev
* commit '900a3afd0e8e0d88426b21447d601ee67e17b642':
Jit: Fix register usage bug - Issue 2518825 native crash running ARMv5te JIT
|
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
Change I8ca61804 added a call to dvmCanPutArrayElement for APUT_OBJECT,
but did so in a way that violated register usage restrictions. This change
tells the register allocation system what registers we expect to remain
live across the call to dvmCanPutArrayElement.
Change-Id: Icd83b888ba60768a196070d62d07d12c7a3c73c6
|
| | |
| |
| |
| |
| |
| |
| |
| | |
See [Issue 1633591] Volatile long/double accesses should be atomic.
Because we believe this to be a rare case, the Jit will just punt
to the interpreter for these.
Change-Id: Idd05b5acae9aa5ffa60941cba8533534a89c0ff8
|
| |\|
| |
| |
| |
| |
| |
| | |
Merge commit 'be6534f384529e51dfba5c3f1b7eb90c86b66e77' into dalvik-dev
* commit 'be6534f384529e51dfba5c3f1b7eb90c86b66e77':
Jit: Fix for [Issue 2487514] Dropped exception
|
| | |
| |
| |
| |
| |
| | |
The jit was failing to call dvmCanPutArrayElement for aput-object.
Change-Id: I8ca618048dc4d1be5b1f1ed85078759041883b09
|
| |\|
| |
| |
| |
| |
| |
| |
| |
| | |
optimization
Merge commit '4527387dd3b5c4dce7300c764805ffd0f3d22649' into dalvik-dev
* commit '4527387dd3b5c4dce7300c764805ffd0f3d22649':
Jit: Make debugging mode aware of inlineExecute/moveResult optimization
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The Jit has a mode in which selected opcodes can be handled normally
or single-stepped in the interpter. This was broken for cases in
which the Jit applied an optimization to fold inlineExecute/moveResult
intruction pairs into a single operation and the debug mode was set
to handle the two opcodes differently.
Change-Id: Ifa436d4ba66ba0c13ea366c0956e6cf92ce9cdfd
|
| |\|
| |
| |
| |
| |
| |
| |
| |
| | |
offending translation
Merge commit 'fc519dc8f4444f6d93806ec15ce7445b322070fd' into dalvik-dev
* commit 'fc519dc8f4444f6d93806ec15ce7445b322070fd':
Jit: Make most Jit compile failures non-fatal; just abort offending translation
|
| | |
| |
| |
| |
| |
| |
| |
| | |
Issue 2175597 Jit compile failures should abort translation, but not the VM
Added new dvmCompileAbort() to replace uses of dvmAbort() when something goes
wrong during the compliation of a trace. In that case, we'll abort the translation
and set it's head to the interpret-only "translation".
|
| |\|
| |
| |
| |
| |
| |
| | |
Merge commit 'f8069e844054d29f320a9ece29fc638a884bbf69' into dalvik-dev
* commit 'f8069e844054d29f320a9ece29fc638a884bbf69':
Collect more JIT stats in the assert build.
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
New stuff includes breakdown of callsite types (ie monomorphic vs polymorphic
vs monoporphic resolved to native), total time spent in JIT'ing, and average
JIT time per compilation.
Example output:
D/dalvikvm( 840): 4042 compilations using 1976 + 329108 bytes
D/dalvikvm( 840): Compiler arena uses 10 blocks (8100 bytes each)
D/dalvikvm( 840): Compiler work queue length is 0/36
D/dalvikvm( 840): size if 8192, entries used is 4137
D/dalvikvm( 840): JIT: 4137 traces, 8192 slots, 1099 chains, 40 thresh, Non-blocking
D/dalvikvm( 840): JIT: Lookups: 1128780 hits, 168564 misses; 179520 normal, 6 punt
D/dalvikvm( 840): JIT: noChainExit: 528464 IC miss, 194708 interp callsite, 0 switch overflow
D/dalvikvm( 840): JIT: Invoke: 507 mono, 988 poly, 72 native, 1038 return
D/dalvikvm( 840): JIT: Total compilation time: 2342 ms
D/dalvikvm( 840): JIT: Avg unit compilation time: 579 us
D/dalvikvm( 840): JIT: 3357 Translation chains, 97 interp stubs
D/dalvikvm( 840): dalvik.vm.jit.op = 0-2,4-5,7-8,a-c,e-16,19-1a,1c-23,26,28-29,2b-2f,31-3d,44-4b,4d-51,60,62-63,68-69,70-72,76-78,7b,81-82,84,87,89,8d-93,95-98,a1,a3,a6,a8-a9,b0-b3,b5-b6,bb-bf,c6-c8,d0,d2-d6,d8,da-e2,ee-f0,f2-fb,
D/dalvikvm( 840): Code size stats: 50666/105126 (compiled/total Dalvik), 329108 (native)
|
| |/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds four new instructions for accessing volatile wide fields (long
and double). The JLS requires that such accesses are atomic, but the
VM doesn't otherwise make guarantees about the atomicity of reads and
writes on 64-bit fields.
There are no behavioral changes. This just adds definitions for the new
instructions and a couple of tests. The current implementation is just
the non-volatile form of the instructions or a C stub, but since we're
not generating them it doesn't really matter yet.
Also:
- bumped Dalvik version to 1.3.0
- added a note to the x86-atom TODO list
For bug 1633591.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Re-enabled load/store motion that had inadvertently been turned off for
non-armv7 targets. Tagged memory references with the kind of memory
they touch (Dalvik frame, literal pool, heap) to enable more aggressive
load hoisting. Eliminated some largely duplicate code in the target
specific files. Reworked temp register allocation code to allocate next
temp round-robin (to improve scheduling opportunities).
Overall, nice gain for Sapphire. Shows 5% to 15% on some benchmarks, and
measurable improvements for Passion.
|
| |
|
|
|
|
|
|
|
|
| |
Real changes:
1) Add a new entry point from JIT to the interpreter to request hot traces w/o
doing chaining.
2) Increase the granularity of the secondary profile filter to match 64-byte
chunks using 64 entries.
The remaining are just cosmetic changes.
|
| |
|
|
| |
(I saw these the other day, but preferred a separate patch.)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than make these changes in the libraries (*10 being a common case),
let's do them once and for all in the JIT.
The 2^n-1 case could be better if we generated RSB instructions, but the
current "fake" RSB is still better than a full multiply.
Thumb doesn't support reg/reg/reg/shift instructions, so we can't optimize
the "population count <= 2" cases (such as *10) there.
Tested on sholes, passion, and passion-running-sapphire (and visually
inspected to check we weren't trying to generate Thumb2 instructions there).
Also tested with the self-verifier.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Two problems with monitor-exit:
1. The Jit code wasn't checking for exception thrown following
unlocks of fat locks using dvmUnlockObject().
2. The mterp interpreter unlock code branched to handle exceptions
thrown during dvmUnlockObject() with the wrong dalvik PC (the
dPC of the unlock, rather than the instruction following the unlock).
Similar issue with the x86 interpreter fixed. Also, deleted armv7-a
MONITOR_ENTER template, which turned out to be identical to the armv5te
one.
|
| | |
|
| |
|
|
|
| |
Renaming of all of those register utilities which used to be local because
of our include mechanism to the standard dvmCompiler prefix scheme.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Patching requests for predicted chaining cells (used by virtual/interface
methods) are now batched in a queue and processed when the VM is paused for GC.
2) When the code cache is full the reset operation is also conducted at the
end of GC pauses so this totally eliminates the need for the compiler thread
to issue suspend-all requests. This is a very rare event and when happening it
takes less than 5ms to finish.
3) Change the initial value of the branch in a predicted chaining cell from 0
(ie lsl r0, r0, #0) to 0xe7fe (ie branch to self) so that initializing a
predicted chaining cell doesn't need to suspend all threads. Together with 1)
seeing 20% speedup on some benchmarks.
4) Add TestCompability.c where defining "TEST_VM_IN_ECLAIR := true" in
buildspec.mk will activate dummy symbols needed to run libdvm.so in older
releases.
Bug: 2397689
Bug: 2396513
Bug: 2331313
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The Jit must stop all threads in order to flush the translation cache (and
other tables). Threads which are blocked in a monitor wait cause some
headache here because they effectively hold a references to the translation
cache (though the return address on the native stack). The new model
introduced in this CL is that for the fast path of monitor enter, control
is allowed to resume in the translation cache. However, if we need to do a
heavyweight lock (which may cause us to block) control does not return to the
translation cache but instead bails out to the interpreter. This allows us to
safely clear the code cache even if some threads are in THREAD_MONITOR state.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Defer initialization of jit to support upcoming feature to wait until
first screen is painted to start in order to avoid wasting effort on
jit'ng initialization code. Timed delay in place for the moment.
To change the on/off state, call dvmSuspendAllThreads(), update the
value of gDvmJit.pJitTable and then dvmResumeAllThreads().
Each time a thread goes through the heavyweight check suspend path, returns
from a monitor lock/unlock or returns from a JNI call, it will refresh
its on/off state.
Also:
Recognize and handle failure to increase size of JitTable.
Avoid repeated lock/unlock of JitTable modification mutex during resize
Make all work order enqueue actions non-blocking, which includes adding
a non-blocking mutex lock: dvmTryLockMutex().
Fix bug Jeff noticed where we were using a half-word form of a Thumb2
instruction rather than the byte form.
Minor comment changes.
|
| | |
|
| |
|
|
|
|
| |
Add a new flag in the Thread struct to track the whereabout of the top frame
in each Java thread. It is not safe to blow away the code cache if any thread
is in the JIT'ed land.
|
| |
|
|
|
|
|
|
|
| |
Bug: 2369821
There are 12 bytes of additional code after the 65th chaining cell. So if a
switch statement with more than that many cases is translated by the JIT, it
will run fine until the next unchaining event, which will patch the wrong code
and lead to all kinds of unexpected crashes.
|
| | |
|
| |
|
|
|
|
| |
Because the code cache may be wiped out after safe points now the patching of
inline cache for predicted chains is done through the compiler thread's work
queue.
|
| | |
|
| |
|
|
|
|
| |
an object. Invert the meaning of the shape bit to match the encoding
scheme described in Bacon's paper. Consequently, monitor pointers
must have the lower 3 bits stripped before they may be dereferenced.
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
| |
OP_MOVE_EXCEPTION handler was neglecting to reset.
Blocking mode was failing to signal empty queue in some cases
Self-cosim was including operations in traces that can't be done twice
Added OP_MOVE_EXCEPTION to self cosim's no-replay ops (it has side effects)
Restored threshold of 1 to self-cosim (now able to boot device with self-cosim)
When threshold < 6, disable 2nd-level translation filter
|
|
|
The original Codegen.c is broken into three components:
- CodegenCommon.c (arch-independend)
- CodegenFactory.c (Thumb1/2 dependent)
- CodegenDriver.c (Dalvik dependent)
For the Thumb/Thumb2 directories, each contain the followin three files:
- Factory.c (low-level routines for instruction selections)
- Gen.c (invoke the ISA-specific instruction selection routines)
- Ralloc.c (arch-dependent register pools)
The FP directory contains FP-specific codegen routines depending on
Thumb/Thumb2/VFP/PortableFP:
- Thumb2VFP.c
- ThumbVFP.c
- ThumbPortableFP.c
Then the hierarchy is formed by stacking these files in the following top-down
order:
1 CodegenCommon.c
2 Thumb[2]/Factory.c
3 CodegenFactory.c
4 Thumb[2]/Gen.c
5 FP stuff
6 Thumb[2]/Ralloc.c
7 CodegenDriver.c
|