summaryrefslogtreecommitdiff
path: root/cpp/Script.cpp
diff options
context:
space:
mode:
authorTobias Grosser <grosser@google.com>2013-07-11 17:58:47 -0700
committerTobias Grosser <grosser@google.com>2013-07-15 14:07:20 -0700
commitda7ddd8477dc802c8736c7ab860fc09f33689ce9 (patch)
treeaa2adccfb8659aeef55ae83ef2f1164699c4cb30 /cpp/Script.cpp
parent574854bcb2eb25a85b9b52faf2fb3e743fa7aa14 (diff)
Simplify code of convolve3x3
Instead of first doing all multiplications and then adding the results in a tree manner, we just repetitively perform a load/multiply/add patter. With and without tuning for A15, this yields a 5% performance increase for N10. This commit also exposes more instructions to be transformed into fused multiply adds. Change-Id: I1215d75da236e6b2d6b6aa48b3ab35606cdba7b8
Diffstat (limited to 'cpp/Script.cpp')
0 files changed, 0 insertions, 0 deletions