Branch: Tag:

2023-03-09

2023-03-09 14:03:51 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64] [arm32] [arm64]: Inline F_DUP and F_SWAP with arg1 != 0.

Inlines most cases of F_DUP and F_SWAP on amd64, arm32 and arm64.

NB: Not complete. F_SWAP with arg1 != 0 is NOT inlined on arm32.

2023-03-09 11:33:03 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler: Parameterized the opcodes F_DUP and F_SWAP.

They now take an offset from the old implicit argument.
An offset of 0 gives the old operation.

These are useful for eg duplicating the lvalue at the top of the stack:

F_DUP(1)
F_DUP(1)

Previously this required much more complex code:

F_DUP
F_REARRANGE(1, 2)
F_SWAP
F_DUP
F_REARRANGE(1, 3)
F_SWAP

2022-11-20

2022-11-20 12:43:50 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Work in progress: Sakura master

2022-09-29

2022-09-29 13:14:58 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler: Add optional validation that %rsp is properly aligned.

Add new generic configure option `--with-experimental` intended
to be used to enable various experimental and/or debug code.
Exactly what `--with-experimental` enables is NOT intended to
be stable, and may be changed at any time. It may also have no
effect at all.

2022-09-29 13:14:24 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler: Add optional validation that %rsp is properly aligned.

Add new generic configure option `--with-experimental` intended
to be used to enable various experimental and/or debug code.
Exactly what `--with-experimental` enables is NOT intended to
be stable, and may be changed at any time. It may also have no
effect at all.

2022-09-22

2022-09-22 12:40:17 by Tobias S. Josefowitz <tobij@tobij.de>

Compiler [amd64]: Keep stack alignment before calling C code

GCC 8 started to emit movaps instructions with (%RSP) as destination,
leading to GPF in case it was not properly aligned.

Backport of the remainder of the fix since it is now relevant.

2022-09-22 09:35:40 by Martin Karlgren <marty@roxen.com>

Add --with-mc-stack-frames configure option. (Currently X86-64 only.)

This will enable frame pointers in machine code, thereby allowing e.g.
Linux perf to unwind the stack and get proper stack traces including
Pike functions.

2022-09-22 09:35:39 by Martin Karlgren <marty@roxen.com>

Inline the F_CATCH opcode (on AMD64 so far).

This is a prerequisite for MACHINE_CODE_STACK_FRAMES, since
inter_return_opcode_F_CATCH will "inject" itself on the C stack when the first
F_CATCH opcode is encountered (and won't vanish until inter return, which may
occur in an outer Pike frame).

2022-03-09

2022-03-09 10:44:42 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Merge branch 'patches/amd64-broken-debug-F_XOR_INT' into 8.0

* patches/amd64-broken-debug-F_XOR_INT:
Compiler [amd64]: Fix indexing out of bounds for F_XOR_INT --with-debug.

2022-03-09 10:44:16 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Merge branch 'patches/amd64-broken-debug-F_XOR_INT'

* patches/amd64-broken-debug-F_XOR_INT:
Compiler [amd64]: Fix indexing out of bounds for F_XOR_INT --with-debug.

2022-03-09 10:42:19 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Fix indexing out of bounds for F_XOR_INT --with-debug.

2020-02-09

2020-02-09 15:20:28 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Disassembler [amd64]: Added a few more opcodes.

2019-09-14

2019-09-14 09:43:33 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Inline F_SWAP_STACK_LOCAL.

2019-09-03

2019-09-03 09:23:22 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler and runtime: Added byte codes F_PUSH_CATCHES and F_CATCH_AT.

These are needed to be able to save and restore the recovery context
for generator functions.

Updates the code generators for quite a few machine code backends.

2019-09-03 09:16:04 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler: Fixed typo in PIKE_DEBUG code.

2019-09-02

2019-09-02 15:27:46 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Runtime: Increase the maximum number of bytecode opcodes to 512.

Adds F_INSTR_PREFIX_256.

We were very close to the opcode limit...

2019-08-29

2019-08-29 09:41:48 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Runtime: Support F_MARK_AT in generators.

2019-08-23

2019-08-23 13:52:01 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [WIP]: Experimental implementation of support for generators.

NB: The code is known to be broken, and the approach is likely to be
changed.

WORK IN PROGRESS!

DO NOT USE!

Do NOT merge into any branches that are likely to be merged into main line!

2019-07-22

2019-07-22 16:09:21 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Build [x86_64]: Fixed typo in previous commit.

2019-07-22 15:05:30 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Build [x86_64]: Fixed call of string_builder_append_disassembly().

Fall out from the recent API change.

2019-05-04

2019-05-04 09:12:19 by Arne Goedeke <el@laramies.com>

Merge remote-tracking branch 'origin/master' into new_utf8

2019-03-19

2019-03-19 12:33:55 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Merge commit '722771973bd' into patches/lyslyskom22891031

* commit '722771973bd': (6177 commits)
Verify that callablep responses are aligned with reality.
...

2019-03-18

2019-03-18 22:27:31 by Tobias S. Josefowitz <tobij@tobij.de>

Compiler: Silence compiler warnings

GCC 8 got more picky about function pointer signatures, but there is
really no need to let those warnings bother us.

2019-03-14

2019-03-14 10:39:03 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Merge commit '2470270f500c728d10b8895314d8d8b07016e37b' into grubba/typechecker-automap

* commit '2470270f500c728d10b8895314d8d8b07016e37b': (18681 commits)
Removed the old typechecker.
...

2018-11-04

2018-11-04 16:11:11 by Arne Goedeke <el@laramies.com>

Merge remote-tracking branch 'origin/master' into new_utf8

2018-11-03

2018-11-03 14:21:37 by Marcus Comstedt <marcus@mc.pp.se>

Merge remote-tracking branch 'origin/8.1' into gobject-introspection

2018-10-16

2018-10-16 20:25:40 by Tobias S. Josefowitz <tobij@tobij.de>

Compiler [amd64]: Keep stack alignment before calling C code

GCC 8 started to emit movaps instructions with (%RSP) as destination,
leading to GPF in case it was not properly aligned.

2018-10-16 20:23:20 by Tobias S. Josefowitz <tobij@tobij.de>

Compiler [amd64]: Keep stack alignment before calling C code

GCC 8 started to emit movaps instructions with (%RSP) as destination,
leading to GPF in case it was not properly aligned.

2018-04-02

2018-04-02 13:21:42 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Build [amd64]: Fixed warning.

2018-03-30

2018-03-30 16:06:39 by Arne Goedeke <el@laramies.com>

Interpreter: fixed handling of SAVE_LOCALS bitmask

Since the introduction of save_locals_bitmask, expendible_offset was
never set. Also since the handling of expendible_offset and
save_locals_bitmask were handled by the same case, the code was broken.

During pop entries handling of the save_locals bitmask could lead
to situations where locals above expendible_offset were 'copied' into
the trampoline frame. Those locals could have already been popped from
the stack by the RETURN_LOCAL opcode.

Also slightly refactored the code to not allocate more space for locals
than needed and removed some unnecessary casts.

This became visible and could lead to crashes when building for 32bit
on 64bit x86 machines.

2018-02-15

2018-02-15 15:54:26 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Merge commit '75c9d1806f1a69ca21c27a2c2fe1b4a6ea38e77e' into patches/pike63

* commit '75c9d1806f1a69ca21c27a2c2fe1b4a6ea38e77e': (19587 commits)
...

2018-02-12

2018-02-12 21:49:35 by Marcus Comstedt <marcus@mc.pp.se>

Fix spelling of FALLTHRU directive

The non-standard spelling "FALL_THROUGH" is not recognized by gcc 7.3.
Also, the comment must not contain any other text, or be placed inside
braces.

2017-11-18

2017-11-18 10:13:13 by Arne Goedeke <el@laramies.com>

Interpreter: merge low_return variants

2017-11-05

2017-11-05 15:53:18 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Merge branch 'grubba/rename_lfun_destroy' into 8.1

* grubba/rename_lfun_destroy:
Modules: Fixed logts of warnings.
Testsuite: Updated for LFUN::_destruct().
Compiler: Don't complain about LFUN::destroy() in compat mode.
Fix multiple warnings.
Runtime: LFUN::destroy() has been renamed to _destruct().
Compiler: Rename LFUN::destroy() to LFUN::_destruct().

2017-11-05 14:35:39 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler: Rename LFUN::destroy() to LFUN::_destruct().

As decided at Pike Conference 2017.

2017-10-09

2017-10-09 18:08:07 by Martin Karlgren <marty@roxen.com>

X86-64: Check C stack margin before adding stub stack frames.

2017-10-07

2017-10-07 21:04:32 by Martin Karlgren <marty@roxen.com>

Merge branch 'marty/call_frames' into 8.1

This introduces the --with-mc-stack-frames configure option, which will
instruct the machine code generator to insert proper stack frames (currently
only supported on X86-64). This is useful for profiling, especially in
combination with Debug.generate_perf_map() on Linux.

2017-09-13

2017-09-13 04:58:53 by Arne Goedeke <el@laramies.com>

Compiler: do not modify instrs array

2017-07-17

2017-07-17 15:30:05 by Martin Nilsson <nilsson@fastmail.com>

unsigned INT64 -> UINT64

2017-03-19

2017-03-19 15:41:08 by Arne Goedeke <el@laramies.com>

Interpreter: merge low_return variants

2017-02-21

2017-02-21 20:49:34 by Martin Karlgren <marty@roxen.com>

Add --with-mc-stack-frames configure option. (Currently X86-64 only.)

This will enable frame pointers in machine code, thereby allowing e.g.
Linux perf to unwind the stack and get proper stack traces including
Pike functions.

2017-02-21 20:41:13 by Martin Karlgren <marty@roxen.com>

Inline the F_CATCH opcode (on AMD64 so far).

This is a prerequisite for MACHINE_CODE_STACK_FRAMES, since
inter_return_opcode_F_CATCH will "inject" itself on the C stack when the first
F_CATCH opcode is encountered (and won't vanish until inter return, which may
occur in an outer Pike frame).

2017-02-18

2017-02-18 10:44:15 by Arne Goedeke <el@laramies.com>

Interpreter: merge low_return variants

2017-02-09

2017-02-09 17:34:20 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler: Moved yyreport() et al to pike_compiler.cmod.

More code cleanup.

2017-01-29

2017-01-29 22:18:08 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Disassembler now supports narrow registers.

2017-01-24

2017-01-24 10:56:48 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Use gdb-style disassembly syntax.

Fixes multiple disassembly syntax and argument ordering issues.

2017-01-23

2017-01-23 15:03:26 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Use string_builder_append_disassembly().

Improved formatting of disassembler output by using the new function.

2017-01-21

2017-01-21 07:11:42 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Fixed disassembler table typo.

2017-01-20

2017-01-20 13:48:15 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Fixed some invalid constants in the disassembler.

2017-01-19

2017-01-19 10:20:00 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Fixed some disassembler lookup table typos.

2017-01-18

2017-01-18 16:46:25 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Added some more sub-opcodes to the disassembler.

2017-01-18 16:34:46 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Support multi-byte opcodes in disassembler.

2017-01-17

2017-01-17 16:12:07 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Support jmp instructions in disassembler.

Also fixes argument order for some instructions.

2017-01-15

2017-01-15 18:06:09 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: More disassembler opcodes.

2017-01-14

2017-01-14 16:12:44 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Multiple fixes of the disassembler.

2017-01-13

2017-01-13 17:46:51 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: First go at implementing DISASSEMBLE_CODE().

Work in progress; disassembly of SIB not yet implemented,
argument order is not correct in some cases, but it does disassemble
some opcodes correctly.

2016-09-05

2016-09-05 19:51:21 by Arne Goedeke <el@laramies.com>

AMD64: use Pike_fp->current_storage

2016-07-29

2016-07-29 07:56:51 by Arne Goedeke <el@laramies.com>

AMD64: reimplement F_PROTECT_STACK

2016-07-27

2016-07-27 18:12:46 by Arne Goedeke <el@laramies.com>

Interpreter: simplify struct pike_frame

- expendibles and save_sp are usually only used during setup and frame
deallocation. It is enough to store them as offsets from 'locals'
- reordered the struct entries to avoid some padding

2016-06-22

2016-06-22 21:49:35 by Martin Nilsson <nilsson@fastmail.com>

Silence warning

2016-06-14

2016-06-14 09:20:16 by Arne Goedeke <el@laramies.com>

Compiler: reimplement F_PROTECT_STACK

2016-06-14 09:13:37 by Arne Goedeke <el@laramies.com>

Interpreter: store save_sp and expandibles as offsets

2016-06-13

2016-06-13 12:55:22 by Per Hedbor <ph@opera.com>

Unused version of mov_imm_mem16 for fast calls branch.

2016-06-12

2016-06-12 08:35:52 by Arne Goedeke <el@laramies.com>

Interpreter: store save_sp and expandibles as offsets

2016-04-09

2016-04-09 16:47:42 by Martin Nilsson <nilsson@fastmail.com>

unsigned INT64 -> UINT64

2016-01-12

2016-01-12 17:40:59 by Per Hedbor <ph@opera.com>

F_LOOP: Use <= 0, not == 0

2016-01-12 17:39:35 by Per Hedbor <ph@opera.com>

F_LOOP: Use <= 0, not == 0

2015-10-14

2015-10-14 19:41:28 by Martin Nilsson <nilsson@fastmail.com>

Removed Intel IA64 compiler specific DO_NOT_WARN.

2015-09-07

2015-09-07 17:12:00 by Per Hedbor <ph@opera.com>

[amd64] Fixed compilation of F_CONSTANT in decoded programs.

Note that the code used for decode_value decoded programs is
significantly slower than the one that is generated for programs
compiled from pike code.

This is sort of unfortunate since most modules are dumped.

2015-09-06

2015-09-06 10:32:49 by Per Hedbor <ph@opera.com>

Bypass the constant table, push value directly

2015-09-03

2015-09-03 22:30:12 by Per Hedbor <ph@opera.com>

[amd64] Bypass the string table and push the string directly

2015-08-06

2015-08-06 21:23:39 by Per Hedbor <ph@opera.com>

Todays micro-optimization: avoid re-doing overflowing multiplications.

Use the full 128-bit result from imul. Not really all that much of an
improvement in most cases, presumably.

The same thing can however also be done for + and -, at a minimum.

In general more mpz operations could be inlined where it makes sense
(the operators, mainly).

2015-07-07

2015-07-07 13:46:37 by Arne Goedeke <el@laramies.com>

Compiler [amd64]: reload sp_reg after call into c code

The stack pointer needs to be reloaded after calling F_LOOP. Otherwise,
since the F_LOOP opcode function changes the stack pointer, it might be
overwritten with the wrong value before calling a subsequent opcode
function.

2015-07-07 13:33:43 by Arne Goedeke <el@laramies.com>

Compiler [amd64]: reload sp_reg after call into c code

The stack pointer needs to be reloaded after calling F_LOOP. Otherwise,
since the F_LOOP opcode function changes the stack pointer, it might be
overwritten with the wrong value before calling a subsequent opcode
function.

2015-05-25

2015-05-25 15:33:02 by Martin Nilsson <nilsson@opera.com>

Removed trailing spaces.

2015-05-25 14:56:10 by Martin Nilsson <nilsson@opera.com>

Normalized file ends.

2015-04-25

2015-04-25 12:07:44 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Added a few fall-though markers.

Fixes [CID 1294640].

2015-02-19

2015-02-19 12:45:32 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Fix bug in F_FOREACH.

The initial foreach counter may be set to non-zero when foreach goes
over a ranged array. If the initial foreach counter is larger than
the size of the array F_FOREACH started indexing outside the array.

Fixes [bug 7426 (#7426)].

FIXME: Is there a corresponding problem with negative ranges?

2015-02-19 12:45:13 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Fix bug in F_FOREACH.

The initial foreach counter may be set to non-zero when foreach goes
over a ranged array. If the initial foreach counter is larger than
the size of the array F_FOREACH started indexing outside the array.

Fixes [bug 7426 (#7426)].

FIXME: Is there a corresponding problem with negative ranges?

2014-12-06

2014-12-06 18:36:04 by Per Hedbor <ph@opera.com>

Minimal optimization of mov_imm_reg

2014-12-06 18:07:39 by Per Hedbor <ph@opera.com>

Minimal optimization of mov_imm_reg

2014-12-06 17:25:55 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Fixed F_*CALL_BUILTIN* --with-debug.

The use of ins_debug_instr_prologue() zapped ARG1_REG for at least
F_MARK_CALL_BUILTIN.

2014-12-06 17:24:44 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Fixed F_*CALL_BUILTIN* --with-debug.

The use of ins_debug_instr_prologue() zapped ARG1_REG for at least
F_MARK_CALL_BUILTIN.

2014-12-05

2014-12-05 15:40:48 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Fixed code generator for INC/DEC.

Fixes [bug 7384 (#7384)].

2014-12-05 15:40:14 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Fixed code generator for INC/DEC.

Fixes [bug 7384 (#7384)].

2014-12-04

2014-12-04 19:27:19 by Per Hedbor <ph@opera.com>

Added F_UNDEFINEDP as an opcode

Also added a convenience SVAL(X) macro that returns the offset to
different parts of the svalue X values from the base register.

Also added some usage of the SVAL function in a few places.

Chagned to 32-bit arithmetics for types in a few locations. It saves
on code size, if nothing else (no REX prefix unless R8.. is used)

2014-12-04 19:27:19 by Per Hedbor <ph@opera.com>

Checking using tst_reg32 is not optimal in branch_if_non_zero.

2014-12-04 19:27:19 by Per Hedbor <ph@opera.com>

Fixed F_ADD, also disabled the int+int optimization in it.

Most cases are already caught by the ADD_INTS opcode.
At least if people type their code.

Switched a few cmp(reg,PIKE_T_INT) to test(reg).

2014-12-04 19:27:19 by Per Hedbor <ph@opera.com>

Fixed a few missing debug prologues.

2014-12-04 19:27:19 by Per Hedbor <ph@opera.com>

Automatically convert cmp_reg[32](reg,0) to test_reg(reg).

2014-12-04 19:27:18 by Per Hedbor <ph@opera.com>

Added INC,ADD, DEC and SUBTRACT

These are inlined when both arguments are integers.

It would be fairly to do for floats as well.

DIV/MOD/AND/MULT etc would be fairly easy to add as well, but
most require some more decoding of the intel instruction
reference manual.

2014-12-04 19:27:18 by Martin Nilsson <nilsson@opera.com>

Disable the most problematic opcodes.

2014-12-04 19:27:17 by Per Hedbor <ph@opera.com>

Added a few more global variable opcodes.

Gotta catch em all!

This time:

PRIVATE_IF_DIRECT_GLOBAL and ASSIGN_PRIVATE_IF_DIRECT_GLOBAL

These will fetch or assign a global variable if the currently
executing program is the program the object is cloned from.

These are only slightly slower than the F_PRIVATE_GLOBAL family of
opcodes, and the overhead if the global is not actually private is
minimal.

Missing: [ASSIGN_]PRIVATE_IF_DIRECT_TYPED_GLOBAL[_AND_POP] and
ASSIGN_PRIVATE_IF_DIRECT_GLOBAL_AND_POP.

2014-12-04 19:27:16 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Fixed a few typos.

The code generator for F_ASSIGN_PRIVATE_TYPED_GLOBAL_AND_POP
was broken (generated code for F_ASSIGN_PRIVATE_TYPED_GLOBAL)
due to a cut-n-paste miss.

This caused some code in Roxen to fail.

Fixes [LysLysKOM 20929878].

2014-12-04 19:27:15 by Per Hedbor <ph@opera.com>

Removed CLEAR_2_LOCAL & CLEAR_4_LOCAL, added CLEAR_N_LOCAL

This simplifies things a bit, and reduces codesize at times.

The record I have seen while running the testsuite was a clear_n_local(23).

2014-12-04 19:27:14 by Per Hedbor <ph@opera.com>

Added lvalue version of lexical_local

2014-12-04 19:27:14 by Per Hedbor <ph@opera.com>

Added F_LEXICAL_LOCAL amd64 edition

2014-12-04 19:27:14 by Per Hedbor <ph@opera.com>

Added F_CLEAR_4_LOCALS opcode.

I see a need for a CLEAR_N_LOCALS one instead.

2014-12-04 19:27:12 by Per Hedbor <ph@opera.com>

Added F_ASSIGN_PRIVATE_TYPED_GLOBAL[_AND_POP].

This completes the suite of private global opcodes.

2014-12-04 19:27:12 by Per Hedbor <ph@opera.com>

Added machinecode verison of F_LOCAL_LOCAL_INDEX

Specifically, this optimizes array[int].
Also includes incomplete f_branch_if_type_is_not.

2014-12-04 19:27:11 by Per Hedbor <ph@opera.com>

Added F_PRIVATE_TYPED_GLOBAL.

Much like PRIVATE_GLOBAL, but handles typed svalues
(everything but int, function and object).

No assign yet.

2014-12-04 19:27:08 by Per Hedbor <ph@opera.com>

Comment and whitespace changes

2014-12-04 19:27:07 by Per Hedbor <ph@opera.com>

Some more static on functions

Also added convenience functions to check if an svalue is a reference type.

2014-12-04 19:27:07 by Per Hedbor <ph@opera.com>

Read somewhat fewer bytes

Mainly, this saves four bytes of code size for each branch_when_{eq,ne}.

2014-12-04 19:27:07 by Per Hedbor <ph@opera.com>

Revert "Keep pike_fp->current_storage up to date in pike functions."

This reverts commit 9129e401d0db1703a938794d2d61d73b4b214992.

2014-12-04 19:27:07 by Per Hedbor <ph@opera.com>

Fixed "hilfe arrow up" crash.

The code crashed when assigning a private global variable
that was either an integer or float with bit 8 set.

2014-12-04 19:27:07 by Per Hedbor <ph@opera.com>

Do not allow assignment of private variables in destructed objects.

2014-12-04 19:27:07 by Per Hedbor <ph@opera.com>

Keep pike_fp->current_storage up to date in pike functions.

This speeds up global variable accesses quite a lot.

2014-12-04 19:27:07 by Per Hedbor <ph@opera.com>

Verify ENTRY_PROLOGUE_SIZE size.

2014-12-04 19:27:04 by Per Hedbor <ph@opera.com>

Revert "Changed fast_call_threads_etc handling with valgrind"

This reverts commit 1c4cf54199bd51903bc071a5aceff11e40c00222.

Needs more work, currently it is causing crashes.

2014-12-04 19:27:04 by Per Hedbor <ph@opera.com>

Generate more compact code for int+int.

2014-12-04 19:27:04 by Per Hedbor <ph@opera.com>

Removed some #if 0:ed code

Fixes a warning when compiling with debug.

2014-12-04 19:27:04 by Per Hedbor <ph@opera.com>

More compact type checks, no need to do a cmp, & is enough now.

2014-12-04 19:27:04 by Per Hedbor <ph@opera.com>

Optimized access to private/final global variables

Especially the machine code version is now significantly faster, it
will simply read the variable directly from the known byte offset
instead of calling a function that resolves it in the vtable.

Gives about a 20x speedup of trivial code along the lines of
globala = globala + globalb;

Also tried to disable some of the optimizations that causes lvalues to
be generated instead of the desired global/assign_global opcodes.

For now this is only done if the global variabeles are known to not be
arrays, multiset, strings, mapping or objects, since those
optimizations are needed to quickly append things to arrays (and
mappings/multiset, but that is less common. It is also needed for
destructive modifications of strings, something that is even less
common).

2014-12-04 19:27:03 by Per Hedbor <ph@opera.com>

Save a few bytes of code size for each free_svalue

8-bit constants generates smaller code.

2014-12-04 19:27:03 by Per Hedbor <ph@opera.com>

Changed fast_call_threads_etc handling with valgrind

Instead of disabling it entirely, clear it at function entry.
This gets rid of the uninitialized value, and slows things down less
than not doing the optimization.

2014-12-04 19:27:03 by Per Hedbor <ph@opera.com>

Hide the REG_<X> macros/enums.

It is just too easy to accidentally write REG_RBX instead of P_REG_RBX.

This causes rather hard to find bugs in the generated code.

2014-12-04 19:26:50 by Per Hedbor <ph@opera.com>

add_mem8_imm is used when not compiling with valgrind.

Re-introduced the function

2014-12-04 19:26:50 by Per Hedbor <ph@opera.com>

Added F_CALL_BUILTIN_N and F_APPLY_N.

This calls the constant in arg1 with arg2 arguments from the stack.

These opcodes are used if the number of arguments is known and bigger
than 1.

It is not really all that big an optimization, it only removes the
mark stack handling. And, in fact, due to the fact that it removes
some peep optimizations it might be somewhat slower when not using the
amd64 machine code (since, as an example, APPLY/ASSIGN_LOCAL/POP is no
longer an opcode that is used in this case).

However, when using the amd64 code the assign local + pop opcode is
higly optimized, so it's not an issue that it is not merged into the
apply opcode. It is in fact more of a feature.

For that reason the code in docode.c is currently conditional.
The only code generator using it is the amd64 one.

2014-12-04 19:26:42 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Runtime: Unified struct svalue and struct fast_svalue.

Modern gcc (4.7.3) had aliasing problems with the two structs, which
caused changes performed with SET_SVAL() (which used struct fast_svalue)
to not be reflected in TYPEOF() (which used struct svalue). This in turn
caused eg casts of integers to floats to fail with "Cast failed, wanted
float, got int".

The above problem is now solved by having an actual union for the type
fields in struct svalue. This has the additional benefit of forcing
all code to use the svalue macros.

NB: This code change will cause problems with compilers that don't
support union initializers.

2014-12-04 19:25:34 by Martin Nilsson <nilsson@opera.com>

Hide unused opcodes.

2014-12-04 19:24:10 by Henrik Grubbström (Grubba) <grubba@grubba.org>

[amd64] Fixed one more case broken by the svalue renumbering.

Fixes [LysLysKOM 20484693]/[Pike mailinglist 13687].

Thanks to Chris Angelico <rosuav@gmail.com> for the report an test case.

2014-12-04 19:24:10 by Per Hedbor <ph@opera.com>

Now compiles on modern Ubuntu version.

Newer versions of linux has defines and enums defining REG_x, where X
is all amd64 registers, but they are not numbered in a logical
manner. Fixed by renaming REG_X to P_REG_X in our file.

2014-12-04 19:23:51 by 0

Reshuffle labels to avoid "Branch 130 too far" message seen after ba7d5e1fb6e8.

2014-12-04 19:23:47 by Henrik Grubbström (Grubba) <grubba@grubba.org>

[amd64] Reorder the arguments to cmp_reg_reg().

cmp_reg_reg() now compares its arguments in
the same order as the cmp_reg*_imm() variants.

Fixes F_POP_TO_MARK.

Probably fixes index overruns in F_INDEX and F_LOCAL_INDEX.

2014-12-04 19:23:47 by Henrik Grubbström (Grubba) <grubba@grubba.org>

[runtime][amd64] Fixed some free_svalue-related bugs.

free_svalues() now survives freeing unfinished arrays.

amd64_free_svalue() now supports freeing PIKE_T_VOID svalues.

2014-12-04 19:23:47 by Henrik Grubbström (Grubba) <grubba@grubba.org>

[amd64] Some constant folding in F_POS_INT_INDEX.

2014-12-04 19:23:40 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Runtime: Renumbered PIKE_T_*. Breaks ppc32 and ppc64.

Renumber the low PIKE_T_* values so that PIKE_T_INT becomes zero.

This has the feature that zeroed memory becomes filled with Pike
svalues containing integer zeroes (and not NULL pointer arrays).
This will let call_c_initializers() avoid traversing the entire
identifier table for the class.

Note: The serialized representation of types (__parse_pike_type())
is unchanged. As is the {out,in}put for {en,de}code_value().

Updates the code generators for ia32 and amd64.

Breaks the code generators for ppc32 and ppc64.

2014-12-04 19:23:38 by Per Hedbor <ph@opera.com>

[amd64] Fully inline RETURN. Inline LOCAL_2_GLOBAL

2014-12-04 19:23:30 by Martin Nilsson <nilsson@opera.com>

Valgrind friendly machine code

2014-12-04 19:23:15 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Fixed bug in F_POS_INT_INDEX.

The range check in F_POS_INT_INDEX used the wrong comparison opcode
which caused indexing of arrays with their size to be allowed.

Added some corresponding tests to the testsuite.

Thanks to Stewa for the report.

2014-12-04 19:23:09 by 0

Move label to keep jump distance below limit.

2014-12-04 19:23:08 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Add some missing type checking to F_FOREACH.

The first argument to F_FOREACH wasn't verified to be an array,
which would cause core dumps if it wasn't.
Fixes [Pike-mailing-list 13472]/[LysLysKOM 20109625].

2014-12-04 19:23:02 by 0

Wrap unused parameters in UNUSED(), and debug-only parameters in DEBUGUSED(), to cut
down on compiler warnings. The macro also renames parameters to catch accidental use.
(There are more places to clean up but I don't want to modify code that isn't compiling
on my machine.)

2014-10-21

2014-10-21 13:52:46 by Martin Nilsson <nilsson@opera.com>

Have division by zero constant be compilation error on all platforms and not just amd64.

2014-09-23

2014-09-23 15:54:27 by Per Hedbor <ph@opera.com>

Fixed branch

2014-09-23 14:14:26 by Per Hedbor <ph@opera.com>

Fixed error in F_SIZEOF_LOCAL_STRING when the argument is not actually a string

2014-09-23 14:12:23 by Per Hedbor <ph@opera.com>

Added F_SIZEOF_STRING and F_SIZEOF_LOCAL_STRING

We really should pass on the type to the code generator instead, I
think.

There should also be a "#pragma promise_correct_types" or something
that would guarantee that the types are correct, and crash and burn if
they are not.

The generated code would be significantly smaller and faster.

2014-09-23 09:58:49 by Per Hedbor <ph@opera.com>

Added F_NOT amd64 edition.

2014-09-14

2014-09-14 08:40:19 by Per Hedbor <ph@opera.com>

Added a few more amd64 opcodes, the comparisons.

This adds support for F_EQ, F_NE, F_LE, F_GE, G_LT and F_GT.

They could be better, it would be simple to add floats, as an example,
especially in F_EQ and F_NE (they /almost/ work for floats, nan
complicates things however).

2014-09-04

2014-09-04 15:57:43 by Arne Goedeke <el@laramies.com>

Merge remote-tracking branch 'origin/8.0' into string_alloc

Conflicts:
src/stralloc.c

2014-09-02

2014-09-02 14:56:44 by Martin Nilsson <nilsson@opera.com>

Silence warnings.

2014-09-01

2014-09-01 11:48:11 by Per Hedbor <ph@opera.com>

For now -- Give up entirely doing / and % with negative integers

x86 is round-to-0, and pike expects round-to-negative-infinity.

This can be fixed in the opcodes, but it is somewhat harder than simply ignoring it for now. :)

2014-09-01 10:23:12 by Per Hedbor <ph@opera.com>

Fixed some issues with +/- int and re-enabled F_MOD_INT.

2014-08-31

2014-08-31 19:10:52 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Fixed code generator for F_RSH_INT.

The wrong label was used.

2014-08-31 17:33:14 by Per Hedbor <ph@opera.com>

Added a few more x86-64 opcodes

F_XOR / F_XOR_INT
F_DIVIDE_INT / F_MOD_INT
and a partial implementation of F_POP_N_ELEMS.

2014-08-31 15:55:36 by Per Hedbor <ph@opera.com>

Added F_DIVIDE

2014-08-31 15:55:36 by Per Hedbor <ph@opera.com>

Added x86_64 support for F_MOD

The divide instruciton also does mod.
But for negative numbers pike is somewhat different.
So, be rather restrictive.

2014-08-31 14:14:10 by Per Hedbor <ph@opera.com>

Unified some code. Added F_MULTIPLY. Fixed an issue in F_LSH_INT.

2014-08-31 14:14:09 by Per Hedbor <ph@opera.com>

Added F_COMPL. Fixed F_AND_INT

2014-08-31 13:15:39 by Per Hedbor <ph@opera.com>

Fixed F_LSH_INT.

2014-08-31 12:53:05 by Per Hedbor <ph@opera.com>

Added x86-64 version of R_LHS_INT

2014-08-31 10:17:33 by Per Hedbor <ph@opera.com>

Fixed non-int f_negate case

2014-08-31 10:14:33 by Per Hedbor <ph@opera.com>

Added a few more opcodes

F_NEGATE F_LSH F_AND_INT F_OR_INT F_RSH_INT F_SUBTRACT_INT

2014-08-29

2014-08-29 05:24:08 by Per Hedbor <ph@opera.com>

Reset the result types, clearing zero_type

2014-08-28

2014-08-28 20:14:35 by Per Hedbor <ph@opera.com>

Slightly shorter F_RSH

It is uncommon enough to do >> with values >= 64 to use the C version.

2014-08-28 19:21:09 by Per Hedbor <ph@opera.com>

Added F_AND, F_OR and F_RSH opcodes

2014-08-22

2014-08-22 18:02:24 by Arne Goedeke <el@laramies.com>

Merge remote-tracking branch 'origin/8.0' into string_alloc

2014-08-18

2014-08-18 15:03:53 by Per Hedbor <ph@opera.com>

Added F_UNDEFINEDP as an opcode

Also added a convenience SVAL(X) macro that returns the offset to
different parts of the svalue X values from the base register.

Also added some usage of the SVAL function in a few places.

Chagned to 32-bit arithmetics for types in a few locations. It saves
on code size, if nothing else (no REX prefix unless R8.. is used)

2014-08-18 13:28:39 by Per Hedbor <ph@opera.com>

Checking using tst_reg32 is not optimal in branch_if_non_zero.

2014-08-18 13:14:48 by Per Hedbor <ph@opera.com>

Fixed F_ADD, also disabled the int+int optimization in it.

Most cases are already caught by the ADD_INTS opcode.
At least if people type their code.

Switched a few cmp(reg,PIKE_T_INT) to test(reg).

2014-08-18 13:14:48 by Per Hedbor <ph@opera.com>

Automatically convert cmp_reg[32](reg,0) to test_reg(reg).

2014-08-18 13:12:34 by Per Hedbor <ph@opera.com>

Fixed a few missing debug prologues.

2014-08-18 12:02:48 by Martin Nilsson <nilsson@opera.com>

Disable the most problematic opcodes.

2014-08-16

2014-08-16 22:33:34 by Per Hedbor <ph@opera.com>

Added INC,ADD, DEC and SUBTRACT

These are inlined when both arguments are integers.

It would be fairly to do for floats as well.

DIV/MOD/AND/MULT etc would be fairly easy to add as well, but
most require some more decoding of the intel instruction
reference manual.

2014-08-16 21:45:10 by Per Hedbor <ph@opera.com>

Added a few more global variable opcodes.

Gotta catch em all!

This time:

PRIVATE_IF_DIRECT_GLOBAL and ASSIGN_PRIVATE_IF_DIRECT_GLOBAL

These will fetch or assign a global variable if the currently
executing program is the program the object is cloned from.

These are only slightly slower than the F_PRIVATE_GLOBAL family of
opcodes, and the overhead if the global is not actually private is
minimal.

Missing: [ASSIGN_]PRIVATE_IF_DIRECT_TYPED_GLOBAL[_AND_POP] and
ASSIGN_PRIVATE_IF_DIRECT_GLOBAL_AND_POP.

2014-08-15

2014-08-15 22:10:00 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler [amd64]: Fixed a few typos.

The code generator for F_ASSIGN_PRIVATE_TYPED_GLOBAL_AND_POP
was broken (generated code for F_ASSIGN_PRIVATE_TYPED_GLOBAL)
due to a cut-n-paste miss.

This caused some code in Roxen to fail.

Fixes [LysLysKOM 20929878].

2014-08-15 14:05:39 by Per Hedbor <ph@opera.com>

Removed CLEAR_2_LOCAL & CLEAR_4_LOCAL, added CLEAR_N_LOCAL

This simplifies things a bit, and reduces codesize at times.

The record I have seen while running the testsuite was a clear_n_local(23).

2014-08-15 13:20:08 by Per Hedbor <ph@opera.com>

Added F_CLEAR_4_LOCALS opcode.

I see a need for a CLEAR_N_LOCALS one instead.

2014-08-15 12:33:38 by Per Hedbor <ph@opera.com>

Added lvalue version of lexical_local

2014-08-15 12:31:00 by Per Hedbor <ph@opera.com>

Added F_LEXICAL_LOCAL amd64 edition

2014-08-14

2014-08-14 13:36:53 by Per Hedbor <ph@opera.com>

Added F_ASSIGN_PRIVATE_TYPED_GLOBAL[_AND_POP].

This completes the suite of private global opcodes.

2014-08-14 13:36:53 by Per Hedbor <ph@opera.com>

Added machinecode verison of F_LOCAL_LOCAL_INDEX

Specifically, this optimizes array[int].
Also includes incomplete f_branch_if_type_is_not.

2014-08-14 10:22:53 by Per Hedbor <ph@opera.com>

Added F_PRIVATE_TYPED_GLOBAL.

Much like PRIVATE_GLOBAL, but handles typed svalues
(everything but int, function and object).

No assign yet.

2014-08-11

2014-08-11 17:57:53 by Per Hedbor <ph@opera.com>

Comment and whitespace changes

2014-08-11 15:24:14 by Per Hedbor <ph@opera.com>

Read somewhat fewer bytes

Mainly, this saves four bytes of code size for each branch_when_{eq,ne}.

2014-08-11 14:36:53 by Per Hedbor <ph@opera.com>

Do not allow assignment of private variables in destructed objects.

2014-08-11 14:26:25 by Per Hedbor <ph@opera.com>

Some more static on functions

Also added convenience functions to check if an svalue is a reference type.

2014-08-11 14:26:25 by Per Hedbor <ph@opera.com>

Verify ENTRY_PROLOGUE_SIZE size.

2014-08-11 14:26:25 by Per Hedbor <ph@opera.com>

Fixed "hilfe arrow up" crash.

The code crashed when assigning a private global variable
that was either an integer or float with bit 8 set.

2014-08-11 14:26:25 by Per Hedbor <ph@opera.com>

Revert "Keep pike_fp->current_storage up to date in pike functions."

This reverts commit 9129e401d0db1703a938794d2d61d73b4b214992.

2014-08-11 14:26:25 by Per Hedbor <ph@opera.com>

Keep pike_fp->current_storage up to date in pike functions.

This speeds up global variable accesses quite a lot.

2014-08-08

2014-08-08 12:26:40 by Per Hedbor <ph@opera.com>

More compact type checks, no need to do a cmp, & is enough now.

2014-08-08 12:26:40 by Per Hedbor <ph@opera.com>

Revert "Changed fast_call_threads_etc handling with valgrind"

This reverts commit 1c4cf54199bd51903bc071a5aceff11e40c00222.

Needs more work, currently it is causing crashes.

2014-08-07

2014-08-07 17:10:11 by Per Hedbor <ph@opera.com>

Generate more compact code for int+int.

2014-08-07 16:24:32 by Per Hedbor <ph@opera.com>

Optimized access to private/final global variables

Especially the machine code version is now significantly faster, it
will simply read the variable directly from the known byte offset
instead of calling a function that resolves it in the vtable.

Gives about a 20x speedup of trivial code along the lines of
globala = globala + globalb;

Also tried to disable some of the optimizations that causes lvalues to
be generated instead of the desired global/assign_global opcodes.

For now this is only done if the global variabeles are known to not be
arrays, multiset, strings, mapping or objects, since those
optimizations are needed to quickly append things to arrays (and
mappings/multiset, but that is less common. It is also needed for
destructive modifications of strings, something that is even less
common).

2014-08-07 16:24:32 by Per Hedbor <ph@opera.com>

Removed some #if 0:ed code

Fixes a warning when compiling with debug.

2014-08-07 16:24:31 by Per Hedbor <ph@opera.com>

Save a few bytes of code size for each free_svalue

8-bit constants generates smaller code.

2014-08-07 16:24:31 by Per Hedbor <ph@opera.com>

Hide the REG_<X> macros/enums.

It is just too easy to accidentally write REG_RBX instead of P_REG_RBX.

This causes rather hard to find bugs in the generated code.

2014-08-07 16:24:31 by Per Hedbor <ph@opera.com>

Changed fast_call_threads_etc handling with valgrind

Instead of disabling it entirely, clear it at function entry.
This gets rid of the uninitialized value, and slows things down less
than not doing the optimization.

2014-07-15

2014-07-15 14:52:20 by Per Hedbor <ph@opera.com>

add_mem8_imm is used when not compiling with valgrind.

Re-introduced the function

2014-07-15 13:09:00 by Per Hedbor <ph@opera.com>

Added F_CALL_BUILTIN_N and F_APPLY_N.

This calls the constant in arg1 with arg2 arguments from the stack.

These opcodes are used if the number of arguments is known and bigger
than 1.

It is not really all that big an optimization, it only removes the
mark stack handling. And, in fact, due to the fact that it removes
some peep optimizations it might be somewhat slower when not using the
amd64 machine code (since, as an example, APPLY/ASSIGN_LOCAL/POP is no
longer an opcode that is used in this case).

However, when using the amd64 code the assign local + pop opcode is
higly optimized, so it's not an issue that it is not merged into the
apply opcode. It is in fact more of a feature.

For that reason the code in docode.c is currently conditional.
The only code generator using it is the amd64 one.

2014-06-24

2014-06-24 14:31:50 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Runtime: Unified struct svalue and struct fast_svalue.

Modern gcc (4.7.3) had aliasing problems with the two structs, which
caused changes performed with SET_SVAL() (which used struct fast_svalue)
to not be reflected in TYPEOF() (which used struct svalue). This in turn
caused eg casts of integers to floats to fail with "Cast failed, wanted
float, got int".

The above problem is now solved by having an actual union for the type
fields in struct svalue. This has the additional benefit of forcing
all code to use the svalue macros.

NB: This code change will cause problems with compilers that don't
support union initializers.

2014-03-15

2014-03-15 20:13:20 by Martin Nilsson <nilsson@opera.com>

Hide unused opcodes.

2014-01-05

2014-01-05 15:14:13 by Marcus Comstedt <marcus@mc.pp.se>

Merge branch '8.0' into gobject-introspection

2013-10-08

2013-10-08 12:34:26 by Per Hedbor <ph@opera.com>

Now compiles on modern Ubuntu version.

Newer versions of linux has defines and enums defining REG_x, where X
is all amd64 registers, but they are not numbered in a logical
manner. Fixed by renaming REG_X to P_REG_X in our file.

2013-10-07

2013-10-07 16:48:40 by Henrik Grubbström (Grubba) <grubba@grubba.org>

[amd64] Fixed one more case broken by the svalue renumbering.

Fixes [LysLysKOM 20484693]/[Pike mailinglist 13687].

Thanks to Chris Angelico <rosuav@gmail.com> for the report an test case.

2013-07-06

2013-07-06 16:47:12 by 0

Reshuffle labels to avoid "Branch 130 too far" message seen after ba7d5e1fb6e8.

2013-06-21

2013-06-21 09:18:55 by Arne Goedeke <el@laramies.com>

Merge remote-tracking branch 'origin/7.9' into pdf

2013-06-19

2013-06-19 19:10:35 by Henrik Grubbström (Grubba) <grubba@grubba.org>

[amd64] Reorder the arguments to cmp_reg_reg().

cmp_reg_reg() now compares its arguments in
the same order as the cmp_reg*_imm() variants.

Fixes F_POP_TO_MARK.

Probably fixes index overruns in F_INDEX and F_LOCAL_INDEX.

2013-06-19 19:08:20 by Henrik Grubbström (Grubba) <grubba@grubba.org>

[amd64] Some constant folding in F_POS_INT_INDEX.

2013-06-19 18:50:28 by Henrik Grubbström (Grubba) <grubba@grubba.org>

[runtime][amd64] Fixed some free_svalue-related bugs.

free_svalues() now survives freeing unfinished arrays.

amd64_free_svalue() now supports freeing PIKE_T_VOID svalues.

2013-06-12

2013-06-12 18:29:23 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Runtime: Renumbered PIKE_T_*. Breaks ppc32 and ppc64.

Renumber the low PIKE_T_* values so that PIKE_T_INT becomes zero.

This has the feature that zeroed memory becomes filled with Pike
svalues containing integer zeroes (and not NULL pointer arrays).
This will let call_c_initializers() avoid traversing the entire
identifier table for the class.

Note: The serialized representation of types (__parse_pike_type())
is unchanged. As is the {out,in}put for {en,de}code_value().

Updates the code generators for ia32 and amd64.

Breaks the code generators for ppc32 and ppc64.

2013-06-12 18:21:52 by Arne Goedeke <el@laramies.com>

Merge remote-tracking branch 'origin/7.9' into ba

Conflicts:
src/interpret.c
src/interpret.h
src/pike_embed.c

2013-06-12 15:45:52 by Per Hedbor <ph@opera.com>

[amd64] Fully inline RETURN. Inline LOCAL_2_GLOBAL

2013-05-28

2013-05-28 17:41:57 by Martin Nilsson <nilsson@opera.com>

Valgrind friendly machine code

2013-03-06

2013-03-06 19:06:50 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Fixed bug in F_POS_INT_INDEX.

The range check in F_POS_INT_INDEX used the wrong comparison opcode
which caused indexing of arrays with their size to be allowed.

Added some corresponding tests to the testsuite.

Thanks to Stewa for the report.

2013-02-05

2013-02-05 20:21:48 by 0

Move label to keep jump distance below limit.

2013-02-05 10:42:35 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Add some missing type checking to F_FOREACH.

The first argument to F_FOREACH wasn't verified to be an array,
which would cause core dumps if it wasn't.
Fixes [Pike-mailing-list 13472]/[LysLysKOM 20109625].

2013-01-06

2013-01-06 20:04:14 by Arne Goedeke <el@laramies.com>

Casting to INT64 first is correct here.

This reverts commit 806cc2fd28f3315d8aedf8325f8b85139439023c.

2013-01-02

2013-01-02 02:31:03 by 0

F_NEG_NUMBER: Don't cast to INT64 before negation.

Solves a sign-extension issue when used with 0x80000000 as argument, though it's
debatable whether this value should ever occur in the first place.

2012-12-30

2012-12-30 15:37:27 by 0

Wrap unused parameters in UNUSED(), and debug-only parameters in DEBUGUSED(), to cut
down on compiler warnings. The macro also renames parameters to catch accidental use.
(There are more places to clean up but I don't want to modify code that isn't compiling
on my machine.)

2012-10-06

2012-10-06 11:38:03 by Marcus Comstedt <marcus@mc.pp.se>

Merge branch '7.9' into gobject-introspection

2012-07-19

2012-07-19 00:20:57 by 0

[compiler][amd64] Can't easily compare functions and floats in
F_BRANCH_WHEN_{EQ,NEQ}. Also corrected some comment typos.

2012-07-18

2012-07-18 14:35:57 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Changed calling convention for {jmp,call}_rel_imm*().

They now take the absolute address as argument,
since the relative offset depends on the size
of the generated opcode, which could vary.

This fixes tlib/modules/Calendar.pmod/testsuite:433,
where the second F_RETURN_IF_TRUE jumped five bytes
too short into the code generated by the first.

2012-07-18 12:56:57 by Arne Goedeke <el@laramies.com>

Merge branch '7.9' into block_alloc

Conflicts:
src/modules/system/configure.in
src/post_modules/CritBit/tree_low.c
src/post_modules/CritBit/tree_low.h
src/post_modules/CritBit/tree_source.H

2012-07-16

2012-07-16 21:09:35 by Per Hedbor <ph@opera.com>

[compiler][amd64] Use new features from peep.c

Place the code for calling check_threads_etc before the function instead of
inside it, to have one branch less in tight loops.

This saves about 4% in the nested loops test, at the cost of 12 bytes extra
code-space for functions that do not actually contain loops (for functions
that contain loops 3 bytes is saved intead)

One alternative would have to place the code after the function, if it does
contain a loop, then update the relative jumps to point to the code.

That is left for later.

2012-07-14

2012-07-14 06:35:42 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Load fp_reg even more consistently.

It should be loaded even without PIKE_DEBUG...

2012-07-13

2012-07-13 15:43:29 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Load fp_reg more consistently.

Background:
Some of the opcode implementations use the C-implementation as
a fallback for the more complex cases. These typically use
amd64_call_c_opcode(), which calls maybe_update_pc(), which may
call UPDATE_PC(), which calls amd64_load_fp_reg(), which loads
fp_reg if it isn't thought to be loaded.

Problem:
This means that the opcodes in question sometimes will enter
with fp_reg not loaded, and exit with fp_reg thought to be
loaded even though it isn't loaded on all code-paths for the
opcode.

Solution:
This patch loads fp_reg in the instruction prologue under
the same circumstances where maybe_update_pc() would have
loaded it.

2012-07-01

2012-07-01 22:05:19 by Arne Goedeke <el@laramies.com>

Merge remote branch 'origin/7.9' into block_alloc

2012-06-28

2012-06-28 21:56:29 by Per Hedbor <ph@opera.com>

[compiler][amd64] Attempt at faster PC updates.

Always just assign PC to the current address instead of adding the
difference. This is somewhat faster.

We still do too many updates, though.

As an example it is commont to have 2-3 update_pc in a row.

2012-06-28 21:55:02 by Per Hedbor <ph@opera.com>

[compiler][amd64] Added instr_prologue in a lot of places.

Also, load sp register more consistently, and only when
actually needed.

2012-06-25

2012-06-25 17:54:37 by Per Hedbor <ph@opera.com>

[compiler][amd64] Inline version of FOREACH

This is the old foreach( array, loop_variable ). About 10 times faster if the
loop variable is a local variable in the function.

2012-06-25 17:13:22 by Per Hedbor <ph@opera.com>

[compiler][amd64] Yet more inlined opcodes

Added inline versions of INDEX, INT_NDEX, NEG_INT_INDEX, LOCAL_INDEX.

They are only inlined when the index is an integer and the item to be
indexed is an array.

Adding support for string[int] might be useful.

2012-06-25 14:53:53 by Per Hedbor <ph@opera.com>

[compiler][amd64] Inline some more opcodes

Added SIZEOF, RETURN_LOCAL and CLEAR_2_LOCAL.

2012-06-25 14:10:12 by Per Hedbor <ph@opera.com>

[compiler][amd64] More inline opcodes.

Added the various *CALL*BULTIN* opcodes.

2012-06-25 00:02:25 by Per Hedbor <ph@opera.com>

[compiler][amd64] Inline a few mode opcodes

Added inline versions of LTOSVAL2_AND_FREE, LTOSVAL, ASSIGN and
ASSIGN_AND_POP. Slightly optimized BRANCH_WHEN_*ZERO and
BRANCH_WHEN_*LOCAL.

2012-06-22

2012-06-22 15:58:21 by Per Hedbor <ph@opera.com>

[amd64] Fixed the comparison opcodes for real.

Also, use 32-bit comparisons when possible, this saves one byte in the generated
code per comparison since rex is not needed.

2012-06-22 10:10:46 by Per Hedbor <ph@opera.com>

[amd64] Fixed fallback version of BRANCH_WHEN_ZERO.

2012-06-22 09:12:41 by Per Hedbor <ph@opera.com>

[compiler][amd64] Real mov16 and mov8 added.

Using the movzx instruction, this is for unsigned numbers. Versions
using movsz is needed if signed numbers are to be used.

Inlined a few more opcodes. Fixed branch when (non) zero and branch
when local to correctly treat 0.0 as non-zero.

Fixed clearing of zero type in ADD_LOCAL_INT[_and_pop] and
ADD_[NEG_]INT.

2012-06-21

2012-06-21 15:28:22 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Improved detection of use of stale registers.

2012-06-21 15:24:27 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Fixed bug where F_FILL_STACK pushed one arg too many.

2012-06-20

2012-06-20 15:30:30 by Arne Goedeke <el@laramies.com>

Merge remote branch 'origin/7.9' into block_alloc

Conflicts:
lib/modules/Tools.pmod/Shoot.pmod/module.pmod

2012-06-20 04:43:06 by Per Hedbor <ph@opera.com>

[compiler][amd64] Cleaned up code somewhat and faster branches

Added FAST_BRANCH_WHEN{_,_NOT_}ZERO that knows that sp[-1] is an
integer. It can thus avoid doing any checking of types and the normal
pop_stack checks.

Also inlined the normal BRANCH_WHEN{_,_NON_}ZERO.

There is now a common function that is used to generate
modrm+sib+offset for the *mem* family of functions.

Also removed frame init/stack cleaning for Functions that just return
a constant.

2012-06-19

2012-06-19 21:31:16 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Inline some more opcodes.

2012-06-19 21:27:27 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Make sure the fp_reg is loaded before use.

2012-06-18

2012-06-18 00:45:27 by Per Hedbor <ph@opera.com>

[compiler] Significantly faster simple loops

New opcodes:
ASSIGN_LOCAL_NUMBER_AND_POP
ADD_LOCAL_NUMBER_AND_POP, ADD_LOCAL_LOCAL_AND_POP
and ASSIGN_GLOBAL_NUMBER_AND_POP

The rationale for the assign_local variants is that it is
significantly faster to do local=local and local+=[number||local] than
it is to do local&, number, f_add_to and similar.

The reason being that the locals act much like registers, they are
easy to assign values from the machinecode level.

Also added some perhaps dubious optimizations of the code that the
treeoptimizer produce for for-loops.

The result of the above is that the NestedLoops* tests are about eight
times faster. And runs entirely in native code, without any function
calls.

2012-06-17

2012-06-17 15:05:24 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Fixed code generator for shifts.

2012-06-17 13:17:22 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Fixed a few typos.

PIKE_INT_TYPE is signed...

2012-06-15

2012-06-15 16:10:59 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Fixed a few warnings.

2012-06-15 16:09:49 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Added inlineing of a few more opcodes.

2012-06-15 10:07:58 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Fixed PC-calculation.

2012-06-15 09:54:06 by Arne Goedeke <el@laramies.com>

Merge remote branch 'origin/7.9' into rblock_alloc

Conflicts:
src/post_modules/CritBit/floattree.cmod
src/post_modules/CritBit/inttree.cmod
src/post_modules/CritBit/stringtree.cmod

2012-06-14

2012-06-14 22:11:43 by Per Hedbor <ph@opera.com>

[compiler][amd64] Some more optimizations and changes

Added branch_check_threads_etc calls that went missing.

Also changed how branch_check_threads_etc is called, the code now
maintains a counter on the C-stack, if adding 1 to it (as a signed
byte) causes it to overflow the C-function is called, after adding 128
to the in-memory counter. This saves rather a lot of calls.

Inlined F_{DUMB_,}RETURN, F_BRANCH_WHEN_{EQ,NE} F_ADD_NEG_INT,
F_ADD_INT and F_ADD_INTS.

2012-06-13

2012-06-13 20:40:26 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Inline F_POP_TO_MARK and F_FILL_STACK.

2012-06-13 20:37:21 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): low_mov_mem_reg() now supports REG_R12...

2012-06-13 20:35:50 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Support using labels for backward jumps too.

2012-06-13 17:10:41 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Minor optimization of F_LOCAL_2_LOCAL.

2012-06-13 17:01:23 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Fixed typo in add_reg_imm_reg()

This typo broke F_LOCAL_2_LOCAL (and probably others).

2012-06-13 12:13:13 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Fixed a few ins_debug_instr_prologue() calls.

2012-06-13 08:58:07 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Fixed typo in F_SWAP.

2012-06-13 00:58:29 by Per Hedbor <ph@opera.com>

Added yet more 'native' opcodes for AMD64/x86_64

Added inline versions of F_DUP, F_SWAP, F_LOOP and F_LOCAL_2_LOCAL.
This almost doubled the speed of the 'Loops Nested (local)' benchmark.

2012-06-12

2012-06-12 22:33:12 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Got rid of some C++-style comments.

2012-06-12 21:16:15 by Per Hedbor <ph@opera.com>

Rewrote x86_64/AMD64 native code generation.

De-macrofied the generator to make it at least somewhat easier to
read, also attempted to make the opcode implementations easier to
understand (at least if you have a copy of the Intel Software
Developer's Manual in front of you.

There are known issues with some opcodes, mot all mov_* work with all
registers as arguments (due to the x86 instruction encoding for
register&7==[3,4])

Added a few more instructions, and changed some occurences of things
like 'mov $4, eax; mov eax,[ecx+off]' to 'mov $4,[ecx+off]'.

Added a simple branch/label system to make it somewhat easier to write
more complex opcodes.

Inlined or partially inlined some opcodes:

THIS_OBJECT, ASSIGN_LOCAL, ASSIGN_LOCAL_AND_POP, ASSIGN_GLOBAL,
ASSIGN_GLOBAL_AND_POP, POP_VALUE, SIZEOF_LOCAL, CONSTANT,
GLOBAL_LVALUE, LOCAL_LVALUE, BRANCH_IF[_NOT]_LOCAL and fixed INIT_FRAME.

Changed handling of 'check_threads_etc', it is now only called every
1024 branches (or, like normal, when functions are called).

Overall pure pike-code execution is about 20% faster.

2012-06-11

2012-06-11 19:23:19 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Fixed typo.

2012-06-11 06:44:24 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Disabled inlineing of F_INIT_FRAME.

Needs either 16-bit or 32-bit store.

2012-06-10

2012-06-10 20:21:10 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Inline some of the new opcodes.

2011-07-10

2011-07-10 09:25:54 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Interpreter mega patch: The global Pike_interpreter struct replaced with Pike_interpreter_pointer.

2011-05-26

2011-05-26 16:59:29 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Inline a few more opcodes.

2011-05-24

2011-05-24 16:28:12 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Inline a few more opcodes.

2011-05-23

2011-05-23 15:50:57 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Improved robustness for PC-relative addressing.

Should now support MacOS X.

2011-05-23 15:22:14 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Compensate for larger CALL_ABSOLUTE() on MacOS X.

2011-05-20

2011-05-20 13:32:46 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Some fixes and potential support for MacOS X.

2011-05-16

2011-05-16 21:22:41 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Changed calling conventions for inter_return_opcode_F_CATCH().

2011-05-16 06:41:03 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Check stack alignment in debug mode.

2011-05-15

2011-05-15 20:09:04 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Several bugfixes in the code-generator.

2011-05-15 09:58:13 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Use LEA (%rip) to get the program pointer.

Also some fixes of AMD64_MOVE_REG_TO_RELADDR().

2011-05-12

2011-05-12 16:49:14 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler (amd64): Use MOV instead of LEA to save a byte.

2011-05-11

2011-05-11 20:38:06 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler: Inline some common opcodes in the amd64 generator.

2011-05-11 17:34:05 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler: Removed broken remnant of old code for amd64 machine-code.

2011-05-11 16:55:54 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler: Support for machine-code for amd64 (aka x86_64) now seems to work.

2011-05-09

2011-05-09 16:30:38 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Compiler: Implemented partial machine-code support for amd64 (aka x86_64).

2006-09-08

2006-09-08 17:20:46 by Henrik Grubbström (Grubba) <grubba@grubba.org>

Pike opcode arguments are signed (cf struct p_instr_s).

Rev: src/code/amd64.c:1.2
Rev: src/code/bytecode.c:1.8
Rev: src/code/computedgoto.c:1.5
Rev: src/code/ia32.c:1.46
Rev: src/code/ia32.h:1.30
Rev: src/code/ppc32.c:1.41
Rev: src/code/sparc.c:1.48
Rev: src/pikecode.h:1.14

2006-04-27

2006-04-27 09:37:34 by Tor Edvardsson <tor.edvardsson@gmail.com>

Source file for amd64 code generation.

Rev: src/code/amd64.c:1.1