Rewrote x86_64/AMD64 native code generation.
De-macrofied the generator to make it at least somewhat easier to
read, also attempted to make the opcode implementations easier to
understand (at least if you have a copy of the Intel Software
Developer's Manual in front of you.
There are known issues with some opcodes, mot all mov_* work with all
registers as arguments (due to the x86 instruction encoding for
Added a few more instructions, and changed some occurences of things
like 'mov $4, eax; mov eax,[ecx+off]' to 'mov $4,[ecx+off]'.
Added a simple branch/label system to make it somewhat easier to write
more complex opcodes.
Inlined or partially inlined some opcodes:
THIS_OBJECT, ASSIGN_LOCAL, ASSIGN_LOCAL_AND_POP, ASSIGN_GLOBAL,
ASSIGN_GLOBAL_AND_POP, POP_VALUE, SIZEOF_LOCAL, CONSTANT,
GLOBAL_LVALUE, LOCAL_LVALUE, BRANCH_IF[_NOT]_LOCAL and fixed INIT_FRAME.
Changed handling of 'check_threads_etc', it is now only called every
1024 branches (or, like normal, when functions are called).
Overall pure pike-code execution is about 20% faster.