[compiler] Significantly faster simple loops New opcodes: ASSIGN_LOCAL_NUMBER_AND_POP ADD_LOCAL_NUMBER_AND_POP, ADD_LOCAL_LOCAL_AND_POP and ASSIGN_GLOBAL_NUMBER_AND_POP The rationale for the assign_local variants is that it is significantly faster to do local=local and local+=[number||local] than it is to do local&, number, f_add_to and similar. The reason being that the locals act much like registers, they are easy to assign values from the machinecode level. Also added some perhaps dubious optimizations of the code that the treeoptimizer produce for for-loops. The result of the above is that the NestedLoops* tests are about eight times faster. And runs entirely in native code, without any function calls.