[amd64] Fixed the comparison opcodes for real. Also, use 32-bit comparisons when possible, this saves one byte in the generated code per comparison since rex is not needed.