cybercyst/go - go - Gitea: Git with a cup of tea

mirror of https://github.com/golang/go.git synced 2025-05-27 18:31:35 +00:00

Author	SHA1	Message	Date
Cherry Zhang	ee330385ca	cmd/internal/obj, runtime: preempt & restart some instruction sequences On some architectures, for async preemption the injected call needs to clobber a register (usually REGTMP) in order to return to the preempted function. As a consequence, the PC ranges where REGTMP is live are not preemptible. The uses of REGTMP are usually generated by the assembler, where it needs to load or materialize a large constant or offset that doesn't fit into the instruction. In those cases, REGTMP is not live at the start of the instruction sequence. Instead of giving up preemption in those cases, we could preempt it and restart the sequence when resuming the execution. Basically, this is like reissuing an interrupted instruction, except that here the "instruction" is a Prog that consists of multiple machine instructions. For this to work, we need to generate PC data to mark the start of the Prog. Currently this is only done for ARM64. TODO: the split-stack function prologue is currently not async preemptible. We could use this mechanism, preempt it and restart at the function entry. Change-Id: I37cb282f8e606e7ab6f67b3edfdc6063097b4bd1 Reviewed-on: https://go-review.googlesource.com/c/go/+/208126 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-05-06 15:41:12 +00:00
Ruixin(Peter) Bao	a45ea55da7	cmd/internal: allow ADDE to work with memory location on s390x Originally on s390x, ADDE does not work when adding numbers from a memory location. For example: ADDE (R3), R4 will result in a failure. Since ADDC, ADD and ADDW already supports adding from memory location, let's support that for ADDE as well. Change-Id: I7cbe112ea154733a621b948c6a21bbee63fb0c62 Reviewed-on: https://go-review.googlesource.com/c/go/+/229304 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-22 11:37:03 +00:00
Ruixin(Peter) Bao	553a8626ba	cmd/internal: add MVCIN instruction to s390x assembler On s390x, we already have MVCIN opcode in asmz.go, but we did not use it. This CL uses that opcode and adds MVCIN instruction. MVCIN instruction can be used to move data from one storage location to another while reversing the order of bytes within the field. This could be useful when transforming data from little-endian to big-endian. Change-Id: Ifa1a911c0d3442f4a62f91f74ed25b196d01636b Reviewed-on: https://go-review.googlesource.com/c/go/+/227478 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-07 15:03:09 +00:00
Russ Cox	fc8a6336d1	cmd/asm, cmd/compile, runtime: add -spectre=ret mode This commit extends the -spectre flag to cmd/asm and adds a new Spectre mitigation mode "ret", which enables the use of retpolines. Retpolines prevent speculation about the target of an indirect jump or call and are described in more detail here: https://support.google.com/faqs/answer/7625886 Change-Id: I4f2cb982fa94e44d91e49bd98974fd125619c93a Reviewed-on: https://go-review.googlesource.com/c/go/+/222661 Reviewed-by: Keith Randall <khr@golang.org>	2020-03-13 19:05:54 +00:00
Cherry Zhang	4751db93ef	cmd/internal/obj/s390x: mark unsafe points For async preemption, we will be using REGTMP as a temporary register in injected call on S390X, which will clobber it. So any code that uses REGTMP is not safe for async preemption. In the assembler backend, we expand a Prog to multiple machine instructions and use REGTMP as a temporary register if necessary. These need to be marked unsafe. Unlike ARM64 and MIPS, instructions on S390X are variable length so we don't use the length as a condition. Instead, we set a bit on the Prog whenever REGTMP is used. Change-Id: Ie5d14068a950f4c7cea51dff2c4a8bdc19ec9348 Reviewed-on: https://go-review.googlesource.com/c/go/+/204105 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2019-11-07 20:34:27 +00:00
Cherry Zhang	1da575a7bc	cmd/internal/obj/s390x: add support of SPM instruction For restoring condition code (we already support IPM instruction for saving condition code). Change-Id: I56d376df44a5f831134a130d052521cec6b5b781 Reviewed-on: https://go-review.googlesource.com/c/go/+/204104 Reviewed-by: Michael Munday <mike.munday@ibm.com>	2019-11-04 17:19:36 +00:00
Ruixin(Peter) Bao	a8fc82f77a	cmd/asm/internal/asm/testdata/s390x: add test cases for some assembly instructions From CL 199979, I noticed that there were some instructions not covered by the test cases. Added those in this CL. Additional tests for assembly instructions are also added based on suggestions made during the review of this CL. Previously, VSB and VSH are not included in asmz.go, they were also added in this patch. Change-Id: I6060a9813b483a161d61ad2240c30eec6de61536 Reviewed-on: https://go-review.googlesource.com/c/go/+/203721 Reviewed-by: Michael Munday <mike.munday@ibm.com>	2019-11-04 15:43:40 +00:00
Michael Munday	38c4a73706	cmd/asm: add s390x branch-on-count instructions The branch-on-count instructions on s390x decrement the input register and then compare its value to 0. If not equal the branch is taken. These instructions are useful for implementing loops with a set number of iterations (which might be in a register). For example, this for loop: for i := 0; i < n; i++ { ... // i is not used or modified in the loop } Could be implemented using this assembly: MOVD Rn, Ri loop: ... BRCTG Ri, loop Note that i will count down from n in the assembly whereas in the original for loop it counted up to n which is why we can't use i in the loop. These instructions will only be used in hand-written codegen and assembly for now since SSA blocks cannot currently modify values. We could look into this in the future though. Change-Id: Iaab93b8aa2699513b825439b8ea20d8fe2ea1ee6 Reviewed-on: https://go-review.googlesource.com/c/go/+/199977 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-10-09 15:04:59 +00:00
Michael Munday	cf03238020	cmd/compile: use numeric condition code masks on s390x Prior to this CL conditional branches on s390x always used an extended mnemonic such as BNE, BLT and so on to represent branch instructions with different condition code masks. This CL adds support for numeric condition code masks to the s390x SSA backend so that we can encode the condition under which a Block's successor is chosen as a field in that Block rather than in its type. This change will be useful as we come to add support for combined compare-and-branch instructions. Rather than trying to add extended mnemonics for every possible combination of mask and compare-and- branch instruction we can instead use a single mnemonic for each instruction. Change-Id: Idb7458f187b50906877d683695c291dff5279553 Reviewed-on: https://go-review.googlesource.com/c/go/+/197178 Reviewed-by: Keith Randall <khr@golang.org>	2019-09-26 14:47:12 +00:00
Michael Munday	8c99e45ef9	cmd/asm: add masked branch and conditional load instructions to s390x The branch-relative-on-condition (BRC) instruction allows us to use an immediate to specify under what conditions the branch is taken. For example, `BRC $7, L1` is equivalent to `BNE L1`. It is sometimes useful to specify branches in this way when either we don't have an extended mnemonic for a particular mask value or we want to generate the condition code mask programmatically. The new load-on-condition (LOCR and LOCGR) and compare-and-branch (CRJ, CGRJ, CLRJ, CLGRJ, CIJ, CGIJ, CLIJ and CLGIJ) instructions provide the same flexibility for conditional loads and combined compare and branch instructions. Change-Id: Ic6f5d399b0157e278b39bd3645f4ee0f4df8e5fc Reviewed-on: https://go-review.googlesource.com/c/go/+/196558 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-09-25 22:24:41 +00:00
Ruixin Bao	98aa97806b	cmd/compile: add math/bits.Mul64 intrinsic on s390x This change adds an intrinsic for Mul64 on s390x. To achieve that, a new assembly instruction, MLGR, is introduced in s390x/asmz.go. This assembly instruction directly uses an existing instruction on Z and supports multiplication of two 64 bit unsigned integer and stores the result in two separate registers. In this case, we require the multiplcand to be stored in register R3 and the output result (the high and low 64 bit of the product) to be stored in R2 and R3 respectively. A test case is also added. Benchmark: name old time/op new time/op delta Mul-18 11.1ns ± 0% 1.4ns ± 0% -87.39% (p=0.002 n=8+10) Mul32-18 2.07ns ± 0% 2.07ns ± 0% ~ (all equal) Mul64-18 11.1ns ± 1% 1.4ns ± 0% -87.42% (p=0.000 n=10+10) Change-Id: Ieca6ad1f61fff9a48a31d50bbd3f3c6d9e6675c1 Reviewed-on: https://go-review.googlesource.com/c/go/+/194572 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-09-13 09:04:48 +00:00
Ruixin Bao	90e0b40ab2	cmd/internal/obj/s390x: use 12 bit load and store instruction when possible on s390x Originally, we default to use load and store instruction with 20 bit displacement. However, that is not necessary. Some instructions have a displacement smaller than 12 bit. This CL allows the usage of 12 bit load and store instruction when that happens. This change also reduces the size of .text section in go binary by 19 KB. Some tests are also added to verify the functionality of the change. Change-Id: I13edea06ca653d4b9ffeaefe8d010bc2f065c2ba Reviewed-on: https://go-review.googlesource.com/c/go/+/194857 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-09-12 15:49:54 +00:00
Ruixin(Peter) Bao	11c2411c50	cmd/compile/internal/s390x: replace 4-byte NOP with a 2-byte NOP on s390x Added a new instruction, NOPH, with the encoding [0x0700](i.e: bcr 0, 0) and replace the current 4-byte nop that was encoded using the WORD instruction. This reduces the size of .text section in go binary by around 17KB and make generated code easier to read. Change-Id: I6a756df39e93c4415ea6d038ba4af001b8ccb286 Reviewed-on: https://go-review.googlesource.com/c/go/+/194344 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-09-11 12:19:26 +00:00
Shulhan	ed7f323c8f	all: simplify code using "gofmt -s -w" Most changes are removing redundant declaration of type when direct instantiating value of map or slice, e.g. []T{T{}} become []T{{}}. Small changes are removing the high order of subslice if its value is the length of slice itself, e.g. T[:len(T)] become T[:]. The following file is excluded due to incompatibility with go1.4, - src/cmd/compile/internal/gc/ssa.go Change-Id: Id3abb09401795ce1e6da591a89749cba8502fb26 Reviewed-on: https://go-review.googlesource.com/c/go/+/166437 Run-TryBot: Dave Cheney <dave@cheney.net> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2019-05-06 22:19:22 +00:00
Michael Munday	2c1b5130aa	cmd/compile: add math/bits.{Add,Sub}64 intrinsics on s390x This CL adds intrinsics for the 64-bit addition and subtraction functions in math/bits. These intrinsics use the condition code to propagate the carry or borrow bit. To make the carry chains more efficient I've removed the 'clobberFlags' property from most of the load and store operations. Originally these ops did clobber flags when using offsets that didn't fit in a signed 20-bit integer, however that is no longer true. As with other platforms the intrinsics are faster when executed in a chain rather than a loop because currently we need to spill and restore the carry bit between each loop iteration. We may be able to reduce the need to do this on s390x (e.g. by using compare-and-branch instructions that do not clobber flags) in the future. name old time/op new time/op delta Add64 1.21ns ± 2% 2.03ns ± 2% +67.18% (p=0.000 n=7+10) Add64multiple 2.98ns ± 3% 1.03ns ± 0% -65.39% (p=0.000 n=10+9) Sub64 1.23ns ± 4% 2.03ns ± 1% +64.85% (p=0.000 n=10+10) Sub64multiple 3.73ns ± 4% 1.04ns ± 1% -72.28% (p=0.000 n=10+8) Change-Id: I913bbd5e19e6b95bef52f5bc4f14d6fe40119083 Reviewed-on: https://go-review.googlesource.com/c/go/+/174303 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-05-03 10:41:15 +00:00
Michael Munday	0f79510dc5	cmd/asm: add s390x 'rotate then ... selected bits' instructions This CL adds the following instructions, useful for shifting/rotating and masking operations: * RNSBG - rotate then and selected bits * ROSBG - rotate then or selected bits * RXSBG - rotate then exclusive or selected bits * RISBG - rotate then insert selected bits It also adds the 'T' (test), 'Z' (zero), 'H' (high), 'L' (low) and 'N' (no test) variants of these instructions as appropriate. Operands are ordered as: I₃, I₄, I₅, R₂, R₁. Key: I₃=start, I₄=end, I₅=amount, R₂=source, R₁=destination Change-Id: I200d12287e1df7447f37f4919da5e9a93d27c792 Reviewed-on: https://go-review.googlesource.com/c/go/+/159357 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-04-16 09:17:24 +00:00
Michael Munday	9c843f031d	cmd/internal/obj/s390x: handle RestArgs in s390x assembler Allow up to 3 RestArgs arguments to be specified. This is needed to for us to add the 'rotate and ... bits' instructions, which require 5 arguments, cleanly. Change-Id: I76b89adfb5e3cd85a43023e412f0cc202d489e0b Reviewed-on: https://go-review.googlesource.com/c/go/+/171726 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-04-16 09:17:14 +00:00
Michael Munday	e61985427e	cmd/internal/obj/s390x: remove param field from optab The param field isn't useful, we can just use REGSP instead. Change-Id: I2ac68131c390209cc84e43aa7620ccbf5ae69120 Reviewed-on: https://go-review.googlesource.com/c/go/+/171725 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-04-16 09:17:06 +00:00
Michael Munday	6966b67510	cmd/asm: add 'insert program mask' instruction for s390x This CL adds the 'insert program mask' (IPM) instruction to s390x. IPM stores the current program mask (which contains the condition code) into a general purpose register. This instruction will be useful when implementing intrinsics for the arithmetic functions in the math/bits package. We can also potentially use it to convert some condition codes into bool values. The condition code can be saved and restored using an instruction sequence such as: IPM R4 // save condition code to R4 ... TMLH R4, $0x3000 // restore condition code from R4 We can also use IPM to save the carry bit to a register using an instruction sequence such as: IPM R4 // save condition code to R4 RISBLGZ $31, $31, $3, R4, R4 // isolate carry bit in R4 Change-Id: I169d450b6ea1a7ff8c0286115ddc42618da8a2f4 Reviewed-on: https://go-review.googlesource.com/c/go/+/165997 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-03-29 17:34:06 +00:00
Ian Lance Taylor	e546ef123e	cmd/internal/obj/s390x: don't crash on invalid instruction I didn't bother with a test as there doesn't seem to be an existing framework for testing assembler failures, and tests for invalid code aren't all that interesting. Fixes #26700 Change-Id: I719410d83527802a09b9d38625954fdb36a3c0f7 Reviewed-on: https://go-review.googlesource.com/c/153177 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-12-07 15:43:09 +00:00
bill_ofarrell	3f3142ad99	cmd/asm: add s390x VMSLG instruction variants VMSLG has three variants on z14 and later machines. These variants are used in "limbified" squaring: VMSLEG: Even Shift Indication -- the even-indexed intermediate result is doubled VMSLOG: Odd Shift Indication -- the odd-indexed intermediate result is doubled VMSLEOG: Even and Odd Shift Indication -- both intermediate results are doubled Limbified squaring is very useful for high performance cryptographic algorithms, such as elliptic curve. This change allows these instructions to be used in Go assembly. Change-Id: Iaad577b07320205539f99b3cb37a2a984882721b Reviewed-on: https://go-review.googlesource.com/c/145180 Reviewed-by: Michael Munday <mike.munday@ibm.com>	2018-10-29 09:54:51 +00:00
Michael Munday	6f9b94ab66	cmd/compile: implement OnesCount{8,16,32,64} intrinsics on s390x This CL implements the math/bits.OnesCount{8,16,32,64} functions as intrinsics on s390x using the 'population count' (popcnt) instruction. This instruction was released as the 'population-count' facility which uses the same facility bit (45) as the 'distinct-operands' facility which is a pre-requisite for Go on s390x. We can therefore use it without a feature check. The s390x popcnt instruction treats a 64 bit register as a vector of 8 bytes, summing the number of ones in each byte individually. It then writes the results to the corresponding bytes in the output register. Therefore to implement OnesCount{16,32,64} we need to sum the individual byte counts using some extra instructions. To do this efficiently I've added some additional pseudo operations to the s390x SSA backend. Unlike other architectures the new instruction sequence is faster for OnesCount8, so that is implemented using the intrinsic. name old time/op new time/op delta OnesCount 3.21ns ± 1% 1.35ns ± 0% -58.00% (p=0.000 n=20+20) OnesCount8 0.91ns ± 1% 0.81ns ± 0% -11.43% (p=0.000 n=20+20) OnesCount16 1.51ns ± 3% 1.21ns ± 0% -19.71% (p=0.000 n=20+17) OnesCount32 1.91ns ± 0% 1.12ns ± 1% -41.60% (p=0.000 n=19+20) OnesCount64 3.18ns ± 4% 1.35ns ± 0% -57.52% (p=0.000 n=20+20) Change-Id: Id54f0bd28b6db9a887ad12c0d72fcc168ef9c4e0 Reviewed-on: https://go-review.googlesource.com/114675 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-09-03 14:35:38 +00:00
Michael Munday	9fa988547a	cmd/internal/obj/s390x: increase maximum number of loop iterations The maximum number of 'spanz' iterations that the s390x assembler performs to reach a fixed point for relative offsets was 10. This turned out to be too aggressive for one example of auto-generated fuzzing code. Increase the number of iterations by 10x to reduce the likelihood that the limit will be hit again. This limit only exists to help find bugs in the assembler. master at tip does not fail with the example code in the issue, I have therefore not submitted it as a test (it is also quite large). I tested this change with the example code at the commit given and it fixes the issue. Fixes #25269. Change-Id: I0e44948957a7faff51c7d27c0b7746ed6e2d47bb Reviewed-on: https://go-review.googlesource.com/122235 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-07-05 07:21:50 +00:00
Daniel Martí	9ecf899b29	cmd: remove some unnecessary gotos Pick the low-hanging fruit, which are the gotos that don't go very far and labels that aren't used often. All of them have easy replacements with breaks and returns. One slightly tricky rewrite is defaultlitreuse. We cannot use a defer func to reset lineno, because one of its return paths does not reset lineno, and thus broke toolstash -cmp. Passes toolstash -cmp on std cmd. Change-Id: Id1c0967868d69bb073addc7c5c3017ca91ff966f Reviewed-on: https://go-review.googlesource.com/110063 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-05-01 10:46:08 +00:00
bill_ofarrell	3c65bb5b90	cmd/asm: add s390x VMSLG instruction This instruction was introduced on the z14 to accelerate "limbified" multiplications for certain cryptographic algorithms. This change allows it to be used in Go assembly. Change-Id: Ic93dae7fec1756f662874c08a5abc435bce9dd9e Reviewed-on: https://go-review.googlesource.com/109695 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-04-28 22:15:25 +00:00
Michael Munday	32e6461dc6	cmd/asm, math: add s390x floating point test instructions Floating point test instructions allow special cases (NaN, ±∞ and a few other useful properties) to be checked directly. This CL adds the following instructions to the assembler: * LTEBR - load and test (float32) * LTDBR - load and test (float64) * TCEB - test data class (float32) * TCDB - test data class (float64) Note that I have only added immediate versions of the 'test data class' instructions for now as that's the only case I think the compiler will use. Change-Id: I3398aab2b3a758bf909bd158042234030c8af582 Reviewed-on: https://go-review.googlesource.com/104457 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-04-03 16:08:04 +00:00
Michael Munday	c280126557	cmd/asm, cmd/internal/obj/s390x, math: add "test under mask" instructions Adds the following s390x test under mask (immediate) instructions: TMHH TMHL TMLH TMLL These are useful for testing bits and are already used in the math package. Change-Id: Idffb3f83b238dba76ac1e42ac6b0bf7f1d11bea2 Reviewed-on: https://go-review.googlesource.com/41092 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-10-30 23:55:14 +00:00
Michael Munday	96cdacb971	cmd/asm, cmd/compile: optimize math.Abs and math.Copysign on s390x This change adds three new instructions: - LPDFR: load positive (math.Abs(x)) - LNDFR: load negative (-math.Abs(x)) - CPSDR: copy sign (math.Copysign(x, y)) By making use of GPR <-> FPR moves we can now compile math.Abs and math.Copysign to these instructions using SSA rules. This CL also adds new rules to merge address generation into combined load operations. This makes GPR <-> FPR move matching more reliable. name old time/op new time/op delta Copysign 1.85ns ± 0% 1.40ns ± 1% -24.65% (p=0.000 n=8+10) Abs 1.58ns ± 1% 0.73ns ± 1% -53.64% (p=0.000 n=10+10) The geo mean improvement for all math package benchmarks was 4.6%. Change-Id: I0cec35c5c1b3fb45243bf666b56b57faca981bc9 Reviewed-on: https://go-review.googlesource.com/73950 Run-TryBot: Michael Munday <mike.munday@ibm.com> Reviewed-by: Keith Randall <khr@golang.org>	2017-10-30 23:42:51 +00:00
isharipo	8c67f210a1	cmd/internal/obj: change Prog.From3 to RestArgs ([]Addr) This change makes it easier to express instructions with arbitrary number of operands. Rationale: previous approach with operand "hiding" does not scale well, AVX and especially AVX512 have many instructions with 3+ operands. x86 asm backend is updated to handle up to 6 explicit operands. It also fixes issue with 4-th immediate operand type checks. All `ytab` tables are updated accordingly. Changes to non-x86 backends only include these patterns: `p.From3 = X` => `p.SetFrom3(X)` `p.From3.X = Y` => `p.GetFrom3().X = Y` Over time, other backends can adapt Prog.RestArgs and reduce the amount of workarounds. -- Performance -- x/benchmark/build: $ benchstat upstream.bench patched.bench name old time/op new time/op delta Build-48 21.7s ± 2% 21.8s ± 2% ~ (p=0.218 n=10+10) name old binary-size new binary-size delta Build-48 10.3M ± 0% 10.3M ± 0% ~ (all equal) name old build-time/op new build-time/op delta Build-48 21.7s ± 2% 21.8s ± 2% ~ (p=0.218 n=10+10) name old build-peak-RSS-bytes new build-peak-RSS-bytes delta Build-48 145MB ± 5% 148MB ± 5% ~ (p=0.218 n=10+10) name old build-user+sys-time/op new build-user+sys-time/op delta Build-48 21.0s ± 2% 21.2s ± 2% ~ (p=0.075 n=10+10) Microbenchmark shows a slight slowdown. name old time/op new time/op delta AMD64asm-4 49.5ms ± 1% 49.9ms ± 1% +0.67% (p=0.001 n=23+15) func BenchmarkAMD64asm(b *testing.B) { for i := 0; i < b.N; i++ { TestAMD64EndToEnd(nil) TestAMD64Encoder(nil) } } Change-Id: I4f1d37b5c2c966da3f2127705ccac9bff0038183 Reviewed-on: https://go-review.googlesource.com/63490 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-09-15 21:05:03 +00:00
Cherry Zhang	f20944de78	cmd/compile: set/unset base register for better assembly print For address of an auto or arg, on all non-x86 architectures the assembler backend encodes the actual SP offset in the instruction but leaves the offset in Prog unchanged. When the assembly is printed in compile -S, it shows an offset relative to pseudo FP/SP with an actual hardware SP base register (e.g. R13 on ARM). This is confusing. Unset the base register if it is indeed SP, so the assembly output is consistent. If the base register isn't SP, it should be an error and the error output contains the actual base register. For address loading instructions, the base register isn't set in the compiler on non-x86 architectures. Set it. Normally it is SP and will be unset in the change mentioned above for printing. If it is not, it will be an error and the error output contains the actual base register. No change in generated binary, only printed assembly. Passes "go build -a -toolexec 'toolstash -cmp' std cmd" on all architectures. Fixes #21064. Change-Id: Ifafe8d5f9b437efbe824b63b3cbc2f5f6cdc1fd5 Reviewed-on: https://go-review.googlesource.com/49432 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-08-02 12:24:02 +00:00
Michael Munday	35cf3843a4	cmd/{asm,compile}: avoid zeroAuto clobbering flags on s390x This CL modifies how MOV[DWHB] instructions that store a constant to memory are assembled to avoid them clobbering the condition code (flags). It also modifies zeroAuto to use MOVD instructions instead of CLEAR (which is assembled as XC). MOV[DWHB]storeconst ops also no longer clobbers flags. Note: this CL modifies the assembler so that it can no longer handle immediates outside the range of an int16 or offsets from SB, which reflects what the machine instructions support. The compiler doesn't need this capability any more and I don't think this affects any existing assembly, but it is easy to workaround if it does. Fixes #20187. Change-Id: Ie54947ff38367bd6a19962bf1a6d0296a4accffb Reviewed-on: https://go-review.googlesource.com/42179 Reviewed-by: David Chase <drchase@google.com> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-05-02 17:43:31 +00:00
Michael Hudson-Doyle	d2a9545178	cmd/internal: remove SymKind values that are only checked for, never set Change-Id: Id152767c033c12966e9e12ae303b99f38776f919 Reviewed-on: https://go-review.googlesource.com/40987 Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-04-28 20:01:54 +00:00
Michael Munday	db6f3bbc9a	cmd: fix the order that s390x operands are printed in The assembler reordered the operands of some instructions to put the first operand into From3. Unfortunately this meant that when the instructions were printed the operands were in a different order than the assembler would expect as input. For example, 'MVC $8, (R1), (R2)' would be printed as 'MVC (R1), $8, (R2)'. Originally this was done to ensure that From contained the source memory operand. The current compiler no longer requires this and so this CL simply makes all instructions use the standard order for operands: From, Reg, From3 and finally To. Fixes #18295 Change-Id: Ib2b5ec29c647ca7a995eb03dc78f82d99618b092 Reviewed-on: https://go-review.googlesource.com/40299 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-04-25 15:16:56 +00:00
Matthew Dempsky	1e3570ac86	cmd/internal/objabi: extract shared functionality from obj Now only cmd/asm and cmd/compile depend on cmd/internal/obj. Changing the assembler backends no longer requires reinstalling cmd/link or cmd/addr2line. There's also now one canonical definition of the object file format in cmd/internal/objabi/doc.go, with a warning to update all three implementations. objabi is still something of a grab bag of unrelated code (e.g., flag and environment variable handling probably belong in a separate "tool" package), but this is still progress. Fixes #15165. Fixes #20026. Change-Id: Ic4b92fac7d0d35438e0d20c9579aad4085c5534c Reviewed-on: https://go-review.googlesource.com/40972 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-04-19 00:00:09 +00:00
Matthew Dempsky	1747078695	cmd/internal/obj: un-embed FuncInfo field in LSym Automated refactoring using github.com/mdempsky/unbed (to rewrite s.Foo to s.FuncInfo.Foo) and then gorename (to rename the FuncInfo field to just Func). Passes toolstash-check -all. Change-Id: I802c07a1239e0efea058a91a87c5efe12170083a Reviewed-on: https://go-review.googlesource.com/40670 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-04-18 17:29:50 +00:00
Michael Munday	eed6938cbb	cmd/asm, cmd/internal/obj/s390x, math: add LGDR and LDGR instructions The instructions allow moves between floating point and general purpose registers without any conversion taking place. Change-Id: I82c6f3ad9c841a83783b5be80dcf5cd538ff49e6 Reviewed-on: https://go-review.googlesource.com/38777 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-04-17 16:33:51 +00:00
Josh Bleecher Snyder	1e69245418	cmd/internal/obj/s390x: make assembler almost concurrency-safe CL 39922 made the arm assembler concurrency-safe. This CL does the same, but for s390x. The approach is similar: introduce ctxtz to hold function-local state and thread it through the assembler as necessary. One race remains after this CL, similar to CL 40252. That race is conceptually unrelated to this refactoring, and will be addressed in a separate CL. Passes toolstash-check -all. Updates #15756 Change-Id: Iabf17aa242b70c0b078c2e85dae3d93a5e512372 Reviewed-on: https://go-review.googlesource.com/40371 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Munday <munday@ca.ibm.com>	2017-04-11 15:02:28 +00:00
Josh Bleecher Snyder	63c1aff60b	cmd/internal/obj: eagerly initialize assemblers CL 38662 changed the x86 assembler to be eagerly initialized, for a concurrent backend. This CL puts in place a proper mechanism for doing so, and switches all architectures to use it. Passes toolstash-check -all. Updates #15756 Change-Id: Id2aa527d3a8259c95797d63a2f0d1123e3ca2a1c Reviewed-on: https://go-review.googlesource.com/39917 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-07 16:57:03 +00:00
Josh Bleecher Snyder	5b59b32c97	cmd/compile: teach assemblers to accept a Prog allocator The existing bulk Prog allocator is not concurrency-safe. To allow for concurrency-safe bulk allocation of Progs, I want to move Prog allocation and caching upstream, to the clients of cmd/internal/obj. This is a preliminary enabling refactoring. After this CL, instead of calling Ctxt.NewProg throughout the assemblers, we thread through a newprog function that returns a new Prog. That function is set up to be Ctxt.NewProg, so there are no real changes in this CL; this CL only establishes the plumbing. Passes toolstash-check -all. Negligible compiler performance impact. Updates #15756 name old time/op new time/op delta Template 213ms ± 3% 214ms ± 4% ~ (p=0.574 n=49+47) Unicode 90.1ms ± 5% 89.9ms ± 4% ~ (p=0.417 n=50+49) GoTypes 585ms ± 4% 584ms ± 3% ~ (p=0.466 n=49+49) SSA 6.50s ± 3% 6.52s ± 2% ~ (p=0.251 n=49+49) Flate 128ms ± 4% 128ms ± 4% ~ (p=0.673 n=49+50) GoParser 152ms ± 3% 152ms ± 3% ~ (p=0.810 n=48+49) Reflect 372ms ± 4% 372ms ± 5% ~ (p=0.778 n=49+50) Tar 113ms ± 5% 111ms ± 4% -0.98% (p=0.016 n=50+49) XML 208ms ± 3% 208ms ± 2% ~ (p=0.483 n=47+49) [Geo mean] 285ms 285ms -0.17% name old user-ns/op new user-ns/op delta Template 253M ± 8% 254M ± 9% ~ (p=0.899 n=50+50) Unicode 106M ± 9% 106M ±11% ~ (p=0.642 n=50+50) GoTypes 736M ± 4% 740M ± 4% ~ (p=0.121 n=50+49) SSA 8.82G ± 3% 8.88G ± 2% +0.65% (p=0.006 n=49+48) Flate 147M ± 4% 147M ± 5% ~ (p=0.844 n=47+48) GoParser 179M ± 4% 178M ± 6% ~ (p=0.785 n=50+50) Reflect 443M ± 6% 441M ± 5% ~ (p=0.850 n=48+47) Tar 126M ± 5% 126M ± 5% ~ (p=0.734 n=50+50) XML 244M ± 5% 244M ± 5% ~ (p=0.594 n=49+50) [Geo mean] 341M 341M +0.11% Change-Id: Ice962f61eb3a524c2db00a166cb582c22caa7d68 Reviewed-on: https://go-review.googlesource.com/39633 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-04-06 02:07:21 +00:00
Josh Bleecher Snyder	7e817859b3	cmd/internal/obj: eliminate Curp Remove the global obj.Link.Curp. In asmz.go, replace the only use by passing it as an argument. In asm0.go and asm9.go, it was written but never read. In asm5.go and asm7.go, thread it through as an argument. Passes toolstash-check -all. Updates #15756 Change-Id: I1a0faa89e768820f35d73a8b37ec8088d78d15f7 Reviewed-on: https://go-review.googlesource.com/38715 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-27 18:51:42 +00:00
Michael Munday	a524616860	cmd/{asm,internal/obj/s390x}, math: remove emulated float instructions The s390x port was based on the ppc64 port and, because of the way the port was done, inherited some instructions from it. ppc64 supports 3-operand (4-operand for FMADD etc.) floating point instructions but s390x doesn't (the destination register is always an input) and so these were emulated. There is a bug in the emulation of FMADD whereby if the destination register is also a source for the multiplication it will be clobbered. This doesn't break any assembly code in the std lib but could affect future work. To fix this I have gone through the floating point instructions and removed all unnecessary 3-/4-operand emulation. The compiler doesn't need it and assembly writers don't need it, it's just a source of bugs. I've also deleted the FNMADD family of emulated instructions. They aren't used anywhere. Change-Id: Ic07cedcf141a6a3b43a0c84895460f6cfbf56c04 Reviewed-on: https://go-review.googlesource.com/33350 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-02-10 16:11:25 +00:00
Michael Munday	fd118b69fa	cmd/asm, cmd/internal/obj/s390x: fix encoding of VREPI{H,F,G} Also adds tests for all missing VRI-a instructions (which may be affected by this change). Fixes #18749. Change-Id: I48249dda626f32555da9ab58659e2e140de6504a Reviewed-on: https://go-review.googlesource.com/35561 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-02-01 19:37:18 +00:00
Michael Munday	3ef07c412f	cmd, runtime: remove s390x 3 operand immediate logical ops These are emulated by the assembler and we don't need them. Change-Id: I2b07c5315a5b642fdb5e50b468453260ae121164 Reviewed-on: https://go-review.googlesource.com/31758 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-10-25 12:36:06 +00:00
Michael Munday	930ab0afd7	cmd/asm, cmd/internal/obj/s390x: fix VFMA and VFMS encoding The m5 and m6 fields were the wrong way round. Fixes #17444. Change-Id: I10297064f2cd09d037eac581c96a011358f70aae Reviewed-on: https://go-review.googlesource.com/31130 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2016-10-21 02:14:57 +00:00
Ian Lance Taylor	e32ac7978d	cmd/link, cmd/internal/obj: stop exporting various names Just happened to notice that these names (funcAlign and friends) are never referenced outside their package, so no need to export them. Change-Id: I4bbdaa4b0ef330c3c3ef50a2ca39593977a83545 Reviewed-on: https://go-review.googlesource.com/31496 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2016-10-19 21:16:58 +00:00
Michael Munday	1cfb5c3fd5	cmd/compile: merge loads into operations on s390x Adds the new canMergeLoad function which can be used by rules to decide whether a load can be merged into an operation. The function ensures that the merge will not reorder the load relative to memory operations (for example, stores) in such a way that the block can no longer be scheduled. This new function enables transformations such as: MOVD 0(R1), R2 ADD R2, R3 to: ADD 0(R1), R3 The two-operand form of the following instructions can now read a single memory operand: - ADD - ADDC - ADDW - MULLD - MULLW - SUB - SUBC - SUBE - SUBW - AND - ANDW - OR - ORW - XOR - XORW Improves SHA3 performance by 6-8%. Updates #15054. Change-Id: Ibcb9122126cd1a26f2c01c0dfdbb42fe5e7b5b94 Reviewed-on: https://go-review.googlesource.com/29272 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-10-17 19:45:20 +00:00
Michael Munday	f13372c9f7	cmd/internal/obj/s390x: remove support for stores of global addresses This CL removes support for MOVD instructions that store the address of a global variable. For example: MOVD $main·a(SB), (R1) MOVD $main·b(SB), main·c(SB) These instructions are emulated and the new backend doesn't need them (the stores now always go through an intermediate register). Change-Id: I3a1bcb3f19c5096ad0426afd76d35a4d7975733b Reviewed-on: https://go-review.googlesource.com/30720 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-10-09 20:19:31 +00:00
Michael Munday	45b26a93f3	cmd/{asm,compile}: replace TESTB op with CMPWconst on s390x TESTB was implemented as AND $0xff, Rx, REGTMP. Unfortunately there is no 3-operand AND-with-immediate instruction and so it was emulated by the assembler using two instructions. This CL uses CMPW instead of AND and also optimizes CMPW to use the chi instruction where possible. Overall this CL reduces the size of the .text section of the bin/go binary by ~2%. Change-Id: Ic335c29fc1129378fcbb1265bfb10f5b744a0f3f Reviewed-on: https://go-review.googlesource.com/30690 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-10-07 20:02:59 +00:00
Michael Munday	91706c04b9	cmd/asm, cmd/internal/obj/s390x: delete unused instructions Deletes the following s390x instructions: - ADDME - ADDZE - SUBME - SUBZE They appear to be emulated PPC instructions left over from the porting process and I don't think they will ever be useful. Change-Id: I9b1ba78019dbd1218d0c8f8ea2903878802d1990 Reviewed-on: https://go-review.googlesource.com/30538 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-10-06 11:45:48 +00:00
Michael Munday	dd1dcf9496	cmd/{asm,compile}: add ANDW, ORW and XORW instructions to s390x Adds the following instructions and uses them in the SSA backend: - ANDW - ORW - XORW The instruction encodings for 32-bit operations are typically shorter, particularly when an immediate is used. For example, XORW $-1, R1 only requires one instruction, whereas XOR requires two. Also removes some unused instructions (that were emulated): - ANDN - NAND - ORN - NOR Change-Id: Iff2a16f52004ba498720034e354be9771b10cac4 Reviewed-on: https://go-review.googlesource.com/30291 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2016-10-06 02:59:04 +00:00

1 2

62 Commits