diff --git a/doc/asm.html b/doc/asm.html
index 3a05d46aeb..c954079b66 100644
--- a/doc/asm.html
+++ b/doc/asm.html
@@ -738,6 +738,13 @@ The other codes are ->
(arithmetic right shift),
The ARM64 port is in an experimental state.
+
+R18
is the "platform register", reserved on the Apple platform.
+R27
and R28
are reserved by the compiler and linker.
+R29
is the frame pointer.
+R30
is the link register.
+
+
Instruction modifiers are appended to the instruction following a period.
The only modifiers are P
(postincrement) and W
@@ -752,11 +759,61 @@ Addressing modes:
-
-
(R5, R6)
: Register pair for LDP
/STP
.
+R0->16
+
+R0>>16
+
+R0<<16
+
+R0@>16
:
+These are the same as on the 32-bit ARM.
+
+
+-
+
$(8<<12)
:
+Left shift the immediate value 8
by 12
bits.
+
+
+-
+
8(R0)
:
+Add the value of R0
and 8
.
+
+
+-
+
(R2)(R0)
:
+The location at R0
plus R2
.
+
+
+-
+
R0.UXTB
+
+R0.UXTB<<imm
:
+UXTB
: extract an 8-bit value from the low-order bits of R0
and zero-extend it to the size of R0
.
+R0.UXTB<<imm
: left shift the result of R0.UXTB
by imm
bits.
+The imm
value can be 0, 1, 2, 3, or 4.
+The other extensions include UXTH
(16-bit), UXTW
(32-bit), and UXTX
(64-bit).
+
+
+-
+
R0.SXTB
+
+R0.SXTB<<imm
:
+SXTB
: extract an 8-bit value from the low-order bits of R0
and sign-extend it to the size of R0
.
+R0.SXTB<<imm
: left shift the result of R0.SXTB
by imm
bits.
+The imm
value can be 0, 1, 2, 3, or 4.
+The other extensions include SXTH
(16-bit), SXTW
(32-bit), and SXTX
(64-bit).
+
+
+-
+
(R5, R6)
: Register pair for LDAXP
/LDP
/LDXP
/STLXP
/STP
/STP
.
+
+Reference: Go ARM64 Assembly Instructions Reference Manual
+
+
64-bit PowerPC, a.k.a. ppc64
diff --git a/src/cmd/internal/obj/arm64/doc.go b/src/cmd/internal/obj/arm64/doc.go
index d06025d21c..d98b1b6f9e 100644
--- a/src/cmd/internal/obj/arm64/doc.go
+++ b/src/cmd/internal/obj/arm64/doc.go
@@ -1,334 +1,201 @@
-// Copyright 2017 The Go Authors. All rights reserved.
+// Copyright 2018 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
-package arm64
-
/*
+Package arm64 implements an ARM64 assembler. Go assembly syntax is different from GNU ARM64
+syntax, but we can still follow the general rules to map between them.
-Go Assembly for ARM64 Reference Manual
-
-1. Alphabetical list of basic instructions
- // TODO
-
- LDARB: Load-Acquire Register Byte
- LDARB (),
- Loads a byte from memory, zero-extends it and writes it to Rd.
-
- LDARH: Load-Acquire Register Halfword
- LDARH (),
- Loads a halfword from memory, zero-extends it and writes it to Rd.
-
- LDAXP: Load-Acquire Exclusive Pair of Registers
- LDAXP (), (, )
- Loads two 64-bit doublewords from memory, and writes them to Rt1 and Rt2.
-
- LDAXPW: Load-Acquire Exclusive Pair of Registers
- LDAXPW (), (, )
- Loads two 32-bit words from memory, and writes them to Rt1 and Rt2.
-
- LDXP: 64-bit Load Exclusive Pair of Registers
- LDXP (), (, )
- Loads two 64-bit doublewords from memory, and writes them to Rt1 and Rt2.
-
- LDXPW: 32-bit Load Exclusive Pair of Registers
- LDXPW (), (, )
- Loads two 32-bit words from memory, and writes them to Rt1 and Rt2.
-
- MOVD|MOVW|MOVH|MOVHU|MOVB|MOVBU: Load Register (register offset)
- MOVD (Rn)(Rm.UXTW<<3), Rt
- MOVD (Rn)(Rm.SXTX), Rt
- MOVD (Rn)(Rm<<3), Rt
- MOVD (Rn)(Rm), Rt
- MOVB|MOVBU (Rn)(Rm.UXTW), Rt
-
- MOVD|MOVW|MOVH|MOVB: Stote Register (register offset)
- MOVD Rt, (Rn)(Rm.UXTW<<3)
- MOVD Rt, (Rn)(Rm.SXTX)
- MOVD Rt, (Rn)(Rm)
-
- PRFM: Prefetch Memory (immediate)
- PRFM imm(Rn),
- prfop is the prefetch operation and can have the following values:
- PLDL1KEEP, PLDL1STRM, PLDL2KEEP, PLDL2STRM, PLDL3KEEP, PLDL3STRM,
- PLIL1KEEP, PLIL1STRM, PLIL2KEEP, PLIL2STRM, PLIL3KEEP, PLIL3STRM,
- PSTL1KEEP, PSTL1STRM, PSTL2KEEP, PSTL2STRM, PSTL3KEEP, PSTL3STRM.
- PRFM imm(Rn), $imm
- $imm prefetch operation is encoded as an immediate.
-
- STLRB: Store-Release Register Byte
- STLRB , ()
- Stores a byte from Rd to a memory location from Rn.
-
- STLRH: Store-Release Register Halfword
- STLRH , ()
- Stores a halfword from Rd to a memory location from Rn.
-
- STLXP: 64-bit Store-Release Exclusive Pair of registers
- STLXP (, ), (),
- Stores two 64-bit doublewords from Rt1 and Rt2 to a memory location from Rn,
- and returns in Rs a status value of 0 if the store was successful, or of 1 if
- no store was performed.
-
- STLXPW: 32-bit Store-Release Exclusive Pair of registers
- STLXPW (, ), (),
- Stores two 32-bit words from Rt1 and Rt2 to a memory location from Rn, and
- returns in Rs a status value of 0 if the store was successful, or of 1 if no
- store was performed.
-
- STXP: 64-bit Store Exclusive Pair of registers
- STXP (, ), (),
- Stores two 64-bit doublewords from Rt1 and Rt2 to a memory location from Rn,
- and returns in Rs a status value of 0 if the store was successful, or of 1 if
- no store was performed.
-
- STXPW: 32-bit Store Exclusive Pair of registers
- STXPW (, ), (),
- Stores two 32-bit words from Rt1 and Rt2 to a memory location from Rn, and returns in
- a Rs a status value of 0 if the store was successful, or of 1 if no store was performed.
-
-2. Alphabetical list of float-point instructions
- // TODO
-
- FMADDD: 64-bit floating-point fused Multiply-Add
- FMADDD , , ,
- Multiplies the values of and ,
- adds the product to , and writes the result to .
-
- FMADDS: 32-bit floating-point fused Multiply-Add
- FMADDS , , ,
- Multiplies the values of and ,
- adds the product to , and writes the result to .
-
- FMSUBD: 64-bit floating-point fused Multiply-Subtract
- FMSUBD , , ,
- Multiplies the values of and , negates the product,
- adds the product to , and writes the result to .
-
- FMSUBS: 32-bit floating-point fused Multiply-Subtract
- FMSUBS , , ,
- Multiplies the values of and , negates the product,
- adds the product to , and writes the result to .
-
- FNMADDD: 64-bit floating-point negated fused Multiply-Add
- FNMADDD , , ,
- Multiplies the values of and , negates the product,
- subtracts the value of , and writes the result to .
-
- FNMADDS: 32-bit floating-point negated fused Multiply-Add
- FNMADDS , , ,
- Multiplies the values of and , negates the product,
- subtracts the value of , and writes the result to .
-
- FNMSUBD: 64-bit floating-point negated fused Multiply-Subtract
- FNMSUBD , , ,
- Multiplies the values of and ,
- subtracts the value of , and writes the result to .
-
- FNMSUBS: 32-bit floating-point negated fused Multiply-Subtract
- FNMSUBS , , ,
- Multiplies the values of and ,
- subtracts the value of , and writes the result to .
-
-3. Alphabetical list of SIMD instructions
- VADD: Add (scalar)
- VADD , ,
- Add corresponding low 64-bit elements in and ,
- place the result into low 64-bit element of .
-
- VADD: Add (vector).
- VADD .T, ., .
- Is an arrangement specifier and can have the following values:
- B8, B16, H4, H8, S2, S4, D2
-
- VADDP: Add Pairwise (vector)
- VADDP ., ., .
- Is an arrangement specifier and can have the following values:
- B8, B16, H4, H8, S2, S4, D2
-
- VADDV: Add across Vector.
- VADDV ., Vd
- Is an arrangement specifier and can have the following values:
- 8B, 16B, H4, H8, S4
-
- VAND: Bitwise AND (vector)
- VAND ., ., .
- Is an arrangement specifier and can have the following values:
- B8, B16
-
- VCMEQ: Compare bitwise Equal (vector)
- VCMEQ ., ., .
- Is an arrangement specifier and can have the following values:
- B8, B16, H4, H8, S2, S4, D2
-
- VDUP: Duplicate vector element to vector or scalar.
- VDUP .[index], .
- Is an arrangement specifier and can have the following values:
- 8B, 16B, H4, H8, S2, S4, D2
- Is an element size specifier and can have the following values:
- B, H, S, D
-
- VEOR: Bitwise exclusive OR (vector, register)
- VEOR ., ., .
- Is an arrangement specifier and can have the following values:
- B8, B16
-
- VFMLA: Floating-point fused Multiply-Add to accumulator (vector)
- VFMLA ., ., .
- Is an arrangement specifier and can have the following values:
- S2, S4, D2
-
- VFMLS: Floating-point fused Multiply-Subtract from accumulator (vector)
- VFMLS ., ., .
- Is an arrangement specifier and can have the following values:
- S2, S4, D2
-
- VEXT: Extracts vector elements from src SIMD registers to dst SIMD register
- VEXT $index, ., ., .
- is an arrangment specifier and can be B8, B16
- $index is the lowest numbered byte element to be exracted.
-
- VLD1: Load multiple single-element structures
- VLD1 (Rn), [., . ...] // no offset
- VLD1.P imm(Rn), [., . ...] // immediate offset variant
- VLD1.P (Rn)(Rm), [., . ...] // register offset variant
- Is an arrangement specifier and can have the following values:
- B8, B16, H4, H8, S2, S4, D1, D2
-
- VLD1: Load one single-element structure
- VLD1 (Rn), .[index] // no offset
- VLD1.P imm(Rn), .[index] // immediate offset variant
- VLD1.P (Rn)(Rm), .[index] // register offset variant
- is an arrangement specifier and can have the following values:
- B, H, S D
-
- VMOV: move
- VMOV .[index], Rd // Move vector element to general-purpose register.
- Is a source width specifier and can have the following values:
- B, H, S (Wd)
- D (Xd)
-
- VMOV Rn, . // Duplicate general-purpose register to vector.
- Is an arrangement specifier and can have the following values:
- B8, B16, H4, H8, S2, S4 (Wn)
- D2 (Xn)
-
- VMOV ., . // Move vector.
- Is an arrangement specifier and can have the following values:
- B8, B16
-
- VMOV Rn, .[index] // Move general-purpose register to a vector element.
- Is a source width specifier and can have the following values:
- B, H, S (Wd)
- D (Xd)
-
- VMOV .[index], Vn // Move vector element to scalar.
- Is an element size specifier and can have the following values:
- B, H, S, D
-
- VMOV .[index], .[index] // Move vector element to another vector element.
- Is an element size specifier and can have the following values:
- B, H, S, D
-
- VMOVI: Move Immediate (vector).
- VMOVI $imm8, .
- is an arrangement specifier and can have the following values:
- 8B, 16B
-
- VMOVS: Load SIMD&FP Register (immediate offset). ARMv8: LDR (immediate, SIMD&FP)
- Store SIMD&FP register (immediate offset). ARMv8: STR (immediate, SIMD&FP)
- VMOVS (Rn), Vn
- VMOVS.W imm(Rn), Vn
- VMOVS.P imm(Rn), Vn
- VMOVS Vn, (Rn)
- VMOVS.W Vn, imm(Rn)
- VMOVS.P Vn, imm(Rn)
-
- VORR: Bitwise inclusive OR (vector, register)
- VORR ., ., .
- Is an arrangement specifier and can have the following values:
- B8, B16
-
- VRBIT: Reverse bit order (vector)
- VRBIT ., .
- is an arrangment specifier and can be B8, B16
-
- VREV32: Reverse elements in 32-bit words (vector).
- REV32 ., .
- Is an arrangement specifier and can have the following values:
- B8, B16, H4, H8
-
- VREV64: Reverse elements in 64-bit words (vector).
- REV64 ., .
- Is an arrangement specifier and can have the following values:
- B8, B16, H4, H8, S2, S4
-
- VSHL: Shift Left(immediate)
- VSHL $shift, ., .
- is an arrangement specifier and can have the following values:
- B8, B16, H4, H8, S2, S4, D1, D2
- $shift Is the left shift amount
-
- VST1: Store multiple single-element structures
- VST1 [., . ...], (Rn) // no offset
- VST1.P [., . ...], imm(Rn) // immediate offset variant
- VST1.P [.