go/src/cmd/internal/obj/link.go
Heschi Kreinick 4c54a047c6 [dev.debug] cmd/compile: better DWARF with optimizations on
Debuggers use DWARF information to find local variables on the
stack and in registers. Prior to this CL, the DWARF information for
functions claimed that all variables were on the stack at all times.
That's incorrect when optimizations are enabled, and results in
debuggers showing data that is out of date or complete gibberish.

After this CL, the compiler is capable of representing variable
locations more accurately, and attempts to do so. Due to limitations of
the SSA backend, it's not possible to be completely correct.

There are a number of problems in the current design. One of the easier
to understand is that variable names currently must be attached to an
SSA value, but not all assignments in the source code actually result
in machine code. For example:

  type myint int
  var a int
  b := myint(int)
and
  b := (*uint64)(unsafe.Pointer(a))

don't generate machine code because the underlying representation is the
same, so the correct value of b will not be set when the user would
expect.

Generating the more precise debug information is behind a flag,
dwarflocationlists. Because of the issues described above, setting the
flag may not make the debugging experience much better, and may actually
make it worse in cases where the variable actually is on the stack and
the more complicated analysis doesn't realize it.

A number of changes are included:
- Add a new pseudo-instruction, RegKill, which indicates that the value
in the register has been clobbered.
- Adjust regalloc to emit RegKills in the right places. Significantly,
this means that phis are mixed with StoreReg and RegKills after
regalloc.
- Track variable decomposition in ssa.LocalSlots.
- After the SSA backend is done, analyze the result and build location
lists for each LocalSlot.
- After assembly is done, update the location lists with the assembled
PC offsets, recompose variables, and build DWARF location lists. Emit the
list as a new linker symbol, one per function.
- In the linker, aggregate the location lists into a .debug_loc section.

TODO:
- currently disabled for non-X86/AMD64 because there are no data tables.

go build -toolexec 'toolstash -cmp' -a std succeeds.

With -dwarflocationlists false:
before: f02812195637909ff675782c0b46836a8ff01976
after:  06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec
benchstat -geomean  /tmp/220352263 /tmp/621364410
completed   15 of   15, estimated time remaining 0s (eta 3:52PM)
name        old time/op       new time/op       delta
Template          199ms ± 3%        198ms ± 2%     ~     (p=0.400 n=15+14)
Unicode          96.6ms ± 5%       96.4ms ± 5%     ~     (p=0.838 n=15+15)
GoTypes           653ms ± 2%        647ms ± 2%     ~     (p=0.102 n=15+14)
Flate             133ms ± 6%        129ms ± 3%   -2.62%  (p=0.041 n=15+15)
GoParser          164ms ± 5%        159ms ± 3%   -3.05%  (p=0.000 n=15+15)
Reflect           428ms ± 4%        422ms ± 3%     ~     (p=0.156 n=15+13)
Tar               123ms ±10%        124ms ± 8%     ~     (p=0.461 n=15+15)
XML               228ms ± 3%        224ms ± 3%   -1.57%  (p=0.045 n=15+15)
[Geo mean]        206ms             377ms       +82.86%

name        old user-time/op  new user-time/op  delta
Template          292ms ±10%        301ms ±12%     ~     (p=0.189 n=15+15)
Unicode           166ms ±37%        158ms ±14%     ~     (p=0.418 n=15+14)
GoTypes           962ms ± 6%        963ms ± 7%     ~     (p=0.976 n=15+15)
Flate             207ms ±19%        200ms ±14%     ~     (p=0.345 n=14+15)
GoParser          246ms ±22%        240ms ±15%     ~     (p=0.587 n=15+15)
Reflect           611ms ±13%        587ms ±14%     ~     (p=0.085 n=15+13)
Tar               211ms ±12%        217ms ±14%     ~     (p=0.355 n=14+15)
XML               335ms ±15%        320ms ±18%     ~     (p=0.169 n=15+15)
[Geo mean]        317ms             583ms       +83.72%

name        old alloc/op      new alloc/op      delta
Template         40.2MB ± 0%       40.2MB ± 0%   -0.15%  (p=0.000 n=14+15)
Unicode          29.2MB ± 0%       29.3MB ± 0%     ~     (p=0.624 n=15+15)
GoTypes           114MB ± 0%        114MB ± 0%   -0.15%  (p=0.000 n=15+14)
Flate            25.7MB ± 0%       25.6MB ± 0%   -0.18%  (p=0.000 n=13+15)
GoParser         32.2MB ± 0%       32.2MB ± 0%   -0.14%  (p=0.003 n=15+15)
Reflect          77.8MB ± 0%       77.9MB ± 0%     ~     (p=0.061 n=15+15)
Tar              27.1MB ± 0%       27.0MB ± 0%   -0.11%  (p=0.029 n=15+15)
XML              42.7MB ± 0%       42.5MB ± 0%   -0.29%  (p=0.000 n=15+15)
[Geo mean]       42.1MB            75.0MB       +78.05%

name        old allocs/op     new allocs/op     delta
Template           402k ± 1%         398k ± 0%   -0.91%  (p=0.000 n=15+15)
Unicode            344k ± 1%         344k ± 0%     ~     (p=0.715 n=15+14)
GoTypes           1.18M ± 0%        1.17M ± 0%   -0.91%  (p=0.000 n=15+14)
Flate              243k ± 0%         240k ± 1%   -1.05%  (p=0.000 n=13+15)
GoParser           327k ± 1%         324k ± 1%   -0.96%  (p=0.000 n=15+15)
Reflect            984k ± 1%         982k ± 0%     ~     (p=0.050 n=15+15)
Tar                261k ± 1%         259k ± 1%   -0.77%  (p=0.000 n=15+15)
XML                411k ± 0%         404k ± 1%   -1.55%  (p=0.000 n=15+15)
[Geo mean]         439k              755k       +72.01%

name        old text-bytes    new text-bytes    delta
HelloSize         694kB ± 0%        694kB ± 0%   -0.00%  (p=0.000 n=15+15)

name        old data-bytes    new data-bytes    delta
HelloSize        5.55kB ± 0%       5.55kB ± 0%     ~     (all equal)

name        old bss-bytes     new bss-bytes     delta
HelloSize         133kB ± 0%        133kB ± 0%     ~     (all equal)

name        old exe-bytes     new exe-bytes     delta
HelloSize        1.04MB ± 0%       1.04MB ± 0%     ~     (all equal)

Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8
Reviewed-on: https://go-review.googlesource.com/41770
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-27 20:19:44 +00:00

545 lines
16 KiB
Go

// Derived from Inferno utils/6l/l.h and related files.
// https://bitbucket.org/inferno-os/inferno-os/src/default/utils/6l/l.h
//
// Copyright © 1994-1999 Lucent Technologies Inc. All rights reserved.
// Portions Copyright © 1995-1997 C H Forsyth (forsyth@terzarima.net)
// Portions Copyright © 1997-1999 Vita Nuova Limited
// Portions Copyright © 2000-2007 Vita Nuova Holdings Limited (www.vitanuova.com)
// Portions Copyright © 2004,2006 Bruce Ellis
// Portions Copyright © 2005-2007 C H Forsyth (forsyth@terzarima.net)
// Revisions Copyright © 2000-2007 Lucent Technologies Inc. and others
// Portions Copyright © 2009 The Go Authors. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package obj
import (
"bufio"
"cmd/internal/dwarf"
"cmd/internal/objabi"
"cmd/internal/src"
"cmd/internal/sys"
"fmt"
"sync"
)
// An Addr is an argument to an instruction.
// The general forms and their encodings are:
//
// sym±offset(symkind)(reg)(index*scale)
// Memory reference at address &sym(symkind) + offset + reg + index*scale.
// Any of sym(symkind), ±offset, (reg), (index*scale), and *scale can be omitted.
// If (reg) and *scale are both omitted, the resulting expression (index) is parsed as (reg).
// To force a parsing as index*scale, write (index*1).
// Encoding:
// type = TYPE_MEM
// name = symkind (NAME_AUTO, ...) or 0 (NAME_NONE)
// sym = sym
// offset = ±offset
// reg = reg (REG_*)
// index = index (REG_*)
// scale = scale (1, 2, 4, 8)
//
// $<mem>
// Effective address of memory reference <mem>, defined above.
// Encoding: same as memory reference, but type = TYPE_ADDR.
//
// $<±integer value>
// This is a special case of $<mem>, in which only ±offset is present.
// It has a separate type for easy recognition.
// Encoding:
// type = TYPE_CONST
// offset = ±integer value
//
// *<mem>
// Indirect reference through memory reference <mem>, defined above.
// Only used on x86 for CALL/JMP *sym(SB), which calls/jumps to a function
// pointer stored in the data word sym(SB), not a function named sym(SB).
// Encoding: same as above, but type = TYPE_INDIR.
//
// $*$<mem>
// No longer used.
// On machines with actual SB registers, $*$<mem> forced the
// instruction encoding to use a full 32-bit constant, never a
// reference relative to SB.
//
// $<floating point literal>
// Floating point constant value.
// Encoding:
// type = TYPE_FCONST
// val = floating point value
//
// $<string literal, up to 8 chars>
// String literal value (raw bytes used for DATA instruction).
// Encoding:
// type = TYPE_SCONST
// val = string
//
// <register name>
// Any register: integer, floating point, control, segment, and so on.
// If looking for specific register kind, must check type and reg value range.
// Encoding:
// type = TYPE_REG
// reg = reg (REG_*)
//
// x(PC)
// Encoding:
// type = TYPE_BRANCH
// val = Prog* reference OR ELSE offset = target pc (branch takes priority)
//
// $±x-±y
// Final argument to TEXT, specifying local frame size x and argument size y.
// In this form, x and y are integer literals only, not arbitrary expressions.
// This avoids parsing ambiguities due to the use of - as a separator.
// The ± are optional.
// If the final argument to TEXT omits the -±y, the encoding should still
// use TYPE_TEXTSIZE (not TYPE_CONST), with u.argsize = ArgsSizeUnknown.
// Encoding:
// type = TYPE_TEXTSIZE
// offset = x
// val = int32(y)
//
// reg<<shift, reg>>shift, reg->shift, reg@>shift
// Shifted register value, for ARM and ARM64.
// In this form, reg must be a register and shift can be a register or an integer constant.
// Encoding:
// type = TYPE_SHIFT
// On ARM:
// offset = (reg&15) | shifttype<<5 | count
// shifttype = 0, 1, 2, 3 for <<, >>, ->, @>
// count = (reg&15)<<8 | 1<<4 for a register shift count, (n&31)<<7 for an integer constant.
// On ARM64:
// offset = (reg&31)<<16 | shifttype<<22 | (count&63)<<10
// shifttype = 0, 1, 2 for <<, >>, ->
//
// (reg, reg)
// A destination register pair. When used as the last argument of an instruction,
// this form makes clear that both registers are destinations.
// Encoding:
// type = TYPE_REGREG
// reg = first register
// offset = second register
//
// [reg, reg, reg-reg]
// Register list for ARM.
// Encoding:
// type = TYPE_REGLIST
// offset = bit mask of registers in list; R0 is low bit.
//
// reg, reg
// Register pair for ARM.
// TYPE_REGREG2
//
// (reg+reg)
// Register pair for PPC64.
// Encoding:
// type = TYPE_MEM
// reg = first register
// index = second register
// scale = 1
//
type Addr struct {
Reg int16
Index int16
Scale int16 // Sometimes holds a register.
Type AddrType
Name AddrName
Class int8
Offset int64
Sym *LSym
// argument value:
// for TYPE_SCONST, a string
// for TYPE_FCONST, a float64
// for TYPE_BRANCH, a *Prog (optional)
// for TYPE_TEXTSIZE, an int32 (optional)
Val interface{}
}
type AddrName int8
const (
NAME_NONE AddrName = iota
NAME_EXTERN
NAME_STATIC
NAME_AUTO
NAME_PARAM
// A reference to name@GOT(SB) is a reference to the entry in the global offset
// table for 'name'.
NAME_GOTREF
)
type AddrType uint8
const (
TYPE_NONE AddrType = iota
TYPE_BRANCH
TYPE_TEXTSIZE
TYPE_MEM
TYPE_CONST
TYPE_FCONST
TYPE_SCONST
TYPE_REG
TYPE_ADDR
TYPE_SHIFT
TYPE_REGREG
TYPE_REGREG2
TYPE_INDIR
TYPE_REGLIST
)
// Prog describes a single machine instruction.
//
// The general instruction form is:
//
// As.Scond From, Reg, From3, To, RegTo2
//
// where As is an opcode and the others are arguments:
// From, Reg, From3 are sources, and To, RegTo2 are destinations.
// Usually, not all arguments are present.
// For example, MOVL R1, R2 encodes using only As=MOVL, From=R1, To=R2.
// The Scond field holds additional condition bits for systems (like arm)
// that have generalized conditional execution.
//
// Jump instructions use the Pcond field to point to the target instruction,
// which must be in the same linked list as the jump instruction.
//
// The Progs for a given function are arranged in a list linked through the Link field.
//
// Each Prog is charged to a specific source line in the debug information,
// specified by Pos.Line().
// Every Prog has a Ctxt field that defines its context.
// For performance reasons, Progs usually are usually bulk allocated, cached, and reused;
// those bulk allocators should always be used, rather than new(Prog).
//
// The other fields not yet mentioned are for use by the back ends and should
// be left zeroed by creators of Prog lists.
type Prog struct {
Ctxt *Link // linker context
Link *Prog // next Prog in linked list
From Addr // first source operand
From3 *Addr // third source operand (second is Reg below)
To Addr // destination operand (second is RegTo2 below)
Pcond *Prog // target of conditional jump
Forwd *Prog // for x86 back end
Rel *Prog // for x86, arm back ends
Pc int64 // for back ends or assembler: virtual or actual program counter, depending on phase
Pos src.XPos // source position of this instruction
Spadj int32 // effect of instruction on stack pointer (increment or decrement amount)
As As // assembler opcode
Reg int16 // 2nd source operand
RegTo2 int16 // 2nd destination operand
Mark uint16 // bitmask of arch-specific items
Optab uint16 // arch-specific opcode index
Scond uint8 // condition bits for conditional instruction (e.g., on ARM)
Back uint8 // for x86 back end: backwards branch state
Ft uint8 // for x86 back end: type index of Prog.From
Tt uint8 // for x86 back end: type index of Prog.To
Isize uint8 // for x86 back end: size of the instruction in bytes
}
// From3Type returns From3.Type, or TYPE_NONE when From3 is nil.
func (p *Prog) From3Type() AddrType {
if p.From3 == nil {
return TYPE_NONE
}
return p.From3.Type
}
// An As denotes an assembler opcode.
// There are some portable opcodes, declared here in package obj,
// that are common to all architectures.
// However, the majority of opcodes are arch-specific
// and are declared in their respective architecture's subpackage.
type As int16
// These are the portable opcodes.
const (
AXXX As = iota
ACALL
ADUFFCOPY
ADUFFZERO
AEND
AFUNCDATA
AJMP
ANOP
APCDATA
ARET
ATEXT
AUNDEF
A_ARCHSPECIFIC
)
// Each architecture is allotted a distinct subspace of opcode values
// for declaring its arch-specific opcodes.
// Within this subspace, the first arch-specific opcode should be
// at offset A_ARCHSPECIFIC.
//
// Subspaces are aligned to a power of two so opcodes can be masked
// with AMask and used as compact array indices.
const (
ABase386 = (1 + iota) << 10
ABaseARM
ABaseAMD64
ABasePPC64
ABaseARM64
ABaseMIPS
ABaseS390X
AllowedOpCodes = 1 << 10 // The number of opcodes available for any given architecture.
AMask = AllowedOpCodes - 1 // AND with this to use the opcode as an array index.
)
// An LSym is the sort of symbol that is written to an object file.
type LSym struct {
Name string
Type objabi.SymKind
Attribute
RefIdx int // Index of this symbol in the symbol reference list.
Size int64
Gotype *LSym
P []byte
R []Reloc
Func *FuncInfo
}
// A FuncInfo contains extra fields for STEXT symbols.
type FuncInfo struct {
Args int32
Locals int32
Text *Prog
Autom []*Auto
Pcln Pcln
dwarfInfoSym *LSym
dwarfLocSym *LSym
dwarfRangesSym *LSym
GCArgs LSym
GCLocals LSym
}
// Attribute is a set of symbol attributes.
type Attribute int16
const (
AttrDuplicateOK Attribute = 1 << iota
AttrCFunc
AttrNoSplit
AttrLeaf
AttrWrapper
AttrNeedCtxt
AttrNoFrame
AttrSeenGlobl
AttrOnList
AttrStatic
// MakeTypelink means that the type should have an entry in the typelink table.
AttrMakeTypelink
// ReflectMethod means the function may call reflect.Type.Method or
// reflect.Type.MethodByName. Matching is imprecise (as reflect.Type
// can be used through a custom interface), so ReflectMethod may be
// set in some cases when the reflect package is not called.
//
// Used by the linker to determine what methods can be pruned.
AttrReflectMethod
// Local means make the symbol local even when compiling Go code to reference Go
// symbols in other shared libraries, as in this mode symbols are global by
// default. "local" here means in the sense of the dynamic linker, i.e. not
// visible outside of the module (shared library or executable) that contains its
// definition. (When not compiling to support Go shared libraries, all symbols are
// local in this sense unless there is a cgo_export_* directive).
AttrLocal
)
func (a Attribute) DuplicateOK() bool { return a&AttrDuplicateOK != 0 }
func (a Attribute) MakeTypelink() bool { return a&AttrMakeTypelink != 0 }
func (a Attribute) CFunc() bool { return a&AttrCFunc != 0 }
func (a Attribute) NoSplit() bool { return a&AttrNoSplit != 0 }
func (a Attribute) Leaf() bool { return a&AttrLeaf != 0 }
func (a Attribute) SeenGlobl() bool { return a&AttrSeenGlobl != 0 }
func (a Attribute) OnList() bool { return a&AttrOnList != 0 }
func (a Attribute) ReflectMethod() bool { return a&AttrReflectMethod != 0 }
func (a Attribute) Local() bool { return a&AttrLocal != 0 }
func (a Attribute) Wrapper() bool { return a&AttrWrapper != 0 }
func (a Attribute) NeedCtxt() bool { return a&AttrNeedCtxt != 0 }
func (a Attribute) NoFrame() bool { return a&AttrNoFrame != 0 }
func (a Attribute) Static() bool { return a&AttrStatic != 0 }
func (a *Attribute) Set(flag Attribute, value bool) {
if value {
*a |= flag
} else {
*a &^= flag
}
}
var textAttrStrings = [...]struct {
bit Attribute
s string
}{
{bit: AttrDuplicateOK, s: "DUPOK"},
{bit: AttrMakeTypelink, s: ""},
{bit: AttrCFunc, s: "CFUNC"},
{bit: AttrNoSplit, s: "NOSPLIT"},
{bit: AttrLeaf, s: "LEAF"},
{bit: AttrSeenGlobl, s: ""},
{bit: AttrOnList, s: ""},
{bit: AttrReflectMethod, s: "REFLECTMETHOD"},
{bit: AttrLocal, s: "LOCAL"},
{bit: AttrWrapper, s: "WRAPPER"},
{bit: AttrNeedCtxt, s: "NEEDCTXT"},
{bit: AttrNoFrame, s: "NOFRAME"},
{bit: AttrStatic, s: "STATIC"},
}
// TextAttrString formats a for printing in as part of a TEXT prog.
func (a Attribute) TextAttrString() string {
var s string
for _, x := range textAttrStrings {
if a&x.bit != 0 {
if x.s != "" {
s += x.s + "|"
}
a &^= x.bit
}
}
if a != 0 {
s += fmt.Sprintf("UnknownAttribute(%d)|", a)
}
// Chop off trailing |, if present.
if len(s) > 0 {
s = s[:len(s)-1]
}
return s
}
// The compiler needs LSym to satisfy fmt.Stringer, because it stores
// an LSym in ssa.ExternSymbol.
func (s *LSym) String() string {
return s.Name
}
type Pcln struct {
Pcsp Pcdata
Pcfile Pcdata
Pcline Pcdata
Pcinline Pcdata
Pcdata []Pcdata
Funcdata []*LSym
Funcdataoff []int64
File []string
Lastfile string
Lastindex int
InlTree InlTree // per-function inlining tree extracted from the global tree
}
type Reloc struct {
Off int32
Siz uint8
Type objabi.RelocType
Add int64
Sym *LSym
}
type Auto struct {
Asym *LSym
Aoffset int32
Name AddrName
Gotype *LSym
}
type Pcdata struct {
P []byte
}
// Link holds the context for writing object code from a compiler
// to be linker input or for reading that input into the linker.
type Link struct {
Headtype objabi.HeadType
Arch *LinkArch
Debugasm bool
Debugvlog bool
Debugpcln string
Flag_shared bool
Flag_dynlink bool
Flag_optimize bool
Flag_locationlists bool
Bso *bufio.Writer
Pathname string
hashmu sync.Mutex // protects hash
hash map[string]*LSym // name -> sym mapping
statichash map[string]*LSym // name -> sym mapping for static syms
PosTable src.PosTable
InlTree InlTree // global inlining tree used by gc/inl.go
Imports []string
DiagFunc func(string, ...interface{})
DebugInfo func(fn *LSym, curfn interface{}) []dwarf.Scope // if non-nil, curfn is a *gc.Node
Errors int
Framepointer_enabled bool
// state for writing objects
Text []*LSym
Data []*LSym
}
func (ctxt *Link) Diag(format string, args ...interface{}) {
ctxt.Errors++
ctxt.DiagFunc(format, args...)
}
func (ctxt *Link) Logf(format string, args ...interface{}) {
fmt.Fprintf(ctxt.Bso, format, args...)
ctxt.Bso.Flush()
}
// The smallest possible offset from the hardware stack pointer to a local
// variable on the stack. Architectures that use a link register save its value
// on the stack in the function prologue and so always have a pointer between
// the hardware stack pointer and the local variable area.
func (ctxt *Link) FixedFrameSize() int64 {
switch ctxt.Arch.Family {
case sys.AMD64, sys.I386:
return 0
case sys.PPC64:
// PIC code on ppc64le requires 32 bytes of stack, and it's easier to
// just use that much stack always on ppc64x.
return int64(4 * ctxt.Arch.PtrSize)
default:
return int64(ctxt.Arch.PtrSize)
}
}
// LinkArch is the definition of a single architecture.
type LinkArch struct {
*sys.Arch
Init func(*Link)
Preprocess func(*Link, *LSym, ProgAlloc)
Assemble func(*Link, *LSym, ProgAlloc)
Progedit func(*Link, *Prog, ProgAlloc)
UnaryDst map[As]bool // Instruction takes one operand, a destination.
DWARFRegisters map[int16]int16
}