图解 Google V8 # 14：字节码（二）：解释器是如何解释执行字节码的？-CFANZ编程社区

说明

图解 Google V8 学习笔记

在编译流水线中的位置

字节码的解释执行在编译流水线中的位置：

图解 Google V8 # 14：字节码（二）：解释器是如何解释执行字节码的？_v8

V8 源码目录：

图解 Google V8 # 14：字节码（二）：解释器是如何解释执行字节码的？_字节码_02

如何生成字节码？

当 V8 执行一段 JavaScript 代码时，会先对 JavaScript 代码进行解析 (Parser)，并生成为 AST 和作用域信息，之后 AST 和作用域信息被输入到一个称为 Ignition 的解释器中，并将其转化为字节码，之后字节码再由 Ignition 解释器来解释执行。

例子：在 kaimo.js 文件里添加下面代码

图解 Google V8 # 14：字节码（二）：解释器是如何解释执行字节码的？_字节码_03

function add(x,) {
  var z = x+y
  return z
}
console.log(add(1, 2))

生成 AST

V8 首先会将函数的源码解析为 AST，执行下面命令查看 V8 内部生成的 AST：

v8-debug --print-ast kaimo.js

结果如下：

[generating bytecode for function: ]
--- AST ---
FUNC at 0
. KIND 0
. LITERAL ID 0
. SUSPEND COUNT 0
. NAME ""
. INFERRED NAME ""
. DECLS
. . FUNCTION "add" = function add
. EXPRESSION STATEMENT at 56
. . ASSIGN at -1
. . . VAR PROXY local[0] (0000025CD49A64A0) (mode = TEMPORARY, assigned = true) ".result"
. . . CALL
. . . . PROPERTY at 64
. . . . . VAR PROXY unallocated (0000025CD49A6560) (mode = DYNAMIC_GLOBAL, assigned = false) "console"
. . . . . NAME log
. . . . CALL
. . . . . VAR PROXY unallocated (0000025CD49A6330) (mode = VAR, assigned = true) "add"
. . . . . LITERAL 1
. . . . . LITERAL 2
. RETURN at -1
. . VAR PROXY local[0] (0000025CD49A64A0) (mode = TEMPORARY, assigned = true) ".result"

[generating bytecode for function: add]
--- AST ---
FUNC at 12
. KIND 0
. LITERAL ID 1
. SUSPEND COUNT 0
. NAME "add"
. PARAMS
. . VAR (0000025CD49A63B0) (mode = VAR, assigned = false) "x"
. . VAR (0000025CD49A6430) (mode = VAR, assigned = false) "y"
. DECLS
. . VARIABLE (0000025CD49A63B0) (mode = VAR, assigned = false) "x"
. . VARIABLE (0000025CD49A6430) (mode = VAR, assigned = false) "y"
. . VARIABLE (0000025CD49A64B0) (mode = VAR, assigned = false) "z"
. BLOCK NOCOMPLETIONS at -1
. . EXPRESSION STATEMENT at 34
. . . INIT at 34
. . . . VAR PROXY local[0] (0000025CD49A64B0) (mode = VAR, assigned = false) "z"
. . . . ADD at 35
. . . . . VAR PROXY parameter[0] (0000025CD49A63B0) (mode = VAR, assigned = false) "x"
. . . . . VAR PROXY parameter[1] (0000025CD49A6430) (mode = VAR, assigned = false) "y"
. RETURN at 43
. . VAR PROXY local[0] (0000025CD49A64B0) (mode = VAR, assigned = false) "z"

3

图解 Google V8 # 14：字节码（二）：解释器是如何解释执行字节码的？_作用域_04

图形化：

图解 Google V8 # 14：字节码（二）：解释器是如何解释执行字节码的？_javascript_05

函数的字面量被解析为 AST 树的形态，函数主要拆分成四部分：

参数的声明 (PARAMS)，参数声明中包括了所有的参数，在这里主要是参数 x 和参数 y，你可以在函数体中使用 arguments 来使用对应的参数。
变量声明节点 (DECLS)，参数部分可以使用 arguments 来调用，也可以将这些参数作为变量来直接使用，这体现在 DECLS 节点下面也出现了变量 x 和变量 y，除了可以直接使用 x 和 y 之外，我们还有一个 z 变量也在 DECLS 节点下。在上面生成的 AST 数据中，参数声明节点中的 x 和变量声明节点中的 x 的地址是相同的，都是 0000025CD49A63B0，同样 y 也是相同的，都是 0000025CD49A6430，这说明它们指向的是同一块数据。
x+y 的表达式节点，可以看到，节点 add 下面使用了var proxy x 和var proxy y 的语法，它们指向了实际 x 和 y 的值。
RETURN 节点，它指向了 z 的值，在这里是local[0]。

生成作用域

V8 在生成 AST 的同时，还生成了 add 函数的作用域，可以使用下面命令来查看：

v8-debug --print-scopes kaimo.js

结果如下：

Inner function scope:
function add () { // (00000203BED51A30) (12, 54)
  // NormalFunction
  // 2 heap slots
  // local vars:
  VAR y;  // (00000203BED42C30) never assigned
  VAR x;  // (00000203BED42BE8) never assigned
  VAR z;  // (00000203BED42C78) never assigned
}
Global scope:
global { // (00000203BED51840) (0, 78)
  // will be compiled
  // NormalFunction
  // 1 stack slots
  // temporary vars:
  TEMPORARY .result;  // (00000203BED51D70) local[0]
  // local vars:
  VAR add;  // (00000203BED51C00)
  // dynamic vars:
  DYNAMIC_GLOBAL console;  // (00000203BED51E30) never assigned

  function add () { // (00000203BED51A30) (12, 54)
    // lazily parsed
    // NormalFunction
    // 2 heap slots
  }
}
Global scope:
function add (x, y) { // (00000203BED4B400) (12, 54)
  // will be compiled
  // NormalFunction
  // 1 stack slots
  // local vars:
  VAR y;  // (00000203BED4B6D0) parameter[1], never assigned
  VAR x;  // (00000203BED4B650) parameter[0], never assigned
  VAR z;  // (00000203BED4B750) local[0], never assigned
}
3

图解 Google V8 # 14：字节码（二）：解释器是如何解释执行字节码的？_javascript_06

作用域中的变量都是未使用的，默认值都是 undefined，在执行阶段，作用域中的变量会指向堆和栈中相应的数据。

作用域和实际数据的关系：

图解 Google V8 # 14：字节码（二）：解释器是如何解释执行字节码的？_v8_07

在解析期间，所有函数体中声明的变量和函数参数，都被放进作用域中，如果是普通变量，那么默认值是 undefined，如果是函数声明，那么将指向实际的函数对象。

生成字节码

生成了作用域和 AST，V8 就可以依据它们来生成字节码。

v8-debug --print-bytecode kaimo.js

[generated bytecode for function:  (0x02c2002535b5 <SharedFunctionInfo>)]
Bytecode length: 43
Parameter count 1
Register count 6
Frame size 48
Bytecode age: 0
         000002C20025367E @    0 : 13 00             LdaConstant [0]
         000002C200253680 @    2 : c3                Star1
         000002C200253681 @    3 : 19 fe f8          Mov <closure>, r2
         000002C200253684 @    6 : 65 59 01 f9 02    CallRuntime [DeclareGlobals], r1-r2
         000002C200253689 @   11 : 21 01 00          LdaGlobal [1], [0]
         000002C20025368C @   14 : c2                Star2
         000002C20025368D @   15 : 2d f8 02 02       GetNamedProperty r2, [2], [2]
         000002C200253691 @   19 : c3                Star1
         000002C200253692 @   20 : 21 03 04          LdaGlobal [3], [4]
         000002C200253695 @   23 : c1                Star3
         000002C200253696 @   24 : 0d 01             LdaSmi [1]
         000002C200253698 @   26 : c0                Star4
         000002C200253699 @   27 : 0d 02             LdaSmi [2]
         000002C20025369B @   29 : bf                Star5
         000002C20025369C @   30 : 63 f7 f6 f5 06    CallUndefinedReceiver2 r3, r4, r5, [6]
         000002C2002536A1 @   35 : c1                Star3
         000002C2002536A2 @   36 : 5e f9 f8 f7 08    CallProperty1 r1, r2, r3, [8]
         000002C2002536A7 @   41 : c4                Star0
         000002C2002536A8 @   42 : a9                Return
Constant pool (size = 4)
000002C200253645: [FixedArray] in OldSpace
 - map: 0x02c200002239 <Map(FIXED_ARRAY_TYPE)>
 - length: 4
           0: 0x02c2002535fd <FixedArray[2]>
           1: 0x02c20000454d <String[7]: #console>
           2: 0x02c2001c27b9 <String[3]: #log>
           3: 0x02c2000041a1 <String[3]: #add>
Handler Table (size = 0)
Source Position Table (size = 0)
[generated bytecode for function: add (0x02c20025360d <SharedFunctionInfo add>)]
Bytecode length: 7
Parameter count 3
Register count 1
Frame size 8
Bytecode age: 0
         000002C2002537B6 @    0 : 0b 04             Ldar a1
         000002C2002537B8 @    2 : 39 03 00          Add a0, [0]
         000002C2002537BB @    5 : c4                Star0
         000002C2002537BC @    6 : a9                Return
Constant pool (size = 0)
Handler Table (size = 0)
Source Position Table (size = 0)
3

图解 Google V8 # 14：字节码（二）：解释器是如何解释执行字节码的？_javascript_08

我们可以看到 add 函数的 Parameter count 3，这是告诉我们这里有三个参数，包括了显式地传入了 x 和 y，还有一个隐式地传入了 this。

图解 Google V8 # 14：字节码（二）：解释器是如何解释执行字节码的？_v8_09

但是李兵大佬这里的字节码如下：

图解 Google V8 # 14：字节码（二）：解释器是如何解释执行字节码的？_作用域_10

StackCheck
Ldar a1
Add a0, [0]
Star r0
LdaSmi [2]

V8中定义的部分字节码指令集

上面一段 JavaScript 代码最终被 V8 还原成这一行行的字节码，它们负责实现特定的功能，有实现运算的，有实现跳转的，有实现返回的，有实现内存读取的。

V8 字节码的指令非常多，具体可以参考：https://github.com/v8/v8/blob/master/src/interpreter/bytecodes.h，我从代码里截取了一部分：

// The list of bytecodes which have unique handlers (no other bytecode is
// executed using identical code).
// Format is V(<bytecode>, <implicit_register_use>, <operands>).
#define BYTECODE_LIST_WITH_UNIQUE_HANDLERS(V)\
  /* Extended width operands */                                                \
  V(Wide, ImplicitRegisterUse::kNone)\
  V(ExtraWide, ImplicitRegisterUse::kNone)\
                                                                               \
  /* Debug Breakpoints - one for each possible size of unscaled bytecodes */   \
  /* and one for each operand widening prefix bytecode                    */   \
  V(DebugBreakWide, ImplicitRegisterUse::kReadWriteAccumulator)\
  V(DebugBreakExtraWide, ImplicitRegisterUse::kReadWriteAccumulator)\
  V(DebugBreak0, ImplicitRegisterUse::kReadWriteAccumulator)\
  V(DebugBreak1, ImplicitRegisterUse::kReadWriteAccumulator,\
    OperandType::kReg)\
  V(DebugBreak2, ImplicitRegisterUse::kReadWriteAccumulator,\
    OperandType::kReg, OperandType::kReg)\
  V(DebugBreak3, ImplicitRegisterUse::kReadWriteAccumulator,\
    OperandType::kReg, OperandType::kReg, OperandType::kReg)\
  V(DebugBreak4, ImplicitRegisterUse::kReadWriteAccumulator,\
    OperandType::kReg, OperandType::kReg, OperandType::kReg,\
    OperandType::kReg)\
  V(DebugBreak5, ImplicitRegisterUse::kReadWriteAccumulator,\
    OperandType::kRuntimeId, OperandType::kReg, OperandType::kReg)\
  V(DebugBreak6, ImplicitRegisterUse::kReadWriteAccumulator,\
    OperandType::kRuntimeId, OperandType::kReg, OperandType::kReg,\
    OperandType::kReg)\
                                                                               \
  /* Side-effect-free bytecodes -- carefully ordered for efficient checks */   \
  /* - [Loading the accumulator] */                                            \
  V(Ldar, ImplicitRegisterUse::kWriteAccumulator, OperandType::kReg)\
  V(LdaZero, ImplicitRegisterUse::kWriteAccumulator)\
  V(LdaSmi, ImplicitRegisterUse::kWriteAccumulator, OperandType::kImm)\
  V(LdaUndefined, ImplicitRegisterUse::kWriteAccumulator)\
  V(LdaNull, ImplicitRegisterUse::kWriteAccumulator)\
  V(LdaTheHole, ImplicitRegisterUse::kWriteAccumulator)\
  V(LdaTrue, ImplicitRegisterUse::kWriteAccumulator)\
  V(LdaFalse, ImplicitRegisterUse::kWriteAccumulator)\
  V(LdaConstant, ImplicitRegisterUse::kWriteAccumulator, OperandType::kIdx)\
  V(LdaContextSlot, ImplicitRegisterUse::kWriteAccumulator, OperandType::kReg,\
    OperandType::kIdx, OperandType::kUImm)\
  V(LdaImmutableContextSlot, ImplicitRegisterUse::kWriteAccumulator,\
    OperandType::kReg, OperandType::kIdx, OperandType::kUImm)\
  V(LdaCurrentContextSlot, ImplicitRegisterUse::kWriteAccumulator,\
    OperandType::kIdx)\
  V(LdaImmutableCurrentContextSlot, ImplicitRegisterUse::kWriteAccumulator,\
    OperandType::kIdx)\
  /* - [Register Loads ] */                                                    \
  V(Star, ImplicitRegisterUse::kReadAccumulator, OperandType::kRegOut)\
  V(Mov, ImplicitRegisterUse::kNone, OperandType::kReg, OperandType::kRegOut)\
  V(PushContext, ImplicitRegisterUse::kReadAccumulator, OperandType::kRegOut)\
  V(PopContext, ImplicitRegisterUse::kNone, OperandType::kReg)\
  /* - [Test Operations ] */                                                   \
  V(TestReferenceEqual, ImplicitRegisterUse::kReadWriteAccumulator,\
    OperandType::kReg)\
  V(TestUndetectable, ImplicitRegisterUse::kReadWriteAccumulator)\
  V(TestNull, ImplicitRegisterUse::kReadWriteAccumulator)\
  V(TestUndefined, ImplicitRegisterUse::kReadWriteAccumulator)\
  V(TestTypeOf, ImplicitRegisterUse::kReadWriteAccumulator,\
    OperandType::kFlag8)\
                                                                               \
  /* Globals */                                                                \
  V(LdaGlobal, ImplicitRegisterUse::kWriteAccumulator, OperandType::kIdx,\
    OperandType::kIdx)\
  V(LdaGlobalInsideTypeof, ImplicitRegisterUse::kWriteAccumulator,\
    OperandType::kIdx, OperandType::kIdx)\
  V(StaGlobal, ImplicitRegisterUse::kReadWriteAccumulator, OperandType::kIdx,\
    OperandType::kIdx)\
                                                                               \                                                 \
...

字节码指令列：

图解 Google V8 # 14：字节码（二）：解释器是如何解释执行字节码的？_javascript_11

V8 解释器的整体设计架构

通常有两种类型的解释器：

基于栈 (Stack-based)的解释器：使用栈来保存函数参数、中间运算结果、变量等，比如 Java 虚拟机，.Net 虚拟机，还有早期的 V8 虚拟机。
基于寄存器 (Register-based)的解释器：支持寄存器的指令操作，使用寄存器来保存参数、中间计算结果。比如现在的 V8 虚拟机。