0
点赞
收藏
分享

微信扫一扫

PostgreSQL数据库网络层——libpq前后端协议

_铁马冰河_ 2022-04-13 阅读 92

PostgreSQL 使用基于消息的协议在前端和后端(客户端和服务器)之间进行通信。 TCP/IP 和 Unix 域套接字都支持该协议。端口号 5432 已在 IANA 注册为支持此协议的服务器的惯用 TCP 端口号,但实际上可以使用任何非特权端口号。PostgreSQL uses a message-based protocol for communication between frontends and backends (clients and servers). The protocol is supported over TCP/IP and also over Unix-domain sockets. Port number 5432 has been registered with IANA as the customary TCP port number for servers supporting this protocol, but in practice any non-privileged port number can be used.

本文档描述了协议的 3.0 版本,在 PostgreSQL 7.4 及更高版本中实现。有关早期协议版本的描述,请参阅 PostgreSQL 文档的早期版本。单个服务器可以支持多个协议版本。初始启动请求消息告诉服务器客户端尝试使用哪个协议版本。如果服务器不支持客户端请求的主要版本,则连接将被拒绝(例如,如果客户端请求的协议版本 4.0,在撰写本文时不存在,则会发生这种情况)。如果服务器不支持客户端请求的次要版本(例如,客户端请求版本 3.1,但服务器仅支持 3.0),服务器可能会拒绝连接,也可能会以包含最高次要协议的 NegotiateProtocolVersion 消息进行响应它支持的版本。然后客户端可以选择使用指定的协议版本继续连接或中止连接。This document describes version 3.0 of the protocol, implemented in PostgreSQL 7.4 and later. For descriptions of the earlier protocol versions, see previous releases of the PostgreSQL documentation. A single server can support multiple protocol versions. The initial startup-request message tells the server which protocol version the client is attempting to use. If the major version requested by the client is not supported by the server, the connection will be rejected (for example, this would occur if the client requested protocol version 4.0, which does not exist as of this writing). If the minor version requested by the client is not supported by the server (e.g., the client requests version 3.1, but the server supports only 3.0), the server may either reject the connection or may respond with a NegotiateProtocolVersion message containing the highest minor protocol version which it supports. The client may then choose either to continue with the connection using the specified protocol version or to abort the connection.

为了有效地服务多个客户端,服务器为每个客户端启动一个新的“后端”进程。在当前实现中,在检测到传入连接后立即创建一个新的子进程。然而,这对协议是透明的。出于协议的目的,术语“后端”和“服务器”是可以互换的;同样,“前端”和“客户端”是可以互换的。In order to serve multiple clients efficiently, the server launches a new “backend” process for each client. In the current implementation, a new child process is created immediately after an incoming connection is detected. This is transparent to the protocol, however. For purposes of the protocol, the terms “backend” and “server” are interchangeable; likewise “frontend” and “client” are interchangeable.

Overview

该协议具有用于启动和正常操作的单独阶段。在启动阶段,前端打开与服务器的连接并验证自己以使服务器满意。 (这可能涉及一条消息,也可能涉及多条消息,具体取决于所使用的身份验证方法。)如果一切顺利,服务器将状态信息发送到前端,最后进入正常操作。除了初始启动请求消息外,这部分协议由服务器驱动。The protocol has separate phases for startup and normal operation. In the startup phase, the frontend opens a connection to the server and authenticates itself to the satisfaction of the server. (This might involve a single message, or multiple messages depending on the authentication method being used.) If all goes well, the server then sends status information to the frontend, and finally enters normal operation. Except for the initial startup-request message, this part of the protocol is driven by the server.

正常运行时,前端向后端发送查询等命令,后端发回查询结果等响应。在少数情况下(例如 NOTIFY),后端会发送未经请求的消息,但在大多数情况下,会话的这一部分是由前端请求驱动的。During normal operation, the frontend sends queries and other commands to the backend, and the backend sends back query results and other responses. There are a few cases (such as NOTIFY) wherein the backend will send unsolicited messages, but for the most part this portion of a session is driven by frontend requests.

会话的终止通常由前端选择,但在某些情况下可以由后端强制。无论如何,当后端关闭连接时,它会在退出之前回滚任何打开的(不完整的)事务。Termination of the session is normally by frontend choice, but can be forced by the backend in certain cases. In any case, when the backend closes the connection, it will roll back any open (incomplete) transaction before exiting.

在正常操作中,SQL 命令可以通过两个子协议中的任何一个来执行。在“简单查询”协议中,前端只是发送一个文本查询字符串,由后端解析并立即执行。在“扩展查询”协议中,查询的处理分为多个步骤:解析、参数值绑定和执行。这提供了灵活性和性能优势,但代价是额外的复杂性。Within normal operation, SQL commands can be executed through either of two sub-protocols. In the “simple query” protocol, the frontend just sends a textual query string, which is parsed and immediately executed by the backend. In the “extended query” protocol, processing of queries is separated into multiple steps: parsing, binding of parameter values, and execution. This offers flexibility and performance benefits, at the cost of extra complexity.

正常操作具有用于特殊操作(例如 COPY)的附加子协议。Normal operation has additional sub-protocols for special operations such as COPY.

Messaging Overview

All communication is through a stream of messages. The first byte of a message identifies the message type, and the next four bytes specify the length of the rest of the message (this length count includes itself, but not the message-type byte). The remaining contents of the message are determined by the message type. For historical reasons, the very first message sent by the client (the startup message) has no initial message-type byte. 所有通信都是通过消息流进行的。消息的第一个字节标识消息类型,接下来的四个字节指定消息其余部分的长度(此长度计数包括自身,但不包括消息类型字节)。消息的其余内容由消息类型决定。由于历史原因,客户端发送的第一条消息(启动消息)没有初始消息类型字节。

To avoid losing synchronization with the message stream, both servers and clients typically read an entire message into a buffer (using the byte count) before attempting to process its contents. This allows easy recovery if an error is detected while processing the contents. In extreme situations (such as not having enough memory to buffer the message), the receiver can use the byte count to determine how much input to skip before it resumes reading messages. 为了避免与消息流失去同步,服务器和客户端通常在尝试处理其内容之前将整个消息读入缓冲区(使用字节数)。如果在处理内容时检测到错误,这可以轻松恢复。在极端情况下(例如没有足够的内存来缓冲消息),接收方可以使用字节数来确定在继续读取消息之前要跳过多少输入。

Conversely, both servers and clients must take care never to send an incomplete message. This is commonly done by marshaling the entire message in a buffer before beginning to send it. If a communications failure occurs partway through sending or receiving a message, the only sensible response is to abandon the connection, since there is little hope of recovering message-boundary synchronization. 相反,服务器和客户端都必须小心不要发送不完整的消息。这通常通过在开始发送之前将整个消息编组在缓冲区中来完成。如果在发送或接收消息的过程中发生通信故障,唯一明智的反应是放弃连接,因为恢复消息边界同步的希望很小。

Extended Query Overview

在扩展查询协议中,SQL 命令的执行分为多个步骤。步骤之间保留的状态由两种类型的对象表示:prepared statements和portals。prepared statements表示文本查询字符串的解析和语义分析的结果。prepared statements本身并没有准备好执行,因为它可能缺少参数的特定值。portals表示准备执行或已部分执行的语句,其中填充了任何缺失的参数值。(对于 SELECT 语句,入口相当于打开的游标,但我们选择使用不同的术语,因为游标不’不处理非 SELECT 语句。)In the extended-query protocol, execution of SQL commands is divided into multiple steps. The state retained between steps is represented by two types of objects: prepared statements and portals. A prepared statement represents the result of parsing and semantic analysis of a textual query string. A prepared statement is not in itself ready to execute, because it might lack specific values for parameters. A portal represents a ready-to-execute or already-partially-executed statement, with any missing parameter values filled in. (For SELECT statements, a portal is equivalent to an open cursor, but we choose to use a different term since cursors don’t handle non-SELECT statements.)

整个执行周期包括一个解析步骤,它从文本查询字符串创建一个准备好的语句;绑定步骤,在给定准备好的语句和任何所需参数的值的情况下创建门户;以及运行门户查询的执行步骤。在返回行的查询(SELECT、SHOW 等)的情况下,可以告诉执行步骤仅获取有限数量的行,因此可能需要多个执行步骤来完成操作。 The overall execution cycle consists of a parse step, which creates a prepared statement from a textual query string; a bind step, which creates a portal given a prepared statement and values for any needed parameters; and an execute step that runs a portal’s query. In the case of a query that returns rows (SELECT, SHOW, etc), the execute step can be told to fetch only a limited number of rows, so that multiple execute steps might be needed to complete the operation.

后端可以跟踪多个prepared statements和portals(但请注意,这些仅存在于会话中,并且从不跨会话共享)。现有的prepared statements和portals由创建时分配的名称引用。此外,还有一个“未命名的”prepared statement和portal。尽管这些行为与命名对象大体相同,但对它们的操作针对只执行一次查询然后丢弃它的情况进行了优化,而对命名对象的操作针对多次使用的预期进行了优化。The backend can keep track of multiple prepared statements and portals (but note that these exist only within a session, and are never shared across sessions). Existing prepared statements and portals are referenced by names assigned when they were created. In addition, an “unnamed” prepared statement and portal exist. Although these behave largely the same as named objects, operations on them are optimized for the case of executing a query only once and then discarding it, whereas operations on named objects are optimized on the expectation of multiple uses.

Formats and Format Codes

特定数据类型的数据可能以几种不同格式中的任何一种进行传输。从 PostgreSQL 7.4 开始,唯一支持的格式是“文本”和“二进制”,但该协议为未来的扩展做了准备。任何值的所需格式由格式代码指定。客户端可以为每个传输的参数值和查询结果的每一列指定格式代码。文本的格式代码为零,二进制的格式代码为一,所有其他格式代码保留供将来定义。Data of a particular data type might be transmitted in any of several different formats. As of PostgreSQL 7.4 the only supported formats are “text” and “binary”, but the protocol makes provision for future extensions. The desired format for any value is specified by a format code. Clients can specify a format code for each transmitted parameter value and for each column of a query result. Text has format code zero, binary has format code one, and all other format codes are reserved for future definition.

值的文本表示是特定数据类型的输入/输出转换函数生成和接受的任何字符串。在传输的表示中,没有尾随空字符;如果前端想要将接收到的值作为 C 字符串处理,则必须将其加一。 (顺便说一下,文本格式不允许嵌入空值。)The text representation of values is whatever strings are produced and accepted by the input/output conversion functions for the particular data type. In the transmitted representation, there is no trailing null character; the frontend must add one to received values if it wants to process them as C strings. (The text format does not allow embedded nulls, by the way.)

整数的二进制表示使用网络字节顺序(最高有效字节在前)。对于其他数据类型,请查阅文档或源代码以了解二进制表示。请记住,复杂数据类型的二进制表示可能会因服务器版本而异;文本格式通常是更便携的选择。Binary representations for integers use network byte order (most significant byte first). For other data types consult the documentation or source code to learn about the binary representation. Keep in mind that binary representations for complex data types might change across server versions; the text format is usually the more portable choice.

Message Flow

本节描述消息流和每种消息类型的语义。 (每条消息的精确表示的细节出现在第 53.7 节中。)根据连接的状态,有几种不同的子协议:启动、查询、函数调用、复制和终止。 异步操作(包括通知响应和命令取消)也有特殊规定,可以在启动阶段之后的任何时间发生。This section describes the message flow and the semantics of each message type. (Details of the exact representation of each message appear in Section 53.7.) There are several different sub-protocols depending on the state of the connection: start-up, query, function call, COPY, and termination. There are also special provisions for asynchronous operations (including notification responses and command cancellation), which can occur at any time after the start-up phase.

Start-up

Simple Query

Extended Query

Function Call

COPY Operations

Asynchronous Operations

Canceling Requests in Progress

Termination

SSL Session Encryption

GSSAPI Session Encryption

SASL Authentication

SCRAM-SHA-256 Authentication

Streaming Replication Protocol

Logical Streaming Replication Protocol

Logical Streaming Replication Parameters

Logical Replication Protocol Messages

Logical Replication Protocol Message Flow

Message Data Types

Message Formats

Error and Notice Message Fields

Logical Replication Message Formats

Summary of Changes since Protocol 2.0

举报

相关推荐

0 条评论