A cursory examination of AMQP will reveal that it is a binary protocol. As a result, operations like Queue.Create are described as binary frames in which each field has a specific location and has been properly encoded to maximise available space. String encoding is substantially less efficient than binary encoding for numbers. In binary, “123” needs one byte, and as a string, it needs three. AMQP can be parsed relatively quickly because it is a binary protocol. Since there is no need to parse text, strings are secure. Since AMQP uses binary encoding, buffer overflow attacks cannot be used against AMQP software. Since AMQP is specified as easy-to-parse XML, creating the encoding/decoding procedures is also simple.
Or not. The evidence, in my opinion, indicates that the binary encoding used by AMQP was an error—originally mine. Let’s examine the benefits and drawbacks of this strategy while also dissecting the fundamental presumptions upon which it was built. Let’s finally contrast this with a different strategy based on what I think are more reliable presumptions.
It is more manageable.
Compared to a text format, it is easier to parse.
String parsing is more secure.
One can fully generate the codecs.
In silicon, processing is simple.
Codecs are a necessity in the first place.
It produces endlessly divergent variants of AMQP.
It is harder to use and comprehend than string encoding.
The importance of data kinds is highly stressed.
Even the most basic client API is incredibly complicated.
Now, a quick, efficient wire-level encoding is unquestionably worthwhile. Since AMQP is an enterprise solution and is meant for speed, complicated APIs are not a major issue, the argument went. Although a TELNET client can be used to experiment with text-based protocols like HTTP, nobody actually creates HTTP clients; instead, they use pre-existing libraries.
Therefore, it was decided that the expenses of binary encoding were a necessary cost of a quick, dependable protocol. In our opinion, it was impossible to expect to attain the required performance using a text-encoded protocol like HTTP.
The fundamental presumption behind AMQP’s encoding was that it was required for performance (fast parsing, data compactness). I have eliminated the primary basis for binary encoding if I can demonstrate that this presumption is incorrect.
A very quick protocol
Here is a quick test to see how well you understand procedures. What is the quickest common messaging protocol that is present in all contemporary operating systems, supported by all current web browsers, and able to overburden regular networks? It is significantly faster than AMQP and substantially faster than HTTP. In fact, implementations of this protocol would be the fastest messaging programme ever created if they weren’t reliant on reading and writing data to disc.
FTP, a simple file transfer protocol, is the solution and a favourite among network experts who want to determine whether a network link is set up for 100Mbps or 1Gbps: FTP can send data down the connection quickly enough to establish the network’s speed beyond a shadow of a doubt.
Here’s where it becomes fascinating and where my question comes in. What makes FTP unique in that it can send data so quickly? And what guidance does this give AMQP?
Interestingly, the research done by iMatix and FastMQ on ZeroMQ, an AMQ-inspired system that can send millions of messages per second, is where my opinions on AMQP performance originated. The reason ZeroMQ is so quick is that it employs FTP-style methods as opposed to AMQP’s.
FTP triumphs because it only requires one connection for message transfer and one for control commands. Later protocols, including HTTP, did not do this action. But compared to HTTP, FTP is typically quicker and easier. Faster and easier features are desirable.
The false assumptions of binary encoding
The basic premise of AMQP, that binary encoding is required, may be split down into more specific assumptions, each of which is false:
that improving control commands like Queue is necessary.Create. It is assumed that these commands are pertinent to performance. In actuality, they make up a very small portion of messaging activity, with message transport making up the vast majority rather than message control.
the requirement that control instructions use the same network connection as communications. The presumption is that a logical bundle of message data and control directives must follow the same physical path. They can even go through various areas of the network on quite diverse paths.
that both control commands and message transport require the same encoding. The optimal encoding, whichever it may be, can be used by any component of the protocol, hence there is no reason to try to utilise a single encoding model.
The simplest feasible encoding form for commands and the simplest possible encoding form for messages should have been used by AMQP from the beginning to separate control from data. The value of simplicity cannot be emphasised enough, especially in emerging protocols that must accommodate rapid expansion.
Control and data are separated, which greatly simplifies client and server communication. Writing AMQP clients is made difficult by the need to comprehend a complicated series of exchanges that include control and data. It turns out that by dividing these into two dialogues, the single merged dialogue becomes considerably simpler.
We are now free to select the most straightforward encoding for each category of work since we have separated them. The simplest feasible encoding for commands is text, using (for instance) the widely used “Header: value” syntax from HTTP, SMTP, etc. The parsing of this using regular expressions is simple. Attacks on this type of encoding are well known and simple to counter. Everything is a string and there are no odd data types.
Developers can see it immediately.
Maintaining backward compatibility becomes simple.
Writing clients gets less difficult.
Debugging and testing AMQP test cases become simpler.