The Advanced Message Queuing Protocol, or AMQP, is the subject of this article. Some extremely major companies and some smaller suppliers are trying to modify the rules in the commercial messaging industry with AMQP. They are successful in some respects. There are only a few open-source rivals and two leading suppliers in this sector, IBM and Tibco. It’s detrimental to both clients and vendors. Simply put, when there is no genuine rivalry in a technological field, nothing advances to the top. Nobody benefits from sterility since the old money eventually runs out. Take a look at Linux, HTTP, SQL, and the new riches that has developed on top of these now-basic technology. Look at messaging now; the only place where it has surpassed messaging is in the Java world. But that’s insufficient.
Let me briefly describe how standards operate. You might believe that the main goal is to make cutting-edge new ideas accessible to the general population. Standards actually work more to stifle innovation than to promote it. To be more precise, by stopping innovation at one level, more of it is permitted at higher levels.
Standardisation, then, inevitably progresses from more fundamental and widely used technology to more advanced and specifically utilised ones. The stack of standards thins out over time like sedimentary seams, and useless stacks disappear, leaving fewer and fewer fundamental standards supporting the whole computing industry. AMQP is distinctive in that it builds on protocols like TCP rather than HTTP at the bottom of the stack. Taking a single step from a very low level (data on the wire) to a very high level (application semantics) is likewise exceptionally ambitious. Later, we’ll talk more about how well that goal was executed.
The idea of a natural monopoly serves as the economic foundation for setting standards. This implies that a successful standard will draw in and retain all users (at least in this context). Use of money as an example: When the state mandates that we all use a specific currency unit, this develops into a monopsony. You can only trade with a penalty if you have any other currency in your possession.
The Internet Protocol, energy, telephones, and rail transport are examples of natural monopolies. Your toaster should fit with any electrical outlet.
A successful natural monopoly eliminates a lot of waste, sometimes known as “friction costs,” “transaction costs,” or possibly “excess profits,” when it arises due to chance, legislation, or market forces. Natural monopolies can generate a lot of wealth. Vendors, or people who sell things, have a comparable motive to attempt and capture that value in order to make up for the profits that too much of Adam Smith’s invisible hand would otherwise eliminate. Users can gain from the natural monopoly by having value released. A good illustration is the Internet. However, it can also use their capture and merciless taxation as a means of punishment. One example is the cost of your cell phone.
Thus, when vendors participate in standards development, there is frequently a basic conflict of interest, which is ironic given that nearly all conventional standards consortia are led by vendors. How can we free up all of that trapped value while still catching some of it? Patents and other forms of control are the solution in typical standards consortia. As a result, the Digital Standards Organisation defines an open standard as one that “is immune, at all stages of its life-cycle, from vendor capture” (disclaimer: I am one of the founders). A clever one-liner that irritated some individuals but encouraged many others to examine this conflict of interest further.
I’d like to say that iMatix is different from other vendors because we are nicer people, but it would be misleading when people question, “What makes iMatix different from other vendors, then?” Each vendor can choose to pursue the natural monopoly or engage in quality and performance competition. The vendor’s size and position in the market will determine the most effective approach. Competition is good for small, adaptable sellers like iMatix. Customers will gladly pay more in a competitive market for the greatest products. A successful open standard also paves the way for expansion into a lot of new markets in the future.
AMQP targets a market that has enormous captive value. The incumbents essentially set their own prices, significantly higher than the initial costs of software licences, in the aftermarket for expert services. No credible open source alternatives exist. Serious open standards don’t exist.
In order to unleash the captive value of the message market to the customers and to potential enterprises, AMQP was created with the goal of turning messaging into a dull solved problem and a natural monopoly.First, individuals are really considering AMQP as a solution and have bought into this bold vision. That is to say, it appears that everyone agrees that the mission statement of AMQP makes sense. Second, open source AMQP-based solutions are already outperforming conventional messaging in some very big projects.
AMQP, however, faces several challenges as with any nascent things. Without a defined plan or design, the specification appears to be out of control and getting more and more complicated. Except in a few specific situations, AMQP products do not interact. More ambitious functionality is not happening, and core protocol components are unreliable. Many of the initial developers have burned out, and the Working Group is being worn out.
At this point, the majority of my iMatix team has written off AMQP as a “failure”. These engineers are capable, as evidenced by one member of my team who began hacking the Linux kernel at the age of 11. In 2007, he met with me face-to-face for two days at the AMQP in New York, after which he said to me, “Pieter, these are pointless discussions, and if this is how things happen, AMQP will never succeed. If iMatix wants to waste its time and money, fine, but please don’t ask me to work on this project, or I will quit.”
It was regretful, but not incorrect, that he immediately and harshly criticised the internals of the banger maker. I was anticipating his response to either support or contradict my own. Soon after, iMatix made the decision to halt all investments in AMQP and wait for things to develop as they would, joining some of the other players. The AMQP strategy is currently on the verge of failing, and the vision is disintegrating. Additionally, the downward trend was predicted and obvious by late 2006.
Discussions in the Working Group are now open to the public, however discussing issues is frowned upon since “it scares potential customers.” For the end of 2008, AMQP/1.0 is anticipated. I’m prepared to wager that unless there is a rollback to a less complex, earlier version of the specification, AMQP/1.0 won’t be released in 2008 or 2009.
The Working Group is more likely to slowly disintegrate as it finds itself on the verge of missing one more deadline while unable to decide whether to go ahead, sideways, or backward.
Since 2005, iMatix has made significant investments in AMQP and its supporting technologies, such as OpenAMQ. In previous years, we spent so much money on it that it almost put us out of business and burned out a number of important team members. If AMQP does not work, we will have a significant issue. It would be a management failure if I didn’t do everything I could to mitigate the risk to my organisation, including producing this post.
The chance of failure for AMQP has increased with time. Any potentially profitable protocol will be fair game for the worst type of battles, as I already outlined. People immediately begin to consider “how can we kill the competition” after realising “this protocol means new business,” not realising that their rivals are actually assisting in the development of their future market.
To some extent, this is a typical step in the process because AMQP needs to redefine and triumph over its own circumstances in order to survive. It was inevitable that a brand-new open standard in a wealthy industry would run across the “vendor capture” conflict of interests and either resolve it or fail. Developers and supporters of AMQP must face and resolve the problems stopping AMQP from progressing. That entails carefully analysing the origins of wounds, regardless of how painful they may be, and doing the required procedures without regard for self-pity, pride, or shame.
Making AMQP is a social challenge rather than a technological one.
I have a theory about design: “the more you know about something, the harder it is to solve it.” OK, I admit I have a thing with funny one-liners. The key point is that solving complicated technical issues does not necessitate a high level of technical expertise; in fact, this actually gets in the way. There are a lot of very qualified engineers who are deeply technical and who would appreciate working on something as exciting as AMQP. There are thousands of them.
Making difficult problems simple for others to solve is a good strategy for solving them. Competent engineers frequently fail in this area because they comprehend technology but not people. Complexity is a human problem, and excellent design is about getting over psychological barriers rather than technical ones.
In his essay, Paul Gerrard claimed that “software does not obey the laws of physics” and compared the development of software to the construction of bridges. At a roundtable discussion about this phrase at a conference, I responded, “Software does obey physics, but it’s the physics of people, not bricks.”
Standards for software are no different. All of the issues with AMQP stem from the Working Group’s inability to see and address its own human limits. Through many iterations of the seven deadly sins of the Bible, we were unable to comprehend the physics of humans and instead concentrated solely on message technology.
As a result, the Working Group is currently on the verge of imploding due to repeated failures to meet deadlines, ongoing expenses without a clear return, low-quality publications, abandonment of interoperability between implementations, loss of credibility in the eyes of potential users, failure to satisfy rising demand for open messaging, failure to compete with less ambitious, more adaptable grass roots initiatives like XMPP, burnout of many or most key contributors, and a lack of progress.
“Epic fail!” as my mum would say.
Some people would say that by writing this, I’m making the situation worse. The published AMQP specifications are already enough of an indictment, the risk of failure is so great, and the price is so high, that it’s difficult to see how anything can be worse than what we already have. Like Brian in his Life Of, my response is, “Make it worse? Make it worse? How can it possibly be worse, I’m already going to be crucified!”
Failure can be overlooked. It is unacceptable to fail to learn from mistakes and to move on. A highly publicised and costly project failing mysteriously is fatal to a career.
This article will examine AMQP and what is wrong with it, ideally without feeling pity, pride, or shame. In addition to helping AMQP function as it should, my goal is to offer insights that can be applied elsewhere. I’ll examine a different piece of the puzzle in each section and offer solutions.
Keeping Things Basic
Today, the most common criticism of the AMQP specs is that they are overly complicated. This complaint is well-founded. Complexity is deadly; it is the minefield and high wall surrounding a technology that keeps people at bay, the garden of thorns surrounding Sleeping Beauty. Small teams and small projects cannot use or implement a complicated AMQP. AMQP will fail everywhere if it fails in the FOSS community.
The first published specification, AMQP/0.8, had the benefit of being straightforward. Without any prior testing, we witnessed numerous teams take the specification, read it, and create implementations that appropriately interacted with iMatix’s OpenAMQ reference implementation. This is unusual, nearly unheard of, but it demonstrates the power of ignorance (we didn’t realise interoperability would be challenging, therefore we made it work).
As the wise programmer Leif Svalgaard once reminded me, complexity is simple to create. The difficulty lies in simplicity. However, simply requesting that individuals keep things simple is insufficient. There is no secret formula for simplicity, and you cannot impose, rent, or buy it.
One of the reasons for complexity is that it takes time to learn how to simplify a design before it is really put into use. We don’t know what we need until we’ve really used it, and even then, we don’t know how to make it best. Many of the initial AMQP principles were over-engineered to address issues for which we had theories but no experience in practise. Direct waste made up about 30–40% of AMQP/0.8, but it was hard to predict what those components would be in advance. The 1.0 protocol ought to have been 90% as large as the 0.8 protocol, with some new bits added, sections removed, and with changes and clarifications made.
Instead, much of the genuine waste has not been removed, and the AMQP/0-10 specification is much larger than AMQP/0.8. The 0-9SP1 release, a cleanup of the officially deprecated 0-9 branch, will do a lot to remedy issue, although that version of the protocol is almost a skunkworks effort. Instead of being simpler, the official branch becomes more and more convoluted.
Growing intricacy is a sign of more serious issues. If you address these, you may, with some luck and effort, achieve simpler results. What then are the basic issues driving AMQP’s gradual increase in complexity?
“Sloppy work” would be the easy response. This would be pointless and offensive. It takes a lot of work to write protocol specifications, and it’s essential to have clear guidelines, precise vision, and a strong will. Getting this approved by a committee is notoriously tough. However, I believe that the only reason AMQP has so far failed is due to poor guidance from some of the Working Group’s most influential members.
It’s simple to make sweeping wishes that, like King Midas’ request that everything he touched turn to gold, end up being disastrous if they are taken literally.
The fact that AMQP has been marketed as a “Enterprise Technology” is, in my opinion, the first issue. Why does this matter? I believe the idea is that technology is compartmentalised like air travel, with one experience for the common people wearing jeans and trainers and another for high-value suits that is expensive, opulent, and comfortable. Enterprise technology is different from regular, everyday technology. Since AMQP will be used in banks and other institutions, it must be the finest available. Users of AMQP are VIPs who want to travel in Business Class. Customers and suppliers alike want to go the luxury path because it entails more spending and more flight miles.
This may make sense in the context of plane travel, but from what I can see, in the software industry, the prole technology is typically superior to the alternatives in terms of speed, security, robustness, and simplicity. Less money is available for spending and, more significantly, for wasting for proles.
The current proles in the software industry are, of course, FOSS users and developers. I’ve been creating free software code generators and tools since around 1992, therefore I have a good understanding of the industry. We do not believe that our software is inferior to closed-source goods; on the contrary, we believe that exposure makes it hard to conceal errors. The proletariat flies first class with unlimited champagne and neck rubs at Mach 10, while Enterprise customers pay a hundred times more for the right to stand in a cramped lavatory with seven screaming children all the way to Casablanca.
The Working Group is tolerant of complexity that would not have lasted an hour on the Internet thanks to AMQP’s branding as Enterprise Technology. The plans would have been posted on Slashdot and mocked six feet under with comments like “OMG? ROTFL!” Instead, they were included into the official guidelines and made public as Gospel. It would have a certain style as a subdued type of comedy (“we’re not ready yet, so here is some light farce to entertain you”). It fails as a serious undertaking and as a piece of recorded history.
In addition to tolerating subpar work that is overpriced and marketed as “extra value,” Enterprise Technology sets standards that are impossible for anyone to meet. “We must have 100% guaranteed reliability even if the server crashes” sounds good, but this one demand, which has been emphasised and promoted frequently by several AMQP participants, is what has led to the majority of the AMQP slippage. In reality, no provider can promise 100 percent flawless dependability in any industry, including computers, vehicles, food, texting, or software. And nobody requires it. That’s okay as long as the likelihood of loss is minimal. Or, to put it another way, cease making technology more dependable if compensating a customer for a lost message, exploding computer, delayed aircraft, or broken brakes is less expensive. Better is the enemy of excellent and therefore more expensive.
However, Enterprise Technology must be totally dependable in terms of AMQP. Making transactions that can successfully withstand a failover from a primary to backup server is necessary. This means that the protocol needs to have techniques for transferring sessions between servers. As a result, we must redefine what a session is as well as the atomicity of each activity, from queue formation to message transfer. Therefore, we must approach everything as a message. Or possibly the inverse. Therefore, thousands of new data kinds are required, including several iterations of the inadequately complex “bit” idea. Yes, the 300-page specification is now here. We now offer an appropriate item for the Enterprise!
Okay, no. What we got is a foul mass that resembles Godzilla’s morning vomit. Tokyo is still there, but it’s no longer a place you’d want to call home. Making servers that don’t crash is the major factor in server reliability in communications. And this results from creating simpler components. And this results from lowering the bar on perfectionist standards. Ironically, AMQP’s attempts to achieve perfection have the opposite result.
Later, I’ll discuss my personal opinions on reliability. They contradict the notion that AMQP servers must be large, complex, and trustworthy, which is uncomfortable reading for the existing AMQP worldview, but I haven’t seen any evidence-based arguments in opposition to them. The AMQP Working Group must learn to accept “good enough” when it is offered, regardless of the precise solution.
Here is the second major factor driving AMQP towards perfect entropy: “There can only ever be one version!”
Some of the larger participants have made this their guiding principle. A wire level protocol upgrade is quite expensive from the customer’s point of view. HTTP 1.1 marked the end of its development. As a result, since AMQP/1.0 won’t be altered again, it must be flawless. We only have one chance, therefore it must be perfect!
From the standpoint of the vendor, a protocol that is being controlled by a single team and becoming more complex presents a profitable chance for capture. Whether on purpose or by mistake, AMQP/0-10 has evolved into a repository for the internal architecture of a single implementation (Apache Qpid). This is fantastic: there will only be one (surviving) implementation that can monopolise the market when AMQP/1.0 is ultimately released. Unfortunately for the rivals, this is business and not philanthropy.
Therefore, a persistent alliance of interests between the larger users and vendors is driving AMQP towards complexity and disorder and stifling the voices of those who have been calling since 2006 for simplicity and closure. Although roughly reasonable, the coalition of interests is founded on faulty logic.
Let’s begin by assuming that we only have one opportunity to create AMQP. My thirty years of software development have taught me one thing: things change. AMQP was originally designed to last 50 years, which may sound like a long time but is actually not when you consider that it may take 5–10 years to create the protocol. A 1.0 will be followed by a 2.x, 3.x, 4.x, and so on. The explanation is straightforward: in order to take advantage of new technological advancements, a fundamental networking protocol needs to change. Because of this, my laptop is compatible with four or five Wi-Fi generations.
It’s not a significant concern; peers discuss which version of the protocol they wish to use. The protocol has a version header. If they choose to, servers and client libraries can support various versions. Although ugly, the issue is neither novel nor extremely challenging.
It becomes progressively harder to defend not releasing the most interoperable specification at the time (the regrettably named but clean and “good enough” 0-9SP1) as “1.0” as it becomes clear that AMQP should have been released virtually as-is in 2006 as version 1.0.
The AMQP Working Group should release 0-9SP1 as version 1.0 and insist on interoperability from all vendors, in my opinion. Once those are resolved, we can go to 2.0. We are aware that the Enterprise Reliability some of us desire won’t be implemented in 1.0. Fine. We can complete it later if it is possible. If not, we shouldn’t kill the child because it can’t fly like an angel.
The Qpid crew, who have strayed far from the 0-9 tree, will be extremely hurt by this suggestion. But it becomes obvious that they will have to face reality and endure the suffering. They have no other option because they are all by themselves, using an almost proprietary version of AMQP that resembles a dump of their software rather than a well-designed protocol, and there is no vendor interoperability. They will control exactly 100% of a market with a value of less than zero. Since clients will spend their money there, the competitors are where the actual profits are.
It’s similar to the fable of where to construct a successful Chinese restaurant. The solution is to construct it near other Chinese eateries as that is where the customers congregate. So, you get a lovely, prosperous Chinatown. As an alternative, you might construct a Chinese restaurant by yourself and draw customers who despise Chinese food but were nearby and hungry. You might think about the largely empty dining room and vacant staff as they order steak, fries, and beer and devour the expertly carved carrot dragons, and you can wonder when your money will run out.
It will be difficult for the AMQP Working Group to commit to the 0-9SP1 definition because doing so would be extremely embarrassing. Large Promises of What Will Be Delivered in 1.0 Have Been Made. Grandiose promises were oversold to influential people during exquisite presentations. There is a lot of credibility on the line, and I’ve only ever seen corporate pride submit to reality under truly dire situations. Therefore, it appears more likely than not that we will wind up in the situation I depicted in the first section: failure to offer a convincing 1.0 spec, collapse of trust, argument, blame, and incapacity to make a decision about how to proceed.
But just because something seems chaotic, that doesn’t mean it will be.
I’ve focused on palliative care in this part, namely how to lessen the grief of what will inevitably feel like a death in the family. Eat a lot of humble pie in the first step and admit that we can’t build the ideal packaging just now. Step 2: Request vendor support for the finest product we have created so far and release it as 1.0. Step 3: Carefully consider your next move.
In the next section, I’ll begin to look at step three by reviewing some of the most painful events our team has gone through while trying to contribute to AMQP over the past three years. We eventually found ourselves burned out and pushed to the limit.
Pain is typically not a good sign.
I stated the purpose of this post in the earlier sections and offered some obnoxious but obvious suggestions for saving some of the earlier effort. Putting pride aside, it is possible to transition AMQP to a Safe Place by simply making the 0-9SP1 text available as 1.0 and requiring all implementations to support it. Humans make mistakes, but it takes divinity—or a really smart parser—to acknowledge them and be willing to go back and correct them.
Securing the work that has already been done, which focuses primarily on AMQP deployment and only partially on AMQP itself, is only half the battle. Yes, it would be wonderful to at least have this resolved. What occurs next, though? I stated that AMQP must aim for a 2.x, 3.x, and 4.x release over a 50-year period. How?
I’m not an expert in message, but I do have the gift—or curse, perhaps—of being direct and frequently right about things that don’t work. I’ve learned a lot about the best and worst organisational practises by working in hundreds of teams, ranging from professional to volunteer. The AMQP Working Group is not only superior to some, but also inferior to others.
My own pain threshold and the pain thresholds of others I work with, like my coworker in the New York face-to-face, are the only true measures I have. An organisational paradigm is flawed if it causes harm to myself or to my fellow participants. Pain is typically not a positive indicator.
The way that each AMQP Working Group participant experiences pain is a key distinction between them. Some are enormous and affluent, and they are accustomed to the challenging and complex systems that govern larger organisations. However, some participants are small, have minimal funding, and frequently use lean, agile organisational methods. These are the coal mine’s canaries. For the same reason, I want FOSS engineers to find AMQP appealing. The more vulnerable a player is to issues that eventually harm everyone, the smaller they are.
Although I’m sure everyone in the AMQP Working Group has their own opinions on this, there are three specific issues that have led iMatix to step back and, to some extent, regard AMQP as a risk rather than an opportunity.
The fact that AMQP is one document (really, one textual text and one XML file) is the first source of discomfort. The rationale was that a single, indivisible specification would facilitate interoperability more readily than a collection of requirements. But that one specification has grown to a little over 300 pages. While some standards are incredibly large, AMQP is obviously excessive when measured against the IEFT RFC publications.
The XML specs had a top-level file and one XML file for each class before we released AMQP/0.8. Since the protocol could be expanded and amended by adding or removing classes, this made it easy to manage and made it simple to edit individual classes. However, this structure was flattened into a single file because XML parsers were thought to find it too challenging. I had claimed that a small Perl script could perform this flatenning for code generators on demand, but the protocol required One File.
A standard like AMQP, which covers numerous issues at numerous levels, does not make sense for a single huge document, whether it be text or XML. One such Enterprise Technology tick box item is compatibility with the Java Messaging System (JMS), which is uninteresting to FOSS developers but seems to possess some AMQP participants like an evil demon possessing a weak-willed priest in a Nollywood horror movie. Why is wire-level encoding of a 4-byte integer specified as a low-level feature in the same text as JMS compatability, a legacy high-level application language-specific feature? Or, to provide another illustration, why is the distributed transaction specification for AMQP in the same document as the specification for connection negotiation?
It is feasible to suggest changes that affect multiple levels at once, as if there is no architecture at all, when ideas like JMS, fundamental data types, connections, and transactions are all included in one document. Hey, why not provide them Java-compatible JMS headers and include connections in the transaction? Sounds fantastic, no? And quite quickly, fundamental ideas about structure and architecture are lost, and what is left can only be reformatted with a nicer typeface and page layout.
The functional range of AMQP greatly surpasses the confines of a single standard, and we already have a number of different specifications (for remote administration, for federation, for message bundling, etc.) that don’t fit within that single specification and therefore cannot even exist, on paper.
After AMQP 1.0 is released, we must redesign it as a stack of interoperable protocols, each of which defines a layer of the stack and permits competition and experimentation at that layer. Which takes me to ownership, which is the second big source of pain. or rather, its absence.
When I observed sound, straightforward ideas being altered, discarded, and replaced with what appeared like juvenile complexity, for unclear reasons, and with little to no room for opposing viewpoints, I became disenchanted with the AMQP process.
We tried implementing the current spec, but, to be honest, we didn’t like it. So, we developed some new code, and here is a dump of our design. No matter how much argument it takes, we will now push through with this. It is pointless to resist. Shamefully, the AMQP Working Group did nothing and let this to happen repeatedly, as if the specification didn’t matter and was just a pointless fig leaf.
A wealth of excruciating examples labelled “Work in progress” for the reader’s convenience may be found throughout the AMQP/0-9 specification. If you ask yourself what procedure led to the publication of such a document, “lack of ownership” is a big part of the response.
I’ll use an early illustration. Why do we put a dot in front of 0.8 and a stripe between 0 and 10? The major version and minor version were each given one byte in the original AMQP version header. The minor version’s two significant digits from 00 to 99 were encoded as a decimal number, therefore 0.8 was:
minor = 80 major = 0
which enabled us to patch up smaller releases like 0.81. A 0.10 release (as that would be the same as 0.1) would have been impossible. You may be thinking, “A good thing.” 99 minor versions must be sufficient.
The version number was altered to encode the minor version as a cardinal number (1 to 255) following one of the Working Group’s first major flamewars and what felt like a thousand emails between opposing factions. In contrast to making it difficult to declare an upgrade to a minor version, this allowed one to conceive of 255 releases prior to 1.0.
The only justification given for this change was “we don’t like the current scheme, so here is a new one” There was no explicit explanation of what was broken; instead, there was merely disagreement over which side was right. I expressed my displeasure at the time to the Working Group leadership that change was being permitted for its own sake, but I received little rebuttal. Those who could have maintained order either refrained from participating in the conversation or believed it was necessary to permit it to take place. But the reality was a failure, and it was only the first of many.
As a result, we now write 0-9SP1 instead of 0.91, and following the official protocol negotiation, we must negotiate the differences between 0-9 and 0-9SP1 by inserting magic values into one of the connection methods. This means that we lack reliable, straightforward, and robust protocol negotiation at the precise moment when we need it.
The satisfaction of being correct is not even mild, and my justification is not a “told you so” moment. My argument is that the fundamentally sick process behind AMQP makes poor outcomes the norm rather than the exception. Despite being a part of many of the disputes, I hope that my explanation of how this process came to be and why it fails is objective.
If there was a distinct understanding of ownership, the “loudest wins” procedure would not endure.
Because AMQP is not separated into ownable components and because there are no regulations protecting such ownership, no one owns any portion of the protocol.
Describe ownership of a standard in further detail. Clearly, it has nothing to do with legal ownership. It is about accountability and blame. Someone has very strong incentives to make something work if their name is on it. The incentives to safeguard and enhance a piece that is controlled by a committee simply aren’t the same.
A committee owns one large package called AMQP.
If AMQP’s design comprised a stack of protocols, like with any similarly sophisticated IETF definition, we might observe the following benefits:
Anyone could redesign each layer in the stack. Consider the security components of SASL. You desire to create a new one. Where to work and how to work are both obvious.
Each piece of writing could be consistently and correctly linked to its creators. Credit, you may believe. However, culpability and accountability seem to be more practical tools for quality.
For altering any specific item, there might be a correct procedure or etiquette. With such large pieces of work, that is not practical.
We lose those persons who are most qualified to solve things as a result of the protocol being progressively worse and the lack of a respectful change procedure that seems to have damaged many participants. AMQP is sunk like a lost beast into this pit of incompetence and complexity.
You require a change in a layer on which you rely.
You recognise the breach or deficiency.
You can locate the damaged component.
You give credit to the piece’s author or authors.
You let them know about the issue and request a resolution.
If you can, you put out a suggestion on your own.
You can threaten, bribe, or coax the authors if necessary.
You take their work, fork it, start a flame war, and try to convert people to your – hopefully – improved version if the authors don’t reply, insult you, treat your contribution poorly, or otherwise behave antisocially. At that moment, you take ownership of that component. You need to feed it because you saved it.
a solid modular design that divides a complex issue into manageable parts that cooperate and compete in ways that are clearly outlined by written contracts;
a distinct ownership structure where each item is ownable and owned;
a fork etiquette that respects and relies on ownership while allowing forks to be used freely.
The idea is to harness natural impulses for competition and cooperation as motivators rather than sources of conflict. The same animosity that may ignite a thousand fires can propel rival teams to extraordinary feats. Only if the social architecture is properly implemented.
I’ve maintained that the issue with AMQP is social, not technological. Social architecture encompasses much more than just the Working Group’s structure. It includes contributors outside the core specification developers and AMQP users.
The lack of an online community for AMQP will be examined in the section after this one as another significant indication that something is wrong with the technology.
The meat is absent.
I’ve already highlighted some of the less exciting aspects of how AMQP is created in this article. I indicated that there are IMO thousands of people who would love to contribute in a project like AMQP and are quite qualified to do so.
Where are these contributors now that there have been two years of open AMQP releases? Where is the neighbourhood? Where are the forums, blogs, FOSS initiatives, and other signs that a standards process is functioning normally?
Perhaps the clearest indication that AMQP is in bad shape is their nearly complete absence. Although many individuals seem to believe in AMQP’s potential, few or none of the thousands of potential protocol engineers have signed up and invested their time and money in the specification game. This is a sign that the market may not always be right. Not protocol experts, but software engineers working for huge companies, are creating the specification.
Online communities expand, as might be expected. Less visible is the fact that they appear to develop in reaction to social difficulties. For instance, I’d contend that the Wikipedia community is powerful and self-assured because it consistently deals with trolls, spies, liars, and manipulators and has come up with strategies to defeat them. It is not a problem that anyone can edit a Wikipedia page and ruin someone else’s well-done effort. Instead, it serves as Wikipedia’s main engine of success. Edit battles elicit feelings that unite the community.
The AMQP community therefore needs challenges to address, preferably social ones rather than technical ones. In other words, precisely the issues that have hampered the development of AMQP.
In my opinion, the absence of a community is a contributing element as well as a symptom. There is hardly any AMQP community, merely a narrowly varied group of self-selected experts.
The fact that documenting and publicising issues enables others to address them should serve as a enough justification for revealing AMQP’s internal strife and humiliating those involved. The physics of people and social concerns, of course, are never neat or easy.
The main distinction between AMQP and Wikipedia, however, is that the latter is a meritocracy where anyone can participate on their own and advance based on their shown contributions. AMQP is an invitation-only business club whose members represent their companies rather than themselves. Although the procedures for joining and voting have been significantly loosened, they are still cumbersome.
Because it took too much work to distribute packs of contracts, the number of AMQP participants was eventually limited.
Fear that one of the parties would add patentable technology and attempt to syphon off any of that delightful natural monopoly efficiency was the driving driver for this thick layer of contracts. It’s a valid concern, but there are less complicated ways to allay it.
Multilateral agreements, between several parties, are not scalable, to start with. For agreements to be one to many and for the cost of adding more participants to be linear rather than exponential, AMQP needs to be hosted by a legitimate not-for-profit organisation.
Second, contracts might be much easier. In essence, these are promises made by contributors—individuals or organizations—to whoever owns AMQP not to contribute anything that is patented and not to patent anything in the specification, unless it is subject to a royalty-free, irrevocable licence granted to anyone willing to use it. There is only a little amount of information regarding copyrights and trademarks.
Third, the Working Group’s operations are covered in great detail in the AMQP development contracts. Now, it is clear that coming to an agreement on change-making policies and then voting on them is a smart idea. It is not a good idea to set items in the one location that is virtually impossible to change. Policies must be simple to implement and easy to modify. As things stand, the lengthy agreements we all made regarding the means of cooperation have done absolutely nothing to aid in the process of settling disputes.
In addition to the procedural obstacles, AMQP’s technical sophistication makes it extremely difficult for anyone who is not already familiar with it to participate. I’ll use an old joke again: if you understand any of the latest AMQP specification’s definitions, you probably wrote it.
The process is failing if complexity is rising quickly.
However, even when AMQP was still straightforward, the Working Group made no attempt to establish a separate community. The fast and simple wikis some of us tried to start were abandoned in favour of horribly slow Enterprise (yes) wikis poorly hosted by one of the participants, who insisted on SSL protection (for “Enterprise security measures”) for every page but was unable to install a valid SSL certificate. It felt like a fear of scrutiny: discussions were kept private, the public Google groups some of us had set-up at the start were not used.
iMatix is accustomed to establishing communal infrastructure. The free wiki farm Wikidot.com, which is hosting this content, is one of my projects and was created specifically as a quick and adaptable community platform. I work in community infrastructure. We made several attempts to establish an easy-to-use infrastructure for AMQP, but each time we ran into resistance from members of the Working Group who felt there should have been a better way—a manner that never materialised.
When editing a wiki page takes between thirty and sixty seconds, you can bet that users stop adding content very quickly. Cynics might also point out that developers from one specific firm were not troubled by the slow SSL interface or the various login pages, therefore it is not unexpected that early versions of AMQP increasingly reflected the opinions of a single player.
Taking note of the infrastructure and supporting processes for an emerging standard is one approach to do so. Even though AMQP must do so, it is not currently secure.
Needs for community building
a thing unaffected by sellers and their hostility to outside knowledge.
A suitable budget to cover system maintenance, community growth, etc.
marketing with awareness to innovators and professionals who enjoy new technologies.
Events, newsletters, forums, blogs, and other community-focused content are available.
Above all, a straightforward method for unaffiliated specialists to join in the process and make contributions with some degree of assurance. Which leads me back to the “Pain is a Bad Sign” tenets of a good modular design, ownership guidelines, and a transparent and moral change process, all of which are absent in AMQP and will prevent any community engagement other than on the periphery.
If I were offering the AMQP Working Group advice, I would advise them to place more emphasis on creating a thriving community than on developing standards.
Any expert’s cardinal sin is having faith in their own knowledge. We only succeed when we are aware of, prepared for, and able to work around our limitations. The most important lessons should come from our failures and blunders. Speaking of failures, the fundamental AMQP design for which I am responsible contains a sizable defect, a serious design flaw that was caused by hasty optimisation and based on false assumptions. This design problem appears to have gone unnoticed, maybe because all of the engineers still working on AMQP are making the same false assumptions. Sometimes an insufficiently diversified group will miss the obvious.
I’ll describe this weakness in more detail in the following part, along with how addressing it will make using AMQP faster, easier, and all around more fun.
Early optimisation is a direct route to hell
While I won’t get too technical about AMQP (in part because I haven’t tried to keep up with the tsunami of changes in the 0-10 family), some design choices are crucial, and if they are made incorrectly, they have a ripple effect throughout the entire system.
A cursory examination of AMQP will reveal that it is a binary protocol. As a result, operations like Queue.Create are described as binary frames in which each field has a specific location and has been properly encoded to maximise available space. String encoding is substantially less efficient than binary encoding for numbers. In binary, “123” needs one byte, and as a string, it needs three. AMQP can be parsed relatively quickly because it is a binary protocol. Since AMQP uses binary encoding, buffer overflow attacks cannot be used against AMQP software. Since AMQP is specified as easy-to-parse XML, creating the encoding/decoding procedures is also simple.
Overall, the binary encoding of AMQP is a huge success.
The falsity of that last claim has been established. Evidence suggests that the binary encoding used by AMQP is a fundamental error. I fully accept responsibility for the error because this was my design, and I was really happy of how well it turned out. An expert error. Let’s examine the benefits and drawbacks of this strategy, as well as the underlying presumptions that I maintain have since been disproven, to better understand why I’m admitting this. Finally, let’s contrast this with a different strategy based on more reliable presumptions.
Benefits of binary coding
It is more manageable.
Compared to a text format, it is easier to parse.
String parsing is more secure.
One can fully generate the codecs.
In silicon, processing is simple.
Codecs are a necessity in the first place.
It produces endlessly divergent variants of AMQP.
It is harder to use and comprehend than string encoding.
The importance of data kinds is highly stressed.
Even the most basic client API is incredibly complicated.
Now, a quick, efficient wire-level encoding is unquestionably worthwhile. After all, the argument went, AMQP is an enterprise technology that is meant for speed, so complex APIs are not a major issue. Although a TELNET client can be used to experiment with text-based protocols like HTTP, nobody actually creates HTTP clients; instead, they use pre-existing libraries.
According to common AMQP thinking, binary encoding is expensive yet necessary if one wants a quick, dependable protocol. In our opinion, it was impossible to expect to attain the required performance using a text-encoded protocol like HTTP.
The primary premise of AMQP’s encoding is that it is required for performance (fast parsing, data compactness). I can disprove the primary argument in favour of binary encoding if I can demonstrate that it is false.
What is the quickest common messaging protocol that is present in all contemporary operating systems, supported by all current web browsers, and able to overburden regular networks? It is significantly faster than AMQP and substantially faster than HTTP. In fact, implementations of this protocol would be the fastest messaging programme ever created if they weren’t reliant on reading and writing data to disc.
The answer is FTP, the simple file transfer protocol that network experts use to determine if a network link is set up for 100Mbps or 1Gbps. FTP is able to send data down the line quickly enough to establish the network’s speed beyond a shadow of a doubt.
Here’s where it becomes fascinating and where my question comes in. What makes FTP unique in that it can send data so quickly? And what guidance does this give AMQP?
Interestingly, our work on ZeroMQ, a messaging network capable of sending millions of messages per second, is where my opinions on AMQP performance first emerged. The reason ZeroMQ is so quick is that it employs FTP-style methods as opposed to AMQP’s.
FTP triumphs because it only requires one connection for message transfer and one for control commands. Later protocols, including HTTP, did not do this action. But compared to HTTP, FTP is typically quicker and easier. Faster and easier features are desirable.
The basic premise of AMQP, that binary encoding is required, may be split down into more specific assumptions, each of which is false:
that improving control commands like Queue.Create is necessary. It is assumed that these commands are pertinent to performance. In actuality, they make up a very small portion of messaging activity, with message transport making up the vast majority rather than message control.
the requirement that control instructions use the same network connection as communications. The presumption is that a logical bundle of message data and control directives must follow the same physical path. They can even go through various areas of the network on quite diverse paths.
that both control commands and message transport require the same encoding. The optimal encoding, whichever it may be, can be used by any component of the protocol, hence there is no reason to try to utilise a single encoding model.
The value of simplicity cannot be emphasised enough, especially in emerging protocols that must accommodate rapid expansion.
The simplest feasible encoding for commands is text, using (for instance) the widely used “Header: value” syntax from HTTP, SMTP, etc. The parsing of this using regular expressions is simple. Attacks on this type of encoding typically take the form of larger strings and are simple to
Using a straightforward text encoding for commands frees AMQP from a number of restrictions:
Developers can see it immediately.
Maintaining backward compatibility becomes simple.
Writing clients gets less difficult.
Debugging and testing AMQP test cases become simpler.
Is there a performance cost associated with text parsing? Yes, but in the overall performance question, it is completely meaningless, and I guarantee this.
So what is the most straightforward encoding for messages? For large messages and low performance requirements, AMQP constructs a very amazing envelope around messages (about 60–100 octets), but this is bad news for little messages and high throughputs. We wondered how thin the message envelope could become when we were developing ZeroMQ. You can reduce it to a single octet, which is a shocking answer.
Of course, we can specify additional encodings, each with a unique cost-benefit analysis.
Binary messages can be enclosed in textual envelopes. This is how HTTP and BEEP function. This single-connection architecture is wasteful and appears to be simpler, but it actually becomes rather complex.
Like FTP, we can use separate connections for control commands and messages. Although it’s straightforward, managing several ports is required.
If we choose to begin message transfer, we can begin with a straightforward text-based control paradigm and convert to straightforward binary message encoding. This is comparable to how TLS transforms an unsafe connection into one that is encrypted.
I favour the final selection. Control and data should always be kept apart. When they are combined, like AMQP currently does, it leads to some incredibly delicate issues, like how to manage faults that can affect both synchronous and asynchronous talks. Although the exception handling provided by AMQP is a beautiful solution, wouldn’t it be good to have something more traditional?
The idea I refer to as “natural semantics” exists. These straightforward patterns simply function. Similar to mathematical truths, natural semantics also exist objectively and without regard to certain tools or perspectives. They are priceless items. A natural semantic is the AMQ exchange-binding-queue model. Always looking for these natural semantics, a skilled designer should then codify them in ways that make them inviolable, unavoidable, and simple to utilise.
Control commands naturally take the form of pessimistic synchronous dialogue, in which each request is acknowledged and given a success or failure status. Optimistic asynchronous monologue, in which one side pushes data to the other as quickly as possible without any kind of reaction, is the natural meaning for data transfer. In the part after this, I’ll address the query, “What happens if data gets lost?”
Although AMQP supports both synchronous and asynchronous communication, it is not bound by the conventional definitions of control and data. The natural semantics are inadequately inevitable and loosely restricted.
Because HTTP utilises the incorrect semantics for data transport, it is slow. It would be incorrect to wrap data with control commands, like BEEP does. It is wise to employ two distinct connections since they neatly divide the two natural semantics. However, if you’ve ever used FTP servers and clients (I have, and they are terrible in this regard), you’ll know that a significant part of the issue is that FTP uses port negotiation, which is intended to get over firewalls. This is unnecessary with AMQP because we may use the same port for all connections and declare from the outset whether a connection is for data or control.
While we’re at it, let’s forget about connection multiplexing—also known as “channels” in AMQP—in general. This fixes HTTP’s issue with clients opening and closing several connections concurrently as they download web page elements. AMQP clients establish one connection and maintain it for a long time. Multiplexing is an expensive solution to a non-problem.
In a nutshell, my fundamental AMQP design contained a number of seriously faulty presumptions that caused AMQP to be far slower, more complex, challenging to comprehend, and incompatible with itself than it should have been. I am truly sorry. In defence of myself, I’ll say that since no one else has exposed the errors in these presumptions, they cannot be so evident.
The premise that underlying the entire AMQP vision and one I’ve always found uncomfortable will be highlighted in the part after this one. I’ll contend that it also has flaws.
preventing special cases
I’ve examined what, in my opinion, makes AMQP excessively complicated, why most people who have worked on it have found it difficult, why there is no AMQP community, and how and why I nearly entirely misdesigned the wire level framing when I wrote the original AMQP.
Although these accusations and admissions may appear serious, they are all both natural and recoverable. I’ll offer a more thorough analysis of the AMQP vision in this part. I have to acknowledge that my opinions on this specific subject clearly conflict with those of the AMQP Working Group, who must either consider me naive, trollish, or simply misinformed. But the weight of the evidence has pushed me to adopt my particular viewpoint, which was not where I started with AMQP.
I’ll primarily make the case that the AMQP vision, which assumes that a centralised server reliably stores and sends messages, is flawed and is founded on two false assumptions. We do, in fact, have a central server, to start. Two, there is just one dependability model available. I’ll make an effort to explain the origins of these concepts and why I believe they are incorrect.
There are various sources of inspiration for AMQP, but the idea of a central server offering capabilities like to that of JMS, the Java Messaging System, was what most influenced and shaped the protocol. It’s not necessary to mean “is a literal copy of” when using the phrase “inspired by”. The sub-machine gun served as the model for the Kevlar vest. The exchange-binding-queue AMQ model is the Kevlar to JMS’s “destination,” which is an illustration of a possibly completely artificial semantic marketed as “Enterprise technology.”
But if you look at AMQP’s specifications for transactions, acknowledgements, message delivery, and message headers, you will notice numerous echoes of the JMS specification. AMQP does, more or less, strive to cover the same functional terrain as JMS. We were trying to reuse ideas that we assumed worked, or at least worked well enough to get us through to the next iteration of the protocol, in addition to making it simple to support JMS in the future.
Let me briefly review some of the pertinent presumptions we received from the JMS specs and the JMS products we considered to be competitors:
that a central server or central server cluster that is fault-tolerant exists. This is accepted wisdom, particularly in the Enterprise, which appears to enjoy large central boxes.
The protocol needs to provide reliable “fire-and-forget” messaging. This is a logical presumption since if interoperable dependability is not included in the protocol, each implementation will create its own version.
It is believed that there is a single, perfect model for reliability that is capable of handling all application scenarios. This model resembles relational database transactions.
That the central server(s) must be configured to provide such reliability. This makes sense because complete reliability necessitates things like exorbitantly expensive Enterprise-level storage area networks (SANs), which logically must be located somewhere in the centre.
Standard transactions, which operate on published messages and on acknowledgments, are used to implement such reliability. It must be correct because it is directly lifted from JMS, which exemplifies how popular products like MQSeries operate.
These transactions must be able to be recovered from a primary server crash and carried out on a backup server. This is simply a result of the earlier presumptions. It’s the stinger, though.
It is not difficult to create JMS-style transactions; in fact, we did it in an earlier version of OpenAMQ (about AMQP/0.4), but we removed it since it was so unsightly and unnecessary for our intended use.
Making (centralised) transactions resilient to server failure is difficult. Many of the changes in AMQP/0-10 are mysterious to me, but from what I can tell, the majority of them are motivated by achieving JMS compatibility and the desire to obtain transactions that reliably recover from crashes.
In other words, obtaining this “Enterprise level” reliability has been the primary goal of the work done on AMQP during the past two years, possibly the majority of it. According to my best understanding, this development has been motivated by huge corporate “almost-clients” who adamantly insist they can’t commit to AMQP (pardon the pun) until it offers this highly wanted capability. Instead of being evidence-based, it has a lot of the feel of belief-based investing.
The fundamental component of AMQP, asynchronous message transit, does not fit very well on top of transactions, the traditional unit of reliability. When exactly is a message sent? Is it when the application receives it, when the exchange routes it, when it is added to a queue, or when it has been processed and acknowledged? What happens if the message is forwarded across a federation of servers? What if a multicast protocol is what we want to use? These problems must be addressed if reliability is to be incorporated into the fundamental protocol. I haven’t seen any responses.
The most recent versions of AMQP appear to be telling us loud and clear that “this problem is too hard to solve.” I don’t see the foundation for some of the AMQP editors’ optimism, and from what I can tell, AMQP won’t be reliable enough to “fire and forget” until there is a significant shift in tactics. I have nothing against the fire-and-forget philosophy. But despite efforts that are currently several orders of magnitude higher than those put into inventing AMQP in the beginning, including this into the wire-level protocol has not succeeded.
Especially when it turns out to be nearly difficult to implement, a high-level feature demand that adds disproportionate complexity to the protocol at all levels must be questioned.
According to my opinion, the current attempt to provide flawless reliability in AMQP involves blending innovation with standardisation, which is analogous to mixing petrol and firecrackers. AMQP should be partitioned such that dependability can be added later on top of a more fundamental protocol that is currently restricted, however it is not currently the case. Big changes cause havoc across everything, and because they are structural, this havoc cannot be fixed.
It may not have been able to implement reliability in AMQP for a number of reasons. The idea itself can be wrong. We might have chosen the incorrect strategy. Because every modification breaks several other things, it’s probable that AMQP’s lack of architecture prevents such work from being done. All of these may be true to some extent, but in my opinion, the reliability problem has proven impossible to resolve on a strictly technical level since we have focused on a narrow special instance.
If we challenge our most fundamental beliefs about the function of the AMQP server, about centralization, and about what “the protocol” is, we might be able to solve the problem of perfect interoperable reliability.
We’ll start by examining performance and how a central server affects it. All messages being forced to go via one point results in a performance bottleneck that grows worse as more clients are added. Two types of additional latency are introduced. The first is the price of reading, processing, and publishing the message. Second, the price of having to wait while messages are processed in a queue. Latencies worsen exponentially with traffic through a central server, which is a nightmare for serious messaging users who want consistently low latencies.
Similar issues are addressed brutally in protocols like IP by discarding excess data that enters the network. We are left with an insoluble issue because AMQP aims to provide a fully trustworthy protocol, hence this is not an option.
The outcome of IP’s solution is often positive. It enables any network node to assume control when the network is overwhelmed. It enables network traffic to navigate around errors. It enables networks to scale to any number of intermediate routing points, ranging from 0 to hundreds. AMQP is now a “pessimistic/idealistic” protocol, whereas IP may be defined as a “optimistic/cynical” standard. Although AMQP anticipates data loss, it is confident it can avoid it.
Since every message must be delivered twice—once from the publisher to the server and again from the server to the client—a central server that directs messages also reduces the network’s capacity.
Therefore, it is poor for performance and scalability to pump messages through a single central location. The fact that every message passes via precisely one switch point makes it a pretty unique scenario from the standpoint of more general network design.
If we favour centralization, then we should favour having exactly one switch point. We’ll refer to this as the “N = 1” situation. But proof must come first, not belief. Do we have any instances where N is not 1, and if so, are they important?
Undoubtedly, there are actual instances of “federation” where N is 2 or 3, in which one server serves as a client of another, which may in turn serve as a client of a third. An fundamental architectural design is federation, which addresses the issue of how to expand AMQP networks beyond a single local network. We have applied federation in actual situations, and we can demonstrate that “N > 1” is significant empirically.
What if N = 0? We can have a look at ZeroMQ, which pushes queueing and routing to the consumer edge. And as you might anticipate from a peer-to-peer approach, it keeps scaling easily as the network’s number of peers increases.
Performance is not optional in my opinion; any design that increases it by a sufficient amount must be taken into consideration. Therefore, “N = 0” is likewise significant.
Therefore, it appears obvious that we must use a “N-neutral” design to deal with N = 0, 1, 2, 3,…. It appears obvious that “N = 1” is a special case that we should not be focusing on, to the exclusion of other values of N, and that it is always better to solve general instances rather than specific ones.
The current AMQP designs are solely concerned with “N = 1” and consider this to be a universal constant. The dependability work in AMQP is predicated on “N = 1”. I’ve demonstrated that “N = 1” is a constrained specific case that doesn’t take into account the requirements of big networks or excellent performance. Therefore, if AMQP’s reliability model ever materialises, it will be meaningless for each of the actual deployments that we have witnessed.
This should persuade most people, but I’ll give you one more argument in favour of my position that reliability shouldn’t be incorporated into AMQP’s fundamental protocols.
There are several aspects to reliability. The level of reliability we desire in a specific situation depends heavily on the type of work being performed. We must clearly identify the types of communications situations we are considering at the outset of any design conversation for reliability. There are different messaging models, not just one, and each includes cost-benefit trade-offs and a unique definition of what “reliable” implies. Here is a brief, non-exhaustive list:
Building service-oriented architectures uses the request-response approach. A service receives a request from a caller, routes it to the appropriate department, performs its duties, and then responds. Retry mechanisms paired with the capacity for the service side to recognise and appropriately handle duplicate requests make up the simplest proven dependability model.
Data distribution to several customers is accomplished through the transient publish-subscribe architecture. If data is lost under this paradigm, clients only wait for new data to arrive. Streaming speech or video would be a nice example.
When the cost of lost data is too large, the dependable publish-subscribe approach is employed. Clients acknowledge data in this model by sending a brief reply to the sender. The sender can resend data if necessary. TCP and this are comparable.
These all appear to be individual dependability protocols, each with their own semantics and levels of interoperability, which leads me to believe that they should be solved on top of AMQP rather than within it. In order to tackle reliability within AMQP, one of the multiple situations must either be solved or all of the cases must be solved, which results in an overly complex solution.
As a reminder, here are some issues with centralised reliability:
When there is no server in the middle, it cannot be used.
The scenario in which we federate servers is not handled by it.
On top of an asynchronous message flow, it does not fit.
It incorrectly thinks that reliability requires a single semantic model.
Moreover, circumstantially
Design has essentially proven to be impossible.
In numerous respects, it has damaged AMQP.
Its superiority is not supported by any evidence.
Other, simpler, and more established designs are available.
Other effective protocols avoid doing this.
Thus, we arrive at the obvious conclusions:
There shouldn’t be any semantics for reliability, acknowledgments, transactions, etc. in the fundamental AMQP message transport protocol. It should be a hopeful, cynical protocol that mimics IP.
This fundamental messaging protocol should be overlaid with various levels of reliability.
It is reasonable to presume that in addition to the fundamental messaging protocol, various layers of security should be used. Once again, this is how IP functions, and it hasn’t been made clear why AMQP would be unable to use that.
People who have invested a lot of time and money trying to build the AMQP wire-level protocol with faultless reliability will be upset and irritated by my conclusions. I’ll offer two defences in this. First off, this topic has been discussed before. Early in 2007, when we began working on peer-to-peer messaging, we explained it. Second, given the choice between upsetting some people and receiving a more straightforward set of protocols, I’d always go with the latter. While life changes, sound protocol endures forever.
What I’d really like to see are AMQP networks where boxes can fail without causing problems, where data can travel around damage, where extra load can be handled by throwing things away, and where applications can access the full capacity of the network without having to rely on complex messaging architectures.
The outlines of a true architecture for AMQP begin to take shape if we concede that resolving N-neutrality is more beneficial than resolving “N = 1”. ZeroMQ contains hints of this as a result of Martin Sustrik’s meticulous design effort.
I’ll put forth what I think ought to be the fundamental layout for a new generation—appropriately dubbed AMQP/2.0—in the following section.
A Messaging Model That Is Simplified
I’ve come to the conclusion that the current work on AMQP should be discontinued, that the stable AMQP/0-9SP1 should be released as AMQP/1.0, and that a new strategy is required to properly address AMQP/2.0 in this essay titled “What’s Wrong with AMQP (and How to Fix It)”.
The existing protocol workgroup is on the verge of burnout, and part of my justification for this proposal is that I believe AMQP’s core architecture has reached its breaking point and requires reconsideration. If my thesis is true, all improvements and additions to the basic architecture are sheer waste and are likely to hurt rather than help AMQP in the long run.
In my opinion, there is no way to gradually transform the current standards into ones that work. Locking down the good material that is already in production, parking the rest, and starting again from scratch are all necessary.
In this post, I’ll attempt to accomplish this by starting from scratch once more and working my way up to a new, more condensed communications model that can serve as the firm foundation for AMQP/2.0.
Let me state two things now before everyone starts to panic. First off, I would like to clarify for anyone who have already purchased AMQP that my concerns are just with the production process and not with how AMQP is used. With new understanding, implementations can and must adapt. Applications should barely notice this. It took us several tries to build all the useful, straightforward features of AMQP. Second, calm down if you are unfamiliar with the internals of AMQP. The models I’ll offer are straightforward. I want to ease everyone’s burdens, not add to them. The fundamental AMQP specification should only be a few pages long, according to my plan.
The exchange-binding-queue wiring paradigm, which offers application architects complete control over how messages flow via the network, is the fundamental elegant and potent natural semantic in AMQP.
A queue is a FIFO buffer that stores messages. An algorithm that only routes messages, an exchange has no storage. An exchange and a queue have a relationship called a binding, which informs the exchange which queues should expect which messages. Except for the fact that a queue won’t receive the same message more than once, bindings are independent and orthogonal. There are two types of queues: private, for a single application that consumes, and shared, for several apps that consume.
Applications can construct and use exchange instances using AMQP at runtime. AMQP defines a variety of exchange types, each of which corresponds to a particular routing method. It enables programmes to dynamically bind queues to one or more exchanges while they are running. Applications can then use their own queue or queues to consume messages.
The most appealing aspect of AMQP may be its exchange-binding-queue semantic, which is remarkably straightforward. Its design has problems, though, and recognising them is a good place to start when creating our new, more straightforward model, which I’ll create by fixing each flaw one at a time.
The first problem is that every message must always be forwarded through a server queue. How are server-side queues redundant in the majority of situations? Think of these three fundamental communications situations:
point-to-point, where a message is sent from one application to another. A message is published by the sender to an exchange, and that exchange forwards it to a private queue. The recipient reads and processes messages from its private queue.
Data dissemination is the act of one programme sending information to numerous others. A message is published by the sender to an exchange, and that exchange forwards it to various private queues. Each recipient processes messages that are pulled from their individual private queues.
A message is published by the sender to an exchange, and that exchange forwards it to a single shared queue. The recipients receive messages in a round-robin method from this common queue.
The client-side AMQP API implements a second, secret queue for each receiver, which is what is lacking from the description given above. Because AMQP sends messages to recipients asynchronously, this secret queue is invisible to the protocol. Although AMQP has mechanisms for sending messages synchronously, I think these should be abandoned because they were developed through our reverse-engineering of JMS and are cumbersome and unreliable.
The server-side private queue and the concealed client-side private queue perform the exact same tasks in point-to-point and data distribution scenarios, which are the mainstays of messaging. When one has the perspective that “N = 1” for JMS, holding private queues on a central server seems like common sense. However, by requiring that apps generate and manage more entities, this architecture increases latency, strains the server, and makes things more complicated than they need to be.
Queues on the client side are reachable from the rest of the network. Exchanges can specifically send messages right into these queues.
The term “addressable” refers to the ability to use client-side queues as binding targets and the existence of a protocol mechanism for sending messages into those queues. Later, we’ll look at how this might operate.
The clearest argument against the idea that exchanges should be able to send messages straight into client-side queues is the fact that any network action requires a queue at both ends. A queue is also required at the sending side to buffer outgoing messages, as the exchange won’t be able to deliver into a remote queue if the network is slow. According to our experience, TCP/IP buffering is usually sufficient, and the best course of action when networks are truly congested is to delete data rather than build up backlogs.
The counterargument asserts that server-side queues can always be added as a design decision for particular use scenarios.
We can investigate how client-side queues are generated, managed, and named once we’ve transformed them into first-class protocol objects (i.e., addressable from the rest of the network). The optimal answer is the simplest possible one, which is:
There is always a queue for incoming messages on every client. The queue’s life span is equal to the client’s life span. No name is given to the line.
Let’s now break out the AMQP definition of “consumer”. This concept is on my hit list of things to get rid of because it is another JMS hangover. Consumers are ill-defined connections between message recipients and server-side queues.
Client-side queues are now first-class objects, thus the hazy consumer concept can be swapped out for the following, more reliable first-class relationship:
Applications have the ability to tie exchanges and queues together. The flow of messages into the client-side queue is controlled when an application ties its client-side queue to a server-side queue.
For the sake of clarity, whenever I use the phrase “A binds to B,” I mean it in the reflexive meaning, as in “attaches to.” A binding is a connection a queue makes to a message source.
The consumer idea is thus replaced by a queue-to-queue binding, and applications regulate the flow of messages into their client-side queues by building and discarding such bindings.
Bindings now appear to be highly intriguing because they are connectors that can span the network rather than merely being connections between exchanges and queues.
That holding exchanges on a central server is a bad design in some situations can be fixed if bindings are in fact elastic connectors that can spread throughout the network.
A network node with natural authority must host an exchange that is performing essential operations. This node cannot be moved without having an effect, it cannot be permitted to fail, and it cannot be randomly stopped and started. No application node has natural authority in the point-to-point and workload distribution scenarios, so we must establish an artificial authority in the form of a central server. All is well thus far. However, in the case of data distribution, we have a reliable natural authority in the form of the publisher. Making a second one is pointless. Redundancy adds unnecessary complexity, lag time, and error-proneness to processes.
We must take advantage of each opportunity to streamline processes, make them quicker, and increase their dependability. As a result, we modify the AMQP model as follows:
Exchanges are reachable from anyplace on the network and can be located there.
Exchanges may potentially be incorporated into client APIs just as easily as servers because they are nothing more than algorithms. There will be a fee, but in crucial situations, it will be well worth it. Users may also choose to use server-side exchanges by default in other circumstances.
As a result, with a few adjustments, AMQP now supports the ideal “N = 0” model for point-to-point communication and data distribution while continuing to be compatible with “N = 1” for task allocation. Would “N > 1” work? Most of the time, it doesn’t seem like federation requires any specific protocol support; OpenAMQ handles it nicely with the present semantics. To remind the reader, federation is accomplished by connecting one exchange to another. It is not a binding; rather, a consuming or producing application is simulated and acted upon by the client exchange. However, we modified the AMQP paradigm so that private queues, which are client-side queues, now flow out from the core and towards the periphery.
We put a queue in front of the client exchange so that it can receive messages from its parent exchange because it simulates an application. Special circumstances should never be ignored. Do lines being placed in front of some transactions but not others make sense? The simplest, broadest response is:
There is always a queue for incoming messages in every exchange. The exchange’s lifespan and the queue’s lifespan are same. No name is given to the line.
Which is cool since it allows for yet another significant simplification and makes exchanges appear more like regular applications:
Only queues are sent messages to, always.
The former distinction between “publish” and “deliver” has now been eliminated, and queueing has been generalised such that it always occurs at the receiving side, reducing delays and ensuring that every receiver has a first-class queue.
The semantics of message flow via an AMQP network are now fully N-neutral, regardless of whether N is 0 or 1. The elegance of this new paradigm will be demonstrated in a few diagrams. The former “N = 1” AMQP model appeared as follows:
<== [S] –> [X –> Q] –> [q –> R]
Where “S” stands for “sender,” “X” for “exchange,” “Q” for “queue,” “q” for “invisible queue,” and “R” for “recipient.” A network node is represented by each bracketed set, and message flows are shown by “—>” arrows. The binding symbol is “==”.
Redrawing this “N = 1” case with the new N-neutral semantics looks like this:
<== [S] –> [Q –> X] –> [Q –> R]
Now, to create a quick “N = 0” peer-to-peer network, we just rearrange the brackets to obtain:
<== [S –> Q –> X] –> [Q –> R]
Additionally, we build an extensive “N = 2” federation as follows:
<== <==
[S] –> [Q –> X] –> [Q –> X] –> [Q –> R]
Queue-to-queue binding is the method we utilise to distribute the workload:
<== <==
[S] –> [Q –> X –> Q] –> [Q –> R]
It is obvious that every message architecture we have encountered can be handled by the N-neutral model because it is both straightforward and adaptable.
Here is a brief description of the new N-neutral AMQP model:
Anywhere on the network can have an exchange.
For incoming messages, every programme has an automatic local queue.
For incoming messages, each exchange has an automatic local queue.
These queues are first-class objects that can be addressed by other network nodes.
Exchanges and shared queues can still be held on a central server.
Both exchange to exchange and queue to queue bindings are possible.
Network links can be crossed via bindings.
“Send message to queue” is the only meaning that applies to message transit.
Bindings determine how messages are sent by exchanges and queues.
The same semantics are supported for all possible values of N.
We can begin to develop actual protocols based on this new, simpler AMQP model, which is what I’ll do in the followi