AMQP RPC single per process response queue performance improvement

Registered by Raymond Pekowski

This blueprint proposes a change to AMQP based OpenStack RPC implementations, specifically, RabbitMQ and Qpid, to improve the maximum throughput of RPC. The proposal is to replace the dynamically created and deleted response queues and exchanges per RPC call with one per RPC process. This improvement also resolves a RabbitMQ scalability problem in which maximum RPC throughput decreases as cluster nodes are added.

A Dell study on the performance benefits for this change can be found here:
https://docs.google.com/file/d/0B-droFdkDaVhVzhsN3RKRlFLODQ/edit

Blueprint information

Status:
Complete
Approver:
Mark McLoughlin
Priority:
Medium
Drafter:
Raymond Pekowski
Direction:
Approved
Assignee:
Raymond Pekowski
Definition:
Review
Series goal:
Accepted for grizzly
Implementation:
Implemented
Milestone target:
milestone icon 2013.1
Started by
Raymond Pekowski
Completed by
Mark McLoughlin

Related branches

Sprints

Whiteboard

A prototype has already been implemented. Here is more detail on what was done in that prototype, a good starting point for discussion:
- Add a reply ID to the RPC request and copied on the RPC response and use it to correlate RPC responses with the RPC caller thread.
- If an RPC request is received without the reply ID, assume the requester/caller is downlevel and send the RPC reply in the downlevel way, e.g. on the queue identified by the message ID and without the message ID in the response.
- if an RPC response is received without the reply ID, assume the responder/callee is downlevel, log an error message and discard the response. The call will eventually time out. Consider allowing the response to be returned if there is only one outstanding call, e.g no concurrency.
- Spawn a greenthread (using Connection class consume_in_thread()) to receive all RPC responses
- Create an eventlet “Light Queue” queue for passing the message data from the receive thread to the waiting thread.
- Add a backward compatibility option to fall back to prior RPC behavior, since this change breaks backward compatibility. And/or make use of any RPC versioning features if they exist.

---

Interesting!

As I said by mail, please start a discussion about this on <email address hidden>

Backwards compatibility is going to be a big concern. Please look over the message envelope patch and discussion here: https://review.openstack.org/17554
---
The message envelope patch addresses RPC versioning for overall RPC. This change only addresses AMQP based RPCs, so the message envelope is not applicable. I have provided a compatibility flag, which currently defaults to this faster non-compatible feature. I could make compatibility mode the default.

---
I had a discussion on openstack-dev mailing list on this change back in November. Here is the post that started it:
http://lists.openstack.org/pipermail/openstack-dev/2012-November/002730.html

The code review is at this link:
https://review.openstack.org/#/q/status:open+project:openstack/oslo-incubator+branch:master+topic:bp/amqp-rpc-fast-reply-queue,n,z

---

The following link is to a presentation publishing the results of a study measuring the scalability of RabbitMQ as used by Openstack and the performance improvement that comes from the change proposed by this blueprint:
https://docs.google.com/file/d/0B-droFdkDaVhVzhsN3RKRlFLODQ/edit

---

Implemented by https://review.openstack.org/19721

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.