Complex (anti-)affinity policies
To meet typical telecommunication service provider SLAs the failure
of any given host cannot take down more than k VMs in each N+k pool
for a given application. More precisely, given that the pools are
dynamically scaled, it is a requirement that at no time is there
more than a certain proportion of any pool instantiated on the same
host.
An N+k pool is a pool of identical, stateless servers, any of which
can handle requests for any user. N is the number required purely
for capacity; k is the additional number required for redundancy.
k is typically greater than 1 to allow for multiple failures.
During normal operation N+k servers should be running. Conceptually
in this context a pool is roughly analogous to Nova's concept of a
"server group" though the latter does not support the type of policy
described in this proposal.
Affinity/
is sufficient for a 1:1 active/passive architecture, but an N+k pool
needs something more subtle. Specifying that all members of the
pool should live on distinct hosts is clearly wasteful. Instead,
availability modelling shows that the overall availability of an N+k
pool is determined by the time to detect and spin up new instances,
the time between failures, and the proportion of the overall pool
that fails simultaneously. The OpenStack scheduler needs to provide
some way to control the last of these by limiting the proportion of
a group of related VMs that are scheduled on the same host.
Blueprint information
- Status:
- Complete
- Approver:
- Matt Riedemann
- Priority:
- Low
- Drafter:
- Stephen Gordon
- Direction:
- Approved
- Assignee:
- Yikun Jiang
- Definition:
- Approved
- Series goal:
- Accepted for rocky
- Implementation:
- Implemented
- Milestone target:
- rocky-3
- Started by
- Matt Riedemann
- Completed by
- Matt Riedemann
Related branches
Related bugs
Sprints
Whiteboard
Moving to rocky for discussion; I believe our product team at Huawei has a similar requirement for a type of tolerance/
Gerrit topic: https:/
Addressed by: https:/
[WIP] Allow Specifying Limit For Soft (Anti-)Affinity Groups
Gerrit topic: https:/
Addressed by: https:/
Add rules column to instance_
Addressed by: https:/
Add policy to InstacenGroup object and api models.
Addressed by: https:/
Add policy field to ServerGroup notification object
Addressed by: https:/
WIP: Microversion 2.63 - Use new format policy in server group
Approved for Rocky. -- mriedem 20180516
Addressed by: https:/
Change the anti-affinity Filter to adapt to new policy
Addressed by: https:/
WIP: Adapt _validate_
Addressed by: https:/
Add InstanceGroupPolicy object
Addressed by: https:/
Fix all invalid obj_make_compatible test case
Gerrit topic: https:/
Addressed by: https:/
Refactor the policies to policy
Addressed by: https:/
DNM: use policy create()
Addressed by: https:/
Address nit in afc7650e64753ab
Addressed by: https:/
Change deprecated policies to policy
The compute API microversion 2.64 and corresponding novaclient change are merged so this is complete for Rocky. -- mriedem 20180718