Cross-cell resize
References:
* https:/
* https:/
* http://
High level notes:
* We are doing resize because cells can be sharded by flavors and resize is the only non-admin way for users to opt into migrating from one cell with an old flavor (gen1) to a new flavor (gen2) in a new cell. This eases up admins/operators to drain old cells with old hardware.
* Currently a resize restricts the selected destination host to the existing cell; we'll add a policy rule to allow overriding that behavior in the scheduler so candidate target hosts are pulled from all cells. As part of this, we'll add a weigher which by default selects hosts from the current cell if possible to avoid unnecessary cross-cell migrations.
* We'll add a new task to conductor to orchestrate the cross-cell resize since it will be substantially different from the existing cold migrate / resize task.
* The conductor will perform pre-migration checks similar to the live migration task where the destination compute will be validated to make sure things like volumes and ports attached to the instance will continue to work on the destination host in the target cell.
* A cross-cell resize will leverage the existing shelve offload operation in the compute so that we shelve offload from the source host in cell1 and unshelve into the target host in cell2.
* The new conductor task will orchestrate creation of the instance and its related records (BDMs and tags) in the target cell and updating the instance mapping to point at the new cell. When the instance is deleted from the source cell and when the instance mapping record is updated (during or after the unshelve to the new cell) is TBD.
* The API will have to deal with the same instance living temporarily in multiple cells when listing instances and hide one of them based on the instance mapping (or simply the task_state/
* Cross-cell resize will support the same confirm/revert semantics as normal resize today. Reverting a cross-cell resize will delete the instance from the target cell and recreate it in the source cell (note that the original source host might change if the instance was offloaded).
* There are some shelve-related bugs which fixing would be in our best interest before we build more functionality onto the shelve / unshelve workflow, those are linked to this blueprint.
* A formal design spec will follow once a proof of concept is written and basic testing has begun.
Blueprint information
- Status:
- Complete
- Approver:
- Sylvain Bauza
- Priority:
- Medium
- Drafter:
- Matt Riedemann
- Direction:
- Approved
- Assignee:
- Matt Riedemann
- Definition:
- Approved
- Series goal:
- Accepted for ussuri
- Implementation:
- Implemented
- Milestone target:
- ussuri-2
- Started by
- Matt Riedemann
- Completed by
- Eric Fried
Related branches
Related bugs
Sprints
Whiteboard
Gerrit topic: https:/
Addressed by: https:/
WIP: Cross-cell resize
Addressed by: https:/
Add Migration.
Addressed by: https:/
Add Destination.
Addressed by: https:/
Add InstanceAction/
Addressed by: https:/
Change HostManager to allow scheduling to other cells
Addressed by: https:/
Add CrossCellWeigher
Addressed by: https:/
Spec for cross-cell resize
Addressed by: https:/
Isolate cell-targeting code in MigrationTask
Addressed by: https:/
Extract compute API _create_image to compute.utils
Addressed by: https:/
Extract shelve API logic to compute.utils
Addressed by: https:/
Add can_connect_
Addressed by: https:/
WIP: Add initial cross-cell resize tasks
Addressed by: https:/
WIP: Add snapshot task for cross-cell resize
Addressed by: https:/
WIP: Create instance data in target cell prior to resize
Spec was approved on 2019-01-07 for Stein. -- melwitt 20190109
Addressed by: https:/
Add Instance.hidden field
Addressed by: https:/
WIP: Add CrossCellMigrat
Addressed by: https:/
WIP: Add prep_snapshot_
Addressed by: https:/
Move resize.
Addressed by: https:/
Execute TargetDBSetupTask
Addressed by: https:/
Move resize.(start|end) notification sending to helper method
Addressed by: https:/
Add prep_snapshot_
Addressed by: https:/
Move finish_
Addressed by: https:/
WIP: Add finish_
Addressed by: https:/
WIP: Add FinishResizeAtD
Addressed by: https:/
WIP: Execute CrossCellMigrat
Addressed by: https:/
WIP: Plumb allow_cross_
Addressed by: https:/
WIP: Filter duplicates from compute API get_migrations_
Addressed by: https:/
WIP: Start functional testing for cross-cell resize
Addressed by: https:/
Make Claim._claim_test handle SchedulerLimits object
Addressed by: https:/
RT: improve logging in _update_
Addressed by: https:/
Make move_allocations handle empty source allocations
Addressed by: https:/
Stub out port binding create/delete in NeutronFixture
Addressed by: https:/
WIP: Add confirm_
Addressed by: https:/
WIP: Add ConfirmResizeTask
Addressed by: https:/
WIP: Add confirm_
Addressed by: https:/
WIP: Confirm cross-cell resize from the API
Addressed by: https:/
Add nova.compute.
Addressed by: https:/
WIP: Add revert_
Addressed by: https:/
WIP: Add finish_
Addressed by: https:/
WIP: Add RevertResizeTask
Addressed by: https:/
WIP: Add revert_
Addressed by: https:/
WIP: Revert cross-cell resize from the API
Addressed by: https:/
Confirm cross-cell resize while deleting a server
Addressed by: https:/
Add cross-cell resize policy rule and enable in API
Addressed by: https:/
WIP: Fix the leak in the cross-cell revert resize code
Addressed by: https:/
Improve CinderFixtureNe
Addressed by: https:/
Deal with cross-cell resize in _remove_
I'm deferring this from Stein since we're two days from feature freeze and this has a long ways to go. Will re-propose for Train. -- mriedem 20190305
Addressed by: https:/
WIP: Fix RT usage issues in cross-cell resize functional tests
Addressed by: https:/
Fix ProviderUsageBa
Addressed by: https:/
Add functional recreate test for bug 1818914
Addressed by: https:/
Remove unused context parameter from RT._get_
Addressed by: https:/
Update usage in RT.drop_move_claim during confirm resize
Addressed by: https:/
Refactor ComputeManager.
Addressed by: https:/
Add power_on kwarg to ComputeDriver.
Addressed by: https:/
Add functional test for cross-cell migrate with target host
Addressed by: https:/
Validate image/create during cross-cell resize functional testing
Addressed by: https:/
Re-propose cross-cell-resize spec for Train
Addressed by: https:/
Add zones wrinkle to TestMultiCellMi
Addressed by: https:/
Add negative test for cross-cell finish_resize failing
Addressed by: https:/
Extract compute API _create_image to compute.utils
Addressed by: https:/
DNM: Add instance hard delete
Re-approved for Train. -- mriedem 20190410
Gerrit topic: https:/
Addressed by: https:/
Add archive_
Addressed by: https:/
FUP for I68498afd481f72
Gerrit topic: https:/
Addressed by: https:/
Fix ProviderUsageBa
Addressed by: https:/
Improve CinderFixtureNe
Addressed by: https:/
Add functional recreate test for bug 1818914
Addressed by: https:/
Remove unused context parameter from RT._get_
Addressed by: https:/
Update usage in RT.drop_move_claim during confirm resize
Addressed by: https:/
Add Migration.
Addressed by: https:/
Add InstanceAction/
Addressed by: https:/
DNM: Add instance hard delete
Addressed by: https:/
Add Instance.hidden field
Addressed by: https:/
Add TargetDBSetupTask
Addressed by: https:/
Add CrossCellMigrat
Addressed by: https:/
Execute TargetDBSetupTask
Addressed by: https:/
Add can_connect_
Addressed by: https:/
Add prep_snapshot_
Addressed by: https:/
Add PrepResizeAtDes
Addressed by: https:/
Add prep_snapshot_
Addressed by: https:/
Add nova.compute.
Addressed by: https:/
Add PrepResizeAtSou
Addressed by: https:/
Refactor ComputeManager.
Addressed by: https:/
Add power_on kwarg to ComputeDriver.
Addressed by: https:/
Add finish_
Addressed by: https:/
Add FinishResizeAtD
Addressed by: https:/
Add Destination.
Addressed by: https:/
Execute CrossCellMigrat
Addressed by: https:/
Plumb allow_cross_
Addressed by: https:/
Filter duplicates from compute API get_migrations_
Addressed by: https:/
Change HostManager to allow scheduling to other cells
Addressed by: https:/
Start functional testing for cross-cell resize
Addressed by: https:/
Add functional test for cross-cell migrate with target host
Addressed by: https:/
Validate image/create during cross-cell resize functional testing
Addressed by: https:/
Add zones wrinkle to TestMultiCellMi
Addressed by: https:/
Add negative test for cross-cell finish_resize failing
Addressed by: https:/
WIP: Add confirm_
Addressed by: https:/
WIP: Add ConfirmResizeTask
Addressed by: https:/
Add confirm_
Addressed by: https:/
Confirm cross-cell resize from the API
Addressed by: https:/
WIP: Add revert_
Addressed by: https:/
Deal with cross-cell resize in _remove_
Addressed by: https:/
WIP: Add finish_
Addressed by: https:/
WIP: Add RevertResizeTask
Addressed by: https:/
Add revert_
Addressed by: https:/
Revert cross-cell resize from the API
Addressed by: https:/
Confirm cross-cell resize while deleting a server
Addressed by: https:/
Add archive_
Addressed by: https:/
Add CrossCellWeigher
Addressed by: https:/
Add cross-cell resize policy rule and enable in API
Gerrit topic: https:/
Addressed by: https:/
WIP: Add nova-multi-cell job
Addressed by: https:/
Enable cross-cell resize in the nova-multi-cell job
Addressed by: https:/
Support cross-cell moves in external_
Addressed by: https:/
Robustify attachment tracking in CinderFixtureNe
Addressed by: https:/
Fix hard-delete of instance with soft-deleted referential constraints
Addressed by: https:/
Add functional test for anti-affinity cross-cell migration
Addressed by: https:/
Handle lazy-load of Migration.
Addressed by: https:/
Refresh instance in MigrationTask.
Addressed by: https:/
Add negative test for prep_snapshot_
Addressed by: https:/
FUP for I66d8f06f19c5c6
Deferring to Ussuri since we're 1 week from Train feature freeze and there is still a ton of code to land for this feature so I want to avoid this being a distraction for Train. Will re-propose the spec for Ussuri. -- mriedem 20190905
Addressed by: https:/
Re-propose cross-cell-resize spec for Ussuri
[efried 20190918] Fast approving per previously approved spec process http://
Addressed by: https:/
FUP to I30916d8d10d70c
Addressed by: https:/
FUP to I4d181b44494f3b
Addressed by: https:/
WIP: Add negative test to delete server during cross-cell resize claim
Addressed by: https:/
libvirt: flatten rbd image during cross-cell move spawn at dest
Addressed by: https:/
Pass exception through TaskBase.rollback
Addressed by: https:/
Follow up to I3e28c0163dc14d
Addressed by: https:/
Remove unused CannotMigrateWi
Addressed by: https:/
Make API always RPC cast to conductor for resize/migrate
Addressed by: https:/
Flesh out RevertResizeTas
Addressed by: https:/
Add functional cross-cell revert test with detached volume
Addressed by: https:/
Add test_resize_
Addressed by: https:/
Simplify FinishResizeAtD
Addressed by: https:/
Amend cross-cell-resize spec
Addressed by: https:/
Flesh out docs for cross-cell resize/cold migrate
Addressed by: https:/
WIP: Implement reschedule logic for cross-cell resize/migrate
Addressed by: https:/
WIP: Implement cleanup_
Addressed by: https:/
Follow up to I5b9d41ef343856
Addressed by: https:/
Add sequence diagrams for cross-cell-resize
Addressed by: https:/
DNM: debug cross-cell resize
Addressed by: https:/
Add cross-cell resize tests for _poll_unconfirm
Addressed by: https:/
Refresh target cell instance after finish_
Addressed by: https:/
Fix accumulated non-docs nits for cross-cell-resize series
Addressed by: https:/
Plumb graceful_exit through to EventReporter
Addressed by: https:/
Use graceful_exit=True in ComputeTaskMana
Addressed by: https:/
FUP for docs nits in cross-cell-resize series
Addressed by: https:/
FUP to Iff8194c868580f
[efried 20200107] Marking complete. Remaining patches in this bp as of right now:
- https:/
- https:/
- https:/
- https:/
Addressed by: https:/
Improve CinderFixtureNe
Addressed by: https:/
Robustify attachment tracking in CinderFixtureNe
Addressed by: https:/
Improve CinderFixtureNe
Addressed by: https:/
Robustify attachment tracking in CinderFixtureNe
Addressed by: https:/
Improve CinderFixtureNe
Addressed by: https:/
Robustify attachment tracking in CinderFixtureNe