Move ci.linaro.org into the data centre

Registered by Deepti B. Kalakeri

We want to consolidate ci.linaro.org and android-build.linaro.org onto a single server.
First step in that direction is moving ci.linaro.org to a sufficiently powered hardware to run both inside the Canonical data centre, which is what this blueprint is about.

Blueprint information

Status:
Complete
Approver:
Данило Шеган
Priority:
Medium
Drafter:
Deepti B. Kalakeri
Direction:
Approved
Assignee:
None
Definition:
Approved
Series goal:
None
Implementation:
Informational Informational
Milestone target:
None
Started by
Данило Шеган
Completed by
Данило Шеган

Related branches

Sprints

Whiteboard

[deepti Feb 01, 2012]: Here are the set of hardware requirements that would need to host the ci.linaro.org and android-build.linaro.org. Please review the same.
Master hardware requirements:
Processor: The master can be a machine with 1 - 2 Processor.
Memory: Master with primary memory of 4G would be good candidate. Higher memory will be a bonus point though.
Hard Disk: A disk with enough space something more than 100GB+ of hard disk space.
[pfalcon 2011-02-10] During connect, we with Gesha tried to set up a test Android build on ci.linaro.org. It took installing SSH publisher plugin and getting admin access so we can configure EC2 build slave type for Android builds, but in the end it worked without further issues: https://ci.linaro.org/jenkins/view/Andriod_builds/job/linaro-android_panda-ics-gcc44-aosp-stable-blob/4/ . So, we know that the main component of supporting Android builds on consolidated system is to have proper plugins installed and EC2 slave type configured. The frontend is remaining piece of the puzzle (to be tested).
[pfalcon 2012-02-14] Add WIs for ntpdate and separating /var/lib/jenkins vs /var/lib/jenkins/jobs based on discussion with Loic on latest android-build issues.
[deepti, Feb 23, 2012]: This BP cannot be accomplished for 2012.02 milestone as we IS team has not got back to us with a machine
Slave hardware requirements:
Kernel CI:
Processor: The slaves need to have more processing power. A slave with 2 - 4 processors is a good candidate.
Memory: A slave with around 8 - 16 GB of memory is a good candidate.
Hard Disk: . A slave with disk space of 20GB+ should be fine
Android CI:
Processor: 4 processors is the minimum. Option for 8 CPU slave is a plus.
Memory: 16Gb is minimum. Option for 32Gb is a plus.
Hard Disk: 100Gb is good start, option for more is a plus.

[danilo, 2012-02-02]: We are going to keep using EC2 for slaves at this time. Let's get a 1 CPU, 4 GB RAM, 200GB machine and get it set-up with jenkins as required for ci.linaro.org. I suggest we use "build.linaro.org" for that.
[danilo, 2012-02-24]: Discussed this with IS, end of next week we should have jenkins.linaro.org capable of serving our needs (enough memory, disk space). If disk space is not an issue to continue configuring it further, we can have it even sooner (currently, jenkins.linaro.org is at 20GB of disk space).
[pfalcon 2012-02-24] 20Gb is too tiny for normal operation, consider that even without artifacts, a build is up to 100Mb (logs, other internal info). But we for sure can start configuring with that, if IS will copy that over to a bigger disk.
[dzin, Feb 24, 2012]: Blocked by hardware (infrastructure). Moving to the next cycle.
[danilo, 2012-03-05] RT: https://rt.linaro.org/Ticket/Display.html?id=298
[deepti, 2012-04-23]: Moving the BP to backlog until we hear from IS about the machine. Spoke to Danilo before I moved it to backlog.
[fboudra, 2012-06-13] Unblock the blueprint. RT#298 is resolved and machine is available for set up Jenkins configuration.
[danilo, 2012-11-22] We are moving away from the DC, killing this.

Meta:
Headline: ci.linaro.org is moved to Canonical DC as a basis for future consolidation with android-build.linaro.org.
Acceptance: ci.linaro.org is moved to Canonical DC and has resources to host android-build.linaro.org. Old EC2-based ci.linaro.org is discontinued.

(?)

Work Items

Work items:
Identify the software requirements for ci.linaro.org service: DONE
[pfalcon] Identify the software requirements for android-build.linaro.org service: DONE
[danilo] Review the hardware/software requirements identify for ci.linaro.org/android-build.linaro.org service: DONE
Raise an RT to request for a machine which can match the ci.linaro.org and android-build.linaro.org: DONE
Update wiki to describe setting up new jobs (ci.linaro.org ones) on the new consolidation server: TODO
Get the machine described in the RT up and running: TODO
Migrate the ci.linaro.org jobs to the new consolidation server: TODO
Enable the ci.linaro.org builds on the new consolidated server: TODO
Verify the migrated ci.linaro.org jobs on consolidation server work fine and there are no regressions because of the movement: TODO
Verify that the lava test execution for migrated ci.linaro.org build jobs submitted from the consolidation server works fine and there are no regressions: TODO
Make the new consolidated server for the ci.linaro.org Build Service available for end users in Linaro: TODO
Kill the ci.linaro.org instance hosted on the ec2 instance: TODO
[pfalcon] Set up a test Android build on ci.linaro.org: DONE
[pfalcon] Set up android-frontend to work remotely against ci.linaro.org (as a test): TODO
Make sure that new master runs ntpdate: TODO
Separate /var/lib/jenkins and /var/lib/jenkins/jobs to separate partitions, so /var/lib/jenkins/jobs overflow (which happened several times) didn't affect storage of Jenkins state data, which otherwise can lead to gross ec2 instance leak (also happened): TODO
Request extended monitoring for the master host from IS (disk space, load, etc.): TODO
Elaborate master's security - by all means try to avoid running any user-definable code on master: TODO

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information 
Everyone can see this information.