Page MenuHomeSoftware Heritage

Provision staging vms through terraform (up to the first puppet run)
AbandonedPublic

Authored by ardumont on Jul 23 2019, 4:20 PM.

Details

Reviewers
vlorentz
douardda
Group Reviewers
Reviewers
Summary

It's done in multiple steps:

  • (once) preparing your workstation to orchestrate the terraform tooling (for proxmox here)
  • (once) first create a template (through the hypervisor though, it's scripted in this diff)
  • (n times vm) defines vm through the proxmox dsl in staging.tf file (there are 2 here: gateway, storage0)

The current staging.tf file defines 2 that can be created from scratch (run terraform destroy, terraform apply, also clean up the puppet master's certificates for those node):

  • gateway (192.168.128.1)
  • storage0 (192.168.128.2). Routing its packets through the gateway (as a storage server node)

This diff also defines a series of documents (markdown, scripts) to clarify how to:

  • prepare the workstation with the tools
  • create the template vm (debian 9/10)

Note:
This works for debian oldstable (9) and stable (10).

Requisite (for reproducibility):

  • swh-site: Needs to be updated to add the expected roles for the new vms (branch new_staging)
  • pergamon:/etc/puppet/autosign.conf: Updated manually the file to add the new domain internal.staging.swh.network (i don't see any puppet stanza which deals with that yet)
  • louvre: one route has been added to louvre to allow the ssh connection from the internal network ssh root@192.168.128.1 (gateway access).

Related T1785

Test Plan
terraform init
terraform apply

...

Run from scratch (no manual intervention):

proxmox_vm_qemu.gateway (remote-exec): Notice: Applied catalog in 202.40 seconds
proxmox_vm_qemu.gateway (remote-exec): Node provisionned!
...
proxmox_vm_qemu.storage (remote-exec): Notice: Applied catalog in 229.62 seconds
proxmox_vm_qemu.storage: Still creating... (6m10s elapsed)
proxmox_vm_qemu.storage (remote-exec): Node provisionned!
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

Check connectivity:

$ ssh ardumont@192.168.128.1 'echo hello from $(hostname)'
hello from gateway
$ ssh ardumont@192.168.128.2 'echo hello from $(hostname)'
hello from storage0

Diff Detail

Repository
rSPRE sysadm-provisioning
Branch
provision-proxmox-through-terraform
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 7098
Build 10010: arc lint + arc unit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
    • storage: Update default hardware
    • storage0: Drop the swh- prefix which is redundant in fqdn form
    • init-template: Add default instructions adaptations to template
    • Add work around current limitation in api-proxmox-go
    • storage.tf: Format according to terraform conventions
    • Reference workstation preparation so that it's reproducible
    • variables: Use the actual root key referenced in the password store
    • Add FIXME about the new vlan
  • - Update documentations to explain how to reproduce
This revision is now accepted and ready to land.Jul 24 2019, 6:15 PM
ardumont planned changes to this revision.Jul 24 2019, 6:16 PM
ardumont updated this revision to Diff 5971.Jul 24 2019, 6:19 PM

Remove in-progress work from diff

This revision is now accepted and ready to land.Jul 24 2019, 6:19 PM
ardumont added inline comments.Jul 24 2019, 6:21 PM
proxmox/terraform/init-template.md
96

This will be exploited by puppet to set things up location-wise.

proxmox/terraform/init-template.sh
35

The ip should probably be set up as a parameter of the script.
125 is currently unallocated right now, that won't be always the case.

ardumont planned changes to this revision.Jul 24 2019, 6:21 PM
ardumont updated this revision to Diff 5973.Jul 24 2019, 6:22 PM

Remove not-supposed to be committed-code

This revision is now accepted and ready to land.Jul 24 2019, 6:22 PM
ardumont planned changes to this revision.Jul 24 2019, 6:23 PM
ardumont updated this revision to Diff 5985.Jul 25 2019, 4:28 PM

Fix issues with debian 9 by using ssh connection to template

commits:

  • storage.tf: Add docstrings
  • init-template.sh: Use ssh to connect to template vm
This revision is now accepted and ready to land.Jul 25 2019, 4:28 PM
ardumont planned changes to this revision.Jul 25 2019, 4:28 PM
ardumont updated this revision to Diff 5986.Jul 25 2019, 4:34 PM
  • Docs: Improve phrasing sentence
This revision is now accepted and ready to land.Jul 25 2019, 4:34 PM
ardumont planned changes to this revision.Jul 25 2019, 4:37 PM
ardumont updated this revision to Diff 5991.Jul 25 2019, 6:36 PM
  • Use template-debian-9 as default to match production
This revision is now accepted and ready to land.Jul 25 2019, 6:36 PM
ardumont updated this revision to Diff 5992.Jul 25 2019, 7:20 PM
  • init-template: Update necessary steps for debian-9
ardumont planned changes to this revision.Jul 25 2019, 7:20 PM
ardumont edited the summary of this revision. (Show Details)Jul 25 2019, 7:26 PM

Heads up:

terraform apply
ssh root@192.168.100.125 "puppet agent --server pergamon.internal.softwareheritage.org --test --noop --environment=new_staging --waitforcert 60 --certname storage0.internal.staging.swh.network"

worked!

So next step is to automate that manual step in the proxmox dsl.

ardumont edited the summary of this revision. (Show Details)Jul 25 2019, 7:40 PM
ardumont updated this revision to Diff 5993.Jul 25 2019, 8:17 PM
  • storage: Refactor into variables what will change
This revision is now accepted and ready to land.Jul 25 2019, 8:17 PM
ardumont planned changes to this revision.Jul 25 2019, 8:17 PM
ardumont updated this revision to Diff 5994.Jul 25 2019, 8:26 PM
  • storage: Refactor into variables what will change
  • storage: Delegate to puppet the provisioning
This revision is now accepted and ready to land.Jul 25 2019, 8:26 PM
ardumont added inline comments.Jul 25 2019, 8:33 PM
proxmox/terraform/storage.tf
97 ↗(On Diff #5994)

--noop should go away as we want to actually apply.
That means that we must know already what that node will be used for (but hey, we are calling puppet so that should be obvious ;).

--certname might be best to not use it (but i don't know why yet ;)

ardumont planned changes to this revision.Jul 25 2019, 8:37 PM
ardumont edited the summary of this revision. (Show Details)Jul 26 2019, 10:10 AM
ardumont retitled this revision from Initialize terraform/proxmox to build vm to Provision staging vms through terraform (up to the first puppet run).Jul 26 2019, 10:18 AM
ardumont edited the summary of this revision. (Show Details)
ardumont updated this revision to Diff 6003.Jul 26 2019, 3:49 PM
ardumont edited the summary of this revision. (Show Details)
  • storage: Update /etc/hosts and do not use the --certname flag
  • storage.tf: Expose the hostname as variable for reuse
  • storage.tf: Use sed to alter /etc/hosts
This revision is now accepted and ready to land.Jul 26 2019, 3:49 PM
ardumont planned changes to this revision.Jul 26 2019, 4:01 PM
ardumont updated this revision to Diff 6009.EditedJul 27 2019, 4:32 PM
  • Rename storage.tf to staging.tf as this describes staging vms
  • staging: Actually apply the first puppet run

(forgot to update yesterday, this should be good to land now).

This revision is now accepted and ready to land.Jul 27 2019, 4:32 PM
douardda requested changes to this revision.Jul 29 2019, 1:31 PM
douardda added a subscriber: douardda.

Looks globally ok but as discussed IRL:

  • we really do not want this hardocoded MAC address,
  • we really want this first staging vm to live in a dedicated /24 instead of prod's one (192.168.100.0/24),
  • it would be nice to check if this declared resources can be "templatized"; having to copy/paste this whole resource declaration for each and every VM we want to instantiate is the promise of a nightmare...
This revision now requires changes to proceed.Jul 29 2019, 1:31 PM
ardumont added a comment.EditedJul 29 2019, 2:03 PM

we really do not want this hardocoded MAC address,

yup, i forgot to note in comment that it's a workaround (i think it was a comment on the first version of the diff) a current limitation/issue of the api client used underneath (api-proxmox-go).

But yeah, should be transparent here.
I'll remove it.

we really want this first staging vm to live in a dedicated /24 instead of prod's one (192.168.100.0/24),

Right

it would be nice to check if this declared resources can be "templatized"; having to copy/paste this whole resource declaration for each and every VM we want to instantiate is the promise of a nightmare...

That was on my todo list for after but sure ;)

ardumont updated this revision to Diff 6033.Jul 30 2019, 6:27 PM
  • staging: Add gateway node
  • staging: Update macaddress to new one

This cannot yet land though (as i need some work on the puppet side to add the
gateway role/profile ... which does not exist yet).

And now, this makes apparent the need to refactor between storage0 and
gateway.

ardumont planned changes to this revision.Jul 30 2019, 6:28 PM
ardumont added inline comments.Jul 30 2019, 7:35 PM
proxmox/terraform/staging.tf
84

should use the ${var.gateway} here (that's the point of the variable which should really be a constant actually)...

ardumont added inline comments.Jul 31 2019, 10:45 AM
proxmox/terraform/staging.tf
84

In the end, no.

But to avoid repeating this elsewhere, i should use ${resource_vm_qemu.storage.network.ip} in the other vm that needs it.

That creates a dependency that terraform knows how to deal with.

So in the end, that variable should go away.

ardumont updated this revision to Diff 6054.Jul 31 2019, 10:53 PM
  • init-template: Update documentation about some more needed steps
  • prepare-workstation: Fix instructions
  • gateway: Instantiate the staging gateway
  • storage0: Use the gateway for that node
  • staging: Work around the puppet agent non standard exit code
  • storage: Push dependency on gateway provisioning
ardumont added inline comments.Jul 31 2019, 10:55 PM
proxmox/terraform/staging.tf
84

well, yes, i agree.
But it seems that's not exposed by the plugin so no.
I'll keep the variable.

ardumont edited the summary of this revision. (Show Details)Jul 31 2019, 10:58 PM
ardumont edited the summary of this revision. (Show Details)Jul 31 2019, 11:05 PM
ardumont edited the test plan for this revision. (Show Details)Jul 31 2019, 11:33 PM

we really want this first staging vm to live in a dedicated /24 instead of prod's one (192.168.100.0/24),

done.
It now lives in 192.168.128.0/24.

we really do not want this hardocoded MAC address,

yes, as explained, it's a current bug in the api-proxmox-go used underneath (well i think it is, need to double check)
I need some time to dig further.
I'll open up the issue upstream eventually (possibly with the fix if possible).

it would be nice to check if this declared resources can be "templatized"; having to copy/paste this whole resource declaration for each and every VM we want to instantiate is the promise of a nightmare...

It seems to be possible using terraform module.
I'll investigate this and open that refactoring in another diff.

For me, it's now ready to land.
(I'll possibly need to rework the commits but no blocker).

I'll rework the git history and split the diffs in multiple smaller ones.

ardumont abandoned this revision.Aug 1 2019, 9:46 AM
ardumont edited the summary of this revision. (Show Details)
ardumont edited the summary of this revision. (Show Details)