Initial Rack Setup

Getting Started

In the first phase of the rack configuration, the Rack Setup Service (RSS) is performed interactively via either an Oxide-provided laptop or a secured jumpbox connecting to one of the technician ports on the rack switches. With the assistance of Oxide technicians, you will locate the technician port IP addresses and interface names.

Important
The instructions below assume you start by connecting to Switch 0. You may connect to the other switch and adapt the command endpoint to match with the switch you are using.

First, ssh into the wicket captive shell against one of the Switch 0 tech ports:

ssh wicket@${IP_ADDRESS}%${INTERFACE_NAME}

This should give you an Oxide splash screen and land you in wicket showing a graphical display of the rack.

Overview

Since the rack does not have network access at this point in the process, the RSS covers only the minimum configuration necessary to:

  • Validate component end-to-end connectivity

  • Update rack software to the latest versions, if necessary

  • Upload an SSL certificate

  • Set up recovery account credentials

  • Configure basic networking such as upstream DNS, NTP, VLAN and routing information

Validate component connectivity

The RSS will communicate with the Management Gateway Service (MGS) to retrieve information about sled position and identity from the Service Processors (SP) in the Gimlet on each sled.

To view the sled and switch information:

  1. On the left pane of the wicket UI, you can use the up and down arrows (or j/k a la vim) to select a screen.

  2. Select OVERVIEW and press Tab to move focus into the rack. For every sled displayed, you can press Enter to see its details.

  3. On the sled detail screen, you can use left and right arrows (or h/l) to move left and right.

Sled Details

Confirm that the number of sleds with Ignition information matches the expected count (16 for a half-rack, 32 for a full rack), and that the two switches and PSC are all powered on.

Update rack software

The steps below assume that the Oxide software binaries are already downloaded to the technician laptop or copied into the secured jumpbox via a proxy server.

Upload software to the rack

Close wicket via Ctrl-C. Upload the software image zip file by executing the following command:

ssh wicket@${IP_ADDRESS}%${INTERFACE_NAME} upload < tuf-mupdate.zip

Once the upload is complete, you will see a "successfully uploaded repository to wicketd" message and you will be returned to your shell. Next, ssh back into wicket:

ssh wicket@${IP_ADDRESS}%${INTERFACE_NAME}

Execute sled updates

You will initiate software update one sled at a time, each of which should take about 20 minutes. You can have multiple sleds execute the process in parallel without waiting for the other ones to finish.

To initiate a sled update,

  1. On the left pane of the wicket UI, select UPDATE and press Tab to move focus into the rack.

    Update Status
  2. Arrow down to the target sled. You can press the right/left arrows to expand or collapse the short list of versions (there should be only one version available during the first rack install).

  3. Press Enter and this should take you to another pane with the versions listed at the top, and the bottom should say "Update ready: press Ctrl-U to start".

  4. Press Ctrl-U, then press Y on the popup to confirm you want to start the update. The bottom pane will be replaced by a list of steps that will be performed.

Update Status Details

At any time, you can move up and down the list (via up/down/j/k) and press enter to see details about the step. The sleds will be rebooted automatically after update. Here is an example of the update step details:

Update Steps
Note
Wicket will refuse to update the sled connecting to the switch which the laptop is plugged into. This is because wicket needs a live connection to the rack and cannot self-update. The final step below describes how to handle the update of this sled.

Execute Switch 1 and PSC updates

The following steps can be done in parallel with the sled updates above and should take no more than 10 minutes each:

  1. Select SWITCH 1 - the switch that is not connecting to the current technician port - and initiate an update in the same way as how it is done for sleds.

  2. Select PSC 0 and initiate an update as well.

Update Switch 0 and its adjacent sled

Once the updates for sleds, switch 1, and PSC invoked above have been completed successfully, exit from wicket and disconnect from Switch 0.

Next, connect to one of the technician ports on Switch 1. Ssh into wicket and select the sled that was excluded from update previously. Follow the same steps to initiate a software update for this sled.

After the sled comes back up, select SWITCH 0 and initiate update. Upon completion, the rack should have all the latest software and is ready for configuration and setup.

Configure Rack Settings

On the left pane of the wicket UI, select RACK SETUP. The current rack status displayed on the right pane will be "Uninitialized" at this point.

While keeping the wicket UI open, start another terminal session and ssh into the setup command shell:

ssh wicket@${IP_ADDRESS}%${INTERFACE_NAME} setup

This should bring up the list of available subcommands:

Usage: wicket setup [OPTIONS] 

Commands:
  get-config        Get the current rack configuration as a TOML template
  set-config        Set the current rack configuration from a filled-in TOML template
  reset-config      Reset the configuration to its original (empty) state
  set-password      Set the password for the recovery user of the recovery silo
  set-bgp-auth-key  Set one or more BGP authentication keys
  upload-cert       Upload a certificate chain
  upload-key        Upload the private key of a certificate chain
  help              Print this message or the help of the given subcommand(s)

In this second terminal window, you will make use of the commands above to enter or upload the necessary rack configurations.

Upload SSL certificate

Oxide Console and API will be hosted under the domain name controlled by your organization. In this step, you will upload the certificate and key files that correspond to the subdomain delegated to the Oxide Rack.

Execute the upload-cert subcommand to import the SSL certificate chain file:

ssh wicket@${IP_ADDRESS}%${INTERFACE_NAME} setup upload-cert < ${CERT-CHAIN}.pem

and then upload-key to import the key file:

ssh wicket@${IP_ADDRESS}%${INTERFACE_NAME} setup upload-key < ${CERT-KEY}.pem

Set Recovery User Password

The RSS will create a built-in silo for setup and recovery purposes. This is an ordinary silo backed by the local-only identity provider, with a system user named "recovery". This user has the privileges to create other silos and modify mutable pieces of their identity provider configuration.

Execute the following subcommand to enter the password for the recovery user:

ssh wicket@${IP_ADDRESS}%${INTERFACE_NAME} setup set-password

Configure basic networking

In this step, you will configure the endpoints of boundary services that integrate with the Oxide Rack. You will supply the information in the form of a text file in toml format.

To begin the configuration, retrieve the toml template via

ssh wicket@${IP_ADDRESS}%${INTERFACE_NAME} setup get-config > rack.toml

The content of the file should look like this:

# Delegated external DNS zone name
#
# The rack provides separate external API and console endpoints for each Silo.
# These are named `$silo_name.sys.$external_dns_zone_name`.  For a Silo called
# "eng" with delegated domain "oxide.example", the API would be accessible at
# "eng.sys.oxide.example".  The rack runs external DNS servers that serve A/AAAA
# records for these DNS names.
external_dns_zone_name = ""

# IP addresses for authoritative external DNS servers operated by the rack for
# the DNS domain delegated to the rack by the customer. Each of these addresses
# must be contained in one of the "internal services" IP Pool ranges listed
# below.
external_dns_ips = [
]

# External NTP servers; e.g., "ntp.eng.oxide.computer".
ntp_servers = [
]

# External DNS server IP Addresses; e.g., "1.1.1.1", "9.9.9.9".
dns_servers = [
]

# Ranges of the service IP pool which may be used for internal services.
#
# Elements of this list should be of the form:
#
#    { first = "first_ip", last = "last_ip" }
#
# where `last_ip` is equal to or higher than `first_ip`; e.g.,
#
#    { first = "172.20.26.1", last = "172.20.26.10" }
internal_services_ip_pool_ranges = [
]

# List of sleds to initialize.
#
# Confirm this list contains all expected sleds before continuing!
bootstrap_sleds = [
   (list of sleds auto-discovered from the rack will be displayed here)
]

# Allowlist of source IPs that can make requests to user-facing services.
#
# Use the key:
#
# allow = "any"
#
# to indicate any external IPs are allowed to make requests. This is the default.
#
# Use the below two lines to only allow requests from the specified IP subnets.
# Requests from any other source IPs are refused. Note that individual addresses
# must include the netmask, e.g., "1.2.3.4/32".
#
# allow = "list"
# ips = [ "1.2.3.4/5", "5.6.7.8/10" ]
[allowed_source_ips]
allow = "any"

# network config
[rack_network_config]
infra_ip_first = ""
infra_ip_last = ""

# A table of ports to initialize on the rack. The keys are the switch (switch0,
# switch1) and the port name (qsfp0, qsfp1, etc). Copy and paste this section
# for each port.

[rack_network_config.switch0.qsfp0]

    # Routes associated with this port.
    # { nexthop = "1.2.3.4", destination = "0.0.0.0/0" }
    routes = []

    # Addresses associated with this port.
    # "1.2.3.4/24"
    addresses = []

    # `speed40_g`, `speed100_g`, ...
    uplink_port_speed = ""

    # `none`, `firecode`, or `rs`
    uplink_port_fec = ""

    # Whether or not to set autonegotiation: `true` or `false`
    autoneg = false

    # A list of BGP peers for this port. Copy this section, changing the port name
    # as desired. Remove if not needed.
    [[rack_network_config.switch0.qsfp0.bgp_peers]]

        # The autonomous system number (required). This must match one of the `asn`
        # values in the `[[rack_network_config.bgp]]` section.
        asn = 0

        # The switch port the peer is reachable on (required).
        port = ""

        # The IPv4 address of the peer (required): e.g. 1.2.3.4.
        addr = ""

        # How long to keep a session alive without a keepalive, in seconds.
        hold_time = 6

        # How long to keep a peer in idle after a state machine reset, in seconds.
        idle_hold_time = 3

        # How long to delay sending open messages to a peer, in seconds.
        delay_open = 0

        # The interval in seconds between peer connection retry attempts.
        connect_retry = 3

        # The interval to send keepalive messages at, in seconds.
        keepalive = 2

        # Require that a peer has a specified ASN (optional).
        # remote_asn = 0

        # Require messages from a peer have a minimum IP time to live field (optional).
        # min_ttl = 0

        # If BGP authentication is desired, a key identifier. Multiple peers
        # can share the same key ID, if desired.
        #
        # The actual keys are provided via `wicket setup set-bgp-auth-key`.
        # Currently, only TCP-MD5 authentication is supported.
        # auth_key_id = "key1"

        # Apply the provided multi-exit discriminator (MED) for updates sent to the
        # peer (optional).
        # multi_exit_discriminator = 0

        # Include the provided communities in updates sent to the peer (optional).
        # communities = [28, 47]

        # Apply a local preference to routes sent to the peer (optional).
        # local_pref = 0

        # Enforce that the first AS in paths received from the peer is the
        # peer's AS.
        enforce_first_as = false

        # Apply import policy to this peer with an allowlist of prefixes
        # (optional). Defaults to allowing all prefixes. Use an empty list to
        # indicate that no prefixes are allowed.
        # allowed_import = ["224.0.0.0/8"]

        # Apply export policy to this peer with an allowlist of prefixes
        # (optional). Defaults to allowing all prefixes. Use an empty list to
        # indicate that no prefixes are allowed.
        # allowed_export = []

        # Associate a VLAN ID with this BGP session (optional).
        # vlan_id = 0

[rack_network_config.switch1]

# Optional BGP configuration, as a list of entries. Duplicate or remove this
# section as needed.
[[rack_network_config.bgp]]

# The autonomous system number.
asn = 0

# Prefixes to originate e.g., ["10.0.0.0/16"].
originate = []
Important
The internal_services_ip_pool_ranges are used for Control Plane DNS and API services. The pool range(s) must cover 16 or more IP addresses.

Use a text editor such as vim to edit the toml file. Upon completing the configuration data entry, you can upload the file via

ssh wicket@${IP_ADDRESS}%${INTERFACE_NAME} setup set-config < rack.toml

The configurations should be refreshed automatically in the wicket UI with the uploaded data. If everything looks correct, proceed to the next step; else, edit the configurations with reset-config, get-config, and set-config as needed.

To continue with rack setup, press Ctrl-Alt-K. The process may take about 30 minutes or longer. Once the initialization has completed, the Current rack status will become Initialized. Here is an example of the final state:

Rack Setup Status

If certain misconfigurations are found after the rack has been initialized, you can reset the rack with Ctrl-R Ctrl-R to remove the control plane and any VM instances, and repeat rack initialization after correcting the settings.

Note
As part of the rack initialization, RSS instructs sled-agents to generate the rack secret, split it into shares, and distribute the encrypted shares to different sleds on the bootstrap network over tcp links. The rack secret is used for the storage encryption scheme. Subsequent to rack initialization, whenever a sled boots, it must recover a certain number of shares of rack secret - the "trust quorum" threshold - to reconstruct the rack secret and unlock its local storage. Upon completion of the unlock process, the rack secret will be securely erased from memory.

Next: Log in the web console to complete the rest of the rack configuration.

Last updated