Important Notes

  1. The Oxide CLI, Go SDK, and Terraform Provider have been updated for API enhancements described under New Features. Please be sure to upgrade.

Installation

Oxide Computer Model 0 must be installed and configured under the guidance of Oxide technicians. The requirement may change in future releases.

Upgrade Compatibility

Upgrade from version 12 is supported. We recommend shutting down all running instances on the rack before software update commences. Any instances that aren’t stopped for software update are transitioned to the failed state when the control plane comes up. They can be configured to start automatically with auto-restart policy or they can be started manually by the user.

All existing setup and data (e.g., projects, users, instances) remain intact after the software update.

New Features

AMD Performance Counters

In this release, we have made CPU performance counters available for monitoring the behavior of a process or family of processes running on the system. You can consume the statistics through the perf package available in most Linux guests:

ubuntu@primary:~$ sudo perf stat -I 3000
# time counts unit events
3.007459118 12027.92 msec cpu-clock # 4.009 CPUs utilized
3.007459118 15447 context-switches # 0.001 M/sec
3.007459118 56 cpu-migrations # 0.005 K/sec
3.007459118 0 page-faults # 0.000 K/sec
3.007459118 6708280958 cycles # 0.558 GHz (50.02%)
3.007459118 0 stalled-cycles-frontend (50.02%)
3.007459118 0 stalled-cycles-backend # 0.00% backend cycles idle (50.13%)
3.007459118 3618134011 instructions # 0.54 insn per cycle (50.02%)
3.007459118 691518275 branches # 57.493 M/sec (50.02%)
3.007459118 35098092 branch-misses # 5.08% of all branches (49.91%)
...

Performance and Power Management

In this release, we have enabled additional hardware power saving features in circumstances where processor cores are not being used. These power saving features also enable higher CPU frequencies for low thread count workloads, with single-threaded CPU-bound processes seeing as much as an 18% performance improvement on sleds with few competing workloads.

Link Layer Discovery Protocol (LLDP) Support

From v13, the fleet administrator will be able to configure LLDP in the rack switch link settings to enable device discovery and ease network management. Here are the new API endpoints for managing and viewing LLDP configurations:

  • view/update config: /v1/system/hardware/switch-port/{port}/lldp/config

  • list neighbors: /v1/system/hardware/rack-switch-port/{rack_id}/{switch_location}/{port}/lldp/neighbors

You may also use the CLI for making LLDP requests. Please see the CLI docs for more information.

Web console

The auto-restart policy for failed instances can now be managed through the web.

Instance auto-restart popover
Full console changelog

Bug fixes and other enhancements

  • Upstairs write stats were not sent to oximeter (crucible#1615)

  • HTTP 500 errors were returned when creating multiple vpc subnets in parallel (omicron#7404)

  • Oxide telemetry now includes management network data link and switch port control data metrics for more operational visibility (omicron#6918)

  • Allow read only activation with less than three downstairs to faciliate disk repair during maintenance (crucible#1608)

  • External DNS now supports more than 200 silos and TLS certificates (omicron#7291)

  • Various sled expungement bug fixes and supporting tool improvements

Firmware update

  • Rack firmware now supports Milan LRDIMMs that provide up to 2 TiB DRAM per sled (128 GiB x 16 channels); note that the 2 TiB sled configuration is only available in the form of new hardware purchase and cannot be applied through in-place upgrade on existing sleds

  • No third-party firmware change

Known Behavior and Limitations

End-user features

Feature AreaKnown Issue/LimitationIssue Number

Image/snapshot management

Disks in importing_from_bulk_writes state cannot be deleted directly. The procedure for unsticking a canceled disk import can be used as a workaround.

Image/snapshot management

Image upload sometimes stalls with HTTP/2 on Firefox.

Image/snapshot management

The ability to modify image metadata is not available at this time.

Instance orchestration

Instance hostname validation has been strengthened. Instances with a now-invalid hostname will fail to start, though they can still be listed and viewed. If the disks attached to them are valuable, they may be detached from the invalid instances, and re-attached to a new instance. The invalid instance may be deleted at that time.

Instance orchestration

Instances fail to start when one of the switch zones is unavailable.

Instance performance

The tsc clocksource is treated as unreliable by guest, resulting in its fallback to use substantially slower timestamp syscalls. A workaround for this issue can be found in the Troubleshooting Guide.

VPC internet gateway

Changing a silo’s default IP pool causes some instances to lose their outbound internet access. This is due to a mismatch between the pool containing the instances' external IP (which are allocated from the new default pool) and the pool attached to the system-created internet gateways (which are linked to the old pool during creation time). See the Troubleshooting Guide for some possible options for restoring instance outbound connectivity.

VPC routing

Subnet update clears custom router ID when the field is left out of request body.

VPC routing

Network interface update clears transit ips when the field is left out of request body.

-

Telemetry

VM instance memory utilization and VPC network/firewall metrics are unavailable at this time.

-

Operator features

Feature AreaKnown Issue/LimitationIssue Number

Access control

Device tokens do not expire.

omicron#2302

Control plane

Sled and physical storage availability status are not available in the inventory UI and API yet.

omicron#2035

Control plane

The built-in test silo named "default-silo" has resource quotas and should be removed.

omicron#5731

Control plane

Operator-driven software update is currently unavailable. All updates need to be performed by Oxide technicians.

-

Control plane

Operator-driven instance migration across sleds is currently unavailable.

-

Control plane

New instances cannot be created when the total number of NAT entries (private-to-external IP mappings) in the system exceeds 1024.

omicron#6939

User management

User offboarding from the rack is not supported at this time. Apart from updating the identity provider to remove obsolete users from the relevant groups, operators will need to remove any IAM roles granted directly to those users in silos and projects.

omicron#2587