Attempting to create a VM with more than 32 vcpus brings nexus down #3212

citrus-it · 2023-05-24T10:54:12Z

I tried to create an image from the CLI with:

oxide instance create \
        --project ni \
        --description 'Ptang Zoo Boing!' \
        --hostname ekke2 \
        --memory 64G \
        --ncpus 64 \
        --name ekke2

which reported a timeout fairly quickly (around 5 seconds):

error
Communication Error: error sending request for url (http://venus.oxide-preview.c
om/v1/instances?project=ni): operation timed out

and nexus crashed.

I'll put the relevant log on shared storage somewhere, but here are the final few events that I extracted:

2023-05-24 10:33:10.006Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: received new runtime state from sled agent
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
    runtime_state = InstanceRuntimeState { run_state: Failed, sled_id: aa7c82d9-6e59-406e-b1e3-3648890b4bec, propolis_id: e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f, dst_propolis_id: None, propolis_addr: Some([fd00:1122:3344:103::28]:12400), migration_id: None, propolis_gen: Generation(1), ncpus: InstanceCpuCount(64), memory: ByteCount(68719476736), hostname: "ekkke", gen: Generation(3), time_updated: 2023-05-24T10:33:09.987050789Z }
2023-05-24 10:33:10.006Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (dropshot_internal) on oxz_nexus: roles
    actor_id = 001de000-05e4-4000-8000-000000000002
    authenticated = true
    local_addr = [fd00:1122:3344:106::4]:12221
    method = PUT
    remote_addr = [fd00:1122:3344:103::1]:52079
    req_id = 13334114-d055-47ab-a328-b8d8662908c0
    roles = RoleSet { roles: {} }
    uri = /instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213

2023-05-24 10:33:10.049Z INFO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: instance updated by sled agent
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
    new_state = failed
    propolis_id = e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f

2023-05-24 10:33:10.049Z INFO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (dropshot_internal) on oxz_nexus: request completed
    local_addr = [fd00:1122:3344:106::4]:12221
    method = PUT
    remote_addr = [fd00:1122:3344:103::1]:52079
    req_id = 13334114-d055-47ab-a328-b8d8662908c0
    response_code = 204
    uri = /instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213
2023-05-24 10:33:10.050Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: client response
    SledAgent = aa7c82d9-6e59-406e-b1e3-3648890b4bec
    result = Ok(Response { url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv6(fd00:1122:3344:103::1)), port: Some(12345), path: "/instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213/state", query: None, fragment: None }, status: 500, headers: {"content-type": "application/json", "x-request-id": "6164bc99-d4dd-4324-b3dd-409682cacc4f", "content-length": "124", "date": "Wed, 24 May 2023 10:33:09 GMT"} })
2023-05-24 10:33:10.050Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: Handling sled agent instance PUT result
    result = Err(Error Response: status: 500 Internal Server Error; headers: {"content-type": "application/json", "x-request-id": "6164bc99-d4dd-4324-b3dd-409682cacc4f", "content-length": "124", "date": "Wed, 24 May 2023 10:33:09 GMT"}; value: Error { error_code: Some("Internal"), message: "Internal Server Error", request_id: "6164bc99-d4dd-4324-b3dd-409682cacc4f" })
2023-05-24 10:33:10.050Z ERRO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: saw Error Response: status: 500 Internal Server Error; headers: {"content-type": "application/json", "x-request-id": "6164bc99-d4dd-4324-b3dd-409682cacc4f", "content-length": "124", "date": "Wed, 24 May 2023 10:33:09 GMT"}; value: Error { error_code: Some("Internal"), message: "Internal Server Error", request_id: "6164bc99-d4dd-4324-b3dd-409682cacc4f" } from instance_put!
2023-05-24 10:33:10.053Z ERRO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: saw Ok(false) from setting InstanceState::Failed after bad instance_put
2023-05-24 10:33:10.053Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: saga log event
    new_state = N051 failed
    sec_id = 496eabae-738d-4497-a183-1b75bf912c7c
2023-05-24 10:33:10.053Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: recording saga event
    event_type = Failed(ActionFailed { source_error: Object {"InternalError": Object {"internal_message": String("Internal Server Error")}} })
    node_id = 51
    saga_id = 71fc0701-d3dc-4223-81d4-3965e59ae5d0
2023-05-24 10:33:10.054Z INFO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: update for saga cached state
    new_state = Unwinding
    saga_id = 71fc0701-d3dc-4223-81d4-3965e59ae5d0
    sec_id = 496eabae-738d-4497-a183-1b75bf912c7c
2023-05-24 10:33:10.054Z INFO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: updating state
    new_state = unwinding
    saga_id = 71fc0701-d3dc-4223-81d4-3965e59ae5d0
2023-05-24 10:33:10.085Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: saga log event
    new_state = N050 undo_started
    sec_id = 496eabae-738d-4497-a183-1b75bf912c7c

2023-05-24 10:33:10.678Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: client response
    SledAgent = aa7c82d9-6e59-406e-b1e3-3648890b4bec
    result = Err(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv6(fd00:1122:3344:103::1)), port: Some(12345), path: "/instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213", query: None, fragment: None }, source: hyper::Error(IncompleteMessage) })
2023-05-24 10:33:10.678Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: Handling sled agent instance PUT result
    result = Err(Communication Error: error sending request for url (http://[fd00:1122:3344:103::1]:12345/instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213): connection closed before message completed)
2023-05-24 10:33:10.678Z ERRO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: saw Communication Error: error sending request for url (http://[fd00:1122:3344:103::1]:12345/instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213): connection closed before message completed from instance_put!
2023-05-24 10:33:10.682Z ERRO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: saw Ok(true) from setting InstanceState::Failed after bad instance_put
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: action failed', /home/build/.cargo/registry/src/github.com-1ecc6299db9ec823/steno-0.3.1/src/saga_exec.rs:1187:65

The text was updated successfully, but these errors were encountered:

citrus-it · 2023-05-24T10:58:22Z

And here is what is in the logs on the sled that handled this request:

10:33:00.434Z INFO SledAgent (InstanceManager): ensuring instance is registered
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
    propolis_id = e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
10:33:00.434Z INFO SledAgent (InstanceManager): registering new instance
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:00.434Z INFO SledAgent (InstanceManager): Instance::new w/initial HW: InstanceHardware { runtime: InstanceRuntimeState { run_state: Creating, sled_id: aa7c82d9-6e59-406e-b1e3-3648890b4bec, propolis_id: e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f, dst_propolis_id: None, propolis_addr: Some([fd00:1122:3344:103::28]:12400), migration_id: None, propolis_gen: Generation(1), ncpus: InstanceCpuCount(64), memory: ByteCount(68719476736), hostname: "ekkke", gen: Generation(1), time_updated: 2023-05-24T10:32:56.175817Z }, nics: [NetworkInterface { id: 127d34e4-cde2-4f12-aeff-81eb81ca9347, kind: Instance { id: 95f8f4f6-b70c-4207-b1a4-8467ebde3213 }, name: Name("net0"), ip: 172.30.0.8, mac: MacAddr(MacAddr6([168, 64, 37, 246, 205, 200])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }], source_nat: SourceNatConfig { ip: 172.20.26.17, first_port: 32768, last_port: 49151 }, external_ips: [], firewall_rules: [VpcFirewallRule { status: Enabled, direction: Inbound, targets: [NetworkInterface { id: 0552f278-55a4-44a8-adb8-2ce71230e6d9, kind: Instance { id: 8f0a78f8-5cf1-444f-8b1c-7a89820f9ee4 }, name: Name("net0"), ip: 172.30.0.7, mac: MacAddr(MacAddr6([168, 64, 37, 245, 144, 127])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 127d34e4-cde2-4f12-aeff-81eb81ca9347, kind: Instance { id: 95f8f4f6-b70c-4207-b1a4-8467ebde3213 }, name: Name("net0"), ip: 172.30.0.8, mac: MacAddr(MacAddr6([168, 64, 37, 246, 205, 200])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 27a5055d-0fec-4d47-938d-6ecef2036d4f, kind: Instance { id: 47ac095c-716a-4926-816c-f95ed28f4111 }, name: Name("net0"), ip: 172.30.0.5, mac: MacAddr(MacAddr6([168, 64, 37, 251, 157, 68])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 8c779a09-7447-46c9-81c4-cebcba3c66d3, kind: Instance { id: e34426cd-c8ab-45d4-acb7-8de6f3d28ae0 }, name: Name("net0"), ip: 172.30.0.6, mac: MacAddr(MacAddr6([168, 64, 37, 249, 17, 84])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }], filter_hosts: None, filter_ports: None, filter_protocols: Some([Icmp]), action: Allow, priority: VpcFirewallRulePriority(65534) }, VpcFirewallRule { status: Enabled, direction: Inbound, targets: [NetworkInterface { id: 0552f278-55a4-44a8-adb8-2ce71230e6d9, kind: Instance { id: 8f0a78f8-5cf1-444f-8b1c-7a89820f9ee4 }, name: Name("net0"), ip: 172.30.0.7, mac: MacAddr(MacAddr6([168, 64, 37, 245, 144, 127])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 127d34e4-cde2-4f12-aeff-81eb81ca9347, kind: Instance { id: 95f8f4f6-b70c-4207-b1a4-8467ebde3213 }, name: Name("net0"), ip: 172.30.0.8, mac: MacAddr(MacAddr6([168, 64, 37, 246, 205, 200])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 27a5055d-0fec-4d47-938d-6ecef2036d4f, kind: Instance { id: 47ac095c-716a-4926-816c-f95ed28f4111 }, name: Name("net0"), ip: 172.30.0.5, mac: MacAddr(MacAddr6([168, 64, 37, 251, 157, 68])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 8c779a09-7447-46c9-81c4-cebcba3c66d3, kind: Instance { id: e34426cd-c8ab-45d4-acb7-8de6f3d28ae0 }, name: Name("net0"), ip: 172.30.0.6, mac: MacAddr(MacAddr6([168, 64, 37, 249, 17, 84])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }], filter_hosts: Some([Vpc(Vni(1922673)), Vpc(Vni(1922673)), Vpc(Vni(1922673)), Vpc(Vni(1922673))]), filter_ports: None, filter_protocols: None, action: Allow, priority: VpcFirewallRulePriority(65534) }, VpcFirewallRule { status: Enabled, direction: Inbound, targets: [NetworkInterface { id: 0552f278-55a4-44a8-adb8-2ce71230e6d9, kind: Instance { id: 8f0a78f8-5cf1-444f-8b1c-7a89820f9ee4 }, name: Name("net0"), ip: 172.30.0.7, mac: MacAddr(MacAddr6([168, 64, 37, 245, 144, 127])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 127d34e4-cde2-4f12-aeff-81eb81ca9347, kind: Instance { id: 95f8f4f6-b70c-4207-b1a4-8467ebde3213 }, name: Name("net0"), ip: 172.30.0.8, mac: MacAddr(MacAddr6([168, 64, 37, 246, 205, 200])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 27a5055d-0fec-4d47-938d-6ecef2036d4f, kind: Instance { id: 47ac095c-716a-4926-816c-f95ed28f4111 }, name: Name("net0"), ip: 172.30.0.5, mac: MacAddr(MacAddr6([168, 64, 37, 251, 157, 68])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 8c779a09-7447-46c9-81c4-cebcba3c66d3, kind: Instance { id: e34426cd-c8ab-45d4-acb7-8de6f3d28ae0 }, name: Name("net0"), ip: 172.30.0.6, mac: MacAddr(MacAddr6([168, 64, 37, 249, 17, 84])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }], filter_hosts: None, filter_ports: Some([L4PortRange { first: L4Port(22), last: L4Port(22) }]), filter_protocols: Some([Tcp]), action: Allow, priority: VpcFirewallRulePriority(65534) }], disks: [], cloud_init_bytes: Some("") }
10:33:00.434Z INFO SledAgent (dropshot (SledAgent)): request completed
    local_addr = [fd00:1122:3344:103::1]:12345
    method = PUT
    remote_addr = [fd00:1122:3344:106::4]:60126
    req_id = 9cfa20d5-441d-4284-a366-2756f63771e4
    response_code = 200
    uri = /instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:00.536Z INFO SledAgent (InstanceManager): Configuring new Omicron zone: oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:00.564Z INFO SledAgent (InstanceManager): Installing Omicron zone: oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:02.865Z INFO SledAgent (InstanceManager): Zone booting
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
    zone = oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
10:33:09.020Z INFO SledAgent (InstanceManager): Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:103::28, prefix: 64 }))
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
    zone = oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
10:33:09.603Z INFO SledAgent (InstanceManager): Created address fd00:1122:3344:103::28/64 for zone: oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.635Z INFO SledAgent (InstanceManager): Adding service
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
    smf_name = svc:/system/illumos/propolis-server:vm-e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
10:33:09.672Z INFO SledAgent (InstanceManager): Adding service property group 'config'
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.709Z INFO SledAgent (InstanceManager): Setting server address property
    address = [fd00:1122:3344:103::28]:12400
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.750Z INFO SledAgent (InstanceManager): Setting metric address property address [fd00:1122:3344:106::4]:12221
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.788Z INFO SledAgent (InstanceManager): Refreshing instance
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.826Z INFO SledAgent (InstanceManager): Enabling instance
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.866Z INFO SledAgent (InstanceManager): Started propolis in zone: oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.943Z INFO SledAgent (InstanceManager): Sending ensure request to propolis: InstanceEnsureRequest { cloud_init_bytes: Some(""), disks: [], migrate: None, nics: [NetworkInterfaceRequest { name: "vopte5", slot: Slot(0) }], properties: InstanceProperties { bootrom_id: 00000000-0000-0000-0000-000000000000, description: "Test description", id: 95f8f4f6-b70c-4207-b1a4-8467ebde3213, image_id: 00000000-0000-0000-0000-000000000000, memory: 65536, name: "ekkke", vcpus: 64 } }
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.968Z INFO SledAgent (InstanceManager): result of instance_ensure call is Err(Error Response: status: 500 Internal Server Error; headers: {"content-type": "application/json", "x-request-id": "53f54795-e4d9-45b7-893a-63cb868dfc47", "content-length": "124", "date": "Wed, 24 May 2023 10:33:09 GMT"}; value: Error { error_code: Some("Internal"), message: "Internal Server Error", request_id: "53f54795-e4d9-45b7-893a-63cb868dfc47" })
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:10.050Z INFO SledAgent (dropshot (SledAgent)): request completed
    error_message_external = Internal Server Error
    error_message_internal = Error managing instances: Instance error: Failure from Propolis Client: Error Response: status: 500 Internal Server Error; headers: {"content-type": "application/json", "x-request-id": "53f54795-e4d9-45b7-893a-63cb868dfc47", "content-length": "124", "date": "Wed, 24 May 2023 10:33:09 GMT"}; value: Error { error_code: Some("Internal"), message: "Internal Server Error", request_id: "53f54795-e4d9-45b7-893a-63cb868dfc47" }
    local_addr = [fd00:1122:3344:103::1]:12345
    method = PUT
    remote_addr = [fd00:1122:3344:106::4]:55245
    req_id = 6164bc99-d4dd-4324-b3dd-409682cacc4f
    response_code = 500
    uri = /instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213/state
10:33:10.124Z WARN SledAgent (InstanceManager): Halting and removing zone: oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213

citrus-it · 2023-05-24T11:21:06Z

Through some shenanigans, I managed to capture the propolis error:

error_message_external: Internal Server Error, error_message_internal: failed to create instance: maxcpu out of range, response_code: 500, uri: /instance, method: PUT

There's currently a limit of 32 vcpus per instance and propolis returns a 500 error if asked to do more.

citrus-it · 2023-05-24T11:58:02Z

There are two issues here that might need splitting up:

If propolis returns a 500 in response to an InstanceEnsureRequest, we unwind to death. This is presumably true for any 500 response to such a request.
The useful part of the error from propolis is not recorded of propagated.

davepacheco · 2023-05-24T16:12:07Z

The crash seems likely oxidecomputer/steno#26.

Re: propagating the error message: If you use dropshot::HttpError::for_internal_error(internal_message) (which I think is what's happening here), you get a 500 where Dropshot sends a generic "internal server error" message to the client. But you can construct your own HttpError for a 500 where you provide whatever internal and external messages you want. The default behavior is aimed at situations where the human on the other end is not expected to know anything about the server's implementation details (as in the case of the external API). For our own internal stuff I think there'd be no harm in exposing the internal messages to clients.

I think there are a few other issues here:

It seems like this should be a 400 error from propolis-server. I think it doesn't hugely matter because it's not actionable either way but I think it helps debug when the error code better reflects the problem. On the plus side, if this were a 400, the error message would have been sent back to Nexus.
I assume we also want to avoid having Nexus get this far? If Propolis has a cap of 32 vcpus, shouldn't Nexus impose that cap on VMs as well?

citrus-it · 2023-05-24T16:18:27Z

I assume we also want to avoid having Nexus get this far? If Propolis has a cap of 32 vcpus, shouldn't Nexus impose that cap on VMs as well?

Propolis is the source of truth on the current vcpu cap which actually comes from illumos bhyve. This cap may be changed or relaxed in the future (there is work in upstream freebsd bhyve around this). It feels cleaner to me to have this be checked in the one place that knows the limit.

davepacheco · 2023-05-24T16:33:13Z

That makes sense. But it seems like more work would be needed to raise that cap. If the cap varies across sleds, wouldn't we want Nexus to take this into account when choosing which sled to use for a provision? Just thinking out loud: Propolis could remain the source of truth. And we could propagate the cap outside of Propolis and into CockroachDB. Then Nexus could take this into account when selecting a sled for a new provision. If no sleds could possibly satisfy it, we could fail the request without even creating the saga. I think this would be a better user experience for the case where someone just inputs a number higher than we support.

pfmooney · 2023-05-24T17:02:31Z

Just thinking out loud: Propolis could remain the source of truth. And we could propagate the cap outside of Propolis and into CockroachDB.

I'd probably just encode the limit into sled-agent for now, rather than propolis, so the cap can be communicated into nexus w/o requiring a propolis instance/zone to exist first. Once we get around to lifting that arbitrary 32-vcpu limit in bhyve, then we can make sled-agent aware of its dynamic nature, and the rest would fall out (assuming logic for handling differing limits was built into the control plane at that point).

I'm sorry that the VM_MAXCPU limit hasn't been lifted yet. There have been more pressing issues, and we'll probably need to look at how we're doing vCPU scheduling in the OS before we get too wild with VM sizing.

askfongjojo · 2023-07-11T18:55:16Z

Will apply validations at the API level per FCS, max 32 vCPUs and 64 GBytes DRAM.

zephraph · 2023-07-13T12:28:39Z

The mitigation for this landed in #3574

askfongjojo · 2023-07-13T15:55:19Z

@zephraph brought up the need for disk size limit. This is what I propose:

vmware's vmdk size limit is 2TB minus 512B - ESXi 5.0 and 5.1 and 62TB for ESXi 5.5 and later
We can probably set the max to 1TB (updated per Alan's test*) and make it a tunable.

The size limit won't be a FCS blocker (can always raise it if customer needs a higher limit).

*Update from @leftwo: I just tried on a bench gimlet and 1TiB is the largest disk size I can create.

askfongjojo · 2023-07-14T22:23:57Z

I've moved this to "unscheduled" for revisit if we should have sled-agent own the validation.

askfongjojo · 2023-08-04T15:24:23Z

@zephraph - With oxidecomputer/propolis#474 landed (and VMM reservoir #3223 just before FCS), can you please raise the VM instance size limit to the following:

number of vcpus: 64
memory: 256 GiB

Also I was off by 1 GiB on the max disk size, it should have been 1023 GiB, not 1TiB. Would you please change that as well?

Finally, on second thought, the API may actually be the right place for setting all the limits since it's where documentation lives. If we do the checks in sled-agent, the checks and API docs will become disjoined. As such, you can mark this ticket close once you are done with the size limit changes.

zephraph · 2023-08-04T16:52:51Z

Yes, absolutely, I'll get on that.

citrus-it changed the title ~~nexus saga unwrap failure~~ Attempting to create a VM with more than 32 vcpus brings nexus down May 24, 2023

citrus-it added this to the FCS milestone May 24, 2023

askfongjojo mentioned this issue May 26, 2023

Tracking issue for saga unwind safety #2052

Open

13 tasks

askfongjojo mentioned this issue Jun 8, 2023

Launching a mega instance kills neighbors #3286

Closed

askfongjojo assigned david-crespo Jul 11, 2023

askfongjojo mentioned this issue Jul 11, 2023

Fail to provision instances with 96Gb memory (failed to await http server / tcp connect error) #3417

Closed

david-crespo mentioned this issue Jul 11, 2023

API for the current rack utilization #3149

Closed

david-crespo assigned zephraph and unassigned david-crespo Jul 11, 2023

zephraph mentioned this issue Jul 12, 2023

Add API cap to instance vCPUs and memory #3574

Merged

zephraph linked a pull request Jul 12, 2023 that will close this issue

Add API cap to instance vCPUs and memory #3574

Merged

david-crespo removed a link to a pull request Jul 12, 2023

Add API cap to instance vCPUs and memory #3574

Merged

zephraph removed their assignment Jul 13, 2023

zephraph mentioned this issue Jul 14, 2023

decide if we want to apply constraints to disk size, memory size #2415

Closed

askfongjojo modified the milestones: FCS, Unscheduled Jul 14, 2023

askfongjojo assigned zephraph Aug 4, 2023

askfongjojo removed this from the Unscheduled milestone Aug 4, 2023

askfongjojo added this to the 1.0.2 milestone Aug 4, 2023

zephraph added a commit that referenced this issue Aug 4, 2023

Update resource limits as per #3212

b1b1e1b

zephraph mentioned this issue Aug 4, 2023

Update resource limits as per #3212 #3819

Merged

zephraph closed this as completed in #3819 Aug 7, 2023

zephraph added a commit that referenced this issue Aug 7, 2023

Update resource limits as per #3212 (#3819)

1b23673

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempting to create a VM with more than 32 vcpus brings nexus down #3212

Attempting to create a VM with more than 32 vcpus brings nexus down #3212

citrus-it commented May 24, 2023

citrus-it commented May 24, 2023

citrus-it commented May 24, 2023

citrus-it commented May 24, 2023

davepacheco commented May 24, 2023

citrus-it commented May 24, 2023

davepacheco commented May 24, 2023

pfmooney commented May 24, 2023 •

edited

Loading

askfongjojo commented Jul 11, 2023

zephraph commented Jul 13, 2023

askfongjojo commented Jul 13, 2023 •

edited

Loading

askfongjojo commented Jul 14, 2023

askfongjojo commented Aug 4, 2023 •

edited

Loading

zephraph commented Aug 4, 2023

Attempting to create a VM with more than 32 vcpus brings nexus down #3212

Attempting to create a VM with more than 32 vcpus brings nexus down #3212

Comments

citrus-it commented May 24, 2023

citrus-it commented May 24, 2023

citrus-it commented May 24, 2023

citrus-it commented May 24, 2023

davepacheco commented May 24, 2023

citrus-it commented May 24, 2023

davepacheco commented May 24, 2023

pfmooney commented May 24, 2023 • edited Loading

askfongjojo commented Jul 11, 2023

zephraph commented Jul 13, 2023

askfongjojo commented Jul 13, 2023 • edited Loading

askfongjojo commented Jul 14, 2023

askfongjojo commented Aug 4, 2023 • edited Loading

zephraph commented Aug 4, 2023

pfmooney commented May 24, 2023 •

edited

Loading

askfongjojo commented Jul 13, 2023 •

edited

Loading

askfongjojo commented Aug 4, 2023 •

edited

Loading