Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to create a VM with more than 32 vcpus brings nexus down #3212

Closed
citrus-it opened this issue May 24, 2023 · 13 comments · Fixed by #3819
Closed

Attempting to create a VM with more than 32 vcpus brings nexus down #3212

citrus-it opened this issue May 24, 2023 · 13 comments · Fixed by #3819
Assignees
Milestone

Comments

@citrus-it
Copy link
Contributor

I tried to create an image from the CLI with:

oxide instance create \
        --project ni \
        --description 'Ptang Zoo Boing!' \
        --hostname ekke2 \
        --memory 64G \
        --ncpus 64 \
        --name ekke2

which reported a timeout fairly quickly (around 5 seconds):

error
Communication Error: error sending request for url (http://venus.oxide-preview.c
om/v1/instances?project=ni): operation timed out

and nexus crashed.

I'll put the relevant log on shared storage somewhere, but here are the final few events that I extracted:

2023-05-24 10:33:10.006Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: received new runtime state from sled agent
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
    runtime_state = InstanceRuntimeState { run_state: Failed, sled_id: aa7c82d9-6e59-406e-b1e3-3648890b4bec, propolis_id: e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f, dst_propolis_id: None, propolis_addr: Some([fd00:1122:3344:103::28]:12400), migration_id: None, propolis_gen: Generation(1), ncpus: InstanceCpuCount(64), memory: ByteCount(68719476736), hostname: "ekkke", gen: Generation(3), time_updated: 2023-05-24T10:33:09.987050789Z }
2023-05-24 10:33:10.006Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (dropshot_internal) on oxz_nexus: roles
    actor_id = 001de000-05e4-4000-8000-000000000002
    authenticated = true
    local_addr = [fd00:1122:3344:106::4]:12221
    method = PUT
    remote_addr = [fd00:1122:3344:103::1]:52079
    req_id = 13334114-d055-47ab-a328-b8d8662908c0
    roles = RoleSet { roles: {} }
    uri = /instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213

2023-05-24 10:33:10.049Z INFO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: instance updated by sled agent
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
    new_state = failed
    propolis_id = e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f

2023-05-24 10:33:10.049Z INFO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (dropshot_internal) on oxz_nexus: request completed
    local_addr = [fd00:1122:3344:106::4]:12221
    method = PUT
    remote_addr = [fd00:1122:3344:103::1]:52079
    req_id = 13334114-d055-47ab-a328-b8d8662908c0
    response_code = 204
    uri = /instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213
2023-05-24 10:33:10.050Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: client response
    SledAgent = aa7c82d9-6e59-406e-b1e3-3648890b4bec
    result = Ok(Response { url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv6(fd00:1122:3344:103::1)), port: Some(12345), path: "/instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213/state", query: None, fragment: None }, status: 500, headers: {"content-type": "application/json", "x-request-id": "6164bc99-d4dd-4324-b3dd-409682cacc4f", "content-length": "124", "date": "Wed, 24 May 2023 10:33:09 GMT"} })
2023-05-24 10:33:10.050Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: Handling sled agent instance PUT result
    result = Err(Error Response: status: 500 Internal Server Error; headers: {"content-type": "application/json", "x-request-id": "6164bc99-d4dd-4324-b3dd-409682cacc4f", "content-length": "124", "date": "Wed, 24 May 2023 10:33:09 GMT"}; value: Error { error_code: Some("Internal"), message: "Internal Server Error", request_id: "6164bc99-d4dd-4324-b3dd-409682cacc4f" })
2023-05-24 10:33:10.050Z ERRO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: saw Error Response: status: 500 Internal Server Error; headers: {"content-type": "application/json", "x-request-id": "6164bc99-d4dd-4324-b3dd-409682cacc4f", "content-length": "124", "date": "Wed, 24 May 2023 10:33:09 GMT"}; value: Error { error_code: Some("Internal"), message: "Internal Server Error", request_id: "6164bc99-d4dd-4324-b3dd-409682cacc4f" } from instance_put!
2023-05-24 10:33:10.053Z ERRO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: saw Ok(false) from setting InstanceState::Failed after bad instance_put
2023-05-24 10:33:10.053Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: saga log event
    new_state = N051 failed
    sec_id = 496eabae-738d-4497-a183-1b75bf912c7c
2023-05-24 10:33:10.053Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: recording saga event
    event_type = Failed(ActionFailed { source_error: Object {"InternalError": Object {"internal_message": String("Internal Server Error")}} })
    node_id = 51
    saga_id = 71fc0701-d3dc-4223-81d4-3965e59ae5d0
2023-05-24 10:33:10.054Z INFO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: update for saga cached state
    new_state = Unwinding
    saga_id = 71fc0701-d3dc-4223-81d4-3965e59ae5d0
    sec_id = 496eabae-738d-4497-a183-1b75bf912c7c
2023-05-24 10:33:10.054Z INFO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: updating state
    new_state = unwinding
    saga_id = 71fc0701-d3dc-4223-81d4-3965e59ae5d0
2023-05-24 10:33:10.085Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: saga log event
    new_state = N050 undo_started
    sec_id = 496eabae-738d-4497-a183-1b75bf912c7c

2023-05-24 10:33:10.678Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: client response
    SledAgent = aa7c82d9-6e59-406e-b1e3-3648890b4bec
    result = Err(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Ipv6(fd00:1122:3344:103::1)), port: Some(12345), path: "/instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213", query: None, fragment: None }, source: hyper::Error(IncompleteMessage) })
2023-05-24 10:33:10.678Z DEBG 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: Handling sled agent instance PUT result
    result = Err(Communication Error: error sending request for url (http://[fd00:1122:3344:103::1]:12345/instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213): connection closed before message completed)
2023-05-24 10:33:10.678Z ERRO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: saw Communication Error: error sending request for url (http://[fd00:1122:3344:103::1]:12345/instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213): connection closed before message completed from instance_put!
2023-05-24 10:33:10.682Z ERRO 496eabae-738d-4497-a183-1b75bf912c7c/29152 (ServerContext) on oxz_nexus: saw Ok(true) from setting InstanceState::Failed after bad instance_put
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: action failed', /home/build/.cargo/registry/src/github.com-1ecc6299db9ec823/steno-0.3.1/src/saga_exec.rs:1187:65
@citrus-it
Copy link
Contributor Author

And here is what is in the logs on the sled that handled this request:

10:33:00.434Z INFO SledAgent (InstanceManager): ensuring instance is registered
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
    propolis_id = e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
10:33:00.434Z INFO SledAgent (InstanceManager): registering new instance
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:00.434Z INFO SledAgent (InstanceManager): Instance::new w/initial HW: InstanceHardware { runtime: InstanceRuntimeState { run_state: Creating, sled_id: aa7c82d9-6e59-406e-b1e3-3648890b4bec, propolis_id: e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f, dst_propolis_id: None, propolis_addr: Some([fd00:1122:3344:103::28]:12400), migration_id: None, propolis_gen: Generation(1), ncpus: InstanceCpuCount(64), memory: ByteCount(68719476736), hostname: "ekkke", gen: Generation(1), time_updated: 2023-05-24T10:32:56.175817Z }, nics: [NetworkInterface { id: 127d34e4-cde2-4f12-aeff-81eb81ca9347, kind: Instance { id: 95f8f4f6-b70c-4207-b1a4-8467ebde3213 }, name: Name("net0"), ip: 172.30.0.8, mac: MacAddr(MacAddr6([168, 64, 37, 246, 205, 200])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }], source_nat: SourceNatConfig { ip: 172.20.26.17, first_port: 32768, last_port: 49151 }, external_ips: [], firewall_rules: [VpcFirewallRule { status: Enabled, direction: Inbound, targets: [NetworkInterface { id: 0552f278-55a4-44a8-adb8-2ce71230e6d9, kind: Instance { id: 8f0a78f8-5cf1-444f-8b1c-7a89820f9ee4 }, name: Name("net0"), ip: 172.30.0.7, mac: MacAddr(MacAddr6([168, 64, 37, 245, 144, 127])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 127d34e4-cde2-4f12-aeff-81eb81ca9347, kind: Instance { id: 95f8f4f6-b70c-4207-b1a4-8467ebde3213 }, name: Name("net0"), ip: 172.30.0.8, mac: MacAddr(MacAddr6([168, 64, 37, 246, 205, 200])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 27a5055d-0fec-4d47-938d-6ecef2036d4f, kind: Instance { id: 47ac095c-716a-4926-816c-f95ed28f4111 }, name: Name("net0"), ip: 172.30.0.5, mac: MacAddr(MacAddr6([168, 64, 37, 251, 157, 68])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 8c779a09-7447-46c9-81c4-cebcba3c66d3, kind: Instance { id: e34426cd-c8ab-45d4-acb7-8de6f3d28ae0 }, name: Name("net0"), ip: 172.30.0.6, mac: MacAddr(MacAddr6([168, 64, 37, 249, 17, 84])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }], filter_hosts: None, filter_ports: None, filter_protocols: Some([Icmp]), action: Allow, priority: VpcFirewallRulePriority(65534) }, VpcFirewallRule { status: Enabled, direction: Inbound, targets: [NetworkInterface { id: 0552f278-55a4-44a8-adb8-2ce71230e6d9, kind: Instance { id: 8f0a78f8-5cf1-444f-8b1c-7a89820f9ee4 }, name: Name("net0"), ip: 172.30.0.7, mac: MacAddr(MacAddr6([168, 64, 37, 245, 144, 127])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 127d34e4-cde2-4f12-aeff-81eb81ca9347, kind: Instance { id: 95f8f4f6-b70c-4207-b1a4-8467ebde3213 }, name: Name("net0"), ip: 172.30.0.8, mac: MacAddr(MacAddr6([168, 64, 37, 246, 205, 200])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 27a5055d-0fec-4d47-938d-6ecef2036d4f, kind: Instance { id: 47ac095c-716a-4926-816c-f95ed28f4111 }, name: Name("net0"), ip: 172.30.0.5, mac: MacAddr(MacAddr6([168, 64, 37, 251, 157, 68])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 8c779a09-7447-46c9-81c4-cebcba3c66d3, kind: Instance { id: e34426cd-c8ab-45d4-acb7-8de6f3d28ae0 }, name: Name("net0"), ip: 172.30.0.6, mac: MacAddr(MacAddr6([168, 64, 37, 249, 17, 84])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }], filter_hosts: Some([Vpc(Vni(1922673)), Vpc(Vni(1922673)), Vpc(Vni(1922673)), Vpc(Vni(1922673))]), filter_ports: None, filter_protocols: None, action: Allow, priority: VpcFirewallRulePriority(65534) }, VpcFirewallRule { status: Enabled, direction: Inbound, targets: [NetworkInterface { id: 0552f278-55a4-44a8-adb8-2ce71230e6d9, kind: Instance { id: 8f0a78f8-5cf1-444f-8b1c-7a89820f9ee4 }, name: Name("net0"), ip: 172.30.0.7, mac: MacAddr(MacAddr6([168, 64, 37, 245, 144, 127])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 127d34e4-cde2-4f12-aeff-81eb81ca9347, kind: Instance { id: 95f8f4f6-b70c-4207-b1a4-8467ebde3213 }, name: Name("net0"), ip: 172.30.0.8, mac: MacAddr(MacAddr6([168, 64, 37, 246, 205, 200])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 27a5055d-0fec-4d47-938d-6ecef2036d4f, kind: Instance { id: 47ac095c-716a-4926-816c-f95ed28f4111 }, name: Name("net0"), ip: 172.30.0.5, mac: MacAddr(MacAddr6([168, 64, 37, 251, 157, 68])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }, NetworkInterface { id: 8c779a09-7447-46c9-81c4-cebcba3c66d3, kind: Instance { id: e34426cd-c8ab-45d4-acb7-8de6f3d28ae0 }, name: Name("net0"), ip: 172.30.0.6, mac: MacAddr(MacAddr6([168, 64, 37, 249, 17, 84])), subnet: V4(Ipv4Net(Ipv4Network { addr: 172.30.0.0, prefix: 22 })), vni: Vni(1922673), primary: true, slot: 0 }], filter_hosts: None, filter_ports: Some([L4PortRange { first: L4Port(22), last: L4Port(22) }]), filter_protocols: Some([Tcp]), action: Allow, priority: VpcFirewallRulePriority(65534) }], disks: [], cloud_init_bytes: Some("") }
10:33:00.434Z INFO SledAgent (dropshot (SledAgent)): request completed
    local_addr = [fd00:1122:3344:103::1]:12345
    method = PUT
    remote_addr = [fd00:1122:3344:106::4]:60126
    req_id = 9cfa20d5-441d-4284-a366-2756f63771e4
    response_code = 200
    uri = /instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:00.536Z INFO SledAgent (InstanceManager): Configuring new Omicron zone: oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:00.564Z INFO SledAgent (InstanceManager): Installing Omicron zone: oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:02.865Z INFO SledAgent (InstanceManager): Zone booting
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
    zone = oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
10:33:09.020Z INFO SledAgent (InstanceManager): Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:103::28, prefix: 64 }))
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
    zone = oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
10:33:09.603Z INFO SledAgent (InstanceManager): Created address fd00:1122:3344:103::28/64 for zone: oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.635Z INFO SledAgent (InstanceManager): Adding service
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
    smf_name = svc:/system/illumos/propolis-server:vm-e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
10:33:09.672Z INFO SledAgent (InstanceManager): Adding service property group 'config'
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.709Z INFO SledAgent (InstanceManager): Setting server address property
    address = [fd00:1122:3344:103::28]:12400
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.750Z INFO SledAgent (InstanceManager): Setting metric address property address [fd00:1122:3344:106::4]:12221
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.788Z INFO SledAgent (InstanceManager): Refreshing instance
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.826Z INFO SledAgent (InstanceManager): Enabling instance
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.866Z INFO SledAgent (InstanceManager): Started propolis in zone: oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.943Z INFO SledAgent (InstanceManager): Sending ensure request to propolis: InstanceEnsureRequest { cloud_init_bytes: Some(""), disks: [], migrate: None, nics: [NetworkInterfaceRequest { name: "vopte5", slot: Slot(0) }], properties: InstanceProperties { bootrom_id: 00000000-0000-0000-0000-000000000000, description: "Test description", id: 95f8f4f6-b70c-4207-b1a4-8467ebde3213, image_id: 00000000-0000-0000-0000-000000000000, memory: 65536, name: "ekkke", vcpus: 64 } }
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:09.968Z INFO SledAgent (InstanceManager): result of instance_ensure call is Err(Error Response: status: 500 Internal Server Error; headers: {"content-type": "application/json", "x-request-id": "53f54795-e4d9-45b7-893a-63cb868dfc47", "content-length": "124", "date": "Wed, 24 May 2023 10:33:09 GMT"}; value: Error { error_code: Some("Internal"), message: "Internal Server Error", request_id: "53f54795-e4d9-45b7-893a-63cb868dfc47" })
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213
10:33:10.050Z INFO SledAgent (dropshot (SledAgent)): request completed
    error_message_external = Internal Server Error
    error_message_internal = Error managing instances: Instance error: Failure from Propolis Client: Error Response: status: 500 Internal Server Error; headers: {"content-type": "application/json", "x-request-id": "53f54795-e4d9-45b7-893a-63cb868dfc47", "content-length": "124", "date": "Wed, 24 May 2023 10:33:09 GMT"}; value: Error { error_code: Some("Internal"), message: "Internal Server Error", request_id: "53f54795-e4d9-45b7-893a-63cb868dfc47" }
    local_addr = [fd00:1122:3344:103::1]:12345
    method = PUT
    remote_addr = [fd00:1122:3344:106::4]:55245
    req_id = 6164bc99-d4dd-4324-b3dd-409682cacc4f
    response_code = 500
    uri = /instances/95f8f4f6-b70c-4207-b1a4-8467ebde3213/state
10:33:10.124Z WARN SledAgent (InstanceManager): Halting and removing zone: oxz_propolis-server_e0091a98-c23d-4dc0-bcf5-5d60cf4dca9f
    instance_id = 95f8f4f6-b70c-4207-b1a4-8467ebde3213

@citrus-it citrus-it changed the title nexus saga unwrap failure Attempting to create a VM with more than 32 vcpus brings nexus down May 24, 2023
@citrus-it
Copy link
Contributor Author

Through some shenanigans, I managed to capture the propolis error:

error_message_external: Internal Server Error, error_message_internal: failed to create instance: maxcpu out of range, response_code: 500, uri: /instance, method: PUT

There's currently a limit of 32 vcpus per instance and propolis returns a 500 error if asked to do more.

@citrus-it
Copy link
Contributor Author

There are two issues here that might need splitting up:

  • If propolis returns a 500 in response to an InstanceEnsureRequest, we unwind to death. This is presumably true for any 500 response to such a request.
  • The useful part of the error from propolis is not recorded of propagated.

@citrus-it citrus-it added this to the FCS milestone May 24, 2023
@davepacheco
Copy link
Collaborator

The crash seems likely oxidecomputer/steno#26.

Re: propagating the error message: If you use dropshot::HttpError::for_internal_error(internal_message) (which I think is what's happening here), you get a 500 where Dropshot sends a generic "internal server error" message to the client. But you can construct your own HttpError for a 500 where you provide whatever internal and external messages you want. The default behavior is aimed at situations where the human on the other end is not expected to know anything about the server's implementation details (as in the case of the external API). For our own internal stuff I think there'd be no harm in exposing the internal messages to clients.

I think there are a few other issues here:

  • It seems like this should be a 400 error from propolis-server. I think it doesn't hugely matter because it's not actionable either way but I think it helps debug when the error code better reflects the problem. On the plus side, if this were a 400, the error message would have been sent back to Nexus.
  • I assume we also want to avoid having Nexus get this far? If Propolis has a cap of 32 vcpus, shouldn't Nexus impose that cap on VMs as well?

@citrus-it
Copy link
Contributor Author

  • I assume we also want to avoid having Nexus get this far? If Propolis has a cap of 32 vcpus, shouldn't Nexus impose that cap on VMs as well?

Propolis is the source of truth on the current vcpu cap which actually comes from illumos bhyve. This cap may be changed or relaxed in the future (there is work in upstream freebsd bhyve around this). It feels cleaner to me to have this be checked in the one place that knows the limit.

@davepacheco
Copy link
Collaborator

That makes sense. But it seems like more work would be needed to raise that cap. If the cap varies across sleds, wouldn't we want Nexus to take this into account when choosing which sled to use for a provision? Just thinking out loud: Propolis could remain the source of truth. And we could propagate the cap outside of Propolis and into CockroachDB. Then Nexus could take this into account when selecting a sled for a new provision. If no sleds could possibly satisfy it, we could fail the request without even creating the saga. I think this would be a better user experience for the case where someone just inputs a number higher than we support.

@pfmooney
Copy link
Contributor

pfmooney commented May 24, 2023

Just thinking out loud: Propolis could remain the source of truth. And we could propagate the cap outside of Propolis and into CockroachDB.

I'd probably just encode the limit into sled-agent for now, rather than propolis, so the cap can be communicated into nexus w/o requiring a propolis instance/zone to exist first. Once we get around to lifting that arbitrary 32-vcpu limit in bhyve, then we can make sled-agent aware of its dynamic nature, and the rest would fall out (assuming logic for handling differing limits was built into the control plane at that point).

I'm sorry that the VM_MAXCPU limit hasn't been lifted yet. There have been more pressing issues, and we'll probably need to look at how we're doing vCPU scheduling in the OS before we get too wild with VM sizing.

@askfongjojo
Copy link

Will apply validations at the API level per FCS, max 32 vCPUs and 64 GBytes DRAM.

@zephraph
Copy link
Contributor

The mitigation for this landed in #3574

@askfongjojo
Copy link

askfongjojo commented Jul 13, 2023

@zephraph brought up the need for disk size limit. This is what I propose:

  • vmware's vmdk size limit is 2TB minus 512B - ESXi 5.0 and 5.1 and 62TB for ESXi 5.5 and later
  • We can probably set the max to 1TB (updated per Alan's test*) and make it a tunable.

The size limit won't be a FCS blocker (can always raise it if customer needs a higher limit).

*Update from @leftwo: I just tried on a bench gimlet and 1TiB is the largest disk size I can create.

@askfongjojo
Copy link

I've moved this to "unscheduled" for revisit if we should have sled-agent own the validation.

@askfongjojo
Copy link

askfongjojo commented Aug 4, 2023

@zephraph - With oxidecomputer/propolis#474 landed (and VMM reservoir #3223 just before FCS), can you please raise the VM instance size limit to the following:

  • number of vcpus: 64
  • memory: 256 GiB

Also I was off by 1 GiB on the max disk size, it should have been 1023 GiB, not 1TiB. Would you please change that as well?

Finally, on second thought, the API may actually be the right place for setting all the limits since it's where documentation lives. If we do the checks in sled-agent, the checks and API docs will become disjoined. As such, you can mark this ticket close once you are done with the size limit changes.

@askfongjojo askfongjojo removed this from the Unscheduled milestone Aug 4, 2023
@askfongjojo askfongjojo added this to the 1.0.2 milestone Aug 4, 2023
@zephraph
Copy link
Contributor

zephraph commented Aug 4, 2023

Yes, absolutely, I'll get on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants