Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OHCI: Fixed a bug in the OHCI implementation from QEMU #1531

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

faha223
Copy link
Contributor

@faha223 faha223 commented Sep 23, 2023

This fixes the issue with USB passthrough in which a race condition is created when using the async api in libusb to write to and read from multiple endpoints simultaneously

Addresses this bug:
#1239

@faha223 faha223 changed the title Fixed a bug in the OHCI implementation from QEMU OHCI: Fixed a bug in the OHCI implementation from QEMU Sep 23, 2023
@GXTX
Copy link
Contributor

GXTX commented Sep 23, 2023

Since this an issue with QEMU, should this perhaps be posted to the ML for comments / future merge?

@faha223
Copy link
Contributor Author

faha223 commented Sep 23, 2023

Yes, it should. I'll need to look into what their patch submission process is

@mborgerson mborgerson self-assigned this Oct 17, 2023
hw/usb/hcd-ohci.h Outdated Show resolved Hide resolved
@@ -1999,9 +2096,6 @@ const VMStateDescription vmstate_ohci_state = {
VMSTATE_UINT32(hreset, OHCIState),
VMSTATE_UINT32(htest, OHCIState),
VMSTATE_UINT32(old_ctl, OHCIState),
VMSTATE_UINT8_ARRAY(usb_buf, OHCIState, 8192),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • How are in-flight packets handled now during vmsave/load events?
  • The structure change breaks snapshot compatibility

Copy link
Contributor Author

@faha223 faha223 Apr 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When loading a snapshot, vmstate_load_state invokes pre_load function pointer in the VMStateDescription if one was defined. Then it copies the data stored for each field in the VMState into the appropriate memory as defined in the VMStateDescription. Then it invokes the post_load function pointer in the VMStateDescription if one was defined.

Saving a snapshot does the same thing, but using the pre_save and post_save function pointers.

These fields are not defined in vmstate_ohci_state so there's no logic that runs when loading or saving a snapshot that interacts with these fields.

I suspect there's a real possibility that the VMState saving/restoring the async_td and usb_packet fields in snapshots can cause problems, but the structure change that would be needed to save and load the new list of async_packets would break snapshot compatibility.

I tested saving and loading snapshots with a libusb controller attached, both while interacting with the controller and while not interacting with the controller and it seemed to be mostly stable. I was able to crash the emulator during vmload a few times but the point of failure was clearing out the existing async packets before restoring the snapshot data. It didn't have anything to do with the actual snapshot data.

I've added back the fields I removed from OHCIState and vmstate_ohci_state since their removal breaks snapshot compatibility.

hw/usb/hcd-ohci.c Outdated Show resolved Hide resolved
ohci_set_interrupt(ohci, OHCI_INTR_UE);
ohci_bus_stop(ohci);
}

void ohci_clear_active_packets(struct OHCIState *ohci)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why clear and not cancel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clear because this function makes the list empty. I am not opposed to changing it to cancel

uint8_t usb_buf[8192];
uint32_t async_td;
bool async_complete;
OHCIState *ohci;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need OHCIState pointer here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in ohci_async_complete_packet when calling ohci_process_lists. This is the only place it is used.

We get USBActivePacket by using container_of on the USBPacket argument. It might be possible to use container_of to get the OHCI pointer we need here if we can use one of the QTAILQ macros to get to the head of the list.

hw/usb/hcd-ohci.c Outdated Show resolved Hide resolved
QTAILQ_FOREACH(iter, &s->active_packets, next) {
if (iter->async_td) {
usb_cancel_packet(&iter->usb_packet);
iter->async_td = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you clean up packets

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

I removed this foreach loop because the call right after this (ohci_stop_endpoints) actually calls (ohci_clear_active_packets) and does the work of this foreach loop already.

typedef struct USBActivePacket USBActivePacket;
typedef struct OHCIState OHCIState;

struct USBActivePacket {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is basically 1:1 with endpoints, does it makes sense to just move the async tracking into USBEndpoint and avoid all the list iteration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

USBEndpoint is shared between OHCI, EHCI, UHCI, and XHCI. Moving the async tracking into USBEndpoint would make sense from the perspective of OHCI, but wouldn't be used by EHCI, UHCI, or XHCI

hw/usb/hcd-ohci.c Outdated Show resolved Hide resolved
hw/usb/hcd-ohci.c Outdated Show resolved Hide resolved
@mborgerson
Copy link
Member

This fixes the issue with USB passthrough in which a race condition is created when using the async api in libusb to write to and read from multiple endpoints simultaneously

Can you share more specifics about the race condition to help me understand the root cause of the issue?

@faha223
Copy link
Contributor Author

faha223 commented Oct 22, 2023

Every USBEndpoint that shares a single OCHI Controller shares a single buffer for reading and writing data.
The libsub async API returns after sending a request and responses are handled later.

If a read request is sent and a write request is sent after the read request is sent but before the response comes in, then the read request has a chance to return data from the write request. I observed this while using the Steel Battalion controller but also while using the gamepad.

Converting the libusb passthrough code to use the sync API fixes the issue by blocking on every read and write request. This introduces latency so fixing the code to work properly with the async API is preferred.

@LoveMHz
Copy link
Contributor

LoveMHz commented Nov 16, 2023

And here I was, spending my day thinking I borked something in my USB driver.

This PR appears to resolve the issue stated in #1239. I was able to verify this with Stellar's Xbox One driver where we were running into endpoint data appearing in endpoint interrupts under xemu. With the patch, I was able to verify that the correct behavior matches that of our hw tests.

I look forward to reviewing the patch once @mborgerson's comments are addressed.

@faha223 would you mind dropping me an email? ([email protected]). Thank you for your contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants