Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] low level console i/o (up_putc) can interfere interrupt-based uart i/o #14662

Open
1 task done
yamt opened this issue Nov 6, 2024 · 7 comments
Open
1 task done
Labels
Arch: all Issues that apply to all architectures Area: Drivers Drivers issues OS: Mac Issues related to MacOS (building system, etc) Type: Bug Something isn't working

Comments

@yamt
Copy link
Contributor

yamt commented Nov 6, 2024

Description / Steps to reproduce the issue

for some configurations, it's possible for a system to use both of low-level up_putc and the uart (drivers/serial/serial.c) on the console.
for an example, printf(stdout) goes to the uart and syslog() goes to up_putc.

the uart uses a semaphore (dev->xmitsem) interlocked with the kernel critical section to deal with interrupts.

although how up_putc is implemented is device-specific, it usually disables interrupts, performs polling-based i/o, and then restores the interrupts.

unfortunately, with SMP, the latter sometimes can "consume" the hardware events which would otherwise have triggered tx interrupts, which the former might be waiting for. if it happens, the former can block forever.

i was mainly looking at the esp32 implementation. but other implementations look similar.

an obvious fix would be to wrap up_putc with the critical section. but i'm not feeling too happy with the approach.

any suggestions?

On which OS does this issue occur?

[OS: Mac]

What is the version of your OS?

macOS 14.7

NuttX Version

master

Issue Architecture

[Arch: all]

Issue Area

[Area: Drivers]

Verification

  • I have verified before submitting the report.
@yamt yamt added the Type: Bug Something isn't working label Nov 6, 2024
@github-actions github-actions bot added Arch: all Issues that apply to all architectures Area: Drivers Drivers issues OS: Mac Issues related to MacOS (building system, etc) labels Nov 6, 2024
@acassis
Copy link
Contributor

acassis commented Nov 6, 2024

@patacongo what do you think? I know this implementation exists since initial NuttX versions

@patacongo
Copy link
Contributor

@patacongo what do you think? I know this implementation exists since initial NuttX versions

up_putc was only added originally to get some very early output to assist in CPU bringup and analyzing assertions. That was about 17 or more years ago. It normally just printed things like ABCDEF to let you know where the power-up reset logic failed..

For some of the early CPUs, I had only LEDs and low level serial output to perform the entire bring-up. It is really amazing what you can do with very little if you have to.

up_putc was used in various other, less appropriate ways since... Such as getting interrupt level serial output.

Using serial output to support CPU bring-up with serial output is not so important with today's modern, cheap debuggers. Perhaps up_putc() could even just be removed.

yamt added a commit to yamt/incubator-nuttx that referenced this issue Nov 7, 2024
yamt added a commit to yamt/incubator-nuttx that referenced this issue Nov 7, 2024
@yamt
Copy link
Contributor Author

yamt commented Nov 7, 2024

@patacongo what do you think? I know this implementation exists since initial NuttX versions

up_putc was only added originally to get some very early output to assist in CPU bringup and analyzing assertions. That was about 17 or more years ago. It normally just printed things like ABCDEF to let you know where the power-up reset logic failed..

For some of the early CPUs, I had only LEDs and low level serial output to perform the entire bring-up. It is really amazing what you can do with very little if you have to.

up_putc was used in various other, less appropriate ways since... Such as getting interrupt level serial output.

Using serial output to support CPU bring-up with serial output is not so important with today's modern, cheap debuggers. Perhaps up_putc() could even just be removed.

it's still useful for early in the boot, where i guess it's fine. (maybe until we boot secondary CPUs.)

i guess we can avoid most of problematic cases by removing CONFIG_SYSLOG_DEFAULT. how do you think?

@acassis
Copy link
Contributor

acassis commented Nov 7, 2024

@yamt I agree! It still useful and we need to keep it. No idea why Syslog is using it by default, it could be an option, instead of default output channel

yamt added a commit to yamt/incubator-nuttx that referenced this issue Nov 8, 2024
Because the implementation is problematic.
See apache#14662 for the discussion.
Also, make it depend on EXPERIMENTAL to discourage it.

Instead, enable SYSLOG_CONSOLE for a bit more cases.

An alternative would be to introduce/extend some serialization
mechanism (eg. enter_critical_section, which is currently used by
the serial driver to interact with interrupts) to cover up_putc
users including SYSLOG_DEFAULT. While it works, it doesn't sound
attractive to me to introduce this kind of complexity to up_putc,
which is supposed to be a very low-level machinary used early in
the boot. Also, the implementation of such serialization can be
complex because how up_putc works is basically device-specific and
how it corresponds to other devices on the system (eg. uarts) isn't
obvious to the upper layers.
yamt added a commit to yamt/incubator-nuttx that referenced this issue Nov 11, 2024
This would avoid the undesirable intertactions with the serial driver
described in apache#14662.

Although I'm not entirely happy with this fix because it assumes
the particular implementations of up_putc/up_nputc and its association
to the serial devices, I haven't come up with better ideas for now.

An alternative is to place some serializations inside the target
specific serial (and/or whatever provides up_putc api) implementaitons.
But it isn't too attractive to put potentially complex logic into the
low-level machinaries, especially when we have a lot of similar copies
of it.

Another alternative is to deprecate up_putc. (at least for the purpose
of syslog.) But it seems at least some of users are relying on what
the current implementation provides heavily.

This commit also removes g_lowputs_lock because the critical section
would serve the purpose of the lock as well.
xiaoxiang781216 pushed a commit that referenced this issue Nov 12, 2024
This would avoid the undesirable intertactions with the serial driver
described in #14662.

Although I'm not entirely happy with this fix because it assumes
the particular implementations of up_putc/up_nputc and its association
to the serial devices, I haven't come up with better ideas for now.

An alternative is to place some serializations inside the target
specific serial (and/or whatever provides up_putc api) implementaitons.
But it isn't too attractive to put potentially complex logic into the
low-level machinaries, especially when we have a lot of similar copies
of it.

Another alternative is to deprecate up_putc. (at least for the purpose
of syslog.) But it seems at least some of users are relying on what
the current implementation provides heavily.

This commit also removes g_lowputs_lock because the critical section
would serve the purpose of the lock as well.
yamt added a commit to yamt/incubator-nuttx that referenced this issue Nov 14, 2024
yamt added a commit to yamt/incubator-nuttx that referenced this issue Nov 14, 2024
yamt added a commit to yamt/incubator-nuttx that referenced this issue Nov 14, 2024
Because the implementation is problematic.
See apache#14662 for the discussion.
Also, make it depend on EXPERIMENTAL to discourage it.

Instead, enable SYSLOG_CONSOLE for a bit more cases.

An alternative would be to introduce/extend some serialization
mechanism (eg. enter_critical_section, which is currently used by
the serial driver to interact with interrupts) to cover up_putc
users including SYSLOG_DEFAULT. While it works, it doesn't sound
attractive to me to introduce this kind of complexity to up_putc,
which is supposed to be a very low-level machinary used early in
the boot. Also, the implementation of such serialization can be
complex because how up_putc works is basically device-specific and
how it corresponds to other devices on the system (eg. uarts) isn't
obvious to the upper layers.
@yamt
Copy link
Contributor Author

yamt commented Nov 19, 2024

unfortunately, with SMP,

i guess this was not really SMP-specific.
consider that:

  1. a thread calls syslog, which ends up with calling up_putc. it's preempted in the middle of up_putc.
  2. another thread calls uart_write. the tx buffer happens to be full and it blocks.
  3. the first thread is scheduled again and breaks the expectation of the blocking thread. (eg. by "restore"ing the tx interrupt status.)

#14761 should be enough to fix the case as well.

@xiaoxiang781216
Copy link
Contributor

but is uart_write protected by the critical section too?

@yamt
Copy link
Contributor Author

yamt commented Nov 19, 2024

but is uart_write protected by the critical section too?

i'm not sure what you meant. but a literal answer is "yes".

JaeheeKwon pushed a commit to JaeheeKwon/nuttx that referenced this issue Nov 28, 2024
This would avoid the undesirable intertactions with the serial driver
described in apache#14662.

Although I'm not entirely happy with this fix because it assumes
the particular implementations of up_putc/up_nputc and its association
to the serial devices, I haven't come up with better ideas for now.

An alternative is to place some serializations inside the target
specific serial (and/or whatever provides up_putc api) implementaitons.
But it isn't too attractive to put potentially complex logic into the
low-level machinaries, especially when we have a lot of similar copies
of it.

Another alternative is to deprecate up_putc. (at least for the purpose
of syslog.) But it seems at least some of users are relying on what
the current implementation provides heavily.

This commit also removes g_lowputs_lock because the critical section
would serve the purpose of the lock as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arch: all Issues that apply to all architectures Area: Drivers Drivers issues OS: Mac Issues related to MacOS (building system, etc) Type: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants