Skip to content

faux-mgs logging and stdout output are sometimes racy #418

@hawkw

Description

@hawkw

faux-mgs, like most other Oxide software using slog, logs to the console using slog-async:

let drain = slog_term::FullFormat::new(decorator)
.build()
.filter_level(level)
.fuse();
let (drain, guard) = slog_async::Async::new(drain).build_with_guard();
(drain.fuse(), guard)

This means that logging is carried out on a background thread.

When a log file is not provided via the CLI, it uses the default slog_term::TermDecorator returned by TermDecorator::new:

make_drain(level, slog_term::TermDecorator::new().build())

In addition, the faux-mgs CLI will print text to stdout. Because slog-async is writing to stderr on a background thread, the log lines and CLI output are not synchronized with each other. Sometimes, pieces of the log line being written out by slog-async and the stdout output printed by faux-mgs' CLI can crash into each other unpleasantly, like this:

eliza@castle ~ $ pilot sp exec -e 'ereports' BRM42220036
Aug 06 17:14:08.269 INFO creating SP handle on interface london_sw0tp0, component: faux-mgs
Aug 06 17:14:08.272 INFO initial discovery completerestart ID: afb94641-961d-2dea-8706-faefc33a4972
restart IDs did not match (requested 00000000-0000-0000-0000-000000000000)
count: 2

ereports:
, addr:0x1: {
    "baseboard_part_number": String("913-0000019"),
    "baseboard_serial_number": String("BRM42220036"),
    "ereport_message_version": Number(0),
    "hubris_archive_id": String("YGMrNrtk_JI"),
    "hubris_task_gen": Number(0),
    "hubris_task_name": String("packrat"),
    "hubris_uptime_ms": Number(0),
    "lost": Null,
    "rev": Number(6),
}
 [fe80::aa40:25ff:fe04:202%3]:11111,
interface: london_sw0tp0, socket: control-plane-agent, component: faux-mgs
0x2: {
    "avg_vin": Number(9.748998641967773),
    "baseboard_part_number": String("913-0000019"),
    "baseboard_serial_number": String("BRM42220036"),
    "ereport_message_version": Number(0),
    "hubris_archive_id": String("YGMrNrtk_JI"),
    "hubris_task_gen": Number(0),
    "hubris_task_name": String("gimlet_seq"),
    "hubris_uptime_ms": Number(210703),
    "k": String("seq.vcore.undervolt"),
    "max_vin": Number(11.690000534057617),
    "min_vin": Number(7.950000762939453),
    "rail": Number(0),
    "rev": Number(6),
    "time": Number(210660),
}

Here, part of the second log line was interrupted by the CLI output, and then it printed the rest of it in between the two ereports printed by the CLI.

Here's what the output from that command should look like (and what it normally looks like):

eliza@castle ~ $ pilot sp exec -e 'ereports' BRM42220036
Aug 06 17:20:29.965 INFO creating SP handle on interface london_sw0tp0, component: faux-mgs
Aug 06 17:20:29.968 INFO initial discovery complete, addr: [fe80::aa40:25ff:fe04:202%3]:11111, interface: london_sw0tp0, socket: control-plane-agent, component: faux-mgs
restart ID: afb94641-961d-2dea-8706-faefc33a4972
restart IDs did not match (requested 00000000-0000-0000-0000-000000000000)
count: 2

ereports:
0x1: {
    "baseboard_part_number": String("913-0000019"),
    "baseboard_serial_number": String("BRM42220036"),
    "ereport_message_version": Number(0),
    "hubris_archive_id": String("YGMrNrtk_JI"),
    "hubris_task_gen": Number(0),
    "hubris_task_name": String("packrat"),
    "hubris_uptime_ms": Number(0),
    "lost": Null,
    "rev": Number(6),
}

0x2: {
    "avg_vin": Number(9.748998641967773),
    "baseboard_part_number": String("913-0000019"),
    "baseboard_serial_number": String("BRM42220036"),
    "ereport_message_version": Number(0),
    "hubris_archive_id": String("YGMrNrtk_JI"),
    "hubris_task_gen": Number(0),
    "hubris_task_name": String("gimlet_seq"),
    "hubris_uptime_ms": Number(210703),
    "k": String("seq.vcore.undervolt"),
    "max_vin": Number(11.690000534057617),
    "min_vin": Number(7.950000762939453),
    "rail": Number(0),
    "rev": Number(6),
    "time": Number(210660),
}

It would be nice if the logging and CLI output couldn't crash into each other.

Of course, this isn't a huge deal --- the logging can always be pointed at a file instead, and if the user is redirecting stdout to a file (common when making faux-mgs output JSON), the logs are written to stderr, so they won't screw up the JSON. Still, it would be nicer for this to be less of a mess.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions