Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix --time-stamp-precision when used with -V #1078

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

jtippet
Copy link

@jtippet jtippet commented Aug 28, 2023

tcpdump does the wrong thing if you combine --time-stamp-precision with -V to open multiple input files. As a trivial example, observe that printing the exact same 1-line file twice gives different timestamps:

$ printf "tests/reason_code-0.pcap\ntests/reason_code-0.pcap" | tcpdump -V - --time-stamp-precision=nano
reading from file tests/reason_code-0.pcap, link-type IEEE802_11_RADIO (802.11 plus radiotap header), snapshot length 65535
22:15:22.659099000 1.0 Mb/s 2412 MHz 11b 100 sq antenna 0 57dB signal DeAuthentication (00:0d:93:82:36:3a (oui Unknown)): Reserved
reading from file tests/reason_code-0.pcap, link-type IEEE802_11_RADIO (802.11 plus radiotap header), snapshot length 65535
22:15:22.000659099 1.0 Mb/s 2412 MHz 11b 100 sq antenna 0 57dB signal DeAuthentication (00:0d:93:82:36:3a (oui Unknown)): Reserved

The problem is that when -V was added, the code to open the 2nd and subsequent files was copy-pasted into the inductive part of a loop. When timestamp support was added, the patch was only applied to the first copy of the code, which is no doubt an oversight.

I figured I'd fix this and prevent future similar problems by unifying the two code clones for opening files, so I created a new helper routine open_pcap_file to hold the common code.

As a minor side effect, there is a behavior change around error handling, which may or not be desired. Previously, if the 2nd file had a different link type, tcpdump would bail out early with "$filename: new dlt does not match original". Now we'll get a little further printing the file banner "reading from file $filename, link-type $type, snapshot length $length" before giving up. This was always an error case, so I don't imagine it's a big deal if the sequence of messages is different.

I didn't see a clean way to test this using the existing harness, so I added a minor hack to TESTrun: if $options appears to contain -V, then use that instead of the usual -r switch. I added a new test that exhibits the bug; after the fix, the test passes cleanly.

I've attempted to match the local code philosophy and style, but if I've overlooked anything, please let me know; I'm happy to improve the patch.

@jtippet jtippet force-pushed the multi-file-timestamp branch from c98114e to a478f28 Compare August 28, 2023 21:17
Capture files were opened in two different bits of code. Only the
first bit of code acquired support for --time-stamp-precision, so
the 2nd and subsequent files opened with -V would always have the
default timestamp precision. This fix unifies file opening into a
single routine, ensuring consistent behavior for each opened file.
@jtippet jtippet force-pushed the multi-file-timestamp branch from a478f28 to 87501b7 Compare August 28, 2023 21:18
@jtippet
Copy link
Author

jtippet commented Aug 29, 2023

Thanks for setting up a rigorous CI system - it found a few issues in my original patch. I believe the remaining CI failures are not related to my payload.

win32 failed with

appveyor DownloadFile https://npcap.com/dist/npcap-sdk-1.12.zip
Error downloading remote file: One or more errors occurred.
Inner Exception: The request was aborted: Could not create SSL/TLS secure channel

which is pretty clearly a transient issue.

netbsd-mips64 failed with

$ make releasetar
tcpdump-5.0.0-PRE-GIT/tests/stp-heapoverflow-4.out: Write failed
tcpdump-5.0.0-PRE-GIT/tests/stp-heapoverflow-4.pcap: Write to restore size failed

which also doesn't seem to be payload-related, although I can't really rule it out given my unfamiliarity with the codebase. Is this a known flaky CI leg? Could you try re-running it? (I don't have permissions to re-run.)

As you might have noticed if you follow all the commit history here, I had to do some deeper edits to the test harness than I'd originally anticipated. Unfortunately, the existing harness was somewhat hostile to exercising the -V switch, since it unconditionally hardcodes the -r switch in all the tcpdump command lines. My workaround is to change the harness to detect if -V is passed, and if so, treat the input file as a list of pcap files to read. But because the test harness sometimes runs in a different directory, I had to educate the harness to rewrite the list of relative filenames into absolute filenames.

This ends up with some clumsy code for ultimately just one test case, so I'm not super thrilled with that. But it also is really the only way to exercise -V, which is a documented interface to tcpdump, so it's certainly worth being able to test it.

Let me know if you have any suggestions on how I can exercise -V more cleanly than what's being done here.

Assuming the CI re-runs don't turn up any surprises, I'm ready for discussion & review of this proposed change.

@guyharris
Copy link
Member

win32 failed with ... which is pretty clearly a transient issue.

Yes. AppVeyor is prone to those sorts of transient issues; I restarted that build with "rebuild the failed build and continue from there". (I tried switching to Cirrus CI, but they don't have any pre-packaged containers complete with Visual Studio, and I wasn't going to have each build spend a long time downloading VS, so I gave up on that.)

netbsd-mips64 failed with [a bunch of inadequate error messages about writes failing]

Yeah, whatever program is writing those messages should at least use errno to report what particular problem the writes encountered. My money's on ENOSPC, a/k/a "file system full", here, as I think that's by far the most common error for failing writes. (I suspect the others would be, in decreasing order of likelihood, ETIMEDOUT or whatever file server connection/network/crash problems turn into, ESTALE for NFS servers if the file gets removed or replaced out from under the client, and EIO in what I suspect is a very unlikely case these days, namely a hardware problem on the storage device.)

which also doesn't seem to be payload-related,

Extremely unlikely to be related to anything done to tcpdump, as it's only constructing a source tarball there, not running tcpdump.

Is this a known flaky CI leg?

I think it's run out of disk space in the past.

Could you try re-running it? (I don't have permissions to re-run.)

I don't know what account I'd need to log in. @infrastation, could you poke this?

@jtippet
Copy link
Author

jtippet commented Sep 12, 2023

It looks like the CI passed on re-run. Is there anything else needed on my side to move this PR forward? Thanks ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants