At LogDNA, consuming log files and making them searchable is what we do!
It all starts with the ability to efficiently watch log files on a local
host and send new lines up to the LogDNA service. This Node.js class provides
functionality like Unix's tail -f
command, and we use it in our
agents to get the job done. Of course, anything needing tail
functionality
in Node.js could also benefit from using this.
- Features
- Installation
- Usage
- Events
- API
- Program Flow
- How Log Rolling is Handled
- Backpressure Pauses Polling
- Zero dependencies! It's lightweight and uses 100% Node.js core modules.
- It implements a
Readable
stream, which is efficient and flexible in terms of being able topipe
to other streams or consume via events. - Stream backpressure is properly respected, so at no time is data pushed through the stream unless it is requested.
- It handles log rolling. Renaming files is handled gracefully without losing lines written to the "old" file, no matter what the poll interval is.
- It handles file truncation, continuing to tail the file despite being truncated.
npm install @logdna/tail-file
Instantiate an instance by passing the full path of a file to tail.
This will return a stream that can be piped to other streams or consumed
via data
events. To begin the tailing, call the start
method.
const TailFile = require('@logdna/tail-file')
const tail = new TailFile('/path/to/your/logfile.txt', {encoding: 'utf8'})
.on('data', (chunk) => {
console.log(`Recieved a utf8 character chunk: ${chunk}`)
})
.on('tail_error', (err) => {
console.error('TailFile had an error!', err)
})
.on('error', (err) => {
console.error('A TailFile stream error was likely encountered', err)
})
.start()
.catch((err) => {
console.error('Cannot start. Does the file exist?', err)
})
This example is more realistic. It pipes the output to a transform stream
which breaks the data up by newlines, emitting its own data
event for
every line.
const TailFile = require('@logdna/tail-file')
const split2 = require('split2') // A common and efficient line splitter
const tail = new TailFile('/path/to/your/logfile.txt')
tail
.on('tail_error', (err) => {
console.error('TailFile had an error!', err)
throw err
})
.start()
.catch((err) => {
console.error('Cannot start. Does the file exist?', err)
throw err
})
// Data won't start flowing until piping
tail
.pipe(split2())
.on('data', (line) => {
console.log(line)
})
This is an easy way to get a "line splitter" by using Node.js core modules.
For tailing files with high throughput, an official Transform
stream is
recommended since it will edge out readline
slightly in performance.
const readline = require('readline')
const TailFile = require('@logdna/tail-file')
async function startTail() {
const tail = new TailFile('./somelog.txt')
.on('tail_error', (err) => {
console.error('TailFile had an error!', err)
})
try {
await tail.start()
const linesplitter = readline.createInterface({
input: tail
})
linesplitter.on('line', (line) => {
console.log(line)
})
} catch (err) {
console.error('Cannot start. Does the file exist?', err)
}
}
startTail().catch((err) => {
process.nextTick(() => {
throw err
})
})
TailFile
is a Readable
stream, so it can emit any events from that
superclass. Additionally, it will emit the following custom events.
This event is emitted when the underlying stream is done being read.
If backpressure is in effect, then _read()
may be called multiple
times until it's flushed, so this event signals the end of the process.
It is used primarily in shutdown to make sure the data is exhausted,
but users may listen for this event if the relative "read position" in the
file is of interest. For example, the lastReadPosition
may be persisted to memory
or database for resuming tail-file
on a separate execution without missing
any lines or duplicating them.
This event is emitted when a file with the same name is found, but has a different inode than the previous poll. Commonly, this happens during a log rotation.
If a file that was successfully being tailed goes away, TailFile
will
try for maxPollFailures
to re-poll the file. For each of those retries,
this event is emitted for informative purposes. Typically, this could happen
if log rolling is occurring manually, or timed in a way where the poll happens
during the time in which the "new" filename is not yet created.
When an error happens that is specific to TailFile
, it cannot emit an error
event
without causing the main stream to end (because it's a Readable
implementation).
Therefore, if an error happens in a place such as reading the underlying file
resource, a tail_error
event will be emitted instead.
If a file is shortened or truncated without moving or renaming the file,
TailFile
will assume it to be a new file, and it will start consuming
lines from the beginning of the file. This event is emitted for informational
purposes about that behavior.
Event: (Any Readable
event)
TailFile
implements a Readable
stream, so it may also emit these events. The most common ones are close
(when TailFile
exits), or data
events from the stream.
filename
<String>
- The filename to tail. Poll errors do not happen untilstart
is called.options
<Object>
- OptionalpollFileIntervalMs
<Number>
- How often to pollfilename
for changes. Default:1000
mspollFailureRetryMs
<Number>
- After a polling error (ENOENT?), how long to wait before retrying. Default:200
msmaxPollFailures
<Number>
- The number of times to retry a failed poll before exiting/erroring. Default:10
times.readStreamOpts
<Object>
- Options to pass to thefs.createReadStream
function. This is used for reading bytes that have been added tofilename
between every poll.startPos
<Number>
- An integer representing the inital read position in the file. Useful for reading from0
. Default:null
(start tailing from EOF)- Any additional key-value options get passed to the
Readable
superclass constructor ofTailFile
- Throws:
<TypeError>
|<RangeError>
if parameter validation fails - Returns:
TailFile
, which is aReadable
stream
Instantiating TailFile
will return a readable stream, but nothing will happen
until start()
is called. After that, follow node's standard procedure to
get the stream into flowing mode. Typically, this means using
pipe
or attaching data
listeners to the readable stream.
As the underlying filename
is polled for changes, it will call
fs.createReadStream
to efficiently read the changed bytes since the last poll.
To control the options of that stream, the key-values in readStreamOpts
will
be passed to the fs.createReadStream
constructor. Similarly, options for
controlling TailFile
's' stream can be passed in via options
, and they will
get passed through to the Readable
's super()
constructor.
Useful settings such as encoding: 'utf8'
can be used this way.
- Returns:
<Promise>
- Resolves after the file is polled successfully - Rejects: If
filename
is not found
Calling start()
begins the polling of filename
to watch for added/changed bytes.
start()
may be called before or after data is set up to be consumed with a
data
listener or a pipe
. Standard node stream rules apply, which say
that data will not flow through the stream until it's consumed.
- Returns:
undefined
- Emits:
close
when the parentReadstream
is ended.
This function closes all streams and exits cleanly. The parent TailFile
stream will be
properly ended by pushing null
, therefore an end
event may be emitted as well.
Using "file watcher" events don't always work across different operating systems,
therefore the most effective way to "tail" a file is to continuously poll
it for changes and read those changes when they're detected.
Even Unix's tail -f
command works similarly.
Once start()
is called, TailFile
will being this polling process. As changes
are detected through a .size
comparison, it uses fs.openReadStream
to
efficiently read to the end of the file using async/await iterators.
This allows backpressure to be supported throughout the process.
TailFile
keeps a FileHandle
open for the filename
, which is attached to an
inode. If log rolling happens, TailFile
uses the FileHandle
to read the rest of the
"old" file before starting the process from the beginning of the newly-created file.
This ensures that no data is lost due to the rolling/renaming of filename
.
This functionality assumes that filename
is re-created with the same name,
otherwise an error is emitted if filename
does not re-appear.
Because TailFile
won't be consumed until it is in a reading mode,
this may cause backpressure to be enacted. In other words, if .start()
is called,
but pipe
or data events are not immediately set up, TailFile
may encounter
backpressure if its push()
calls exceed the high water mark.
Backpressure can also happen if TailFile
becomes unpiped.
In these cases, TailFile
will stop polling and wait until data is flowing before
polling resumes.
If polling is off during backpressure, TailFile
can handle
a single log roll or rename during backpressure, but if
the log is renamed more than once, there will most likely be data loss, as polling for
changes will be off.
This is an extrememly unlikely edge case, however we recommend consuming the TailFile
stream almost immediately upon creation.