forked from david-a-wheeler/flawfinder
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathflawfinder.1
1351 lines (1262 loc) · 55.9 KB
/
flawfinder.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
'\"
.\" (C) Copyright 2001-2018 David A. Wheeler ([email protected])
.\"
.\" This program is free software; you can redistribute it and/or modify
.\" it under the terms of the GNU General Public License as published by
.\" the Free Software Foundation; either version 2 of the License, or
.\" (at your option) any later version.
.\"
.\" This program is distributed in the hope that it will be useful,
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
.\" GNU General Public License for more details.
.\"
.\" You should have received a copy of the GNU General Public License
.\" along with this program; if not, write to the Free Software
.\" Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
.\"
.\"
.\"
.\" Man page created 17 May 2001 by David A. Wheeler ([email protected])
.\"
.TH FLAWFINDER 1 "3 Jun 2021" "Flawfinder" "Flawfinder"
.SH NAME
flawfinder \- lexically find potential security flaws ("hits") in source code
.SH SYNOPSIS
.B flawfinder
.\" Documentation:
.RB [ \-\-help | \-h ]
.RB [ \-\-version ]
.RB [ \-\-listrules ]
.br
.\" Selecting Input Data:
.RB [ \-\-allowlink ]
.RB [ \-\-followdotdir ]
.RB [ \-\-nolink ]
.br
.RB [ \-\-patch=\fIfilename\fR | \-P\ \fIfilename\fR ]
.br
.\" Selecting Hits to Display:
.RB [ \-\-inputs | \-I ]
[ \fB\-\-minlevel=\fR\fIX\fR | \fB\-m\fR\ \fIX\fR ]
.RB [ \-\-falsepositive | \-F ]
.br
.RB [ \-\-neverignore | \-n ]
.br
[\fB\-\-regex=\fR\fIPATTERN\fR | \fB\-e\fR \fIPATTERN\fR]
.br
.\" Selecting Output Format:
.RB [ \-\-context | \-c ]
.RB [ \-\-columns | \-C ]
.RB [ \-\-csv ]
.RB [ \-\-dataonly | \-D ]
.RB [ \-\-html | \-H ]
.RB [ \-\-immediate | -i ]
.RB [ \-\-sarif ]
.RB [ \-\-singleline | \-S ]
.RB [ \-\-omittime ]
.RB [ \-\-quiet | \-Q ]
.RB [ \-\-error-level=\fRLEVEL\fR ]
.br
.\" Managing hit list.
[\fB\-\-loadhitlist=\fR\fIF\fR]
[\fB\-\-savehitlist=\fR\fIF\fR]
[\fB\-\-diffhitlist=\fR\fIF\fR]
.br
.RB [ \-\- ]
.I [ source code file or source root directory ]+
.SH DESCRIPTION
.PP
Flawfinder searches through C/C++ source code looking for
potential security flaws.
To run flawfinder, simply give flawfinder a list of directories or files.
For each directory given, all files that have C/C++ filename extensions
in that directory (and its subdirectories, recursively) will be examined.
Thus, for most projects, simply give flawfinder the name of the source
code's topmost directory (use ``.'' for the current directory),
and flawfinder will examine all of the project's C/C++ source code.
Flawfinder does \fInot\fR require that you be able to build your software,
so it can be used even with incomplete source code.
If you only want to have \fIchanges\fR reviewed, save a unified diff
of those changes (created by GNU "diff -u" or "svn diff" or "git diff")
in a patch file and use the \-\-patch (\-P) option.
.PP
Flawfinder will produce a list of ``hits'' (potential
security flaws, also called findings),
sorted by risk; the riskiest hits are shown first.
The risk level is shown inside square brackets and
varies from 0, very little risk, to 5, great risk.
This risk level depends not only on the function, but on the values of the
parameters of the function.
For example, constant strings are often less risky than fully variable
strings in many contexts, and in those contexts the hit will have a
lower risk level.
Flawfinder knows about gettext (a common library for internationalized
programs) and will treat constant strings
passed through gettext as though they were constant strings; this reduces
the number of false hits in internationalized programs.
Flawfinder will do the same sort of thing with _T() and _TEXT(),
common Microsoft macros for handling internationalized programs.
.\" For more info, see: http://www.rpi.edu/~pudeyo/articles/unicode.html
Flawfinder correctly ignores text inside comments and strings.
Normally flawfinder shows all hits with a risk level of at least 1,
but you can use the \-\-minlevel option
to show only hits with higher risk levels if you wish.
Hit descriptions also note the relevant
Common Weakness Enumeration (CWE) identifier(s) in parentheses,
as discussed below.
Flawfinder is officially CWE-Compatible.
Hit descriptions with "[MS-banned]" indicate functions that are in the
banned list of functions released by Microsoft; see
http://msdn.microsoft.com/en-us/library/bb288454.aspx
for more information about banned functions.
.PP
Not every hit (aka finding) is actually a security vulnerability,
and not every security vulnerability is necessarily found.
Nevertheless, flawfinder can be an aid in finding and removing
security vulnerabilities.
A common way to use flawfinder is to first
apply flawfinder to a set of source code and examine the
highest-risk items.
Then, use \-\-inputs to examine the input locations, and check to
make sure that only legal and safe input values are
accepted from untrusted users.
.PP
Once you've audited a program, you can mark source code lines that
are actually fine but cause spurious warnings so that flawfinder will
stop complaining about them.
To mark a line so that these warnings are suppressed,
put a specially-formatted comment either on the same
line (after the source code) or all by itself in the previous line.
The comment must have one of the two following formats:
.IP \(bu
// Flawfinder: ignore
.IP \(bu
/* Flawfinder: ignore */
.PP
For compatibility's sake, you can replace "Flawfinder:" with
"ITS4:" or "RATS:" in these specially-formatted comments.
Since it's possible that such lines are wrong, you can use
the \-\-neverignore option, which causes flawfinder to never ignore any line
no matter what the comment directives say
(more confusingly, \-\-neverignore ignores the ignores).
.PP
Flawfinder uses an internal database called the ``ruleset'';
the ruleset identifies functions that are common causes of security flaws.
The standard ruleset includes a large number of different potential
problems, including both general issues that can impact any
C/C++ program, as well as a number of specific Unix-like and Windows
functions that are especially problematic.
The \-\-listrules option reports the list of current rules and their
default risk levels.
As noted above, every potential security flaw found in a given source code file
(matching an entry in the ruleset)
is called a ``hit,'' and the set of hits found during any particular
run of the program is called the ``hitlist.''
Hitlists can be saved (using \-\-savehitlist), reloaded back for redisplay
(using \-\-loadhitlist), and you can show only the hits that are different
from another run (using \-\-diffhitlist).
.PP
Flawfinder is a simple tool, leading to some fundamental pros and cons.
Flawfinder works by doing simple lexical tokenization
(skipping comments and correctly tokenizing strings),
looking for token matches to the database
(particularly to find function calls).
Flawfinder is thus similar to RATS and ITS4, which also
use simple lexical tokenization.
Flawfinder then examines the
text of the function parameters to estimate risk.
Unlike tools such as splint, gcc's warning flags,
and clang, flawfinder does \fInot\fR use or have access to
information about control flow, data flow, or data types when
searching for potential vulnerabilities or estimating the level of risk.
Thus, flawfinder will necessarily
produce many false positives for vulnerabilities
and fail to report many vulnerabilities.
On the other hand, flawfinder can find vulnerabilities in programs that
cannot be built or cannot be linked.
It can often work with programs that cannot even be compiled
(at least by the reviewer's tools).
Flawfinder also doesn't get as confused by macro definitions
and other oddities that more sophisticated tools have trouble with.
Flawfinder can also be useful as a simple
introduction to static analysis tools in general,
since it is easy to start using and easy to understand.
.PP
Any filename given on the command line will be examined (even if
it doesn't have a usual C/C++ filename extension); thus you can force
flawfinder to examine any specific files you desire.
While searching directories recursively, flawfinder only opens and
examines regular files that have C/C++ filename extensions.
Flawfinder presumes that files are C/C++ files if they have the extensions
".c", ".h", ".ec", ".ecp", ".pgc", ".C", ".cpp",
".CPP", ".cxx", ".c++", ".cc", ".CC", ".pcc", ".hpp", or ".H".
The filename ``\-'' means the standard input.
To prevent security problems,
special files (such as device special files and named pipes) are
always skipped, and by default symbolic links are skipped
(the \-\-allowlink option follows symbolic links).
.PP
After the list of hits is a brief summary of the results
(use -D to remove this information).
It will show the number of hits, lines analyzed (as reported by wc \-l),
and the physical source lines of code (SLOC) analyzed.
A physical SLOC is a non-blank, non-comment line.
It will then show the number of hits at each level; note that there will
never be a hit at a level lower than minlevel (1 by default).
Thus, "[0] 0 [1] 9" means that at level 0 there were 0 hits reported,
and at level 1 there were 9 hits reported.
It will next show the number of hits at a given level or larger
(so level 3+ has the sum of the number of hits at level 3, 4, and 5).
Thus, an entry of "[0+] 37" shows that at level 0 or higher there were
37 hits (the 0+ entry will always be the same as the "hits" number above).
Hits per KSLOC is next shown; this is each of the "level or higher"
values multiplied by 1000 and divided by the physical SLOC.
If symlinks were skipped, the count of those is reported.
If hits were suppressed (using the "ignore" directive
in source code comments as described above), the number suppressed is reported.
The minimum risk level to be included in the report
is displayed; by default this is 1 (use \-\-minlevel to change this).
The summary ends with important reminders:
Not every hit is necessarily a security vulnerability, and
there may be other security vulnerabilities not reported by the tool.
.PP
Flawfinder can easily integrate into a continuous integration system.
You might want to check out the \-\-error\-level option to help do that, e.g.,
using \-\-error\-level=4 will cause an error to be returned if flawfinder
finds a vulnerability of level 4 or higher.
.PP
Flawfinder is released under the GNU GPL license version 2 or later (GPLv2+).
.PP
Flawfinder works similarly to another program, ITS4, which is not
fully open source software (as defined in the Open Source Definition)
nor free software (as defined by the Free Software Foundation).
The author of Flawfinder has never seen ITS4's source code.
Flawfinder is similar in many ways to RATS, if you are familiar with RATS.
.SH "BRIEF TUTORIAL"
Here's a brief example of how flawfinder might be used.
Imagine that you have the C/C++ source code for some program named xyzzy
(which you may or may not have written), and you're
searching for security vulnerabilities (so you can fix them before
customers encounter the vulnerabilities).
For this tutorial, I'll assume that you're using a Unix-like system,
such as Linux, OpenBSD, or MacOS X.
.PP
If the source code is in a subdirectory named xyzzy, you would probably
start by opening a text window and using flawfinder's default settings, to
analyze the program and report a prioritized list of potential
security vulnerabilities (the ``less'' just makes sure the results
stay on the screen):
.RS
flawfinder xyzzy | less
.RE
.PP
At this point, you will see a large number of entries.
Each entry has a filename, a colon, a line number, a
risk level in brackets (where 5 is the most risky), a category,
the name of the function, and
a description of why flawfinder thinks the line is a vulnerability.
Flawfinder normally sorts by risk level, showing the riskiest items
first; if you have limited time, it's probably best to start working on
the riskiest items and continue until you run out of time.
If you want to limit the display to risks with only
a certain risk level or higher, use
the \-\-minlevel option.
If you're getting an extraordinary number of false positives because
variable names look like dangerous function names, use the \-F option
to remove reports about them.
If you don't understand the error message, please see documents such as the
.UR "https://dwheeler.com/secure-programs"
.I "Secure Programming HOWTO"
.UE
at
https://dwheeler.com/secure-programs
which provides more information on writing secure programs.
.PP
Once you identify the problem and understand it, you can fix it.
Occasionally you may want to re-do the analysis, both because the
line numbers will change \fIand\fP to make sure that the new code
doesn't introduce yet a different vulnerability.
.PP
If you've determined that some line isn't really a problem, and
you're sure of it, you can insert just before or on the offending
line a comment like
.RS
/* Flawfinder: ignore */
.RE
to keep them from showing up in the output.
.PP
Once you've done that, you should go back and search for the
program's inputs, to make sure that the program strongly filters
any of its untrusted inputs.
Flawfinder can identify many program inputs by using the \-\-inputs
option, like this:
.RS
flawfinder \-\-inputs xyzzy
.RE
.PP
Flawfinder can integrate well with text editors and
integrated development environments; see the examples for
more information.
.PP
Flawfinder includes many other options, including ones to
create HTML versions of the output (useful for prettier displays) and
OASIS Static Analysis Results Interchange Format (SARIF) output.
The next section describes those options in more detail.
.SH OPTIONS
Flawfinder has a number of options, which can be grouped into options that
control its own documentation,
select input data,
select which hits to display,
select the output format,
and perform hitlist management.
The commonly-used flawfinder options
support the standard option syntax defined in the
POSIX (Issue 7, 2013 Edition) section ``Utility Conventions''.
Flawfinder also supports the GNU long options
(double-dash options of form \-\-\fIoption\fR)
as defined in the \fIGNU C Library Reference Manual\fR
``Program Argument Syntax Conventions''
and \fIGNU Coding Standards\fR ``Standards for Command Line Interfaces''.
Long option arguments can be provided as ``--name=value'' or ``-name value''.
All options can be accessed using the more
readable GNU long option conventions;
some less commonly used options can \fIonly\fR be accessed
using long option conventions.
.SS "Documentation"
.TP 12
.BI \-\-help
.TP
.BI \-h
.\" Leave -? undocumented... it also invokes help.
Show usage (help) information.
.TP
.BI \-\-version
Shows (just) the version number and exits.
.TP 12
.BI \-\-listrules
List the terms (tokens)
that trigger further examination, their default risk level,
and the default warning (including the CWE identifier(s), if applicable),
all tab-separated.
The terms are primarily names of potentially-dangerous functions.
Note that the reported risk level and warning
for some specific code may be different than the default,
depending on how the term is used.
Combine with \-D if you do not want the usual header.
Flawfinder version 1.29 changed the separator from spaces to tabs, and
added the default warning field.
.SS "Selecting Input Data"
.TP 12
.BI \-\-allowlink
Allow the use of symbolic links; normally symbolic links are skipped.
Don't use this option if you're analyzing code by others;
attackers could do many things to cause problems for an analysis
with this option enabled.
For example, an attacker
could insert symbolic links to files such as /etc/passwd
(leaking information about the file) or create a circular loop,
which would cause flawfinder to run ``forever''.
Another problem with enabling this option is that
if the same file is referenced multiple times using symbolic links,
it will be analyzed multiple times (and thus reported multiple times).
Note that flawfinder already includes some protection against symbolic links
to special file types such as device file types (e.g., /dev/zero or
C:\\mystuff\\com1).
Note that for flawfinder version 1.01 and before, this was the default.
.TP
.BI \-\-followdotdir
Enter directories whose names begin with ".".
Normally such directories are ignored, since they normally
include version control private data (such as .git/ or .svn/),
build metadata (such as .makepp),
configuration information, and so on.
.TP
.BI \-\-nolink
Ignored.
Historically this disabled following symbolic links;
this behavior is now the default.
.TP 12
\fB\-\-patch=\fR\fIpatchfile\fR
.TP
\fB\-P\fR \fIpatchfile\fR
Examine the selected files or directories, but only report hits in lines
that are added or modified as described in the given patch file.
The patch file must be in a recognized unified diff format
(e.g., the output of GNU "diff -u old new", "svn diff", or "git diff [commit]").
Flawfinder assumes that the patch has already been applied to the files.
The patch file can also include changes to irrelevant files
(they will simply be ignored).
The line numbers given in the patch file are used to determine which
lines were changed, so if you have modified the files since the
patch file was created, regenerate the patch file first.
Beware that the file names of the new files
given in the patch file must match exactly,
including upper/lower case, path prefix, and directory
separator (\\ vs. /).
Only unified diff format is accepted (GNU diff, svn diff, and
git diff output is okay);
if you have a different format, again regenerate it first.
Only hits that occur on resultant changed lines, or immediately
above and below them, are reported.
This option implies \-\-neverignore.
\fBWarning\fR: Do \fInot\fR pass a patch file without the
\fB\-P\fR, because flawfinder will then try to treat the file as a
source file.
This will often work, but the line numbers will be relative
to the beginning of the patch file, not the positions in the
source code.
Note that you \fBmust\fR also provide the actual files to analyze,
and not just the patch file; when using \fB\-P\fR files are only reported
if they are both listed in the patch and also listed (directly or indirectly)
in the list of files to analyze.
.SS "Selecting Hits to Display"
.TP
.BI "\-\-inputs"
.TP
.BI \-I
Show only functions that obtain data from outside the program;
this also sets minlevel to 0.
.TP
\fB\-\-minlevel=\fIX\fR
.TP
.BI -m " X"
Set minimum risk level to X for inclusion in hitlist.
This can be from 0 (``no risk'') to 5 (``maximum risk'');
the default is 1.
.TP
.BI "\-\-falsepositive"
.TP
.BI \-F
Do not include hits that are likely to be false positives.
Currently, this means that function names are ignored if they're
not followed by "(", and that declarations of character arrays aren't
noted.
Thus, if you have use a variable named "access" everywhere, this will
eliminate references to this ordinary variable.
This isn't the default, because this also increases the likelihood
of missing important hits; in particular, function names in #define
clauses and calls through function pointers will be missed.
.TP
.BI \-\-neverignore
.TP
.BI -n
Never ignore security issues, even if they have an ``ignore'' directive
in a comment.
.TP
\fB\-\-regexp=\fR\fIPATTERN\fR
.TP
\fB-e\fR \fIPATTERN\fR
Only report hits with text that matches the regular expression pattern PATTERN.
For example, to only report hits containing the text "CWE-120",
use ``\-\-regex CWE-120''.
These option flag names are the same as grep.
.SS "Selecting Output Format"
.TP 12
.BI \-\-columns
.TP
.BI \-C
Show the column number (as well as the file name and line number)
of each hit; this is shown after the line number by adding a colon
and the column number in the line (the first character in a line is
column number 1).
This is useful for editors that can jump to specific columns, or
for integrating with other tools (such as those to further filter out
false positives).
.TP
.BI \-\-context
.TP
.BI \-c
Show context, i.e., the line having the "hit"/potential flaw.
By default the line is shown immediately after the warning.
.TP
.BI \-\-csv
Generate output in comma-separated-value (CSV) format.
This is the recommended format for sending to other tools for processing.
It will always generate a header row, followed by 0 or more data rows
(one data row for each hit).
Selecting this option automatically enables \-\-quiet and
\-\-dataonly.
The headers are mostly self-explanatory.
"File" is the filename, "Line" is the line number,
"Column" is the column (starting from 1),
"Level" is the risk level (0-5, 5 is riskiest),
"Category" is the general flawfinder category,
"Name" is the name of the triggering rule,
"Warning" is text explaining why it is a hit (finding),
"Suggestion" is text suggesting how it might be fixed,
"Note" is other explanatory notes,
"CWEs" is the list of one or more CWEs,
"Context" is the source code line triggering the hit,
and "Fingerprint" is the SHA-256 hash of the context once
its leading and trailing whitespace have been removed
(the fingerprint may help detect and eliminate later duplications).
If you use Python3, the hash is of the context when encoded as UTF-8.
.TP
.BI "\-\-dataonly"
.TP
.BI \-D
Don't display the header and footer.
Use this along with \-\-quiet to see just the data itself.
.TP
.BI \-\-html
.TP
.BI \-H
Format the output as HTML instead of as simple text.
.TP
.BI "\-\-immediate"
.TP
.BI -i
Immediately display hits (don't just wait until the end).
.TP
.BI \-\-sarif
Produce output in the OASIS
Static Analysis Results Interchange Format (SARIF) format (a JSON-based format).
The goals of the SARIF format, as explained in
version 2.1.0 (27 March 2020) of its specification, include being able to
"comprehensively capture the range of data produced by commonly
used static analysis tools."
SARIF output identifies the tool name as "Flawfinder".
The flawfinder levels 0 through 5 are mapped to SARIF rank (by dividing by 5),
SARIF level, and the default viewer action as follows:
Flawfinder 0: SARIF rank 0.0, SARIF level note, Does not display by default
Flawfinder 1: SARIF rank 0.2, SARIF level note, Does not display by default
Flawfinder 2: SARIF rank 0.4, SARIF level note, Does not display by default
Flawfinder 3: SARIF rank 0.6, SARIF level warning, Displays by default, does not break build / other processes
Flawfinder 4: SARIF rank 0.8, SARIF level error, Displays by default, breaks build/ other processes
Flawfinder 5: SARIF rank 1.0, SARIF level error, Displays by default, breaks build/ other processes
A big thanks to Yong Yan implementing SARIF output generation for flawfinder!
For more about the SARIF format, see:
https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=sarif
.TP
.BI "\-\-singleline"
.TP
.BI -S
Display as single line of text output for each hit.
Useful for interacting with compilation tools.
.TP
.BI "\-\-omittime"
Omit timing information.
This is useful for regression tests of flawfinder itself, so that
the output doesn't vary depending on how long the analysis takes.
.TP
.BI "\-\-quiet"
.TP
.BI \-Q
Don't display status information (i.e., which files are being examined)
while the analysis is going on.
.TP
.BI "\-\-error-level=LEVEL"
Return a nonzero (false) error code if there is at least one
hit of LEVEL or higher. If a diffhitlist is provided,
hits noted in it are ignored.
This option can be useful within a continuous integration script,
especially if you mark known-okay lines as "flawfinder: ignore".
Usually you want level to be fairly high, such as 4 or 5.
By default, flawfinder returns 0 (true) on a successful run.
.SS "Hitlist Management"
.\" This isn't sorted as usual, because logically saving comes
.\" before loading and differencing.
.TP 12
\fB\-\-savehitlist=\fR\fIF\fR
Save all resulting hits (the "hitlist") to F.
.TP
\fB\-\-loadhitlist=\fR\fIF\fR
Load the hitlist from F instead of analyzing source programs.
Warning: Do \fInot\fR load hitlists from untrusted sources
(for security reasons).
These are internally implemented using Python's "pickle" facility,
which trusts the input.
Note that stored hitlists often cannot be read when using an older version
of Python, in particular, if savehitlist was used but
flawfinder was run using Python 3,
the hitlist can't be loaded by running flawfinder with Python 2.
.TP
\fB\-\-diffhitlist=\fR\fIF\fR
Show only hits (loaded or analyzed) not in F.
F was presumably created previously using \-\-savehitlist.
Warning: Do \fInot\fR diff hitlists from untrusted sources
(for security reasons).
If the \-\-loadhitlist option is not provided, this will show the hits in
the analyzed source code files that were not previously stored in F.
If used along with \-\-loadhitlist, this will show the hits in the
loaded hitlist not in F.
The difference algorithm is conservative;
hits are only considered the ``same'' if they have the same
filename, line number, column position, function name, and risk level.
.SS "Character Encoding Errors"
Flawfinder uses the character encoding rules set by Python.
Sometimes source code does not perfectly follow some encoding rules.
If you run flawfinder with Python 2
these non-conformities often do not impact processing in practice.
However, if you run flawfinder with Python 3, this can be a problem.
Python 3 developers wants the world to always use encodings perfectly correctly,
everywhere, and in general wants everyone to only use UTF-8.
UTF-8 is a great encoding, and it is very popular, but
the world often doesn't care what the Python 3 developers want.
When running flawfinder using Python 3, the program will crash hard if
\fIany\fR source file has \fIany\fR non-conforming text.
It will do this even if the non-conforming text is in comments or strings
(where it often doesn't matter).
Python 3 fails to provide useful built-ins to deal with
the messiness of the real world, so it's
non-trivial to deal with this problem without depending on external
libraries (which we're trying to avoid).
A symptom of this problem
is if you run flawfinder and you see an error message like this:
\fIError: encoding error in ,1.c\fR
\fI'utf-8' codec can't decode byte 0xff in position 45: invalid start byte\fR
What you are seeing is the result of an internal UnicodeDecodeError.
If this happens to you, there are several options:
Option #1 (special case):
if your system normally uses an encoding other than UTF-8,
is properly set up to use that encoding (using LC_ALL and maybe LC_CTYPE),
and the input files are in that non-UTF-8 encoding,
it may be that Python3 is (incorrectly) ignoring your configuration.
In that case, simply tell Python3 to use your
configuration by setting the environment variable PYTHONUTF8=0, e.g.,
run flawfinder as:
"PYTHONUTF8=0 python3 flawfinder ...".
Option #2 (special case): If you know what the encoding of the files is,
you can force use of that encoding. E.g., if the encoding
is BLAH, run flawfinder as:
"PYTHONUTF8=0 LC_ALL=C.BLAH python3 flawfinder ...".
You can replace "C" after LC_ALL= with your real language locale
(e.g., "en_US").
Option #3: If you don't know what the encoding is, or the encoding is
inconsistent (e.g., the common case of UTF-8 files with some
characters encoded using Windows-1252 instead),
then you can force the system to use the
ISO-8859-1 (Latin-1) encoding in which all bytes are allowed.
If the inconsistencies are only in comments and strings, and the
underlying character set is "close enough" to ASCII, this can get you
going in a hurry.
You can do this by running:
"PYTHONUTF8=0 LC_ALL=C.ISO-8859-1 python3 flawfinder ...".
In some cases you may not need the "PYTHONUTF8=0".
You may be able to replace "C" after LC_ALL= with your real language locale
(e.g., "en_US").
Option #4: Convert the encoding of the files to be analyzed so that it's
a single encoding - it's highly recommended to convert to UTF-8.
For example, the system program "iconv"
or the Python program cvt2utf
can be used to convert encodings.
(You can install cvt2utf with "pip install cvtutf").
This works well if some files have one encoding, and some have another,
but they are consistent within a single file.
If the files have encoding errors, you'll have to fix them.
Option #5: Run flawfinder using Python 2 instead of Python 3.
E.g., "python2 flawfinder ...".
To be clear:
I strongly recommend using the UTF-8 encoding for all source code,
and use continuous integration tests to ensure that the source code
is always valid UTF-8.
If you do that, many problems disappear.
But in the real world this is not always the situation.
Hopefully
this information will help you deal with real-world encoding problems.
.SH EXAMPLES
Here are various examples of how to invoke flawfinder.
The first examples show various simple command-line options.
Flawfinder is designed to work well with text editors and
integrated development environments, so the next sections
show how to integrate flawfinder into vim and emacs.
.SS "Simple command-line options"
.TP 12
.B "flawfinder /usr/src/linux-3.16"
Examine all the C/C++ files in the directory
/usr/src/linux-3.16 and all its subdirectories (recursively),
reporting on all hits found.
By default flawfinder will skip symbolic links and
directories with names that start with a period.
.TP
.B "flawfinder \-\-error-level=4 ."
Examine all the C/C++ files in the current directory
and its subdirectories (recursively);
return an error code if there are vulnerabilities
level 4 and up (the two highest risk levels).
This is a plausible way to use flawfinder in a continuous integration system.
.TP
.B "flawfinder \-\-minlevel=4 ."
Examine all the C/C++ files in the current directory
and its subdirectories (recursively);
only report vulnerabilities level 4 and up (the two highest risk levels).
.TP
.B "flawfinder \-\-inputs mydir"
Examine all the C/C++ files in mydir
and its subdirectories (recursively), and report functions
that take inputs (so that you can ensure that they filter the
inputs appropriately).
.TP
.B "flawfinder \-\-neverignore mydir"
Examine all the C/C++ files in the directory mydir and its subdirectories,
including even the hits marked for ignoring in the code comments.
.TP
.B "flawfinder \-\-csv ."
Examine the current directory down (recursively), and report all
hits in CSV format.
This is the recommended form if you want to further process
flawfinder output using other tools
(such as data correlation tools).
.TP
.B "flawfinder \-QD mydir"
Examine mydir and report only the actual results
(removing the header and footer of the output).
This form may be useful
if the output will be piped into other tools for further analysis,
though CSV format is probably the better choice in that case.
The \-C (\-\-columns) and \-S (\-\-singleline)
options can also be useful if you're piping the data
into other tools.
.TP
.B "flawfinder \-QDSC mydir"
Examine mydir, reporting only the actual results (no header or footer).
Each hit is reported on one line, and column numbers are reported.
This can be a useful command if you are feeding
flawfinder output to other tools.
.TP
.B "flawfinder \-\-quiet \-\-html \-\-context mydir > results.html"
Examine all the C/C++ files in the directory mydir and its subdirectories,
and produce an HTML formatted version of the results.
Source code management systems (such as SourceForge and Savannah)
might use a command like this.
.TP
.B "flawfinder \-\-quiet \-\-savehitlist saved.hits *.[ch]"
Examine all .c and .h files in the current directory.
Don't report on the status of processing, and save the resulting hitlist
(the set of all hits) in the file saved.hits.
.TP
.B "flawfinder \-\-diffhitlist saved.hits *.[ch]"
Examine all .c and .h files in the current directory, and show any
hits that weren't already in the file saved.hits.
This can be used to show only the ``new'' vulnerabilities in a
modified program, if saved.hits was created from the
older version of the program being analyzed.
.TP 12
.B "flawfinder \-\-patch recent.patch ."
Examine the current directory recursively, but only report lines
that were changed or added in the already-applied patchfile named
\fIrecent.patch\fR.
.TP
\fBflawfinder \-\-regex "CWE-120|CWE-126" src/\fR
Examine directory \fIsrc\fR recursively, but only report hits
where CWE-120 or CWE-126 apply.
.SS "Invoking from vim"
.PP
The text editor
vim includes a "quickfix" mechanism that works well with flawfinder,
so that you can easily view the warning messages and jump to
the relevant source code.
.PP
First, you need to invoke flawfinder to create a list of hits, and
there are two ways to do this.
The first way is to start flawfinder first, and then (using its output)
invoke vim.
The second way is to start (or continue to run) vim, and then invoke
flawfinder (typically from inside vim).
.PP
For the first way, run flawfinder and store its output in some
FLAWFILE (say "flawfile"),
then invoke vim using its -q option, like this: "vim -q flawfile".
The second way (starting flawfinder after starting vim) can be done
a legion of ways.
One is to invoke flawfinder using a shell command,
":!flawfinder-command > FLAWFILE", then follow that with the command
":cf FLAWFILE".
Another way is to store the flawfinder command in your makefile
(as, say, a pseudocommand like "flaw"), and then run
":make flaw".
.PP
In all these cases you need a command for flawfinder to run.
A plausible command, which places each hit in its own line (-S) and
removes headers and footers that would confuse it, is:
.PP
.B "flawfinder \-SQD ."
.PP
You can now use various editing commands to view the results.
The command ":cn" displays the next hit; ":cN" displays the
previous hit, and ":cr" rewinds back to the first hit.
":copen" will open a window to show the current list of hits, called
the "quickfix window"; ":cclose" will close the quickfix window.
If the buffer in the used window has changed, and the error is in
another file, jumping to the error will fail.
You have to make sure the window contains a buffer which can be abandoned
before trying to jump to a new file, say by saving the file;
this prevents accidental data loss.
.SS "Invoking from emacs"
The text editor / operating system
emacs includes "grep mode" and "compile mode" mechanisms
that work well with flawfinder, making it easy to
view warning messages, jump to the relevant source code, and fix
any problems you find.
.PP
First, you need to invoke flawfinder to create a list of warning messages.
You can use "grep mode" or "compile mode" to create this list.
Often "grep mode" is more convenient;
it leaves compile mode untouched so you can easily recompile
once you've changed something.
However, if you want to jump to the exact column position of a hit,
compile mode may be more convenient because emacs can use
the column output of flawfinder to directly jump to the right location
without any special configuration.
.PP
To use grep mode,
enter the command "M-x grep"
and then enter the needed flawfinder command.
To use compile mode, enter the command
"M-x compile" and enter the needed flawfinder command.
This is a meta-key command, so you'll need to use the meta key for your
keyboard (this is usually the ESC key).
As with all emacs commands, you'll need to press RETURN after
typing "grep" or "compile".
So on many systems, the grep mode is invoked by typing
ESC x g r e p RETURN.
.PP
You then need to enter a command, removing whatever was there before if
necessary.
A plausible command is:
.PP
.B "flawfinder \-SQDC ."
.PP
This command makes every hit report a single line,
which is much easier for tools to handle.
The quiet and dataonly options remove the other status information not needed
for use inside emacs.
The trailing period means that the current directory and all descendents
are searched for C/C++ code, and analyzed for flaws.
.PP
Once you've invoked flawfinder, you can use emacs to jump around
in its results.
The command C-x \`
(Control-x backtick)
visits the source code location for the next warning message.
C-u C-x \` (control-u control-x backtick)
restarts from the beginning.
You can visit the source for any particular error message by moving
to that hit message in the *compilation* buffer or *grep* buffer
and typing the return key.
(Technical note: in the compilation buffer, this invokes
compile-goto-error.)
You can also click the Mouse-2 button on the error message
(you don't need to switch to the *compilation* buffer first).
.PP
If you want to use grep mode to jump to specific columns of a hit,
you'll need to specially configure emacs to do this.
To do this, modify the emacs variable "grep-regexp-alist".
This variable tells Emacs how to
parse output of a "grep" command, similar to the
variable "compilation-error-regexp-alist" which lists various formats
of compilation error messages.
.SS "Invoking from Integrated Development Environments (IDEs)"
.PP
For (other) IDEs, consult your IDE's set of plug-ins.
.SH COMMON WEAKNESS ENUMERATION (CWE)
.PP
The Common Weakness Enumeration (CWE)
is ``a formal list or dictionary of common software weaknesses
that can occur in software's architecture, design, code or implementation
that can lead to exploitable security vulnerabilities...
created to serve as a common language for
describing software security weaknesses''
(https://cwe.mitre.org/about/faq.html).
For more information on CWEs, see https://cwe.mitre.org.
.PP
Flawfinder supports the CWE and is officially CWE-Compatible.
Hit descriptions typically include a relevant
Common Weakness Enumeration (CWE) identifier in parentheses
where there is known to be a relevant CWE.
For example, many of the buffer-related hits mention
CWE-120, the CWE identifier for
``buffer copy without checking size of input''
(aka ``Classic Buffer Overflow'').
In a few cases more than one CWE identifier may be listed.
The HTML report also includes hypertext links to the CWE definitions
hosted at MITRE.
In this way, flawfinder is designed to meet the CWE-Output requirement.
.PP
In some cases there are CWE mapping and usage challenges; here is how
flawfinder handles them.
If the same entry maps to multiple CWEs simultaneously,
all the CWE mappings are listed as separated by commas.
This often occurs with CWE-20, Improper Input Validation;
thus the report "CWE-676, CWE-120" maps to two CWEs.
In addition, flawfinder provides additional information for those who are
are interested in the CWE/SANS top 25 list 2011 (https://cwe.mitre.org/top25/)
when mappings are not directly to them.
Many people will want to search for specific CWEs in this top 25 list,
such as CWE-120 (classic buffer overflow).
The challenge is that some flawfinder hits map
to a more general CWE that would include a top 25 item, while in some
other cases hits map to a more specific vulnerability that is
only a subset of a top 25 item.
To resolve this, in some cases flawfinder will list a sequence of CWEs
in the format "more-general/more-specific", where the CWE actually
being mapped is followed by a "!".
This is always done whenever a flaw is not mapped directly to
a top 25 CWE, but the mapping is related to such a CWE.
So "CWE-119!/CWE-120" means that the vulnerability is mapped
to CWE-119 and that CWE-120 is a subset of CWE-119.
In contrast, "CWE-362/CWE-367!" means that the hit is mapped to
CWE-367, a subset of CWE-362.
Note that this is a subtle syntax change from flawfinder version 1.31;
in flawfinder version 1.31,
the form "more-general:more-specific" meant what is now listed as
"more-general!/more-specific", while
"more-general/more-specific" meant "more-general/more-specific!".
Tools can handle both the version 1.31 and the current format,
if they wish, by noting that the older format did not use "!" at all
(and thus this is easy to distinguish).
These mapping mechanisms simplify searching for certain CWEs.
.PP
CWE version 2.7 (released June 23, 2014) was used for the mapping.
The current CWE mappings select the most specific CWE the tool can determine.
In theory, most CWE security elements (signatures/patterns that the
tool searches for) could theoretically be mapped to
CWE-676 (Use of Potentially Dangerous Function), but such a mapping would
not be useful.