@@ -44,7 +44,7 @@ wrappers.
4444
4545The distribution does contain a set of C wrapper functions for the 8-bit
4646library that are based on the POSIX regular expression API (see the pcre2posix
47- man page). These can be found in a library called libpcre2posix . Note that this
47+ man page). These can be found in a library called libpcre2-posix . Note that this
4848just provides a POSIX calling interface to PCRE2; the regular expressions
4949themselves still follow Perl syntax and semantics. The POSIX API is restricted,
5050and does not give full access to all of PCRE2's facilities.
@@ -58,8 +58,8 @@ renamed or pointed at by a link.
5858If you are using the POSIX interface to PCRE2 and there is already a POSIX
5959regex library installed on your system, as well as worrying about the regex.h
6060header file (as mentioned above), you must also take care when linking programs
61- to ensure that they link with PCRE2's libpcre2posix library. Otherwise they may
62- pick up the POSIX functions of the same name from the other library.
61+ to ensure that they link with PCRE2's libpcre2-posix library. Otherwise they
62+ may pick up the POSIX functions of the same name from the other library.
6363
6464One way of avoiding this confusion is to compile PCRE2 with the addition of
6565-Dregcomp=PCRE2regcomp (and similarly for the other POSIX functions) to the
@@ -204,13 +204,6 @@ library. They are also documented in the pcre2build man page.
204204 --enable-newline-is-crlf, --enable-newline-is-anycrlf, or
205205 --enable-newline-is-any to the "configure" command, respectively.
206206
207- If you specify --enable-newline-is-cr or --enable-newline-is-crlf, some of
208- the standard tests will fail, because the lines in the test files end with
209- LF. Even if the files are edited to change the line endings, there are likely
210- to be some failures. With --enable-newline-is-anycrlf or
211- --enable-newline-is-any, many tests should succeed, but there may be some
212- failures.
213-
214207. By default, the sequence \R in a pattern matches any Unicode line ending
215208 sequence. This is independent of the option specifying what PCRE2 considers
216209 to be the end of a line (see above). However, the caller of PCRE2 can
@@ -253,13 +246,13 @@ library. They are also documented in the pcre2build man page.
253246 sizes in the pcre2stack man page.
254247
255248. In the 8-bit library, the default maximum compiled pattern size is around
256- 64K. You can increase this by adding --with-link-size=3 to the "configure"
257- command. PCRE2 then uses three bytes instead of two for offsets to different
258- parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is
259- the same as --with-link-size=4, which (in both libraries) uses four-byte
260- offsets. Increasing the internal link size reduces performance in the 8-bit
261- and 16-bit libraries. In the 32-bit library, the link size setting is
262- ignored, as 4-byte offsets are always used.
249+ 64K bytes . You can increase this by adding --with-link-size=3 to the
250+ "configure" command. PCRE2 then uses three bytes instead of two for offsets
251+ to different parts of the compiled pattern. In the 16-bit library,
252+ --with-link-size=3 is the same as --with-link-size=4, which (in both
253+ libraries) uses four-byte offsets. Increasing the internal link size reduces
254+ performance in the 8-bit and 16-bit libraries. In the 32-bit library, the
255+ link size setting is ignored, as 4-byte offsets are always used.
263256
264257. You can build PCRE2 so that its internal match() function that is called from
265258 pcre2_match() does not call itself recursively. Instead, it uses memory
@@ -339,12 +332,23 @@ library. They are also documented in the pcre2build man page.
339332
340333 Of course, the relevant libraries must be installed on your system.
341334
342- . The default size (in bytes) of the internal buffer used by pcre2grep can be
343- set by, for example:
335+ . The default starting size (in bytes) of the internal buffer used by pcre2grep
336+ can be set by, for example:
344337
345338 --with-pcre2grep-bufsize=51200
346339
347- The value must be a plain integer. The default is 20480.
340+ The value must be a plain integer. The default is 20480. The amount of memory
341+ used by pcre2grep is actually three times this number, to allow for "before"
342+ and "after" lines. If very long lines are encountered, the buffer is
343+ automatically enlarged, up to a fixed maximum size.
344+
345+ . The default maximum size of pcre2grep's internal buffer can be set by, for
346+ example:
347+
348+ --with-pcre2grep-max-bufsize=2097152
349+
350+ The default is either 1048576 or the value of --with-pcre2grep-bufsize,
351+ whichever is the larger.
348352
349353. It is possible to compile pcre2test so that it links with the libreadline
350354 or libedit libraries, by specifying, respectively,
@@ -368,6 +372,22 @@ library. They are also documented in the pcre2build man page.
368372 If you get error messages about missing functions tgetstr, tgetent, tputs,
369373 tgetflag, or tgoto, this is the problem, and linking with the ncurses library
370374 should fix it.
375+
376+ . There is a special option called --enable-fuzz-support for use by people who
377+ want to run fuzzing tests on PCRE2. At present this applies only to the 8-bit
378+ library. If set, it causes an extra library called libpcre2-fuzzsupport.a to
379+ be built, but not installed. This contains a single function called
380+ LLVMFuzzerTestOneInput() whose arguments are a pointer to a string and the
381+ length of the string. When called, this function tries to compile the string
382+ as a pattern, and if that succeeds, to match it. This is done both with no
383+ options and with some random options bits that are generated from the string.
384+ Setting --enable-fuzz-support also causes a binary called pcre2fuzzcheck to
385+ be created. This is normally run under valgrind or used when PCRE2 is
386+ compiled with address sanitizing enabled. It calls the fuzzing function and
387+ outputs information about it is doing. The input strings are specified by
388+ arguments: if an argument starts with "=" the rest of it is a literal input
389+ string. Otherwise, it is assumed to be a file name, and the contents of the
390+ file are the test string.
371391
372392The "configure" script builds the following files for the basic C library:
373393
@@ -543,7 +563,7 @@ script creates the .txt and HTML forms of the documentation from the man pages.
543563
544564
545565Testing PCRE2
546- ------------
566+ -------------
547567
548568To test the basic PCRE2 library on a Unix-like system, run the RunTest script.
549569There is another script called RunGrepTest that tests the pcre2grep command.
@@ -757,6 +777,7 @@ The distribution should contain the files listed below.
757777 src/pcre2_xclass.c )
758778
759779 src/pcre2_printint.c debugging function that is used by pcre2test,
780+ src/pcre2_fuzzsupport.c function for (optional) fuzzing support
760781
761782 src/config.h.in template for config.h, when built by "configure"
762783 src/pcre2.h.in template for pcre2.h when built by "configure"
@@ -814,7 +835,7 @@ The distribution should contain the files listed below.
814835 libpcre2-8.pc.in template for libpcre2-8.pc for pkg-config
815836 libpcre2-16.pc.in template for libpcre2-16.pc for pkg-config
816837 libpcre2-32.pc.in template for libpcre2-32.pc for pkg-config
817- libpcre2posix .pc.in template for libpcre2posix .pc for pkg-config
838+ libpcre2-posix .pc.in template for libpcre2-posix .pc for pkg-config
818839 ltmain.sh file used to build a libtool script
819840 missing ) common stub for a few missing GNU programs while
820841 ) installing, generated by automake
@@ -845,4 +866,4 @@ The distribution should contain the files listed below.
845866Philip Hazel
846867Email local part: ph10
847868Email domain: cam.ac.uk
848- Last updated: 01 April 2016
869+ Last updated: 01 November 2016
0 commit comments