@@ -54,18 +54,20 @@ Unicode.tables The files in this directory were downloaded from the Unicode
5454ucptest.c A short C program for testing the Unicode property macros
5555 that do lookups in the pcre2_ucd.c data, mainly useful after
5656 rebuilding the Unicode property table. Compile and run this in
57- the "maint" directory (see comments at its head).
57+ the "maint" directory (see comments at its head). This program
58+ can also be used to find characters with specific properties.
5859
59- ucptestdata A directory containing two files, testinput1 and testoutput1,
60- to use in conjunction with the ucptest program.
60+ ucptestdata A directory containing four files, testinput{1,2} and
61+ testoutput{1,2}, for use in conjunction with the ucptest
62+ program.
6163
6264utf8.c A short, freestanding C program for converting a Unicode code
6365 point into a sequence of bytes in the UTF-8 encoding, and vice
6466 versa. If its argument is a hex number such as 0x1234, it
6567 outputs a list of the equivalent UTF-8 bytes. If its argument
6668 is a sequence of concatenated UTF-8 bytes (e.g. e188b4) it
6769 treats them as a UTF-8 character and outputs the equivalent
68- code point in hex.
70+ code point in hex. See comments at its head for details.
6971
7072
7173Updating to a new Unicode release
@@ -96,9 +98,10 @@ lists of scripts.
9698
9799The ucptest program can be compiled and used to check that the new tables in
98100pcre2_ucd.c work properly, using the data files in ucptestdata to check a
99- number of test characters. The source file ucptest.c should also be updated
100- whenever new Unicode script names are added, and adding a few tests for new
101- scripts is a good idea.
101+ number of test characters. It used to be necessary to update the source
102+ ucptest.c whenever new Unicode scripts were added, but this is no longer
103+ required because that program now uses the lists in the PCRE2 source. However,
104+ adding a few tests for new scripts to the files in ucptestdata is a good idea.
102105
103106
104107Preparing for a PCRE2 release
@@ -437,4 +440,4 @@ very sensible; some are rather wacky. Some have been on this list for years.
437440Philip Hazel
438441Email local part: ph10
439442Email domain: cam.ac.uk
440- Last updated: 03 June 2019
443+ Last updated: 01 April 2020
0 commit comments