Releases: RunDevelopment/refa
Releases · RunDevelopment/refa
v0.12.1
Added
- Generally added some documentation.
- Added
CharMap#{size,entryCount}
to get the size of a map. - Added
CharMap#copy
to create a (mapped) copy of a map. - Added
StringSet#{is{Proper,}{Subset,Superset}Of,isDisjointWith}
for set relations. - Added
UnicodeSet#{is{Proper,}{Subset,Superset}Of,isDisjointWith}
for set relations. - Added
UnicodeSet#{wordSets,maximum}
for a more ergonomic API. UnicodeSet#{equals,union,intersect,without}
allowCharSet
s.
Fixed
- Made
StringSet#words
private. This should have never been public in the first place. - Made
StringSet.empty
readonly. - Removed ignored
range
parameter fromCharMap#entries
. - Fixed some bugs in
StringSet#{equals,union}
. - Removed
StringSet#{filter,map}
. These methods should have never been public in the first place.
v0.12.0
Breaking changes
- Added support for the
v
flag.- This significantly changes the interfaces and types around the parser.
- There are now 2 never classes
JS.UnicodeSet
andJS.StringSet
to represent a Unicode set with strings. - Much more.
combineTransformers
has been deprecated. UseCombinedTransformer
instead.
Added
JS.parseCharSet
andJS.parseUnicodeSet
have been added to easily parse a character AST into aCharSet
orUnicodeSet
.JS.toLiteral
now supports thev
flag.- Added
CharSet#fromCharacter
to easily create a character set from a single character. - Allow string argument for
JS.Parser.fromLiteral
. - Added
CombinedTransformer
class to combine multiple transformers into one. - Added transform events which allows the caller to observe the transformation.
- Added
Transformers.makeGreedy
to make quantifiers greedy whenever possible. - Added
Transformers.simplify
as a stable way to get the best combination of transformers to simplify a regex.
Changed
- Transformers can now have an optional name.
- Major internal improvements to some transformers, especially
applyAssertions
.
Fixed
isEmpty
has been fixed.Transformers.moveUpEmpty
should now work correctly.- Transformers are now guaranteed to be called with the correct
this
argument.
v0.11.0
Breaking changes
- Upgraded to
@eslint-community/regexpp
v4.5.0 and dropregexpp
. - Drop support for NodeJS 10.
- Changed default character-set-to-string function of
{DFA,ENFA,NFA}#toDot
toCharSet#toUnicodeString
. - Changed character-set-to-string function of
{DFA,ENFA,NFA}#toString
toCharSet#toUnicodeString
. - Renamed
ToDotInfo
toNodeInfo
. - Removed
createSimpleToDotOptions
.
Added
- Added
toMermaid
as part of theFAIterators
namespace andFiniteAutomaton
interface. - Added a unified interface for the namespaced
toDot
andtoMermaid
functions. - Many DFA, ENFA, and NFA operations now take optional node factory arguments to control the number of nodes created. All operations that create nodes no take factory arguments.
{DFA,ENFA,NFA}.emptyWord
will create a new FA that makes exactly the empty word.- Added
withInitial
,withGetOut
, andwithIsFinal
to more easily derive new FA iterators. - Added
assertions: "ignore"
to JS parser,ENFA.fromRegex
, andNFA.fromRegex
. This is mostly for convenience and performance. The same behavior could previously be achieved using transformers. - Added
CharSet#toUnicodeString
to provide an easy way to convert a character set into a human-readable string. - Added
CharSet#isProper{Subset,Superset}Of
. CharSet#equals
now supportsCharRange
s.
Fixed
- Fixed
ENFA#isEmpty
for non-normalized graphs.
Changed
{DFA,NFA}.fromCharSet
and{ENFA,NFA}.all
now use constructions with fewer states.approximateRejectingWordSet
will now returnundefined
instead of throwing an error if the input character set is empty.- Changed behavior of
ENFA#countNodes
to be consistent with NFA and DFA. - Upgraded from Unicode 13.0.0 to Unicode 15.0.0.
- Generally added and improved documentation.
0.10.0
Breaking changes
CharSet
: Theintersect
andwithout
methods now only takeCharSet
s andCharRange
s as arguments.{DFA,ENFA,NFA}#{isDisjointWith,getIntersectionWords,getIntersectionWordSets}
were removed.- Removed
NodeList
API for all FA implements. This is a very significant change as to how FAs are implemented but doesn't affect the main FA APIs too much. This change gives users a lot more control over FA implements. - Removed
{DFA,ENFA,NFA}.CreationOptions
interfaces. Use the newNodeFactory
API instead. - Removed
{DFA,ENFA,NFA}#options
. Just pass the FA as is instead. - Removed
FACreationOptions
interface. getIntersection{Iterator,Words,WordSets}
,isDisjointWith
: Replaced optionalFACreationOptions
parameter with optionalmaxNodes
parameter.- DFA nodes can now only be linked using
CharSet
s. Linking withChar
s andCharRange
s is no longer supported. - Removed
ENFA
'sunorderedResolveEpsilon
. UseresolveEpsilon
instead. FAIterator.MapFABuilder
: Removed optionalkind
argument.
Added
NodeFactory
interface. This new interface is the basis of allFABuilder
s and FA implemented.{DFA,ENFA,NFA}.nodeFactory
: This is an unlimited node factory.{DFA,ENFA,NFA}.LimitedNodeFactory
: This node factory can be used to limit the number of nodes an FA operation is allowed to create.CharSet#resize
CharSet.fromRange
FAIterator.fromWords
will create a new iterator from a list of words.FAIterator.fromWordSets
will create a new iterator from a list of word sets.{DFA,ENFA,NFA}.fromCharSet
will create new FAs from a givenCharSet
.{DFA,ENFA,NFA}.fromWordSets
will create new FAs from a list of word sets.{DFA,ENFA,NFA}#countNodes
will return the number of nodes in the FA.{DFA,ENFA,NFA}#nodes
will iterate through all nodes in the FA.ENFA.withoutEmptyWord
ENFA#{append,prepend,union}Into
will move the nodes of the given ENFA instead of copying them. This can be used to improve performance.
Changed
JS.toLiteral
: Settingunicode: false
in theflags
option will now always succeed.
Improved
CharMap
is now implemented using a sorted array instead of an AVL tree. This is significantly faster. Most DFA operation are now 10% faster.FAIterators.iterateWordSets
will now use the natural iteration order of the given FAIterator for words of the same length. This makesENFA#wordSets
a lot more logical.ENFA
'sresolveEpsilon
is now implemented non-recursively.- The docs now have a dark mode thanks to TypeDoc v0.22.0.
0.9.1
Fixed
- Fixed that some ENFA operations created unnecessary states.
- Fixed internal functions used to traverse graphs. This fixes the bug that some
FAIterators
functions had trouble with falsy state values.
Improved
JS.toLiteral
: The heuristic used to decide flags has been improved to prevent unnecessaryi
flags.
0.9.0
Breaking changes
FAIterators.intersection
no longer accepts options.- Removed the
IntersectionOptions
interface. Use the newFACreationOptions
interface or any of the FA-specific interfaces instead. - Removed custom equality functions for
CharMap
. - The constructor of
FAIterators.MapFABuilder
changed slightly. It now accepts arguments as parameters instead of as an object. - Some renaming:
FAIterator#deterministicOut
->FAIterator#stableOut
FAIterators.ensureDeterministicOut
->FAIterators.ensureStableOut
CharMap#{delete,set}Every
->CharMap#{delete,set}Range
Added
- Added support for the new JS RegExp
hasIndices
flag. - New
WordSet
andReadonlyWordSet
types. - New
CharBase
class. This provides methods to remap alphabets. - Added
CharMap#clear
. - Added
CharMap#filter
. - Added
CharMap#invert
to convertCharMap
s toMap
s. - Added
CharMap#setCharSet
to more efficiently set many ranges. - Added
CharSet#characters
to iterate over all characters in a set. - Added
CharSet#toRangesString
to print only the ranges of a set. - Added
CharSet.fromCharacters
to create a set from a collection of characters. FAIterators.shortestAcceptingPath
returns the shortest accepting path of arbitrary iterators.FAIterators.shortestWordSet
returns the shortest accepted word set of an iterator.FAIterators.makeInitialFinal
andFAIterators.makeInitialNonFinal
changes whether the initial state is also a final state.FAIterators.approximateRejectingWordSet
tries to find a rejected word set.FAIterators.makeDeterministic
builds a deterministic version of an iterator. This is a general DFA construction.Words.wordSetsToWords
converts a collections of word sets into a collection of words.
Changed
- refa is now allowed to assume all given
Char
s andCharRange
s conform to the guarantees given by the interface. This includes guarantees that cannot be verified at compile time (e.g.min <= max
forCharRange
). Words.pickMostReadableWord
will now always return a word.{DFA,ENFA,NFA}#{isDisjointWith,getIntersectionWords,getIntersectionWordSets}
are now deprecated and will be removed in future releases.Words.wordSetToWords
is now deprecated. UseWords.wordSetsToWords
instead (mind the s).
Fixed
ReadonlyCharMap#isEmpty
is now a readonly property.- Fixed
JS.Parser
incorrectly caching parsed characters. - Fixed
JS.Parser
incorrectly canonicalizing Unicode property escapes. - Fixed
DFA.NodeList#removeUnreachable
removing reachable states sometimes.
Improved
- Many, many minor improvements (code quality, documentation, etc.).
- 10x faster
wordSetToWords
. All methods iterating words will now be faster. - Pretty much all DFA operations will be faster. DFA minimization is up to 20x faster.
- Faster NFA creation.
0.8.0
Breaking changes
- New RE AST node:
Unknown
. This node is used to represent parts of a regex that cannot be represented using RE AST. - Removed
JS.ParseOptions.disableOptimizations
. UseJS.ParseOptions.simplify
instead. - Removed
TransitionIterableFA
interface. TransitionIterable
is now generic over the state type.- Renamed
{DFA,NFA}.intersectionWordSets
togetIntersectionWordSets
. - Renamed
{DFA,NFA}.intersectionWords
togetIntersectionWords
.
Added
- ENFA - a non-deterministic finite automaton with epsilon transitions.
- FAIterators - a new namespace containing methods can consume and produce FA iterators.
- New
toDot
method for finite automata. This will make it easier to visualize the state machines. - New
isDisjointWith
,getIntersectionWordSets
, andgetIntersectionWords
functions. These free functions can be used with any FA types. - New
JS.ParseOptions.simplify
option. - New
FAIterator.deterministicOut
property. - New
TransitionIterator
. (This only gives an already commonly used type a name.) - New
MaxCharacterError
for incompatible finite automata. - New
FABuilder
interface to allow algorithms to construct FA without knowing the actual FA implementation. {DFA,ENFA,NFA}.NodeList
: Added staticwithLimit
method to be able to limit the number of nodes aNodeList
is allowed to create.
Changed
JS.Parser.parseElement
now accepts more parsable elements.JS.toLiteral
now accepts any RE AST node.NFA.fromRegex
now accepts any RE AST node.{DFA,ENFA,NFA}.NodeList
now implement theFABuilder
interface.- Many, many internal changes that do not affect the API.
Fixed
DFA.fromIntersection
now correctly computes the intersection for non-DFA arguments.getBaseSets
(a util function) now guarantees O(n*log n) run time. I accidentally implemented this in O(n^2) before which caused some DFA operations to be extremely slow.iterateWordSets
(a util function) now correctly eliminates dead states. This fixes the bug that some FA with infinite languages only yielded finitely many words when iterating over them.
0.7.1
0.7.0
Breaking changes
FiniteAutomaton.test
now requires aReadonlyArray
instead of anIterable
.Words.wordSetToWords
now returns anIterable
instead of anIterableIterator
.- Removed
toPatternString
function. - Removed
NFA.FromRegexOptions.disableLookarounds
. UseNFA.FromRegexOptions.assertions
instead. - AST format: Quantifier nodes now have a lazy property to enable non-greedy quantifiers.
JS.Parser
no longer implementsJS.Literal
. Use theJS.Parser.literal
property instead.JS.Parser
now resolves backreferences differently. It now supports resolving capturing groups with finite small languages. How small the language is required to be can be controlled via the newJS.ParseOptions.maxBackreferenceWords
option (defaults to 100 words).JS.ParseOptions.backreferences
also works differently now. See theJS.ParseOptions
documentation for more details.- Some renaming:
JS.ParseOptions.lookarounds
->JS.ParseOptions.assertions
ToRegexOptions.maximumNodes
->ToRegexOptions.maxNodes
ToRegexOptions.maximumOptimizationPasses
->ToRegexOptions.maxOptimizationPasses
Fixed
Words.fromStringToUTF16
now works properly.JS.toLiteral
will now properly detect predefined character sets in character classes. This didn't work properly before.
Added
- Documentation. A lot of code documentation and a TypeDoc-generated website have been added.
- New
Char
,Word
, andReadonlyWord
types replace the old plain number and iterable types. - AST transformers. They can efficiently modify a given AST and are used to e.g. apply assertions.
JS.ParseOptions
now has amaxNodes
option to limit the size of the parsed AST.JS.Parser
now has amaxCharacter
property.
Changed
NFA.test
now implements Thompson's algorithm which guarantees efficient execution.- The
toRegex
methods of the DFA and NFA classes now use AST transformers under the hood to produce smaller ASTs. - The default value of
ToRegexOptions.maxOptimizationPasses
is now implementation-defined.
0.6.0
Breaking changes
DFA#clone
has been renamed toDFA#copy
to be compatible withNFA#copy
.- The
source
property of RE AST nodes is now optional. This results in the removal/change of several types and functions. TheSimple
type has been removed; useNoParent
instead.
Added
JS.toLiteral
now has aflags
options to force/disallow certain flags and afastCharacters
options for up to 10x better performance.JS.toLiteral
now detects builtin assertions.
Changed
- All DFA and NFA creation methods now have safe defaults and will throw if the FA that is being created is too large. The limit can be controlled using the
maxNodes
option.