Skip to content

Commit d474ea7

Browse files
committed
refactor: Complete separation of condition categories in language readers
This commit finalizes the refactoring of condition handling by successfully separating mixed condition concepts into distinct categories: control flow keywords, logical operators, case keywords, and ternary operators. Key changes include: - Updated all 21+ language readers to utilize the new structured approach. - Enhanced extensions to leverage semantic names for better clarity. - Fixed multiple identified bugs, including adding the missing 'elif' keyword in GDScript and including AND/OR logical operators in ST. - Added comprehensive tests to document and verify the behavior of various language constructs. These changes improve code maintainability, enhance accuracy in complexity calculations, and ensure backward compatibility with existing functionality.
1 parent 0798d06 commit d474ea7

11 files changed

+881
-1562
lines changed

.vscode/settings.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{
2+
"makefile.configureOnOpen": false
3+
}

ongoing/REFACTORING_COMPLETE.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# Separate Conditions Refactoring - COMPLETE ✅
2+
3+
## Status: Production Ready
4+
5+
All phases of the conditions refactoring have been successfully completed.
6+
7+
## What Was Done
8+
9+
Separated mixed condition concepts into 4 distinct categories across the entire Lizard codebase:
10+
11+
1. **Control Flow Keywords** - `if`, `for`, `while`, `catch`
12+
2. **Logical Operators** - `&&`, `||`, `and`, `or`
13+
3. **Case Keywords** - `case`, `when`
14+
4. **Ternary Operators** - `?`, `??`, `?:`
15+
16+
## Results
17+
18+
-**28 files updated** (1 base + 21 languages + 4 extensions + 2 cleanups)
19+
-**7 bugs fixed** (Perl, Rust, TTCN, GDScript x2, ST, R)
20+
-**8 new tests added** (bug reproduction and validation)
21+
-**1021 tests passing** (100% pass rate, 0 regressions)
22+
-**Fully backward compatible** (no breaking changes)
23+
24+
## Documentation Available
25+
26+
### For Language Implementers
27+
- `language-implementation-guide.md` - How to add new language support
28+
- `condition-categories-reference.md` - Complete reference for all categories
29+
- `code-structure-reference.md` - Code organization and patterns
30+
31+
### For Reference
32+
- `todo_list.md` - Current tasks (refactoring complete)
33+
34+
## Benefits
35+
36+
- **Clarity**: Each token's purpose is explicit
37+
- **Correctness**: 7 bugs fixed, all edge cases handled
38+
- **Maintainability**: Extensions use semantic names
39+
- **Extensibility**: Clear API for future enhancements
40+
- **Compatibility**: 100% backward compatible
41+
42+
## Test Coverage
43+
44+
```
45+
1021 tests passing
46+
- 1013 existing regression tests
47+
- 8 new bug reproduction tests
48+
6 tests skipped (unchanged)
49+
0 regressions
50+
```
51+
52+
## Key Improvements
53+
54+
### Extensions Can Now Use Semantic Names
55+
56+
```python
57+
# lizardnonstrict.py
58+
reader.conditions -= reader.logical_operators # Clear intent
59+
```
60+
61+
### Languages Are Properly Categorized
62+
63+
Each of 23+ languages has explicit categorization showing exactly which keywords and operators contribute to complexity.
64+
65+
### All Edge Cases Documented
66+
67+
Special cases (R element-wise operators, Erlang macros, etc.) are documented with rationale.
68+
69+
## Completion Date
70+
71+
Phase 1-6: Core refactoring complete
72+
Phase 7: Documentation complete
73+
74+
**Status**: ✅ **PRODUCTION READY**
75+
Lines changed: 242 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,242 @@
1+
# Lizard Code Structure Reference
2+
3+
## Condition Handling System
4+
5+
### Base Class: CodeReader
6+
7+
**Location**: `lizard_languages/code_reader.py`
8+
9+
The `CodeReader` class defines four condition category fields:
10+
11+
```python
12+
class CodeReader:
13+
# Separated condition categories
14+
_control_flow_keywords = {'if', 'for', 'while', 'catch'}
15+
_logical_operators = {'&&', '||'}
16+
_case_keywords = {'case'}
17+
_ternary_operators = {'?'}
18+
19+
@classmethod
20+
def _build_conditions(cls):
21+
"""Combines all categories into one set."""
22+
return (cls._control_flow_keywords |
23+
cls._logical_operators |
24+
cls._case_keywords |
25+
cls._ternary_operators)
26+
27+
def __init__(self, context):
28+
# Combined set for compatibility
29+
self.conditions = copy(self.__class__._build_conditions())
30+
31+
# Individual sets for extensions
32+
self.control_flow_keywords = copy(self.__class__._control_flow_keywords)
33+
self.logical_operators = copy(self.__class__._logical_operators)
34+
self.case_keywords = copy(self.__class__._case_keywords)
35+
self.ternary_operators = copy(self.__class__._ternary_operators)
36+
```
37+
38+
### Language Reader Pattern
39+
40+
Each language overrides the category fields:
41+
42+
```python
43+
class LanguageReader(CodeReader):
44+
_control_flow_keywords = {'if', 'for', 'while'}
45+
_logical_operators = {'&&', '||'}
46+
_case_keywords = {'case'}
47+
_ternary_operators = {'?'}
48+
```
49+
50+
### How CCN is Calculated
51+
52+
**Location**: `lizard.py:condition_counter()`
53+
54+
```python
55+
def condition_counter(tokens, reader):
56+
conditions = reader.conditions # Combined set of all categories
57+
for token in tokens:
58+
if token in conditions:
59+
reader.context.add_condition() # Adds +1 to CCN
60+
yield token
61+
```
62+
63+
Each token in `reader.conditions` adds +1 to the function's cyclomatic complexity.
64+
65+
## Extension Patterns
66+
67+
### Pattern 1: Remove Specific Category
68+
69+
**Example**: `lizardnonstrict.py` removes logical operators
70+
71+
```python
72+
class LizardExtension(object):
73+
def __call__(self, tokens, reader):
74+
# Remove logical operators from conditions
75+
reader.conditions -= reader.logical_operators
76+
return tokens
77+
```
78+
79+
**Effect**: Excludes logical operators from CCN calculation.
80+
81+
### Pattern 2: Special Handling for Category
82+
83+
**Example**: `lizardmccabe.py` handles consecutive case statements
84+
85+
```python
86+
class LizardExtension(ExtensionBase):
87+
def _state_global(self, token):
88+
if token == "case": # Detect case keywords
89+
self._state = self._in_case
90+
91+
def _after_a_case(self, token):
92+
if token == "case": # Consecutive case
93+
self.context.add_condition(-1) # Subtract complexity
94+
```
95+
96+
**Effect**: Only first case in sequence adds to CCN.
97+
98+
### Pattern 3: Track All Categories
99+
100+
**Example**: `lizardcomplextags.py` records all complexity keywords
101+
102+
```python
103+
class LizardExtension(object):
104+
def __call__(self, tokens, reader):
105+
conditions = reader.conditions # Use combined set
106+
for token in tokens:
107+
if token in conditions:
108+
# Record [token, line_number]
109+
context.current_function.complex_tags.append([token, line])
110+
yield token
111+
```
112+
113+
**Effect**: Logs all complexity-contributing tokens.
114+
115+
## Language-Specific Implementations
116+
117+
### Python
118+
```python
119+
_control_flow_keywords = {'if', 'elif', 'for', 'while', 'except', 'finally'}
120+
_logical_operators = {'and', 'or'}
121+
_case_keywords = set() # No case in Python
122+
_ternary_operators = set() # Uses 'x if c else y' syntax
123+
```
124+
125+
### TypeScript
126+
```python
127+
_control_flow_keywords = {'if', 'elseif', 'for', 'while', 'catch'}
128+
_logical_operators = {'&&', '||'}
129+
_case_keywords = {'case'}
130+
_ternary_operators = {'?'}
131+
```
132+
133+
### R
134+
```python
135+
_control_flow_keywords = {'if', 'else if', 'for', 'while', 'repeat', 'switch', 'tryCatch', 'try', 'ifelse'}
136+
_logical_operators = {'&&', '||', '&', '|'} # Both short-circuit and element-wise
137+
_case_keywords = set()
138+
_ternary_operators = set()
139+
```
140+
141+
### Rust
142+
```python
143+
_control_flow_keywords = {'if', 'for', 'while', 'catch', 'match', 'where'}
144+
_logical_operators = {'&&', '||'}
145+
_case_keywords = set() # Rust uses match arms, not case
146+
_ternary_operators = {'?'} # Error propagation operator
147+
```
148+
149+
### Fortran (Case-insensitive)
150+
```python
151+
_control_flow_keywords = {'IF', 'DO', 'if', 'do'}
152+
_logical_operators = {'.AND.', '.OR.', '.and.', '.or.'}
153+
_case_keywords = {'CASE', 'case'}
154+
_ternary_operators = set()
155+
```
156+
157+
## Special Cases
158+
159+
### Languages with Multiple Operator Forms
160+
161+
**Perl** - Both word and symbol forms:
162+
```python
163+
_logical_operators = {'&&', '||'} # Symbol form
164+
# Note: Also has 'and', 'or' with different precedence (not included)
165+
```
166+
167+
**PHP** - Symbol primary, word exists:
168+
```python
169+
_logical_operators = {'&&', '||'}
170+
# Note: Also has 'and', 'or' with different precedence
171+
```
172+
173+
### Languages with Unusual Operators
174+
175+
**Erlang** - `?` is macro expansion:
176+
```python
177+
_ternary_operators = {'?'} # Macro operator, not ternary
178+
```
179+
180+
**Zig** - Multiple special operators:
181+
```python
182+
_logical_operators = {'and', 'or', 'orelse'} # orelse = null coalescing
183+
_ternary_operators = {'=>'} # Error union and switch cases
184+
```
185+
186+
**Kotlin** - Elvis operator:
187+
```python
188+
_ternary_operators = {'?:'} # Elvis operator
189+
```
190+
191+
## Instance Attributes Available
192+
193+
After initialization, each reader instance has:
194+
195+
```python
196+
reader.conditions # Set: Combined all categories
197+
reader.control_flow_keywords # Set: Control flow only
198+
reader.logical_operators # Set: Logical operators only
199+
reader.case_keywords # Set: Case keywords only
200+
reader.ternary_operators # Set: Ternary operators only
201+
```
202+
203+
All are mutable sets that extensions can modify.
204+
205+
## Backward Compatibility
206+
207+
The base class supports old-style `_conditions`:
208+
209+
```python
210+
# Old style (still supported):
211+
class OldReader(CodeReader):
212+
_conditions = {'if', 'for', '&&', '||', 'case', '?'} # Mixed
213+
214+
# New style (recommended):
215+
class NewReader(CodeReader):
216+
_control_flow_keywords = {'if', 'for'}
217+
_logical_operators = {'&&', '||'}
218+
_case_keywords = {'case'}
219+
_ternary_operators = {'?'}
220+
```
221+
222+
Both work identically. The new style provides better semantics.
223+
224+
## Quick Reference Table
225+
226+
| Category | Purpose | Example Tokens | Use Case |
227+
|----------|---------|----------------|----------|
228+
| `_control_flow_keywords` | Decision points | if, for, while, catch | Core control structures |
229+
| `_logical_operators` | Compound conditions | &&, \|\|, and, or | Combining conditions |
230+
| `_case_keywords` | Switch branches | case, when | Multi-way branches |
231+
| `_ternary_operators` | Inline conditions | ?, ??, ?: | Conditional expressions |
232+
233+
## Summary
234+
235+
The separated condition categories provide:
236+
- **Semantic clarity**: Each token's role is explicit
237+
- **Extension flexibility**: Target specific types
238+
- **Easier validation**: Check each category independently
239+
- **Better maintenance**: Self-documenting structure
240+
241+
All categories combine into `reader.conditions` for CCN calculation, maintaining compatibility while providing better organization.
242+

0 commit comments

Comments
 (0)