Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan for more complex messages (constants and concatenations in arguments) #110

Open
pczi opened this issue Sep 16, 2017 · 0 comments
Open

Comments

@pczi
Copy link

pczi commented Sep 16, 2017

This is not an issue, but your code could be enhanced based on the works i did.
The following code will scan ::t functions and for the first 2 arguments replace concats and constants by their correct value. All the scanning is done with Regex expressions.

It will correctly resolve:
Yii::t('appMenu', self::LBL_A);
Yii::t('appMenu', self::LBL_A, otherArg);
Yii::t('appMenu', OtherClass::LBL_A);
Yii::t('appMenu', OtherClassUsedAs::LBL_A);
Yii::t('appMenu', selfImplementingClass::LBL_A);
Yii::t('appMenu', self::LBL_A . 'xxxx' );
Yii::t( self::LBL_A . 'appMenu', OtherClass::LBL_B . 'xxxx' . OtherClassUsedAs::LBL_C);

The following items are taken into account to resolve constants:

  • namespace ...;
  • use ...;
  • use ... as ...;
  • class ... (implements ...);
  • interface ...;

It works quite well and fast on a large project.

    // which folder should be scanned
    private $scanroot = '@_protected' . DIRECTORY_SEPARATOR;
    // list of the php function for translating messages.
    private $phpTranslators = ['::t'];    
    // list of file extensions that contain language elements.
    private $phpPatterns = ['*.php'];
    // these categories won't be included in the language database.
    private $ignoredCategories = ['yii'];
    // these files will not be processed.
    private $ignoredItems = [
        '.gitignore',
        '/messages',
        '/vendor',
        '/frontend/assets',
        'runtime',
        'bower',
        'nikic',
    ];
    // Regular expression to match PHP namespace definitions.
    public $patternNamespace = '/namespace\s*(.+?)\s*;/i';
    // Regular expression to match PHP class definitions.
    public $patternClass = '/(?<!(\/\/\s)|\*\s|\/\*\*\s)(?:class|interface)\s*(.+?)\s/i';
    // Regular expression to match PHP Implements definitions.
    public $patternImplements = '/class\s*.+?\simplements\s(.+?)\s/i';
    // Regular expression to match PHP const assignments.
    public $patternConst = '/const\s*(.+?)\s*=\s*(.+);/i';
    // Regular expression to match PHP use and use as definitions.
    public $patternUse = '/use\s*(.+?\\<Searchstring>)\s*;/i';
    public $patternUseas = '/use\s*(.+?)\s*as\s*<Searchstring>\s*;/i';
    // Regular expression to match PHP Yii::t functions.
    public $patternPhp = '/::t\s*\(\s*(((["\'])(?:(?=(\\\\?))\4.)*?\3|[\w\d:$>-]*?|[\s\.]*?)+?)\s*\,\s*(((["\'])(?:(?=(\\\\?))\8.)*?\7|[\w\d:$>-]*?|[\s\.]*?)+?)\s*[,\)]/';
    // Regular expression to split up PHP concat "./dot" into parts
    public $patternConcatSplit = '/(["\'])(?:\\\\.|[^\1])*?\1|[^\s\.]+/';
    // holds all the const elements
    private $_constElements = [];
    // holds all the language elements
    private $_languageElements = [];

    /**
     * Scan source code for messages.
     *
     * @return \yii\web\Response
     */
    public function actionScan()
    {
        // FILES LIST
        $rootpath = Yii::getAlias($this->scanroot);

        $files = FileHelper::findFiles($rootpath, [
            'except' => $this->ignoredItems,
            'only' => $this->phpPatterns,
        ]);
        
        // first fetch constants
        $implementstack = [];
        foreach($files as $file) {
            $text = file_get_contents($file);
            $namespace = $this->regexFirstMatch($this->patternNamespace, $text);
            if ($namespace) {
                $class = $this->regexSecondMatch($this->patternClass, $text);
                $implements = $this->regexFirstMatch($this->patternImplements, $text);
                $namespaceclass = $namespace.'\\'.$class;
                $namespaceimplements = $namespace.'\\'.$implements;
                preg_match_all($this->patternConst, $text, $matches, PREG_SET_ORDER, 0);
                foreach($matches as $match) {
                    $const = $match[1];
                    $msg = $match[2];
                    $this->_constElements[$namespaceclass.'\\'.$const] = $msg;                    
                }
                if ($implements) {
                    $implementstack[$namespaceclass] = $namespaceimplements;
                }
            }
        }
        // apply all the implements
        foreach ($implementstack as $kis => $vis) {
            foreach($this->_constElements as $constname => $msg) {
                if (0 === strpos($constname.'\\', $vis.'\\')) {
                    $this->_constElements[str_replace($vis, $kis, $constname)] = $msg;                    
                }
            }
        }

//        dd($this->_constElements);

        // parse all the files contains a namespace definition
        foreach($files as $file) {
            $text = file_get_contents($file);
            $namespace = $this->regexFirstMatch($this->patternNamespace, $text);
            // process files only if we have a namespace
            if ($namespace) {
                $class = $this->regexSecondMatch($this->patternClass, $text);
                $implements = $this->regexFirstMatch($this->patternImplements, $text);
                $namespaceclass = $namespace.'\\'.$class;
                $namespaceimplements = $namespace.'\\'.$implements;
                if ($this->containsTranslator($this->phpTranslators, $text)) {
                    preg_match_all($this->patternPhp, $text, $matches, PREG_SET_ORDER, 0);
                    foreach($matches as $match) {
                        $cat = $match[1];
                        $msg = $match[5];
                        $lines = file($file);
                        $line_number = false;
                        while (list($key, $line) = each($lines) and !$line_number) {
                           $line_number = (strpos($line, $match[0]) !== FALSE) ? $key + 1 : $line_number;
                        }
                        // split up the category by . and replace constants
                        $cat = $this->replaceConst($cat, $namespaceclass, $text);
                        $msg = $this->replaceConst($msg, $namespaceclass, $text);
                        if (!in_array($cat, $this->ignoredCategories)) {
                            $this->_languageElements[$cat][$msg][] = $namespaceclass.':'.$line_number;
                        }
                    }
                }
            }
        }
        
        dd($this->_languageElements);
        
        return $this->redirect(Yii::$app->request->referrer);
    }
    
    protected function regexFirstMatch($pattern, &$subject) {
        preg_match($pattern, $subject, $match);
        if (count($match)>0) {
            return $match[1];
        }
    }

    protected function regexSecondMatch($pattern, &$subject) {
        preg_match($pattern, $subject, $match);
        if (count($match)>0) {
            return $match[2];
        }
    }

    protected function replaceConst(string &$argument, string $namespaceclass, string &$text) :string 
    {
        preg_match_all($this->patternConcatSplit, $argument, $argumentdotparts, PREG_PATTERN_ORDER, 0);
        $argumentdotparts = $argumentdotparts[0];
        foreach ($argumentdotparts as $adkey => $adpart) {
            $constparts = explode('::', $adpart);
            if (count($constparts)>1) {
                // we have a constant, replace it with value
                if ($constparts[0] == 'self') {
                    // const within self::
                    $constparts[0] = $namespaceclass;
                    $constname = implode('\\', $constparts);
                    if (isset($this->_constElements[$constname])) {
                        // self refers to itself
                        $argumentdotparts[$adkey] = $this->_constElements[$constname];
                    }
                } else {
                    // const within some other class
                    $otherclass = $constparts[0];
                    $useclass = $this->regexFirstMatch(str_replace('<Searchstring>', $otherclass, $this->patternUse), $text);
                    $constparts[0] = $useclass;
                    $constname = implode('\\', $constparts);
                    if (isset($this->_constElements[$constname])) {
                        // self refers to use class
                        $argumentdotparts[$adkey] = $this->_constElements[$constname];
                    } else {
                        // self refers to use as class
                        $useasclass = $this->regexFirstMatch(str_replace('<Searchstring>', $otherclass, $this->patternUseas), $text);
                        $constparts[0] = $useasclass;
                        $constname = implode('\\', $constparts);
                        if (isset($this->_constElements[$constname])) {
                            // self refers to useas class
                            $argumentdotparts[$adkey] = $this->_constElements[$constname];
                        }
                    }
                }
                $constname = implode('\\', $constparts);
                if (isset($this->_constElements[$constname])) {
                    $argumentdotparts[$adkey] = $this->_constElements[$constname];
                }
            }
            $argumentdotparts[$adkey] = trim($argumentdotparts[$adkey],'"');
            $argumentdotparts[$adkey] = trim($argumentdotparts[$adkey],"'");
        }
        return implode('', $argumentdotparts);

    }
    /**
     * Determines whether the file has any of the translators.
     *
     * @param string[] $translators Array of translator patterns to search (for example: `['::t']`).
     * @param string $file Path of the file.
     *
     * @return bool
     */
    protected function containsTranslator($translators, &$text)
    {
        return preg_match(
            '#(' . implode('\s*\()|(', array_map('preg_quote', $translators)) . '\s*\()#i',
            $text
        ) > 0;
    }
@pczi pczi changed the title Scan for more complex messages (constants and concatenations) Scan for more complex messages (constants and concatenations in arguments) Sep 16, 2017
@moltam moltam added this to the Nice to have milestone Jun 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants