Merge branch 'release/2.0.0'

Release 2.0.0
PaperMtn · Jan 2, 2021 · cc036eb · cc036eb
2 parents 2ad84b0 + 5dabff5
commit cc036eb
Show file tree

Hide file tree

Showing 15 changed files with 600 additions and 309 deletions.
diff --git a/.github/workflows/pythonpublish.yml b/.github/workflows/pythonpublish.yml
@@ -0,0 +1,31 @@
+# This workflows will upload a Python Package using Twine when a release is created
+# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries
+
+name: Upload Python Package
+
+on:
+  release:
+    types: [published]
+
+jobs:
+  deploy:
+
+    runs-on: ubuntu-latest
+
+    steps:
+    - uses: actions/checkout@v2
+    - name: Set up Python
+      uses: actions/setup-python@v2
+      with:
+        python-version: '3.x'
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install setuptools wheel twine
+    - name: Build and publish
+      env:
+        TWINE_USERNAME: ${{ '__token__' }}
+        TWINE_PASSWORD: ${{ secrets.PYPI_TOKEN }}
+      run: |
+        python setup.py sdist bdist_wheel
+        twine upload dist/*
diff --git a/.gitignore b/.gitignore
@@ -125,6 +125,8 @@ venv.bak/
 dmypy.json
 
 # package related
-src/additional_matches.txt
-src/duplicate_passwords.txt
-src/HIBP_matches.txt
+lil_pwny/additional_matches.txt
+lil_pwny/duplicate_passwords.txt
+lil_pwny/HIBP_matches.txt
+*.json
+lil_pwny/caching_test*.py
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,10 @@
+## 2.0.0 - 2021-01-02
+### Added
+- Massive enhancements to make much better use of multiprocessing for the large HIBP password file, as well as more efficient importing and handling of Active Directory user hashes. 
+- Updated directory structure to play more nicely with more OS versions and flavours, rather than installing in the `src` directory.
+- Logging: Removed outdated text file output and implemented JSON formatted logging to either stdout or to .log file
+- New option to obfuscate genuine password NTLM hashes in logging output. This is achieved by further hashing the hash with a randomly generated salt.
+- Active Directory computer accounts are now not imported with AD user hashes. There is little value in assessing these, so no point importing them.
+
+## 1.2.0 - 2020-03-22
+Initial Release
diff --git a/README.md b/README.md
@@ -1,22 +1,38 @@
+<img src="https://i.imgur.com/Q0pPSjN.png" width="450">
+
 # Lil Pwny
 ![Python 2.7 and 3 compatible](https://img.shields.io/badge/python-2.7%2C%203.x-blue.svg)
 ![PyPI version](https://img.shields.io/pypi/v/lil-pwny.svg)
 ![License: MIT](https://img.shields.io/pypi/l/lil-pwny.svg)
 
-A multiprocessing approach to auditing Active Directory passwords using Python.
+Fast, offline auditing of Active Directory passwords using Python.
 
 ## About Lil Pwny
 
-Lil Pwny is a Python application to perform an offline audit of NTLM hashes of users' passwords, recovered from Active Directory, against known compromised passwords from Have I Been Pwned. The usernames of any accounts matching HIBP will be returned in a .txt file
+Lil Pwny is a Python application to perform an offline audit of NTLM hashes of users' passwords, recovered from Active Directory, against known compromised passwords from Have I Been Pwned. Results will be output in JSON format containing the username, matching hash (can be obfuscated), and how many times the matching password has been seen in HIBP
 
 There are also additional features:
-- Ability to provide a list of your own passwords to check AD users against. This allows you to check user passwords against passwords relevant to your organisation that you suspect people might be using. These are NTLM hashed, and AD hashes are then compared with this as well as the HIBP hashes.
+- Ability to provide a list of your own custom passwords to check AD users against. This allows you to check user passwords against passwords relevant to your organisation that you suspect people might be using. These are NTLM hashed, and AD hashes are then compared with this as well as the HIBP hashes.
 - Return a list of accounts using the same passwords. Useful for finding users using the same password for their administrative and standard accounts.
+- Obfuscate hashes in output, for if you don't want to handle or store live user NTLM hashes.
+
+More information about Lil Pwny can be found [on my blog](https://papermtn.co.uk/category/tools/lil-pwny/)
+
+## Resources
+This application has been developed to make the most of multiprocessing in Python, with the aim of it working as fast as possible on consumer level hardware.
+
+Because it uses multiprocessing, the more cores you have available, the faster Lil Pwny should run. I have still had very good results with a low number of logical cores:
+- Test env of ~8500 AD accounts and HIBP list of 613,584,246 hashes:
+    - 6 logical cores - 0:05:57.640813
+    - 12 logical cores - 0:04:28.579201
 
-More information about Lil Pwny can be found [on my blog](https://papermtn.co.uk/)
+## Output
+Lil Pwny will output results as JSON format either to stdout or to file:
 
-## Recommendations
-This application was developed to ideally run on high resource infrastructure to make the most of Python multiprocessing. It will run on desktop level hardware, but the more cores you use, the faster the audit will run.
+```json
+{"localtime": "2021-00-00 00:00:00,000", "level": "NOTIFY", "source": "Lil Pwny", "match_type": "hibp", "detection_data": {"username": "RICKON.STARK", "hash": "0C02C50B2B08F2979DFDE12EDA472FC1", "matches_in_hibp": "24230577", "obfuscated": "True"}}
+```
+This JSON formatted logging can be easily ingested in to a SIEM or other log analysis tool, and can be fed to other scripts or platforms for automated resolution actions.
 
 ## Installation
 Install via pip
@@ -27,25 +43,33 @@ pip install lil-pwny
 ## Usage
 Lil-pwny will be installed as a global command, use as follows:
 
-```bash
-usage: lil-pwny [-h] -hibp HIBP [-a A] -ad AD_HASHES [-d] [-m] [-o OUTPUT]
+```
+usage: lil-pwny [-h] -hibp HIBP [-c CUSTOM] -ad AD_HASHES [-d]
+                   [-output {file,stdout}] [-o]
 
 optional arguments:
-  -hibp, --hibp-path    The HIBP .txt file of NTLM hashes
-  -a, --a               .txt file containing additional passwords to check for
-  -ad, --ad-hashes      The NTLM hashes from of AD users
-  -d, --find-duplicates Output a list of duplicate password users
-  -m, --memory          Load HIBP hash list into memory (over 24GB RAM
-                        required)
-  -o, --out-path        Set output path. Uses working dir when not set
+  -h, --help            show this help message and exit
+  -hibp HIBP, --hibp-path HIBP
+                        The HIBP .txt file of NTLM hashes
+  -c CUSTOM, --custom CUSTOM
+                        .txt file containing additional custom passwords to
+                        check for
+  -ad AD_HASHES, --ad-hashes AD_HASHES
+                        The NTLM hashes from of AD users
+  -d, --duplicates      Output a list of duplicate password users
+  -output {file,stdout}, --output {file,stdout}
+                        Where to send results
+  -o, --obfuscate       Obfuscate hashes from discovered matches by hashing
+                        with a random salt
+
 ```
 
 Example:
 ```bash
-lil-pwny -hibp ~/hibp_hashes.txt -ad ~/ad_ntlm_hashes.txt -a ~/additional_passwords.txt -o ~/Desktop/Output -m -d
+lil-pwny -hibp ~/hibp_hashes.txt -ad ~/ad_user_hashes.txt -c ~/custom_passwords.txt -output stdout -do
 ```
 
-use of the `-m` flag will load the HIBP hashes into memory, which will allow for faster searching. Note this will require at least 24GB of available memory.
+
 
 ## Getting input files
 ### Step 1: Get an IFM AD database dump
@@ -71,9 +95,9 @@ Get-ADDBAccount -All -DBPath '.\Active Directory\ntds.dit' -BootKey $bootKey | F
 ```
 
 ### Step 3: Download the latest HIBP hash file
-The file can be downloaded from [here](https://downloads.pwnedpasswords.com/passwords/pwned-passwords-ntlm-ordered-by-count-v5.7z)
+The file can be downloaded from [here](https://downloads.pwnedpasswords.com/passwords/pwned-passwords-ntlm-ordered-by-count-v7.7z)
 
-The latest version of the hash file contains around 551 million hashes.
+The latest version of the hash file contains around 613 million hashes.
 
 ## Resources
 - [ntdsutil & IFM](https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2012-r2-and-2012/cc732530(v=ws.11))
diff --git a/lil_pwny/__about__.py b/lil_pwny/__about__.py
@@ -0,0 +1,19 @@
+
+__all__ = [
+    '__title__',
+    '__summary__',
+    '__uri__',
+    '__version__',
+    '__author__',
+    '__email__',
+    '__license__',
+]
+
+__title__ = 'Lil Pwny'
+__summary__ = 'Fast offline auditing of Active Directory passwords using Python and multiprocessing'
+__uri__ = 'https://github.com/PaperMtn/lil-pwny'
+__version__ = '2.0.0'
+__author__ = 'PaperMtn'
+__email__ = '[email protected]'
+__license__ = 'GPL-3.0'
+__copyright__ = '2021 {}'.format(__author__)
diff --git a/lil_pwny/__init__.py b/lil_pwny/__init__.py
@@ -0,0 +1,159 @@
+import os
+import time
+import builtins
+import argparse
+import uuid
+from datetime import timedelta
+
+from lil_pwny import hashing
+from lil_pwny import password_audit
+from lil_pwny import logger
+from lil_pwny import __about__
+
+OUTPUT_LOGGER = ''
+
+
+def main():
+    global OUTPUT_LOGGER
+    custom_count = 0
+    duplicate_count = 0
+
+    try:
+        start = time.time()
+
+        parser = argparse.ArgumentParser()
+        parser.add_argument('-hibp', '--hibp-path', help='The HIBP .txt file of NTLM hashes',
+                            dest='hibp', required=True)
+        parser.add_argument('--version', action='version',
+                            version='lil-pwny {}'.format(__about__.__version__))
+        parser.add_argument('-c', '--custom', help='.txt file containing additional custom passwords to check for',
+                            dest='custom')
+        parser.add_argument('-ad', '--ad-hashes', help='The NTLM hashes from of AD users', dest='ad_hashes',
+                            required=True)
+        parser.add_argument('-d', '--duplicates', action='store_true', dest='d',
+                            help='Output a list of duplicate password users')
+        parser.add_argument('-output', '--output', choices=['file', 'stdout'], dest='logging_type',
+                            help='Where to send results')
+        parser.add_argument('-o', '--obfuscate', action='store_true', dest='obfuscate',
+                            help='Obfuscate hashes from discovered matches by hashing with a random salt')
+
+        args = parser.parse_args()
+        hibp_file = args.hibp
+        custom_passwords = args.custom
+        ad_hash_file = args.ad_hashes
+        duplicates = args.d
+        logging_type = args.logging_type
+        obfuscate = args.obfuscate
+
+        hasher = hashing.Hashing()
+
+        if logging_type:
+            if logging_type == 'file':
+                OUTPUT_LOGGER = logger.FileLogger(log_path=os.getcwd())
+            elif logging_type == 'stdout':
+                OUTPUT_LOGGER = logger.StdoutLogger()
+        else:
+            OUTPUT_LOGGER = logger.StdoutLogger()
+
+        if isinstance(OUTPUT_LOGGER, logger.StdoutLogger):
+            print = OUTPUT_LOGGER.log_info
+        else:
+            print = builtins.print
+
+        print('*** Lil Pwny started execution ***')
+        print('Loading AD user hashes...')
+        try:
+            ad_users = password_audit.import_users(ad_hash_file)
+            ad_lines = 0
+            for ls in ad_users.values():
+                ad_lines += len(ls)
+        except FileNotFoundError as not_found:
+            raise Exception('AD user file not found: {}'.format(not_found.filename))
+        except Exception as e:
+            raise e
+
+        print('Comparing {} AD users against HIBP compromised passwords...'.format(ad_lines))
+        try:
+            hibp_results = password_audit.search(OUTPUT_LOGGER, hibp_file, ad_hash_file)
+            hibp_count = len(hibp_results)
+            print(hibp_results)
+            for hibp_match in hibp_results:
+                if obfuscate:
+                    hibp_match['hash'] = hasher.obfuscate(hibp_match.get('hash'))
+                    hibp_match['obfuscated'] = 'True'
+                else:
+                    hibp_match['obfuscated'] = 'False'
+                OUTPUT_LOGGER.log_notification(hibp_match, 'hibp')
+        except FileNotFoundError as not_found:
+            raise Exception('HIBP file not found: {}'.format(not_found.filename))
+        except Exception as e:
+            raise e
+
+        if custom_passwords:
+            try:
+                # Import custom strings from file and convert them to NTLM hashes
+                custom_content = hasher.get_hashes(custom_passwords)
+
+                # Create a tmp file to store the converted hashes and pass to the search function
+                # Filename is a randomly generated uuid
+                f = open('{}.tmp'.format(str(uuid.uuid4().hex)), 'w')
+                for h in custom_content:
+                    # Replicate HIBP format: "hash:occurrence"
+                    f.write('{}:{}'.format(h, 0) + '\n')
+                f.close()
+
+                print('Comparing {} Active Directory users against {} custom password hashes...'
+                      .format(ad_lines, len(custom_content)))
+                custom_matches = password_audit.search(OUTPUT_LOGGER, f.name, ad_hash_file)
+                custom_count = len(custom_matches)
+
+                # Remove the tmp file
+                os.remove(f.name)
+
+                for custom_match in custom_matches:
+                    if obfuscate:
+                        custom_match['hash'] = hasher.obfuscate(custom_match.get('hash'))
+                        custom_match['obfuscated'] = 'True'
+                    else:
+                        custom_match['obfuscated'] = 'False'
+                    OUTPUT_LOGGER.log_notification(custom_match, 'custom')
+            except FileNotFoundError as not_found:
+                raise Exception('Custom password file not found: {}'.format(not_found.filename))
+            except Exception as e:
+                raise e
+
+        if duplicates:
+            try:
+                print('Finding users with duplicate passwords...')
+                duplicate_results = password_audit.find_duplicates(ad_users)
+                duplicate_count = len(duplicate_results)
+                for duplicate_match in duplicate_results:
+                    if obfuscate:
+                        duplicate_match['hash'] = hasher.obfuscate(duplicate_match.get('hash'))
+                        duplicate_match['obfuscated'] = 'True'
+                    else:
+                        duplicate_match['obfuscated'] = 'False'
+                    OUTPUT_LOGGER.log_notification(duplicate_match, 'duplicate')
+            except Exception as e:
+                raise e
+
+        time_taken = time.time() - start
+        total_comp_count = custom_count + hibp_count
+
+        print('Audit completed')
+        print('Total compromised passwords: {}'.format(total_comp_count))
+        print('Passwords matching HIBP: {}'.format(hibp_count))
+        print('Passwords matching custom password dictionary: {}'.format(custom_count))
+        print('Passwords duplicated (being used by multiple user accounts): {}'.format(duplicate_count))
+        print('Time taken: {}'.format(str(timedelta(seconds=time_taken))))
+
+    except Exception as e:
+        if isinstance(OUTPUT_LOGGER, logger.StdoutLogger):
+            OUTPUT_LOGGER.log_critical(e)
+        else:
+            print = builtins.print
+            print(e)
+
+
+if __name__ == '__main__':
+    main()
diff --git a/lil_pwny/__main__.py b/lil_pwny/__main__.py
@@ -0,0 +1,3 @@
+from lil_pwny import main
+
+main()
diff --git a/lil_pwny/hashing.py b/lil_pwny/hashing.py
@@ -0,0 +1,52 @@
+import binascii
+import hashlib
+import secrets
+
+
+class Hashing(object):
+    def __init__(self):
+        self.salt = secrets.token_hex(8)
+
+    @staticmethod
+    def _hashify(input_string):
+        """Converts the input string to a NTLM hash and returns the hash
+
+        Parameters:
+            input_string: string to be converted to NTLM hash
+        Returns:
+            Converted NTLM hash
+        """
+
+        output = hashlib.new('md4', input_string.encode('utf-16le')).digest()
+
+        return binascii.hexlify(output).decode('utf-8').upper()
+
+    def get_hashes(self, input_file):
+        """Reads the input file of passwords, converts them to NTLM hashes
+
+        Parameters:
+            input_file: file containing strings to convert to NTLM hashes
+        Returns:
+            Dict that replicates HIBP format: 'hash:occurrence_count'
+        """
+
+        output_dict = {}
+        with open(input_file, 'r') as f:
+            for item in f:
+                if item:
+                    output_dict[self._hashify(item.strip())] = '0'
+
+        return output_dict
+
+    def obfuscate(self, input_hash):
+        """Further hashes the input NTLM hash with a random salt
+
+        Parameters:
+            input_hash: hash to be obfuscated
+        Returns:
+            String containing obfuscated hash
+        """
+
+        output = hashlib.new('md4', (input_hash + self.salt).encode('utf-16le')).digest()
+
+        return binascii.hexlify(output).decode('utf-8').upper()