diff --git a/README.md b/README.md index 06ab864..3093336 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,25 @@ # reverseCentaur - +Tool to turn bro logs into data for all sorts of machine learning/statisticable csvs +1. AVG/TOT across domains +2. Time series in domain +3. All n length fingerprints for a period across domains +4. All n length fingerprints across domains ## To-Do: +### Easy +* Add PCR(s) +* Remove grep +* Directory Load * see if I should be using pandas pivots +* Aggregate each domain file +### Hard * Time Based Split -* Directory Load +* Find CDX data +* Data cleaning +* Shitty anomaly detect +* Shitty clustering * Periodicity & jitter test w/ fake data in iPynb -## Long Term: +### Long Term: * https://github.com/featuretools/featuretools * TPOT * Random Anomaly Detection