-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added normalisation and unit test cases #118
base: release/0.9
Are you sure you want to change the base?
Conversation
Can you make these functionality based on Transformation step and also put it to the correct module ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments
Thank you for your contribution @sritha272 , really appreciate it. I do agree with Mikita about the placement of the modules that you chose for this one. Also, can you explain the intended use for this a bit more? Would this be for ML usescases with the input being something like pandas perhaps? Or did you have something else in mind. I propose that we have a small meetup to discuss, as I would love to add your contribution to our library. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my earlier comment
Please also see: #129 |
Description
Implemented multiple data transformation functions (normalize, scale, clipping, exponential, standardize, z_score_normalize) to enhance the framework's data processing capabilities. These functions are equipped with comprehensive unit tests to ensure correctness and handle edge cases.
Breakdown of Each Implemented Function
Normalize
Scales data to a specified range [min_value, max_value].
Includes edge case handling for zero range and empty data.
Scale
Scales data by a specified multiplier.
Useful for linear scaling transformations.
Clipping
Clips data to fall within a specified range [min_value, max_value].
Prevents extreme outliers in datasets.
Exponential Transformation
Applies an exponential transformation to data with a specified base.
Handles exponential growth scenarios effectively.
Standardize
Standardizes data to have a mean of 0 and a standard deviation of 1.
Includes custom mean and standard deviation parameters.
Z-Score Normalize
Computes z-scores for data points for standardization.
Handles mixed positive and negative datasets effectively.
Related Issue
Motivation and Context
The newly added functions provide robust data normalization and transformation capabilities, which are critical for preparing data for machine learning, statistical analysis, and other computational tasks. These functions solve the problem of inconsistent data scaling and ensure uniform preprocessing pipelines.
How Has This Been Tested?
Unit Tests: Added unit tests for each function:
Verified outputs for standard, edge, and invalid inputs.
Tests include large numbers, small numbers, mixed data types, and edge cases like empty datasets or identical values.
Environment: Testing performed on:
Python 3.12
OS: Windows 11
Commands:
Ran python -m unittest discover tests to ensure all test cases passed successfully.
Validated compatibility with existing project components.
Screenshots (if appropriate):
Types of changes
Checklist: