Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ION extra to Kestra Python Package #16

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

japerry911
Copy link
Contributor

@japerry911 japerry911 commented Feb 7, 2025

What changes are being made and why?

  • added Kestra ION package as extra to Kestra Python Package

  • kept the repo code 99% the same, just changed around file names so that from kestra_ion import read_ion worked as before - which makes it backward compatible

    • I am happy to change file structure, let me know thoughts
  • added to documentation and tests to cover the new ION extra update

  • we will probably want to update other docs that reference Kestra-ION repo, and not shut down that repo for time being, in case users do not see this new extra (I also won't probably have time to update to use new extra for a week or two)

  • closes Add kestra-ion as subpackage to kestra #11


How the changes have been QAed?

pip install 'kestra[ion] @ git+https://github.com/japerry911/libs.git@japerry911/imp/add-kestra-ion-extra#subdirectory=python'
  • can confirm pip install kestra does not install the Kestra ION package

  • main.py file -

import pandas as pd
import requests
from kestra_ion import read_ion

file_path = "employees.ion"
url = "https://huggingface.co/datasets/kestra/datasets/resolve/main/ion/employees.ion"
response = requests.get(url)
if response.status_code == 200:
    with open(file_path, "wb") as file:
        file.write(response.content)
else:
    print(f"Failed to download the file. Status code: {response.status_code}")


data = read_ion(file_path)
df = pd.DataFrame(data)
print(df.info())
python main.py
(.venv) jackperry@FDMACDE-JPER test_kestra_ion % python main.py    
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 11 columns):
 #   Column          Non-Null Count  Dtype         
---  ------          --------------  -----         
 0   employee_id     100 non-null    float64       
 1   first_name      100 non-null    object        
 2   last_name       100 non-null    object        
 3   email           100 non-null    object        
 4   phone_number    100 non-null    object        
 5   hire_date       100 non-null    datetime64[ns]
 6   job_id          100 non-null    object        
 7   salary          100 non-null    float64       
 8   commission_pct  35 non-null     float64       
 9   manager_id      99 non-null     float64       
 10  department_id   99 non-null     float64       
dtypes: datetime64[ns](1), float64(5), object(5)
memory usage: 8.7+ KB
None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: To review
Development

Successfully merging this pull request may close these issues.

Add kestra-ion as subpackage to kestra
1 participant