Help with multinomial logistic regression implementation #289
-
Hi, I am trying to implement an approach from a paper that attempts to predict the causal structure of data. As part of the approach, I need to perform multinomial logistic regression tests over my data to predict conditional independence amongst the variables (i.e. are X and Y independent given a set of variables S?). This basically involves applying two different conditional independence tests depending on the type of variables:
I am a little confused on how this could be implemented in pingouin and I was wondering if I could have some guidance, particularly for the second point. Here's an example dataset that might help: def generate_data(n_data_points: int):
# Education:
# 0 = None
# 1 = High school
# 2 = Undergraduate
# 3 = Postgraduate
education = np.random.choice([0, 1, 2, 3], size=n_data_points)
# Previous experience (years)
experience = np.random.normal(10, 5, size=n_data_points).astype(int)
experience = [abs(e) for e in experience] # No negative experience
# Job offer
# 0 = No
# 1 = Yes
# 2 = Alternative (less experienced) role
job_offer = np.zeros_like(experience)
for j, _ in enumerate(job_offer):
if education[j] == 3 and experience[j] > 5:
job_offer[j] = np.random.choice([0, 1, 2], p=[0.01, 0.1, 0.89])
elif education[j] > 1 and experience[j] > 3:
job_offer[j] = np.random.choice([0, 1, 2], p=[0.3, 0.6, 0.1])
else:
job_offer[j] = int(np.random.choice([0, 1, 2], p=[0.89, 0.1, 0.01]))
simulated_data = pd.DataFrame({"education": education,
"experience": experience,
"job_offer": job_offer})
return simulated_data
if __name__ == "__main__":
df = generate_data(1000)
print(df) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi @AndrewC19, This unfortunately cannot be implemented in Pingouin for the simple reason that it does not support multinomial logistic regression. However, I think this should be doable with statsmodels. These links may be useful:
|
Beta Was this translation helpful? Give feedback.
Hi @AndrewC19,
This unfortunately cannot be implemented in Pingouin for the simple reason that it does not support multinomial logistic regression. However, I think this should be doable with statsmodels. These links may be useful: