-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Implement "Pydantic protocol" #53999
Comments
cc @Dr-Irv |
@Kludex You wrote:
Can you give a short example of what things look like today if you try to use Pydantic with pandas, i.e., the nature of the customization that is required? |
Yep! I'll use Right now, if you want to have a import numpy as np
import pandas as pd
from pydantic import BaseModel, ConfigDict
arr = [1, 2, 3, np.nan, 6, 8]
series = pd.Series(arr)
class Model(BaseModel):
model_config = ConfigDict(arbitrary_types_allowed=True)
series: pd.Series The The thing is that the usage with Pydantic gets limited. I'm not able to generate the JSON schema (interesting for users using FastAPI, for example), nor create the field with lists or dicts. Implementing the import numpy as np
import pandas as pd
from pydantic import GetCoreSchemaHandler, BaseModel
from pydantic_core import core_schema
from typing import Any
arr = [1, 2, 3, np.nan, 6, 8]
series = pd.Series(arr)
class Series(pd.Series):
@classmethod
def __get_pydantic_core_schema__(
cls, __source: type[Any], __handler: GetCoreSchemaHandler
) -> core_schema.CoreSchema:
return core_schema.no_info_before_validator_function(
pd.Series, core_schema.list_schema()
)
class Model(BaseModel):
series: Series
model = Model(series=series)
model = Model(series=arr) # You can now use the `arr`
model = Model(series={"a": 1, "b": 2}) # Or the dict... Or any other type that constructs a pd.Series
model.model_json_schema() # You are also able to generate the JSON schema You can see more about this on Creating custom classes using Hope it's clear. |
if you're willing to submit a PR, this seems nice to have! |
I am! Thanks! 🙏 |
cool - note that |
Is the |
No, it's not. I'll push a PR soon.
They don't need it for this to be implemented in |
Sure, but it would just be good to see if there's a precedent for it. Has any other library accepted this yet? |
Pydantic V2 was released a week ago, and |
could you clarify please? |
Yes... What I mean is that the one that I provided was just an example on how it would look like... 😅 The one implemented on https://github.com/pandas-dev/pandas/pull/54034/files is "general" enough to fit all users' use case. |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
I wish I could use Pydantic with Pandas without the need of customization or external library.
Feature Description
The idea would be to implement the
__get_pydantic_core_schema__
mentioned here onpd.Series
,pd.DataFrame
, and others.Alternative Solutions
There's a PR in
pydantic-extra-types
to include a custom type forpd.Series
.Additional Context
We have more information on the migration guide.
The text was updated successfully, but these errors were encountered: