Incorrect credentials handling in pandas.GBQTableDataset
and pandas.GBQQueryDataset
#975
Labels
Community
Issue/PR opened by the open-source community
Description
The credentials handling in
pandas.GBQTableDataset
andpandas.GBQQueryDataset
is incorrect in 2 ways:pandas_gbq.read_gbq
is incorrect.The credentials should be of type
google.auth.credentials.Credentials
but is incorrectly annotated asgoogle.oauth2.credentials.Credentials
From https://googleapis.dev/python/pandas-gbq/latest/api.html#pandas_gbq.read_gbq
Credentials
classgoogle.auth.credentials.Credentials
is the base class for all credentials implemented in the https://github.com/googleapis/google-auth-library-python. It is not meant to be instantiated directly.Instantiating only the
google.oauth2.credentials.Credentials
also doesn't seem correct. The user should have the flexibilty to instantiate any credentials class as long as it bases ongoogle.auth.credentials.Credentials
Possible implementation
To support Python API:
SImply change the type annotation to
google.auth.credentials.Credentials
To support YAML API:
This poses the generic issue of supporting non native type instantiation through yaml. Ideally the user should have the flexibilty to instantiate any credentials class as long as it bases on
google.auth.credentials.Credentials
This can be done by implementing a functionality natively into kedro which allows to instantiate any arbitrary object with arguments (named or otherwise).
Then in code we can load whatever
object
is specified and pass the remaining arguments.However, most usecases would likely only require the service account json when a dictionary is passed, hence it makes sense to assume that if a dictionary is passed, then the intended credentials are Service Account credentials.
So, inside the code we can do the following:
The same pattern is also followed in
gcsfs
library where they allow anygoogle.auth.credentials.Credentials
object. If a string is passed it callsServiceAccountCredentials.from_service_account_file
, and if a dictionary is passed it callsServiceAccountCredentials.from_service_account_info(credentials)
.See here: https://github.com/fsspec/gcsfs/blob/main/gcsfs/credentials.py
Happy to open a PR if above implementation makes sense! 🙂
The text was updated successfully, but these errors were encountered: