Skip to content

[Bug]: Minor bug in gemini batch prediction tutorial (trailing slash in gcs_uri breaks fsspec.glob pattern) #2179

@baeseongsu

Description

@baeseongsu

File Name

gemini/batch-prediction/intro_batch_prediction.ipynb

What happened?

In the current gemini/batch-prediction/intro_batch_prediction.ipynb notebook, the gcs_batch_job.dest.gcs_uri attribute from a completed job includes a trailing slash. When this URI is used with the notebook's f-string glob pattern, the resulting double slash breaks the path search, causing fsspec.glob() to fail to retrieve the predictions.jsonl file.

Could this issue be resolved by using .rstrip('/') on the URI before concatenation?

Relevant log output

fs = fsspec.filesystem("gcs")

# Since this pattern includes a double slash, file_paths will be an empty list.
file_paths = fs.glob(f"{gcs_batch_job.dest.gcs_uri}/*/predictions.jsonl")

if gcs_batch_job.state == "JOB_STATE_SUCCEEDED":
    # Load the JSONL file into a DataFrame
    df = pd.read_json(f"gs://{file_paths[0]}", lines=True)

    df = df.join(pd.json_normalize(df["response"], "candidates"))
    display(df)

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions