Skip to content

[Converter] FailedPreconditionError when output shard path is a directory (Missing path validation) #8635

@amadhan882

Description

@amadhan882

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow.js): Yes (Verification script provided below)
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 22.04.5 LTS (Jammy Jellyfish)
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
  • TensorFlow.js installed from (npm or script link): pip (tensorflowjs)
  • TensorFlow.js version (use command below): 4.22.0
  • Browser version: N/A
  • Tensorflow.js Converter Version: 4.22.0
  • Python version: 3.12.12
  • TensorFlow version: 2.19.0

Describe the current behavior

The tensorflowjs converter fails with a low-level FailedPreconditionError when a directory exists at the target path where a weight shard file (e.g., group1-shard1of1.bin) is expected to be written. This happens because the Python layer in write_weights.py does not check if the target path is a directory before attempting to write via tf.io.gfile.GFile, leading to a backend crash instead of a graceful Python exception.

Describe the expected behavior

The converter should perform high-level path validation. If a directory is found at the target file path, it should raise a clear Python exception (e.g., IsADirectoryError) instead of triggering a backend FailedPreconditionError.

Standalone code to reproduce the issue

The following script demonstrates the bug by creating a conflicting directory and then verifies the fix by patching the local library:

import os
import tensorflow as tf
import tensorflowjs as tfjs
import importlib
import tensorflowjs.write_weights as ww

# 1. Setup conflict directory
file_path = ww.__file__
target_dir = "real_fix_test"
shard_name = "group1-shard1of1.bin"
os.makedirs(target_dir, exist_ok=True)
os.makedirs(os.path.join(target_dir, shard_name), exist_ok=True)

# 2. Test BEFORE fix (Triggers Bug)
print("\n[*] --- TESTING BEFORE FIX ---")
model = tf.keras.Sequential([tf.keras.layers.Dense(1, input_shape=(1,))])
try:
    tfjs.converters.save_keras_model(model, target_dir)
except Exception as e:
    print(f"[!] Caught Original Bug: {type(e).__name__}")

# 3. APPLY REAL PATCH TO LIBRARY (Adding validation check)
with open(file_path, 'r') as f:
    code = f.read()

old_line = "    with tf.io.gfile.GFile(filepath, 'wb') as f:"
new_line = """    if tf.io.gfile.isdir(filepath):
        raise IsADirectoryError(f"Cannot write shard: {filepath} is a directory.")
    with tf.io.gfile.GFile(filepath, 'wb') as f:"""

patched_code = code.replace(old_line, new_line)
with open(file_path, 'w') as f:
    f.write(patched_code)

importlib.reload(ww)
importlib.reload(tfjs.converters)

# 4. Test AFTER fix (Graceful Handling)
print("\n[*] --- TESTING AFTER REAL PATCH ---")
try:
    tfjs.converters.save_keras_model(model, target_dir)
except Exception as e:
    print(f"[SUCCESS] Caught with Patched Logic: {type(e).__name__}")
    print(f"Error Message: {e}")

Other info / logs

Terminal Output Observed:
[*] Modifying library file at: /usr/local/lib/python3.12/dist-packages/tensorflowjs/write_weights.py
[*] --- TESTING BEFORE FIX ---
[!] Caught Original Bug: FailedPreconditionError
[*] --- TESTING AFTER REAL PATCH ---
[SUCCESS] Caught with Patched Logic: IsADirectoryError
Error Message: Cannot write shard: real_fix_test/group1-shard1of1.bin is a directory.

The logs confirm that adding a `tf.io.gfile.isdir()` check in `write_weights.py` (around line 305) prevents the backend crash and handles the error at the Python layer as expected.

Conclusion

The current implementation of _shard_group_bytes_to_disk in write_weights.py lacks proactive path validation. By allowing the write operation to proceed on a path that is already a directory, the system relies on the C++ backend to catch the error, resulting in a cryptic FailedPreconditionError.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions