-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow.js): Yes (Verification script provided below)
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 22.04.5 LTS (Jammy Jellyfish)
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
- TensorFlow.js installed from (npm or script link): pip (tensorflowjs)
- TensorFlow.js version (use command below): 4.22.0
- Browser version: N/A
- Tensorflow.js Converter Version: 4.22.0
- Python version: 3.12.12
- TensorFlow version: 2.19.0
Describe the current behavior
The tensorflowjs converter fails with a low-level FailedPreconditionError when a directory exists at the target path where a weight shard file (e.g., group1-shard1of1.bin) is expected to be written. This happens because the Python layer in write_weights.py does not check if the target path is a directory before attempting to write via tf.io.gfile.GFile, leading to a backend crash instead of a graceful Python exception.
Describe the expected behavior
The converter should perform high-level path validation. If a directory is found at the target file path, it should raise a clear Python exception (e.g., IsADirectoryError) instead of triggering a backend FailedPreconditionError.
Standalone code to reproduce the issue
The following script demonstrates the bug by creating a conflicting directory and then verifies the fix by patching the local library:
import os
import tensorflow as tf
import tensorflowjs as tfjs
import importlib
import tensorflowjs.write_weights as ww
# 1. Setup conflict directory
file_path = ww.__file__
target_dir = "real_fix_test"
shard_name = "group1-shard1of1.bin"
os.makedirs(target_dir, exist_ok=True)
os.makedirs(os.path.join(target_dir, shard_name), exist_ok=True)
# 2. Test BEFORE fix (Triggers Bug)
print("\n[*] --- TESTING BEFORE FIX ---")
model = tf.keras.Sequential([tf.keras.layers.Dense(1, input_shape=(1,))])
try:
tfjs.converters.save_keras_model(model, target_dir)
except Exception as e:
print(f"[!] Caught Original Bug: {type(e).__name__}")
# 3. APPLY REAL PATCH TO LIBRARY (Adding validation check)
with open(file_path, 'r') as f:
code = f.read()
old_line = " with tf.io.gfile.GFile(filepath, 'wb') as f:"
new_line = """ if tf.io.gfile.isdir(filepath):
raise IsADirectoryError(f"Cannot write shard: {filepath} is a directory.")
with tf.io.gfile.GFile(filepath, 'wb') as f:"""
patched_code = code.replace(old_line, new_line)
with open(file_path, 'w') as f:
f.write(patched_code)
importlib.reload(ww)
importlib.reload(tfjs.converters)
# 4. Test AFTER fix (Graceful Handling)
print("\n[*] --- TESTING AFTER REAL PATCH ---")
try:
tfjs.converters.save_keras_model(model, target_dir)
except Exception as e:
print(f"[SUCCESS] Caught with Patched Logic: {type(e).__name__}")
print(f"Error Message: {e}")
Other info / logs
Terminal Output Observed:
[*] Modifying library file at: /usr/local/lib/python3.12/dist-packages/tensorflowjs/write_weights.py
[*] --- TESTING BEFORE FIX ---
[!] Caught Original Bug: FailedPreconditionError
[*] --- TESTING AFTER REAL PATCH ---
[SUCCESS] Caught with Patched Logic: IsADirectoryError
Error Message: Cannot write shard: real_fix_test/group1-shard1of1.bin is a directory.
The logs confirm that adding a `tf.io.gfile.isdir()` check in `write_weights.py` (around line 305) prevents the backend crash and handles the error at the Python layer as expected.
Conclusion
The current implementation of _shard_group_bytes_to_disk in write_weights.py lacks proactive path validation. By allowing the write operation to proceed on a path that is already a directory, the system relies on the C++ backend to catch the error, resulting in a cryptic FailedPreconditionError.