Add sequential example #24

mats-claassen · 2024-05-03T17:11:59Z

Also added some fixes needed for this use case and removed unused conftest file

glema-xmartlabs

functions need simple doctring

glema-xmartlabs · 2024-05-06T11:40:20Z

examples/sequential/preprocessing.py

+from tf_tabular.utils import get_vocab
+
+
+def divide_ratings_by_mean_user_rating(ratings: pd.DataFrame, user_id_column='user_id'):


why divide?
please add a simple doc string "Substract mean rating for each user"

glema-xmartlabs · 2024-05-06T11:41:24Z

examples/sequential/preprocessing.py

+    ratings = ratings[["user_id", "movie_id", "user_rating"]].groupby(["user_id"], as_index=False).agg(list)
+
+    def cutoff(x):
+        return min(int(len(x) * 0.2), max_y_cutoff)


parametrize 0.2

glema-xmartlabs · 2024-05-06T11:42:30Z

examples/sequential/preprocessing.py

+
+
+def preprocess_dataset(ratings_df, movies_df):
+    train_df, val_df = split_by_user(ratings_df, max_y_cutoff=5)


parametrize 5 in max_y_cutoff

glema-xmartlabs · 2024-05-06T11:44:14Z

src/tf_tabular/utils.py

@@ -142,7 +145,7 @@ def build_categorical_input(name, embedding_dim, vocab, is_multi_hot, embedding_
    return (x, inp)


-def get_vocab(series, max_size):
+def get_vocab(series, max_size: int = None):


Optional[int] or int | None

glema-xmartlabs · 2024-05-07T13:26:12Z

examples/sequential/preprocessing.py

+    val_users = unique_users[: int(num_users * 0.2)]
+    train_users = unique_users[int(num_users * 0.2) :]


replace 0.2 with val_split

glema-xmartlabs

Still seeing a mix of imperative and descriptive docstrings
Otherwise, LGTM!

pgrill · 2024-05-08T11:52:24Z

examples/sequential/preprocessing.py

+    train_set = ratings[ratings.user_id.isin(train_users)]
+    val_set = ratings[ratings.user_id.isin(val_users)]
+
+    print(f"Train set size: {train_set.shape}")


I don't like much the idea of having prints in the code. Can we change them for logging.info?

glema-xmartlabs

LGTM

Add sequential example

533dfbf

mats-claassen requested review from pgrill and glema-xmartlabs May 3, 2024 17:11

Linter fixes

ff7e5d4

glema-xmartlabs requested changes May 6, 2024

View reviewed changes

mats-claassen added 3 commits May 6, 2024 12:54

Add documentation

6177e03

Rename types because we are using tf==2.15.1

743d848

Minor fixes in docs

c00d25d

glema-xmartlabs reviewed May 7, 2024

View reviewed changes

glema-xmartlabs requested changes May 7, 2024

View reviewed changes

Change some docstrings

8b0055a

pgrill reviewed May 8, 2024

View reviewed changes

Remove prints for logging statements

3c263cf

glema-xmartlabs approved these changes May 8, 2024

View reviewed changes

mats-claassen merged commit 4ab8c7b into main May 8, 2024

mats-claassen deleted the examples/sequential branch May 8, 2024 14:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sequential example #24

Add sequential example #24

mats-claassen commented May 3, 2024

glema-xmartlabs left a comment

glema-xmartlabs May 6, 2024

glema-xmartlabs May 6, 2024

glema-xmartlabs May 6, 2024

glema-xmartlabs May 6, 2024

glema-xmartlabs May 7, 2024

glema-xmartlabs left a comment

pgrill May 8, 2024

glema-xmartlabs left a comment

		from tf_tabular.utils import get_vocab


		def divide_ratings_by_mean_user_rating(ratings: pd.DataFrame, user_id_column='user_id'):



		def preprocess_dataset(ratings_df, movies_df):
		train_df, val_df = split_by_user(ratings_df, max_y_cutoff=5)

		val_users = unique_users[: int(num_users * 0.2)]
		train_users = unique_users[int(num_users * 0.2) :]

Add sequential example #24

Add sequential example #24

Conversation

mats-claassen commented May 3, 2024

glema-xmartlabs left a comment

Choose a reason for hiding this comment

glema-xmartlabs May 6, 2024

Choose a reason for hiding this comment

glema-xmartlabs May 6, 2024

Choose a reason for hiding this comment

glema-xmartlabs May 6, 2024

Choose a reason for hiding this comment

glema-xmartlabs May 6, 2024

Choose a reason for hiding this comment

glema-xmartlabs May 7, 2024

Choose a reason for hiding this comment

glema-xmartlabs left a comment

Choose a reason for hiding this comment

pgrill May 8, 2024

Choose a reason for hiding this comment

glema-xmartlabs left a comment

Choose a reason for hiding this comment