Retail eCommerce Funnel Analysis Demo with 1 million members and 87 million record dataset using StarRocks #26075
Closed
Replies: 1 comment
-
您好,打扰下,请教下: |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The dataset
The data comes from a user behavior dataset. This dataset randomly selected 1 million users, recording their actions on taobao from November 25, 2017 to December 3, 2017.
The dataset contains 86,953,525 records with five dimensions: userID, itemID, categoryID, behaviorType, and timestamp. The data spans eight days from November 25, 2017, to December 2, 2017. It involves a total of 987,982 unique users, 3,962,559 unique products, and 9,377 product categories. The dataset includes four types of user behaviors: page view (PV), purchase (Buy), add to cart (Cart), and favorite (Fav).
Environment
You need at least 16GB of ram to run this tutorial.
Where to download it.
The 1.1 GB file can be downloaded at https://cdn.starrocks.io/dataset/user_behavior_sample_data.parquet
Loading the data
If you get an error like
type:LOAD_RUN_FAIL; msg:Memory of Query3a47bc7f-46b5-428f-bfe7-43dd10d6a141 exceed limit. Pipeline Backend: 10.0.25.171, fragment: 3a47bc7f-46b5-428f-bfe7-43dd10d6a142 Used: 2333611624, Limit: 2147483648. Mem usage has exceed the limit of single query, You can change the limit by set session variable query_mem_limit.
, you can executeSET GLOBAL query_mem_limit = 137438953472;
to change the memory used for the load and then retry the load again.Scenario 1: Higher level view of users completing this conversion path within 1800s
This indicates that only 30% of users who viewed the product added it to their cart, and only 12% of users who added it to their cart proceeded to place an order. The conversion rate is not good.
Scenario 2: Examine the item IDs of the top ten products with the worst conversion rate from PV (page views) to buy.
At this point, we have identified items with poor conversion, such as item_id=3563468.
Scenario 3: Would like to see the user paths of those who dropped off?
We can see that the majority of people just viewed the products and left without taking any further actions.
Beta Was this translation helpful? Give feedback.
All reactions