-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explanation on bin packing result generation time #14
Comments
Hi @S1LV3RJ1NX, The performance is tied substantially to the size of graph replicating all valid patterns. One thing that has most impact is the number of items that fit in each bin. When there are many dimensions, multiple bin types, big capacities, long patterns (many items fit in a single bin) it is hard to create a compact representation of every valid packing pattern. Note that adding a cardinality constraint (e.g., maximum of 3 or 4 items per bin) may eventually help reducing the size of the graph enough to find a solution. One thing that may be useful in that heuristic methods tend to work well on the instances in which it is hard to solve exactly with VPSolver, since for heuristics the bigger set of possible patterns typically helps finding good solutions. For VPSolver the larger number of valid packing patterns makes it hard to generate a model small enough to be possible to solve. |
Thanks @fdabrandao, for the explanation! There were cases when VPSolver was able to find bin-packing solutions immediately. Like I had a sample trace with 100 tasks and ~250 bins, 100 tasks were given all at once. I got the solution within a minute. How can I explain those? Is there a mathematical way to say based on these many bins and these many tasks it will take so and so amt of time? |
What was the maximum number of tasks in a single bin in the optimal solution? If you could provide the instances it should be easy to see why some are easy and some are hard. Jusf having small items in an instance can make it substantially harder for VPSolver as the number of valid packing patterns grows really fast. Groping small tasks in a bigger task would be a way to reduce the graph size. |
Hello @fdabrandao , Here I am using For another experiement, |
For the first experiments it looks like patterns should be very short. How many tasks were placed on each machine? For the second one, I believe you sent google_vms_10k_max_load.csv twice so I could not see the tasks. However, with that amount of items, even if two or three tasks can be placed on each machine, the graph representing the patterns can get quite big. |
@fdabrandao apologies for that, I have updated the csvs please check. |
In the first dataset you have big items such as:
for machines such as:
It looks like most solutions will have one or two tasks assigned to each machine for the first test. Therefore, the patterns can be reasonably short. In the second dataset you have many tiny items:
for big machines:
This results in very long patterns that are very hard to represent in a compact way. For this type of dataset you can use an assignment-based model more effectively. You can learn about the graph generation method at https://research.fdabrandao.pt/papers/PhDThesisBrandao.pdf and the assignment-based models are also mentioned there. |
@fdabrandao Thank you for your suggesstions these are very helpful. Can you please recommend some github repos for using assignment based models like VPSolver? |
@fdabrandao I had a unique observation can you please explain me the reasoning behing it? If I take 100 samples form first dataset, and around 267 bins a mix of each categories, if my bin cost is equal to machine cost (multiplied by 1000 as we need an int) as shown in dataset 1 (please refer your comment) then the ILP is not solved even after 6-8hrs, but if I change cost to a function of cpu and mem of that bin capacity such as 9cpu + 1mem, I get answer within seconds. How is cost of bin affecting the ILP solving time? |
Hello @fdabrandao ,
I have a doubt regarding time it takes to generate bin packing results. I am currently using gurobi solver.
The experiment setup is as follows:
I have 8K - 4D bins and close to 10K tasks (4D) . I am basically trying to simulate a cloud workload. In that workload task requests arrive anywhere in the range of 10-200. The scenario is as follows:
The issue I am facing is if task requests are anywhere between 10-30 VpSolver is able to provide a solution for it. But if task request groups go beyond 30~36 VpSolv takes a lot of time to provide the results. Sometimes even after 6hrs of running the solver I do not have the result. Sometimes, entire system RAM is exhausted and VpSolver halts. Can I know the reason for this? Is there a certain bound in what time results can be generated? It would be really helpful for me to understand this. Requesting your help.
Thanks!
The text was updated successfully, but these errors were encountered: