-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assign element to cpu thread #467
Comments
|
So StaticThreadSched(FromDevice(eth1) 1)? And when i run the click configuration file, should i run it with -a ? |
Check https://github.com/kohler/click/wiki/Language for the language ;) Well if you you use DPDK element (from the other discussion I guess you want), you need to launch with --dpdk. For -a, it depends on what you want. Nowadays people advocate for run-to-completion, so you should use -a. This will give you the click basics, such as naming too : https://github.com/tbarbette/fastclick/wiki/Tutorial |
Thank @tbarbette, i did not manage to integrate click with dpdk yet.. I m using just user-level click. I have a scenario that i run a sink(FastUdpSource -> ToDevice) a vnf(FromDevice->Queue->ToDevice) and a sink(FromDevice->Counter->Discard).. each running on a separate node. And first I want to measure the throughput.. I change the RATE of the packets send per/second. And i count the rate that reach the "sink" |
George, I have a vague idea of the issue you are trying to solve with thread pinning and suspect the solution might not work as intended, but I think in theory what you are asking could be achieved with: cilck -a -j3 sink-sender.click click -a -j3 vnf-bridge.click click -a -j3 sink-receiver.click This assumes each node has access to 3 threads. It also assumes CPU thread 0 on node1 and CPU thread 0 on node 2 etc will be scheduled on the same host CPU thread, which may or may not be the case. Perhaps limiting each node to 1 thread and pinning each node to a specific core on the host would give you more control? |
Thank you @ahenning, to be more precise, i want to test 2 different scenarios.
So if i understood what you wrote i must write an click configuration file in like StaticThreadSched(FromDevice 1 , To Device 1)?? and how i link them with the queue element? And after this i have to execute click -a -j3 vnf-bridge.click Im sorry if im wrong, but im currently start working on click |
@p4pe StaticThreadSched takes element names, not a declaration. Please take the time to read the links I gave you :) Looking at the examples in the "conf" folder will help too. You can give a value to -a to pin at a certain offset. So you can simply use "-a 2 -j 1" and every element of that click will run on core 2. If you want to use multiple cores in a single instance, you can use "-a 2 -j 2" and use StaticThreadSched(elementnameA 0, elementnameB 1); to pin elementnameA and elementnameB to core 2 and 3. |
Ok now i think i got it.. Thank you @tbarbette.. I built click with I had this warning |
For the second scenario the config would look something like:
FromDevice should run on thread 0, and packets pushed to and processed by ToDevice should run on thread 1. If the configuration and elements are more complicated, Click has a home thread function that you could add to your elements to verify if needed. Not all elements are thread safe, so one way would be to place the elements you want to run on specific core between two queues e.g.
This is assuming the whole config is more complicated and the idea is to only run the resource heavy elements on a dedicated core and the rest on thread 0. This info might not be relevant to your use case but I am just adding that here for posterity's sake. Also, if the element timers need to run on say the same thread 1, then the actual element also needs to be scheduled via StaticThreadSched and not just the pull to push converter like Unqueue. |
I appreciate your help @ahenning.. Im playing with this now, but i realize that despite i ran The click did not built with multithread enabled.. and i m trying to see why |
Did you make clean then make again? Weird. |
I did a new installation in new machine, and now it is ok |
Hello @tbarbette I m trying to pin elements to threads but i took this Error : My click configuration file is: And I m running click with: |
Hi, First, you need to create an instance of From/ToDevice as follows: in:: FromDevice(enp4s0f1); Then describe your pipeline: in -> Queue -> out; Finally, pin each instance to the correct thread: StaticThreadSched(in 1, out 0); The way you did it, Click creates a different instance for every From/ToDevice call. This is why the output states that router configuration is specified twice. |
Thank you @gkatsikas, but I have the same issue with this click configuration file: in::FromDevice(enp4s0f1); in -> Queue -> out; StaticThreadSched(in 1, out 0); |
I'm following the issue because I also have to study a very similar case. Node1: FastUDPSource(800000, 10000000, 60, 3c:fd:fe:04:64:42, 192.168.6.2, 1234 14:18:77:26:68:15, 192.168.6.5,? 1234) Node2: Node3: In node2 we are running click with the command Whereas running it with What is the actual difference between these two commands? I mean how does triggering the affinity switch work and why does it change the rate? Also trying to use both the affinity and thread switches like this: How would I go about pinning the threads in two cores a) of a cpu in one socket b) different sockets Kindly thank you @tbarbette @ahenning and @gkatsikas for your input |
I'd advise to keep "-a" empty, and play with the affinity inside Click. If you want to offset by two, just pin elements to thread 2 and 3. WIth the forwarder using two cores, I'd expect the sink or source to become the bottleneck. But I'd advise using DPDK as soon as performance matters. |
Random advices:
|
@IoakeimFotoglou we have the same issues i see. @tbarbette I will go too with your advices. Every time I try to play with the affinity inside the Click I "took" this warning forwarder.click:6: While configuring ‘StaticThreadSched@4 :: StaticThreadSched’: I'm using the configuration that @gkatsikas proposed. Thank you in advance |
With configuration in -> Queue -> out; StaticThreadSched(in 0, out 1);** and click -a -j 2 forwarder.click is working fine With configuration **in::FromDevice(enp6s0f1); in -> Queue -> out; StaticThreadSched(in 2, out 3);** and click -a j 2 forwarder.click I have the issue i mentioned above. |
Well, in the second configuration you explicitly ask for threads 2 and 3 in StaticThreadSched, but you call Click with only 2 cores (i.e., j 2 --> which implies that threads 0 and 1 will be allocated). If you bump j to 4 instead of 2 it should work. |
Obviously I did not understand something correctly. What I had understood so far is: If I have StaticThreadSched(in 0, out 1) this means that i ask for two threads (0, 1) and with -a -j 2 in the call, this configuration runs on the cores 0 and 1. The "conflict" comes when i tried to run click in different cores (0,1). I configure StaticThreadSched(in 2, out 3) and i thought that these means that the click will run in cores 2 and 3. Kindly thank you for your input and advices @gkatsikas |
No, the thread index in StaticThreadSched does not necessarily correspond to a physical CPU core ID, it is simply a thread count. |
Ok.. Thanks for the explanation. I want to use Click first. So what do you suggest for better management of core pinning? If I want 2 threads in one core i will have StaticThreadSched(in 0, out 0) |
Your first two points were correct. I think what you missed is that a Click thread can run multiple elements. It's like user-level threads. So with
You pin the two elements to thread 0. As you pass -a, threads 0 means core 0. Core 1 is there but does nothing. Similarly you can pass -j 4 and assign thread 2 and 3 to the in and out elements. 0 and 1 will do nothing. It is not correct to assign elements to thread 2 and 3 if you launched click with 2 threads, as 3 is an out of bound index. That is the error you get. Taskset will not work because if two elements are on the same thread there is nothing you can do about it. For completeness, -a takes a parameter that allows to offset the assignation of threads to core. With -a 2, thread 0 will be pinned to core 2, while thread 1 to core 3. So in that case you would pin in and out to thread 0 and 1 which will be running on core 2 and 3. What DPDK gives is the ability to further define a list of core so if you pass, 3,7,10 thread 0 would be pinned to core 3, 1 to 7 and 2 to 10. My suggestion would be to run click with -a -j 16 if you have 16 cores and never think about this anymore. You pin elements to thread indexes that are exactly cores. if a core has nothing assigned to it then it won't run anything, you don't care really... |
Ok now I think that I get it. If i want to run FromDevice and ToDevice in two different threads, and assign these threads to different cores that are on different sockets. If we assume that core2 and core4 are on different socket. The configuration will be And with click -a -j 16 I will have what I want. Thank you @tbarbette |
You configuration should work even with -j 5. |
I know @gkatsikas, this performance degradation I want to observe! I have 4 different scenarios
If Im right the (4) scenario will have the worst performance(more ore less) due to the inter-socket communication |
Yes, this is likely the case, although QPI effects may be obscured by some artefacts of your setup, such a mem copy from/to user-space. |
Unfortunately I did not manage to install kernel-based Click(I think that is not compatible with new linux headers). Next step is to try Click with DPDK, but first I have to take a look at DPDK cause I am newbie. Last question just for confirmation. in::FromDevice(enp6s0f1); without StaticThreadSchead And just run click forwarder.click |
Yes (provided that you have disabled HT) |
I'd say to always pin them, even for case 1. Also you have to consider that without DPDK you're not pinning actually most of the RX work. Packets will be received by the kernel on probably all cores (the default for the NIC is to use as many queues as cores) through the interrupts handler, no matter how the application is pinned. They will go through the kernel stack on all those cores, this is some heavy work, before the app reads the packets from a single given core. Therefore if you really want to test QPI with the kernel sockets, you'll need to consider the number of queues (ethtool -L) and irq affinity. Just a thought : similarly as your device is attached to a specific CPU, the packets will actually never be moved to the second core in the setup you present, just the Click metadata. You may want to "touch" the bytes on the second CPU. "CheckIPHeader -> SetTCP(or UDP)Checksum should do the trick. |
Thank you all for your help guys. I (believe) that i manage to install the fastclick and know I will try to run the same "experiment" and see the difference. if I understood well the only changes that I have to do is to replace ToDevice and FromDevice with ToDPDKDevice and FromDPDK device, with the interfaces that are binded with the DPDK, and after I run click with click --dpdk . |
Mostly yes. You don't need a Queue also ;) |
Hi. I am studying a similar case like yours. |
Hello, in my case SRCETH and SRCIP are the Mac and the IP of node one, but
DSTETH and DSTIP are on the node3(sink) .
My topology is source--->VNF--->sink
|
Thanks! But what script runs on your node2(VNF)? Simple in -> Queue -> out ? |
Yes just FD->Queue->TD. You have to enable promiscious mode to in and out interfaces in order all the traffic to be able to pass. |
But it seems my packets are directly sent to node3 by node1 and ignore node2. |
If you run tcpdump on ingress port of node2, what did you take? |
Oh use tcpdump can see the packet from node1 to node3 |
You are welcome! |
sorry for trouble you after a long time. I am still newbie to Click and doubt my setup doesn't work well Three nodes' information is as follows: Node1: FastUDPSource(800000, -1, 60, 00:0c:29:92:68:92, 192.168.32.128, 1234, Node2: FromDevice(ens38, PROMISC true) Node3: FromDevice(ens33, PROMISC true) -> c:: Counter -> Discard; I can see the result on node3. I use tcpdump on node2 and see the packet 192.168.32.128.1234 > 192.168.32.132.1234, but use IPPrint element see nothing. |
Hello, I think you have to rewrite the mac in every node that a packet arrives. |
Thanks! Do you mean I should set node1's DST to node2, then on node2 use EtherRewrite element and set it's DST to node3? |
Hello, I am newbie to Click, and i am using click in user-level. I have two questions.
Is it possible to:
The text was updated successfully, but these errors were encountered: