Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RemoveEventListener cause connection leak #911

Open
shabicheng opened this issue Apr 27, 2022 · 6 comments · Fixed by #928
Open

RemoveEventListener cause connection leak #911

shabicheng opened this issue Apr 27, 2022 · 6 comments · Fixed by #928
Labels

Comments

@shabicheng
Copy link

Recently both our app and docker engine found fd count increase in a short time, after looking into the go-dockerclient source code and test, we find the AddEventListener and RemoveEventListener cause the problem: RemoveEventListener will not close connection, but after remove all listener and call AddEventListener again, the client will create a new connection.

After looking into the code, we find that the eventHijack create the gorountine and will not return when we call RemoveEventListener :

https://github.com/fsouza/go-dockerclient/blob/main/event.go#L383

image

Now the newest go-dockerclient version has this problem.

The test:

image

package main

import (
	"fmt"
	"io/ioutil"
	"log"
	"os"
	"time"

	docker "github.com/fsouza/go-dockerclient"
)

func main() {
	fmt.Println("Start running daemon")

	//Init Docker client
	endpoint := "unix:///var/run/docker.sock"

	if len(os.Args) < 3 {
		fmt.Printf(`Usage %s [unix-socket-path] [fd-path]
		example: %s unix:///var/run/docker.sock /proc/$(pidof docker)/fd`, os.Args[0], os.Args[0])
		os.Exit(2)
	}
	endpoint = os.Args[1]
	fdDir := os.Args[2]

	client, err := docker.NewClient(endpoint)

	if err != nil {
		log.Fatal(err)
	}

	for {
		connect(client, endpoint, fdDir)
	}
}

func connect(client *docker.Client, endpoint, fdDir string) {
	arr, err := ioutil.ReadDir(fdDir)
	if err != nil {
		log.Fatalf("read fd dir error %s: %v", fdDir, err)
	}
	fmt.Println(time.Now().Format("Mon Jan 02 15:04:05"), "before connect fd count:", len(arr))
	events := make(chan *docker.APIEvents)
	client.AddEventListener(events)

	defer func() {
		client.RemoveEventListener(events)

	}()
	timeout := time.After(3 * time.Second)
	select {
	case msg := <-events:
		if msg.Status == "start" {
			fmt.Println("EVENT:" + msg.Status)
		}
	case <-timeout:
		break
	}
}

@fsouza
Copy link
Owner

fsouza commented Apr 27, 2022

Thanks for the detailed issue and for providing a reproducer!

Yeah, if nothing is listening for the events we should probably close the connection. I'll take a closer look at this on Thursday.

@fsouza fsouza added the bug label Apr 27, 2022
@yyuuttaaoo
Copy link

Is there a plan to fix this issue as there is an official alternative now?

@fsouza
Copy link
Owner

fsouza commented Aug 3, 2022

Is there a plan to fix this issue as there is an official alternative now?

Yeah I plan to get to it before the end of the Summer, just been away for a while.

@fsouza
Copy link
Owner

fsouza commented Aug 19, 2022

@yyuuttaaoo @shabicheng can you try #928?

@yyuuttaaoo
Copy link

After testing, this patch does not work. So I think the issue should be reopened.
First, let me fix a small bug in the test program. $(pidof docker) should be $(pidof dockerd).
Now we can use the test program to test.
We can see, the fd number is still climbing. Why? There are 2 reasons:

  1. If there is no new event for some reason, the json.Decode will block forever.
  2. res.Body.Close actually hangs forever if it is called before conn.Close.

There are 2 other problems in this solution:

  1. There is a chance that eventChannel and errorChannel are closed before the last event was sent to the channel. This may cause panic.
  2. If the last event was an error, it will cause reconnect regardless of the number of listeners.

So I have made another patch based on the current one.
#930

Note:

  1. The patch still contains some debug printf code, because it is not fully tested.
  2. I have not run any UT for regression yet.
  3. I've only test it on Linux using unencrypted docker.sock. SSH/TLS are not tested. Windows platform is not tested.

@fsouza Would you help/coauthor to make the new patch complete?

@fsouza fsouza reopened this Aug 31, 2022
@fsouza
Copy link
Owner

fsouza commented Aug 31, 2022

@yyuuttaaoo thank you for detailed explanation! I can definitely work on that patch, but this week is a bit busy. I can come around some time next week and look into that patch and make sure it's ready for prime time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants