Fix deadlock on harbor-core initialization

During the harbor core initialization if the database takes longer to
be ready there is a risk of deadlock when checking for the TCP connection
with the database.

The `TestTCPConn` function uses unbuffered channels to check when the
connection succeeds/timeouts. The timeout check is executed in parallel
with the connection check (this runs in a gorountine). The deadlock happens
when the goroutine execution takes longer than the function timeout
(hence setting `cancel <- 1`) and the DialTimeout call succeeds (hence
setting `success <- 1`). At this point both threads are waiting for the
channels values to be read.

This is reproducible mostly on slow systems where initializing the
database takes longer and finishes during the 5th time of the
`DialTimeout` call where it eventually exceeds the TestTCPConn timeout.

This fix sets the `success` and `cancel` channels as buffered
(non-blocking).

Signed-off-by: Flávio Ramalho <framalho@suse.com>
This commit is contained in:
Flávio Ramalho 2020-10-28 16:45:35 +01:00
parent 723695b3e9
commit ef6414be3e
No known key found for this signature in database
GPG Key ID: 33F16B0C872DC583

View File

@ -89,8 +89,8 @@ func GenerateRandomString() string {
// with the connection, in second
// interval: the interval time for retring after failure, in second
func TestTCPConn(addr string, timeout, interval int) error {
success := make(chan int)
cancel := make(chan int)
success := make(chan int, 1)
cancel := make(chan int, 1)
go func() {
n := 1