WSREP: wsrep::connect() failed: 7 #4

trompx · 2015-05-27T01:39:51Z

Hello,

After I launch the cluster through marathon, the seed is running on host 1 (ip 192.168.33.101), I then scale the number of instances of node to 1 (it will try to launch the node on host 2 with ip 192.168.33.102) and I get an error "WSREP: wsrep::connect() failed: 7". Here is the full log :

CLUSTERCHECK_PASSWORD=62d50cf100bcbb8755a85f9936df32a9a39b1830171d25e59de17cb56e9d10da
+ QCOMM=
+ CLUSTER_NAME=cluster
+ MYSQL_MODE_ARGS=
+ case "$1" in
+ '[' -z galera.service.consul ']'
+ ADDRS=galera.service.consul
+ SEP=
+ for ADDR in '${ADDRS//,/ }'
+ expr galera.service.consul : '^[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*$'
++ paste -sd ,
++ awk '{ print $4 }'
++ host -t A galera.service.consul
+ QCOMM+=192.168.33.102
+ SEP=,
+ shift 2
+ echo 'Starting node, connecting to qcomm://192.168.33.102'
Starting node, connecting to qcomm://192.168.33.102
+ set +e -m
+ trap shutdown TERM INT
+ wait 19
+ /mysqld.sh --console --wsrep_cluster_name=cluster --wsrep_cluster_address=gcomm://192.168.33.102 --wsrep_sst_auth=xtrabackup:3240fd7as9f8798 --defau
lt-time-zone=+00:00
+ /bin/galera-healthcheck -password=62d50cf100bcbb8755a85f9936df32a9a39b1830171d25e59de17cb56e9d10da -pidfile=/var/run/galera-healthcheck.pid -user cl
ustercheck
/usr/sbin/mysqld
Docker startscript:  Get the GTID positon
150527  1:21:58 [Note] mysqld (mysqld 10.0.19-MariaDB-1~trusty-wsrep-log) starting as process 19 ...
150527  1:21:59 [Note] WSREP: Read nil XID from storage engines, skipping position init
150527  1:21:59 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
150527  1:21:59 [Note] WSREP: wsrep_load(): Galera 3.9(rXXXX) by Codership Oy <[email protected]> loaded successfully.
150527  1:21:59 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
150527  1:21:59 [Warning] WSREP: Could not open saved state file for reading: /var/lib/mysql//grastate.dat
150527  1:21:59 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
150527  1:21:59 [Note] WSREP: Passing config to GCS: base_host = 172.17.0.96; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict =
0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period
 = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.
view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache
; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size
 = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0;
 gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery
150527  1:21:59 [Note] WSREP: Service thread queue flushed.
150527  1:21:59 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
150527  1:21:59 [Note] WSREP: wsrep_sst_grab()
150527  1:21:59 [Note] WSREP: Start replication
150527  1:21:59 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
150527  1:21:59 [Note] WSREP: protonet asio version 0
150527  1:21:59 [Note] WSREP: Using CRC-32C for message checksums.
150527  1:21:59 [Note] WSREP: backend: asio
150527  1:21:59 [Warning] WSREP: access file(gvwstate.dat) failed(No such file or directory)
150527  1:21:59 [Note] WSREP: restore pc from disk failed
150527  1:21:59 [Note] WSREP: GMCast version 0
150527  1:21:59 [Note] WSREP: (c5446e45, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
150527  1:21:59 [Note] WSREP: (c5446e45, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
150527  1:21:59 [Note] WSREP: EVS version 0
150527  1:21:59 [Note] WSREP: gcomm: connecting to group 'cluster', peer '192.168.33.102:'
150527  1:22:02 [Warning] WSREP: no nodes coming from prim view, prim not possible
150527  1:22:02 [Note] WSREP: view(view_id(NON_PRIM,c5446e45,1) memb {
        c5446e45,0
} joined {
} left {
} partitioned {
})
150527  1:22:02 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50748S), skipping check
150527  1:22:32 [Note] WSREP: view((empty))
150527  1:22:32 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
         at gcomm/src/pc.cpp:connect():161
150527  1:22:32 [ERROR] WSREP: gcs/src/gcs_core.cpp:long int gcs_core_open(gcs_core_t*, const char*, const char*, bool)():206: Failed to open backend
connection: -110 (Connection timed out)
150527  1:22:32 [ERROR] WSREP: gcs/src/gcs.cpp:long int gcs_open(gcs_conn_t*, const char*, const char*, bool)():1379: Failed to open channel 'cluster'
 at 'gcomm://192.168.33.102': -110 (Connection timed out)
150527  1:22:32 [ERROR] WSREP: gcs connect failed: Connection timed out
150527  1:22:32 [ERROR] WSREP: wsrep::connect() failed: 7
150527  1:22:32 [ERROR] Aborting

150527  1:22:32 [Note] WSREP: Service disconnected.
150527  1:22:33 [Note] WSREP: Some threads may fail to exit.
150527  1:22:33 [Note] mysqld: Shutdown complete

+ RC=1
+ test -s /var/run/galera-healthcheck.pid
++ cat /var/run/galera-healthcheck.pid
+ kill 18
+ exit 1

In the start file, the gcomm is defined by :

ADDRS="$2" # with $2 = galera.service.consul
for ADDR in ${ADDRS//,/ }; do
    if expr "$ADDR" : '^[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*$' >/dev/null; then
        QCOMM+="$SEP$ADDR"
    else
        QCOMM+="$SEP$(host -t A "$ADDR" | awk '{ print $4 }' | paste -sd ",")"
    fi
    SEP=,
done

The result is always the ip of the host where the galera node is being launched, not the list of all ips of the galera cluster. Do you think that is the culprit ?
In the cas of galera.service.consul, it is QCOMM+="$SEP$(host -t A "$ADDR" | awk '{ print $4 }' | paste -sd ",")" which is used, do you think I have to change this or does it works out of the box for you.

I was trying to get all the galera clusters hosts ip with :

dig galera.service.consul +tcp SRV
or
curl http://192.168.33.101:8500/v1/catalog/service/galera

but I am not sure it is the way to go...

Hopefully you have some tips :)

Thank you

The text was updated successfully, but these errors were encountered:

trompx · 2015-05-27T02:05:11Z

I even try to hardcode the ips of the two hosts in gcomm but it still does not work.

+ /mysqld.sh --console --wsrep_cluster_name=cluster --wsrep_cluster_address=gcomm://192.168.33.101,192.168.33.102 --wsrep_sst_auth=xtrabackup:3240fd7a
s9f8798 --default-time-zone=+00:00
+ /bin/galera-healthcheck -password=62d50cf100bcbb8755a85f9936df32a9a39b1830171d25e59de17cb56e9d10da -pidfile=/var/run/galera-healthcheck.pid -user cl
ustercheck
/usr/sbin/mysqld
Docker startscript:  Get the GTID positon
150527  2:01:11 [Note] mysqld (mysqld 10.0.19-MariaDB-1~trusty-wsrep-log) starting as process 22 ...
150527  2:01:11 [Note] WSREP: Read nil XID from storage engines, skipping position init
150527  2:01:11 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
150527  2:01:11 [Note] WSREP: wsrep_load(): Galera 3.9(rXXXX) by Codership Oy <[email protected]> loaded successfully.
150527  2:01:11 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
150527  2:01:11 [Warning] WSREP: Could not open saved state file for reading: /var/lib/mysql//grastate.dat
150527  2:01:11 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
150527  2:01:11 [Note] WSREP: Passing config to GCS: base_host = 172.17.0.7; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0
; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period
= PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.v
iew_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache;
 gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size
= 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0;
gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery
150527  2:01:11 [Note] WSREP: Service thread queue flushed.
150527  2:01:11 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
150527  2:01:11 [Note] WSREP: wsrep_sst_grab()
150527  2:01:11 [Note] WSREP: Start replication
150527  2:01:11 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
150527  2:01:11 [Note] WSREP: protonet asio version 0
150527  2:01:11 [Note] WSREP: Using CRC-32C for message checksums.
150527  2:01:11 [Note] WSREP: backend: asio
150527  2:01:11 [Warning] WSREP: access file(gvwstate.dat) failed(No such file or directory)
150527  2:01:11 [Note] WSREP: restore pc from disk failed
150527  2:01:11 [Note] WSREP: GMCast version 0
150527  2:01:11 [Note] WSREP: (3f9a02cd, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
150527  2:01:11 [Note] WSREP: (3f9a02cd, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
150527  2:01:11 [Note] WSREP: EVS version 0
150527  2:01:11 [Note] WSREP: gcomm: connecting to group 'cluster', peer '192.168.33.101:,192.168.33.102:'
150527  2:01:14 [Warning] WSREP: no nodes coming from prim view, prim not possible
150527  2:01:14 [Note] WSREP: view(view_id(NON_PRIM,3f9a02cd,1) memb {
        3f9a02cd,0
} joined {
} left {
} partitioned {
})
150527  2:01:15 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.5041S), skipping check
150527  2:01:44 [Note] WSREP: view((empty))
150527  2:01:44 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
         at gcomm/src/pc.cpp:connect():161
150527  2:01:44 [ERROR] WSREP: gcs/src/gcs_core.cpp:long int gcs_core_open(gcs_core_t*, const char*, const char*, bool)():206: Failed to open backend
connection: -110 (Connection timed out)
150527  2:01:44 [ERROR] WSREP: gcs/src/gcs.cpp:long int gcs_open(gcs_conn_t*, const char*, const char*, bool)():1379: Failed to open channel 'cluster'
 at 'gcomm://192.168.33.101,192.168.33.102': -110 (Connection timed out)
150527  2:01:44 [ERROR] WSREP: gcs connect failed: Connection timed out
150527  2:01:44 [ERROR] WSREP: wsrep::connect() failed: 7
150527  2:01:44 [ERROR] Aborting

150527  2:01:44 [Note] WSREP: Service disconnected.
150527  2:01:45 [Note] WSREP: Some threads may fail to exit.
150527  2:01:45 [Note] mysqld: Shutdown complete

+ RC=1
+ test -s /var/run/galera-healthcheck.pid
++ cat /var/run/galera-healthcheck.pid
+ kill 21
+ exit 1

trompx · 2015-06-02T10:27:13Z

Finally I hardcoded the node (host2) gcomm to point to the seed host1 192.168.33.101. Doing a nmap on the seed host, I get the following :

nmap -sT -p 3306,4567 192.168.33.101

Starting Nmap 6.40 ( http://nmap.org ) at 2015-06-02 10:17 UTC
Nmap scan report for 192.168.33.101
Host is up (0.00074s latency).
PORT     STATE  SERVICE
3306/tcp open   mysql
4567/tcp closed tram

I created a rule in haproxy to listen on *:3306 so it makes sense that the mysql port is open, but concerning the 4567 port, marathon/mesos attributes dynamic port in the range 31000/32000 so when the node try to connect to host1 192.168.33.101:4567, it does not find it (if I launch the containers with command line and publishing the ports -p3306:3306 -p 4567:4567 erverything works as expected).

In your demo it does not look like you have this kind of problem, did you add manually some others configurations ? is it possible to dynamically configure on which port galera is listening for new node (get the ports attributed by marathon/mesos and listen on those instead of 4567) ?

Thanks for the help.

trompx · 2015-06-03T13:12:19Z

Well it appears that your galera nodes are connecting directly (between docker containers) thanks to weave. Due to bad performance, I was not using weave, so when the node was trying to connect to the seed at 192.168.33.101:4567, the port was not available on the host.

I finally managed to map all ports on host with marathon changing the mesos-slave resources and setting the hostPort in the marathon.json file.

One weird thing is that when doing nmap on the seed host I get :

Starting Nmap 6.40 ( http://nmap.org ) at 2015-06-03 13:08 UTC
Nmap scan report for mysql1 (192.168.33.101)
Host is up (0.00033s latency).
PORT     STATE  SERVICE
3306/tcp open   mysql
4444/tcp closed krb524
4567/tcp open   tram
4568/tcp closed unknown

I don't quite get how the node can sync with the seed with sst xtrabackup as it is supposed to use port 4444 which is closed on the seed host...

Anyway, the seed is running and healthcheck passes, but when I scale the number of node to 1 (which will deploy on host 2), the healhcheck fails. However when I "show status like 'wsrep%';" on the seed, I get :

| wsrep_local_state            | 4                                                                               |
| wsrep_local_state_comment    | Synced                                                                          |
| wsrep_cert_index_size        | 0                                                                               |
| wsrep_causal_reads           | 0                                                                               |
| wsrep_cert_interval          | 0.000000                                                                        |
| wsrep_incoming_addresses     | 192.168.33.101:3306,192.168.33.102:3306 |
| wsrep_evs_delayed            | 87bcc037-0995-11e5-8b5a-d619af593f37:tcp://192.168.33.102:4567:1                |
| wsrep_evs_evict_list         |                                                                                 |
| wsrep_evs_repl_latency       | 0/0/0/0/0                                                                       |
| wsrep_evs_state              | OPERATIONAL                                                                     |
| wsrep_gcomm_uuid             | 4f243f04-098a-11e5-be08-6fee31f18f6a                                            |
| wsrep_cluster_conf_id        | 4                                                                               |
| wsrep_cluster_size           | 2                                                                               |
| wsrep_cluster_state_uuid     | 4f24c5d9-098a-11e5-b6a2-ef3776cc6bd7                                            |
| wsrep_cluster_status         | Primary                                                                         |
| wsrep_connected              | ON                                                                              |
| wsrep_local_bf_aborts        | 0                                                                               |
| wsrep_local_index            | 0                                                                               |
| wsrep_provider_name          | Galera                                                                          |
| wsrep_provider_vendor        | Codership Oy <[email protected]>                                               |
| wsrep_provider_version       | 3.9(rXXXX)                                                                      |
| wsrep_ready                  | ON                                                                              |
| wsrep_thread_count           | 2                                                                               |
+------------------------------+---------------------------------------------------------------------------------+

Here is the log I get on the node :

I0603 12:58:10.696002 11214 exec.cpp:132] Version: 0.22.1
I0603 12:58:10.727375 11243 exec.cpp:206] Executor registered on slave 20150603-102449-1696704704-5050-45-S1
+ '[' -z 3240fd7as9f8798 ']'
++ sha256sum
++ awk '{print $1;}'
++ echo 3240fd7as9f8798
+ CLUSTERCHECK_PASSWORD=62d50cf100bcbb8755a85f9936df32a9a39b1830171d25e59de17cb56e9d10da
+ QCOMM=
+ CLUSTER_NAME=cluster
+ MYSQL_MODE_ARGS=
+ case "$1" in
+ '[' -z galera.service.consul ']'
+ ADDRS=galera.service.consul
+ SEP=
+ for ADDR in '${ADDRS//,/ }'
+ expr galera.service.consul : '^[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*$'
++ paste -sd ,
++ awk '{ print $4 }'
++ host -t A galera.service.consul
+ QCOMM+=found:
+ SEP=,
+ NODE_ADDR=192.168.33.102
+ INCOMING_ADDR=192.168.33.101:3306,192.168.33.102:3306
+ QCOMM=192.168.33.101
+ shift 2
+ echo 'Starting node, connecting to qcomm://192.168.33.101'
+ set +e -m
+ trap shutdown TERM INT
+ echo 'Node address :' 192.168.33.102
+ /mysqld.sh --console --wsrep_cluster_name=cluster --wsrep_cluster_address=gcomm://192.168.33.101 --wsrep_node_address=192.168.33.102 --wsrep_node_incoming_address=192.168.33.101:3306,192.168.33.102:3306 --wsrep_sst_auth=xtrabackup:3240fd7as9f8798 --default-time-zone=+00:00
+ /bin/galera-healthcheck -password=62d50cf100bcbb8755a85f9936df32a9a39b1830171d25e59de17cb56e9d10da -pidfile=/var/run/galera-healthcheck.pid -user clustercheck
150603 12:58:13 [Note] mysqld (mysqld 10.0.19-MariaDB-1~trusty-wsrep-log) starting as process 18 ...
150603 12:58:13 [Note] WSREP: Read nil XID from storage engines, skipping position init
150603 12:58:13 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
150603 12:58:13 [Note] WSREP: wsrep_load(): Galera 3.9(rXXXX) by Codership Oy <[email protected]> loaded successfully.
150603 12:58:13 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
150603 12:58:13 [Note] WSREP: Found saved state: 4f24c5d9-098a-11e5-b6a2-ef3776cc6bd7:-1
150603 12:58:13 [Note] WSREP: Passing config to GCS: base_host = 192.168.33.102; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recov
150603 12:58:13 [Note] WSREP: Service thread queue flushed.
150603 12:58:13 [Note] WSREP: Assign initial position for certification: 0, protocol version: -1
150603 12:58:13 [Note] WSREP: wsrep_sst_grab()
150603 12:58:13 [Note] WSREP: Start replication
150603 12:58:13 [Note] WSREP: Setting initial position to 4f24c5d9-098a-11e5-b6a2-ef3776cc6bd7:0
150603 12:58:13 [Note] WSREP: protonet asio version 0
150603 12:58:13 [Note] WSREP: Using CRC-32C for message checksums.
150603 12:58:13 [Note] WSREP: backend: asio
150603 12:58:13 [Note] WSREP: restore pc from disk successfully
150603 12:58:13 [Note] WSREP: GMCast version 0
150603 12:58:13 [Note] WSREP: (87bcc037, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
150603 12:58:13 [Note] WSREP: (87bcc037, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
150603 12:58:13 [Note] WSREP: EVS version 0
150603 12:58:13 [Note] WSREP: gcomm: connecting to group 'cluster', peer '192.168.33.101:'
150603 12:58:13 [Note] WSREP: (87bcc037, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: 
150603 12:58:13 [Note] WSREP: declaring 4f243f04 at tcp://192.168.33.101:4567 stable
150603 12:58:13 [Note] WSREP: Node 4f243f04 state prim
150603 12:58:13 [Note] WSREP: view(view_id(PRIM,4f243f04,11850) memb {
    4f243f04,0
    87bcc037,0
} joined {
} left {
} partitioned {
})
150603 12:58:13 [Note] WSREP: save pc into disk
150603 12:58:13 [Note] WSREP: clear restored view
150603 12:58:14 [Note] WSREP: gcomm: connected
150603 12:58:14 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
150603 12:58:14 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
150603 12:58:14 [Note] WSREP: Opened channel 'cluster'
150603 12:58:14 [Note] WSREP: Waiting for SST to complete.
150603 12:58:14 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
150603 12:58:14 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
150603 12:58:14 [Note] WSREP: STATE EXCHANGE: sent state msg: 31c4a8d4-09f0-11e5-a718-e325a19de76c
150603 12:58:14 [Note] WSREP: STATE EXCHANGE: got state msg: 31c4a8d4-09f0-11e5-a718-e325a19de76c from 0 (e707458f0c20)
150603 12:58:14 [Note] WSREP: STATE EXCHANGE: got state msg: 31c4a8d4-09f0-11e5-a718-e325a19de76c from 1 (d4e6219d12d2)
150603 12:58:14 [Note] WSREP: Quorum results:
    version    = 3,
    component  = PRIMARY,
    conf_id    = 1,
    members    = 2/2 (joined/total),
    act_id     = 0,
    last_appl. = -1,
    protocols  = 0/7/3 (gcs/repl/appl),
    group UUID = 4f24c5d9-098a-11e5-b6a2-ef3776cc6bd7
150603 12:58:14 [Note] WSREP: Flow-control interval: [23, 23]
150603 12:58:14 [Note] WSREP: Restored state OPEN -> JOINED (0)
150603 12:58:14 [Note] WSREP: New cluster view: global state: 4f24c5d9-098a-11e5-b6a2-ef3776cc6bd7:0, view# 2: Primary, number of nodes: 2, my index: 1, protocol version 3
150603 12:58:14 [Note] WSREP: SST complete, seqno: 0
150603 12:58:14 [Note] WSREP: Member 1.0 (d4e6219d12d2) synced with group.
150603 12:58:14 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
150603 12:58:14 [Note] InnoDB: Using mutexes to ref count buffer pool pages
150603 12:58:14 [Note] InnoDB: The InnoDB memory heap is disabled
150603 12:58:14 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
150603 12:58:14 [Note] InnoDB: Memory barrier is not used
150603 12:58:14 [Note] InnoDB: Compressed tables use zlib 1.2.8
150603 12:58:14 [Note] InnoDB: Using Linux native AIO
150603 12:58:14 [Note] InnoDB: Not using CPU crc32 instructions
150603 12:58:14 [Note] InnoDB: Initializing buffer pool, size = 256.0M
150603 12:58:14 [Note] InnoDB: Completed initialization of buffer pool
150603 12:58:14 [Note] InnoDB: Highest supported file format is Barracuda.
150603 12:58:14 [Note] InnoDB: 128 rollback segment(s) are active.
150603 12:58:14 [Note] InnoDB: Waiting for purge to start
150603 12:58:14 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.23-72.1 started; log sequence number 1631760
150603 12:58:14 [Note] Plugin 'FEEDBACK' is disabled.
150603 12:58:14 [Note] Server socket created on IP: '0.0.0.0'.
150603 12:58:14 [Note] Event Scheduler: Loaded 0 events
150603 12:58:14 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150603 12:58:14 [Note] WSREP: REPL Protocols: 7 (3, 2)
150603 12:58:14 [Note] WSREP: Service thread queue flushed.
150603 12:58:14 [Note] WSREP: Assign initial position for certification: 0, protocol version: 3
150603 12:58:14 [Note] WSREP: Service thread queue flushed.
150603 12:58:14 [Note] WSREP: Synchronized with group, ready for connections
150603 12:58:14 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150603 12:58:14 [Note] Reading of all Master_info entries succeded
150603 12:58:14 [Note] Added new Master_info '' to hash table
150603 12:58:14 [Note] mysqld: ready for connections.
Version: '10.0.19-MariaDB-1~trusty-wsrep-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution, wsrep_25.10.r4144
150603 12:58:16 [Note] WSREP: (87bcc037, 'tcp://0.0.0.0:4567') turning message relay requesting off

As the healthcheck fails, the node keeps being redeployed while it seems it is running fine.
When I "curl 192.168.33.102:host_port_for_8080", I get "Galera Cluster Node status: synced".

Any idea why the node healthcheck could be failing while the seed one is passing ?

Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WSREP: wsrep::connect() failed: 7 #4

WSREP: wsrep::connect() failed: 7 #4

trompx commented May 27, 2015

trompx commented May 27, 2015

trompx commented Jun 2, 2015

trompx commented Jun 3, 2015

WSREP: wsrep::connect() failed: 7 #4

WSREP: wsrep::connect() failed: 7 #4

Comments

trompx commented May 27, 2015

trompx commented May 27, 2015

trompx commented Jun 2, 2015

trompx commented Jun 3, 2015