Mariadb – Galera Cluster Failed to report last committed

galeramariadb

I'm experimenting with Galera Cluster with 3 nodes, but I'm having some issues but I'm not sure why. What I noticed is that I'm having lots of failed to report last committed (number), -110 (connection timed out). Well I'm using 100Mb/s connection, I made a throughput test and I'm able to use 97Mb/s between hosts so my network is ok. But I noticed that one of the nodes was sending many flow control messages, don't know exactly why, so I turned it off and tried with the other two remaining nodes and no problem at all. Here is the output of the wsrep_ variables of the node to give you a hint of what can be causing this issue and suggest me some tuning and the hardware's characteristics, they are heterogeneous:

At all cases the OS is Ubuntu 18.04 LTS

Computer 1: L502X Dell XPS

Intel(R) Core(TM) i7-2670QM CPU @ 2.20GHz 4 cores/8 threads
16GB RAM Kingston
1TB HD ST1000LM024 HN-M101MBB

Running MariaDB 10.3 natively

Computer 2: Toshiba Satellite A200-220

Intel(R) Core(TM)2 Duo T7500 @ 2.20GHz 2 cores/4 threads
4GB RAM
200GB HD Toshiba MK2046GSX

Running MariaDB 10.3 on VirtualBox in a Windows Host:
2 vCPUs
1,5 GB RAM

Computer 3: Dell Inspiron

Intel(R) Core(TM) i3 4005U @ 1.70GHz 2 cores/4 threads
4GB RAM
1TB HD ST1000LM024 HN-M101MBB

Running MariaDB 10.3 on VirtualBox in a Windows Host:
2 vCPUs
1,5 GB RAM

Curious thing is that I tested in pairs to check if the timeouts would stop occurring and, in fact, it stopped only with computers 1 and 2 in the cluster. So I pulled 2 off and pulled in 3 and timeout returned. I don't know why a newer hardware is being the culprit and I have a fear of testing at AWS in the hope of having no timeout and face it there too, I'm making these tests because I plan to migrate my company's DB to AWS and use Galera for HA, Backup and to increase performance and I'm new to Galera and I must have sure that I'm not doing anything wrong.

At all cases I'm using wsrep_slave_threads = 4

Node 1 wsrep_%

wsrep_apply_oooe    0.000000
wsrep_apply_oool    0.000000
wsrep_apply_window  1.000000
wsrep_causal_reads  0
wsrep_cert_deps_distance    9.436170
wsrep_cert_index_size   94473
wsrep_cert_interval 0.000000
wsrep_cluster_conf_id   11
wsrep_cluster_size  3
wsrep_cluster_state_uuid    e3c270c9-6dae-11e8-86a2-0f18d007f9fa
wsrep_cluster_status    Primary
wsrep_commit_oooe   0.000000
wsrep_commit_oool   0.000000
wsrep_commit_window 1.000000
wsrep_connected ON
wsrep_desync_count  0
wsrep_evs_delayed   
wsrep_evs_evict_list    
wsrep_evs_repl_latency  0.00379364/0.0224714/0.0381743/0.00500254/200
wsrep_evs_state OPERATIONAL
wsrep_flow_control_paused   0.418272
wsrep_flow_control_paused_ns    580464988618
wsrep_flow_control_recv 123
wsrep_flow_control_sent 0
wsrep_gcomm_uuid    378af41f-6ddc-11e8-82f4-8aec32d4faa4
wsrep_incoming_addresses    10.0.0.16:3306,10.0.0.19:3306,10.0.0.12:3306
wsrep_last_committed    4534
wsrep_local_bf_aborts   0
wsrep_local_cached_downto   4312
wsrep_local_cert_failures   0
wsrep_local_commits 495
wsrep_local_index   1
wsrep_local_recv_queue  0
wsrep_local_recv_queue_avg  0.007519
wsrep_local_recv_queue_max  2
wsrep_local_recv_queue_min  0
wsrep_local_replays 0
wsrep_local_send_queue  1
wsrep_local_send_queue_avg  0.094382
wsrep_local_send_queue_max  2
wsrep_local_send_queue_min  0
wsrep_local_state   4
wsrep_local_state_comment   Synced
wsrep_local_state_uuid  e3c270c9-6dae-11e8-86a2-0f18d007f9fa
wsrep_protocol_version  8
wsrep_provider_name Galera
wsrep_provider_vendor   Codership Oy <info@codership.com>
wsrep_provider_version  25.3.23(r3789)
wsrep_ready ON
wsrep_received  2016
wsrep_received_bytes    1453297536
wsrep_repl_data_bytes   952332692
wsrep_repl_keys 3381123
wsrep_repl_keys_bytes   27069920
wsrep_repl_other_bytes  0
wsrep_replicated    716
wsrep_replicated_bytes  979451040
wsrep_thread_count  5

Node 2 wsrep_%

wsrep_apply_oooe    0.444043
wsrep_apply_oool    0.032491
wsrep_apply_window  2.660650
wsrep_causal_reads  0
wsrep_cert_deps_distance    9.296029
wsrep_cert_index_size   105593
wsrep_cert_interval 0.000000
wsrep_cluster_conf_id   11
wsrep_cluster_size  3
wsrep_cluster_state_uuid    e3c270c9-6dae-11e8-86a2-0f18d007f9fa
wsrep_cluster_status    Primary
wsrep_commit_oooe   0.000000
wsrep_commit_oool   0.000000
wsrep_commit_window 1.273723
wsrep_connected ON
wsrep_desync_count  0
wsrep_evs_delayed   
wsrep_evs_evict_list    
wsrep_evs_repl_latency  0.0209565/0.0265051/0.0300225/0.00397011/3
wsrep_evs_state OPERATIONAL
wsrep_flow_control_paused   0.451273
wsrep_flow_control_paused_ns    155151831643
wsrep_flow_control_recv 122
wsrep_flow_control_sent 0
wsrep_gcomm_uuid    2ef79ae7-6de7-11e8-b8bb-621e468c1867
wsrep_incoming_addresses    10.0.0.16:3306,10.0.0.19:3306,10.0.0.12:3306
wsrep_last_committed    4525
wsrep_local_bf_aborts   0
wsrep_local_cached_downto   4310
wsrep_local_cert_failures   0
wsrep_local_commits 0
wsrep_local_index   0
wsrep_local_recv_queue  8
wsrep_local_recv_queue_avg  2.460976
wsrep_local_recv_queue_max  16
wsrep_local_recv_queue_min  0
wsrep_local_replays 0
wsrep_local_send_queue  0
wsrep_local_send_queue_avg  0.000000
wsrep_local_send_queue_max  1
wsrep_local_send_queue_min  0
wsrep_local_state   4
wsrep_local_state_comment   Synced
wsrep_local_state_uuid  e3c270c9-6dae-11e8-86a2-0f18d007f9fa
wsrep_protocol_version  8
wsrep_provider_name Galera
wsrep_provider_vendor   Codership Oy <info@codership.com>
wsrep_provider_version  25.3.23(r3789)
wsrep_ready ON
wsrep_received  406
wsrep_received_bytes    319335762
wsrep_repl_data_bytes   0
wsrep_repl_keys 0
wsrep_repl_keys_bytes   0
wsrep_repl_other_bytes  0
wsrep_replicated    0
wsrep_replicated_bytes  0
wsrep_thread_count  5

Node 3 wsrep_%

wsrep_apply_oooe    0.481061
wsrep_apply_oool    0.018939
wsrep_apply_window  1.742424
wsrep_causal_reads  0
wsrep_cert_deps_distance    8.821970
wsrep_cert_index_size   100278
wsrep_cert_interval 0.000000
wsrep_cluster_conf_id   11
wsrep_cluster_size  3
wsrep_cluster_state_uuid    e3c270c9-6dae-11e8-86a2-0f18d007f9fa
wsrep_cluster_status    Primary
wsrep_commit_oooe   0.000000
wsrep_commit_oool   0.000000
wsrep_commit_window 1.064639
wsrep_connected ON
wsrep_desync_count  0
wsrep_evs_delayed   
wsrep_evs_evict_list    
wsrep_evs_repl_latency  0.00271209/0.00877237/0.0228643/0.00731529/11
wsrep_evs_state OPERATIONAL
wsrep_flow_control_paused   0.407408
wsrep_flow_control_paused_ns    554009596591
wsrep_flow_control_recv 121
wsrep_flow_control_sent 121
wsrep_gcomm_uuid    b0924918-6de1-11e8-80bb-c607140f7861
wsrep_incoming_addresses    10.0.0.16:3306,10.0.0.19:3306,10.0.0.12:3306
wsrep_last_committed    4514
wsrep_local_bf_aborts   0
wsrep_local_cached_downto   4309
wsrep_local_cert_failures   0
wsrep_local_commits 0
wsrep_local_index   2
wsrep_local_recv_queue  29
wsrep_local_recv_queue_avg  23.671569
wsrep_local_recv_queue_max  30
wsrep_local_recv_queue_min  0
wsrep_local_replays 0
wsrep_local_send_queue  0
wsrep_local_send_queue_avg  0.000000
wsrep_local_send_queue_max  1
wsrep_local_send_queue_min  0
wsrep_local_state   4
wsrep_local_state_comment   Synced
wsrep_local_state_uuid  e3c270c9-6dae-11e8-86a2-0f18d007f9fa
wsrep_protocol_version  8
wsrep_provider_name Galera
wsrep_provider_vendor   Codership Oy <info@codership.com>
wsrep_provider_version  25.3.23(r3789)
wsrep_ready ON
wsrep_received  1089
wsrep_received_bytes    940803284
wsrep_repl_data_bytes   378
wsrep_repl_keys 1
wsrep_repl_keys_bytes   32
wsrep_repl_other_bytes  0
wsrep_replicated    1
wsrep_replicated_bytes  480
wsrep_thread_count  3