Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Node has stopped synchronization but still responses on some RPC requests. #11793

Closed
max-block opened this issue Jun 20, 2020 · 9 comments
Closed

Comments

@max-block
Copy link

  • OpenEthereum version: v3.0.1
  • Operating system: Linux
  • Installation: binary from github
  • Fully synchronized: yes
  • Network: ethereum
  • Restarted: no

The node has stopped synchronization and there are no any errors in logs:

2020-06-19 23:03:27    28/50 peers   434 MiB chain 2 GiB db 0 bytes queue 15 MiB sync  RPC:  0 conn,    0 req/s,  312 µs
2020-06-19 23:03:54  Imported #10298504 0x5cba…5038 (206 txs, 12.00 Mgas, 479 ms, 39.35 KiB)
2020-06-19 23:03:56  Imported #10298505 0xa4c4…f88a (295 txs, 12.01 Mgas, 634 ms, 44.37 KiB)
2020-06-19 23:03:57    28/50 peers   434 MiB chain 2 GiB db 0 bytes queue 15 MiB sync  RPC:  0 conn,    0 req/s,  312 µs
2020-06-19 23:04:00  Imported #10298506 0x924c…f7fa (287 txs, 11.71 Mgas, 368 ms, 84.31 KiB)
2020-06-19 23:04:14  Imported #10298507 0xed7a…5826 (143 txs, 12.02 Mgas, 805 ms, 26.68 KiB)
2020-06-19 23:04:32    27/50 peers   435 MiB chain 2 GiB db 0 bytes queue 15 MiB sync  RPC:  0 conn,    0 req/s,  312 µs
2020-06-19 23:04:41  Imported #10298508 0xbcc3…808e (222 txs, 12.01 Mgas, 531 ms, 38.88 KiB)
2020-06-19 23:05:02    27/50 peers   435 MiB chain 2 GiB db 0 bytes queue 15 MiB sync  RPC:  0 conn,    0 req/s,  312 µs
2020-06-19 23:05:03  Imported #10298509 0x59db…384b (187 txs, 11.99 Mgas, 791 ms, 34.02 KiB)
2020-06-19 23:05:16  Imported #10298510 0x2aab…a665 (220 txs, 12.02 Mgas, 311 ms, 34.42 KiB)
2020-06-19 23:14:27  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-06-20 00:21:24  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-06-20 01:42:05  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-06-20 01:43:59  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-06-20 02:17:50  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-06-20 03:28:45  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-06-20 04:12:32  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-06-20 04:31:55  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-06-20 04:35:25  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-06-20 05:16:39  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-06-20 06:25:43  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-06-20 06:29:09  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-06-20 06:32:30  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-06-20 07:20:27  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.

I've checked the node with my script which has two JSON RPC requests:

  1. eth_blockNumber -- it returned the outdated block number
  2. eth_syncing -- it didn't return anysing, timeout occurred

I started node with these flags:
openethereum --mode active --tracing off --pruning fast --db-compaction ssd --cache-size 8000 --no-ancient-blocks --ws-interface=0.0.0.0 --ws-hosts=all --ws-origins=all --ws-apis=all --jsonrpc-interface=0.0.0.0 --jsonrpc-hosts=all --jsonrpc-apis=all

I haven't found any error in system logs. How can I run the node next time to be able to track the reason of the problem? Why does openethereum hang?

@adria0
Copy link

adria0 commented Jun 22, 2020

@max-block, thank you for reporting, we are checking this.

Could you give the following information, please?

  • How much time have you been syncing before sync stopped? Which flags you used?
  • How much memory do you have? What is the current process virtual memory size (top/htop)
  • Are you running inside a container?

@max-block
Copy link
Author

max-block commented Jun 23, 2020

* How much time have you been syncing before sync stopped?

1-2 days

Which flags you used?

openethereum --mode active --tracing off --pruning fast --db-compaction ssd --cache-size 8000 --no-ancient-blocks --ws-interface=0.0.0.0 --ws-hosts=all --ws-origins=all --ws-apis=all --jsonrpc-interface=0.0.0.0 --jsonrpc-hosts=all --jsonrpc-apis=all

* How much memory do you have? What is the current process virtual memory size (top/htop)

32Gb. Sorry, I haven't info about memory size. I'll do it next time if it freeze again.

* Are you running inside a container?

No.

Please, can you give me recommendations about --cache-size? I have nodes on different machines with 1 / 8 / 16 / 32 Gb RAM. Is there any correlation between --cache-size and RAM? I'm interested in the fastest synchronization of a node.

@thomsh
Copy link

thomsh commented Jun 24, 2020

Hi,
encounter same issue here.
Using, same version v3.0.1, os Linux, couple of time with different servers, here one failling today.
Ram usage, 8GB of 93GB total.
Node have been fully synced before, after a restart sync process start again

Launched with generated cmd

openethereum-v3.0.1/openethereum --config=ethereum-foundation.toml --no-persistent-txqueue --ports-shift=0

And generated config:

[parity]
chain="foundation"
base_path = "/data/eth"

[network]

[rpc]
interface = "all"
server_threads = 48
port = 30000
apis = [ "eth", "parity", "parity_set", "traces", "web3" ]

[footprint]
tracing = "on"
pruning = "archive"
db_compaction = "ssd"
scale_verifiers = true
num_verifiers = 64
cache_size = 1000

[snapshots]
disable_periodic = true

@max-block
Copy link
Author

I have the same problem on the same server, but now I've saved memory usage:

Tasks: 142 total,   1 running, 141 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  31360.9 total,    285.8 free,  17922.2 used,  13152.8 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  13036.7 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  21851 root      20   0   51.1g  17.1g      0 S   0.7  55.8   7786:03 openethereum
  28230 root      20   0   13780   8752   7316 S   0.7   0.0   0:00.04 sshd
  17322 root      20   0       0      0      0 I   0.3   0.0   0:15.31 kworker/7:1-ata_sff
  28269 root      20   0    9232   3820   3180 R   0.3   0.0   0:00.02 top
      1 root      20   0  170328   7440   3412 S   0.0   0.0   0:14.28 systemd
      2 root      20   0       0      0      0 S   0.0   0.0   0:00.55 kthreadd
      3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp

App logs:

Jun 24 16:20:24 eth-4 openethereum[21851]: 2020-06-24 16:20:24  Imported #10328960 0x85fe…6259 (232 txs, 11.95 Mgas, 886 ms, 39.63 KiB)
Jun 24 16:20:26 eth-4 openethereum[21851]: 2020-06-24 16:20:26  Imported #10328961 0xc0d2…64f4 (172 txs, 11.19 Mgas, 629 ms, 28.35 KiB) + another 1 block(s) containing 165 tx(s)
Jun 24 16:20:31 eth-4 openethereum[21851]: 2020-06-24 16:20:31  Imported #10328962 0xa8ab…d662 (173 txs, 11.98 Mgas, 950 ms, 39.53 KiB)
Jun 24 16:20:32 eth-4 openethereum[21851]: 2020-06-24 16:20:32  Imported #10328959 0xb439…2d8e (146 txs, 11.95 Mgas, 475 ms, 23.81 KiB)
Jun 24 16:20:42 eth-4 openethereum[21851]: 2020-06-24 16:20:42  Imported #10328963 0xb549…6937 (180 txs, 12.00 Mgas, 636 ms, 42.47 KiB)
Jun 24 16:20:44 eth-4 openethereum[21851]: 2020-06-24 16:20:44    32/50 peers    283 MiB chain    2 GiB db  0 bytes queue   13 MiB sync  RPC:  1 conn,    2 req/s, 1596 µs
Jun 24 16:20:50 eth-4 openethereum[21851]: 2020-06-24 16:20:50  Imported #10328964 0xe07f…96d3 (96 txs, 12.00 Mgas, 260 ms, 37.29 KiB)
Jun 24 16:21:07 eth-4 openethereum[21851]: 2020-06-24 16:21:07  Imported #10328965 0x2370…5c47 (187 txs, 12.02 Mgas, 914 ms, 52.94 KiB)
Jun 24 16:21:11 eth-4 openethereum[21851]: 2020-06-24 16:21:11  Imported #10328966 0x2d20…4d36 (164 txs, 11.99 Mgas, 723 ms, 47.87 KiB)
Jun 24 16:21:14 eth-4 openethereum[21851]: 2020-06-24 16:21:14    32/50 peers    283 MiB chain    2 GiB db  0 bytes queue   13 MiB sync  RPC:  1 conn,    1 req/s, 1518 µs
Jun 24 16:21:41 eth-4 openethereum[21851]: 2020-06-24 16:21:41  Imported #10328967 0x2802…fb97 (158 txs, 12.01 Mgas, 467 ms, 38.79 KiB)
Jun 24 16:21:44 eth-4 openethereum[21851]: 2020-06-24 16:21:44    32/50 peers    283 MiB chain    2 GiB db  0 bytes queue   12 MiB sync  RPC:  1 conn,    2 req/s, 1518 µs
Jun 24 16:22:11 eth-4 openethereum[21851]: 2020-06-24 16:22:11  Imported #10328968 0xbb4d…0874 (198 txs, 11.99 Mgas, 360 ms, 44.81 KiB)
Jun 24 16:22:14 eth-4 openethereum[21851]: 2020-06-24 16:22:14    32/50 peers    283 MiB chain    2 GiB db  0 bytes queue   12 MiB sync  RPC:  1 conn,    1 req/s, 1518 µs
Jun 24 16:22:22 eth-4 openethereum[21851]: 2020-06-24 16:22:22  Imported #10328969 0x202a…4b21 (160 txs, 12.01 Mgas, 448 ms, 51.70 KiB)
Jun 24 16:22:24 eth-4 openethereum[21851]: 2020-06-24 16:22:24  Imported #10328970 0x7e62…1113 (0 txs, 0.00 Mgas, 17 ms, 0.52 KiB)
Jun 24 16:22:44 eth-4 openethereum[21851]: 2020-06-24 16:22:44    32/50 peers    283 MiB chain    2 GiB db  0 bytes queue   11 MiB sync  RPC:  1 conn,    0 req/s, 1717 µs
Jun 24 16:22:48 eth-4 openethereum[21851]: 2020-06-24 16:22:48  Imported #10328971 0xedf6…ed39 (208 txs, 11.98 Mgas, 499 ms, 51.91 KiB)
Jun 24 16:22:59 eth-4 openethereum[21851]: 2020-06-24 16:22:59  Imported #10328972 0x0e13…f3b4 (155 txs, 11.97 Mgas, 658 ms, 43.90 KiB)
Jun 24 16:23:14 eth-4 openethereum[21851]: 2020-06-24 16:23:14    32/50 peers    283 MiB chain    2 GiB db  0 bytes queue   12 MiB sync  RPC:  1 conn,    0 req/s, 147451 µs
Jun 24 16:23:44 eth-4 openethereum[21851]: 2020-06-24 16:23:44    32/50 peers    283 MiB chain    2 GiB db  0 bytes queue   11 MiB sync  RPC:  1 conn,    0 req/s, 147451 µs
Jun 24 16:23:49 eth-4 openethereum[21851]: 2020-06-24 16:23:49  Imported #10328973 0xa6a1…28f4 (195 txs, 12.00 Mgas, 488 ms, 40.50 KiB)
Jun 24 16:48:22 eth-4 openethereum[21851]: 2020-06-24 16:48:22  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
Jun 24 17:16:17 eth-4 openethereum[21851]: 2020-06-24 17:16:17  eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.

The node is able to response on JSON-RPC, but it takes longer than usually.
This time the node was started with lower cache size:

[footprint]
tracing = "off"
db_compaction = "ssd"
pruning = "fast"
cache_size = 4096

@adria0
Copy link

adria0 commented Jun 25, 2020

@max-block @thomsh There is the https://github.com/openethereum/openethereum/tree/adria0/cache-db-sizeof branch that tries to patch the sync stop problem ( unfortunately this does not solve the quick-memory-usage one ).

This is not a release, is not "production-ready", but if you could test it, it will be nice.

@max-block
Copy link
Author

max-block commented Jun 27, 2020

@adria0
I'm not able to compile your branch:

So, it was a fresh-new Ubuntu 20.04:

$ curl https://sh.rustup.rs -sSf | sh
$ git clone https://github.com/openethereum/openethereum
$ cd openethereum/
$ git checkout adria0/cache-db-sizeof
$ apt install cargo
$ cargo build --release --features final

, and here is output:

config.status: executing include/jemalloc/jemalloc_mangle_jet.h commands
config.status: executing include/jemalloc/jemalloc.h commands
===============================================================================
jemalloc version   : 0.0.0-0-g0000000000000000000000000000000000000000
library revision   : 2

CONFIG             : --disable-cxx --with-jemalloc-prefix=_rjem_ --with-private-namespace=_rjem_ --host=x86_64-unknown-linux-gnu --build=x86_64-unknown-linux-gnu --prefix=/root/repo/openethereum/target/release/build/jemalloc-sys-d8d87fcf5203da04/out build_alias=x86_64-unknown-linux-gnu host_alias=x86_64-unknown-linux-gnu CC=cc 'CFLAGS=-O3 -ffunction-sections -fdata-sections -fPIC -m64 -Wall' 'LDFLAGS=-O3 -ffunction-sections -fdata-sections -fPIC -m64 -Wall' 'CPPFLAGS=-O3 -ffunction-sections -fdata-sections -fPIC -m64 -Wall'
CC                 : cc
CONFIGURE_CFLAGS   : -std=gnu11 -Wall -Wsign-compare -Wundef -Wno-format-zero-length -pipe -g3 -fvisibility=hidden -O3 -funroll-loops
SPECIFIED_CFLAGS   : -O3 -ffunction-sections -fdata-sections -fPIC -m64 -Wall
EXTRA_CFLAGS       :
CPPFLAGS           : -O3 -ffunction-sections -fdata-sections -fPIC -m64 -Wall -D_GNU_SOURCE -D_REENTRANT
CXX                :
CONFIGURE_CXXFLAGS :
SPECIFIED_CXXFLAGS :
EXTRA_CXXFLAGS     :
LDFLAGS            : -O3 -ffunction-sections -fdata-sections -fPIC -m64 -Wall
EXTRA_LDFLAGS      :
DSO_LDFLAGS        : -shared -Wl,-soname,$(@F)
LIBS               : -lm  -lpthread -ldl
RPATH_EXTRA        :

XSLTPROC           : false
XSLROOT            :

PREFIX             : /root/repo/openethereum/target/release/build/jemalloc-sys-d8d87fcf5203da04/out
BINDIR             : /root/repo/openethereum/target/release/build/jemalloc-sys-d8d87fcf5203da04/out/bin
DATADIR            : /root/repo/openethereum/target/release/build/jemalloc-sys-d8d87fcf5203da04/out/share
INCLUDEDIR         : /root/repo/openethereum/target/release/build/jemalloc-sys-d8d87fcf5203da04/out/include
LIBDIR             : /root/repo/openethereum/target/release/build/jemalloc-sys-d8d87fcf5203da04/out/lib
MANDIR             : /root/repo/openethereum/target/release/build/jemalloc-sys-d8d87fcf5203da04/out/share/man

srcroot            : /root/repo/openethereum/target/release/build/jemalloc-sys-d8d87fcf5203da04/out/jemalloc/
abs_srcroot        : /root/repo/openethereum/target/release/build/jemalloc-sys-d8d87fcf5203da04/out/jemalloc/
objroot            :
abs_objroot        : /root/repo/openethereum/target/release/build/jemalloc-sys-d8d87fcf5203da04/out/build/

JEMALLOC_PREFIX    : _rjem_
JEMALLOC_PRIVATE_NAMESPACE
                   : _rjem_je_
install_suffix     :
malloc_conf        :
autogen            : 0
debug              : 0
stats              : 1
prof               : 0
prof-libunwind     : 0
prof-libgcc        : 0
prof-gcc           : 0
fill               : 1
utrace             : 0
xmalloc            : 0
log                : 0
lazy_lock          : 0
cache-oblivious    : 1
cxx                : 0
===============================================================================
running: "make" "srcroot=../jemalloc/" "-j" "8"

--- stderr
thread 'main' panicked at 'failed to execute command: No such file or directory (os error 2)', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/jemalloc-sys-0.3.2/build.rs:389:19
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

warning: build failed, waiting for other jobs to finish...
error: build failed

@max-block
Copy link
Author

@adria0
It happened again with one of my nodes. I have two nodes in mainnet: with and without the flag --no-ancient-blocks.
And this problem happens for me only for the node with this flag: --no-ancient-blocks.

@vorot93
Copy link

vorot93 commented Jun 29, 2020

Looks like #11758.

@vorot93 vorot93 closed this as completed Jun 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants