BitMEX Research Launches Ethereum Node Metrics Platform Nodestats.org

·

BitMEX Research is pleased to announce the launch of a new platform for monitoring the Ethereum network: Nodestats.org. This website connects to five different Ethereum nodes and collects data every five seconds. It focuses primarily on providing metrics related to the computational resources required for each node. During the analysis of certain indicators, potential issues concerning data integrity reported by nodes have been identified, which may be of interest to Ethereum users. Nodestats.org was developed in collaboration with TokenAnalyst, the Ethereum network data and analytics partner of BitMEX Research.

Overview of Nodestats.org

Nodestats.org compares statistics from the two largest Ethereum node client implementations—Geth and Parity—based on overall adoption. For these clients, the platform evaluates the performance of different node configurations: full, archive, and fast sync nodes.

The primary objectives of Nodestats.org are:

Data collection for Nodestats.org began in early March 2019, making it too early to draw definitive conclusions. However, all data is being stored for future analysis of long-term trends. The platform generates data by querying each of the five Ethereum nodes—or the machines running them—every five seconds (720 times per hour), storing the results in a database. Various rolling averages and derived metrics are displayed on the Nodestats.org website.

Explanation of Node Metrics

Time Synced Percentage

This metric indicates the percentage of time a node has verified and downloaded all block data and is notified by the P2P network as being at the chain tip. The hourly metric is calculated by determining whether the node is at the tip every 5 seconds across 720 queries. The proportion of these queries where the node reports being at the tip is the reported metric. This field is based on web3’s isSyncing field, which uses the highest block seen by the node (“highestBlock”) to determine if it is behind the highest block seen by its peers.

Preliminary Findings: Nodes generally report being at the tip 99.8% of the time, meaning only about one query per hour indicates desynchronization. The exception is the Ethereum Parity full node, discussed later. Data integrity for this metric appears weak, especially for the Parity full node. Future efforts aim to develop a more robust method for calculating this indicator.

Time on Conflicting Chain Percentage

This represents the percentage of time a node follows a different or conflicting chain compared to other nodes. It is determined by storing all block hashes in a database; nodes are considered on different chains if they have different block hashes at the same height.

Preliminary Findings: Nodestats.org has not identified instances of clients following different chains, so this metric is typically 0%.

CPU Usage

This shows the average percentage of machine CPU resources used. All machines used by Nodestats.org feature a “Xeon(R) CPU E5-2686 @ 2.30GHz” processing unit with dual cores, except the archive node, which has 16 cores. All nodes use AWS “i3.large” instances, except the archive node, which uses “i3.4xlarge.”

Preliminary Findings: CPU usage tends to range between 0.01% and 1.0%. Parity often reaches around 1%, while Geth generally uses less CPU power. Geth’s CPU usage is less stable, occasionally spiking to approximately 1%.

Memory Usage

Nodestats.org reads data every 5 seconds related to the memory usage of the Ethereum client. All machines have 14GB of RAM, except the archive node, which has 120GB.

Preliminary Findings: Nodes typically utilize over 95% of available memory, regardless of the total amount. Memory demands appear relatively stable across clients.

Peer Count

The number of network peers reported by the node every 5 seconds.

Preliminary Findings: Parity typically maintains around 450 peers, while Geth averages about 8. Geth’s peer count is more volatile, occasionally dropping to around 6.

Upload Bandwidth

The total network upload bandwidth of the server, measured every 5 seconds.

Preliminary Findings: Parity, with more peers, often uses over 100KB/s bandwidth (in each direction). In contrast, Geth uses about 4KB/s, with occasional spikes to around 60KB/s.

Download Bandwidth

The total network download bandwidth of the server, measured every 5 seconds.

Chain Data Size

The total data used by all directories dedicated to the client. Unlike other metrics, this is an absolute value, not a rolling average.

Preliminary Findings: Currently, Parity requires about 180GB, Geth uses under 200GB, and the full archive node needs 2.36TB.

Parity Full Node Synchronization Status

The Parity full node began syncing on March 1, 2019. As of March 12, 2019, it had not fully synchronized with the Ethereum chain, lagging by approximately 450,000 blocks. Based on its trajectory, it was expected to catch up within a few days. Due to this slow initial sync, the “time synced percentage” metric showed near 0%, as the client was never synchronized.

The machine specifications for the Ethereum Parity full node were:

The fact that a machine with these specifications required over 12 days to sync may indicate that initial synchronization is a more significant concern for the Ethereum network than post-sync issues like block propagation. Although slow initial sync is a potential problem, Ethereum has not reached a point where nodes cannot catch up, as sync speed currently exceeds blockchain growth.

Data Integrity Concerns

Despite being hundreds of thousands of blocks behind the chain tip, the Parity full node occasionally reported itself as synced. For example, in the screenshot included in the original article, the website reported the node as fully synced 0.02% of the time, indicating that the node incorrectly believed it was at the chain tip during those periods.

As shown in a chart generated from Parity full node logs, the value for the “highest block seen on the network graph” (blue) sometimes decreased over time and consistently lagged far behind the actual chain tip (green). Occasionally, this potentially erroneous number decreased toward the verified chain height (orange), leading the website to falsely report the node as synced. This could be a concern for Ethereum users, as the Parity full node maintained many connections to the network, suggesting a possible software bug.

This potential flaw may undermine the reliability of the metrics on the website, even for other nodes, as the field for the highest chain tip might not function correctly. However, the metric continues to be included because Nodestats.org displays data as reported by the nodes, independent of data integrity perceptions. Future improvements to the metric are planned.

In limited scenarios, exploiting this vulnerability could have serious implications. For example, a user might perceive a transaction or smart contract execution as validated because their node claims to be at the network chain tip, while it is not. An attacker could then double-spend at a height the vulnerable node mistakenly believes is the tip, where proof-of-work requirements might be lower than the main chain tip. However, the likelihood of successfully executing such an attack is extremely low, as users rarely rely solely on the highest block function.

Conclusion and Future Developments

Similar to its counterpart Forkmonitor.info, Nodestats.org has room for improvement. In collaboration with TokenAnalyst over the coming months and years, planned enhancements include:

Currently, Nodestats.org serves as a useful tool for estimating the system requirements of running an Ethereum node. At a basic level, it also provides mechanisms for assessing the reliability of the Ethereum network and its various software implementations. However, the “time synced percentage” metric may be unreliable, though it highlights potential areas of concern.

For those interested in deeper network analysis, you can explore more strategies for monitoring blockchain performance and node operations.

Frequently Asked Questions

What is the purpose of Nodestats.org?
Nodestats.org is designed to monitor and compare the computational resource usage of different Ethereum node implementations and configurations. It provides metrics on CPU, memory, bandwidth, and storage to help users understand the requirements and performance of running a node.

How often is data collected on Nodestats.org?
Data is collected every five seconds from each of the five monitored nodes, resulting in 720 queries per hour. This high frequency allows for detailed analysis of node performance and network behavior.

Why does the Parity full node show a low time synced percentage?
The Parity full node underwent initial synchronization during the monitoring period, which took over 12 days due to the large volume of blockchain data. Until it fully synced, it consistently reported being behind the chain tip, resulting in a near-zero time synced percentage.

Are the metrics on Nodestats.org reliable?
While most metrics are based on direct machine readings, the “time synced percentage” metric may suffer from data integrity issues, as nodes can occasionally misreport their sync status. The platform is transparent about these limitations and aims to improve metric reliability in future updates.

Can Nodestats.org detect blockchain forks?
The platform currently includes a “time on conflicting chain” metric to identify forks, but no forks have been detected during the monitoring period. A dedicated fork detection system is planned for future implementation.

How can I use this data to run my own node?
The metrics provide realistic estimates of the storage, bandwidth, and processing power needed to operate different types of Ethereum nodes. This can guide hardware choices and configuration settings for individuals or organizations interested in participating in the network. For detailed guidance, you can view real-time tools and resources for node operators.