Thursday 17 September 2015

Bitcoin peer to peer network

Bitcoin is based on a peer to peer network, where each peer, or node, verifies, stores and forwards the transactions taking place on the network. Some nodes, the miners, collect valid transactions into blocks and publish these. Each block references the previous block, thus creating a chain of them, the Blockchain. Although creating blocks is hard work and requires specialized hardware, anyone can subscribe to these blocks, then inspect and verify them.

By connecting to a node on the network, we can receive a stream of real-time transactions as well as the blocks being published as miners create them. For robustness and security, we should probably connect to several independent nodes, thus reducing the chance that transactions are being censored by malicious nodes, or dropped by a faulty one. Ensuring that we're well connected is far from trivial, so in this first pass we'll just connect to a handful of nodes and hope that they're not all colluding against us. Unresponsive or otherwise broken connections will be dropped and new ones created to replace them.

Implementation

I've written a simple implementation of this network layer based on the protocol documentation (refer to this for details on the various messages mentioned below, or terms that are unfamiliar). I chose Go as the implementation language, and as an experiment the code relies solely on channels for synchronization, which is elegant at times, but far from perfect. I'll write a separate blog post about interesting problems arising from these choices.

The code consist of two layers, the connection and the network. The former focuses on a setting up and maintaining a single connection, including sending and responding to pings. The latter will acquire the addresses of as many peers as possible, score the quality of them by various metrics and create and set up connections to a number of the best scoring ones. New peers' addresses are found by performing DNS look-ups, asking already connected peers by sending a getaddr message, or loading previously known peers from local storage. Usually this will quickly result in a pretty good overview of the network, with hundreds of nodes to chose between for connections.

Users of the library will get a channel to send messages on (either addressed messages or broadcasts) and the ability to subscribe to received messages by type. They don't have to worry about managing the connections and won't even be aware of the low level messages, with the exception of received version messages, that can be subscribed to by users who need to be aware of new peers being connected to (eg. because they want to address messages to them -- the example application included with the library will listen for version messages and send a mempool message to each new peer, thus initiating one or more inv messages containing transactions in the respective peer's mempool to be sent our way). This is a good starting point for beginning to write software interacting with the Bitcoin network.

Inspecting current connectivity

The example application contains a simple web server that can be used to see current connections as well as the applications view of the network. The page /peers on localhost:8080 list all known peers, with their connection status, address, time of last activity and a quality score.

Example peer list as seen on http://localhost:8080/peers:
(connected) 75.189.201.141:8333   2015-09-17 13:10:48.073832944 -0500 CDT Q(1000)
(connected) 73.162.143.196:8333   2015-09-17 13:10:42.964351941 -0500 CDT Q(1000)
(connected) 79.98.137.63:8333   2015-09-17 13:10:47.438899944 -0500 CDT Q(1000)
(spare)     52.11.33.24:8333   2015-09-17 13:10:32.825600001 -0500 CDT Q(800)
(pending)   50.147.76.86:8333   2015-09-17 13:10:32.825595045 -0500 CDT Q(800)
(spare)     69.143.149.38:8333   2015-09-17 13:10:32.825507889 -0500 CDT Q(800)
(pending)   24.52.35.44:8333   2015-09-17 13:10:32.695354838 -0500 CDT Q(800)
(spare)     54.153.97.109:8333   2015-09-17 13:10:32.825616717 -0500 CDT Q(800)
(spare)     25.96.107.191:8333   2015-09-17 11:08:03 -0500 CDT Q(678)
[...]
(spare)     71.93.135.167:8333   2015-09-17 05:02:53 -0500 CDT Q(313)
(spare)     155.133.19.109:8333   2015-09-17 05:00:28 -0500 CDT Q(310)
(failed)    50.177.196.160:8333, failed at 2015-09-17 13:10:43.100961354 -0500 CDT Q(300)
(failed)    173.69.49.106:8333, failed at 2015-09-17 13:10:42.699932334 -0500 CDT Q(300)
(failed)    198.38.93.227:8333, failed at 2015-09-17 13:10:42.700074666 -0500 CDT Q(300)
(spare)     68.238.62.169:8333   2015-09-17 04:49:01 -0500 CDT Q(299)
(spare)     72.241.129.31:8333   2015-09-15 16:09:59 -0500 CDT Q(0)
(spare)     84.104.142.12:8333   2015-09-09 23:23:42 -0500 CDT Q(0)
(spare)     87.220.72.124:8333   2015-08-20 08:24:59 -0500 CDT Q(0)

The application contains a simple In the example above there are 3 established connections, with 2 more pending. 3 nodes have had recent failures and are ranked lower because of this, while the rest are spares, ranked by the time of the last known activity from them (as seen by us, or as communicated by a peer in an addr message).

No comments:

Post a Comment