This goal of this post is to show how quickly you can write code that will calculate some metrics about the LN network
It might later turn into the first episode of a series about the story of improvements of my lightning node channel advisor, and what i learnt building it.
As an example here, we will build a simple script that shows, for a specific lightning node, how many new nodes can be reached for a given number of hops

Getting the lightning network graph

The first step is to get the actual lightning network data to analyze. This actually possible from any running lightning node. That is because nodes gossip with their peers what they know about the network., so that each node can calculate the hops to take to route each payment. Given enough time listening to other nodes’ gossip, they can know most of the public nodes and channels that exist.
For LND, we can get this information by the command: lncli describegraph
The output in JSON format describes all the nodes and channels it knows:
{ "nodes": [ { "last_update": 1703385534, "pub_key": "0200000000a3eff613189ca6c4070c89206ad658e286751eca1f29262948247a5f", "alias": "pay.lnrouter.app", "addresses": [ … ], … }, { info about some other node… }, … ], "edges": [ { "channel_id": "753718519383195649", "chan_point": "339fefe258dd33bce6f988b26c48cdf5029e061d9a7c8a28db3b7a0dc8ae8fa0:1", "last_update": 1706978165, "node1_pub": "0242a4ae0c5bef18048fbecf995094b74bfb0f7391418d71ed394784373f41e4f3", "node2_pub": "0376f520c6304b66045a62524eb6ab8cd0861bad6d7e5fd14609a92e63441560a0", "capacity": "93140", "node1_policy": { "time_lock_delta": 40, "min_htlc": "1000", "fee_base_msat": "1000", "fee_rate_milli_msat": "1", "disabled": true, "max_htlc_msat": "92209000", "last_update": 1706978165, "custom_records": {} }, "node2_policy": { "time_lock_delta": 40, "min_htlc": "1000", "fee_base_msat": "1000", "fee_rate_milli_msat": "1", "disabled": false, "max_htlc_msat": "92209000", "last_update": 1623436267, "custom_records": {} }, … }, { info about some other edge… }, … ] }
nodes is an array with one element per known lightning node edges is an array with one element per known lightning channel
Some notable parameters of nodes:
  • pubkey : the node’s public key hex string
  • alias : the alias the node chose
Some notable parameters of edges/channels:
  • node1_pub, node2_pub : public key of both nodes that share the channel
  • capacity : channel size, in sats
  • channel_id, chan_point : identifiers of the channel
  • node1_policy , node2_policy : fee policy for the node at each side of the channel

Loading the network graph in memory

Let’s now write the python code to load this network graph in memory.
We import the json python module
import json
Then we load the graph file
f = open('c:/graph.json', encoding='utf-8') file_content = f.read() graph = json.loads(file_content)
The object can then be used like a python dictionary:
nodes = graph['nodes'] edges = graph['edges']

Analysing the graph

We choose the node to analyse by its' pubkey
test_pubkey="02c521e5e73e40bad13fb589635755f674d6a159fd9f7b248d286e38c3a46f8683"
We loop on the nodes until we find ours, then print some of its metadata
#print some information about our node node_found = False nodes = graph['nodes'] print("Known network nodes: "+str(len(nodes))) for node in nodes: if(node['pub_key'] == test_pubkey): node_found = True print ("Test node alias : "+node['alias']) print("Known network nodes: "+str(len(nodes))) print("Known network edges: "+str(len(edges)))
Now, the network analysis starts. For this exercise, we will try to look how many nodes our own node can reach for each new hop
We first loop on the list of channels, to reformat this information in an other way : for each node, we will have a list of the nodes they are connected to
#For each edge, note in both peers the other peer they are conected with node_edges = dict() for edge in edges: pk1 = edge['node1_pub'] pk2 = edge['node2_pub'] if not pk1 in node_edges: node_edges[pk1] = list() if not pk2 in node_edges: node_edges[pk2] = list() node_edges[pk1].append(pk2) node_edges[pk2].append(pk1)
Now, we can more quickly access all the peers connected to a node. We now start enumerating, starting from our node, Which peers it can reach, which peer those ones can reach, etc:
#Store for each node the hop count at which it is reached node_reach = dict() # list of pubkeys reached at current hop level new_reach = list() #At 0 hops, we only reach ourselves hop_count = 0 new_reach.append(test_pubkey) node_reach[test_pubkey] = hop_count reachable_nodes = 1 #ourselves #Loop once per hop, until a new hop reaches no new node while len(new_reach) >0 : #For each node newly reached at previous hop, add the nodes they are #connected to (that we did not reach yet) to the reached nodes for #the next hop hop_count = hop_count + 1 prev_reach = list(new_reach) new_reach.clear() for pubkey in prev_reach : peers = node_edges[pubkey] for peer in peers: if not peer in node_reach: new_reach.append(peer) node_reach[peer] = hop_count new_hop_reach = len(new_reach) reachable_nodes = reachable_nodes + new_hop_reach print("Hop "+str(hop_count)+": "+str(new_hop_reach)+" nodes reached") print("Reachable nodes: "+ str(reachable_nodes))
Run this in yout Python interpreter, and you will get this kind of output:
Test node alias : LNShortcut.ovh Known network nodes: 17190 Known network edges: 62806 Hop 1: 21 nodes reached Hop 2: 5335 nodes reached Hop 3: 7503 nodes reached Hop 4: 2692 nodes reached Hop 5: 468 nodes reached Hop 6: 79 nodes reached Hop 7: 8 nodes reached Hop 8: 2 nodes reached Hop 9: 0 nodes reached Reachable nodes: 16109
Voila. In less than 70 lines of code, we can have a view about the number of hops needed to reach most of the LN from our nodes.
There are various reasons why this particular metric is not really useful as is :
  • Not all nodes are active/reachable
  • Not all channels can route in both directions, for the same amount, at the same cost
  • The current centralization of the network make this kind of metric give similar results for most nodes as soon as they are connected other peers
But it shows how quickly one can start analyzing the network, learning from it, then building more advanced network analysis tools
Nice work! I hope to see future updates from you on this subject.
reply
Keep up the good work!
You can also stream gossip into a graphdb (memgraph or neo4j). That approach allows one to both query and visualize the network.
reply
Yet, so many LN node runners never got their hands dirty with such simple scripting. It's the Umbrel effect...
reply
I agree, umbrel (and others) is dangerous that way. I use it myself, but try to dig in when I have the time
reply
Hey these posts are so cool. Are you considering publishing it somewhere additional?
Anyhow, nice work!
reply
Impressive, nice work
reply
Nice! I always wanted to analyze the lightning network too.
reply
I am not too good at coding but somehow I found it interesting and keep doing what you're doing. Don't overuse 'Umbrel' that's all I can say.
reply
Good stuff. I have been contemplating starting an LN channel, not as a business, but to support the network and to learn more, and these are the kinds of insights that I find compelling.
reply
This is awesome!
reply