I run c-otto.de and, despite my best efforts, sometimes have to pay for force-closes. Because the one from last night was rather expensive, I'd like to share my insights.
2023-12-28 15:12:35.748: My node accepts a forward request coming in from bfx-lnd0, requesting to send to Gravity21 πβοΈ:
2023-12-28 15:12:35.748 [DBG] HSWC: ChannelLink(43a63045a6b7fc2be07424afd5eda561ea01009c057ab8693b2d2212a4a015d5:1): queueing keystone of ADD open circuit: (Chan ID=798901:1550:1, HTLC ID=1259)->(Chan ID=805200:3121:1, HTLC ID=16830) 2023-12-28 15:12:35.820 [DBG] HSWC: ChannelLink(43a63045a6b7fc2be07424afd5eda561ea01009c057ab8693b2d2212a4a015d5:1): removing Add packet (Chan ID=798901:1550:1, HTLC ID=1259) from mailbox
Sadly, my peer didn't respond to this, leaving my node with a stuck HTLC. After five minutes, lnd decided to disconnect:
2023-12-28 15:17:36.415 [INF] PEER: Peer(03238001dec7155a367248ed7f9a1e6940f3f372f4d6f2586b31c91ae32cc1628f): disconnecting 03238001dec7155a367248ed7f9a1e6940f3f372f4d6f2586b31c91ae32cc1628f
I believe that lnd should not have added the HTLC to a peer that doesn't respond, which is an issue discussed in https://github.com/lightningnetwork/lnd/issues/2992 (from 2019!). If lnd knows that a peer is offline, it would not add a new HTLC, and instead fail the payment back to the sender. Sadly, that's not what lnd currently does.
As lnd does not know whether my peer received the HTLC, it has to wait for a timeout. This timeout ended in block 823615 (2023-12-30 23:12:43). My node uses some crude scripts to reconnect to peers that are offline, even disconnecting from those that appear to be offline if there's some stuck HTLC in one of their channels. Despite these efforts, my peer did not reconnect and, thus, did not settle/fail the HTLC in time, forcing my node to reclaim the funds on-chain:
2023-12-30 23:12:43.051 [DBG] CNCT: ChannelArbitrator(43a63045a6b7fc2be07424afd5eda561ea01009c057ab8693b2d2212a4a015d5:1): new block (height=823615) examining active HTLC's 2023-12-30 23:12:43.055 [DBG] CNCT: ChannelArbitrator(43a63045a6b7fc2be07424afd5eda561ea01009c057ab8693b2d2212a4a015d5:1): checking commit chain actions at height=823615, in_htlc_count=0, out_htlc_count=1 2023-12-30 23:12:43.058 [INF] CNCT: ChannelArbitrator(43a63045a6b7fc2be07424afd5eda561ea01009c057ab8693b2d2212a4a015d5:1): go to chain for outgoing htlc 6ac792db9b79f262bb941dec776ac676980c2bd7c294443316330c742c2ac706: timeout=823615, blocks_until_expiry=0, broadcast_delta=0 2023-12-30 23:12:43.076 [DBG] CNCT: ChannelArbitrator(43a63045a6b7fc2be07424afd5eda561ea01009c057ab8693b2d2212a4a015d5:1): attempting state step with trigger=chainTrigger from state=StateBroadcastCommit 2023-12-30 23:12:43.077 [INF] CNCT: ChannelArbitrator(43a63045a6b7fc2be07424afd5eda561ea01009c057ab8693b2d2212a4a015d5:1): force closing chan 2023-12-30 23:12:43.103 [INF] CNCT: Broadcasting force close transaction bfea88acb1b00ac48cbe6e571f41d8016012ba31813f1b7e94105a349c890d42, ChannelPoint(43a63045a6b7fc2be07424afd5eda561ea01009c057ab8693b2d2212a4a015d5:1): [...]
Here's the close transaction: https://mempool.space/tx/bfea88acb1b00ac48cbe6e571f41d8016012ba31813f1b7e94105a349c890d42
As you can see, it includes the two anchor outputs (worth 330sats), the balances (52k and 4.9M), and one additional output (13k sats) which is the HTLC in question. As my peer opened the channel, the fees (7.7k sats) were deducted from their channel reserve and I did not pay for this, even though my node sent out the close transaction.
Warning: If there were other pending HTLCs in the channel at this time, they'd also be included in the transaction, no matter how recently they were received! Make sure to limit the number of in-flight HTLCs (per channel), as a single HTLC timeout might cause an extremely large/costly force-close if many other (healthy?) HTLCs have to be included.
Sadly, the fee rate of 24 sat/vByte does not suffice to get the close transaction confirmed in time. As explained in Elle Moutin's excellent blog series (https://ellemouton.com/posts/htlc-deep-dive/), my peer can always reveal the HTLC's preimage and claim the funds. However, my peer on the incoming side (bfx-lnd0) can also claim the funds after some timeout, which is why my node needs to make sure Gravity21 πβοΈ cannot claim the funds after I already lost them to bfx-lnd0. To prepare for this, lnd needed to get the close transaction confirmed and, thus, "pulled" the anchor output (CPFP):
2023-12-31 03:44:16.708 [INF] SWPR: Creating sweep transaction e65d88492b1344be0bf7f2b4f435e671dfa6fd68bf374b82105878c598ff00ed for 3 inputs (bfea88acb1b00ac48cbe6e571f41d8016012ba31813f1b7e94105a349c890d42:0 (CommitmentAnchor), 0529999cd28817dd9666804d28e88c1badf55a260fd7e996677ee6df9dd7c22a:0 (TaprootPubKeySpend), 39fcf06df92450afb3766911d41689007af95fb36f0c386ba05419ed6c530d30:0 (TaprootPubKeySpend)) using 45093 sat/kw, tx_weight=956, tx_fee=0.00093401 BTC, parents_count=1, parents_fee=0.00007787 BTC, parents_weight=1288
Here's the transaction: https://mempool.space/tx/e65d88492b1344be0bf7f2b4f435e671dfa6fd68bf374b82105878c598ff00ed
Note that my node has to pay the fees for this sweep (93k sats!).
This sweep transaction got mined a few minutes later:
2023-12-31 03:47:17.917 [INF] LNWL: Marking unconfirmed transaction e65d88492b1344be0bf7f2b4f435e671dfa6fd68bf374b82105878c598ff00ed mined in block 823653
Now lnd wants to claim the HTLC funds and fail the payment upstream to bfx-lnd0 (also to avoid a force-close from bfx-lnd0!). For this, a HTLC timeout transaction is created, which spends the HTLC output to an intermediate address, from which only my node can take funds (leaving the issue of revocation aside, nobody is cheating here):
2023-12-31 03:47:43.333 [INF] SWPR: Creating sweep transaction ee070ca1a7c29ba96bb38eb4b6ac1ff76e3229033eb833e9a7b7ed7db30e15cf for 3 inputs (bfea88acb1b00ac48cbe6e571f41d8016012ba31813f1b7e94105a349c890d42:2 (HtlcOfferedTimeoutSecondLevelInputConfirmed), e65d88492b1344be0bf7f2b4f435e671dfa6fd68bf374b82105878c598ff00ed:0 (TaprootPubKeySpend), ea3a137ce1beb1c5e53937d3fff1a80cb041dd1c35ffd0452da20bd3b5178202:0 (TaprootPubKeySpend)) using 37114 sat/kw, tx_weight=1300, tx_fee=0.00048248 BTC, parents_count=0, parents_fee=0 BTC, parents_weight=0 2023-12-31 03:47:43.347 [INF] NTFN: Found input bfea88acb1b00ac48cbe6e571f41d8016012ba31813f1b7e94105a349c890d42:2, spent in ee070ca1a7c29ba96bb38eb4b6ac1ff76e3229033eb833e9a7b7ed7db30e15cf 2023-12-31 03:47:43.347 [DBG] CNCT: Found mempool spend of HTLC output bfea88acb1b00ac48cbe6e571f41d8016012ba31813f1b7e94105a349c890d42:2 in tx=ee070ca1a7c29ba96bb38eb4b6ac1ff76e3229033eb833e9a7b7ed7db30e15cf 2023-12-31 03:47:43.347 [DBG] CNCT: HTLC output bfea88acb1b00ac48cbe6e571f41d8016012ba31813f1b7e94105a349c890d42:2 spent doesn't reveal preimage 2023-12-31 03:51:54.669 [INF] NTFN: Found input bfea88acb1b00ac48cbe6e571f41d8016012ba31813f1b7e94105a349c890d42:2, spent in ee070ca1a7c29ba96bb38eb4b6ac1ff76e3229033eb833e9a7b7ed7db30e15cf 2023-12-31 03:51:54.966 [INF] NTFN: Dispatching confirmed spend notification for outpoint=bfea88acb1b00ac48cbe6e571f41d8016012ba31813f1b7e94105a349c890d42:2, script=0 49ae7fe296198a7dcc8b50c6f6e613afc5810d49783926bdc43b3dc7a6d5948c at current height=823654: ee070ca1a7c29ba96bb38eb4b6ac1ff76e3229033eb833e9a7b7ed7db30e15cf[0] spending bfea88acb1b00ac48cbe6e571f41d8016012ba31813f1b7e94105a349c890d42:2 at height=823654
Here's the transaction: https://mempool.space/tx/ee070ca1a7c29ba96bb38eb4b6ac1ff76e3229033eb833e9a7b7ed7db30e15cf
Some observations:
- even though lnd created this transaction, other parts of the code still see it added to the mempool and check for any preimage being revealed
- my node has to pay the fees for this HTLC timeout transaction (48k sats!)
- the HTLC is only worth 13k sats, so paying 140k sats to avoid losing it seems pointless. However, as not claiming timed out HTLCs could lead to peers stealing lots of small-ish amounts, being strict seems to be the better approach overall.
Now, after spending 93k + 48k sats, my funds are safe, but locked. In around two weeks my node will be able to claim both the channel balance and the HTLC amount, but this isn't urgent (and I tweaked my node to not spend a whole lot of sats for this rather optional step, which I believe is an ongoing issue in lnd).
As you can see, even though I didn't open the channel and my node was up and running for the whole time, having an unrealiable peer still caused me to spend 140k+ sats.