I call here all LN devs, coders, enthusiast, LN node operators etc

As node operators, we all were in that shity situation when you got some forced closed channels, and in your frustration, look deep into your ln logs, trying to understand why that happen.
Usually is for no reason, but MUST be a reason. So I call you guys, to post here all your reasonable explanations, causes and what a node operator should do to not have so often this behavior or at least know where / what to do to avoid them.
Nobody in the right mind will force close a channel manually, without a reason. We all do that cooperative close, if a peer is still online and we really need to close that channel.
But automated force closing are really frustrating.
Let's make a list here of all situations, considerations, reasons, causes etc that a channel will be automatically force close and also if you know what steps to do to avoid them.
We have many LND nodes with this situation, few CLN and even fewer Eclair nodes. Possible causes:
  • some incompatibility between LN implementations ?
  • different CLTV delta settings between peers?
  • too many HTLC on the fly for long time?
  • a minimum time a node can be offline?
  • too many commit changes for a channel in a certain time?
  • bad reconnectivity?
Thanks to all devs, in advance for your help, I really want to find out all possible causes for this. Users have to know how to deal with, in special noobs.
I'd like to raise awareness for https://github.com/lightningnetwork/lnd/issues/6363. We need more information to identify bugs and/or understand what either of the two node (operators) did wrong. Currently, it's a lot of log diving, which makes analysis extremely hard.
reply
Based on my experience with around a 100 force closes, most of them are the result of HTLCs timing out. This happens if you start a payment and, before it settles, your peer disconnects (or you make your own node unavailable). This isn't a bug, but it's still annoying.
reply
I tried also limiting max htlcs pending in lnd.conf to 10 or 20. But the thing is that sometimes your node must be restarted or simply do not respond, and you have HTLC in fly. So practically you can't do anything about those HTLC. You are trying to restart the node as fast you can, not to get expired those HTLC, but still, if you have to compact the db will take some hours.
Then come the question: why those pending HTLC are not waiting more time until you get online? Or how to control these aspects?
reply
Yes, I saw your issue posted and looking closer to it. Is really annoying this.
reply
Most force closes happen because your peer (or you) becomes unresponsive while an HTLC is in flight. This HTLC then has to be 'settled', meaning the channel gets force closed. This can be very annoying, especially if the HTLC is too small (e.g. below the dust limit) to get its own output. In those cases, the HTLC in flight is donated to miners. There are a few proposals on how to improve on this, but generally such a mechanism should exist, or else your peers can just "steal" such small HTLCs.
reply
So an increased CLTV delta could avoid this? LND has by default 40 CLN has 34 Eclair 144 Those numbers means number of blocks until a HTLC can expire (if I remember correctly). So would be indicated that peers should have same CLTV on both sides, even if they use different LN implementations? Why each implementation have its own default CLTV? Why we can't have something unique general for all? Are we going to be like with the mobile chargers connectors?
reply
I'd be worried if it was an incompatibility with LN implementation since that would further segment the network, this is a really important question to ask cause ideally we wouldn't need to force close
reply
I wish I knew more about this so I could help but I'm completely ignorant. I suspect each implementation is different. I also suspect the implementation tries to do a cooperative close unless the peers can't communicate or come to an agreement.
I asked for you on the Bitcoin stack exchange to see if we can get some more answers: https://bitcoin.stackexchange.com/questions/113132/under-what-conditions-would-lightning-channels-be-force-closed-automatically
reply
LN channels can be force closed automatically due to multiple reasons:
  • Stuck HTLCs and both nodes failing to cooperate
  • Channel backups or LND detecting old states
  • Failure to reach agreement on states
  • Your implementation or your peer having an error
You can see all possible reasons by searching "fail the channel" (BOLTs jargon for force closing and/or forgetting a channel) in the BOLTs repository.
reply
I'm anxiously waiting for the responses.
reply
Yes, I was reading past days a lot on github issues from all LN implementations, but some of them are really technical and can't drawn some simple conclusions from there.
I would like to gather all possible answers and have like a noob guide what to do and what not to do, in order to have healthy channels and nodes. We have all these LN implementations, also some good documentation, but almost nothing about what can cause force closures and in special at this incremental rate that we are witnessing now.
reply
My suggestion to strenghten the channels is to always open a shadow 2of2 multi-sig wallet on the side which, if ever, can be "closed" only cooperatively, and any funds can be just added to it anytime, while agreeing with the other side on balance changes (see below).
Example:
A and B have both a 1M sat LN channel and a 2of2 multi-sig wallet (where each of them has their own BIP39 seed, just it is not managed by a bot/LNnode who could sign anything with that key according to the rules of LN network, but rather by humans who can work under any conditions they agree on, they start with 1M A and 100k B on that 2of2 multi-sig wallet).
The 1M sat LN channel was opened from B's side, so on the 2of2 multi-sig wallet A just notes (and "signs", in any way acceptable by B) the state of the 2of2 is now 500k for A and 600k for B. B then either pays an invoice or sends 500k via keysend to A.
If the channel gets force-closed by a software issue, they can negotiate the deal because they still are incentivized to cooperate because of the 2of2 multi-sig wallet.
What do you think about this? I guess many are already using similar techniques but have never seen it described generally, suitable for bedtime reading. Yes, this approach adds a level of trust, but this trust is near, with the party you (can) know and have real-life interaction with. It does not add any general trust requirement like in case of banks or governments.
reply
I am not looking to change the way LN it works right now or other ways to create other type of LN or contracts. I am looking for answers "why a force close channel happen" and how the user can control this aspect. If we have a good documentation of how channels works right now, users will know what to do. My opinion is that LN works and is designed well enough, but normal users don't know exactly how it works, due to lack of information. Yeah maybe some LN devs knows very well these aspects, but the large number of noob users have no clue about what should take in consideration or not.
That's why I called here all LN devs, to speak loud and clear, giving an explanation why these force closures and what a normie can do about.
reply
I see your point. Just that what I propose is not changing LN at all. It is actually accepting it in whatever state it is right now (whether it does a lot of unintended channel force-closes or not).
reply