Hello, I was curious to explore what insights can be gained about Bitcoin's dev history from Github data. While there's a lot to uncover, in the fist iteration, i have looked at.
1: How many have contributed to Bitcoin core? Who are the contributors? 2: Can their contribution be quantified? Not all commits are equal, but nevertheless it is a decent starting point. 3: Can GitHub data, such as commits, contributors, stars, and more, provide additional context and insights into Bitcoin's history? (WIP)
Here is high level summary:
1: Less than ~400 dev contributors so far 2: the first five to do the commit on the github repo:
| Login | Name | First Contribution | Contributions |
| non-github-bitcoin | Non-Github User Bitcoin Commits | 2009-08-30 03:46:39+00:00 | 271 |
| gavinandresen | Gavin Andresen | 2010-07-14 15:54:31+00:00 | 1120 |
| dooglus | Chris Moore | 2011-01-21 10:37:34+00:00 | 31 |
| luke-jr | Luke Dashjr | 2011-01-28 19:39:31+00:00 | 526 |
| mgiuca | Matt Giuca | 2011-02-25 21:45:38+00:00 | 6 |
3: Top 5 contributors by # of contributions
| Login | Name | First Contribution | Contributions |
| laanwj | None | 2011-05-07 20:13:39+00:00 | 7370 |
| Anonymous | merge-script, MacroFake, MarcoFalke, Marco | NaT | 6272 |
| fanquake | fanquake | 2012-02-28 12:31:56+00:00 | 3904 |
| sipa | Pieter Wuille | 2011-03-17 21:51:59+00:00 | 2195 |
| achow101 | Ava Chow | 2016-01-23 13:58:17+00:00 | 1604 |
Here is a chart that has data on all the contributors. Pls have a look at the interactive version at. https://sorukumar.github.io/plebdashboard/v0%3A%20for%20feedback/Aug042024_Bitcoin_Contributors.html The tooltip contains info about each developer and their contributions.
Here's how to read the chart:
1: The x-axis represents the year each developer made their first contribution to Bitcoin Core. For example, Peter Wuille's first contribution was in 2011, so he's part of the "Class of 2011". The y-axis shows the number of contributions as of last week. 2: The size of the bubble is correlated with # of contribution. So, larger bubbles indicate top contributors. The tooltip also displays the percentage of their contributions compared to the total number of commits to Bitcoin Core 3: The color of each bubble represents the year of the developer's last contribution. This helps identify who is currently active. 4: All anonymous contributors are grouped under the "Class of 2009". Upon examining the data, I found that many of these individuals are non-anonymous developers making commits anonymously. I even spotted a few high-profile names from big tech companies. 5: The dark blue is Satoshi. https://github.com/non-github-bitcoin?tab=overview&from=2017-12-01&to=2017-12-31 6: you may notice that the GitHub data appears to show a lower number of contributors. According to the GitHub API, the numbers are as follows:
| Repository | Commits | Contributors | Forks | Stars |
| bitcoin/bitcoin | 42041 | 1200 | 35988 | 77846 |
However, out of 1200 only ~350 are identifiable. Looking at the data, I am sure majority of these anonymous folks are 350 dev making commit anonymously. Only 1k commits are out of these anonymous folks, except the dark blue bubble representing Satoshi. Remember v0.1.5 onward is available on the github. https://satoshi.nakamotoinstitute.org/code/
I ended up doing hours of work to manage data quality issues ( account type: user/anonymous), so let me know if i missed anything obvious.
Also, let me know if you would like to get anything added into the analysis/data.
@sorukumar 🙏🙏
Did you distinguish merge commits from other commits?
No. it would be a good next step.
Additionally, i am debating whether it make sense to break commits into components like wallet, consensus, networking, GUI and others. it would add insights on the kind of work that is being done. What could those components be? and, then I need to figure out a way to group it from the commit log.
A lot of coding around testing... in python, which is interesting choice given the code base is C++
Is this always the case for large C++ projects?
No, that would certainly not always be the case. In this case, it’s used for the tests that treat Bitcoin Core as a black box, especially the functional tests, and benchmarks that call RPC commands of the binary and expect specific outcomes. It’s just simpler to use a scripting language for these tests.
gotcha
deleted by author
Thank you for sharing.
Wow great research, thanks for sharing as I wasn’t aware!
Thanks to all the significant contributions to the network! Wow - that is a lot of commits!!!
This is high level stuff. Bitcoin noobies like me are not quite qualified to contribute to this awesome initiative of yours - or am I wrong to think that?
This doesn’t require and Bitcoin skills. Just data analysis of commits in a git repo
yes.
if it interest you, we can certainly do more. the next step could be log analysis, and this is a useful link i found with a quick search. We can try using any of the tool/libs that are useful. https://livablesoftware.com/tools-mine-analyze-github-git-software-data/
Got it. Thanks for the link.
Thanks for that!
really cool!
Is there any stacker among them?
Yes ;)
Who? You're not there? And I see someone zapped you 500 Sats for a simple yes! :) and nothing for the greatest of all questions! ;)
I think I'm on there. Class of 2015. https://m.stacker.news/46142
i wonder why in the viz you are class of 2014
https://m.stacker.news/46226
I opened my first pull request in 2014, but it got merged in 2015. Maybe you are counting by the date the PR was opened?
@CHADBot /eli5
You have summoned CHADBot. Please zap this post 21 sats to receive service.
Made with 🧡 by CASCDR
Carloschida was trying to figure out more about Bitcoin's past by looking at its history on Github, a site where people share and work on code together. Here are some of the things he found out:
Made with 🧡 by CASCDR
BTC presente y futuro 🤑