pull down to refresh

Research in Public #02: Posting fees affect post quantity

Introduction

In #1243188 I suggested to @k00b that we engage in a research project using SN data. The idea would be to use this data to study: A) how micropayments with real money affects internet discourse; and B) barriers to the adoption of self-custody. I also promised @Undisciplined that I'd carry out the research in public, since many people might not know what economics research looks like, and may be curious as to how the process plays out. You can follow all of the updates here.

Reconstructing territory fee histories

Today I decided to look at how territory fees affect the number of posts. Are users sensitive to posting fees, or do they not care?
To do this, I would have to reconstruct the posting fee history of each territory. As I metnioned in #1250442, this is harder than it sounds because the database is not designed to keep historical records of the past. AFAIK, there is no table that records the entire history of posting fees for each territory. Thus, it needs to be inferred from the post costs that users actually paid to post in the territories.
This was, again, quite a difficult task. Posting costs can be modified for a variety of reasons. For example, if a user posts multiple times within a 10 minute window, their costs increase by 10x for each additional post. The database only records the total cost of the post, not the breakdown of why the cost was what it was. As another example, posts can be charged extra if they included large file uploads. And there were other oddities as well, like when the founder of a territory reduces the fees temporarily for their own regular posts, then raises them again. Thus, to properly get the historical posting fee for a territory in a particular period, I had to somehow account for all the reasons the paid fee might be different from the base cost.
After struggling with this for a while, I think I was able to reconstruct a fairly accurate history of each territory's daily posting fees. The histories for ~bitcoin, ~meta, ~econ, and ~BooksAndArticles is shown below. I verified with their founders that the histories look accurate.

Do posting fees affect post quantity?

The next natural question to ask is whether posting fees affect post quantity. Once the posting fee histories were constructed, this was pretty easy to do (thus illustrating a key principle in empirical research: the actual statistical modeling is the easy part---getting the data in the right format is the hard part!)
For each contiguous period of time in which a territory had the same posting fee (called a spell), I counted the number of posts made in that territory over that time, and divided by the number of days to get the posts per day. I then made a scatter plot where each dot is a spell, and the x-axis measures the posting fee during that spell and the y-axis measures the posts-per-day during that spell. The chart is shown below:
Each dot is a spell of contiguous posting fees in a territory
There is a clear downward correlation between posting fee and posts-per-day. I also highlighted spells from the top 5 territories from the month of September, illustrating that generally the correlation holds even for fee changes within the same territory. The only one here that doesn't seem to go in the right direction is ~Stacker_Sports. Maybe @grayruby raises the fee when the territory becomes popular? Haha, I'm not actually suggesting that, but it does highlight that the negative correlation is not a given, since reverse causality might imply an opposite relation.
A log-log regression of the data results in the equation \log_{10} Y = 0.41 - 0.18 \log_{10} X, where Y is posts per day and X is posting fee. These numbers imply that if you 10x the posting fee in your territory, you can expect about 33% fewer posts per day. The results suggest that users are indeed sensitive to posting fee.

Next steps

To be honest, I was a little bit surprised by this result. I wasn't sure that we'd find anything because the fees are pretty small (in terms of real purchasing power). Yet, it does seem that users are responsive to even these very small changes in fees. This implies that micro-incentives really might matter for how people behave online, which is a promising sign for the rest of the research project.
As a next step, it might be interesting to see whether the actual dollar-value of the fees matters more, or the whether the sats-value matters more. I haven't figured out exactly how to do that yet, but that might be a good next step.
Other next steps would include something like whether posting fees are correlated with the quality, not just quantity, of posts.
Anyway that's all I have for today. Anyone who wants to vet the code can go to https://github.com/ed-kung/sn-research. I'll keep posting any time I spend a day doing substantial work on this project.
50 sats \ 1 reply \ @optimism 28m
This made me think about the median zap vs fee. Unless something is truly slop, I baseline zap ~AI posts. I'm quite sure that others like @siggy47, @grayruby and @Undisciplined do the same on their territories. This may also affect the volume?
reply
Yeah it's quite possible that users pick up on the generosity of the territory founders too. Would be interesting to explore that as well
reply
You're getting to this in one of your next steps, but as I'm sure you expected I have questions about bias and endogeneity.
It helps that you can show the relationship holds within territories, so we know it's not just some artifact of composition. However, there still are lots of other things going on.
  1. Reverse causality: Some territories might raise their fees because they have low volume but expect that volume to be inelastic. I think this describes what @realBitcoinDog did with ~HealthAndFitness and it's what @jeff was doing with ~econ before he handed it over to me.
  2. Spurious correlation: There may have simply been lower fees when territories were first introduced, which corresponded to a busier time at SN. During the early price discovery process founders were fairly slow to raise fess because no one had any idea what the optimal fees were. Unless somewhat higher fees caused the decline of SN activity (maybe it did!), then this is a problem.
  3. Selection: Territories have different focuses. ~news is geared towards higher volume shorter posts, while ~mostly_harmless is for more thoughtful content. I suspect the intention of the territory is doing work in addition to the fees.
I'm sure we'll both think of more, but that's enough to start with.
Some sort of dif-in-dif design could get at the reverse causality issue.
You might be able to use a relative measure of fees and activity to check the spurious correlation problem.
I bet something as simple as territory fixed effects will take care of the selection issue.
reply
Yeah next up is probably a two way fixed effects regression with time effects and territory effects. Can throw in bitcoin price as a regressor as well.
If, as I'm imagining it, it'll be a territory-by-week panel, there will probably be lots of zeros. I wonder if you know how to efficiently run a tobit model with lots of fixed effects
reply
I'd probably start with the subsample of territories that have posts almost every week.
Have you looked at the distribution of post output by territory? I suspect the territories that have posts every week account for 95%+ of total posts.
I forget exactly when asinh has problems, but it can perform more or less like a log that accepts zeros.
reply