pull down to refresh

Research in Public #04: Post fee influences post quality

Introduction

In #1243188 I suggested to @k00b that we engage in a research project using SN data. The idea would be to use this data to study: A) how micropayments with real money affects internet discourse; and B) barriers to the adoption of self-custody. I also promised @Undisciplined that I'd carry out the research in public, since many people might not know what economics research looks like, and may be curious as to how the process plays out. You can follow all of the updates here.

Recap: Territory posting fees affect quantity of posts

Last time (#1253062), I showed using a twoway fixed effect regression that territories with lower posting fees attracted a greater number of posts.
  • The correlation could not be explained by territory subject matter, since the model allowed every territory to have a different baseline. Thus, the relationship is identified off of changes to territory posting fees and the change in subsequent quantity of posts.
  • The correlation could not be explained by global time trends in SN usage, since the model allowed every week to have a different baseline effect. Thus, the relationship is also identified off differences in the size and timing of posting fee changes in different territories.
The results were both surprising and unsurprising. They're unsurprising because demand slopes downwards. But they're surprising because the costs are small, yet people are still demonstrably responsive to them.

What about post quality?

Ok, so territory posting fees affect the quantity of posts. Does it affect the quality of posts? As it turns out, the answer seems to be yes.

Measuring quality

First, let me explain how to measure post quality. I use two methods:
  1. The sat value of zaps within 48 hours of posting,
  2. The number of comments within 48 hours of posting.
I use a 48 hour window so that older posts do not have an advantage over newer posts, simply for having been around longer. Moreover, we can demonstrate that 95% of zaps are earned in the first 48 hours, and 90% of all comments:
In addition to the 48 hour requirement, I ignore posts that:
  • Are from the ~AMA or ~jobs territories. ~AMA seems to operate on a different set of incentives, by far attracting the largest number of zaps, but also having an especially high posting cost. ~jobs is ignored also because it probably operates on very different incentives from normal posts.
  • Where the poster is the territory owner.
  • Bio, freebies, and territory-less posts like Saloon posts.
The resulting dataset contains about 180,000 posts with which we can regress posting cost on post quality.

Posting fees and zaps

I run the following regression:
\log Y_{istu} = \beta \log X_{istu} + \mu_{s} + \delta_{t} + \eta_{u} + \epsilon_{itu}
where i indexes a post, s indexes the territory that the post was made in, t indexes the week when the post was made, and u indexes the user who made the post. Y_{istu} is the sat-value of zaps in the first 48 hours, and X_{istu} is the posting cost paid.
  • \mu_{s} allows for arbitrary baseline differences in post quality across territories
  • \delta_{t} allows for arbitrary time trends in post quality across weeks
  • \eta_{u} allows for arbitrary baseline differences in post quality across users
The regression results, with various inclusions of the fixed effects, are shown here:
===================================================================================================
                                                  Dependent variable:                              
                    -------------------------------------------------------------------------------
                                                      log_sats48                                   
                            (1)                 (2)                 (3)                 (4)        
---------------------------------------------------------------------------------------------------
log_cost                 0.335***            0.226***            0.237***            0.104***      
                          (0.003)             (0.003)             (0.007)             (0.007)      
                                                                                                   
Constant                 2.775***                                                                  
                          (0.008)                                                                  
                                                                                                   
---------------------------------------------------------------------------------------------------
Territory FE                 N                   Y                   Y                   Y         
Week FE                      N                   N                   Y                   Y         
User FE                      N                   N                   N                   Y         
Observations              179,653             179,653             179,653             179,653      
R2                         0.079               0.159               0.181               0.406       
Adjusted R2                0.079               0.158               0.180               0.390       
Residual Std. Error 2.268 (df = 179651) 2.167 (df = 179539) 2.139 (df = 179313) 1.846 (df = 174835)
===================================================================================================
Note:                                                                   *p<0.1; **p<0.05; ***p<0.01
The results show that posting fees are positively associated with zaps, even after controlling for arbitrary territory effects, time trends, and user effects.
  • The relationship cannot be explained by territories with higher posting costs having subject matter that is zapped more frequently. This is accounted for by the territory fixed effects.
  • The relationshp cannot be explained by global time trends on SN. This is accounted for by the week fixed effects.
  • The relationship cannot be explained by better users preferring to post in territories with higher costs. This is accounted for by the user fixed effects.
My preferred specification is actually model (3), that doesn't include user fixed effects, because isn't part of setting territory fees about related to what kinds of users you attract? The coefficient of 0.237 in model (3) implies that posts that paid double the post cost can expect to have have 18% more zaps within the first 48 hours of posting.
You might be interested in the territory fixed effects, to see which territories attract the most zapped posts. Here they are:
Minimum 30 posts

Posting fees and comments

I run the same regressions above, but with number of comments in the first 48 hours as the dependent variable instead of zaps. The results are as follows:
===================================================================================================
                                                  Dependent variable:                              
                    -------------------------------------------------------------------------------
                                                    log_ncomments48                                
                            (1)                 (2)                 (3)                 (4)        
---------------------------------------------------------------------------------------------------
log_cost                 0.065***            0.042***            0.059***            0.025***      
                          (0.001)             (0.001)             (0.003)             (0.003)      
                                                                                                   
Constant                 0.644***                                                                  
                          (0.004)                                                                  
                                                                                                   
---------------------------------------------------------------------------------------------------
Territory FE                 N                   Y                   Y                   Y         
Week FE                      N                   N                   Y                   Y         
User FE                      N                   N                   N                   Y         
Observations              179,653             179,653             179,653             179,653      
R2                         0.017               0.107               0.117               0.351       
Adjusted R2                0.017               0.106               0.115               0.334       
Residual Std. Error 0.971 (df = 179651) 0.926 (df = 179539) 0.921 (df = 179313) 0.799 (df = 174835)
===================================================================================================
Note:                                                                   *p<0.1; **p<0.05; ***p<0.01
Using again, my preferred specification (3), the coefficient of 0.059 implies that posts that paid double the post cost can be expected to get 4.2% more comments in the first 48 hours.
Again, the relationship cannot be explained by baseline territory differences, time differences, or user differences.
Here is a chart of the territory fixed effects for comments, so you can see which territories attract the most comments:
Minimum 30 posts

Interpretation?

What's driving the result? As I already mentioned, it can't be because higher cost territories simply have better subject matter. It also can't be because of anything happening across time to SN as a whole. And lastly, it can't be because better users tend to self-select into higher cost territories.
Instead, the result may be driven by one or more of the following channels:
  1. High posting fees discourage low quality posts by bots
  2. Users want to post in relevant territories, but if the territory cost is high, they won't post there unless they know the content is good
  3. Users are willing to pay extra fees (i.e. 10x fees for posting twice in 10 minutes) if they know their content is good
  4. Territory founders who run high-cost territories are more generous with zaps (I realized that the territory fixed effects don't rule this out, because founders can change throughout a territory's lifetime).
Channels 1-3 are consistent with the hypothesis that pay-to-post increases post quality. Channel 4 doesn't, because it is less about post quality and more about the behavior of the territory founder. In future work I will need to try and rule out option 4, probably using a founder fixed effect.

Next steps

As mentioned above, an immediate next step is to re-try these regressions with founder fixed effects.
Another interesting angle to establish is whether the moneyness of the zaps really matters. That is, does it matter that zaps are real money? Or would users be responsive to these micro-incentives even if they were just points on a scoreboard? (i.e. Reddit reputation, etc.) It's plausible that they moneyness doesn't matter, and that people treat SN as a game to earn points. But it's also plausible (and more likely) that the moneyness does matter. This may be hard to test, but I think it's a worthy question to explore.
Anyway that's all I have for today. Anyone who wants to vet the code can go to https://github.com/ed-kung/sn-research. I'll keep posting any time I spend a day doing substantial work on this project.
Pretty cool! Indeed, never saw econ research being done, much less so "in public".
Your interpretation resonates with my personal experience.
If I put a lot of effort into a post, I know that the posting fee will be offset by the zaps. And even if it doesn't, I'm just happy to show the work I've done and I'm ready to pay for it.
On the other hand, when I share a simple link with a few quick comments and quotes, it takes me a few minutes, and I look for the cheaper territory to post it. If I don't find any cheap territory, I'd rather not post, as I don't think paying to post for links is worth the same as paying to post for actual POW.
Did you differentiate between the lower effort "link" posts from the higher effort "discussion" posts in your analysis?
reply
Pretty cool! Indeed, never saw econ research being done, much less so "in public".
One thing that makes me feel very good about this project is that I'm doing it in a way that I've long thought research should be done... in public, with open-source everything. It's my first time doing research like this and it's refreshing. It helps that so far the results have actually come out pretty smoothly haha.
Did you differentiate between the lower effort "link" posts from the higher effort "discussion" posts in your analysis?
I didn't, but that's a nice idea to try!
reply
I do hope you get at least some messy and confusing results. That'll really help people see how the sausage is made.
reply
30 sats \ 1 reply \ @adlai 11h
whether the moneyness of the zaps really matters. That is, does it matter that zaps are real money? Or would users be responsive to these micro-incentives even if they were just points on a scoreboard? (i.e. Reddit reputation, etc.) It's plausible that they moneyness doesn't matter, and that people treat SN as a game to earn points. But it's also plausible (and more likely) that the moneyness does matter. This may be hard to test
is any of the data about CCs vs Sats publicly readable?
reply
It's a bit tricky to break down sat vs CC zaps, but I think it's doable. I haven't done the work to tease them out yet, though.
reply
How much is too much?
reply
Sorry, I didn't understand the meme. Are you asking how much posting fee is too much?
reply
Yes please. How much posting fees, is too much?
reply
Not really sure. Depends on your objectives. Curiously, it seems that the combined elasticity of posting costs on quantity of posts (-0.246, #1253062) and on zaps per post (+0.237, this post), are pretty close in magnitude. It may net out to nearly zero in terms of total sats earned on posts.
@Undisciplined, I think you might be interested in the above
reply
That fits with what some of us have noted anecdotally. It's very strange that total amount zapped would be conserved.
We haven't really thought about zapper behavior though. If stackers have a relatively fixed zapping budget, then they might spread them out in such a manner.
reply
I share your preference for the third specification, but it's nice to have the results from the fourth to gauge how large that effect is. To that end, people probably don't write the same kinds of posts in totally different territories, and even if they do there are different readers, so you might want to use pairwise user-territory fixed effects in place of the two separate fixed effects.
It's also a little confusing to focus on the results from specification 3, but then also state that the results aren't due to user differences.
I doubt the territory founder issue will be significant, since so few territories have changed hands. I was curious though if you excluded my ~econ posts from before I was the owner: i.e. is that designation static or dynamic?
You also might want to highlight how large these R-squareds are and what that implies for people who aren't trained in this stuff. It's interesting that so much variation is explained by these few variables.
Also, also, I was struck by how fat the tails are on the zap and comment distributions.
Again, this is a super cool project and I think you already have some very noteworthy results.
reply
Again, this is a super cool project and I think you already have some very noteworthy results.
Imagine that, SN data being used for a future peer-reviewed paper~~ pinnacle of mainstream-ness.
reply
I doubt the territory founder issue will be significant, since so few territories have changed hands. I was curious though if you excluded my ~econ posts from before I was the owner: i.e. is that designation static or dynamic?
It's dynamic. So, the posts you made while you didn't own ~econ are included in the regression.
You also might want to highlight how large these R-squareds are and what that implies for people who aren't trained in this stuff. It's interesting that so much variation is explained by these few variables.
The high R squareds are mainly due to the fixed effects. Week/territory/user results in a pretty saturated model already.
It's also a little confusing to focus on the results from specification 3, but then also state that the results aren't due to user differences.
Thanks for the feedback. I'll definitely tighten up the writing for an actual manuscript, but for now I'm just trying to mind-dump the results as fast as I can haha.
Again, this is a super cool project and I think you already have some very noteworthy results.
Again, thanks. I'm actually very surprised at how easily the results have flowed out. Pretty much every specification I've tried has worked from the start. As I'm sure you know, that's not the case for most research projects.
reply
As I'm sure you know, that's not the case for most research projects.
Haha. Yeah, this actually happened on one of the most recent things I worked on and it was awesome.
reply