Fatal flaw found in Ederer et al. (2023)
“Expect a lot more [doxxing] to start flowing, very soon!”
This is Part 4 in an ongoing investigation.
To get you up to speed, I had ChatGPT generate summaries for the first three parts.
Three Yale professors have written a research paper investigating EJMR, using computational methods to geolocate the majority of EJMR posts. The findings reveal that 10% of posts originate from universities, including top-ranked institutions in the US. This paper has generated controversy, with some accusing the authors of intending to "doxx" (reveal the identities of) anonymous EJMR users. Critics of EJMR argue that its negativity has detrimental effects on the field, while defenders claim it's a necessary platform for free speech and academic discourse.
Yale University researchers used a GPU-based software to compute the SHA-1 hashes and developed a multi-step procedure to identify IP addresses, exploiting a cybersecurity mistake. The researchers identified EJMR's anonymous owner's IP address in Leeds, UK. After the information leaked online, it caused significant controversy, with critics alleging ethical and legal violations, and expressing concern over potential harm and loss of anonymity for users.
The discovery of a link between EJMR and Mastodon, both associated with Yale, risks exposing users' identities. Accusations of hacking, likened to illegal wiretapping, have arisen due to Yale's exploitation of a security vulnerability to gain IP data. This activity has ignited debate about the boundaries of power, privacy, and legality in academia and digital platforms. Ironically, those advocating for 'doxxing' EJMR users may face exposure themselves. Minors under age 18 could be IDed using the Mastodon data.
Fittingly, since Mastodon users just had their data hacked, Part 4 is sponsored by Aura.
Scammers and Spammers Hate Aura - and That's a Good Thing for You
Are you tired of unwanted emails, texts, and calls from scammers and spammers? Aura uses cutting-edge AI technology to scan the internet to identify where your personal information is exposed and being sold. With Aura, you can take control of your online privacy and secure your personal information. Reduce robocalls and keep your information safe from identity thieves. Try Aura free for 2 weeks and see if your information has been compromised.
Following an initial outburst of coverage from the MSM and blogosphere, two news blogs on this topic have trickled in. First, Tyler Cowen clarifies that this is indeed a hack:
https://marginalrevolution.com/marginalrevolution/2023/07/dear-chat-gpt.html
The second was Mike Makowsky, associate professor of economics at Clemson University, in a blog titled “On EJMR, status competitions, and tapeworms” that agrees this was a hack, comparing the Yale researchers to “hackers”.
I asked ChatGPT to summarize Makowsky’s blog in 100 words:
The article discusses a paper presented at the NBER summer meetings, which revealed that toxic posts on the Economics Job Market Rumors (EJMR) forum weren't limited to fringe participants but included users from elite universities. The author suggests that this toxic behavior reflects the cutthroat, status-driven nature of academia. The highly competitive and subjective nature of academic achievement can give rise to resentment and a tendency to denigrate others. The author calls for a shift from this negative-sum game towards a more cooperative model. They propose solutions like increased transparency, sharing resources, and strict content moderation.
Makowsky’s blog concludes by suggesting that economists should go to Reddit.
And, yes, an anonymous message board (or a identified board with a special anonymous section), but with strict content moderation. We know it can be done. You don’t see any of the same filth on statalist or the economics subreddit.
Wait. Just how toxic is Reddit?
Quantifying Toxicity on Reddit
The word “toxic” appears in the now-infamous paper a total of 42 times.
Toxic is not a legal or scientific term — it is deliberately nebulous, chosen to provoke strong emotions in the reader. “Anything I don’t like is toxic”.
As per Ederer et al. (2023), to quantify toxicity, “we selected ToxiGen Roberta (Hartvigsen et al., 2022). This is a checkpoint of the Roberta model (Liu et al., 2019) fined-tuned for toxicity detection.”
Using this off-the-shelf LLM, the conclusion of their paper is that 10% of EJMR posts qualify as toxic. This 10% of toxicity resulted in a slew of headlines from major outlets, characterizing (libeling?) EJMR as a hotbed of racist, sexist, and generally inappropriate content.
The media have decided this is a hate site, so you obviously deserve to get doxxed if you ever posted there — who cares if you were doxxed as a result of an illegal hack? The ends justify the means, chud.
Evidence of a Toxic Environment for Women in Economics (New York Times)
Economics website is filled with racist and sexist speech, some blame the nation’s top universities (Associated Press)
Toxic Posts on Economist Job Website Traced to Users from Elite Universities (Bloomberg)
Racist, sexist anonymous posts are linked to Harvard, Yale, and other top institutions (Business Insider)
‘Toxic” anonymous online posts linked to university IP addresses (Times Higher Ed)
Harvard, Stanford, other elite schools linked to racist, sexist messages posted for years (Fortune)
Is the Economics Profession Toxic for Women? (Mother Jones)
These “journalists” are so convinced that their "enemies" are virulent racists/sexist/evil people that not a single one bothered to fact-check what 'toxic' actually means. If they were honest, their headlines would read: '90% of EJMR content is pretty good, actually.’ Sadly, that truth would interfere with the preferred narrative that EJMR is a bastion of hate speech.
One anonymous EJMR hero then took it upon himself to run Florian’s Toxigen classifier on Reddit.
That guy created a Twitter account, naively wanting to engage in good faith.
Here is his thread:
This is a thread about the toxicity measure in the EJMR paper, which found that 10% of EJMR posts are toxic. How does this compare to other social media? According to their own toxicity classifier, EJMR is much less toxic than Reddit.
In his exciting talk, Florian informally compared the toxicity of EJMR against other social media. The authors use a large language model, ToxiGen-Roberta, which is tuned for detecting hate speech. The authors do not benchmark their findings against any social media.
How does a 10% toxic comment share compare to other popular websites? I ran the ToxiGen Roberta model on Reddit comments scraped from some of Reddit's popular 'front page' subreddits, which new Redditors are subscribed to by default.
Nearly all popular subreddits - for instance, r/news, r/politics, r/funny, r/movies, r/todayilearned - exhibit substantially higher rates of toxicity, using exactly the same classifier as Ederer et al. EJMR toxicity most comparable to r/pokemon
How do we interpret? Perhaps the classifier is accurate and the share of toxic comments on EJMR is actually lower than most subreddits. This contradicts the prevailing wisdom from Florian's talk that EJMR is a bastion of hate speech on the internet.
A second possibility is that this classifier in Ederer et al. is incapable of reliably detecting toxic speech. Certainly, the classifier does not distinguish intensity of toxic speech. If so the 10% toxicity statistic repeatedly cited in their paper is uninformative.
Emphatically, this is NOT a claim that toxic content does not exist on EJMR. I believe Ederer et al's analysis does not provide any evidence EJMR is more toxic than other mainstream social media, and if anything works the other way.
I believe the authors were aware of this limitation. Twitter and Reddit are two of the most widely scraped platforms on the internet. This exercise took 15 minutes to do. It is almost impossible not to ask how their statistic compares against other online sources.
There are only two possibilities: the classifier is accurate, and EJMR is less toxic than Reddit, or it's not accurate, and nothing that comes from it is meaningfully quantifying share of toxic posts on EJMR or another platform.
Ederer provided shocking examples of actual hateful content on EJMR. Are those representative? Likely not. Comments with any amt of profanity are often marked toxic. I take [Ederer’s] point that the comments from his talk wouldn't be tolerated on Reddit. So is there a classifier that distinguishes run of the mill reddit toxic from the toxic screencaps in [Ederer’s] talk? What share of the toxic 10% on EJMR look like those posts? 10% likely significant overestimate of true EJMR toxicity to neutral observer
— Anonymous aspiring economist
He seemed pretty dismayed that the authors ignored him.
Wu (2019) is fatally flawed in the exact same way as Ederer et al. (2023).
Wu had the perfect dataset at her fingertips, formatted in the exact same format:
https://www.poliscirumors.com/
https://www.socjobrumors.com/
https://www.econjobrumors.com/
Yet, she only chose to scrape/analyze economists and screech about how sexist economics is... Why not compare them to political scientists and sociologists?
Why not use other professions as placebos?
Because just like Ederer et al. (2023)’s narrative gets blown apart by someone comparing his data to Reddit, Wu (2019)’s narrative gets blown apart when you compare her data to other professions. This “research” is all fake, cherrypicked bullshit.
Quantifying Toxicity on EconTwitter.
Next, I fired up python and ran Florian’s toxicity code myself.
Here are some big names on EconTwitter compared to EJMR:
Major caveat: my sample size is small (8*50=400 tweets), which is why e.g. Florian has a score of 0. In a perfect world, I would scrape thousands/million of EVERYONE’s tweets, but since I don’t know how to scrape twitter in light of Musk’s recent API changes, (does anyone know? email me chrisbrunet@protonmail.com or leave a comment) and I was doing this analysis quickly, so I manually copy-pasted a few-hundred tweets as a proof-of-concept.
I am sure that Anna Gifty, a PhD student at Harvard, the biggest bar on the chart, is reading this right now and thinking, hey! I am not toxic! I am one of the good guys! EJMR is a nazi hate site! There is no way I am more toxic than them!
To that, I would respond with this meme:
If you agree Ederer et al. (2023)’s toxicity measure is bad, then anyone with academic integrity who believed the “EJMR is a hate site!” narrative now needs to issue an apology. The Yale authors did not even bother validate their toxicity classifier, or, more likely, they validated it in private, and did not report the results because the inexorable conclusion is either that (1) the classifier is bad at identifying toxic speech or (2) EJMR is actually less toxic than other online platforms, which is why none of those or any relevant benchmarks are reported in the paper despite this implicit comparison being made several times in the talk.
If you agree Ederer et al. (2023)’s toxicity measure is good, then you must also believe that EJMR is less toxic than Reddit and Twitter.
Following my proof-of-concept bar chart, I am now working on scraping econtwitter profiles more comprehensively — RePEc maintains a list of Top 25% Economists by Twitter Followers, maybe that is a good place to start. Alternatively, maybe I will start by scraping every tweet from every economist who “liked” Paul/Florian’s tweets announcing the EJMR hack. Scrape all of their tweets, run Toxigen, rank them based on toxicity, and bootstrap the results to show confidence intervals.
If you want to chat about this research, it will take me at least a few weeks to do properly; please join my discord server:
Second potential fatal flaw?
If that math is correct, misattributed postings are virtually guaranteed. In that case, getting that knowledge out before any of that stuff happens would seem quite important.
The Witch Hunt Gathers Steam
Paul initially promised that he would “not identify users (in either the paper or the presentation.”
He then immediately leaked someone’s identity, spitting in the IRB’s face:
Four days later, he posted step by step instructions that a toddler could follow on how to do the hack.
Wake up.
It’s already happening.
The woke crowd have already started doing mob justice.
They will use Paul’s toddler-hacking-tutorial to doxx any poster they don't like.
There will be a searchable database soon, linking every EJMR post to an IP.
They will say they are only doxxing the "most egregious" posters or whatever, but really it will just be whoever they want.
The cancellations have not even begun yet.
More will come soon.
Ederer tried canceling EJMR on Twitter the other day, tagging Google and Capital One.
Notice how Florian cherrypicked one racist EJMR post to make his point.
Your screenshots aren’t an argument.
You want to play the screenshot game???
When I search a certain slur on Twitter, for example, it appears more than once per second:
Contrasting this firehose of racial slurs with the 20-year history of EJMR… which has resulted in a grand total of 5 (five!) of the same slur.
EconTwitter should perhaps focus on cleaning its own house before moving to censor or demonetize other platforms.
Doleac followed Ederer’s request for demonetization by literally writing “can I please speak to your manager”.
She argues that EJMR needs to be "demonetized” for being “virulently racist” and having “no content moderation.”
Is this not defamation?
She is trying to ruin small business by lying to its hosting website (AT&T / Google) that there is “no content moderation”, when in fact there is strict content moderation — much stricter than Twitter or Reddit. She is lying to to AT&T to inflict financial damages.
Doleac is a real piece of work.
If you want to know her villain origin story, read this article:
She has explicitly declared her intent to use this data to attack her peers' careers.
She also plans to “guess” who owns EJMR.
UPDATE: Yesterday, she escalated the jihad, proclaiming, “expect a lot more [doxxing] to start flowing, very soon!”
It is a little concerning the types of people empowered by modern conflicts taking on this form.
Where is the IRB on this?
Next steps with Yale IRB: the next communication should be legal
Possibly even a letter of spoliation to Yale IRB and the authors asking them to preserve all records relating to the gathering, writing, and publication of the data, along with all communications between review boards and the authors.
What a letter of spoliation does is notify parties that they may become party to civil or criminal cases in the future and requires them to preserve documents and communications and makes it an offense to delete or otherwise get rid of those records.
Someone (maybe who has experience with IRBs) should sit down with a lawyer, because a letter of spoliation needs to be fairly specific about exactly what types of documents, files, communications you want preserved, so we'd need people who are familiar with the verbiage. The authors' communications on this project, I believe, would be incredibly damaging for them.
The other thing is that, by sending this letter, if you get to depositions, you can depose the IRB and the authors about whether the information was every disseminated after receiving the letter.
If they lie, that's perjury. If they did disseminate it to anyone else, that person also can be deposed and nobody wants to get caught up in a long-running legal battle.
In short, we can make the data radioactive and hold Yale accountable if they actually followed correct procedure to protect or destroy dangerous data.
— Anonymous
Overlooked aspect on the paper’s ethics
Disclosing potential conflicts of interest is widely accepted as a basic tenet of research ethics. FE and PGP have both been discussed on this site, in some cases fairly, in many cases I imagine not. Regardless, potentially uncovering the identities of those who discussed you creates a clear conflict of interest. Even if you believe uncovering identities on EJMR is fair game generally, potentially learning those particular identities presents a conflict that anybody remotely concerned about ethics or the appearance of propriety would seek guidance from an IRB on. How should that small number of threads be handled? That the authors were likely aware of the existence of these threads, and potentially relied on semantics to initially claim exemption from IRB review, for me says a lot. Everyone else is free to interpret this as they wish. But as far as I can tell, no debate about the definition of hacking or anything else can justify this specific lapse.
— Anonymous
Are these guys channeling the crowd in the court of Henry VIII?
Just a periodic reminder. We ain't modern. We're just *now.* Plus ca change 'n all that.
Anyway--continued good work from you. Always a joy to read.