Evaluative practices and persuasion in r/changemyview

Corpus pattern discovery and computer assisted annotation: movies versus religion

Daria Dayter (Tampere University) · Thomas C. Messerli (Universität Basel)
ICAME 47 · Koblenz · 2026

Researchers and acknowledgements

Project investigators Daria Dayter (Tampere University) Thomas C. Messerli (Universität Basel)

Research network Communicative Practices on Reddit (CopRe) www.copre.org

Messerli, T. C., Dayter, D., Leuckert, S., Liimatta, A., Mahler, H., Bohmann, A., Kozma, G., & Tosin, R. (2025). Digital debating cultures. DSH, 40(1), 227–240.

live preview, copre.org

What is r/changemyview?

“You must personally hold the view and demonstrate that you are open to it changing.” r/changemyview, Rule B

Debate forum. A subreddit where original posters (OPs) state a view as a starting point.

Original posters. OPs claim willingness to have their opinion changed.

Responses. Comments aim to change the OP’s opinion.

Delta award (∆). Given by the OP, it marks successful persuasion.

r/changemyview: example of a delta awarded comment

CMV: I don’t believe suffering the death of a spouse is inherently more painful than being left by your spouse.
There seems to be much stronger sympathy for widows than for people who survived an unwanted divorce, while I think each situation is different. A divorce does not necessarily mean the relationship was not good …

submission / original post

Delta awarded response
I think the biggest sympathetic difference would be in the actual timing. In divorce there is a long process. Death can often be sudden. There are no details to negotiate, no easing into, it just is. …

DeltaBot Confirmed: 1 delta awarded …

Our research aim and questions

Overarching goal. Explore persuasion in r/changemyview, where the delta (∆) marks a comment that changed the OP’s view.

This subproject. Examine evaluative practices in delta and non-delta comments, focused on the assessment of the OP’s argumentation.

Research questions

How do commenters evaluate the OP’s argument, and how explicit or implicit is that evaluation?
Does it differ across two contrasting topics, movie_character (taste) and god_religion (value)?
Does assessing the argument relate to persuasive success (the delta)?

Data and two topic design

Corpus preparation

Source: the r/changemyview corpus, submissions and comments from 2013 to 2020.
Tagged for part of speech, lemma and dependency with spaCy. A first pass with en_core_web_sm encoded the corpus into NoSketch Engine; we then re-tagged the same token streams with en_core_web_trf, a transformer model built on RoBERTa, for higher accuracy while keeping the tokenisation fixed. All queries use the transformer tags.
Three sub-corpora: submissions (cmvsub), delta winning comments (cmvdelta), and a general comment reference (cmv500k).

Topic modelling

We ran BERTopic over the submission texts (title plus body) to group threads by what they are about. Embeddings from all-MiniLM-L6-v2, with min_topic_size = 200 so the model returns a few dozen well populated topics rather than hundreds of tiny ones.
That produced 75 fine grained topics, which we merged hierarchically and curated by hand into broader labels: movie_character, god_religion, us_politics, relationship_sex, animals_meat and school_college, among others.
We drew a stratified sample by topic and role, and for this talk compare the two that sit at opposite ends: movie_character (taste) and god_religion (value).

Methodology: a corpus and annotation cycle

Stage 1, corpus. Slot alternation patterns in NoSketch Engine (Hanks, 2013) surface candidate evaluative structures, but reach only about 2.7% of sentences.
Stage 2, annotation. Assessing an argument is semantic, not surface detectable, so we hand code it in INCEpTION (Klie et al., 2018) on the Evaluation layer.
Stage 3, triage. A cue and LLM pre-classifier surfaces candidates at high recall; the analyst adjudicates every one.

Stage 1: corpus patterns (NoSketch Engine)

CQL: [word=“your”] [pos=“NN.*”] []{0,4} [lemma=COP] []{0,5} [pos=“JJ.*”] (your NOUN … COPULA … ADJ)

I think	your argument is flawed	because the premise …
honestly	your reasoning is circular	and assumes the conclusion …
to me	your view of marriage is mostly questionable	given the evidence …
so	your point is weak	once you consider …
but	your premise seems inconsistent	with what you said …

flexible seed “your NOUN … COPULA … ADJ” (Hanks, 2013): up to 4 tokens before and 5 after the copula, 16 copula lemmas (be, seem, look, feel …). 315 hits, curated to 53 argument nouns and 28 evaluative adjectives.

Stage 2: annotation in INCEpTION

A span in a god_religion thread, coded on the Evaluation layer (Klie et al., 2018).

The Evaluation layer: our tagset

Stance_type
affective_relational, epistemic

Explicitness
explicit, implicit

Polarity
negative, positive

Object_stance
Argumentation_submission (our focus this round), Content_submission

this round codes only the assessment of the OP’s argumentation, not of the content

What assessing the argument looks like

“The conclusion does not follow from the premises, and the premises themselves are highly questionable.”
delta, on CMV: I don’t think God exists (explicit)

“I actually think making one small change would help make your argument more apt.”
delta, on CMV: I don’t need any evidence to claim God does not exist (implicit)

Coded for polarity and explicitness, the assessments are overwhelmingly negative and implicit.

Explicit and implicit

Finding 1: religion argues over logical validity

hover a bar for a telling example

Finding 1: movies argue by taste

hover a bar for a telling example

Finding 1: a value to taste spectrum

In god_religion, commenters go after the argument’s logical validity: assumptions, contradictions, fallacies.
In movie_character, they argue over taste: whether the view is just opinion or a question of quality.
The two topics sit at opposite ends, value on one side and subjective taste on the other, which is why we picked them.

Finding 2: calling out fallacies does not win

all four are rarer in delta winning comments; hover for an example

Finding 2: what it shows

The argument flaw words (straw man, circular, fallacy, assumption) are all rarer in the comments that won a delta than in ordinary comments.
Comments that win a delta tend to soften their criticism rather than confront the OP directly.
Potential future focus: are challenging questions doing confrontational work? The aim is to tell genuinely undermining rhetorical questions (Ilie, 1994) apart from ordinary ones. (Descriptively, questions are somewhat commoner in non-delta comments, 6.0% versus 4.8%.)

“Do you really ‘know’ that God does not exist?” a question, non-delta, on CMV: I know that there is no god

Finding 3: does assessing the argument win?

hover a bar for a telling example

Finding 3: it depends on the topic

In movies, argument assessment is more common in delta winning comments (41% versus 21%); in religion the pattern reverses (32% versus 47%).
We make no claim either way: assessing the OP’s argument neither reliably wins nor loses a delta. The pattern depends on the topic.
(exploratory, document level, small n)

Finding: the spectrum across all six topics

each topic placed by its logic and taste lexis; hover for values

Finding: where the other topics fall

god_religion stands alone on the logical-validity side. It is the one topic where commenters routinely name assumptions, contradictions and fallacies.
The other four topics, us_politics, relationship_sex, animals_meat and school_college, all sit closer to movie_character on the taste side: they argue more about opinion and quality than about the argument’s logical validity.
Value disputes look like the exception. Most topics lean toward the subjectivity end of the spectrum.

Conclusions

Across both topics, commenters only sometimes evaluate the OP’s argument itself. When they do, the evaluation is mostly negative and implicit: the weakness is implied rather than spelled out.
The topic decides the register. In value disputes (god_religion) commenters use the language of logical validity, naming assumptions, contradictions and fallacies. In taste disputes (movie_character) they argue instead about whether a view is just opinion or a question of quality.

Conclusions, continued

Open confrontation does not look persuasive. The words that name argument flaws, and questions used to undermine, are both more common in comments that did not win a delta. The comments that did win lean on softer, implicit criticism.
Whether evaluating the argument goes with persuasion depends on the topic: it goes with winning in movie disputes, but not in religion disputes (exploratory, small sample).
Methodologically, the corpus and annotation cycle, with a high recall cue and LLM triage, is a workable way to find a rare and semantic phenomenon at scale.

Outlook

Complete the balanced sample and inter-coder reliability.
The sequential organisation of evaluation across turns.
Extend the persuasion analysis to all six topics.
A corpus of OP edit statements, where OPs meta-engage with the comments.

References

Bednarek, M. (2009). Dimensions of evaluation: Cognitive and linguistic perspectives. Pragmatics & Cognition, 17(1), 146–175.

Dayter, D., & Messerli, T. C. (2022). Persuasive language and features of formality on the r/ChangeMyView subreddit. Internet Pragmatics, 5(1), 165–195. doi.org/10.1075/ip.00072.day

Du Bois, J. W. (2007). The stance triangle. In R. Englebretson (Ed.), Stancetaking in discourse: Subjectivity, evaluation, interaction (pp. 139–182). John Benjamins.

Hanks, P. (2013). Lexical analysis: Norms and exploitations. MIT Press.

Hunston, S., & Thompson, G. (Eds.). (2000). Evaluation in text: Authorial stance and the construction of discourse. Oxford University Press.

Hyland, K. (2005). Metadiscourse: Exploring interaction in writing. Continuum.

Ilie, C. (1994). What else can I tell you? A pragmatic study of English rhetorical questions as discursive and argumentative acts. Almqvist & Wiksell International.

Klie, J.-C., Bugert, M., Boullosa, B., Eckart de Castilho, R., & Gurevych, I. (2018). The INCEpTION platform: Machine-assisted and knowledge-oriented interactive annotation. In Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations (pp. 5–9). Association for Computational Linguistics.

Martin, J. R., & White, P. R. R. (2005). The language of evaluation: Appraisal in English. Palgrave Macmillan.

Messerli, T. C., Dayter, D., Leuckert, S., Liimatta, A., Mahler, H., Bohmann, A., Kozma, G., & Tosin, R. (2025). Digital debating cultures: Communicative practices on Reddit. Digital Scholarship in the Humanities, 40(1), 227–240. doi.org/10.1093/llc/fqaf005

Stefanowitsch, A., & Gries, S. Th. (2003). Collostructions: Investigating the interaction of words and constructions. International Journal of Corpus Linguistics, 8(2), 209–243.

Walton, D. N. (1996). Argumentation schemes for presumptive reasoning. Lawrence Erlbaum Associates.