Evaluative practices and persuasion in r/changemyview

Corpus pattern discovery and computer assisted annotation: movies versus religion

Daria Dayter (Tampere University)  ·  Thomas C. Messerli (Universität Basel)
ICAME 47  ·  Koblenz  ·  2026

Researchers and acknowledgements

Project investigators Daria Dayter (Tampere University) Thomas C. Messerli (Universität Basel)

Research network Communicative Practices on Reddit (CopRe) www.copre.org

Messerli, T. C., Dayter, D., Leuckert, S., Liimatta, A., Mahler, H., Bohmann, A., Kozma, G., & Tosin, R. (2025). Digital debating cultures. DSH, 40(1), 227–240.

live preview, copre.org

What is r/changemyview?

“You must personally hold the view and demonstrate that you are open to it changing.” r/changemyview, Rule B

Debate forum. A subreddit where original posters (OPs) state a view as a starting point.
Original posters. OPs claim willingness to have their opinion changed.
Responses. Comments aim to change the OP’s opinion.
Delta award (∆). Given by the OP, it marks successful persuasion.

r/changemyview: example of a delta awarded comment

CMV: I don’t believe suffering the death of a spouse is inherently more painful than being left by your spouse.
There seems to be much stronger sympathy for widows than for people who survived an unwanted divorce, while I think each situation is different. A divorce does not necessarily mean the relationship was not good …
submission / original post

Delta awarded response
I think the biggest sympathetic difference would be in the actual timing. In divorce there is a long process. Death can often be sudden. There are no details to negotiate, no easing into, it just is. …

DeltaBot Confirmed: 1 delta awarded …

Our research aim and questions

Overarching goal. Explore persuasion in r/changemyview, where the delta (∆) marks a comment that changed the OP’s view.

This subproject. Examine evaluative practices in delta and non-delta comments, focused on the assessment of the OP’s argumentation.

Research questions

  1. How do commenters evaluate the OP’s argument, and how explicit or implicit is that evaluation?
  2. Does it differ across two contrasting topics, movie_character (taste) and god_religion (value)?
  3. Does assessing the argument relate to persuasive success (the delta)?

Data and two topic design

Corpus preparation

  • Source: the r/changemyview corpus, submissions and comments from 2013 to 2020.
  • Re-tagged for part of speech, lemma and dependency, then loaded into NoSketch Engine.
  • Three sub-corpora: submissions (cmvsub), delta winning comments (cmvdelta), and a general comment reference (cmv500k).

Topic modelling

  • We ran BERTopic over the submissions to group threads by what they are about.
  • The micro-topics were curated into readable labels, among them movie_character, god_religion, us_politics, relationship_sex, animals_meat and school_college.
  • We drew a stratified sample from these, and for this talk compare the two that sit at opposite ends: movie_character (taste) and god_religion (value).

Methodology: a corpus and annotation cycle

  • Stage 1, corpus. Slot alternation patterns in NoSketch Engine (Hanks, 2013) surface candidate evaluative structures, but reach only about 2.7% of sentences.
  • Stage 2, annotation. Assessing an argument is semantic, not surface detectable, so we hand code it in INCEpTION (Klie et al., 2018) on the Evaluation layer.
  • Stage 3, triage. A cue and LLM pre-classifier surfaces candidates at high recall; the analyst adjudicates every one.

Stage 1: corpus patterns (NoSketch Engine)

CQL: [pos=“PRP$”] [pos=“NN.*”] [lemma=“be”] [pos=“JJ.*”]  (your NOUN is ADJ)
I think your argument is flawed because the premise …
honestly your reasoning is circular and assumes the conclusion …
to me your claim is questionable given the evidence …
so your point is weak once you consider …
but your view is inconsistent with what you said …
315 hits, distilled to 53 argumentation nouns and 28 evaluative adjectives

Stage 2: annotation in INCEpTION

A span in a god_religion thread, coded on the Evaluation layer (Klie et al., 2018).

The Evaluation layer: our tagset

Stance_type
affective_relational, epistemic
Explicitness
explicit, implicit
Polarity
negative, positive
Object_stance
Argumentation_submission (our focus this round), Content_submission
this round codes only the assessment of the OP’s argumentation, not of the content

What assessing the argument looks like

“The conclusion does not follow from the premises, and the premises themselves are highly questionable.”
delta, on CMV: I don’t think God exists (explicit)
“I actually think making one small change would help make your argument more apt.”
delta, on CMV: I don’t need any evidence to claim God does not exist (implicit)

Coded for polarity and explicitness, the assessments are overwhelmingly negative and implicit.

Explicit and implicit

Finding 1: religion argues by logic

hover a bar for a telling example

Finding 1: movies argue by taste

hover a bar for a telling example

Finding 1: a value to taste spectrum

  • In god_religion, commenters go after the logic: assumptions, contradictions, fallacies.
  • In movie_character, they argue over taste: whether the view is just opinion or a question of quality.
  • The two topics sit at opposite ends, value on one side and subjective taste on the other, which is why we picked them.

Finding 2: calling out fallacies does not win

all four are rarer in delta winning comments; hover for an example

Finding 2: what it shows

  • The argument flaw words (straw man, circular, fallacy, assumption) are all rarer in the comments that won a delta than in ordinary comments.
  • A preliminary indicator points the same way: the share of sentences phrased as questions is higher in non-delta comments (6.0%) than in delta comments (4.8%).
  • Comments that win a delta tend to soften their criticism rather than confront the OP directly.
  • Future trajectory: tell genuinely undermining rhetorical questions (Ilie, 1994) apart from ordinary ones.
“Do you really ‘know’ that God does not exist?” a question, non-delta, on CMV: I know that there is no god

Finding 3: does assessing the argument win?

hover a bar for a telling example

Finding 3: it depends on the topic

  • In movies, commenters who engage the argument win more often (delta 41% versus 21%).
  • In religion, it is the opposite (delta 32% versus 47%); the effect flips (interaction p = .054).
  • We do not claim that assessing the argument wins. If anything we find the reverse: in religion, non-delta comments carry more of it. The movie subset points the other way, but is small and exploratory.
  • (exploratory, document level, small n)

Finding: the spectrum across all six topics

each topic placed by its logic and taste lexis; hover for values

Finding: where the other topics fall

  • god_religion stands alone on the logic side. It is the one topic where commenters routinely name assumptions, contradictions and fallacies.
  • The other four topics, us_politics, relationship_sex, animals_meat and school_college, all sit closer to movie_character on the taste side: they argue more about opinion and quality than about formal logic.
  • Value disputes look like the exception. Most topics lean toward the subjectivity end of the spectrum.

Conclusions

  • Across both topics, commenters only sometimes evaluate the OP’s argument itself. When they do, the evaluation is mostly negative and implicit: the weakness is implied rather than spelled out.
  • The topic decides the register. In value disputes (god_religion) commenters use the language of logic, naming assumptions, contradictions and fallacies. In taste disputes (movie_character) they argue instead about whether a view is just opinion or a question of quality.

Conclusions, continued

  • Open confrontation does not look persuasive. The words that name argument flaws, and questions used to undermine, are both more common in comments that did not win a delta. The comments that did win lean on softer, implicit criticism.
  • Whether evaluating the argument goes with persuasion depends on the topic: it goes with winning in movie disputes, but not in religion disputes (exploratory, small sample).
  • Methodologically, the corpus and annotation cycle, with a high recall cue and LLM triage, is a workable way to find a rare and semantic phenomenon at scale.

Outlook

  • Complete the balanced sample and inter-coder reliability.
  • The sequential organisation of evaluation across turns.
  • The full persuasion test across all six topics.
  • A corpus of OP edit statements, where OPs meta-engage with the comments.

References

Bednarek, M. (2009). Dimensions of evaluation: Cognitive and linguistic perspectives. Pragmatics & Cognition, 17(1), 146–175.

Dayter, D., & Messerli, T. C. (2022). Persuasive language and features of formality on the r/ChangeMyView subreddit. Internet Pragmatics, 5(1), 165–195. doi.org/10.1075/ip.00072.day

Du Bois, J. W. (2007). The stance triangle. In R. Englebretson (Ed.), Stancetaking in discourse: Subjectivity, evaluation, interaction (pp. 139–182). John Benjamins.

Hanks, P. (2013). Lexical analysis: Norms and exploitations. MIT Press.

Hunston, S., & Thompson, G. (Eds.). (2000). Evaluation in text: Authorial stance and the construction of discourse. Oxford University Press.

Hyland, K. (2005). Metadiscourse: Exploring interaction in writing. Continuum.

Ilie, C. (1994). What else can I tell you? A pragmatic study of English rhetorical questions as discursive and argumentative acts. Almqvist & Wiksell International.

Klie, J.-C., Bugert, M., Boullosa, B., Eckart de Castilho, R., & Gurevych, I. (2018). The INCEpTION platform: Machine-assisted and knowledge-oriented interactive annotation. In Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations (pp. 5–9). Association for Computational Linguistics.

Martin, J. R., & White, P. R. R. (2005). The language of evaluation: Appraisal in English. Palgrave Macmillan.

Messerli, T. C., Dayter, D., Leuckert, S., Liimatta, A., Mahler, H., Bohmann, A., Kozma, G., & Tosin, R. (2025). Digital debating cultures: Communicative practices on Reddit. Digital Scholarship in the Humanities, 40(1), 227–240. doi.org/10.1093/llc/fqaf005

Stefanowitsch, A., & Gries, S. Th. (2003). Collostructions: Investigating the interaction of words and constructions. International Journal of Corpus Linguistics, 8(2), 209–243.

Walton, D. N. (1996). Argumentation schemes for presumptive reasoning. Lawrence Erlbaum Associates.