Table 4

Sample cases of the Polish Ratio based on the Levenshtein distance where the parts edited by ChatGPT are highlighted in red. From a subjective standpoint, we can discern the rationale behind our PR regression task: the PR score for human-written abstracts tends to be close to 0, indicating no modification. Also, larger PR means more parts introduced by ChatGPT as represented by more red parts.

TextSourcePR
1a) Theoretical work in morphological typology offers the possibility of measuring morphological diversity on a continuous scale. However, literature in Natural Language Processing (NLP) typically labels a whole language with a strict type of morphology, e.g. fusional or agglutinative. In this work, we propose to reduce the rigidity of such claims, by quantifying morphological typology at the word and segment level. We consider Payne (2017)’s approach to classify morphology using two indices: synthesis (e.g. analytic to polysynthetic) and fusion (agglutinative to fusional). For computing synthesis, we test unsupervised and supervised morphological segmentation methods for English, German and Turkish, whereas for fusion, we propose a semi-automatic method using Spanish as a case study. Then, we analyse the relationship between machine translation quality and the degree of synthesis and fusion at word (nouns and verbs for English-Turkish, and verbs in English-Spanish) and segment level (previous language pairs plus English-German in both directions). We complement the word-level analysis with human evaluation, and overall, we observe a consistent impact of both indexes on machine translation quality.Human0.14%
1b) Theoretical work in morphological typology provides a means of quantifying morphological diversity on a continuous scale. However, NLP literature often labels a language with a strict morphological type, such as fusional or agglutinative. This work aims to reduce the rigidity of these claims by quantifying morphological typology at both the word and segment level. We adopt Payne (2017)’s approach, which classifies morphology using two indices: synthesis (e.g. analytic to polysynthetic) and fusion (agglutinative to fusional). To compute synthesis, we evaluate unsupervised and supervised morphological segmentation methods for English, German, and Turkish. For fusion, we propose a semi-automatic method using Spanish as a case study. We examine the relationship between machine translation quality and synthesis and fusion at both the word level (nouns and verbs for English-Turkish and verbs in English-Spanish) and segment level (English-German in both directions). We supplement the word-level analysis with human evaluation, and we observe a consistent impact of both indices on machine translation quality.ChatGPT-polished34.48%
2a) Framing is a communication strategy to bias discussion by selecting and emphasizing. Frame detection aims to automatically analyze framing strategy. Previous works on frame detection mainly focus on a single scenario or issue, ignoring the special characteristics of frame detection that new events emerge continuously and policy agenda changes dynamically. To better deal with various context and frame typologies across different issues, we propose a two-stage adaptation framework. In the framing domain adaptation from pre-training stage, we design two tasks based on pivots and prompts to learn a transferable encoder, verbalizer, and prompts. In the downstream scenario generalization stage, the transferable components are applied to new issues and label sets. Experiment results demonstrate the effectiveness of our framework in different scenarios. Also, it shows superiority both in full-resource and low-resource conditions.Human0.14%
2b) The communication strategy of framing involves selecting and emphasizing certain aspects in order to bias discussion. To analyze this strategy, frame detection is used. However, previous works in this field have mainly focused on a single scenario or issue, ignoring the fact that new events and policy agendas are constantly emerging. To address this issue, we propose a two-stage adaptation framework. The first stage involves adapting the framing domain through pretraining, using two tasks based on pivots and prompts to learn a transferable encoder, verbalizer, and prompts. In the second stage, the transferable components are applied to new issues and label sets. Our framework has been shown to be effective in different scenarios, and to perform better than other methods, both in conditions of full resources and low resources.ChatGPT-polished66.61%

or Create an Account

Close Modal
Close Modal