Investigating Paraphrase Generation as a Data Augmentation Strategy for Low-Resource AMR-to-Text Generation

  • Marco Antonio Sobrevilla Cabezudo
  • , Marcio Lima Inácio
  • , Thiago Alexandre Salgueiro Pardo

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

3 Citas (Scopus)

Resumen

Meaning Representation (AMR) is a meaning representation (MR) designed to abstract away from syntax, allowing syntactically different sentences to share the same AMR graph. Unlike other MRs, existing AMR corpora typically link one AMR graph to a single reference. This paper investigates the value of paraphrase generation in low-resource AMR-to-Text generation by testing various paraphrase generation strategies and evaluating their impact. The findings show that paraphrase generation significantly outperforms the baseline and traditional data augmentation methods, even with fewer training instances. Human evaluations indicate that this strategy often produces syntactic-based paraphrases and can exceed the performance of previous approaches. Additionally, the paper releases a paraphrase-extended version of the AMR corpus.

Idioma originalInglés
Título de la publicación alojadaINLG 2024 - 17th International Natural Language Generation Conference, Proceedings of the Conference
EditoresSaad Mahamood, Nguyen Le Minh, Daphne Ippolito
EditorialAssociation for Computational Linguistics (ACL)
Páginas663-675
Número de páginas13
ISBN (versión digital)9798891761223
EstadoPublicada - 2024
Evento17th International Natural Language Generation Conference, INLG 2024 - Tokyo, Japón
Duración: 23 set. 202427 set. 2024

Serie de la publicación

NombreINLG 2024 - 17th International Natural Language Generation Conference, Proceedings of the Conference

Conferencia

Conferencia17th International Natural Language Generation Conference, INLG 2024
País/TerritorioJapón
CiudadTokyo
Período23/09/2427/09/24

Huella

Profundice en los temas de investigación de 'Investigating Paraphrase Generation as a Data Augmentation Strategy for Low-Resource AMR-to-Text Generation'. En conjunto forman una huella única.

Citar esto