(Go: >> BACK << -|- >> HOME <<)

Page MenuHomePhabricator

User translation goes to unmodified MT on very quick edits after placeholder click
Closed, ResolvedPublic

Description

Steps to reproduce

  1. Start fresh translation and translate one small section with text using MT like Apertium
  2. BEFORE the saving occurs for that one section, add four words. MT usage drops to 85% in my case
  3. After four added words are saved, go back to dashboard and start the draft again
  4. Result: marker is missing
  5. Trying to add one more word, and percentage shows 96%, which means everything (including my four added words) was considered as MT content

What is happening:

The changes are debounced with 500ms. So any edits before this debounce time after the placeholder has filled up, is considered as part of unmodified MT content. This is because the section state has not populated yet and any fresh content is considered as unmodified MT.

Event Timeline

As part of T200683: CX2: Adjust when to check warnings for a paragraph validations does not happen on the first edit of the section. So the processSectionChange does less thing than it used to. Reducing the debounce delay might prevent this edits in half second gap.

I don't want to attemp throttled change handling since it is very important to provide a smooth edit experience. Doing things while users typing does not behave nicely.

Change 459977 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/extensions/ContentTranslation@master] Reduce the debounce delay for section change handler

https://gerrit.wikimedia.org/r/459977

Change 459978 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/extensions/ContentTranslation@master] Before saving, make sure change queue is processed

https://gerrit.wikimedia.org/r/459978

This patch reduces the time period in which this can happen from 500 ms to 100 ms (=if the user if faster than this, his edits will be considered part of the source text).

Before closing this task, I recommend filing a tech-debt follow-up task to refactor the code so that this never happens.

Change 459977 merged by jenkins-bot:
[mediawiki/extensions/ContentTranslation@master] Reduce the debounce delay for section change handler

https://gerrit.wikimedia.org/r/459977

Change 459978 merged by Santhosh:
[mediawiki/extensions/ContentTranslation@master] Before saving, make sure change queue is processed

https://gerrit.wikimedia.org/r/459978

Petar.petkovic subscribed.

I would say the behavior isn't better with the two patches merged. The problem from the description still exists and the MT percentage in the issue card no longer updates while the user types.

MT percentage in the issue card no longer updates while the user types.

It won't update as user types. As per T200683: CX2: Adjust when to check warnings for a paragraph, the calculation and issue card update is in 15 seconds delay from last edit in a paragraph

The problem from the description still exists

Did you manage to do any edit in 100ms gap and add that to unmodified MT? I was not happy with the reduced time solution. We need to improve it. One solution I can think of is:

  1. sectionChange event triggers TranslationController#addToChangeQueue
  2. TranslationController#addToChangeQueue calls this.translationTracker.pushToChangeQueue( sectionNumber );.
  3. In translationTracker if we have a way to check if this is the first change for the current MT engine for the section state, we can immediately process it, without debouncing so that we don't miss the 'pure form of unmodified MT'

Did you manage to do any edit in 100ms gap and add that to unmodified MT? I was not happy with the reduced time solution. We need to improve it.

I was not attempting to make an edit in such a short gap. With normal pace, I was doing the following actions:

  1. Started fresh translation
  2. Added one paragraph
  3. Before the first save, add some words to that paragraph
  4. Add one more paragraph, so that MT abuse warning is triggered for the first one. I saw 86% in my example
  5. Wait for the first save, return to the dashboard and load the translation again
  6. Start editing the same paragraph that was edited in the first session
  7. As soon as you start typing, warning is shown and says that MT percentage is 100%

Looks like a problem in saving. If both unmodified and user translation in save queue. only the user translation get saved.

image.png (310×768 px, 50 KB)

Change 463741 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/extensions/ContentTranslation@master] WIP: Do not miss the unmodified MT in saving

https://gerrit.wikimedia.org/r/463741

Change 463741 merged by jenkins-bot:
[mediawiki/extensions/ContentTranslation@master] Save: Don't miss unmodified MT while there is modified content

https://gerrit.wikimedia.org/r/463741

Etonkovidova subscribed.

Re-checked both scenarios - @santhosh (in the task description) and @Petar.petkovic - the fix is in place: added (typed) words to the first translated paragraph are not counted as MT.

Notes:
(1) If a user edits a paragraph that it's marked with MT abuse, the update does not happen promptly; another paragraph needs to be added or the page should be refreshed to see the updated MT Abuse stats.

(2) There is another issue - pasted text will trigger MT abuse warning "Your translation cannot be published because it contains too much unmodified machine-translated text". I think it's a quite valid use case - a user might do translation.editing in some text editor and paste the text into destination article on ContentTranslation page. I filed it as T207913: CX2: User created content triggers too much unmodified text error when typing and pasting.

And there are still issues with progress calculation. If one paragraph is 86% MT and another one is 100% - the pprogress bar will say that 50% is MT.

Notes:
(1) If a user edits a paragraph that it's marked with MT abuse, the update does not happen promptly; another paragraph needs to be added or the page should be refreshed to see the updated MT Abuse stats.

This is following the design, according to "Check other pending paragraphs when a new paragraph is added to the translation" in T200683. We don't want to disturb users too early.

And there are still issues with progress calculation. If one paragraph is 86% MT and another one is 100% - the progress bar will say that 50% is MT.

That is expected given the way progress calculation works.

Total number of sections that article has, which we mark as X, is used as 100% mark. If Y (Y < X) sections are added to the translation, total progress is calculated as Y / X.

Usage of MT is calculated per section. In your example, for first section, MT amount is calculated to be 86% and the other one is unchanged, so MT percentage is 100%. Number of completely unchanged sections, which have 100% MT content, is used while calculating MT usage for translation as a whole. If we mark number of such sections with letter Z, calculation becomes Z / Y.