Help:ProofreadPage extension

From Wikisource
Jump to navigation Jump to search
ProofreadPage Extension

Wikisource uses the ProofreadPage extension, which allows you to render text along with its corresponding scanned image.

Discussion

[edit]

The side by side proofread page extension provides a transcribed text and a scan of the original document on one page. These pages use the prefix 'Page:' and collections of these displayed in a page beginning with 'Index:'. While many file types are supported by the extension, a document at wikisource is usually a DjVu with OCR.

The ProofreadPage extension is enabled by default at Wikisource and should come up automatically when a page in the "Page:" namespace is edited. Your gadget settings allow you to control certain features, such as whether the OCR button is enabled and whether the text and scan by default appear side-by-side or one above another. Some aspects of the interface need JavaScript to be enabled in your browser; without it, only a basic interface is available.

Users new to proofreading can experiment with the concept at Index:Sandbox.djvu. Working examples can be seen by finding a project in progress, such as Wikisource:Proofread of the Month.

Once you've found a project you want to work on, go to the index page. There, you'll find links to many pages for the project, colored by their status (or red if they don't exist yet). After selecting a page that needs work (any status except Validated or Without text), you'll go into the page, open up the editor, and make whatever changes (either to the document or the status) are appropriate, preview & save.

Anybody is able to proofread and correct most pages at Wikisource. However, editors must log into an account in order to change the proofread status. IP addresses cannot change this status.

When corrections and formatting are complete, the page is marked as Proofread and is ready for the main namespace, leave the page as Not Proofread until it is done. Mark as Problematic if appropriate (e.g. damaged scan, missing image or characters that need special care). See Help:Page status for more on these statuses.

Rationale

[edit]

The ProofreadPage extension is intended to allow easy comparison of text to the original. It has the following advantages:

  • Credibility: it makes it possible for Wikisource to guarantee that the text corresponds to its scanned source.
  • Improved collaboration: texts can be proofread and typos can be fixed by everyone, by providing direct access to the book. This restores the wiki way of collaborating.
  • Security: text is better protected against vandalism (any falsification can be detected immediately; texts are not accessed directly, but through transclusion, which deters inexperienced vandals).
  • No limitations on rendering: a book can be rendered in two different ways, without duplicating data:
  1. As a set of pages. Each page is a column of OCR text beside a column of scanned image. This mode is meant for contributors.
  2. Broken into its logical organization (such as chapters or poems) using transclusion. This mode is meant for readers.
  • Fairness of comparisons: since book pages are not in the 'main' namespace, they are not included in the statistical count of text units. A count of pages is available here. This method of comparison uses the same unit of measure for all texts (the page), which puts an end to the temptation of slicing texts into arbitrarily small units in order to increase statistics.

Limitations

[edit]

The <poem> tag does not work well because it adds a carriage return at the end of a block. It's also not possible to use <pre> formatting, since the line breaks are suppressed during transclusion. To solve this issue, add <br /> tags to the beginning of lines, or use the {{ppoem}} template, which has enhanced formatting Poetry.

To ease proofreading images that are rotated, the Rotate Image Firefox extension can be used.

See also

[edit]