The need for good redaction keeps coming up around Barack Obama. See this earlier post.
After Obama resigned from his Senate seat to become president, Rod Blagojevich, governor of Illinois, had the job of appointing a successor. Blagojevich was accused of selling the Senate seat, among a long list of other corrupt dealings.
In the trial, Blagojevich's lawyers tried to subpoena Obama -- whether because his testimony was material to the trial, or whether they just wanted to complicate things by dragging the president into the trial. Their motion was duly posted online, with key paragraphs redacted. Here is the actual file.
The political implications are one thing (see this) but we're interested in something else.
Two interesting points here:
A. Note the double level of redaction:
1. Personally identifying information like personal names are replaced with a semantic category like "labor union official." This leaves the text comprehensible, but makes it impossible to identify individuals.
2. Paragraphs are blacked out, eliminating the context even for redacted entities, so that the meaning of parts of the document can no longer be understood.
B. The apparently blacked-out paragraphs were simply hidden behind black
layers. Selecting the section and hitting Control-C recover
You cannot redact with ad hoc tools! Your redacted document must lack any private data which is hidden from the human eye. Because a human reviewer doesn't know what appears in hidden sections, your redacted document must not have any unseen sections.
Obama as a case study, part II