Tuesday, September 8, 2015

On Releasing the Hugo Nomination Ballots

Glenn Glazer, Vice-Chair of Business and Finance at Sasquan, released a statement regarding the motion to release the nomination ballots for the Hugo Awards this year:
Back at Sasquan, the BM passed a non-binding resolution to request that Sasquan provide anonymized nomination data from the 2015 Hugo Awards. I stood before the BM and said, as its official representative, that we would comply with such requests. However, new information has come in which has caused us to reverse that decision. Specifically, upon review, the administration team believes it may not be possible to anonymize the nominating data sufficiently to allow for a public release. We are investigating alternatives.
Thank you for your patience in this matter. While we truly wish to comply with the resolution and fundamentally believe in transparent processes, we must hold the privacy of our members paramount and I hope that you understand this set of priorities.
This is a little strange; one would think that it really should not be that hard to anonymize the ballots.

As a result, conspiracy theories abound. Vox Day has weighed in:
This is not acceptable. This is not even REMOTELY acceptable. If you voted in the 2015 Hugo Awards, I encourage you to contact Sasquan and demand that they released the anonymized nomination data.  
I find it very difficult to believe they are refusing to release it because it might make the Rabid Puppies look bad; we already know that the SJW message that the Puppies voted in lockstep is completely false. So, the question is: what voting patterns tend to embarrass whom?
Let's look at the usual suspects. Patrick Nielsen Hayden had 65 votes for Best Long Form Editor. John Scalzi had 168 votes for Best Novel and 78 votes for Best Novella. Not exactly suspicious, although I expect there is considerable overlap between Editor and Novella there.
as has John C. Wright:
If you voted, please write Sasquan, and demand, not ask, that they release the nomination data. The idea that the data must be kept private to avoid someone from deducing the voter’s identities is an absurd lie, not worth wasting ink to refute. They are trying to hide a bloc voting pattern, or a large number of votes that were entered after voting closed or something of the sort.
Turns out that anonymization is harder than I thought. Hugo Administrator John Lorentz, in an exchange on File 770, elaborated:
[Commenter] “With the Hugo data, the only identifying info is the membership number. Remove that, and the ballot has been anonymized.” 
[Brian C] No, it’s not nearly that simple. You also need to eliminate any nominations that are unique to one or a handful of people, as otherwise those nominations could be used to identify people. But then those ballots aren’t actually representative for the purpose of testing the algorithm. So you need to actually replace those with other nominations, that happen not to perturb the algorithm in any way. 
[John Lorentz] And that is the problem that our Hugo system admin folks have been running into. When one of them generated a draft of anonymized nominating data, it didn’t take the other very long to determine who some of the voters were, simply from the voting patterns.
The following article gives several examples of attempted anonymization of data that unraveled when the data were probed more deeply or combined with other data sources. Most of the examples would not apply here ... the ballots would not include zip codes or birth dates. But one can easily imagine identifying some voters with high probability. For example, suppose one piece of fiction recieved only one nomination?

I personally would not mind a system in which all nominations were public (or made public after the event); if I am going to champion a book, I am happy to do so publicly. But, then again, I am not an author who has to worry about offending friends who may retaliate professionally.

No comments:

Post a Comment