MessiandNeymar

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Monday, February 27, 2012

How do we peer-review code?

Posted on 8:12 AM by Unknown

There's a fascinating article in the current issue of Nature magazine online: The case for open computer programs, by Darrel C. Ince, Leslie Hatton & John Graham-Cumming.

The article deals with the problem of successfully and adequately peer-reviewing scientific research in this age of experiments which are supported by extensive computation.

However, there is the difficulty of reproducibility, by which we mean the reproduction of a scientific paper’s central finding, rather than exact replication of each specific numerical result down to several decimal places.

There are some philosophy-of-science issues that are debated in the article, but in addition one of the core questions is this: when attempting to reproduce the results of another's experiment, the reviewers may need to reproduce the computational aspects as well as the data-collection aspects. Is the reproduction of the computational aspects of the experiment best performed by:

  1. taking the original experiment's literal program source code, possibly code-reviewing it, and then re-building and re-running it on the new data set, or
  2. taking a verbal specification of the original experiment's computations, possibly design-reviewing that specification, and then re-implementing and re-running it on the new data set?

Hidden within the discussion is the challenge that, in order for the first approach to be possible, the original experiment must disclose and share its source code, which is currently not a common practice. The authors catalog a variety of current positions on the question, noting specifically that “Nature does not require authors to make code available, but we do expect a description detailed enough to allow others to write their own code to do similar analysis.”

The authors find pros and cons to both approaches. Regarding the question of trying to reproduce a computation from a verbal specification, they observe that:

Ambiguity in program descriptions leads to the possibility, if not the certainty, that a given natural language description can be converted into computer code in various ways, each of which may lead to different numerical outcomes. Innumerable potential issues exist, but might include mistaken order of operations, reference to different model versions, or unclear calculations of uncertainties. The problem of ambiguity has haunted software development from its earliest days.
which is certainly true. It is very, very hard to reproduce a computation given only a verbal description of it.

Meanwhile, they observe that computer programming is also very hard, and there may be errors in the original experiment's source code, which could be detected by code review:

First, there are programming errors. Over the years, researchers have quantified the occurrence rate of such defects to be approximately one to ten errors per thousand lines of source code.

Second, there are errors associated with the numerical properties of scientific software. The execution of a program that manipulates the floating point numbers used by scientists is dependent on many factors outside the consideration of a program as a mathematical object.

...

Third, there are well-known ambiguities in some of the internationally standardized versions of commonly used programming languages in scientific computation.
which is also certainly true.

The authors conclude that high-quality science would be best-served by encouraging, even requiring, published experimental science to disclose and share the code that the experimenters use for the computational aspects of their finding.

Seems like a pretty compelling argument to me.

One worry I have, which doesn't seem to be explicitly discussed in the article, is that programming is hard, so if experimenters routinely disclose their source code, then others who are attempting to reproduce those results might generally just take the existing source code and re-use it, without thoroughly studying it. Then, a worse outcome might arise: an undetected bug in the original program would propagate into the second reproduction, and might gain further validity. Whereas, if the second team had re-written the source from first principles, this independent approach might very well have not contained the same bug, and the likelihood of finding the problem might be greater.

Anyway, it's a great discussion and I'm glad to see it going on!

Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Shelter
    I meant to post this as part of my article on Watership Down , but then totally forgot: Shelter In Shelter you experience the wild as a moth...
  • The Legend of 1900: a very short review
    Fifteen years late, we stumbled across The Legend of 1900 . I suspect that 1900 is the sort of movie that many people despise, and a few peo...
  • Rediscovering Watership Down
    As a child, I was a precocious and voracious reader. In my early teens, ravenous and impatient, I raced through Richard Adams's Watershi...
  • Must be a heck of a rainstorm in Donetsk
    During today's Euro 2012 match between Ukraine and France, the game was suspended due to weather conditions, which is a quite rare occur...
  • Beethoven and Jonathan Biss
    I'm really enjoying the latest Coursera class that I'm taking: Exploring Beethoven’s Piano Sonatas . This course takes an inside-out...
  • Starting today, the games count
    In honor of the occasion: The Autumn Wind is a pirate, Blustering in from sea, With a rollocking song, he sweeps along, Swaggering boisterou...
  • Parbuckling
    The enormous project to right and remove the remains of the Costa Concordia is now well underway. There's some nice reporting on the NP...
  • For your weekend reading
    I don't want you to be bored this weekend, so I thought I'd pass along some articles you might find interesting. If not, hopefully y...
  • Are some algorithms simply too hard to implement correctly?
    I recently got around to reading a rather old paper: McKusick and Ganger: Soft Updates: A Technique for Eliminating Most Synchronous Writes ...
  • Don't see me!
    When she was young, and she had done something she was embarrassed by or felt guilty about, my daughter would sometimes hold up her hand to ...

Blog Archive

  • ►  2013 (165)
    • ►  September (14)
    • ►  August (19)
    • ►  July (16)
    • ►  June (17)
    • ►  May (17)
    • ►  April (18)
    • ►  March (24)
    • ►  February (19)
    • ►  January (21)
  • ▼  2012 (335)
    • ►  December (23)
    • ►  November (30)
    • ►  October (33)
    • ►  September (34)
    • ►  August (29)
    • ►  July (39)
    • ►  June (27)
    • ►  May (48)
    • ►  April (32)
    • ►  March (30)
    • ▼  February (10)
      • James Hamilton studies some failures
      • Online cryptography class delayed again
      • Download the Universe
      • How do we peer-review code?
      • Time for the next generation
      • ABC follows up on the NYT Foxconn story
      • Pricing strategies and bots
      • Code Reviews
      • Crowdsourcing the forecast
      • Yeah, I know how that feels
Powered by Blogger.

About Me

Unknown
View my complete profile