How to open source your stuff

One of the most rewarding things about open sourcing your stuff – for example, some code we developed for an internal project – is seeing what clever people do with it in ways you wouldn’t expect.

At my talk at eDemocracy 08, I put up the chart below to try to illustrate four of the key areas in which open source helps to build community, and why you might want to open source your stuff:

Open Source

  • Support: a group of people who talk about, and can help each other solve problems with, the code
  • Direction: people who contribute ideas about where the code should go in future, what features would be useful and so on
  • Rigour: more eyeballs and fingers exposing the weaknesses and limitations in your code, and gently pointing them out to you (and sometimes fixing them themselves)
  • Marketing: advocates who talk about your code online, and help spread the word – generating great PR for your organisation

In spite of the UK Government’s new action plan, there are some understandable and usually fatal barriers to open sourcing the code and information we produce as civil servants:

  • Ownership and sunk costs: We paid for this. Why give it away for free? Or at least, only give it to other nice public sector people.
  • Licensing: Is it Crown Copyright? What’s Click-Use? Who decides this stuff anyway? Let’s not bother the lawyers with this trivial stuff – they probably wouldn’t know anyway.
  • Publishing: How do you publish open source stuff? Can’t really put it on the corporate site. Don’t you need some complicated version control repository?
  • Documentation & support: I can’t be bothered to take the hard coded values out of the scripts and comment it all. I’ve got a day job – what if people keep asking for help using the code?
  • Lack of incentive: I’ve got more urgent and important things to do. Nobody is chasing me (or paying me) to do this.

That list certainly explains why I never got round to it before. Sure, doing open source really well – a true community endeavour like Linux or WordPress – can take a lot of work. But in my experience open sourcing your stuff can be done fairly easily:

1. Clean up the code (a bit)
Make a copy of your scripts. Go through deleting the obsolete bits, the dead ends and the useless commented-out stuff. Where you can, move your hard-coded variables to a configuration file, or at least move them to the top of each script and comment them clearly. Double check you’ve taken out the hard-coded email addresses, usernames and passwords for your server. But don’t go mad trying to refactor everything to be super-efficient- getting it out there is more important than getting it perfect.

2. Write a Read Me
Create a new text file and call it ‘Read Me’. Describe the code briefly, give it a name and a version number. Make sure you’ve credited other people whose code you’ve used, and link to the original libraries (or include them in your package for redistribution). Describe the steps for installation, any special system requirements, known bugs and any beartraps to watch out for. If you’re releasing an update, briefly add a description of what the new version offers. If you can’t realistically provide support, say so – people will understand. Add a disclaimer so users are clear that liability for the code once they use it is theirs, not yours. But do provide an email address or URL for feedback in any case for people to help you by reporting bugs and requesting features. Worst case, you can always just file the emails until you have time.

3. Decide on your licence
(This is slightly squiffy so tell me in the comments if I’m advising people wrongly on this)

If you wrote the code on your employer’s time, and you’re working for the civil service, it’s probably Crown Copyright. That’s who formally owns the code. But you can decide how that property is used. As a rule, if you want to enable people to be free to re-use your information, the OPSI’s Click Use licence is a fairly simple and very flexible licensing scheme. But if you’re talking about code, it’s a bit of a grey area. Smarter people than me reckon it’s OK to use the permissive GNU Affero licence, which allows people to adapt the code without forcing them to republish modifications themselves, and removes even the small barrier of requiring your users to sign up for a Click Use licence. Make sure you describe and link to the licence you’re using in your Read Me file and preferably within the main parts of the code itself.

4. Set up a static page somewhere, upload your code to it, and promote it
One day, there might be a lovely Sourceforge-like open source repository for government code. In the meantime, you can always just put up a nasty-looking page or a simple blog with a description and a link to a Zip file of your code. Or you could use Sourceforge or Google Code if you don’t mind too much about the branding. Then tell people about the code on your blog or via Twitter.

5. Provide a feedback channel
Ideally, set up a blog or forum where people can come together to discuss your code, report bugs and request features. At the very least, promote a URL, email address or Twitter handle where people can reach you to tell you when it breaks their server (sorry again, Paul).

The point I’m making here is that you don’t have to be purist about open sourcing your code. Feel good about making a contribution, not bad about the mess you think you’re exposing. Feel happy that you’re letting other taxpayers benefit from your time and experience, not anxious about losing control over ‘your work’. And above all, just get version 1.0 out there for people to use, critique and improve.

The Someday List: 1. Licensing

Seems all the cool kids are doing a series of themed blog posts, so I’ll join the party: over the next few weeks I’m going to cover four topics from my ‘Someday/Maybe’ list of applying social media in government:

  • Licensing
  • Accessibility
  • Guidance
  • Evaluation

Let’s start with the one I’m most sketchy about: licensing.

Punched paper tape
Image credit: Marcin Wichary (licensed under Creative Commons)

A couple of weeks ago, Richard Allen from the Power of Information Taskforce posted a useful set of links for local authorities looking to unlock the power of their information, including some basic information about Click-Use licensing, which I’ve come across but never fully got my head around.

I like all the PoI data reuse stuff – it speaks to me. For a couple of years now, I’ve run a website which tries to get public sector jobs information out to a wider audience (and 20,000 visitors/month seem to want it). Like others (I suspect), I have a soft spot for Neighbourhood Statistics, from my mis-spent time trying in vain to find a way to make it usable. And I love services like UniStats when I come across them at work.

But I’ve got a day job which is mainly about other things. I said in a plaintive comment on the PoI blog:

“This isn’t a whine, but to be honest these issues are just too far down my list to really get the attention they need, and my team is too small for us to really get under the skin of it. I suspect there are many of us in government temperamentally predisposed to open up the information we help to manage, but never quite managing to get it done. Similarly when it comes to building APIs to data.

Could the Taskforce provide some kind of help – boiled down practical guidance, a helpdesk, some priorities, template business cases or model approaches – that we could use to help us move foreward in this area quickly and confidently?”

Contrary to Wired’s provocative nonsense, that comment led to a flurry of activity. Adrian Norman got in touch via this blog, and we met today to chat about markets and precedents in public sector information, marginal cost, FoI, and the problem of incentives for people in government to make their data readily re-usable. He has an ambitious solution of his own: use software to auto-generate Information Asset Registers for public sector organisations, linked to a Europe-wide marketplace where the costs and value of the data can be more transparently assessed and the information more easily traded. If nothing else, he reminded me of the market value of what we hold, and that it’s not necessarily about giving stuff away for free.

Another response to my query came from Carol Tullo, Director of Information Policy and Systems at OPSI who gently suggested I make contact and tap into their help, which I’m doing at a meeting next week.

Which brings me to my point: what would be useful to know, as busy, jobbing webbies – the gatekeepers and enthusiasts for low-cost web publishing – to help us kick start more data syndication, licensing and re-use in our organisations?

Here’s my starter for ten (eagle-eyed readers will spot that I don’t have the foggiest about any of this, and a seriously non-legal mind):

  • We have increasing amounts of content (pictures, video, blog posts, methodology documents etc) which I’d like to share with the world, for others to comment on, adapt and reuse. What’s the best way to do that?
  • Can we license stuff under GPL or Creative Commons?
  • What if we use open source stuff and build upon it – can we ‘share alike’ under the same terms?
  • If it’s created by a Civil Servant, I understand it’s probably Crown Copyright, but I’m not sure what that means from a reuse perspective. I know it sometimes get waived anyway. So what’s the deal there?
  • I’ve heard dark things about the legal terms imposed by some of the online services out there such as YouTube. What should we be watching out for, if anything?
  • What really is ‘Click Use’ and is it the solution to my quest for a simple Creative Commons-style licence I can slap on stuff we create?
  • What should I say when talking to data holders in my department about this, and convince them to (i) look for and (ii) store and publish in reusable ways the data they hold?

That’s my list so far: what would you like me to ask about? Or what has your experience been?