UPDATED: corrected the last part of the expression from [0-9][ABD-HJLNP-UXYZ]{2} to [0-9][ABD-HJLNP-UWXYZ]{2}

Just doing a bit of DataMapper validation, and found myself wanting a regular expression for UK postcodes. Well it turns out the UK government specifies exactly what an address is (possibly the hardest data modelling problem I know of!!!), and also what a postcode is. What's more their XML schema gives the official regular expression for what a postcode is.

It needs a bit of translation for Ruby 1.8, because the regex library doesn't support character class subtractions, eg [A-Z-[BTV]]. So I've had to expand these. If there are any errors, it's my fault. Please check them if you're going to use the regex yourself, and let me know what you find.

I've broken the full expression onto multiple lines because it's about a mile long, but it's a one-line expression. I haven't investigated using free format mode with it. There is only one space in the expression, which occurs before [0-9][ABD-HJLNP-UWXYZ]{2}. Re-assemble before use!

1
2
3
4
5
6
7
8

POSTCODE_REGEX = /
  (GIR 0AA)|
  ((([A-PRSTUWYZ][0-9][0-9]?)|
  (([A-PRSTUWYZ][A-HK-Y][0-9][0-9]?)|
  (([A-PRSTUWYZ][0-9][A-HJKSTUW])|
  ([A-PRSTUWYZ][A-HK-Y][0-9][ABEHMNPRVWXY])))) [0-9][ABD-HJLNP-UWXYZ]{2})
/