A friend of mine asked for a regular expression for UK Post codes… got so many confusing results on the net that I decided to make it myself.
First had to find the rules for UK Postcode. A quick search got me a UK government site, (old site was: http://www.cabinetoffice.gov.uk/govtalk/schemasstandards/e-gif/datastandards/address/postcode.aspx as of October 2009. Updated to new site) http://interim.cabinetoffice.gov.uk/govtalk/schemasstandards/e-gif/datastandards/address/postcode.aspx, as of July 2011, which shows the rules as follows:
Permitted Format | Example Postcode |
AN NAA | M1 1AA |
ANN NAA | M60 1NW |
AAN NAA | CR2 6XH |
AANN NAA | DN55 1PT |
ANA NAA | W1A 1HQ |
AANA NAA | EC1A 1BB |
Also
* The letters Q, V and X are not used in the first position.
* The letters I, J and Z are not used in the second position.
* The only letters to appear in the third position are A, B, C, D, E, F, G, H, J, K, S, T, U and W.
* The only letters to appear in the fourth position are A, B, E, H, M, N, P, R, V, W, X and Y.
* The second half of the Postcode is always consistent numeric, alpha, alpha format and the letters C, I, K, M, O and V are never used.
* GIR 0AA is a Postcode that was issued historically and does not confirm to current rules on valid Postcode formats, It is however, still in use.
Was able to come up with this basic Regular expression that does UK Postcode validation:
^([A-PR-UWYZ](([0-9](([0-9]|[A-HJKSTUW])?)?)|([A-HK-Y][0-9]([0-9]|[ABEHMNPRVWXY])?)) [0-9][ABD-HJLNP-UW-Z]{2})|GIR 0AA$
This will validate 100% as per the assumed rules above.
Note however that it is certainly not optimized… I couldn't find any online regular expression optimizer and I’ll have to become a regex expert to do anything about it. Maybe I can take automatic regex optmizer thing as a mini-project. But for now, the unoptimized version will have to do...
Here is the regular expression in action inside Java code:
public static void validate(String code) {
String regexp="^([A-PR-UWYZ](([0-9](([0-9]|[A-HJKSTUW])?)?)|([A-HK-Y][0-9]([0-9]|[ABEHMNPRVWXY])?)) [0-9][ABD-HJLNP-UW-Z]{2})|GIR 0AA$";
Pattern pattern = Pattern.compile(regexp);
Matcher matcher = pattern.matcher(code.toUpperCase());
if (matcher.matches()) {
System.out.println("This is a valid UK Postcode.");
} else {
System.out.println("This is not a valid UK Postcode.");
}
}
In a real life scenario, you may need to convert the received code into upper case before calling validate to be safe. Or better yet, have upper and lower case validation inside the regular expression itself! Plus of course the optimization…
hi,
ReplyDeletevery good article, very close but not quite right the postcode N1P 1AA etc doesn't validate.
Thanks for posting Owen.
ReplyDeleteBut about N1P 1AA, it doesn't seem to be a valid code as the letter "P" is not in the permitted list of letters for the third position (as per rule no.3)?
http://www.britishpostcodes.info/district/N1P
ReplyDeleteThis clarity with your post is superb and that i may think you’re a guru for this issue.
ReplyDeleteipostcodefinder
thank you about this java tutorial about regular expressions.
ReplyDeleteHi Sachin,
ReplyDeleteI tried on regex on a invalid post code and it still matches : HA8 3PW
I like all the stuff so much and would like to thank for this awesome post.Thank you so much for sharing this.
ReplyDeletebeauty products
Thank you so much for this wonderful information .This is really important for me .I am searching this kind of information from a long time and finally got it.
ReplyDeletemumbai pincode