Regular Expressions for Validating and Formatting GB Telephone Numbers

From aa-asterisk.org.uk wiki

Jump to: navigation, search

Territory ID: GB or UK
Country Code: +44
National Trunk Prefix: 0
International Dialling Prefix: 00


This is a short guide to validating and formatting GB telephone numbers.

The regular expressions shown here are very similar to those found in the data files used by the open source libphonenumber project. Here the data for GB, IM, GG, GJ is combined whereas libphonenumber treats them each as separate countries.

Allow the user to enter the telephone number in any format that seems sensible. Do a few basic checks on the prefix and length and then remove the +44 country code or 0 trunk code and any extension details from the number and store for later use.

Next, remove all punctuation and spaces from the remainder so that you're left with the 7, 9 or 10 digit National Significant Number.

Use the regular expressions below to determine whether the remaining number is in a valid range, is the correct length for that range, what type of number it is and then how to correctly format it.


Contents

Selecting valid GB telephone number ranges

Before you start, there are four concepts that need to be separated. These are "input format", "valid number range and valid number length for this range", "storage format" and "display format".

Many systems try to constrain the user to typing numbers in a particular format, and this is usually a very bad idea. The London number 020 3000 5555 can be written as (020) 3000 5555 or as +44 20 3000 5555, but you'll equally see people writing 0203 000 5555, (0203) 000 5555, 02030 005 555, +44 (0) 20 3000 5555, +44(0)203 000 5555, 00 (44) 2030 005 555, (+44 203) 000 5555, (+44) 203 000 5555, 011 44 203 000 5555, and many others, and the same again each with hyphens in various positions.

The user should be allowed to do that. Most users do not properly understand how telephone numbers work nor the significance of the spaces and/or hyphens between country code, area code and local number. The number must be checked for validity and length and then reformatted for display by later more detailed logic elements.


Match GB telephone number in any format

Pattern: ^\(?(?:(?:0(?:0|11)\)?[\s-]?\(?|\+)44\)?[\s-]?\(?(?:0\)?[\s-]?\(?)?|0)(?:\d{2}\)?[\s-]?\d{4}[\s-]?\d{4}|\d{3}\)?[\s-]?\d{3}[\s-]?\d{3,4}|\d{4}\)?[\s-]?(?:\d{5}|\d{3}[\s-]?\d{3})|\d{5}\)?[\s-]?\d{4,5}|8(?:00[\s-]?11[\s-]?11|45[\s-]?46[\s-]?4\d))(?:(?:[\s-]?(?:x|ext\.?\s?|\#)\d+)?)$

The above pattern matches optional opening parentheses, followed by 00 or 011 and optional closing parentheses, followed by an optional space or hyphen, followed by optional opening parentheses. Alternatively, the initial opening parentheses are followed by a literal + without a following space or hyphen. Any of the previous two options are then followed by 44 with optional closing parentheses, followed by optional space or hyphen, followed by optional 0 in optional parentheses, followed by optional space or hyphen, followed by optional opening parentheses (international format). Alternatively, the pattern matches optional initial opening parentheses followed by the 0 trunk code (national format).

The previous part is then followed by the NDC (area code) and the subscriber phone number in 2+8, 3+7, 3+6, 4+6, 4+5, 5+5 or 5+4 format with or without spaces and/or hyphens. This also includes provision for optional closing parentheses and/or optional space or hyphen after where the user thinks the area code ends and the local subscriber number begins. The pattern allows any format to be used with any GB number. The display format must be corrected by later logic if the wrong format for this number has been used by the user on input.

The pattern ends with an optional extension number arranged as an optional space or hyphen followed by x, ext and optional period, or #, followed by the extension number digits. The entire pattern does not bother to check for balanced parentheses as these will be removed from the number in the next step.


Extract NSN, prefix and extension

After checking the input looks like a GB telephone number using the pattern above, the next step is to extract the NSN part so that it can be checked in greater detail for validity and then formatted in the right way for the applicable number range.

Pattern: ^\(?(?:(?:0(?:0|11)\)?[\s-]?\(?|\+)(44)\)?[\s-]?\(?(?:0\)?[\s-]?\(?)?|0)([1-9]\d{1,4}\)?[\s\d-]+)(?:((?:x|ext\.?\s?|\#)\d+)?)$

Use the above pattern to extract the '44' from $1 to know that international format was used, otherwise assume national format if $1 is null.

Extract the optional extension number details from $3 and store them for later use.

Extract the NSN (including spaces, hyphens and parentheses) from $2. Remove those spaces, hyphens and parentheses and use the RegEx patterns below to first determine the number type and then how to correctly format it.


Validate NSN by initial digits and length

If you need a very simple pattern to broadly accept various GB telephone numbers as potentially valid or not, try one of these listed immediately below. They are ordered in increasing detail.

Simple patterns grouped by initial digit

  • Pattern: ^((1\d\d|800)\d{6,7}|[235789]\d{9}|500\d{6}|8(001111|45464\d))$
  • Pattern: ^((1\d\d|800)\d{6,7}|(2[03489]|3[0347]|5[56]|7[04-9]|8[047]|9[018])\d{8}|500\d{6}|8(001111|45464\d))$

Simple patterns grouped by number length

  • Pattern: ^([1235789]\d{9}|(1[2-9]\d|[58]00)\d{6}|8(001111|45464\d))$
  • Pattern: ^((1[1-9]|2[03489]|3[0347]|5[56]|7[04-9]|8[047]|9[018])\d{8}|(1[2-9]\d|[58]00)\d{6}|8(001111|45464\d))$

The above patterns are very basic. Use the detailed patterns in section 2 of this article for much better control. Those patterns are grouped by number range usage for validation and by initial digits for formatting.


Storage and display

The number should be stored (in a database or wherever) in E.164 format, with + sign, country code and NSN (i.e. area code/NDC if applicable, and local number); e.g. +44203000555 or +44175062555.

For display, there's a quite complex set of rules for GB telephone numbers because different number ranges have a different total number of digits (usually 10 or 9, occasionally 7) and there are a variety of different area code lengths in use (from 2 to 5 digits). These rules are quite easy to understand and are listed on the number format page and detailed in the RegEx patterns in section 3 of this article. Use the E.123 format for display. You can choose to display the number in national format, e.g. 020 3000 5555 or (020) 3000 5555 or in full international format, e.g. +44 20 3000 5555. Do not use (0) in the international format.


Validating GB telephone numbers

The following RegEx patterns select valid GB telephone number ranges:

Ranges 2d, 11d, 1d1, 1ddd (and 1dddd) with 10 digits, 1ddd, 1dddd with 9 digits; including NDO

  • Pre-match Pattern: ^\d{9,10}$
  • Pattern: ^(2(?:0[01378]|3[0189]|4[017]|8[0-46-9]|9[012])\d{7}|1(?:(?:1(?:3[0-48]|[46][0-4]|5[012789]|7[0-49]|8[01349])|21[0-7]|31[0-8]|[459]1\d|61[0-46-9]))\d{6}|1(?:2(?:0[024-9]|2[3-9]|3[3-79]|4[1-689]|[58][02-9]|6[0-4789]|7[013-9]|9\d)|3(?:0\d|[25][02-9]|3[02-579]|[468][0-46-9]|7[1235679]|9[24578])|4(?:0[03-9]|2[02-5789]|[37]\d|4[02-69]|5[0-8]|[69][0-79]|8[0-5789])|5(?:0[1235-9]|2[024-9]|3[0145689]|4[02-9]|5[03-9]|6\d|7[0-35-9]|8[0-468]|9[0-5789])|6(?:0[034689]|2[0-689]|[38][013-9]|4[1-467]|5[0-69]|6[13-9]|7[0-8]|9[0124578])|7(?:0[0246-9]|2\d|3[023678]|4[03-9]|5[0-46-9]|6[013-9]|7[0-35-9]|8[024-9]|9[02-9])|8(?:0[35-9]|2[1-5789]|3[02-578]|4[0-578]|5[124-9]|6[2-69]|7\d|8[02-9]|9[02569])|9(?:0[02-589]|2[02-689]|3[1-5789]|4[2-9]|5[0-579]|6[234789]|7[0124578]|8\d|9[2-57]))\d{6}|1(?:2(?:0(?:46[1-4]|87[2-9])|545[1-79]|76(?:2\d|3[1-8]|6[1-6])|9(?:7(?:2[0-4]|3[2-5])|8(?:2[2-8]|7[0-4789]|8[345])))|3(?:638[2-5]|647[23]|8(?:47[04-9]|64[015789]))|4(?:044[1-7]|20(?:2[23]|8\d)|6(?:0(?:30|5[2-57]|6[1-8]|7[2-8])|140)|8(?:052|87[123]))|5(?:24(?:3[2-79]|6\d)|276\d|6(?:26[06-9]|686))|6(?:06(?:4\d|7[4-79])|295[567]|35[34]\d|47(?:24|61)|59(?:5[08]|6[67]|74)|955[0-4])|7(?:26(?:6[13-9]|7[0-7])|442\d|50(?:2[0-3]|[3-68]2|76))|8(?:27[56]\d|37(?:5[2-5]|8[239])|84(?:3[2-58]))|9(?:0(?:0(?:6[1-8]|85)|52\d)|3583|4(?:66[1-8]|9(?:2[01]|81))|63(?:23|3[1-4])|9561))\d{3}|176888[234678]\d{2}|16977[23]\d{3})$
  • Number Type: Fixed Line
  • Example Number: 1212345678

Note: Pattern matches geographic NSN=10 numbers as follows:

  • area code and subscriber number first digit for 2+8,
  • area code and subscriber number first digit for 3+7,
  • area code only for 4+6 (including mixed areas with embedded 5+5).

Note: Pattern matches geographic NSN=9 numbers as follows:

  • area code and subscriber number first two digits for most 4+5 numbers,
  • area code and subscriber number first three digits for 4+5 special case (01768) 88Ddd,
  • area code and subscriber number first digit for 5+4 special case (016977) Dddd.


Ranges 2d, 11d, 1d1, 1ddd (and 1dddd) with 10 digits, 1ddd, 1dddd with 9 digits; excluding NDO

  • Pre-match Pattern: ^\d{9,10}$
  • Pattern: ^(2\d[2-9]\d{7}|1(?:1\d|\d1)[2-9]\d{6}|1(?:[248][02-9]\d[2-9]\d{4,5}|(?:3(?:[02-79]\d|8[0-69])|5(?:[04-9]\d|2[0-35-9]|3[0-8])|6(?:[02-8]\d|9[0-689])|7(?:[02-5789]\d|6[0-79])|9(?:[0235-9]\d|4[0-5789]))[2-9]\d{4,5}|(?:387(?:3[2-9]|[24-9]\d)|5(?:24(?:2[2-9]|[3-9]\d)|39(?:[456][2-9]|[23789]\d))|697(?:[347][2-9]|[25689]\d)|768(?:[347][2-9]|[25679]\d)|946(?:7[2-9]|[2-689]\d))\d{3,4}))$
  • Number Type: Fixed Line, Area Code Optional
  • Example Number: 1332456789

Note: These are a subset of the fixed-line rules, with digits 2 to 9 as the leading digit of the subscriber number. There are patterns for 2+8, 3+7 and a combined pattern for all 4+6/4+5 and 5+5/5+4 numbers. Note that numbers matching this pattern are not necessarily valid numbers. Enclose the area code part in parentheses when formatting these numbers.


Ranges 7ddd (including 7624) (not 70, 76) (excluding protected ranges) with 10 digits

  • Pre-match Pattern: ^\d{10}$
  • Pattern: ^7(?:[1-4]\d\d|5(?:0[0-8]|[13-9]\d|2[0-35-9])|624|7(?:0[1-9]|[1-7]\d|8[02-9]|9[0-689])|8(?:[014-9]\d|[23][0-8])|9(?:[04-9]\d|1[02-9]|2[0-35-9]|3[0-689]))\d{6}$
  • Number Type: Mobile
  • Example Number: 7400123456


Ranges 76 (excluding 7624) with 10 digits

  • Pre-match Pattern: ^\d{10}$
  • Pattern: ^76(?:0[012]|2[356]|4[0134]|5[49]|6[0-369]|77|81|9[39])\d{6}$
  • Number Type: Pager
  • Example Number: 7640123456


Ranges 800 1111 with 7 digits, 800 with 9 or 10 digits, 808 with 10 digits, 500 with 9 digits

  • Pre-match Pattern: ^\d{7}(?:\d{2,3})?$
  • Pattern: ^80(?:0(?:1111|\d{6,7})|8\d{7})|500\d{6}$
  • Number Type: Toll Free (Freefone)
  • Example Number: 8001234567


Ranges 871, 872, 873, 90d, 91d, 982, 983, 984, 989 with 10 digits

  • Pre-match Pattern: ^\d{10}$
  • Pattern: ^(?:87[123]|9(?:[01]\d|8[2349]))\d{7}$
  • Number Type: Premium Rate
  • Example Number: 9012345678


Ranges 845 46 4d with 7 digits, 842, 843, 844, 845, 870 with 10 digits

  • Pre-match Pattern: ^\d{7}(?:\d{3})?$
  • Pattern: ^8(?:4(?:5464\d|[2-5]\d{7})|70\d{7})$
  • Number Type: Shared Cost
  • Example Number: 8431234567


Ranges 70 with 10 digits

  • Pre-match Pattern: ^\d{10}$
  • Pattern: ^70\d{8}$
  • Number Type: Personal Number
  • Example Number: 7012345678


Ranges 56 with 10 digits

  • Pre-match Pattern: ^\d{10}$
  • Pattern: ^56\d{8}$
  • Number Type: VoIP
  • Example Number: 5612345678


Ranges 30d, 33d, 34d, 37d, 55 with 10 digits

  • Pre-match Pattern: ^\d{10}$
  • Pattern: ^(?:3[0347]|55)\d{8}$
  • Number Type: UAN
  • Example Number: 5512345678


Notes

  • Mark all other number ranges as non-valid.


Formatting GB telephone numbers

Valid GB telephone number ranges can be formatted as follows:

Ranges 2d, 55, 56, 70, 76 (excluding 7624) with 10 digits

  • Leading Digits: ^(?:2|5[56]|7(?:0|6(?:[013-9]|2[0-35-9])))
  • Pattern: ^(\d{2})(\d{4})(\d{4})$
  • Format: $1 $2 $3


Ranges 11d, 1d1, 3dd, 9dd with 10 digits

  • Leading Digits: ^(?:1(?:1|\d1)|3[0347]|9[018])
  • Pattern: ^(\d{3})(\d{3})(\d{4})$
  • Format: $1 $2 $3


Ranges 1dddd with 9 or 10 digits

  • Leading Digits: ^(?:1(?:3873|5(?:242|39[456])|697[347]|768[347]|9467))
  • Pattern: ^(\d{5})(\d{4,5})$
  • Format: $1 $2

Note: These area codes are very rare in GB, and are only available in the following places: 13873 (Langholm), 15242 (Hornby-with-Farleton), 15394 (Hawkshead), 15395 (Grange-over-Sands), 15396 (Sedbergh), 16973 (Wigton), 16974 (Raughton Head), 16977 (Brampton), 17683 (Appleby-in-Westmorland), 17684 (Pooley Bridge), 17687 (Keswick), 19467 (Gosforth).


Ranges 1ddd with 9 or 10 digits

  • Leading Digits: ^1
  • Pattern: ^(1\d{3})(\d{5,6})$
  • Format: $1 $2


Ranges 7ddd (including 7624) (not 70, 76) with 10 digits

  • Leading Digits: ^7(?:[1-5789]|624)
  • Pattern: ^(7\d{3})(\d{6})$
  • Format: $1 $2


Ranges 800 1111 with 7 digits : UK ChildLine

  • Leading Digits: ^8001111
  • Pattern: ^(800)(\d{4})$
  • Format: $1 $2


Ranges 845 46 47 with 7 digits : UK NHS Direct

  • Leading Digits: ^845464
  • Pattern: ^(845)(46)(4\d)$
  • Format: $1 $2 $3

Note: This format actually applies to 0845 46 4x, not just to 0845 46 47, but the other nine numbers are unallocated.


Ranges 84d, 87d with 10 digits

  • Leading Digits: ^8(?:4[2-5]|7[0-3])
  • Pattern: ^(8\d{2})(\d{3})(\d{4})$
  • Format: $1 $2 $3


Ranges 80d (including 800) with 10 digits

  • Leading Digits: ^80[08]
  • Pattern: ^(80\d)(\d{3})(\d{4})$
  • Format: $1 $2 $3


Ranges 500, 800 with 9 digits

  • Leading Digits: ^[58]00
  • Pattern: ^([58]00)(\d{6})$
  • Format: $1 $2


Notes

  • Several of the above formatting rules can be combined for more efficient operation.
  • Prefix the above formatted numbers with the +44 country code and a space or with the 0 trunk code.
  • For national format numbers in 01 and 02 area codes and where the subscriber number does not begin with 0 or 1, the area code can be enclosed in parentheses. This signifies the area code is optional when calling from another number within the same area code.
  • Present all other numbers as unformatted.


See also


External links


Products using these patterns

Personal tools