GRC Database Information

Expertise in Global Data

 

 




Place name/postal code combination data

An ideal resource for standardising and processing your data tables to improve quality, de-duplication rates and postal validation rates.

 


 

This data table, built by GRC Database Information, contains in the quarter 2-2010  release 1 over  30 million place name/postal code combination records. This data has been collected by analysing over 90 million addresses, mainly real world data records, for settlement name/postal code combinations.

 


 

 Features of this file:

 

We believe we cover over 95% of all postal settlement/postal code combinations in these 110 countries and territories:

 

Albania

Holy See 

Oman

Algeria

Hungary

Pakistan

American Samoa (for 5-digit ZIP codes)

Iceland

Palau (for 5-digit ZIP codes)

Andorra

India

Papua New Guinea

Armenia

Indonesia

Paraguay

Australia

Italy

Philippines

Austria

Japan

Poland

Azerbaijan

Jordan

Portugal

Bangladesh 

Kenya

Puerto Rico

Belarus

Kosovo

 Réunion

Belgium 

Kuwait

Romania

Bhutan

Kyrgyzstan

 St Barthélemy 

Bosnia-Hercegovina

Latvia

 St Martin 

Brazil

Liechtenstein

 St Pierre & Micquelon

Brunei Darussalam  

Luxembourg

St Vincent & The Grenadines

Bulgaria

Macedonia

San Marino

Canada

Madagascar

Saudi Arabia

Cape Verde Islands

Malaysia  

Senegal

China 

Maldives

Serbia (5-digit codes)

Cocos (Keeling) Islands

Marshall Islands (for 5-digit ZIP codes)

Slovakia

Costa Rica 

Martinique

Slovenia

Croatia

 Mayotte

South Africa

Cyprus

Mexico

Spain

Czech Republic

Micronesia (for 5-digit ZIP codes)

Sri Lanka

Denmark

Moldova

Swaziland

Dominican Republic

Monaco

Sweden

Estonia

Montenegro

Switzerland

Faeroe Islands

Morocco

Taiwan

Finland

Mozambique

Thailand

France

The Netherlands

Tunisia

French Guiana

Nepal

Turkey

French Polynesia

New Caledonia

Ukraine

Germany

New Zealand

United States  (for 5-digit ZIP codes)

Greece

Niger

United States' Virgin Islands (for 5-digit ZIP codes)

Greenland

North Mariana Islands

Uruguay

Guadeloupe

Norway

 

Venezuela

Guatemala

       
 

Guam

       

 

 

 

 

Records contain the settlement name and postal code as they appeared in the incoming record.

Many of the records contain settlement names which have been typed incorrectly, are in the wrong language and so on.  This table contains for 97.4 % of the settlement names correct upper- and mixed-case versions, making this table an ideal source for standardising and correcting settlement name data, allowing improvements in data quality, in de-duplication rates and in postal validation rates.

 For 97.4 % of all records a corrected postal code has been added to the file, allowing this file to be used as a  replacement or complement to postal data files.

We continually add and update information to this file and release updates at least quarterly. 

Because this data does not come from postal authority files, there are no royalties or hidden costs. 

This data differs from data from postal authorities - where postal authority files contain one postal code/settlement combination, this file can contain many because people write data differently.  These different versions all point to a single correct version.  Thus, whereas a postal file may contain a combination 1010 WIEN, this file will contain also 1010 VIENNA, 1010 WENEN, 1010 wien, A-1010 Vienna and so on, all pointing to the correct local settlement name for this postal code: WIEN and corrected postal code 1010.

Because this file contains the data as people enter it, and not how the postal files contain it, complex matching logarithms are not required to find mis-spelt and incorrectly written town names/postal codes.

Data is in Windows ANSI 1252/ISO 8859-1/MS-DOS 850 code page format. This is because, though our database systems can store Unicode data, we cannot enter it in normal use.  We therefore provide fields with the same place name strings with numeric Unicode place holders so that you can translate the data to Unicode should you wish to do so.  An additional Microsoft Access file is provided containing those place names which contain characters which cannot be reproduced in Windows code page 1252. Please seen the documentation for fuller details.

A certain level of geocoding is provided for these countries:

 

American Samoa, Andorra, Argentina, Australia, Austria, Bangladesh, Belgium, Bulgaria, Canada, Czech Republic, Denmark, Dominican Republic, Faeroe Islands, Finland, France, French Guiana , Germany, Greenland, Guadeloupe, Guam, Guatemala, Guernsey, Hungary, Iceland, India, Isle of Man, Italy, Jersey, Liechtenstein, Luxembourg, Macedonia, Marshall Islands, Martinique, Mayotte, Mexico, Moldova, Monaco,  The Netherlands, New Zealand, Northern Mariana Islands, Norway, Pakistan, Poland, Portugal, Puerto Rico, Russia, Réunion, St Pierre & Miquelon, Slovakia, Slovenia, South Africa, Spain, Sri Lanka, Sweden, Switzerland, Thailand, Turkey, United Kingdom and United States of America.  

 

Postal codes referring to non-geographical addresses (large users, post office boxes etc.) will not be geocoded. 

 

To locate a postal code centroid we have (for many countries) used geocoding from several populated places within a postal code area and found a point that lies equidistant between them.  This will not always identify the centre of a postal code centroid, but is more accurate than accepting the geocoding of a single settlement within a postal code area as being the same as for the postal code area itself. No guarantees can be given as to the accuracy of the geocoding provided.  The levels of geocoding per country can be seen in the coverage documentation here.  Much geocoded data used to assign codes to this file has been provided by Geonames under this licence.


 

What this file is not:

 

This data comes from (or is derived from) real World sources, and can contain real World data errors.  Though we mark data which we know is not correct (e.g. settlement fields containing data which is not the settlement, clearly mis-matched settlements/postal codes and so on), there can be no guarantee that a combination is correct on the ground.  

Coverage may not be 100%.  Data will come mainly from areas with the greatest population and greatest economic activity (i.e. the most businesses).  Some settlements may never appear in these files.

 The table has been designed to aid identification, correction and standardisation of place names.  It is not a postal  database.  For countries with greater coverage, it is possible to extract the corrected postal codes and corrected  place names to create a postal file covering most or all of the settlements in a country.  No guarantee is, or can be,  given that the resultant file would be complete or comply with postal regulations.  An explanation of how the file can  be used in different ways is available here.

 


 

Each record contains this data:

 

URN

Country name

Internal GRC Database Information country code

ISO2 country code

ISO3 country code

ISO Numeric country code

Settlement name as it appeared in the original real-World data file

Postal code as it appeared in the original real-World data file

 A corrected postal code, correct in terms of length, format, allowed characters etc.

A corrected upper-case version of the settlement name

A corrected mixed-case version of the settlement name

A string containing the upper-case version of the settlement name with numeric Unicode place holders to replace missing diacritical marks if the name cannot be correctly represented in the Windows 1252 code page

A string containing the mixed-case version of the settlement name with numeric Unicode place holders to replace missing diacritical marks if the name cannot be correctly represented in the Windows 1252 code page

A count showing how often this combination of settlement and postal code has been found in real-World data sets, an indication of data accuracy, of settlement size and of economic activity in that area (i.e. number of companies)

A flag indicating whether the place name data contains data other than a place name, such as a state or province code. Please refer to the full documentation for further details.

Province, state and/or region information  (for these countries: Algeria, Andorra, Argentina, Armenia, Australia, Austria, Belarus, Belgium, Bhutan , Brazil (partial: not in shared postal code areas), Canada, China, Costa Rica, Croatia, Denmark (partial: not in shared postal code areas), Dominican Republic, Finland (partial: not for shared postal code areas), France, Germany (excluding postal areas which overlap province borders), Guatemala, India, Indonesia, Italy, Jamaica, Madagascar,  Malaysia, Marshall Islands, Mexico, Micronesia, Mozambique, Netherlands, Norway (partial: not in shared postal code areas), Oman, Palau, Papua New Guinea, Poland, Portugal, Puerto Rico, Romania, Russia, San Marino, Spain, Sweden (partial: not for shared postal code areas), Switzerland (excluding postal areas which overlap canton borders, Thailand, Tunisia, Turkey, Ukraine, United Kingdom (excluding postal areas which overlap county borders), United States ofAmerica, Uruguay (not shared postal areas).  For information about what region information is stored, please refer to the full documentation)

A certain level of geocoding is provided for these countries: American Samoa, Andorra, Argentina, Australia, Austria, Bangladesh, Belgium, Bulgaria, Canada, Czech Republic, Denmark, Dominican Republic, Faeroe Islands, Finland, France, French Guiana, Germany, Greenland, Guadeloupe, Guam, Guatemala, Guernsey, Hungary, Iceland, India, Isle of Man, Italy, Jersey, Liechtenstein, Luxembourg, Macedonia, Marshall Islands, Martinique, Mayotte, Mexico, Moldova, Monaco,  The Netherlands, New Zealand, Northern Mariana Islands, Norway, Pakistan, Poland, Portugal, Puerto Rico, Russia, Réunion, St Pierre & Miquelon, Slovakia, Slovenia, South Africa, Spain, Sri Lanka, Sweden, Switzerland, Thailand, Turkey, United Kingdom and United States of America.    Level of coverage is shown in this spreadsheet.

 


 

    Example:

 

View a sample of records from a previous version of the file, pertaining to Gent (Ghent) in Belgium, here.

 


 

    Coverage:

 

View the coverage of this version here.  

Full file documentation is available here.

An explanation of how the file can be used in different ways is available here.

 


 

    Formats:

 

Data is held in Microsoft Visual FoxPro format, but can be provided also in these formats: FoxPro 2.x (dBase III+), tab delimited text, pipe delimited text, fixed column width text, and Excel (for small files (<15 000 records) only). Other delimiters are also possible, but as the data contains commas and quotation marks, we would advise against comma delimited format. 

 

The file containing place names in Unicode is provided only in Microsoft Access format.

 

Small data sets can be e-mailed, larger sets can be downloaded.

 


 

 Prices:

 

This whole file is available at the price of only EUR 3950.  The price for a subset of the file is EUR 395 per country for countries containing 100000 records or more and EUR 195 for countries containing fewer than 100000 records.  You can see how many records are provided by country on the spreadsheet available here.  Quarterly updates for the full file are available for EUR 395 per quarter if you update every quarter.  If you do not update every quarter, updates are available after the first purchase for EUR 1495. Prices for subset updates are variable.  If have any questions regarding the file, please  contact us.  

 

This data is offered on a royalty-free basis for use in any way you wish, with this important proviso that the data may not be copied or distributed in any way whatsoever when it can, in normal use, be accessed by other users.  In other words, if you would like to use this data in your software package, that is allowed provided users cannot get at, or export, the data themselves. You will be asked to agree to our terms and conditions when purchasing.  Our terms, conditions and licensing structure can be view here

 


 

 To order:

 

To purchase the full file, or a subset, follow this link to  order by credit card .   

 

If you are ordering a subset and need to know how many records are contained in each country, refer to the final column in the coverage documentation here

 

Delivery

 

As part of out efforts to become a carbon-neutral company, we no longer send data on CD-ROM unless you expressedly request this. Small subsets are e-mailed.  Larger data sets are posted to an online file share server for downloading.  If you have an FTP server that you'd like the data uploaded to, please let us know after ordering.

 

If you have any questions, please don't hesitate to contact us.

 


 

Customers

 

Many of our customers prefer to remain nameless for competitive reasons, and we respect this.  Over 40 companies have bought this data.  Amongst them are:

 

benelog AG, Köln, Germany

Case Runner Pty Ltd, Manly, NSW, Australia

Geodis, Clichy, France

Tenaris Dalmine, Dalmine, Bergamo, Italy

 


 

                 

                 

    This data table is used by GRC Tools for place name parsing and standardization - read about it here

 




GRC Database Information

AMSTERDAM

The Netherlands