GRC Database Information

Expertise in Global Data

 

 




Address elements data

 

An ideal resource for standardising and processing your data tables to improve quality, de-duplication rates and postal validation rates.

This data resource contains strings found in global address data - street types / thoroughfare types, post office box strings, company types and so on.  It is unique in its coverage and scope.

 


 

This table, built by GRC Database Information, contains in the quarter 1-2010  release over 436000 records containing strings which are found in address data throughout the World.  These strings include:

 

Thoroughfare types (Street, Road, Avenue etc.)

Building types (House, Cottage, Bungalow etc.)

University strings (Univ., University of etc.)

Sub-building indicators (Room, Floor, Gate etc.)

Zonal indicators (Industrial Estate, Zone etc.)

Settlement indicators (City, Village etc.)

Postbox indicators (P.O. Box, Postfach, Boite Postal etc.) 

Indicators of man-made objects (Park, Garden, Dock etc.)

Indicators of geographic locations (Hill, River, Mountain etc.)

Biological indicators (Oak, Bird etc.)

Company types (Ltd, PLC, Inc. etc.)

Department indicators (Division, Department etc.)

Directionals (North, South, East etc.)

Personal name attention prefix (Attn., c/o etc.)

Forms of address (Mr, Saint, Professor etc.)

Personal names (Alfons, Winston, Charles etc.)

Colours (Red, Blue etc.)

Numbers (Three, Third etc.)

Articles (The, a, an etc.)

Prepositions (de, van, in etc.)

Adjectives (Old, New, Large etc.)

Obscenities and non-valid address entries (n/a, see e-mail etc.)

Other (address components not able to be categorised elsewhere)

 

Within each record, alternative forms in which the string can be written such as common misspellings, diacritical differences and casing differences, correct forms in upper- and mixed-case, and standardised forms in upper- and mixed case are presented. 

 

This table is ideal as a resource for standardising and processing your data tables to improve quality, de-duplication rates and postal validation rates.

 

This table has been built through intense analysis of both real-World data sets and data sets which cover addresses for a whole country, or large parts thereof, such as postal tables.  The table is being continually updated and improved.  For those countries which have had postal tables analysed, a full range of statistics are included within the table records.  Full details of the table contents and the layout is provided in this documentation.  Furthermore, basic counts for this version are provided in this spreadsheet.  Both documents are in Adobe Acrobat format, and can be read with the free Adobe Acrobat reader.  If you do not already have it, you can download it from here.  

 

The data is provided in Windows code page 1252, complying to ISO8859-1, MS-DOS code page 850.  Diacritical marks (accents) for most Western European languages are reproduced in the table.  Thus, for Eastern European and non-European languages diacritical marks have been replaced with equivalent characters. Though our database systems can store Unicode data, we cannot enter it in normal use.  We therefore provide fields with the same address strings with numeric Unicode place holders so that you can translate the data to Unicode should you wish to do so.

 


Coverage

 

The table has data for most of the World's countries, but the information is fuller for some countries than for others.  As a guide, countries have been categorised according to coverage, as follows:

 

A countries: contains over 95% of relevant strings for this country, along with statistical analysis of occurrences in postal data files OR over 10000 real world records have been analysed to create the tables for this country: 

 

 Argentina

Haiti

Paraguay

Australia

Hong Kong

Philippines

Austria

Hungary

Poland

Belgium

Iceland

Portugal

Brazil

India

Puerto Rico

Bulgaria

Indonesia

Romania

Canada

Iran

 Russia

 China

Ireland

Singapore

Colombia

 Israel

Slovakia

Cyprus

Italy

South Africa

Czech Republic

Japan

 South Korea

Denmark

Luxembourg

Spain

Estonia

Malaysia

Sweden

Finland

Maldives

Switzerland

France

Mexico

Syria

Gabon

Monaco

Taiwan

Germany

 Morocco

Thailand

 Greece

Netherlands

Turkey

 

 

New Zealand 

United Kingdom

 

 

Norway

United States of America

       

Vietnam

 

B countries: data gathered from real-World databases, including localised data and data extrapolated from information known about former colonial powers (where relevant); and between 1000 and 10000 real-world records for this country have been analysed to create these tables:

    Algeria, Bahamas, Bahrain, Bangladesh, Barbados, Benin, Bermuda, British Virgin Islands, Burkina Faso, Cayman Islands, Chile, Congo (Kinshasa), Costa Rica, Croatia, Ecuador, Egypt, Ghana, Ivory Coast, Jamaica, Jordan, Kenya, Kuwait, Latvia, Lebanon, Liechtenstein, Lithuania, Malta, Mauritius, Montenegro, Netherlands Antilles, Nigeria,Oman, Pakistan, Panama, Paraguay, Peru, Réunion, St Kitts & Nevis, Saudi Arabia, Serbia, Slovakia, Slovenia, Sri Lanka, Trinidad & Tobago, Tunisia , Ukraine, United Arab Emirates, Uruguay, Venezuela, Zimbabwe.

     

C countries: data gathered from real-World databases, including localised data and data extrapolated from information known about former colonial powers, where relevant; and fewer than 1000 real-world records for this country have been analysed to create these tables:

    All other countries.


Filters

 

You may choose to purchase data based on any filter which can be carried out using the available fields.  For example, you might want:

 

All records for Switzerland.

United States street type records in English and Spanish.

The 100 most commonly occurring address strings for a given country.

Upper case strings only for a given country.

Strings which do not contain punctuation

and so on

 

Please contact us with requests for counts or other information.

 


Sample

If you would like to see an example of the data, you can download here a sample of data pertaining to the German-language street type straße from a previous version.  This is a Microsoft Excel spreadsheet file.  If you are unable to read this format, please send me a message and I can provide this same data in another format. Kindly note: the ß letter in lower-case German is written SS in upper-case and is not used in Switzerland.

 


Formats

 

Data is held in Microsoft Visual FoxPro format, but can be provided also in these formats: FoxPro 2.x (dBase III+), pipe delimited text, tab delimited text, fixed column width text, and Excel (for small files (<15 000 records) only).  Other delimiters are also possible, but as the data contains commas and quotation marks, we would advise against comma delimited format. Small data sets can be e-mailed, larger sets are provided on CD-ROM.

 


 Prices

 

Prices are in euros and are charged per record according to the number of records requested at a price of EUR 0.04 per record.  The whole file can be purchased for the discount price of EUR 3950

 

Updating of the file after initial purchase is priced at 10% of the original price per quarter for all updates to the filter chosen, if updated each quarter.  Otherwise, a full file update costs EUR 1450.

 

This data is offered on a royalty-free basis for use in any way you wish, with this important proviso that the data may not be copied or distributed in any way whatsoever when it can, in normal use, be accessed by other users.  In other words, if you would like to use this data in your software package, that is allowed provided users cannot get at, or export, the data themselves. You will be asked to agree to our terms and conditions when purchasing.  Our terms, conditions and licensing structure can be view here

 

If you would like more information, or would like to request a count for any filter, please send a message

 

To purchase the full file

 

follow this link to  order by credit card

 


                         

                         

            This data table is used by GRC Tools for address standardization - read about it here

 


 



GRC Database Information

AMSTERDAM

The Netherlands