Address elements data
An ideal resource for standardising and processing your data
tables to improve quality, de-duplication rates and postal
validation rates.
This data resource contains strings found in global address
data - street types / thoroughfare types, post office box strings,
company types and so on. It is unique in its coverage and scope.
This table, built by GRC Database Information, contains in the
quarter 2-2013 release over 497000 records containing strings
which are found in address data throughout the World. These
strings include:
Thoroughfare types (Street, Road, Avenue etc.)
Building types (House, Cottage, Bungalow etc.)
University strings (Univ., University of etc.)
Sub-building indicators (Room, Floor, Gate etc.)
Zonal indicators (Industrial Estate, Zone etc.)
Settlement indicators (City, Village etc.)
Postbox indicators (P.O. Box, Postfach, Boite Postal etc.)
Indicators of man-made objects (Park, Garden, Dock etc.)
Indicators of geographic locations (Hill, River, Mountain etc.)
Biological indicators (Oak, Bird etc.)
Company types (Ltd, PLC, Inc. etc.)
Department indicators (Division, Department etc.)
Directionals (North, South, East etc.)
Personal name attention prefix (Attn., c/o etc.)
Forms of address (Mr, Saint, Professor etc.)
Personal names (Alfons, Winston, Charles etc.)
Colours (Red, Blue etc.)
Numbers (Three, Third etc.)
Articles (The, a, an etc.)
Prepositions (de, van, in etc.)
Adjectives (Old, New, Large etc.)
Obscenities and non-valid address entries (n/a, see e-mail
etc.)
Other (address components not able to be categorised
elsewhere)
Within each record, alternative forms in which the string can be
written such as common misspellings, diacritical differences and
casing differences, correct forms in upper- and mixed-case, and
standardised forms in upper- and mixed case are presented.
This table is ideal as a resource for standardising and processing
your data tables to improve quality, de-duplication rates and
postal validation rates.
This table has been built through intense analysis of both real-
World data sets and data sets which cover addresses for a whole
country, or large parts thereof, such as postal tables. The table is
being continually updated and improved. Full details of the table
contents and the layout is provided in this documentation.
Furthermore, basic counts for this version are provided in this
spreadsheet.
The data is provided in Windows code page 1252, complying to
ISO8859-1, MS-DOS code page 850. Diacritical marks (accents)
for most Western European languages are reproduced in the
table. Thus, for Eastern European and non-European languages
diacritical marks have been replaced with equivalent characters.
Though our database systems can store Unicode data, we
cannot enter it in normal use. We therefore provide fields with
the same address strings with numeric Unicode place holders so
that you can translate the data to Unicode should you wish to do
so.
Coverage
The table has data for most of the World's countries, but the
information is fuller for some countries than for others. As a
guide, countries have been categorised according to coverage,
as follows:
A countries: over 10000 real world records have been analysed
to create the tables for these countries:
Algeria, Argentina, Armenia, Australia, Austria, Azerbaijan,
Bahamas, Belgium, Bermuda, Brazil, Brunei, Bulgaria,
Cambodia, Canada, Chile, China, Colombia, Croatia, Cyprus,
Czech Republic, Denmark, Estonia, Finland, France, Gabon,
Germany, Greece, Guyana, Haiti, Hong Kong, Hungary, India,
Indonesia, Iran, Ireland, Israel, Italy, Japan, Luxembourg,
Malaysia, Maldives, Mexico, Morocco, Netherlands, New
Zealand, Norway, Pakistan, Philippines, Poland, Portugal,
Romania, Russia, Singapore, Slovakia, South Africa, South
Korea, Spain, Sweden, Switzerland, Syria, Taiwan, Thailand,
Turkey, Ukraine, United Arab Emirates, United Kingdom, United
States of America, Venezuela, Vietnam.
B countries: Between 1000 and 9999 real-world records for this
country have been analysed to create these tables:
Albania, Andorra, Antigua & Barbuda, Aruba, Bahrain,
Bangladesh, Barbados, Belarus, Benin, Bhutan, Bolivia, British
Virgin Islands, Burkina Faso, Cameroon, Cayman Islands, Congo
(Kinshasa), Costa Rica, Cuba, Ecuador, Egypt, El Salvador, Fiji,
Georgia, Ghana, Guernsey, Iceland, Ivory Coast, Jamaica,
Jersey, Jordan, Kazakhstan, Kenya, Kuwait, Latvia, Lebanon,
Liechtenstein, Lithuania, Malta, Mauritius, Monaco, Montenegro,
Nigeria, Oman, Panama, Papua New Guinea, Paraguay, Peru,
Puerto Rico, Qatar, Réunion, St Kitts & Nevis, Saudi Arabia,
Serbia, Slovenia, Sri Lanka, Trinidad & Tobago, Tunisia ,
Uruguay, Venezuela, Western Sahara, Yemen, Zimbabwe.
C countries: Fewer than 1000 real-world records for this country
have been analysed to create these tables:
All other countries
Filters
You may choose to purchase data based on any filter which can
be carried out using the available fields. For example, you might
want:
•
All records for Switzerland.
•
United States street type records in English and Spanish.
•
The 100 most commonly occurring address strings for a
given country.
•
Upper case strings only for a given country.
•
Strings which do not contain punctuation
•
and so on
Please contact us with requests for counts or other information.
Sample
If you would like to see an example of the data, you can
download here a sample of data pertaining to the German-
language street type straße from a previous version. This is a
Microsoft Excel spreadsheet file. If you are unable to read this
format, please send us a message and we can provide this same
data in another format. Kindly note: the ß letter in lower-case
German is written SS in upper-case and is not used in
Switzerland.
Formats
Data is held in Microsoft Visual FoxPro format, but can be
provided also in these formats: FoxPro 2.x (dBase III+), pipe
delimited text, tab delimited text, fixed column width text, and
Excel (for small files (<15 000 records) only). Other delimiters
are also possible, but as the data contains commas and quotation
marks, we would advise against comma delimited format.
Prices
Prices are in euros and are charged per record according to the
number of records requested at a price of EUR 0.04 per record.
The whole file can be purchased for the discount price of EUR
3950
Updating of the file after an initial purchase is priced at 10% of
the original price per quarter for all updates to the filter chosen, if
updated each quarter. Otherwise, a full file update costs EUR
1450.
This data is offered on a royalty-free basis for use in any way you
wish, with this important proviso that the data may not be copied
or distributed in any way whatsoever when it can, in normal use,
be accessed by other users. In other words, if you would like to
use this data in your software package, that is allowed provided
users cannot get at, or export, the data themselves. You will be
asked to agree to our terms and conditions when purchasing.
Our terms, conditions and licensing structure can be view here.
If you would like more information, or would like to request a
count for any filter, please contact us.
To purchase the full file
follow this link to order by credit card.
This data table is used by GRCTools for address standardization
- read about it here
If you have any questions about any of our products, please
contact us.
© GRC Database Information 2013
GRC Data Intelligence
Expertise in Global Data