This is the online manual of GRCTools, personal name and address management software.  Click here for details.

 

 

 

Place names - parse & standardise

GRCTools manual | Index

Purpose: to locate, parse and standardize place names, replacing foreign or minority language forms of a town name, names incorrectly spelt or including typos, transcribed letters, incorrect diacritical marks and so on with the standardized local language equivalents.

Example:

LONDNO  
16 High Street, London  
Londres  
Station House,London  

becomes

  London
16 High Street London
  London
Station House London

To be found, a place name must be on its own in a field or separated from the other field data with a comma. Any trailing commas after parsing are removed. A correctly formatted postal code is required.

Note: as with all GRC ToolsTMprocesses, this process handles fields in alphabetical order. For parsing it is more accurate to start processing on fields which are most likely to contain the data being searched for. It may be more accurate to run the process twice or more than to parse from all fields in one pass.

image The lookup tables used by GRC ToolsTMfor this process are very large. GRC ToolsTMmay appear to hang for some time at 0% and 100% during processing. This is normal - avoid interrupting the program at these points.

Information required: For each field chosen, GRC ToolsTMneeds to know in which case the corrected version should be written, and to which field the place name should be moved.

image

Place names can be parsed using two search methods – exact string searching and fuzzy matching. Fuzzy matching is less accurate than exact string matching, so it should be used with caution. Using fuzzy matching increases the number of settlements parsed. If you choose to use fuzzy matching, this is always done as well as, and after, exact string matching. It is never done instead of exact string matching.

image

You may also choose to attempt to parse place names without postal code validation. This is run after exact string searching and, if chosen, fuzzy matching. This is useful to locate place names where postal codes are incorrect or not in the file. However, it should be used with caution: it may find parts of addresses which are similar to place names within the country. Avoid choosing this option if your field contains a significant amount of non-place name data.

This process will only work for each town within the postal code area defined within the lookup table. This prevents address components with similar forms to settlement names being incorrectly parsed. For this reason the process requires knowing in which field the postal code is situated. The postal code should be in its correct format without punctuation or other codes such as country sorting codes (e.g. GB-).

image