Last updated 13th February 2003

Automatically Structuring Graphical Data

Shape, Context, and Statistical Language Modelling for recognition and validation of graphical objects

We are a sub-group of the Intelligent and Graphical Systems (IGS) research group in the Department of Computer Science, NUI Maynooth.

People:

Adam Winstanley (academic)
Laura Keyes (research assistant)
Leo Mulhare (research student)
Bashir Salaik (research student)
Aliex Zhou (research student)
Andrea McQuillan (MCS student)

Associates
Diarmuid O'Donoghue (academic)
John McDonald (academic)

 
Applications:

(Re-)Structuring topographic data for GIS

Validation and Quality assessment of Geographic Data

Recognition of complex objects with graphical data

Structuring data for multi-media systems

Projects

Topographic Shape Recognition
Structure Matching

Graphical Language Modelling

Structuring Archaeological Features on Topographic Digital maps

Recognition, Labelling and Retrieval of Features on Technical Drawings

GI Quality Indicators


Topographic Shape Recognition
People Dr. Adam C. Winstanley, Laura Keyes
Description
Automatic structuring (feature coding and object recognition) of cartographic data, such as that derived from air survey or raster scanning large-scale paper maps, requires the classification of objects such as buildings, roads, rivers, fields and railways based on their shape. There is a considerable body of published work on the identification and classification of objects within shapes. However, less progress has been made on automating feature extraction.

Recognition of objects is largely based on the matching of descriptions of shapes. Numerous shape description techniques have been developed in computer vision and image processing such as boundary chain coding, analysis of scalar features (dimension, area, number of corners etc), Fourier descriptors and moment invariants.. A comparison is made of the effectiveness of each method for recognising features on large-scale topographic maps and plans.

The above techniques are evaluated as general classifiers applied to broad classes of topographic shape (buildings, fields, road etc.) using the sample data provided by Ordnance Survey (OS) Great Britain. One of the aims of this project is to exhaustively test each of the methods and to perform a statistical analysis on the range of descriptor values obtained both within and between each OS feature type. For example, to evaluate how the techniques distinguish buildings from land parcels but also how they distinguish each type of land parcel i.e. surface land parcel or defined natural land-cover. Another aim is to evaluate the classification performance of each method on all polygons through comparison with the original data and to compare the performance between the three methods.


Each Shape description technique produce a set of real descriptor values that are used for the shape description of topographical objects

When tested for the more generalised topographic shapes, Fourier descriptors do not appear to be as conclusive and successful as hoped. However, both the moment invariants and scalar techniques proved to be significantly more successful in their task. Results show that no one shape technique alone is powerful enough for the task i.e. in different situations one technique will perform better than the others and produce significant results (e.g. buildings from linear features in built up areas using the moment invariants technique).

Publications

Keyes, L and Winstanley, AC: Using Moment Invariants for Classifying Shapes on Large Scale Maps, Computers, Environment and Urban Systems 25(1), 119-130, January 2001.

L. Keyes and A.C. Winstanley: Data Fusion for Topographic Object Classification, IEEE/ISPRS Joint Workshop on Remote Sensing and Data Fusion over Urban Areas, 8-9 November 2001.

Keyes, L and A.C. Winstanley: Topographic Object Classification Through Shape, GIS Research UK, 383-387, Glamorgan, April 2001 (extended abstract).

L. Keyes and A.C. Winstanley: Using Moment Invariants for Classifying Shapes on Large Scale Maps, IMVIP 2000 Proceedings of the Irish Machine Vision and Image Processing Conference, 149-156, Queen's University Belfast, September 2000.

A.C. Winstanley, and L. Keyes: Applying Computer Vision Techniques to Topographic Objects, XIXth International Archives of Photogrammetry and Remote Sensing, 33 (B3), 480-487, July 2000.

Keyes, L and Winstanley, AC: Moment Invariants as a Classification Tool for Cartographic Shapes on Large Scale Maps, 3rd AGILE Conference on Geographic Information Science, Helsinki, May 2000.

Keyes, L and Winstanley, AC: Using Moment Invariants for Classifying Shapes on Large Scale Maps, GIS Research UK, York, April 2000 (extended abstract).

L. Keyes and A.C. Winstanley: Fourier Descriptors as a General Classification Tool for Topographic Shapes, IMVIP 1999 Proceedings of the Irish Machine Vision and Image Processing Conference, Dublin City University, 1999.

L. Keyes and A.C. Winstanley: Topographic object recognition through shape, NUIM Signals and Systems Research Group, Technical Report NUIM/SS/--/2001/06, 2001.

 
Funding
Software Support for the Automatic Structuring of Graphical Data, Enterprise Ireland Strategic Research Grants Scheme ST/1998/021, IR£37,000 (1998-2000)

Parallelisation of algorithms for analysing graphical data, (with Clemson University) Enterprise Ireland International Collaboration Programme IC/2000/082, IR£1300 (2000-1)

Topographic Object Recognition (with D. O’Donoghue and Ordnance Survey), Enterprise Ireland/British Council Research Visits Scheme BC/2002/015, €2500 (2002)

Structuring topographic data through object shape, Ordnance Survey (GB) Research Grant, UK£1000 (2000-1)



Structure Matching
People Dr. Adam C. Winstanley, Diarmuid O'Donoghue, Leo Mulhare,
Description
We have developed the Cartographic Structure-Matching (CSM) algorithm that identifies multi-object structures from topological data. We describe the significant differences between CSM, and traditional structure-matching developed for the purposes of cognitive modelling (see background information). CSM was initially developed as a tool for categorising topological data based on the immediate context of an unclassified object. (Traditional techniques examine only an objects content, like size and shape). CSM looks at the objects that are adjacent to the unclassified object, and uses this as a basis for inferring a classification for that object. CSM is particularly adept at classifying objects whose shape does not uniquely identify it. We have investigated a number of more developed and developing applications of CSM.

Classification Error Detection
Error detection is central to the quality assurance needs of national ordnance survey offices. Specific classification errors are identified by explicitly defining an illegal context - for example, a section of road that does not border another road segment. Detecting specific errors is perhaps of greatest use when there is a known problem with an exexisting categorisation processes.
Quality Estimation/ frequency Distribution
Previous results indicate there is an exponential distribution in the frequency with which different contexts occur within a map. However, individual map segments may vary, perhaps using more urban related contexts. When updating a map (segment) we may compare the distribution of context before and after update - any significant discrepancy may indicate an error in the updating process.
Rejoining Segmented Objects
Topological data is a two-dimensional (2D) representation of three-dimensional (3D) information. Occlusions frequently segment objects, like bridges occluding the underlying river. CSM can identify such contexts, and introduce an occluded object segment.
Composite-Object Identification
Topological data is stored as individual land parcels. Introducing hierarchical structure based thematically related collections of objects. Such collections are generally adjacent, and thus CSM is ideally suited to identifying such structures. For example, a road plus adjacent

Publications

Mulhare, O'Donoghue, Winstanley Context-based classification of objects in cartographic data, GISRUK Geographical Information Science Research Conference pp195-198, Sheffield, UK, April 2002.

Mulhare, D. O'Donoghue, A. C. Winstanley, Analogical Structure Matching on Cartographic Data, 12th Artificial Intelligence and Cognitive Science AICS-2001, NUI Maynooth, Ireland, pp 43-53, Sept. 5-7, 2001, ISBN 0-901519-48-0.

O'Donoghue, D., Adam Winstanley., Finding Analogous Structures in Cartographic Data, 4th AGILE Conference on G.I.S. in Europe, Czech Republic, April, 2001.

Adam Winstanley, Diarmuid O'Donoghue, and Laura Keyes, Topographical Object Recognition through Structural  Mapping, 1st International Conference on Geographic Information Science - GIScience 2000 -, Savannah, Georgia, USA, October 28-31, 2000.

Bohan, O'Donoghue A Model for Geometric Analogies using Attribute Matching, AICS-2000 11th Artificial Intelligence and Cognitive Science Conference, Aug. 23-25, NUI Galway, Ireland, 2000. 

 
Funding

Topographic Object Recognition (with D. O’Donoghue and Ordnance Survey), Enterprise Ireland/British Council Research Visits Scheme BC/2002/015, €2500 (2002)

Context sensitive categorisation of topographic data, Ordnance Survey (GB) Research Grant, UK£500 (2000-1)



Graphical Language Modelling
People Dr. Adam C. Winstanley, Bashir Salaik
Description

The success of statistical language models at improving the performance of Natural Language Processing (NLP) applications suggests their possible applicability to the area of automated map reading. This idea stems from the fact that there are similarities between natural language and cartographic language:

There are, of course, also many differences between the two forms of data, notably the one-dimensional sequence of words forming a national language text compared with the two-dimensional graphical map. Natural language also has a large vocabulary whereas the number of classes of topographic object is usually small.

This project describes a method that uses Statistical Language Models to characterise the context of different classes of cartographic object. We use these models to measure the frequency of each feature context. This can then be used to help identify unclassified map features in combination with other methods (for example, based on the object’s shape). The data sets being used are of large-scale topographic mapping (usually depicted at a scale of 1:1250).

Publications

Bashir Salaik, Adam Winstanley and Laura Keyes: Statistical Language Models For Topographic Data Recognition (forthcoming)


Structuring Archaeological Features on Topographic Digital maps
People Dr. Adam C. Winstanley, Laura Keyes, John Mac Donald
Description
Geographic Information Systems (GIS) are increasingly replacing paper maps as the main tool for the analysis and processing of spatial data. For this reason, mapping organisations such as Ordnance Survey provide their data in digital form. However, in order to automate many tasks, this data has to be structured explicitly with information and relationships only implied by the paper version. In an object model, for example, a building is represented as a unique identifiable object containing not only the geometry that depicts it but also attribute information describing non-geometric features (for example, the address). For a particular task, the user might attach specific information to the basic structuring provided by the national mapping organisation.

The work involved in converting large data-sets into an object model is considerable. Therefore, it is very labour intensive to structure the data manually. Some automation of this process is possible. As an extension to previous work on object recognition in topographic data this project aims to implement an automatic procedure for the identification, extraction and depiction of archaeological features on large-scale maps. The system searches the data-sets for likely archaeological features and confirms their status as such. It also ascertains their extent by distinguishing between features that do and do not belong to the site. Complications will arise because top and bottom of slope may also represent modern man-made features such as embankments or road cuttings. Therefore it is important to distinguish the actual archaeological features from other anthropogenic forms. Our previous work carried out on object recognition is applied to this problem. Then a bounding polygon is created using either existing geometry and/or new lines and a composite model is formed to include the relevant features and geometry.  A hierarchical structure to represent the site is then built.


Outline of process involved in developing the tool in ArcView

For many applications, a generalised representation of the archaeological site is required. This is generated using geometric techniques such as the medial axis transform and the crust method of curve reconstruction. This automatic generalisation of the archaeological site produces a simplified version of the site plan suitable for small-scale depiction, morphological analysis and reconstruction of the original configuration of the site.

Publications

Keyes, L and A.C. Winstanley: Automatically Structuring Archaelogical Features on Topographic Maps, GIS Research UK, 191-194, University of Sheffield, April 2002 (extended abstract).

Keyes, L and A.C. Winstanley: Automatic Identification of Archaeological Features on Digital Topographic Maps, NUI Maynooth Postgraduate Research Record: 2002

A.C. Winstanley and L. Keyes: Identification, Extraction and Depiction of Heritage Features in OS MasterMap Data, Technical Report, Department of Computer Science, NUIM, 2002-2003.

Funding

Recognition and Structuring of heritage features in topographic data, Ordnance Survey (GB) Research Grant, UK£3000 (2001-2)



Recognition, Labelling and Retrieval of Features on Technical Drawings
People Dr. Adam C. Winstanley, Laura Keyes, Aliex Zhou
Description
Object recognition techniques previously developed and implemented were applied to graphical data derived from technical drawings. Techniques included those based on object shape (scalar, Fourier and moment invariant descriptors), object context (structure matching and analogical reasoning) and data fusion. The performance of all techniques was statistically examined and the most effective ones integrated into a prototype software tool. The tool allows a user to select an example object (simple or composite) and the software finds similar objects in the same or other drawings. The tool generates data structures that can be used to build multimedia linkages between objects, drawings and related information. The criteria for similarity are user-selectable allowing experimentation and performance measures to be made.

The information is accessed through a standard web browser interface including navigation through hot-links and key-word search facilities. The CAD drawings showing the location of utilities and services also act as browser navigational maps. In operation, the system’s main use concerns day-to-day operation and maintenance tasks, for example:

Project Tasks The data formats and scripting languages used by AutoCAD were studied and learned and scripts were produced that analyse and manipulate objects within drawings. This system finds matches for user-selected objects based on shape and context. The matching criteria are selectable for experimentation and demonstration and a log file is produced that summarises results.


Funding

Automatic recognition and labelling of features on technical drawings (with Entropic Ltd, Maynooth), Enterprise Ireland Innovations Partnership Feasibility Study IP/2002/064, €9000 (2002)



GI Quality Indicators
People Dr. Adam C. Winstanley, Laura Keyes, Andrea McQuillan
Description
Understanding and assessing quality gives rise to varying issues for both a data provider and a data user. The quality assesment and reporting required for the user model (which also depends on category of use) will differ from that of the producer model. It raises the question of how we can express quality when performing the transformation from the providers data model to the users real world model. We are extending research on the automatic structuring of objects on topographic maps to include an evaluation of the quality of spatial data. When solving real-world problems, spatial decision systems are heavily affected by the data and models used. Therefore, the reliability of spatial data quality is one of the very essential properties of GIS and related applications. Our group proposes to build a feature quality indicator model to evaluate, measure and quantify data quality errors.

What quality measurements need to be considered and how are they assessed
The causes of error in spatial data may come from varied sources, such as, understanding and modelling of reality, the source data and data encoding, conversion, analysis and output. To control the affects of error on spatial data the following data quality aspects need to be measured and assessed. The actual quality checking required may vary from application to application and the context of use. An example of the relationship between the quality indicators and the model used to measure them can be seen below (this is not a final model and will be modified to include more detailed quality metrics).


Relationship between the quality components and the quality metrics.

GI Quality measurement, assessment and representation

As a group (IGS) we propose to produce a prototype tool-set, which uses a set of defined feature quality indicators to detect and quantify spatial data quality errors on topographic maps and databases. Firstly, methods of measuring quality related errors, and means determining appropriate quality indicators dealing with these errors need to be developed. Shape and context recognition has already proven useful in the detection of misclassified features. It is envisaged that work, previously done on shape and context recognition both as individual and combined procedures can be directly applied in some form to many of the issues involved in data quality. These include completeness of cover, attribute accuracy, logical consistency and misclassification rates. We plan, through other statistical and data descriptor methods borrowed from such fields as computer vision and engineering to introduce new techniques to act as quality metrics to resolve all issues of quality. The following steps outline the most significant aims of our research.


Hierarchy showing requirements for handling spatial data quality and the indicator model we wish to produce.

Tasks

  • the identification and measurement of the required quality components;
  • a description of the implementation used to build the feature quality model;
  • an evaluation of the methods used in this implementation;
  • the quality tool-set that the implementation will produce and,
  • results and conclusions detailing the performance of the project.