ERIQ

Just another WordPress weblog

Welcome to ERIQ

The Laboratory for Advanced Research in Entity Resolution and Information Quality (ERIQ) conducts research addressing important problems related to entity resolution and information quality. ERIQ is collaboration between the Donaghey College of Engineering and Information Technology (EIT) at the University of Arkansas at Little Rock (UALR) and the Massachusetts Institute of Technology (MIT) Information Quality Program.

Background:

Data Integration is the process of combining data from different sources for use in a common application. Entity Resolution is a form of data integration in which records determined to represent the same real-world entity are successively located and merged. Whereas a database schema has pre-defined keys that allow entity information to be joined across tables, entity resolution in a non-RDBMS context is often a more complex process because

  • The process may span many record sources each with different formats
  • Entity attributes within a record may be incomplete, inconsistent, and in some cases, incorrect
  • Knowledge management can be difficult because each record source may have a slightly different semantic ontology

Data integration in general and entity resolution in particular are tightly bound to information quality. Just as improving the completeness, consistency, and accuracy of input sources will produce more complete, consistent, and accurate resolution outcomes, it is also true that the aggregate knowledge gained about each entity through the resolution process can be re-applied to improve the quality of the record sources.