Data Matching
Deciding whether two data elements are the "same" (a.k.a. a match) or not
Data matching generally refers to the process of deciding whether two data elements are the “same” (a.k.a. a match) or not, where each data element could be of different classes such as string, tuple, column, and so on. Data matching is a key concept in data integration and data preparation that includes a wide spectrum of tasks. In this paper, we consider seven common data matching tasks, namely entity matching, entity linking, entity alignment, string matching, column type annotation, schema matching, and ontology matching.
References
2023
- Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data IntegrationProc. ACM Manag. Data, 2023
2022
- DADER: Hands-Off Entity Resolution with Domain AdaptationProc. VLDB Endow., 2022
- Domain Adaptation for Deep Entity ResolutionIn SIGMOD ’22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, 2022
2021
- Deep Learning for Blocking in Entity Matching: A Design Space ExplorationProc. VLDB Endow., 2021
2018
- Distributed Representations of Tuples for Entity ResolutionProc. VLDB Endow., 2018
2017
- Synthesizing Entity Matching Rules by ExamplesProc. VLDB Endow., 2017
- Generating Concise Entity Matching RulesIn Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14-19, 2017, 2017