Empirical Evaluation of Constraint Discovery Techniques for Business Logic Consistency Verification in Payroll Data Migration

Authors

  • Hao Cao Master of Computer Engineering, Stevens Institute of Technology, Hoboken, NJ, USA Author
  • Muyu Liu Center for Applied Statistics and School of Statistics, Renmin University of China, Beijing, China Author

Keywords:

constraint discovery, business logic verification, payroll data migration, data quality

Abstract

Enterprise payroll system migrations demand rigorous verification that business logic---salary calculations, grade-based rules, and overtime policies---is faithfully preserved in the target environment. Manual testing remains the prevailing practice, yet it scales poorly and is prone to human oversight. Automated constraint discovery offers a promising alternative by extracting latent business rules directly from source data and applying them as verification checks on migrated records. This paper presents an empirical comparative evaluation of three constraint discovery families---functional dependencies (FDs), conditional functional dependencies (CFDs), and denial constraints (DCs)---for detecting business logic inconsistencies introduced during payroll data migration. Experiments are conducted on two real-world datasets: the NYC Citywide Payroll Data (approximately 600,000 records per fiscal year, 17 attributes) and the IBM HR Analytics dataset (1,470 records, 35 attributes). A controlled error injection methodology simulating three categories of migration inconsistencies at four injection rates (1%, 3%, 5%, 10%) is employed to construct ground truth labels. Results indicate that CFD-based discovery achieves the most favorable precision-recall balance (F1 = 0.778 at a 5% error rate on NYC data), while DC discovery attains the highest recall (0.81) at the cost of reduced precision. FD discovery, though computationally efficient, exhibits limited recall due to its inability to encode context-sensitive rules. These findings provide actionable guidance for practitioners selecting automated verification strategies in enterprise data migration projects.

Downloads

Published

2026-05-13