ICARUS: Interactive Cleaning and Rule System

The Need
Data sets are booming larger and more complex, in order to use big data ideas all the data must be stored in an organized manner. Data cleaning is the process of removing corrupted records from a set of data, with the problematic data being updated or removed. Data validation is the process of confirming that the entered date is accurate. When these processes are done by hand it is exhausting and tedious. The problem increases in difficulty as the size of the dataset increases. A software which could reduce these headaches would be a big step forward in big data.

The Technology
Dr. Arnab Nandi and his colleagues at The Ohio State University have developed a platform that works with a user to create a system of rules to automate the data validation process. Dr. Nandi’s proposed solution is a software system named ICARUS: Interactive Cleaning And RUles System. ICARUS shows potentially problematic data in the form of a matrix. After the user makes a change ICARUS proposes a rule for the user to approve which is then applied to the entire data set. The user can see what the outcome of rule will be by hovering their mouse over the field to be manipulated. Once finished ICARUS stores the set of rules for future use by other users. This software will lead to less time being spent on preparing data and more time analyzing making better business decisions.

Commercial Applications

  • Big data
  • Customer data entry
  • Database management


  • Increase in data entry speed
  • Increase in data integrity/accuracy
  • User friendly

User friendly database validation software



Contact Information

TTO Home Page: https://tco.osu.edu/

Name: Andrew Hampton

Email: hampton.309@osu.edu