HomeDISE ReportingReadymade Analytic ModulesTally Business IntelligenceNavision Business IntelligenceData Capture ServicesData CleaningDashboard and Report Design-Data MiningCustomersManagement TeamContact Us

Data Cleansing

As part of the Data TransformationData Consolidation and Excel Consolidation processes, we are involved in activities related with Data Cleansing.

Why is data dirty?

Data could be dirty because of the following reasons:

·        Incomplete data set: say if the completion date for a project has not been filled up.

·        Noisy: say I have a numeric value instead of a date entry.

·        Inconsistent: if the calculations do not match up with the other input data sets.

Why is it important to handle the issue of dirty data?

Handling dirty data is important because poor data set could result in wrong decisions by the senior managers.

How do we cleanse the data sets?

We adopt multiple methods of cleansing the data sets. Some of the approaches are as follows:

·        Handling missing data: for solving this problem, we could adopt the following approaches:

o   Ignore those values

o   Put in a default value. This default value could be a mean, most probable value or a constant.

o   Handle each entry manually.

·        Handling noisy data: we solve issues resulting from the same using Data binning and  Data clustering(remove outliers)