Data Quality concepts
- Data quality dimensions - Data Quality is more than just data accuracy
- Metadata - what does it mean?
- Small incremental improvements can, over time, significantly improve Data Quality levels.
- Cleaning up the data is not
sufficient, you also need to minimise any future occurrence of the
problem. This may be achieved through system / programming
changes, changes to business processes and training
- Data quality examples
- Responsibility - Data Quality is everybody’s responsibility
back to the Data Quality main page
Data quality is more than just data accuracy. Data quality can be divided into several dimensions 1.
-
Accuracy
Defines how well the information that is in, or derived from, the data collection reflects the reality it is supposed to represent.
-
Comparability
Assesses the extent to which databases are consistent over time and use standard conventions (i.e. data elements and reporting periods), making them similar to other databases.
-
Timeliness
Refers primarily to how current or up to date the data is at the time of release. Measures the gap between the end of the reference period to which the data pertains. Measures the date on which the data becomes available to users.
-
Usability
Reflects the ease with which a data collection may be understood and accessed.
-
Relevance
Incorporating above elements to some degree, but focusing on value and adaptability.
1 based on “Data Quality Framework for the New Zealand Ministry of Health”
What is Metadata?
- Metadata is data about data
- Metadata describes or specifies the characteristics of data
- Metadata facilitates an open centralised repository defining commonly used business terminology and its associated attributes. These definitions should then be available and utilised consistently across corporate systems
- Metadata attributes for the business user - attributes should include business name, business description, the source or heritage of the data, associated business rules and be able to provide contextual comments about the data to assist in assessing the value of data
- Metadata attributes for the developer - attributes should include IT name, data type, cardinality, business validation rules and data source
Why is metadata important?
- Using metadata helps address the comparability and usability Data Quality dimensions.
- Metadata is important to of report developers - assisting them recognise the correct data elements to use when developing a report.
- The key is getting developers to utilise common and consistent reference data and in giving users with a clear definition of each element of data.
Accuracy
- inaccurate data is data that is incorrectly coded, invalid or missing.
- if a student is enrolled into the wrong program, or the Field of Education of the program is incorrectly entered, the University may not receive the maximum funding for the student completion.
- if an undergraduate student is enrolled into the wrong course, or the Field of Education of the course is incorrectly entered, the University may not receive the maximum funding for the student load.
- if a student is enrolled with an incorrect student status code, the student may be over or under charged. Students who are overcharged normally receive a refund from the university which involves additional processing by university staff. When students are undercharged, the University does not normally attempt to recoup the losses.
- if income and expenditure items are entered under the wrong cost centre code, future budgets for the cost centre may over or under estimate financial requirements and find it difficult to meet corporate targets.
Comparability
- providing consistency in reporting - when reports are developed they use the same business rules to classify and measure data
- comparing differing reports, knowing that both reports use the same rules to define an attribute e.g. School, Program, EFTSL, Domestic / International / Transnational.
Timeliness
- ensuring that reports are available at the time they are needed for decision making
- providing data in a timely manner e.g. Graduations data being available to the Marketing Unit in a timely manner so that parchments, alumni information etc is available at graduation ceremony
Usability
- ensuring the appropriate supporting data is easily available when performing specific business tasks e.g. program reviews.
- accessibility issues - e.g. specific HR and Finance reports might be available only to relevant HR and Finance staff, but the data is required by a Manager.
Relevance
- ensuring that the data provided in a report is relevant to the task at hand
- as environments change - the necessity for specific reports might change over time, e.g. reports haven't been used for the last 12 months should be archived and removed from the system.
Data Quality is everybody’s responsibility. You can make a difference. Systems can't be improved unless we know where the faults are. If you find what you think is a Data Quality issue - please take the time to log it via the web. Your request will be reviewed and can then be tracked.
To raise a Data Quality issue, please go to Raise a Data Quality issue.
