Data Cleansing for Child Health

The data cleansing exercise was based on two approaches; firstly a comparison was made, using in-house developed software, between all 18-and-under clients on RIO and all18-and-under patients on local GP registers. This exercise, known as the "Data Matching Report" used a download of Hounslow population from the NSTS. Secondly the standard "Report of Clients with Unknown GP" was extracted from the RIO system and submitted for tracing against the spine through the Demographics Batch Service (note 2).

The diagrams below illustrate how this data is being handled, and where various software utilities or data manipulation processes are used - for example to render information suitable for processing by the DBS.

Data Cleansing 1
Data Cleansing 2

The Data Matching Report produced information broadly divided into three categories: "missings" who appear on RIO but not on a local GP list, "adds" who appear on a GP list but not on RIO, and "mismatches" which appear on both but differ in some way such as, for example, having the same NHS number, name and date of birth but apparently registered with different GPs. These "mismatch" records are manually corrected on RIO to match the GP list.

The "adds" were added to the RIO system, and the "missings" were submitted to the DBS for tracing (note 1). The results from the DBS were then manually input to RIO.

Note 1: Of 12,243 records classified as "missings" and traced through DBS, 7,638 produced matches, 4,555 were not matched, and 50 produced multiple matches on the spine. Of the 7,638 records which were matched on the spine, only 6,568 returned a GP practice code. A proportion of these records return a Hounslow GP code, however in some of these cases the local GP practice may have informed the PCT that the child is no longer on their list.

Note 2: Unlike the Data Matching Report, tracing from the Unknown GP report was not limited by age. The data was submitted in two passes. In pass 1 12,992 records were submitted and 8,997 records were traced successfully with their NHS number found to be valid, but only 8,153 returned GP codes. It should be noted that, since it is not age-limited, the data includes a proportion of persons who have died. 71 records produced no result. On pass 2, 3,932 records (those with no NHS number) were submitted for tracing. Only 1,377 were traced, 1,154 returning GP practice codes. Where a record does not include an NHS number, it is recommended to not include any address information with the records submitted to DBS.

PT    17 July 2009