Training of the artificial intelligence (AI) models requires massive amounts of data, especially when this models are used in healthcare industry. The final result of these models should be very precise as it directly influences the health of patients.
First and foremost, health data for training AI models should be anonymized to protect patients’ confidentiality. Anonymization makes sharing of health data possible for such secondary purposes like analysis, research, development, training, and/or quality control of AI algorithms. So how should data anonymization be performed so it’s not compromising patients’ privacy?
Let’s start with the definition of ‘data anonymization’. Data anonymization is the process of removing personally identifiable information from data sets (e.g., imaging like CTs, MRIs, X-Rays or videos like OR or colonoscopy videos), so that the people whom the data describes or who are in the images/videos remain anonymous.
People are identifiable if imaging or video data includes any references to an identifier such as a name, an identification number, personnel number of a person, account data, customer number or any other personal data which directly or indirectly can help identify the person.
Hospitals and clinics must share only anonymized data with third parties like research organizations or healthcare software development companies. Any sensitive metadata like the patient’s name, social security number, the hospital’s name, and address should be erased. Direct identifiers must be removed or rewritten with random values.
Data anonymization, data storage and data transfer are regulated by GDPR in EU and HIPAA in the US. A good example of this approach is the Safe Harbor standard in the HIPAA Privacy Rule. It specifies 18 data elements that need to be removed or encrypted. If this is done properly, the data is considered anonymized with accordance to HIPAA.
This list includes:
To meet GDPR and/or HIPAA compliance not all fields, associated with imaging or video data should be removed. Often medical research is focused on some specific gender, pathology, age group or geography. This means that some information in the metadata description might be left as is but only if this data is not identifying people in them in any way.
Anonymized data is no longer considered personal health data as people in the images or videos can’t be identified. Thus if the data is anonymized then no patient’s consent is required. On the other hand, if any details might lead to uncovering the patient’s identity, the patient consent is obligatory.
Using AI in Emergency Radiology
Quick diagnosis is crucial when saving patients’ lives. In their everyday job, emergency radiologists must deliver timely and precise reporting, often making critical...
What is data anonymization and how does it work?
What is data anonymization and how does it work? Training of the artificial intelligence (AI) models requires massive amounts of data, especially when this models are...
Reduce Burnout in Radiologists With AI
Reduce Burnout in Radiologists With AI Just like regular office workers, radiologists experience burnout which presents a wide range of issues such as moral injury,...