Overview: Maintenance
Maintenance of machine learning models ensures their continued effectiveness, reliability, and relevance over time. This process involves continuous monitoring of model outputs and new incoming data for bias in input and output.
Avoid Bias Evolving in the Learning Model Over Time
Bias in machine learning models refers to systematic errors introduced by algorithms or training data that lead to unfair or disproportionate predictions for specific groups or individuals. Such biases can arise due to historical imbalances in the training data, algorithm design, or data collection. To prevent bias from evolving in the ML model over time, use several bias assessment metrics and methodologies, including:
Disparate Impact Analysis: This technique examines the disparate impact of an AI model's decisions on different demographic groups. It measures the difference in model outcomes for various groups and highlights potential biases.
Fairness Metrics: Researchers and practitioners have developed different fairness metrics to measure bias in machine learning models, including Equal Opportunity Difference, Disparate Misclassification Rate, and Treatment Equality.1 These metrics help assess how fairly the model treats different groups.
Post-hoc Analysis: This involves examining an AI system’s decisions after deployment to identify instances of bias and understand its impact on users and society.2
For more information, refer to the Model Development section.
Data Retention and Deletion Strategy for Handling Ethical and Privacy Issues
Data retention is the storing of information for a specific period, and it helps businesses reduce costs, legal risks, and security threats to users and organizations. To successfully implement a data retention policy, incoming and existing data must be properly classified and organized based on risk level and intended use. The data retention policy needs to indicate the parties and departments responsible for retaining and disposing of each data type in the ecosystem. The policy should also state what needs to happen at the end of the retention period.3 Many global privacy regulations have demanded that organizations and institutions make data deletion a common practice rather than just a best practice.
Privacy regulations require two types of data deletion processes:
- Request-based deletion responds to individual requests. The individual's status and relationship with the organization determine whether the request must be honored, what type of data can be deleted upon the request, and whether any exceptions to deletion requirements apply. Data deletion exceptions can include retaining data required for service delivery, legal and regulatory reasons, or financial reporting.
- Data retention and purge are focused on deleting data according to an organization’s schedules and internal process guidelines. These processes require organizations to operate a dedicated data retention governance team to provide data retention and purge oversight to address ethical and privacy issues. A sustainable data deletion program can help organizations improve their compliance with applicable privacy and data protection requirements as well as enhance data governance and data protection posture at the same time.4
Continuous Fairness Monitoring of Machine Learning Models
Continuous fairness monitoring is increasingly becoming a key focus in machine learning due to demands for equity and diversity in AI products. ML practitioners generally regard monitoring for drift as an early warning system for performance issues and evaluating models with fairness metrics as a solution for assessing bias in a trained model. A trained model that is fair can become unfair after deployment due to the same model drift that causes performance issues.5
Quantile Demographic Drift (QDD) is a method to measure and monitor fairness in machine learning models over the model lifecycle.6 This method involves a novel model bias quantification metric that uses quantile binning to measure differences in the overall prediction distributions over subgroups. QDD is incorporated into a continuous model monitoring system called FairCanary, that reuses existing explanations computed for each prediction to compute explanations for the QDD bias metrics quickly. When unfairness is detected, a bias mitigation strategy replaces the score of the disadvantaged group with the score of the corresponding rank in the advantaged group. As this approach is a post-processing step, it avoids pretraining to de-bias the model and is, therefore, a computationally inexpensive approach to bias mitigation.
More information about continuous fairness monitoring can be found in this paper.
Practical Ways of Mitigating Biases in the Use of the Output
To mitigate biases in the output, AI developers can include a “model facts label” (a 1-page of relevant and actionable information) for front-line users that indicates how, how not, and when to use the output. The model facts label can include a summary of the AI system, the working mechanism (including the source and baseline characteristics of data used for AI development), results of validation studies, guidelines for use (including benefits and appropriate decision support), warnings (including potential risks and consequences), and other relevant information related to the AI system.7 For example, in a virtual tutoring system designed to recommend learning resources to students, the model facts label could outline the AI's underlying mechanisms, including details on the diverse student populations represented in the training data and any potential biases identified during development. Additionally, it could offer guidelines for educators on when and how to use the system effectively, along with warnings about potential risks, such as overreliance on algorithmic recommendations or unintended reinforcement of educational inequalities.
Another practical technical way to mitigate bias is to use equalized odds post-processing. This technique solves a linear program to find probabilities of changing output labels to optimize equalized odds.8 By using this technique, developers can adjust output probabilities to ensure fairness across different demographic groups, thus promoting equitable treatment of all students. For instance, in an AI-powered grading system for essays, equalized odds post-processing could help ensure that students from marginalized backgrounds are not disproportionately penalized due to biases in the algorithm. A Python implementation of this technique can be found here.
More information on bias mitigation strategies can be found in the Model Development section.
Developers wishing to dive deeper into the technical aspects of ensuring equity in AI can access our GitHub site.
Reference this resource we created, Maintenance Guiding Questions, to support your discussion at this phase.
- Tutorial #1: bias and fairness in AI. (2019). Borealis AI. borealisai.com
- Buhl, N. (2023). Mitigating Model Bias in Machine Learning. Encord. encord.com
- Data Retention Guide: Benefits, Best Practices, & Examples. (2021). Segment. segment.com
- How data deletion empowers data protection. (2020). Grant Thornton. grantthornton.com
- Paka, A. (2022). FairCanary: Rapid Continuous Explainable Fairness. Fiddler AI. fiddler.ai
- Ghosh, A., Shanbhag, A., & Wilson, C. (2021). FairCanary: Rapid Continuous Explainable Fairness. arXiv. arxiv.org
- González-Gonzalo, C., Thee, E. F., Klaver, C. C. W., Lee, A. Y., Schlingemann, R. O., Tufail, A., Verbraak, F., & Sánchez, C. I. (2022). Trustworthy AI: Closing the gap between development and integration of AI systems in ophthalmic practice. Progress in Retinal and Eye Research, 90, 101034. doi.org
- aif360. (n.d.). aif360.algorithms.postprocessing — aif360 0.1.0 documentation. aif360.readthedocs.io
Overview: Maintenance
Maintenance of machine learning models ensures their continued effectiveness, reliability, and relevance over time. This process involves continuous monitoring of model outputs and new incoming data for bias in input and output.