创新背景

人工智能的由机器学习、计算机视觉等不同的领域组成，用于模拟、延伸和扩展人的智能理论、方法、技术及应用。机器学习的核心是通过模型从数据中学习并利用经验去决策。深度学习作为机器学习的一个分支，对于人工智能的发展意义重大。
深度学习是使用多个隐藏层神经网络模型，通过大量的向量计算，学习到数据内在规律的高阶表示特征，然后利用这些特征决策的过程。人工神经网络从处理信息的角度通过模拟人脑神经网络抽象以后建立模型，按照不同的连接方式组成不同的网络。它是一种由大量节点相互联接构成的运算模型。

深度学习模型一般由许多层中数百万或属是一个互连节点构成。节点经过训练之后可以通过大量数据检测或分类，进而构成人工智能强大的数据处理系统。但深度学习模型的高度复杂性导致专门的工作研究人员也无法完全了解其中的工作原理，模型就像在黑匣子里工作，预测后果具有不确定性。

创新过程

为了判断人工智能工作是否能按照研究要求进行，研究人员开发了一种可解释性方法用以揭示深度学习机器模型如何做出预测。目前可解释性的方法包括特征验证学习、可视化、挑战集、对抗样本、解释模型等。解释方法分为全局解释和局部解释，全局解释描述模型的整体行为，局部解释注重模型做出特定预测的过程。可解释性方法往往通过开发一个独立且更简单的模型来完成，但这样的模型原理隐藏性更高。

麻省理工学院计算机科学与人工智能实验室（CSAIL）的研究人员将大部分注意力放在局部解释方法上，研究可解释机器学习中的模型、算法和评估。最常见的局部解释方法类型分为特征归因、反事实解释和示例重要性解释，即显示模型在做特定决策时的首要要素、显示给定输入和模型的预测后如何更改输入和显示模型在进行特定预测时最依赖的训练样本。三种方法分别侧重于检查虚假相关性、更改决策的输入改变和训练模型数据。

随着机器学习在越来越多的学科种被使用，解释方法也更多地被用于帮助决策者更好地理解模型的预测，以便他们知道何时信任模型并在实践中使用其指导。可解释性方法使各类研究的相关人员对特定系统的预测过于信任，极有可能导致了偏见永久化的结果。
研究人员表示，科学研究应该永远保持质疑的态度，过度自信很有可能带来更糟糕的结果。解释方法的正确性无法判断，就需要与实际模型进行比较，但模型的工作原理用户并不明白，导致解释方法陷入逻辑的循环。

CSAIL致力于改进解释方法，令其更接近实际模型的预测结果，但研究人员认为，哪怕是最优的解释，使用者也应该慎重对待。并且，人们经常将模型与人类社会的决策者类比，导致过度概括。研究的目标就是让人们认识到模型解释方法的不完美，冷静思考，确保其从局部解释方法种得到的广义模型理解最大限度地发挥其最佳作用。

开发解释方法是为了调试模型并保证质量。例如，随着对特征如何影响模型决策的更多理解，人们可以识别模型工作不正确并进行干预以解决问题，或者将模型抛出并重新开始。机器学习模型研究在注重解释方法的同时更需要研究信息提供给使用者的原理，以便使用者理解人工智能学习模型的工作运行并制定更完善的监管措施，确保机器学习模型在实践中发挥最佳作用。

创新关键点

从可解释性方法的角度探索机器学习模型的工作准确性，拓展人工智能研究使用的注意事项。

创新价值

帮助阐明机器学习模型的预测原理，推动建立一种将解释与特定场景相匹配的理论，克服解释方法和机器学习模型实际使用中的缺陷。

Rethink the interpretability approach and delve into the principles of machine learning

To determine whether AI work can be performed as required by the study, the researchers developed an interpretable method to reveal how deep learning machine models make predictions. Current methods of interpretability include feature verification learning, visualization, challenge sets, adversarial samples, and explanatory models. The interpretation method is divided into global interpretation and local interpretation, which describes the overall behavior of the model, and local interpretation focuses on the process of making specific predictions by the model. Interpretable approaches are often done by developing a stand-alone and simpler model, but such model principles are more hidden.

Researchers at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) focus most of their attention on locally interpreted methods, studying models, algorithms, and evaluations that can explain machine learning. The most common types of local interpretation methods are feature attribution, counterfactual interpretation, and example importance interpretation, which show the primary features of the model when making specific decisions, show how the input changes after a given input and model's predictions, and show the training samples that the model relies on most when making specific predictions. The three methods focus on checking for spurious correlations, changing input changes to decisions, and training model data, respectively.

As machine learning is used in more and more disciplines, interpretive methods are also being used more to help decision makers better understand the predictions of the model so that they know when to trust the model and use its guidance in practice. The interpretability approach makes people involved in various types of studies too trusting in the predictions of a particular system, most likely leading to the result of perpetuating biases.

Researchers say scientific research should always be skeptical, and overconfidence is likely to lead to worse outcomes. The correctness of the interpretation method cannot be judged, and it needs to be compared with the actual model, but the working principle of the model is not understood by the user, resulting in the interpretation method falling into a logical loop.

CSAIL works to improve the interpretation method to bring it closer to the predictions of real-world models, but the researchers believe that even the optimal explanations should be taken with caution by users. Moreover, people often compare models to decision makers in human society, leading to overgeneralization. The goal of the study is to make people realize the imperfections of the model interpretation method, think calmly, and ensure that the generalized model understanding obtained from the local interpretation method maximizes its best effect.

Interpretation methods were developed to debug models and guarantee quality. For example, with a greater understanding of how features affect model decisions, one can identify that the model is not working correctly and intervene to solve the problem, or throw the model out and start over. Machine learning model research needs to study the principle of information provided to the user while focusing on the explanatory method, so that the user can understand the working operation of the artificial intelligence learning model and formulate more perfect regulatory measures to ensure that the machine learning model plays the best role in practice.

智能推荐

使用人工智能鉴定出新的多发性硬化症亚型
2022-08-02
伦敦大学学院的科学家们使用人工智能(AI)确定了三种新的多发性硬化症(MS)亚型。研究人员表示，这一突破性的发现将有助于识别那些更有可能出现疾病进展的人，并有助于更有效地针对性治疗。
涉及学科
涉及领域
研究方向
AI+金融学 | 新型高精度翻译金融领域文件的引擎
2022-11-22
研究人员开发的这种新的基于AI的高精度翻译引擎，可以在日语和英语之间翻译金融部门的文件。
涉及学科
涉及领域
研究方向
人工智能通过图片识别皮肤病，减少误诊及医疗成本
2022-08-05
研究人员设计了一款通过图像识别和管理常见皮肤病的应用程序，防止因误诊或拖延而加重病情，同时帮助卫生组织减少因不必要的复诊、无效的处方和不必要的转诊造成的医疗成本。
涉及学科
涉及领域
研究方向
AI+环境工程 | 使用人工智能处理垃圾
2022-06-30
创新在垃圾处理领域使用人工智能，分拣、燃烧一体化进行，促进工作高效绿色运转。
涉及学科
涉及领域
研究方向

反思可解释性方法，深入探索机器学习的原理

创新背景

创新过程

创新关键点

创新价值

Rethink the interpretability approach and delve into the principles of machine learning

智能推荐

使用人工智能鉴定出新的多发性硬化症亚型

伦敦大学学院的科学家们使用人工智能(AI)确定了三种新的多发性硬化症(MS)亚型。研究人员表示，这一突破性的发现将有助于识别那些更有可能出现疾病进展的人，并有助于更有效地针对性治疗。

AI+金融学 | 新型高精度翻译金融领域文件的引擎

研究人员开发的这种新的基于AI的高精度翻译引擎，可以在日语和英语之间翻译金融部门的文件。

人工智能通过图片识别皮肤病，减少误诊及医疗成本

研究人员设计了一款通过图像识别和管理常见皮肤病的应用程序，防止因误诊或拖延而加重病情，同时帮助卫生组织减少因不必要的复诊、无效的处方和不必要的转诊造成的医疗成本。

AI+环境工程 | 使用人工智能处理垃圾

创新在垃圾处理领域使用人工智能，分拣、燃烧一体化进行，促进工作高效绿色运转。