LLM4Vis：使用 ChatGPT 的可解释可视化推荐

Lei Wang^♠ Songheng Zhang^♠ Yun Wang^♢ Ee-Peng Lim^♠ Yong Wang^♠
^♠Singapore Management University
^♢Microsoft Research Asia
{lei.wang.2019, shzhang.2021, eplim, yongwang}@smu.edu.sg wangyun@microsoft.com
Corresponding author.

摘要

数据可视化是一种功能强大的工具，用于探索和传达各个领域的见解。为了自动为数据集选择可视化，一项称为可视化推荐的任务被提了出来。针对这一目的，已经开发了各种基于机器学习的方法，但它们通常需要大量数据集-可视化对来进行训练，并且缺乏对其结果的自然解释。为了解决这一研究空白，我们提出了 LLM4Vis，一种新颖的基于 ChatGPT 的提示方法，用于执行可视化推荐并使用极少的演示示例返回类似人类的解释。我们的方法包括特征描述、演示示例选择、解释生成、演示示例构建和推理步骤。为了获得具有高质量解释的演示示例，我们提出了一种新的解释生成自举方法，通过考虑先前的生成和基于模板的提示来迭代地细化生成的解释。在 VizML 数据集上的评估表明，LLM4Vis 在少样本和零样本设置中都优于或与随机森林、决策树和 MLP 等监督学习模型的性能相似。定性评估还表明了 LLM4Vis 生成的解释的有效性。我们在 https://github.com/demoleiwang/LLM4Vis 上公开发布了我们的代码。

1 引言

数据可视化是一种功能强大的工具，用于探索数据、传达见解以及在各个领域做出明智的决策，例如商业、科学研究、社交媒体和新闻报道 Munzner (2014); Ward et al. (2010)。但是，创建有效的可视化需要熟悉数据和可视化工具，这可能需要大量的时间和精力 Dibia and Demiralp (2019a)。一项自动为输入数据集选择可视化的任务，也被称为可视化推荐，已经被提了出来。

到目前为止，可视化推荐工作可以分为基于规则和基于机器学习的方法 Hu et al. (2019b); Li et al. (2021); Zhang et al. (2023)。基于规则的方法 Mackinlay (1986); Vartak 等人 (2015); Demiralp 等人 (2017) 利用数据特征和可视化原则来预测可视化，但存在规则表达能力有限和泛化能力差的缺点。基于机器学习的方法 Hu 等人 (2019b); Wongsuphasawat 等人 (2015); Zhou 等人 (2021) 从数据集-可视化对中学习机器学习 (ML) 或深度学习 (DL) 模型，这些模型可以提供更高的推荐准确性和可扩展性。但是，现有的 ML/DL 模型通常需要大量的训练数据集-可视化对，并且无法为推荐结果提供解释。最近，一项基于机器学习的工作 KG4Vis Li 等人 (2021) 利用知识图来实现可解释的可视化推荐。然而，KG4Vis 仍然需要使用大量数据语料进行监督学习，其解释是基于预定义的模板生成的，这限制了解释的自然性和灵活性。

近年来，大型语言模型 (LLM) 如 ChatGPT OpenAI (2022) 和 GPT-4 OpenAI (2023) 已证明了使用情境学习的强大推理能力 Brown 等人 (2020); Zhang 等人 (2022); Chowdhery 等人 (2022)。这一背后的关键思想是使用类比示例进行学习 Dong 等人 (2022)。通过情境学习，LLM 可以有效地执行复杂的任务，包括但不限于数学推理 Wei 等人 (2022)、视觉问答 Yang 等人 (2022) 和表格分类 Hegselmann 等人 (2023)，无需监督学习。通过提示预训练的 LLM 使用情境学习来执行任务，我们避免了将 LLM 调整到新任务时参数更新的开销。

受 ChatGPT 在自然语言任务中的出色表现的启发 Qin 等人 (2023); Li 等人; Sun 等人 (2023); Gilardi 等人 (2023)，我们探索了利用 ChatGPT 进行可解释的可视化推荐的可能性。具体来说，我们提出了 LLM4Vis，一种新颖的基于 ChatGPT 的情境学习方法，用于 Visualization recommendation，通过从极少量的 dataset-visualization 对中学习，提供自然的人类语言解释。 LLM4Vis 包含几个关键步骤：特征描述、演示示例选择、解释生成引导、提示构建、以及可解释可视化推荐的推理。首先，特征描述用于定量地表示表格数据集的特征，这使得使用 ChatGPT 更容易分析和理解表格数据集。然后，使用演示示例选择来检索 $K$ 最近的标记数据示例，以防止输入长度超过 ChatGPT 的最大长度。接下来，我们提出了一种新的迭代细化策略，根据先前的生成和提示来获得更高质量的推荐解释和每个可视化类型在提示构建之前的分数。最后，使用构建的提示来指导 ChatGPT 为测试表格数据集推荐可视化类型，同时提供推荐分数和人类语言解释。

我们通过将 LLM4Vis 的可视化建议准确性与来自 VisML Hu 等人 (2019a) 的强大基于机器学习的基准（如决策树、随机森林和 MLP）进行比较，评估了 LLM4Vis 的可视化建议。可视化建议结果表明，LLM4Vis 在少样本和全样本训练设置中都优于所有基准。此外， LLM 和人类进行的评估表明，生成测试数据示例的解释与预测得分匹配。我们的贡献总结如下：

•

我们提出了 LLM4Vis，一种新颖的基于 ChatGPT 的提示方法，用于可视化建议，它可以实现准确的可视化建议，并提供类似人类的解释。
•

我们提出了一种新的解释生成自举方法，用于为提示构建生成高质量的推荐解释和分数。
•

实验结果表明 LLM4Vis 的实用性和有效性，鼓励进一步探索 LLM 用于可视化建议。

Refer to caption — 图 1： LLM4Vis 的详细说明。 (a) 将标记的表格数据集转换为最终提示的演示示例的过程，包括特征提取、特征描述和解释生成自举。 (b) 测试表格数据集的可视化类型推荐过程，包括演示示例选择、提示构建和推理。

2 相关工作

关于自动可视化建议方法的先前研究可以分为两组：不可解释的可视化建议方法和可解释的可视化方法 Wang 等人 (2021)。不可解释的可视化建议方法，包括 Data2vis Dibia 和 Demiralp (2019b)、VizML Hu 等人 (2019a) 和 Table2Chart Zhou 等人 (2021)，可以为输入数据集推荐合适的可视化，但无法向用户提供推荐背后的推理，使它们成为黑盒方法。可解释的可视化建议方法为其推荐结果提供解释，从而增强透明度和用户对推荐的信心。大多数依赖于人为定义的规则，例如 Show Me Mackinlay 等人 (2007) 和 Voyager Wongsuphasawat 等人 (2015)。但是，基于规则的方法通常很耗时且资源密集，并且需要可视化专家进行手动规范。为了解决这些局限性，Li 等人 (2021) 提出了一种基于知识图谱的推荐方法 (KG4Vis)，该方法从现有的可视化实例中学习规则。为了提供人性化的解释，本文提出利用 ChatGPT 来推荐合适的可视化方法。

3 LLM4Vis 方法

3.1 概述

本节介绍我们提出的 LLM4Vis 方法。如图 1 所示，LLM4Vis 包含几个关键步骤：特征描述、演示示例选择、解释生成引导、提示构建和推理。为了节省空间，我们在附录中展示了我们在 LLM4Vis 中使用所有提示的精确措辞。

3.2 特征描述

大多数大型语言模型，如 ChatGPT OpenAI (2022)，都是基于文本语料库进行训练的。为了允许 ChatGPT 以表格数据集作为输入，我们可以首先使用预定义规则将其转换为一组数据特征，这些特征定量地表示其特征。随后，这些特征可以被序列化成文本描述。

遵循 VizML Hu 等人 (2019b) 和 KG4Vis Li 等人 (2021)，我们提取了 80 个跨列数据特征，这些特征捕获列之间的关系，以及 120 个单列数据特征，这些特征量化了每列的属性。我们将与列相关的数据特征分类为类型、值和名称。类型对应于列的数据类型，值捕获诸如分布和异常值等统计特征，名称与列的名称相关。

先前的工作 Hegselmann 等人 (2023); Dinh 等人 (2022) 主要通过使用规则、模板或语言模型来执行序列化。在这篇论文中，为了确保语法正确性、灵活性以及丰富性，我们遵循 TabLLM 提出的 LLM 序列化方法 Hegselmann 等人 (2023)。具体来说，我们的方法包括提供一个提示，指示 ChatGPT 为每个表格数据集生成一个全面的文本描述，该描述分析了来自单列和跨列视角的特征值。然后使用特征描述来构建简洁但信息丰富的演示示例。

3.3 演示示例选择

由于最大输入长度限制，ChatGPT 提示只能容纳少量演示示例。因此，从大量标记数据中选择好的演示样本至关重要。而不是随机选择可能与目标测试表格数据集无关的示例 Liu 等人 (2021)，我们首先通过将每个表格数据集的特征转换为向量来表示它。然后，我们使用聚类算法从标记集中选择代表性的示例子集。聚类算法创建 $C$ 个聚类，我们从每个聚类中选择 $R$ 个代表性示例，从而得到大小为 $M=C\times R$ 的子集作为检索集。最后，我们根据其向量表示的余弦相似度分数从检索集中检索与目标数据示例具有最高相似度分数的 $K$ 个训练数据示例。

3.4 解释生成自举

每个标记数据示例 $X_{i}$ 仅带有一个基本真值标签 $Y_{i}$ ，但没有在演示示例中使用的解释。因此，我们建议使用一个提示来利用 ChatGPT 的内置知识，为每个标记数据集推荐合适的可视化效果以及相应的解释。我们的策略包括指示 ChatGPT 以 JSON 格式生成响应，其中键对应于四种可能的可视化类型 $\{Y_{LC},Y_{SP},Y_{BC},Y_{BP}\}$ ( $L C$ : 折线图， $S P$ : 散点图， $B C$ : 柱状图， $B P$ : 箱线图)，而值是推荐分数 $\{S_{LC},S_{SP},S_{BC},S_{BP}\}$ 。此外，我们提示 ChatGPT 在迭代过程中为其对每种可视化类型的预测生成解释 $\{Ex_{LC},Ex_{SP},Ex_{BC},Ex_{BP}\}$ 。

具体来说，我们使用零样本提示和表格数据集的特征描述来要求 ChatGPT 为所有可视化类型生成分数 $\{S^{1}_{LC},S^{1}_{SP},S^{1}_{BC},S^{1}_{BP}\}$ ，并提供解释 $\{Ex^{1}_{LC},Ex^{1}_{SP},Ex^{1}_{BC},Ex^{1}_{BP}\}$ 来支持这些分数分配给每种可视化类型。这些分数的总和必须为 1. 随后，这些分数和解释通过一个迭代细化过程进行修正，该过程在真实可视化类型 $Y_{i}$ 获得最高分数，并且该分数比第二高分数至少高出 $0.1$ 的差值时终止。最终的解释和分数分别用 $\{Ex^{f}_{LC},Ex^{f}_{SP},Ex^{f}_{BC},Ex^{f}_{BP}\}$ 和分数 $\{S^{f}_{LC},S^{f}_{SP},S^{f}_{BC},S^{f}_{BP}\}$ 表示。但是，如果真实可视化类型不满足上述条件，我们将开发一个提示并将其附加到初始的零样本提示中，以指示 ChatGPT 生成更准确的输出。一个示例提示模板如下：“{a} 可能比 {b} 更合适。但是，之前的分数是 {c}”。 {a} 槽用于真实标签，{b} 槽用于得分最高的错误标签，{c} 槽用于先前预测的每个可视化类型的分数。在实验部分，我们比较了两种提示策略，包括使用真实标签 (GT-As) 和随机标签 (Rand-As) 作为提示。结果可以在图 2 中找到。

通过这种迭代细化，我们可以获得更高质量的可视化类型预测，以及相应的评分和解释。请注意，如果标记的数据集在最大迭代步骤内无法满足停止条件，我们将从检索集中删除此数据示例。

3.5 提示构建和推理

从检索集中为测试数据样本检索 $K$ 个最近的标记样本，以及它们的特征描述、细化后的解释和细化后的评分，每个演示示例都使用特征描述、任务说明、推荐的可视化类型及其评分和解释构建。然后，我们将测试数据示例的特征描述合并到预定义的模板中。接下来，将构建的演示示例和测试数据示例的完成模板串联起来，并将其输入 ChatGPT 以执行可视化类型推荐。最后，我们从 ChatGPT 输出中提取推荐的可视化效果和解释。

4 评估

4.1 评估设置

表 1: 我们定量评估的结果，最佳结果以粗体突出显示。 LLM4Vis-随机是指从检索集中随机选择演示示例。相反，LLM4Vis-检索是指从检索集中检索

K

最近的标记数据示例。请注意，使用 5 个演示的 LLM4Vis 表现优于使用完整样本 (5000) 训练的机器学习基线，并提供这些基线无法实现的人类般的解释。

Settings	Methods	Hits@2
		Line	Scatter	Bar	Box	Overall
Full Samples	Decision Tree	57.3	60.0	100	56.0	68.3
	Random Forest	92.0	100	90.7	32.0	78.7
	MLP	97.3	100	93.3	24.0	78.7
Few-Shot (4) Fixed	Decision Tree	42.7	12.0	100	41.3	49.0
	Random Forest	66.7	78.7	38.7	65.3	62.0
	MLP	70.7	85.3	44.0	45.3	61.0
	LLM4Vis	53.3	80.0	84.0	93.3	77.7
Few-Shot Dynamic	LLM-SP-Random	36.0	86.0	96.0	46.0	66.0
	LLM-SP-Retrieval	68.0	94.0	90.0	32.0	71.0
	LLM4Vis-Random	46.7	69.3	84.0	90.7	72.7
	LLM4Vis-Retrieval	62.4	96.0	86.8	97.2	85.7
Zero-Shot	LLM-SP	64.0	84.0	56.0	64.0	65.0
Zero-Shot	LLM4Vis	64.0	88.0	76.0	89.3	79.3

数据集。我们利用 VizML 语料库 Hu 等人 (2019b) 来构建我们的训练集、验证集和测试集。我们从语料库中选择 100 个数据可视化对的子集，以评估我们模型的测试性能。这些对包括 25 个折线图、25 个散点图、25 个条形图和 25 个箱线图。我们在实验中采用两种不同的训练设置。在第一个设置中，我们使用语料库中的 5000 个数据可视化对集来训练所有基线模型。在第二个少样本设置中，我们使用聚类技术 Pedregosa 等人 (2011) 从 5000 个对中提取 $4\times 15$ 个数据可视化对，以构建大小为 ( $M=60$ ) 的检索集。

大语言模型设置。我们使用 GPT-3.5 的 gpt-3.5-turbo-16k 版本进行实验，该版本广为人知为 ChatGPT。我们选择 ChatGPT 是因为它是一个公开可用的模型，通常用于评估大型语言模型在下游任务中的性能 Sun 等人 (2023)；Qin 等人 (2023)；Li 等人 . 为了进行我们的实验，我们利用 OpenAI API，它提供了对 ChatGPT 的访问权限。我们的实验是在 2023 年 6 月至 2022 年 7 月之间进行的，生成允许的最大令牌数设置为 1024。为了增强我们生成的输出的确定性，我们将温度设置为 0。由于 ChatGPT 的输入长度限制（即 16,384 个令牌），我们将上下文演示 $K$ 的数量限制为 8。

基线。我们将与来自 VizML Hu et al. (2019a) 的强大的可视化类型推荐基线进行比较。具体来说，我们将我们的方法与决策树、随机森林和 MLP 基线进行比较，这些基线使用 scikit-learn 以及默认设置实现 Pedregosa et al. (2011)。通过完整数据训练，这些强大的基线有望胜过少样本方法。我们还将我们的方法与一种名为 LLM-SP 的简单提示技术进行比较。在零样本设置中，提示中的指令是要求 ChatGPT 根据给定表格数据集的提取特征来推荐可视化类型。在少样本设置中，提示中的每个演示示例都包含一个指令、给定表格数据集的提取特征以及相应的标记可视化类型。

指标。我们提出的方法直接根据大型语言模型做出两个可视化设计选择。参考 KG4Vis Li et al. (2021)，我们采用了一个常用的指标来评估我们方法的有效性：Hits@2，它表示前两个选项中正确可视化设计选择的比例。

4.2 主要结果

表 1 显示，在完整样本训练设置中，我们的少样本 LLM4Vis 优于所有基线，包括决策树、随机森林和 MLP，这表明 LLM 可以通过从有限的演示示例中学习并利用内置的关于可视化的背景知识，有效地推荐适当的可视化类型。请注意，即使是零样本 LLM4Vis 也能优于这些强大的基线。少样本设置的两种类别为：固定和动态。在固定设置中，为所有测试示例选择固定的演示示例，LLM4Vis 优于所有基线。在动态设置中，我们为每个测试示例选择相关的演示示例。具有动态少样本设置的 LLM4Vis 优于随机选择的演示。这表明相关的演示示例可以提供有用的信息，以指导 LLM 为测试表格数据集推荐合适的可视化类型。

4.3 深入分析

LLM4Vis 各个组件的影响。

图 2 展示了 LLM4Vis 变体的比较结果，其中一个组件被移除或替换。发现，在提示中缺少解释、特征描述和推荐评分会导致零样本和少样本设置中的性能下降。随着解释细化迭代次数的增加，性能会提高。将提出的提示替换为真实标签或随机标签会导致性能大幅下降。同样，使用最近的演示示例的预测作为测试示例的预测也会导致性能显著下降，这表明 LLM 有效地从给定的演示示例中学习，而不是仅仅复制它们。总体而言，提出的 LLM4Vis 的所有组件都有助于提高推荐准确性。

上下文示例数量的影响。

我们评估了演示示例数量对 LLM4Vis 性能的影响。具体来说，我们研究了 LLM4Vis，使用不同的最近演示示例集，范围从 1 到 7 个实例。结果如图 3(a) 所示，表明更多的演示示例会导致更好的性能，尽管当演示示例的数量从 3 增加到 4 时性能下降。

检索集大小的影响。

我们量化了检索集大小的影响。我们在不同大小的检索集上测试 LLM4Vis，范围从 $10$ 到 $60$ 个示例。图 3(b) 显示，随着检索集大小的增加，LLM4Vis 的性能得到提高。这可能是因为更大的检索集可以找到更多相关的最近邻居。这表明 LLM4Vis 通过扩展检索集可以取得更好的结果。随着检索集大小从 50 增加到 60，我们观察到性能提升程度下降。这表明 k 近邻演示示例中与测试数据相关的的信息可能没有成比例地增加。

基础大型语言模型的影响

我们还使用各种 LLM 评估 LLM4Vis，包括不同版本的 GPT-3.5。根据官方指南，ChatGPT 具有最高的能力，而 text-davinci-002 是三个 LLM 中能力最弱的模型。正如预期的那样，图 3(c) 说明，随着模型能力从 text-davinci-002 提高到 ChatGPT，模型性能得到改善。总体而言，这些结果表明，能力更强的 LLM 通常会提供更好的推荐准确性。

上下文示例顺序的影响。

我们比较了三种演示顺序：随机（将 $K$ 最近的邻居随机排列）、最远（首先选择相似度最低的样本）和最近（首先选择相似度最高的样本）。图 3(d) 中的结果表明，LLM4Vis 对 $K$ 所选演示的顺序敏感。具体来说，在 LLM4Vis 框架内使用“最远”排序会导致最低的结果，而“最近”排序会导致最强的性能。这表明相关的演示可以稳定 LLM 的上下文学习。

解释评估。

在本节中，我们评估了在一个测试表格数据集中生成的解释与可视化类型推荐的预测分数之间的一致性。采用两种评估指标：基于 LLM 的评估和人工评估。

基于 LLM 的评估测量了由 LLM4Vis 生成的预测分数与 ChatGPT 基于 LLM4Vis 生成的解释预测的分数之间的皮尔逊相关性。更高的皮尔逊相关性意味着预测分数和解释之间的一致性更强。我们获得了零样本 LLM4Vis 的皮尔逊相关性为 0.78，少样本 LLM4Vis 的皮尔逊相关性为 0.92。这些发现表明，与零样本 LLM4Vis 相比，少样本 LLM4Vis 在其预测分数和生成的解释之间表现出更高的一致性。

除了基于 LLM 的评估之外，我们还手动检查了十个正确的推荐，以进一步验证生成的解释与预测分数之间的一致性。我们的检查表明，十个示例中有九个示例展示了它们的解释与预测分数之间的一致性。一个特定实例的生成解释和预测分数不一致。这可能是因为基本事实标签的预测分数很低，并且是第二高的。

5 结论

本文提出了一种基于 ChatGPT 的新型上下文学习方法 LLM4Vis，用于可视化推荐，该方法可以通过仅学习少量数据集-可视化对来生成具有类似人类解释的准确可视化推荐。我们的方法包括几个关键步骤，包括特征提取、特征描述、解释生成、演示示例选择、提示生成和推理。我们对推荐结果和解释的评估证明了 LLM4Vis 的有效性和可解释性，这鼓励了人们进一步探索大型语言模型以完成此任务。

基于 LLM 的可视化推荐可以使许多初创公司和基于 LLM 的应用程序能够推进数据分析，增强洞察力的沟通，并帮助决策。在未来的工作中，我们计划探索将 LLM4Vis 部署到现实世界的数据分析和可视化应用程序中的可能性，并通过数据分析师和普通可视化用户进一步证明其有效性和可用性。此外，研究使用其他具有多模态能力的大型语言模型（例如 GPT-4）进行可视化推荐也很有意思。

6 致谢

该项目得到新加坡教育部学术研究基金第 2 层（提案 ID：T2EP20222-0049）的支持。本材料中表达的任何意见、发现和结论或建议均为作者的观点，并不反映新加坡教育部的观点。

参考文献

Brown et al. (2020) Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
Chowdhery et al. (2022) Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, et al. 2022. Palm: Scaling language modeling with pathways. ArXiv preprint, abs/2204.02311.
Demiralp et al. (2017) Çağatay Demiralp, Peter J Haas, Srinivasan Parthasarathy, and Tejaswini Pedapati. 2017. Foresight: Recommending visual insights. arXiv preprint arXiv:1707.03877.
Dibia and Demiralp (2019a) Victor Dibia and Çagatay Demiralp. 2019a. Data2vis: Automatic generation of data visualizations using sequence-to-sequence recurrent neural networks. IEEE Computer Graphics and Applications, 39(5):33–46.
Dibia and Demiralp (2019b) Victor Dibia and Çağatay Demiralp. 2019b. Data2vis: Automatic generation of data visualizations using sequence-to-sequence recurrent neural networks. IEEE computer graphics and applications, 39(5):33–46.
Dinh et al. (2022) Tuan Dinh, Yuchen Zeng, Ruisu Zhang, Ziqian Lin, Michael Gira, Shashank Rajput, Jy-yong Sohn, Dimitris Papailiopoulos, and Kangwook Lee. 2022. Lift: Language-interfaced fine-tuning for non-language machine learning tasks. Advances in Neural Information Processing Systems, 35:11763–11784.
Dong et al. (2022) Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu Sun, Jingjing Xu, and Zhifang Sui. 2022. A survey for in-context learning. arXiv preprint arXiv:2301.00234.
Gilardi et al. (2023) Fabrizio Gilardi, Meysam Alizadeh, and Maël Kubli. 2023. ChatGPT outperforms Crowd-Workers for Text-Annotation tasks.
Hegselmann et al. (2023) Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David Sontag. 2023. Tabllm: few-shot classification of tabular data with large language models. In International Conference on Artificial Intelligence and Statistics, pages 5549–5581. PMLR.
Hu et al. (2019a) Kevin Hu, Michiel A Bakker, Stephen Li, Tim Kraska, and César Hidalgo. 2019a. Vizml: A machine learning approach to visualization recommendation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pages 1–12.
Hu et al. (2019b) Kevin Zeng Hu, Michiel A. Bakker, Stephen Li, Tim Kraska, and César A. Hidalgo. 2019b. Vizml: A machine learning approach to visualization recommendation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pages 1–12.
(12) Bo Li, Gexiang Fang, Yang Yang, Quansen Wang, Wei Ye, Wen Zhao, and Shikun Zhang. ChatGPT_for_IE: Evaluating ChatGPT’s information extraction capabilities: An assessment of performance, explainability, calibration, and faithfulness.
Li et al. (2021) Haotian Li, Yong Wang, Songheng Zhang, Yangqiu Song, and Huamin Qu. 2021. Kg4vis: A knowledge graph-based approach for visualization recommendation. IEEE Transactions on Visualization and Computer Graphics, 28(1):195–205.
Liu et al. (2021) Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. 2021. What makes good in-context examples for gpt- $3$ ? arXiv preprint arXiv:2101.06804.
Mackinlay et al. (2007) Jock Mackinlay, Pat Hanrahan, and Chris Stolte. 2007. Show me: Automatic presentation for visual analysis. IEEE transactions on visualization and computer graphics, 13(6):1137–1144.
Mackinlay (1986) Jock D. Mackinlay. 1986. Automating the design of graphical presentations of relational information. ACM Transactions on Graphics, 5(2):110–141.
Munzner (2014) Tamara Munzner. 2014. Visualization analysis and design. CRC press.
OpenAI (2022) OpenAI. 2022. Introducing chatgpt. https://openai.com/blog/chatgpt.
OpenAI (2023) OpenAI. 2023. GPT-4 technical report. CoRR, abs/2303.08774.
Pedregosa et al. (2011) Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Gilles Louppe, Peter Prettenhofer, Ron Weiss, Ron J. Weiss, J. Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine learning in python. ArXiv, abs/1201.0490.
Qin et al. (2023) Chengwei Qin, Aston Zhang, Zhuosheng Zhang, Jiaao Chen, Michihiro Yasunaga, and Diyi Yang. 2023. Is ChatGPT a General-Purpose natural language processing task solver?
Sun et al. (2023) Weiwei Sun, Lingyong Yan, Xinyu Ma, Pengjie Ren, Dawei Yin, and Zhaochun Ren. 2023. Is ChatGPT good at search? investigating large language models as Re-Ranking agent.
Vartak et al. (2015) Manasi Vartak, Sajjadur Rahman, Samuel Madden, Aditya Parameswaran, and Neoklis Polyzotis. 2015. Seedb: Efficient data-driven visualization recommendations to support visual analytics. In Proceedings of the VLDB Endowment, volume 8, page 2182–2193.
Wang et al. (2021) Qianwen Wang, Zhutian Chen, Yong Wang, and Huamin Qu. 2021. A survey on ml4vis: Applying machine learning advances to data visualization. IEEE Transactions on Visualization and Computer Graphics, 28(12):5134–5153.
Ward et al. (2010) Matthew O Ward, Georges Grinstein, and Daniel Keim. 2010. Interactive data visualization: foundations, techniques, and applications. CRC Press.
Wei et al. (2022) Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, and Denny Zhou. 2022. Chain of thought prompting elicits reasoning in large language models. In Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022).
Wongsuphasawat et al. (2015) Kanit Wongsuphasawat, Dominik Moritz, Anushka Anand, Jock Mackinlay, Bill Howe, and Jeffrey Heer. 2015. Voyager: Exploratory analysis via faceted browsing of visualization recommendations. IEEE transactions on visualization and computer graphics, 22(1):649–658.
Wu et al. (2021) Aoyu Wu, Yun Wang, Xinhuan Shu, Dominik Moritz, Weiwei Cui, Haidong Zhang, Dongmei Zhang, and Huamin Qu. 2021. Ai4vis: Survey on artificial intelligence approaches for data visualization. IEEE Transactions on Visualization and Computer Graphics.
Yang et al. (2022) Zhengyuan Yang, Zhe Gan, Jianfeng Wang, Xiaowei Hu, Yumao Lu, Zicheng Liu, and Lijuan Wang. 2022. An empirical study of gpt-3 for few-shot knowledge-based vqa. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 3081–3089.
Zhang et al. (2023) Songheng Zhang, Haotian Li, Huamin Qu, and Yong Wang. 2023. Adavis: Adaptive and explainable visualization recommendation for tabular data. IEEE Transactions on Visualization and Computer Graphics.
Zhang et al. (2022) Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona T. Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, and Luke Zettlemoyer. 2022. OPT: open pre-trained transformer language models. CoRR, abs/2205.01068.
Zhou et al. (2021) Mengyu Zhou, Qingtao Li, Xinyi He, Yuejiang Li, Yibo Liu, Wei Ji, Shi Han, Yining Chen, Daxin Jiang, and Dongmei Zhang. 2021. Table2charts: recommending charts by learning shared table representations. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 2389–2399.

附录 A 附录

A.1 提示和示例

本节包括三个部分：提出的 LLM4Vis 中使用的提示措辞（表 2）、可视化类型推荐示例（表 A.2 到表 A.2）和解释迭代细化的示例（表 7 到表 10）。

A.2 相关工作

自动可视化推荐方法的先前研究可以分为两组：不可解释的可视化推荐方法和可解释的可视化方法 Wang et al. (2021)。

不可解释的可视化推荐方法可以为输入数据集推荐合适的可视化，但无法向用户提供推荐背后的推理，使其成为黑盒方法。这些方法中的一种例子是 Data2vis Dibia 和 Demiralp (2019b)，它采用神经翻译模型（Bi-LSTM）以端到端的方式生成可视化规范，无需人工干预。然而，该方法无法很好地模拟数据集特征与可视化（例如可视化类型）之间的映射 Wu 等人 (2021)。为了解决这一限制，Hu 等人提出了 VizML Hu 等人 (2019a)，它执行特征工程来量化输入数据集的特征，并应用神经网络来推荐适合数据集特征的可视化类型。除了这些方法之外，Table2Chart Zhou 等人 (2021) 不仅推荐适合输入数据集的可视化，还推荐特定用户指示的特定可视化类型的视觉编码。与这些方法相比，Table2Chart 提供了一种更个性化的推荐方法，满足用户的特定需求和偏好。尽管这些方法有效，但仍然需要一种可视化推荐方法，能够以准确且可解释的方式推荐可视化。

可解释的可视化推荐方法为其推荐结果提供解释，提高了推荐的透明度和用户信心。大多数可解释的可视化推荐方法依赖于人工定义的规则，这些规则指定了数据集特征与可视化类型之间的映射。例如，Show Me Mackinlay 等人 (2007) 如果数据集特征与预定义规则一致，则会自动推荐可视化类型。 Wongsuphasawat 等人 (2015) 介绍了 Voyager，它通过根据预定义规则详尽地探索数据集列来生成潜在的可视化，并根据数据集属性和可视化原则对它们进行排名。虽然这些基于规则的方法可以解释其推荐，但规则开发非常耗时，资源密集，并且需要可视化专家。

为了解决这一限制，Li 等人提出了一个基于知识图谱的推荐方法 (KG4Vis)，该方法从现有的可视化实例中学习规则。然而，KG4Vis 中的规则可能包含复杂的术语，对于没有领域知识的用户来说可能难以理解。为了应对这一挑战，我们提出了一种新的可视化推荐方法，该方法利用 ChatGPT 为其推荐结果提供类似人类的解释。我们方法生成的解释更容易被只有几个实例的外行人理解。

表 2： LLM4Vis 中使用的提示措辞。

Wording of Feature Description Prompt:

The features of a given tabular dataset are provided in the following delimited by triple backticks. Your task is to generate a detailed text description, in 1000 characters, that focus on features that are important for visualization type selection and comprehensively analyzes this tabuar dataset based on its feature values from both single-column and cross-column perspectives. Note that the response must exclude words such as line chart, scatter plot, bar chart, and box plot, since these words will mislead further visualization recommendation. The response format can be as “Single-column perspective: […]

Cross-column perspective: […].” Ensure that the summary maintains strong generalization ability and includes all vital information. Features for a tabular dataset: ```{ }```

Wording of Visualization Recommendation Prompt:

Determine whether each visualization type in the following list of visualization types is a suitable visualization type in the text description for a tabular dataset below, which is delimited with triple backticks. Give your explanation and your answer at the end as json (Explanation is as below: .

The final answer in JSON format would be:), where each element consists of a visualization type and a score ranging from 0 to 1 (1 means the most suitable). The scores should sum to be 1 (line + scatter + bar + box = 1.0). List of visualization types: [line chart, scatter plot, bar chart, and box plot]. Text description for a tabular dataset:```{ }```

Wording of Hint Guided Visualization Recommendation Prompt:

Determine whether each visualization type in the following list of visualization types is a suitable visualization type in the text description for a tabular dataset below, which is delimited with triple backticks. Hint: { } may be more suitable than { }, however, previous score is { }. With the given hint, editing your explanation and improve your answer at the end as json (Explanation is as below: .

表 3：线形图推荐示例。提示模板以浅灰色突出显示。测试表格数据集的输入特征描述以青柠色突出显示。输出以黄色突出显示。

Prompt:

Demonstration Examples:

…

Test Instance:

``` Single-column perspective: The dataset contains information about two columns, labeled as ‘x’ and ‘y’. The ‘x’ column represents time values, while the ‘y’ column contains numerical decimal values. The ‘x’ column is of the time data type, and the ‘y’ column is of the numerical data type. The ‘x’ column is sorted and monotonic, indicating a continuous progression of time values. The ‘y’ column has outliers present, as indicated by the flags for outliers based on different criteria. The ‘y’ column shows a range from -3.0 to 11.0, with a mean of 0.0 and a standard deviation of 16.17. The distribution of the ‘y’ column appears to be slightly positively skewed, with a skewness value of 0.21. The ‘y’ column has a gini coefficient of 0.36, indicating some inequality in the distribution. The ‘y’ column has a non-zero variance of 261.57, indicating variability in the data. The ‘y’ column has a high percentage of outliers based on different criteria, ranging from 97.78% to 99.10%. The ‘y’ column does not follow a normal distribution, as indicated by the normality statistic and p-value. Cross-column perspective: The dataset consists of two columns, ‘x’ and ‘y’, with ‘x’ representing time values and ‘y’ representing numerical decimal values. There is no significant correlation or statistical relationship between the ‘x’ and ‘y’ columns, as indicated by the absence of significant correlation, chi-square, Kolmogorov-Smirnov, and linear regression tests. The ‘x’ and ‘y’ columns do not share any elements, unique elements, or words. The ‘x’ and ‘y’ columns do not have any categorical or time-based relationship. The ‘x’ column has a length of 1000, indicating a consistent number of time values. The ‘y’ column has a length of 1000 as well. The ‘x’ column has a single-word field name, while the ‘y’ column also has a single-word field name. The ‘x’ column does not have any uppercase characters or special currency symbols in its field name. The ‘y’ column has outliers based on different criteria, while the ‘x’ column does not have any outliers.```

Test Output:

Based on the given text description, we can evaluate the suitability of each visualization type for the tabular dataset. Here is an explanation for each visualization type: 1. Line Chart: A line chart is suitable for visualizing the relationship between two variables over time. In this dataset, the ‘x’ column represents time values, and the ‘y’ column contains numerical decimal values. Since the ‘x’ column represents time values and is sorted and monotonic, a line chart would be a suitable visualization type to show the trend or pattern of the ‘y’ values over time. Therefore, we can assign a score of 0.6 to the line chart. 2. Scatter Plot: A scatter plot is useful for visualizing the relationship between two continuous variables. However, in this dataset, there is no significant statistical relationship between the ‘x’ and ‘y’ columns. Therefore, a scatter plot may not be the most suitable visualization type. We can assign a score of 0.1 to the scatter plot. 3. Bar Chart: A bar chart is commonly used to compare categorical or discrete data. Since the dataset does not contain any categorical variables, a bar chart may not be the most suitable visualization type. We can assign a score of 0.1 to the bar chart. 4. Box Plot: A box plot is useful for visualizing the distribution of a continuous variable and identifying outliers. In this dataset, the ‘y’ column has outliers present, and the text provides information about the range, mean, standard deviation, skewness, and percentage of outliers in the ‘y’ column. These statistics indicate the presence of a distribution and outliers, making a box plot a suitable visualization type to show the distribution and identify outliers. Therefore, we can assign a score of 0.2 to the box plot. Based on the above analysis, the scores for each visualization type can be assigned as follows:

Ground Truth: Line Chart.

表 4：散点图推荐示例。提示模板以浅灰色突出显示。测试表格数据集的输入特征描述以青柠色突出显示。输出以黄色突出显示。

Prompt:

Demonstration Examples:

…

Test Instance:

``` Single-column perspective: Based on the provided features, the dataset contains two columns, referred to as x and y. Both columns have decimal data types (data_type_is_decimal_x and data_type_is_decimal_y). The x column has a length of 800 (length_x), and the y column also has a length of 800 (length_y). The x column is sorted (is_sorted_x) and is monotonic (is_monotonic_x), while the y column is also sorted (is_sorted_y) and monotonic (is_monotonic_y). The x column has a mean value of 0.0 (mean_x), a median of 0.0 (median_x), a variance of 0.0 (var_x), and a standard deviation of 0.0 (std_x). It has a minimum value of 0.0 (min_x), a maximum value of 0.0 (max_x), and a range of 0.0 (range_x). The distribution of the x column is symmetric, with a skewness of 0.0 (skewness_x) and a kurtosis of -3.0 (kurtosis_x). The x column follows a normal distribution, as indicated by the normality statistic of 2210.514 (normality_statistic_x) and a p-value of 0.0 (normality_p_x). … ```

Test Output:

Based on the provided text description for the tabular dataset, we can evaluate the suitability of each visualization type. Here is an explanation for each visualization type: 1. Line chart: A line chart is suitable for visualizing trends and changes over time or a continuous variable. In this dataset, there is no explicit mention of time or a continuous variable. Therefore, a line chart may not be the most suitable visualization type. 2. Scatter plot: A scatter plot is useful for visualizing the relationship between two continuous variables. Since the dataset contains two columns (x and y) with decimal values, a scatter plot can effectively show the relationship between these variables. It can help identify any patterns, clusters, or correlations between the x and y values. 3. Bar chart: A bar chart is commonly used to compare categorical or discrete data. In this dataset, there is no mention of categorical or discrete variables. Therefore, a bar chart may not be the most suitable visualization type. 4. Box plot: A box plot is useful for visualizing the distribution of a continuous variable, including measures such as median, quartiles, and outliers. Since the dataset provides statistical properties for both the x and y columns (mean, median, variance, standard deviation), a box plot can effectively represent the distribution of these variables and provide insights into their spread and central tendency. Based on the above analysis, we can assign scores to each visualization type: - Line chart: 0.1

Ground Truth: Scatter Plot.

表 5：柱状图推荐示例。提示模板以浅灰色突出显示。测试表格数据集的输入特征描述以青柠色突出显示。输出以黄色突出显示。

Prompt:

Demonstration Examples:

…

Test Instance:

``` Single-column perspective: Based on the provided features, the dataset contains two columns. The first column (x-axis) is a categorical variable, as indicated by the features ‘data_type_is_string_x’ and ‘general_type_is_c_x’. It has six unique elements, with a length of 6. The values in this column are sorted and have no missing values or outliers. The second column (y-axis) is a quantitative variable, as indicated by the features ‘data_type_is_decimal_y’ and ‘general_type_is_q_y’. It has a range of 0.0 to 0.0, with a mean, median, and mode of 0.0. The values in this column are sorted, have no missing values, and no outliers. The column has a negative kurtosis value of -3.0, indicating a platykurtic distribution. Cross-column perspective: There is no significant correlation or statistical relationship between the two columns, as indicated by the features ‘correlation_significant_005’, ‘linregress_significant_005’, ‘ks_significant_005’, and ‘one_way_anova_significant_005’. The columns do not share any elements, unique elements, or words, as indicated by the features ‘has_shared_elements’, ‘has_shared_unique_elements’, and ‘has_shared_words’. The chi-square test is not significant, as indicated by the feature ‘chi2_significant_005’. The columns do not have any range overlap, as indicated by the feature ‘has_range_overlap’. The edit distance between the column values is 1.0, and the normalized edit distance is also 1.0, indicating a low similarity between the values. The nestedness feature is not available for analysis.```

Test Output:

To determine the suitability of each visualization type for the given tabular dataset, we need to analyze the characteristics of the dataset and the visualization types. Based on the provided text description, we can make the following assessments: 1. Line Chart: A line chart is suitable for visualizing trends and changes over time or a continuous variable. However, in this dataset, there is no mention of time-related information or a continuous variable that would benefit from a line chart. Additionally, the second column (y-axis) has a constant value of 0.0, indicating no variation. Therefore, a line chart is not suitable for this dataset. 2. Scatter Plot: A scatter plot is useful for visualizing the relationship between two continuous variables. In this dataset, the first column (x-axis) is a categorical variable, and the second column (y-axis) is a quantitative variable. There is no significant correlation or statistical relationship mentioned between the two columns. Therefore, a scatter plot may not be the most suitable visualization type for this dataset. 3. Bar Chart: A bar chart is commonly used to compare categorical or discrete data. In this dataset, the first column (x-axis) is a categorical variable with six unique elements. A bar chart can effectively represent the distribution and comparison of these categories. Therefore, a bar chart is suitable for this dataset. 4. Box Plot: A box plot is effective for visualizing the distribution of numerical data and identifying outliers. In this dataset, the second column (y-axis) is a quantitative variable with a constant value of 0.0, indicating no variation. Therefore, a box plot is not suitable for this dataset. Based on the assessments above, we can assign scores to each visualization type: - Line Chart: 0.0

Ground Truth: Bar Chart.

表格 6: 箱线图推荐示例。提示模板以浅灰色突出显示。测试表格数据集的输入特征描述以青绿色突出显示。输出以黄色突出显示。

Prompt:

Demonstration Examples:

…

Test Instance:

``` Single-column perspective: Based on the provided features, the dataset contains two numerical columns, labeled as ‘x’ and ‘y’. The ‘x’ column has a length of 11 and ranges from 233.5 to 328.3, with a mean of 278.6 and a standard deviation of 27.3. The distribution of ‘x’ is slightly positively skewed (skewness = 0.088) and exhibits a platykurtic shape (kurtosis = -0.849). The ‘y’ column has a length of 14 and ranges from 217.8 to 262.0, with a mean of 244.8 and a standard deviation of 12.5. The distribution of ‘y’ is slightly negatively skewed (skewness = -0.454) and also exhibits a platykurtic shape (kurtosis = -0.722). Both columns have outliers beyond 1.5 times the interquartile range (IQR) and 99th percentile. The ‘x’ column has a higher percentage of outliers (90.9% and 100% for 1.5 IQR and 1-99 percentile, respectively) compared to the ‘y’ column (92.9% and 100% for 1.5 IQR and 1-99 percentile, respectively). The ‘x’ column has a higher range (94.8) compared to the ‘y’ column (44.2). The entropy of the ‘x’ column is 2.393, indicating moderate diversity, while the entropy of the ‘y’ column is slightly higher at 2.638. Both columns show a low Gini coefficient, indicating a relatively even distribution. The ‘x’ column has a higher normalized mean (0.849) and median (0.857) compared to the ‘y’ column (0.934 and 0.947, respectively). Cross-column perspective: The correlation between the ‘x’ and ‘y’ columns is 0.160, indicating a weak positive linear relationship. The p-value for the correlation test is 0.639, suggesting that the correlation is not statistically significant. The Kolmogorov-Smirnov (KS) test indicates a significant difference between the distributions of ‘x’ and ‘y’ (KS statistic = 0.727, p-value = 0.003). The edit distance between the ‘x’ and ‘y’ column names is 1, indicating a slight difference. The normalized edit distance is 0.25, suggesting a moderate dissimilarity. The ‘x’ and ‘y’ column names have the same length (4 characters) and share some common characteristics, such as having the first character in uppercase and containing digits. However, they do not share any words. The ‘x’ and ‘y’ columns do not have significant categorical or time-related characteristics. In summary, the dataset consists of two numerical columns with slightly different distributions and a weak positive linear relationship. Both columns have outliers and exhibit platykurtic shapes. The ‘x’ column has a wider range and higher percentage of outliers compared to the ‘y’ column. The ‘x’ column shows slightly higher diversity and normalized mean/median values compared to the ‘y’ column. The correlation between the columns is not statistically significant, but there is a significant difference in their distributions according to the KS test. The column names have some similarities but differ in terms of edit distance and shared words.```

Test Output:

Based on the given text description for the tabular dataset, we can determine the suitability of each visualization type as follows: - Line chart: The dataset does not provide any time-based or sequential data, and there is no specific trend or relationship mentioned that would require a line chart. Therefore, a line chart is not suitable in this case. Score: 0. - Scatter plot: The dataset consists of two numerical columns with a weak positive linear relationship. Although the correlation is not statistically significant, a scatter plot can still be used to visualize the relationship between the ‘x’ and ‘y’ columns. Score: 0.4. - Bar chart: A bar chart is typically used to compare discrete categories or groups. Since the dataset consists of numerical columns without any categorical or group information, a bar chart is not suitable in this case. Score: 0. - Box plot: The dataset mentions that both columns have outliers and exhibit slightly different distributions. A box plot can effectively represent the distribution of values, including the median, quartiles, and any potential outliers. Therefore, a box plot is suitable for visualizing the characteristics of each column. Score: 0.6. The final answer in JSON format would be: json { “line chart”: 0, “scatter plot”: 0.4, “bar chart”: 0, “box plot”: 0.6 }

Ground Truth: Box Plot.

表格 7: 线图推荐生成的解释的迭代细化示例。

Feature Description:

Single-column perspective: Based on the provided features, the dataset contains two columns: one with time values and the other with numerical values. The time column is identified as a time data type, while the numerical column is identified as a decimal data type. The time column is not sorted, but it is monotonic, meaning the values are either increasing or decreasing without any repetitions. The numerical column is sorted and also monotonic. Both columns have a length of 950, and there are no missing values (none) in either column. The time column has a minimum value of NaN, a maximum value of NaN, and a range of NaN. The numerical column has a minimum value of 0.0, a maximum value of 0.0, and a range of 0.0. The numerical column has a mean of NaN, a median of 0.0, and a standard deviation of 0.0. The skewness of the numerical column is 0.0, indicating a symmetric distribution, and the kurtosis is -3.0, suggesting a platykurtic distribution. Cross-column perspective: There is no significant correlation (correlation_value) or statistical relationship (linregress_significant_005) between the time and numerical columns. The dataset does not exhibit any categorical-categorical, category-numerical, or numerical-numerical relationships. The time column is not significantly different across categories (one_way_anova_significant_005). There is no significant overlap (has_range_overlap) or shared elements (has_shared_elements) between the time and numerical columns. In summary, the dataset consists of a time column and a numerical column. The time column represents time values, while the numerical column represents decimal values. The time column is not sorted but is monotonic, while the numerical column is both sorted and monotonic. There is no significant correlation or relationship between the two columns. The numerical column has a symmetric distribution with a mean and median of 0.0 and a standard deviation of 0.0. The dataset does not exhibit any categorical or numerical relationships.

Iteration 1:

To determine the suitability of each visualization type for the given tabular dataset, we need to analyze the characteristics of the dataset and the capabilities of each visualization type. Based on the provided text description, we can make the following assessments: 1. Line chart: A line chart is suitable for visualizing trends and changes over time. However, in this dataset, the time column is not sorted, and its values are described as either increasing or decreasing without repetitions. Since the time column does not have a meaningful order, a line chart may not be the most suitable visualization type. Score: 0.2 2. Scatter plot: A scatter plot is useful for visualizing the relationship between two numerical variables. However, the text description states that there is no significant correlation or relationship between the time and numerical columns. Therefore, a scatter plot may not be the most suitable visualization type. Score: 0.1 3. Bar chart: A bar chart is commonly used to compare categorical data or discrete numerical data. However, the dataset does not contain any categorical variables, and the numerical column is described as having a sorted and monotonic distribution. Since a bar chart is not well-suited for visualizing continuous numerical data, it may not be the most suitable visualization type. Score: 0.1 4. Box plot: A box plot is effective for visualizing the distribution of numerical data, including measures such as the median, quartiles, and outliers. In this dataset, the numerical column is described in terms of its distribution characteristics, such as having a symmetric distribution, a mean and median of 0.0, and a standard deviation of 0.0. A box plot can effectively represent these distribution characteristics. Score: 0.6 Based on the assessments above, the scores for each visualization type are as follows: - Line chart: 0.2 - Scatter plot: 0.1 - Bar chart: 0.1 - Box plot: 0.6 The scores sum up to 1.0. The final answer in JSON format would be:

{ “line chart”: 0.2, “scatter plot”: 0.1, “bar chart”: 0.1, “box plot”: 0.6 }

Iteration 2:

Based on the given text description for the tabular dataset, we can evaluate the suitability of each visualization type: 1. Line chart: A line chart is suitable for visualizing the relationship between time values and numerical values. Since the dataset contains a time column and a numerical column, a line chart can effectively show any trends or patterns over time. The hint also suggests that a line chart is more suitable than a box plot. Therefore, a line chart is a suitable visualization type. 2. Scatter plot: A scatter plot is useful for visualizing the relationship between two numerical variables. However, in this dataset, there is no significant correlation or relationship between the time and numerical columns. Therefore, a scatter plot may not be the most suitable visualization type. 3. Bar chart: A bar chart is commonly used to compare categorical data or to show the distribution of a single categorical variable. Since the dataset does not contain any categorical variables, a bar chart may not be suitable for this particular dataset. 4. Box plot: A box plot is typically used to display the distribution of numerical data and to identify outliers. However, the dataset description does not provide any information about the distribution of the numerical column or the presence of outliers. Additionally, the hint suggests that a line chart is more suitable than a box plot. Therefore, a box plot may not be the most suitable visualization type. Based on the above analysis, the suitability scores for each visualization type are as follows: - Line chart: 0.7 - Scatter plot: 0.1 - Bar chart: 0.1 - Box plot: 0.1 The scores sum up to 1.0. The final answer in JSON format would be:

{ “line chart”: 0.7, “scatter plot”: 0.1, “bar chart”: 0.1, “box plot”: 0.1 }

Ground Truth:

Line Chart.

表格 8: 散点图推荐生成的解释的迭代细化示例。

Feature Description:

Single-column perspective: Based on the provided features, the dataset contains two columns, both of which are numerical (general_type_is_q_x and general_type_is_q_y). The values in both columns are identical (identical=True) and have a range overlap (has_range_overlap=True). The columns also have shared elements (has_shared_elements=True) and shared unique elements (has_shared_unique_elements=True). The number of shared elements is 5 (num_shared_elements=5.0), accounting for approximately 12% of the total elements (percent_shared_elements=0.12). However, there are no shared words (has_shared_words=False) between the columns. The statistical properties of the columns indicate that they have zero mean (mean_x=0.0, mean_y=0.0) and zero variance (var_x=0.0, var_y=0.0). The data distribution is normal, as indicated by the normality tests (is_normal_5_x=True, is_normal_1_x=True, is_normal_5_y=True, is_normal_1_y=True). The columns are sorted (is_sorted_x=True, is_sorted_y=True) and exhibit monotonicity (is_monotonic_x=True, is_monotonic_y=True). However, the columns do not follow a linear or logarithmic spacing pattern (is_lin_space_x=False, is_log_space_x=False, is_lin_space_y=False, is_log_space_y=False). The descriptive statistics of the columns show that the minimum, maximum, and range values are all zero (min_x=0.0, min_y=0.0, max_x=0.0, max_y=0.0, range_x=0.0, range_y=0.0). The columns have a low coefficient of variation (coeff_var_x=NaN, coeff_var_y=NaN) and exhibit zero skewness (skewness_x=0.0, skewness_y=0.0) and a kurtosis value of -3 (kurtosis_x=-3.0, kurtosis_y=-3.0), indicating a platykurtic distribution. Cross-column perspective: From a cross-column perspective, the dataset consists of two numerical columns that are highly similar. They have identical values, indicating a strong relationship between the two variables. The correlation between the columns is not significant (correlation_significant_005=False), and the correlation coefficient is not available (correlation_value=NaN). The p-values for the correlation tests are 1.0 (correlation_p=1.0), suggesting no significant linear relationship.

…

Iteration 1:

To determine the suitability of each visualization type for the given tabular dataset, we can analyze the characteristics of the dataset and match them with the strengths of each visualization type. Based on the provided text description, we can make the following assessments: 1. Line chart: A line chart is suitable for visualizing trends and changes over time or a continuous variable. However, since there is no time-related information or a specific order mentioned in the dataset, a line chart may not be the most suitable visualization type. Score: 0.2 2. Scatter plot: A scatter plot is useful for visualizing the relationship between two numerical variables. In this dataset, the two columns are highly similar, and their values are identical. While a scatter plot can show the similarity between the variables, it may not provide much additional insight. Score: 0.2 3. Bar chart: A bar chart is effective for comparing categorical or discrete data. Since the dataset consists of numerical columns without any categorical information, a bar chart may not be the most suitable choice. Score: 0.2 4. Box plot: A box plot is commonly used to display the distribution of numerical data and identify outliers. Given that the dataset contains numerical columns with zero mean, zero variance, and a platykurtic distribution, a box plot can effectively represent these statistical properties. Score: 0.4 Based on the assessments above, the scores for each visualization type are as follows: { “line chart”: 0.2, “scatter plot”: 0.2, “bar chart”: 0.2, “box plot”: 0.4 }

Iteration 2:

Based on the given text description for a tabular dataset, we can evaluate the suitability of each visualization type: 1. Line chart: The dataset consists of two identical numerical columns with zero mean, zero variance, and a platykurtic distribution. The columns are sorted and exhibit monotonicity. However, since there is no specific spacing pattern mentioned, a line chart may not be the most suitable visualization type. Score: 0.2 2. Scatter plot: The dataset contains two numerical columns that are highly similar, with identical values indicating a strong relationship between the variables. A scatter plot would be suitable to visualize the similarity and relationship between the columns. Score: 0.4 (Hint: scatter plot may be more suitable than box plot) 3. Bar chart: The given dataset does not have categorical data or discrete values that would be suitable for a bar chart. Therefore, a bar chart is not a suitable visualization type. Score: 0.0 4. Box plot: The dataset does not have significant differences in distribution between the columns, and the statistical properties indicate zero variance. A box plot is typically used to visualize the distribution and variability of data, which may not be informative in this case. Score: 0.2 The final answer in JSON format would be:

“line chart”: 0.2, “scatter plot”: 0.4, “bar chart”: 0.0, “box plot”: 0.2

Ground Truth:

Scatter Plot.

表格 9: 条形图推荐生成的解释的迭代细化示例。

Feature Description:

Single-column perspective: Based on the provided features, the dataset contains two columns: one with a categorical/general type and the other with a quantitative/general type. The categorical column is of string data type, while the quantitative column is of decimal data type. For the categorical column, it has three unique elements with a length ranging from 4 to 6 characters. The values in this column are not sorted or monotonic. There are no missing values (None) in this column, and all the unique elements have the same percentage of occurrence (33.33 In the quantitative column, the values range from 0.49 to 1.29, with a mean of 0.9233 and a standard deviation of 0.33. The distribution of values is slightly negatively skewed (-0.295) and exhibits a kurtosis of -1.5, indicating a relatively flat distribution. The values in this column are not normally distributed. There are outliers present, as indicated by the presence of values beyond 1.5 times the interquartile range (IQR) in both directions. The range of values in this column is 0.8, and the normalized range is 0.8664. Cross-column perspective: From a cross-column perspective, there is a relationship between the categorical and quantitative columns. The categorical column has no missing values and does not share any elements or words with the quantitative column. The quantitative column has no missing values either. The quantitative column exhibits a moderate positive correlation with the categorical column, as indicated by the correlation value of NaN. However, the statistical significance of this correlation is not determined, as the correlation p-value is also NaN. In summary, the dataset consists of a categorical column with three unique elements and a quantitative column with decimal values. The categorical column shows moderate diversity, while the quantitative column exhibits a slightly negatively skewed distribution with outliers. There is a relationship between the categorical and quantitative columns, but the correlation and its significance are not determined.

Iteration 1:

To determine the suitability of each visualization type for the given tabular dataset, we need to analyze the characteristics of the dataset and match them with the strengths of each visualization type. Based on the provided text description, here is an evaluation of each visualization type: 1. Line Chart: A line chart is suitable for visualizing trends and changes over time or a continuous variable. However, the given dataset does not contain any temporal or sequential information, so a line chart may not be the most appropriate choice. Therefore, the score for the line chart would be 0. 2. Scatter Plot: A scatter plot is useful for visualizing the relationship between two continuous variables. In the given dataset, there is a relationship between the categorical and quantitative columns, but the correlation and its significance are not determined. Since the correlation value and p-value are not available, it is difficult to assess the strength of the relationship. Therefore, a scatter plot may not be the most suitable choice. The score for the scatter plot would be 0. 3. Bar Chart: A bar chart is effective for comparing categorical data or discrete values. In the given dataset, the categorical column has three unique elements, and their occurrence percentages are the same. A bar chart can be used to visualize the distribution of these categories. Additionally, the quantitative column can be divided into discrete intervals or bins to create a grouped bar chart. Therefore, a bar chart is a suitable choice. The score for the bar chart would be 1. 4. Box Plot: A box plot is commonly used to display the distribution of numerical data and identify outliers. In the given dataset, the quantitative column has a range of values, outliers, and exhibits a slightly negatively skewed distribution. A box plot can effectively represent these characteristics, including the median, quartiles, and outliers. Therefore, a box plot is a suitable choice. The score for the box plot would be 1. Based on the analysis above, the scores for each visualization type are as follows: - Line Chart: 0 - Scatter Plot: 0 - Bar Chart: 1 - Box Plot: 1 The final answer in JSON format would be: { “line chart”: 0, “scatter plot”: 0, “bar chart”: 0.5, “box plot”: 0.5 }

Iteration 2:

Based on the given text description, we can evaluate the suitability of each visualization type for the tabular dataset. 1. Line chart: Not suitable. The line chart is typically used to show trends over time or ordered categories. Since there is no mention of time or ordered categories in the text description, a line chart is not appropriate. 2. Scatter plot: Not suitable. Scatter plots are useful for visualizing the relationship between two quantitative variables. In this dataset, we have one categorical column and one quantitative column, so a scatter plot is not suitable. 3. Bar chart: Suitable. A bar chart is a good choice for visualizing categorical data. The categorical column in the dataset has three unique elements, and a bar chart can effectively represent the distribution and comparison of these categories. 4. Box plot: Suitable. A box plot is commonly used to display the distribution of quantitative data and identify outliers. The text description mentions the presence of outliers in the quantitative column, making a box plot a suitable choice to visualize this information. Based on the hint provided, the updated scores for each visualization type are as follows: - Line chart: 0 - Scatter plot: 0 - Bar chart: 0.6 - Box plot: 0.4 The final answer in JSON format would be: { “line chart”: 0, “scatter plot”: 0, “bar chart”: 0.6, “box plot”: 0.4 }

Ground Truth:

Bar Chart.

表格 10: 箱线图推荐生成的解释的迭代细化示例。

Feature Description:

Single-column perspective: Based on the provided features, the dataset contains two columns, both of which are numerical. The first column (x) has a length of 2825 and ranges from 0 to 1. It has a mean of 0.079, a median of 0.0, and a standard deviation of 0.27. The distribution of this column is positively skewed with a skewness value of 3.12 and exhibits a high kurtosis of 7.74, indicating heavy tails and a peaked distribution. The column has outliers present, as indicated by the high percentage of outliers (7.9 The second column (y) also has a length of 2825 and ranges from 0 to 1. It has a mean of 0.026, a median of 0.0, and a standard deviation of 0.16. Similar to the first column, it exhibits positive skewness (3.66) and high kurtosis (19.34), indicating a non-normal distribution with heavy tails and a peaked shape. This column also contains outliers, with a percentage of outliers (2.6 Cross-column perspective: When considering the relationship between the two columns, there is a low correlation between them, with a correlation value of 0.002. The p-value for this correlation is not significant (0.888), suggesting that the correlation is not statistically significant. The scatter plot between the two columns would likely show a scattered distribution without a clear linear relationship. Both columns have similar statistical properties, such as range, mean, and median, but differ in terms of standard deviation, skewness, and kurtosis. The first column (x) has a higher standard deviation, skewness, and kurtosis compared to the second column (y), indicating greater variability and deviation from a normal distribution. In summary, the dataset consists of two numerical columns with different statistical properties. The first column (x) has a wider range, higher variability, and a more pronounced skewness and kurtosis compared to the second column (y). The correlation between the two columns is weak and not statistically significant.

Iteration 1:

To determine the suitability of each visualization type for the given tabular dataset, let’s analyze the characteristics of the dataset and how each visualization type can effectively represent the data: 1. Line chart: A line chart is suitable for visualizing trends and patterns over time or a continuous variable. In this dataset, there is no explicit mention of time or a continuous variable. Therefore, a line chart may not be the most suitable visualization type. Score: 0.2 2. Scatter plot: A scatter plot is useful for visualizing the relationship between two numerical variables. Since the dataset contains two numerical columns with a low correlation, a scatter plot can effectively show the scattered distribution and lack of a clear linear relationship. Score: 0.4 3. Bar chart: A bar chart is commonly used to compare categorical data or discrete numerical data. In this dataset, there is no mention of categorical data, and both columns are numerical. Therefore, a bar chart may not be the most suitable visualization type. Score: 0.1 4. Box plot: A box plot is ideal for displaying the distribution of numerical data, including measures of central tendency, variability, and outliers. Given the statistical properties described in the dataset, such as skewness, kurtosis, and the presence of outliers, a box plot can effectively represent these characteristics. Score: 0.3 Based on the analysis above, the scores for each visualization type are as follows: { “line chart”: 0.2, “scatter plot”: 0.4, “bar chart”: 0.1, “box plot”: 0.3 }

Iteration 2:

Based on the given text description for the tabular dataset, we can evaluate the suitability of each visualization type: 1. Line chart: The line chart is not suitable for this dataset because it is designed to show the trend or change over time, and there is no temporal aspect mentioned in the dataset description. 2. Scatter plot: The scatter plot is suitable for this dataset as it can help visualize the relationship between the two numerical columns. However, the hint suggests that the scatter plot has a score of 0.4, indicating it is less suitable compared to other visualization types. 3. Bar chart: The bar chart is not suitable for this dataset because it is typically used to compare categorical data or discrete values, whereas the dataset consists of numerical data. 4. Box plot: The box plot is suitable for this dataset as it can effectively display the distribution, skewness, and presence of outliers in the numerical columns. The hint suggests that the box plot has a score of 0.3, indicating it is more suitable compared to the scatter plot. Based on the above analysis, the updated scores for each visualization type are as follows: - Line chart: 0.2 - Scatter plot: 0.1 - Bar chart: 0.1 - Box plot: 0.6 The final answer in JSON format would be: { “line chart”: 0.2, “scatter plot”: 0.1, “bar chart”: 0.1, “box plot”: 0.6 }

Ground Truth:

Box Plot.

(a)	(b)

(c)	(d)