Epigraphics: Message-Driven Infographics Authoring

Tongyu Zhou Brown UniversityProvidenceRhode IslandUSA tongyu˙zhou@brown.edu , Jeff Huang Brown UniversityProvidenceRhode IslandUSA jeff˙huang@brown.edu and Gromit Yeuk-Yin Chan Adobe ResearchSan JoseCaliforniaUSA ychan@adobe.com
(2024)
Abstract.

The message a designer wants to convey plays a pivotal role in directing the design of an infographic, yet most authoring workflows start with creating the visualizations or graphics first without gauging whether they fit the message. To address this gap, we propose Epigraphics, a web-based authoring system that treats an “epigraph” as the first-class object, and uses it to guide infographic asset creation, editing, and syncing. The system uses the text-based message to recommend visualizations, graphics, data filters, color palettes, and animations. It further supports between-asset interactions and fine-tuning such as recoloring, highlighting, and animation syncing that enhance the aesthetic cohesiveness of the assets. A gallery and case studies show that our system can produce infographics inspired by existing popular ones, and a task-based usability study with 10 designers show that a text-sourced workflow can standardize content, empower users to think more about the big picture, and facilitate rapid prototyping.

infographics authoring, visual storytelling, data visualization
copyright: acmcopyrightjournalyear: 2024copyright: rightsretainedconference: Proceedings of the CHI Conference on Human Factors in Computing Systems; May 11–16, 2024; Honolulu, HI, USAbooktitle: Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI ’24), May 11–16, 2024, Honolulu, HI, USAdoi: 10.1145/3613904.3642172isbn: 979-8-4007-0330-0/24/05ccs: Human-centered computing User interface toolkitsccs: Human-centered computing Natural language interfaces
Refer to caption
Figure 1. To create an infographic with Epigraphics, the user types a key message they wish to convey (A), automatically receives asset recommendations based on brushed text chunks (B), adds desired assets to the canvas to manipulate and merge (C), and finally fine-tunes the assets until they are satisfied (D).

Teaser figure showcasing the user interface of Epigraphics. The left-hand panel displays a text input box and a list of generated recommendations. The middle is a canvas containing a completed infographic of a canary’s wingspans through flight. The right-hand panel is a list of existing assets on the canvas with toggles, dropdowns, and sliders to change the configurations of each asset. From this interface, there are 4 popups further explaining what the text input box, recommendations list, canvas, and configuration panels do.

1. Introduction

The infographic is an evocative, visual vignette of data. By synthesizing visualizations, illustrations, images, color, and text, it condenses datasets into digestible takeaways and stories that can be readily consumed by viewers. Unlike other related mediums such as data videos (Amini et al., 2017), infographics usually deliver a singular message to an audience who may not necessarily have the time nor background to analyze and draw their own conclusions about data. By minimizing the cognitive load required to interpret information, they motivate viewers to engage in deeper reflections about information (Clark and Lyons, 2010) and have seen usage beyond data science and design into tangential fields such as contemporary journalism (Fu and Stasko, 2023) and education (Chudá, 2007; Firat and Laramee, 2018).

However, designing infographics can be difficult, time-consuming, and unintuitive. Current workflows involve creating, arranging, and editing visualizations, graphics, and text components based on the message the designer wants to convey. Although this message is the main mechanism that drives the authoring process, most workflows start with the drawing (Xia et al., 2018) or visualization (Wang et al., 2018) first. Then, designers may fill in complementary titles, captions, or annotations afterwards to draw relationships between visual elements. This is oftentimes an iterative process where elements such as charts, illustrations, text, and overall design are created, passed on for peer feedback (Cambre et al., 2018; Kulkarni et al., 2015; Bawabe et al., 2021), then recreated multiple times to complement the message and strike a balance between information and aesthetics.

But what if the message starts and remains as a constant focus in the entire infographic design process? Authoring can then be more akin to storytelling, where the design is directly driven by the message the designer intends to convey. Given the increasing popularity of authoring systems for automated design in both research (Guo et al., 2021; Tyagi et al., 2022) and commercial products (e.g., Adobe Express111https://www.adobe.com/express/, Figma222https://www.figma.com/, Canva333https://www.canva.com/), we believe a text-based message can be a powerful, interactive medium to automate the creation of infographic components. It can act as an anchor to retrieve appropriate features and visual representations of the data, as well as induce design themes, graphics, and highlights in the infographic. When thinking about the content and placement of the text first, the designer can transform their implicit design thinking and discovery into explicit design edits. We note that in this context, we specifically mean a message that directly conveys meaning or themes of the infographic, rather than just a text-based prompt. Focusing on this message is analogous to writing out “alt text” first; by conceding some creative liberty to generative models and allowing them to fill in the gaps of the assets, the designer can create a more cohesive data story.

While there are existing tools that offer text as a starting point for crafting infographics (Cui et al., 2020; Qian et al., 2021; Tyagi et al., 2022) and strategies for constructing natural language and visualization couplings (Srinivasan et al., 2021), they all rely on existing infographic or visualization exemplars created by a third party. Visualization creation is thus a retrieve-then-modify task rather than a generative process. The resulting infographics produced are derivative, meaning the designer has reduced flexibility over the originality of their creations. Thus, we propose Epigraphics, a web-application where users can craft text-based key messages and brush over their constituents as sources to generate infographic assets such as chart primitives, graphics, color themes, data filters, and animations. They serve as recommendations for users to add to and rearrange on a canvas. Unlike other data agnostic design tools like PowerPoint or Figma, the text is always processed in the context of an imported dataset. The generated assets are modular–this is a deliberate design decision to 1) provide flexibility to pair-wise merge them and 2) preserve creative autonomy by not providing the entire design. The components can be merged via between-asset interactions such as recoloring, highlighting, animation syncing, and injecting graphics into visualizations as glyphs. Using the system, the user can quickly produce visual elements aligned with both the message and the data, and thus focus on the composition and symbolism of their design rather than on individual element aesthetics. We assess the efficacy of the system through a gallery with case studies and a task-based usability study. The demonstrations illustrate the potential expressiveness and capabilities of the system in the hands of an experienced user, while the study reveals insights into workflow and first-time usage outcomes of novice users. Together, they reveal that a message-first authoring workflow for infographics is effective at standardizing content, does not provide a cohesive layout but promotes holistic thinking, and empowers rapid prototyping. The key contributions of this paper include:

  1. (1)

    A system that generates infographic assets based on a key message whenever the user interacts with text and supports within and between-component interactions.

  2. (2)

    Design lessons extrapolated from case studies and a user study about how a message-sourced workflow can influence design processes and outcomes.

2. Related Work

2.1. Data to Beautiful Graphics

Many systems that aid users in creating the beautiful graphics commonly found in infographics focus on binding raw data to visual representations. For example, Data-Driven Guides (Kim et al., 2017) enable designers to create guidelines to which custom shapes can be linked. These guides can then encode data into each shape’s length, area, and position and dynamically deform the drawing based on changing data. DataInk (Xia et al., 2018) adopts a similar approach of binding raw data to shapes, but instead of properties, the entire shape is treated as a singular glyph. Users can use direct manipulation to link free-form sketched glyphs to data points, which can then be used to author creative visualizations. DataQuilt (Zhang et al., 2020) further expands upon this by allowing users to create these bindings for real images such as paintings and photographs so that visualizations can adapt the appearance of collages. However, one limitation of such strategies is that the mappings themselves need to be recreated each time for new datasets. To resolve this, Charticulator (Ren et al., 2019) additionally enables the export of mapped visualizations to be reused as templates with other data.

In instances where the dataset is convoluted, animations can effectively guide the viewer’s gaze from one piece of information to another. Gemini (Kim and Heer, 2021) is one such recommendation system that allows users to write declarative grammar to produce animated transitions between related statistical graphs. Data Animator (Thompson et al., 2021) removes the need for coding altogether by automatically generating transitions between two static visualizations by matching objects and supports a visual interface for fine-tuning. Animated Vega-Lite (Zong et al., 2023) alternatively reframes animated visualizations as time-varying data queries, where the time encodings map data fields to key frames. Outside of transitions, animations in graphs and visualizations can also be decorative to draw the viewer’s attention towards them. Canis (Ge et al., 2020) is a high-level domain-specific language that allows users to select marks from charts and apply custom animations—e.g., a “wheel” effect on a donut chart—to mark units.

These prior works that tether data to either static or animated elements consider the graphic itself as a first-class object and what the user generates first in the workflow. In contrast, our work considers text as the users’ first point of entry and recommends all subsequent bindings with respect to the context of this text. We then establish text-chart primitive-graphic bindings to support a greater breadth of data expression.

2.2. Text-powered Data Stories

When coupled with existing charts in a document, text can provide invaluable context for how a data story may be directed. Systems have thus been created to utilize this text to 1) highlight, 2) reformat, or 3) generate data visualizations. To guide users to corresponding charts when they are reading through a document, Elastic Documents (Badam et al., 2019) couples data tables with text that the user focuses on using a keyword-based matching algorithm to produce on-demand visualizations for whatever the user is reading. Automatic Annotation Synchronizing (Lai et al., 2020) additionally extracts visual elements from graphs using Mask R-CNN to automatically sync these visualizations to accompanying textual descriptions, which can be focused to highlight the graph. ChartText (Pinheiro and Poco, 2022) achieves the same automatic binding using a two-stage encoding method and can be used to add interactivity to documents. Kori (Latif et al., 2022) both automatically suggests and allows users to manually generate references between text and existing graphs from a database as they type.

Alternatively, another strategy to redirect viewer attention more conspicuously is to rearrange the order of text and charts entirely. To improve the flow of data articles across the dynamic page layouts of different devices, VizFlow (Sultanum et al., 2021) establishes text-chart links and reorganizes the text and charts based on each layout. ToonNote (Kang et al., 2021) similarly reminds users of the bigger picture during data analysis by providing a toggle-able “Comic View” for computational notebooks.

In instances where visualizations may not exist to convey the exact message delivered by the text, systems may generate them instead, both with or without the presence of data. CrossData (Chen and Xia, 2022) establishes text-data connections to help users retrieve, compute, and explore tables and charts during their document writing process. Similarly, DataParticles (Cao et al., 2023) links text and data, but specifically utilizes latent connections to help users iterate on animated unit visualizations that accompany the narrative.

Generative systems such as CrossData and DataParticles are most similar to our work. However, unlike current systems that focus on the retrieval of existing charts and then modify them to fit the user’s data, we generate other data story-relevant assets such as static images and animations, and support interactions to combine them to better complement the message of the data story.

2.3. Recommendations for Infographic Creation

Since infographic authoring is often a multi-step and multi-platform process, recent recommendation systems have tried to automate or expedite various aspects of this process. For example, it is often challenging for novices to determine proper layouts and configurations for infographic assets, as different underlying semantic structures linking visual elements can lead to different stories conveyed to the user (Lu et al., 2020). To ameliorate this challenge, Infographics Wizard (Tyagi et al., 2022) relies on a semi-automatic framework to recommend visual information flow layouts, visual groups, and connecting elements between assets. Similarly, Zheng et al. (Zheng et al., 2019) proposed a fully automated approach that uses input images and keyword-based summaries of input text to suggest magazine layouts. De-Stijl (Shi et al., 2023) and InfoColorizer (Yuan et al., 2021) recommend harmonic color palettes for novice users to assist them in quickly crafting design iterations.

Beyond layouts, other systems focus more on providing recommendations for the content of infographics directly. InfoNice (Wang et al., 2018) associates custom graphics with summarized data to transform unembellished charts into infographics with customized marks. ChartSpark (Xiao et al., 2023) embeds semantic context into existing charts using a text-to-image generative model. InfoMotion (Wang et al., 2021) converts static infographics into animated ones by producing a logical breakdown of components within the visualization. DataShot (Wang et al., 2019) automatically generates fact sheets and TypeDance (Xiao et al., 2024) generates typographic logos based on design priors from existing templates. Text-to-Viz (Cui et al., 2020) generates infographic content based on proportion-related statistics from statements by retrieving vector graphics from a database and arranging them according to a predefined list of 20 templates. Similarly, given a text-based prompt, Retrieve-Then-Adapt (Qian et al., 2021) retrieves existing infographics from a database and transforms the content to match user-inputted data.

While our work also emphasizes infographic content generation, we do not rely on templates and source all assets from a centralized message. This workflow can thus provide greater creative autonomy for users while still ensuring that the main idea conveyed by the resulting design is maintained.

Infographics Category Definition Authoring Method Artifact Examples System Examples Components
Statistical Relies on formal or stylized diagrams, charts, and graphs to summarize or highlight data Create either standardized, annotated, or stylized visualizations Posters, Maps (Kim et al., 2017; Xia et al., 2018; Yuan et al., 2021; Ren et al., 2019; Zhang et al., 2020; Wang et al., 2019; Cao et al., 2023; Pinheiro and Poco, 2022; Lai et al., 2020; Xiao et al., 2023) Text, Visualizations, Graphics
Mnemonic Relies on patterns, composition, and structure to highlight specific features and characteristics Arrange text, symbols, or images in thematic layouts Banners, Brochures (Cui et al., 2020; Yuan et al., 2021; Shi et al., 2023; Tyagi et al., 2022; Qian et al., 2021) Text, Graphics, Symbols, Colors, Layout
Transitive Relies on either interactive or automatic transitions to convey sequences of events or operations Design transitions within or between visualizations, images, and text Slideshows (Cao et al., 2023; Wang et al., 2021) Text, Visualizations, Graphics
Directive Relies on spatial ordering to establish logical flow or direction Sequentially arrange text, visualizations, and images Data comics, Instructions, How-to’s (Kim et al., 2019; Chen and Xia, 2022; Kang et al., 2021; Sultanum et al., 2021; Latif et al., 2022; Pinheiro and Poco, 2022; Lai et al., 2020) Text, Visualizations, Graphics, Symbols, Layout
Table 1. A summary of the main types of infographics, how and through which system they are typically authored, and a breakdown of their components.

A table with six columns and four rows depicting the infographic category, the definition of that category, the authoring method for that category, example infographics, example authoring systems, and components that make up that category.

3. Design Space

To contextualize our proposed message-based infographic authoring paradigm, we first need to understand its existing design space. This process to distill design goals has been previously followed for other tools that focus on proposing new interactions to create visualization (Chen et al., 2020, 2019; Tong et al., 2023). Thus, we surveyed prior work on the authoring of visual data stories. To probe real-world usage scenarios of this design space, we also interviewed 2 design experts, one within academia and another in industry. Our derived insights are synthesized in the subsections below.

3.1. What are the different types of infographics and what do they contain?

Based on prior work on infographics usage (Siricharoen, 2013; Tarkhova et al., 2020), a breakdown of how different infographics are authored and what components they are composed of is located in Table 1. Formally, an infographic is defined as “a collection of graphic organizers that integrates different media in simple diagrams: text, images, symbols and schemas” (Siricharoen, 2013). Images can be further broken down into visualizations that summarize data or data-agnostic graphics that convey information through visual metaphors. Similarly, a schema can be decomposed into the color palette that governs the media and the layout, or visual groups (Lu et al., 2020), they are arranged in. Thus, an infographic can be considered a composition of 6 main primitive components: text, (data) visualizations, (non-data) graphics, symbols, color palettes, and layout. We further dissect the primitive components to reveal the functional components that could be provided by an authoring system. First, graphics and symbols can be derived from both (C1) raster and vector graphical formats. Then, visualizations as data charts and annotations could be derived from (C2) declarative grammars and (C3) queries on the chart data. Layouts are attributed to different (C4) templates of visual groups (Lu et al., 2020). Finally, (C5) color palettes and (C6) text can be self-contained collections of hues or fonts, respectively. An authoring system could provide these items as primitives for further combinations and editing to construct an infographic.

3.2. How can a tangible key message support the authoring of infographic components?

Given the breadth of infographic components, supporting users to create them all from scratch is not an easy task, especially when an effective infographic requires the author to be clear and succinct in what they wish to convey (Murray et al., 2017), while juggling multiple visual elements on the page. We identify an opportunity here where many components originate from the same concept: a “key message” the designers want to convey. While there are scenarios where analysts performed an initial exploration of data first, they commonly use such key messages to pass on what they want to designers. Thus, we propose a workflow that generates these components by employing an explicit singular text-based key message—an “epigraph” that alludes to the contents to come. Such an epigraph would contain rich semantic information that can be broken down into themes, sub-phrases, and individual words that could be treated separately as prompts for component primitives. Themes could suggest potential color palettes by extracting discrete words and combining them harmoniously. Sub-phrases can provide abstractions for visualizations. Individual words can be used to retrieve graphics and symbols. The key message itself or its constituents can be added as text. Note that while layout is a primitive component that has been generated in prior systems (Tyagi et al., 2022; Qian et al., 2021; Cui et al., 2020), we deliberately exclude layout and font recommendations as they usually require extrinsic input such as user style or templates that could not be systematically extrapolated from a message. Thus, we expect that a self-contained key message would provide the following components: graphics (C1), visualizations (C2, C3), color palettes (C5), and the text content (C6).

Refer to caption
Figure 2. The complete pipeline from a text-based key message to infographic elements. It involves selecting text chunks from a key message (A), using these chunks to recommend different types of assets (B) such as visualizations, data filters, graphics, and color palettes, merging different combinations of the generated assets (C), and fine-tuning the configurations on a canvas (D).

A flowchart showing how a key message is converted to an infographic. The first node in the flowchart showcases a key message with different types of text (individual words, sub phrases, or the entire message) highlighted. Arrows from these text chunks feed into the next node, a list of icons showing different asset types. An arrow is drawn from them to a list of assets generated. From this list, two arrows are drawn–one straight to the final infographic and another to a list showing icons of the asset types added to each other, representing how assets can be combined. From this list, another arrow is drawn to the outcomes of these asset combinations. Finally, two arrows are drawn from these combinations–one back to the list of assets generated and another to the final infographic.

3.3. How can components be further composed to form varied types of infographics?

We additionally want to support ways to craft each type (Table 1) of infographic, which is determined by the purpose assigned to each component and how they interact with each other. For example, a statistical infographic may necessitate annotated or stylized visualizations; the former requires integrating text into the visualization as annotations, while the latter embeds graphics as glyphs within the visualization. Conversely, mneumonic infographics are not data-based, but rather rely on the layout of text, graphics, symbols, and color to convey their message. Transitive infographics add animations to visualizations or graphics to illustrate sequential events, while directive infographics convey the change through spatial ordering, organized either instructional symbols via the layout of media. In authoring these different types of infographics, the same component can serve multiple functionalities–text can be treated as an annotation to or a title summarizing the visualization. The message conveyed can also be represented as different components–it could be explicitly displayed as a titular banner or left latent as an implication from a graphic. Thus, allowing the recommended components to be combined in a unified UI allows the re-purposing of components into the diverse roles necessary for each infographic type. This UI should also allow users to manually control layout, which cannot be automated solely from a key message.

3.4. Design Goals

Our tool is motivated by the existing design space of infographics authoring and introduces a top-down approach based on the assumption that the author has a key message in mind. Specifically, it intends to achieve the following:

  • G1: Support natural ways to interact with and extract all relevant information from text towards component creation (message to component recommendations).

  • G2: Accommodate higher-level abstractions to compose components so that different categories of infographics are supported (component re-purposing and combination).

  • G3: All interactions should occur in and can be fine-tuned within a unified system to reduce context-switching (human-in-the-loop to handle components and modalities that cannot be achieved via automation in G1 + G2).

4. Epigraphics System

Our proposed tool, Epigraphics, is situated in the larger space of storytelling tools (Li et al., 2023a; Lee et al., 2015), specifically within the planning and implementation stages of the workflow. It also sits at a niche between systems that provide complete low-level fine-grain control desired by domain experts (Xia et al., 2018; Cao et al., 2023) and generative systems that completely automate the text-to-design process (Cui et al., 2020) with its AI creator and human optimizer model (Li et al., 2023a, b). It aims to empower infographic authoring through a key message-sourced approach to compose and combine component primitives. For clarity, we will use the term component to refer to the primitives that make up a general infographic and the term asset to refer to the primitives that can be extracted from the key message. The two sets mostly overlap, with some differences we will discuss in Section 4.3.

4.1. Pipeline Overview

In our system, text is the primary element users interact with to generate different assets for infographic design. The pipeline that describes this workflow is depicted in Figure 2, which uses a toy example of a user-inputted sentence, one about a canary’s wingspan, to illustrate how assets may be recommended. Assuming that the relevant dataset is already imported, the user begins by indicating what components of the key message they wish to use for their source input by brushing over the text with their cursor. They then specify the asset type they want to see, which the system uses to generate a ranking of the selected asset type most relevant to the source input (G1). From the list, the user can then click on each asset to directly add it onto the canvas, or further modify it even more by combining the generated assets (for example, using the color palette to recolor the SVG results of the canary) (G2). Finally, they can rearrange all the assets on the same canvas, making more modifications if necessary, to produce a finalized infographic (G3).

4.2. Data Preparation and Generative Models

Recently, large-language models (LLMs) have attained high levels of generality over a wide range of tasks due to their scale and attention-based architectures (Kaplan et al., 2020). This makes them ideal candidates for text-to-asset and text-to-design generation, especially if the LLM is used to produce multiple iterable segments within a master draft (Sultanum and Srinivasan, 2023). However, since these models do not actually understand the prompts they are provided with like a human would (Webson and Pavlick, 2021), prompts can be engineered to better assist the model in converting natural language instructions into desired outputs (Liu and Chilton, 2022; White et al., 2023). From CSV files containing the dataset, we extract meta-information such as column names, a high-level summary of the data, and unique categorical values to provide additional context to LLMs. For the graphics, we extract captions for each image using a Visual Question Answering (VQA) model (Antol et al., 2015) by asking “what does the image show?” We then use a family of generative models suitable for each type of resource. Specifically, GPT-3.5 (OpenAI, 2023) is utilized for text completion, Sentence-BERT (Reimers and Gurevych, 2019) for embedding extraction, BLIP (Li et al., 2022) for image caption generation, and Adobe Firefly444https://www.adobe.com/sensei/generative-ai/firefly.html for text-to-image generation to extract color schemes.

4.3. Recommending Infographic Components (Text-to-Asset)

Recall from Section 3.1 that a generic infographic may contain 6 primitive components: text, data visualizations, graphics, symbols, color palettes, and layout. Our system uses one of these components, text, to recommend all other components except for layout. We note that symbols are a subset of graphics (Siricharoen, 2013) that can be generated based on specific text. We further introduce an additional asset, data filters, that can be extracted from the text and used to refine and re-purpose other components such as visualizations and text. Thus, the full set of primitive assets Epigraphics supports includes (static) visualizations, data filters, static and animated graphics (including symbols), and color palettes.

4.3.1. [Uncaptioned image]Static Visualizations

Text could be a strong signal for what and how a summary of the data could be displayed to convey the desired message. Assuming an appropriate dataset has been imported and the user has indicated their text of choice, we use GPT-3.5 to extract no more than 5 most relevant column names to the user-provided text. These columns are then converted into intent grammar using Lux (Lee et al., 2021). Lux’s intent language allows partial specification based on CompassQL (Wongsuphasawat et al., 2016). Specifically, it only requires specifications for data aspects of interest (i.e. column names or data filters) and does not need inputs for visualization encodings. This reduction of columns, a common task in data exploration (Castro Fernandez et al., 2018), contracts the search space of visualization specifications so that the recommendation is less likely to depend on the chart recommendation engine. The 5 (or fewer) columns are further broken down into subsets of 2 columns (scatter plots, line charts, bar charts), 3 columns (charts with an additional colored legend), or aggregated/binned (histogram, heatmap) to generate the most common chart types. These output charts are ranked based on the number of relevant columns involved, then converted to Vega-Lite specifications (Satyanarayan et al., 2017), rendered as SVGs, for the user to choose from.

4.3.2. [Uncaptioned image]Data Filters

Text could be used to indicate that the user only wants to operate on a subset of the data. In these cases, the prompts are fed into GPT-3.5 to output SQL queries for the dataset. For example, a dummy sentence such as “The Lakers vs Detroit finals in 2004 was particularly exciting” will be converted to

    SELECT * FROM df
    WHERE team_name = ’Los Angeles Lakers’
    AND opponent = ’DET’ AND season = ’2003-04’
    AND period = 2 AND playoffs = 1
    ORDER BY date LIMIT 10

The query is applied to the dataset and the filtered data is returned as a table for the user to interact with. The table can either be used independently to generate new visualizations, be used to highlight existing ones as an overlay, or used to generate annotations for specific data points, which we will discuss in Section 4.4.

4.3.3. [Uncaptioned image][Uncaptioned image]Static & Animated Graphics

While designers traditionally look up relevant images to import into an infographic design, sourcing these assets directly from the text can streamline the process by keeping it centralized within the authoring tool. Our system relies on SVGRepo555https://www.svgrepo.com/, an open-licensed database for SVGs, as the source for its static graphics. For each image, we generate captions and extract sentence embeddings from them. Similarly, we also obtain embeddings from the user-provided text. After computing cosine similarities between the embeddings for the captions and user input, we rank the scores and return the top 20 SVGs as recommendations for the user. Recommendations for animated graphics are returned as GIFs. GIFs are generated similarly by average pairwise cosine similarities between the user input and each frame in the animation.

4.3.4. [Uncaptioned image]Color Palettes

Finally, while the text structure itself cannot denote a comprehensive color palette, specific keywords can suggest potential colors that make up one. The users’ text of choice is broken down into such keyword fragments, each of which is fed into the text-to-image module of Adobe Firefly to produce multiple relevant images. From each image, we extract the color profiles by computing 5-bin color histograms from each image. The histograms are then compiled into color palettes with color sorted by luminosity.

Refer to caption
Figure 3. When the user brushes over a chunk of text, a pop-up with icons representing potential types of asset recommendations appears. After the user clicks on an icon, the asset is generated and automatically linked to the text chunk.

Two text boxes containing the same key message next to a box contain three visualizations. In the first text box, a cursor brushes over a block of text and a popup with 5 icons indicating visualization, data filter, static graphic, animated graphic, and color palette shows up. In the second text box, the cursor clicks on the visualization icon and an interactive box is formed around the block of text. The box with visualizations contains a connected scatterplot and a scatterplot, both linked to the interactive text box via a chain icon.

Configuration Supported Asset Types
Hide/show axes & legends static visualizations, animated visualizations, data-oriented drawings
Add animation static visualizations
Manual recolor visualizations, graphics, text
Change opacity visualizations, graphics, data-oriented drawings, highlights, text
Change thickness/size visualizations, data-oriented drawings, annotation lines of highlights, text
Change style/pattern annotation lines of highlights, text
Change frame delay animated visualizations, animated graphics
Table 2. A summary of all possible configurations for each asset type.

A table with two columns showing the asset configurations possible and what asset types support them.

4.4. Component Re-purposing and Combination (Between-Asset Interactions)

Once the text-based assets are generated and added to the canvas, the system provides further options to combine or refine them to accommodate the intentions each component serves in different types of infographics (Table 1). While this is not a comprehensive list of all possible asset-asset interactions, most extended combinations could be achieved based on the following pairwise combinations from the core assets. Note that these interactions are commutative operations where the order of the assets does not matter, and the outputs of each combination can additionally be combined with another (e.g. a highlighted visualization can be recolored based on a color palette or synced with an animated graphic).

4.4.1. [Uncaptioned image]Static Visualizations Animated Visualizations

Certain datasets may also contain temporal attributes that can be animated to demonstrate a change over time. Based on a visualization of interest with an associated dataset, we prompt GPT-3.5 with “output the columns with time-oriented words” to extract these columns, which are then presented in a drop-down menu for the user. Once the user selects a column, we convert the unique values of that column into a set of ordered keys that define each frame in the animation. The dataset of the visualization is then filtered for each key, resulting in a GIF that loops indefinitely over each unique time-oriented column value. The user can also freely control the frame delay and thus the speed of the animation.

4.4.2. [Uncaptioned image] Color Palettes + [Uncaptioned image] Visualizations, [Uncaptioned image][Uncaptioned image] Graphics Recolor

Once a desired color palette is selected from the list of recommendations, the user can click on any SVG, GIF, or visualization to map the palette onto the colors of that asset. The color palette from the visualization or graphics is extracted via color histograms in a similar way as Section 4.3.4. Given two histograms, the system then transfers the colors by mapping them to minimize the Earth Mover Distance. Note that regardless if a visualization has a categorical, diverging, or linear color scheme, the resulting colors will preserve these properties.

4.4.3. [Uncaptioned image] Graphics + [Uncaptioned image] Visualizations Data-oriented Drawings

We define a data-oriented drawing (DOD) as a stylized visualization that incorporates custom imagery as glyphs, similar to the outputs of Charticulator (Ren et al., 2019) or DataQuilt (Zhang et al., 2020). To create a DOD, the user can select any existing visualization on the canvas with a categorical colored legend. Then, after selecting what images they want to replace each legend value with from the list of recommended graphics, the visualization is automatically replaced with glyphs. This combination works on scatterplots, bar charts, and line charts with markers.

4.4.4. [Uncaptioned image] Data Filters + [Uncaptioned image] Visualizations Highlighted Visualizations + Annotations

Given a data filter, there are two ways to modify a visualization. If the visualization is a result of aggregated data, the result is the same visualization but abstracted from less data. If the data has not been aggregated, the result is a selection of the current encoding presented as an overlay. We additionally provide annotation-like lines containing the initial text chunk that prompted the highlight which point to the filtered data in the visualization.

4.4.5. [Uncaptioned image] Animated Visualizations + [Uncaptioned image] Animated Graphics Sync

When animated visualizations are added to the canvas in conjunction with animated graphics, the user may wish to sync the animations to create greater unity within the infographic, provided that both have the same number of frames. If not, we either trim the animated graphic or the animated visualization to ensure their frame count is a multiple of the other. Then, we sync their timings by mapping the frame delays of the animated visualization to the graphic and resetting both animations to start simultaneously.

Refer to caption
Figure 4. A gallery of infographics created using Epigraphics with the corresponding epigraphs used to generate them. A (McCrorie et al., 2016), B (Fox et al., 2016), C (Quealy and Sanger-katz, 2016), and D (Lutz, 2014) showcase recreations inspired by existing infographics, while E and F are originals based on open-source datasets. G (Zuñiga, 2017) is also a recreation comparing what can be produced using our system (top) versus a traditional approach (bottom) with labels explaining their workflows for each asset type.

A gallery of infographics arranged in 2 rows of 4 with the epigraphs used to create them on top. In order: A) A stylized bar graph with cities on top of each bar and a gradient of blue in the background from light (top) to dark (bottom). B) A scatterplot overlaid on top of a basketball court, with three points annotated. C) Icons of food items representing each data point of a scatterplot. Icons representing quinoa and granola bars are highlighted. D) A connected scatterplot overlaid on top of a graphic of a canary. E) A mirrored bar chart with icons representing each genre at the tips of the bars. Graphics of a record player and Spotify icon are above this chart. F) Icons representing weather conditions are highlighted and match the position of the sun in the background. G) Two similar infographics of four pie charts, one of which is contained within an avocado.

4.5. Authoring Interface

The authoring interface, depicted in Figure 1, is a web application built using Node.js and Next.js. API calls for text-sourced recommendations are sent to a Flask backend. Before interacting with the interface, the user can either select a preset dataset provided using the dropdown or upload their own CSV to explore that data.

4.5.1. Text Input Panel

The text input panel (Figure 1A) is a rich editor where the user can enter the key message of the infographic or any other text that they wish to use to recommend assets with. Initiating an infographic asset recommendation involves using the cursor to brush over a phrase of interest to select it (Figure 3). This brings up a panel containing icons for the potential assets available to be recommended ([Uncaptioned image][Uncaptioned image][Uncaptioned image][Uncaptioned image][Uncaptioned image] from Section 4.3). Clicking on the icons sends an API call to fetch the corresponding asset, which is then appended to the bottom of the recommendations list. The selected text chunk is automatically linked to the generated list of assets, represented as an interactive box around that chunk in the text editor. Users can then click on the box to filter and find the corresponding assets easily when the recommendation list grows longer.

4.5.2. Recommendations List

The recommendations list (Figure 1B) contains a scrollable history of asset recommendations requested by the user, labelled with the text they are sourced from. For visualization and graphic recommendations, clicking on any element will generate an SVG (or GIF if animated) instance of that asset on the canvas. Data filters are represented as selectable tables and toggling a table row while a visualization on the canvas is selected will create annotated highlights over that visualization. Clicking on a color palette with a visualization or graphic on the canvas selected will recolor that asset based on the palette color schemes. There is also a list of tabs above the recommendations list, which the users can use to filter the list based on their desired asset category.

4.5.3. Canvas

The canvas (Figure 1C) is a space for users to freely manipulate the assets , make final adjustments, and author aesthetic layouts if desired. It supports basic editing functionalities like the ability to add text, change text font, change color, undo/redo, change asset opacity, lock assets, move the depth of assets forward/backward, and download. Users can also move, rotate, and rescale added assets using direct manipulation. Since animations are started once they are added to the canvas, we also include an additional function to reset all the animation timings so that they can start at the same time if desired.

4.5.4. Layers System + Configurations

Additional fine-tuning of the added assets is possible in the layers system (Figure 1D). A summary of all possible configurations can be found in Table 2. For example, users can toggle the visibility of visualization properties such as axes and legends, as well as select the time-oriented column to animate over for animated visualizations. All assets can be recolored by manually mapping one color in the asset to another. Annotation lines can be adjusted for thickness, color, scale, and start/end head patterns. Text anchor positions to the annotation lines can be adjusted. Frame rates for animations can also be adjusted.

5. Gallery and Case Studies

Epigraphics aims to facilitate the rapid, focused creation of message-based infographics. Thus, it should support users in both creating an infographic from scratch and recreating the ideas of existing ones given a meaningful message. We created a gallery to demonstrate the expressiveness of the system in accommodating these goals (Figure 4). Five of these examples are inspired replications (A (McCrorie et al., 2016), B (Fox et al., 2016), C (Quealy and Sanger-katz, 2016), D (Lutz, 2014), G (Zuñiga, 2017)) that come from news articles, posters, online blogs, and other story telling mediums that contain rich graphics, visualizations, and textual descriptions. Two of these (E, F) are original infographics where we came up with our own messages after exploring public datasets.

To illustrate the workflow and highlight the capabilities, strengths, and limitations of our system, we also present two case studies. Case Study 1 is a walk-through for producing infographic D, showing the mechanism that a bird uses to fly, while Case Study 2 directly compares workflows with and without Epigraphics for infographic G. The workflow without Epigraphics was completed using Vega-Lite to manually compose the visualization, SVGRepo to source graphics, and Illustrator to combine them. This ensures that the assets in both workflows are the same to minimize confounding factors in the comparison.

5.1. Deconstructing the Flight of a Canary (D)

After importing the dataset into the system, we entered the key message (Figure 4D Top) into the text panel. Inspired by the original infographic (Lutz, 2014), we noted that we needed an animated graphic of a canary and an visualization of the wing positions animated over each time frame overlaid on top, and looked for opportunities to extract this information from the text (G1). For the graphic, we brushed over the text “canary flapping its wings” to generate potential GIF candidates for the animated canary, and selected a yellow bird flapping its wings to add to the canvas. Next, we brushed over the text “wings based on traced body positions” to view potential visualization candidates. The returned 20 options included a mixture scatter plots, bar charts, and line charts for different data column values such as x position, y position, time frame, wing type, and wing stroke direction. We tried to narrow down these options by adding “as an animated line graph,” but this did not work. Instead, we found that adding semantically meaningful keywords as such “over time” was more effective in trimming the recommendations by reducing the number of columns returned. That is, the tool made it particularly easy to generate a breadth of exploratory assets that can be filtered through a more refined qualitative message, but proved more difficult in executing explicit, technical commands that an AI chatbot would normally take.

After scrolling through the options, a connected scatter plot that plotted x against y position with time frame as the legend was deemed the most appropriate. To really illustrate how the wingspans evolved as the bird flapped its wings (G2), we added animation in the configurations panel, which brings up a drop-down of time-related columns in the visualization. In this case, there was only one option, the time frame. By selecting this, the static visualization was automatically converted to an animation overlay where the connected line of the scatter plot cycles through the x and y positions at each time frame. Next, we wanted to modify the scatter point colors to better match that of the bird. To do this, we brushed over “canary,” which returned an array of palettes, one of which was a collection of yellows that we applied to the visualization. We then synced the animations of both the canary and the visualization. For final touches (G3), we added the original key message as a header, a title, and modified their fonts to complete the infographic. Here, we note that although the visualizations and graphics could be modified for color and opacity, it was not possible to change their overall style (i.e. make it painterly, like a charcoal sketch, or holographic-like as in the original infographic (Lutz, 2014)). Future work that expands upon canvas capabilities or incorporates Epigraphics as a plugin into existing graphical editors can expand the diversity of its visual outcomes.

5.2. Unraveling Statistics for Vegetarianism (G)

In both workflows, the key message to be conveyed about what one should know about vegetarianism was made prominent–in Epigraphics, it was always displayed in the text panel while in the traditional workflow, it was explicitly written as a large banner on the working canvas of Illustrator. To generate desired visualizations, Epigraphics adapts an exploratory approach similar to Case Study 1, where relevant pie charts were picked from a collection of other chart types that combined the most relevant dataset columns after brushing over text chunks. For graphics, specifically the image of the avocado which was inspired by the original infographic (Zuñiga, 2017), Epigraphics generated options for vegetables after we brushed over “vegetarianism.” However, an avocado was not a vegetable, so we had to modify the text panel and type “avocado” directly. To color the pie charts and avocado, we brushed over “vegetarianism,” and applied an earthy green color palette to all the assets.

Conversely, for the traditional approach, we first inspected the dataset to see how it could be plotted. Then, we decided to create pie charts and wrote scripts to sum the data rows and create Vega-Lite grammars. The visualization was exported as an SVG and imported into Illustrator. For the graphic, we directly looked up “avocado” on SVGRepo, and selected an avocado image we desired, downloaded it, and added it to Illustrator. However, recoloring each element on the canvas was more tedious. First, we thought about what a “vegetarian color palette” meant. Then, we selected each asset, which had its own color scheme, and manually remapped it using the ‘recolor’ functionality on Illustrator. Some assets were recolored more than once to balance the overall visual coherence.

Given the replication task, the components of Epigraphics did not necessarily influence what was produced in the end, but rather improved the efficiency of the workflow by reducing context-switching as all operations could be performed in one system. Epigraphics also helped reason about what potential chart types, images, and color palettes were possible by providing options sourced from the message you intend to convey, whereas that decision-making process was completely left to the user alone in the traditional workflow. However, this can backfire as system reason leads to undesired results, such as the “vegetarianism” “avocado” instance, but re-writing the text can quickly resolve this issue.

6. Usability Study

To investigate how Epigraphics’s text-first approach may direct, or redirect, designers’ mental models, we also analyze the 1) intermediate and final visual artifacts created by the user and 2) their interaction patterns and verbalized thought processes during the authoring process. We first ran pilot studies with two participants. They were given a dataset and asked to create infographics with the system with their own text-based messages or sketches. During this process, they expressed that it was difficult to come up with either one after just looking at a table because it was hard to gain insights from the data or come up with a story to tell. In fact, they treated the system as a data analytics system and spent most time exploring the data using the recommended visualizations. To avoid the deviation of purposes, we provide fixed messages for all participants in the final user study so that they could focus on the authoring experience instead of learning the datasets.

6.1. Participants

Ten users (6 female, 4 male), recruited via snowball sampling, participated in the usability study. They range from 19 to 56 years old (μ=28.6,σ=10.3) and have varying levels of design (2 beginner, 2 novice, 3 intermediate, 2 advanced, 1 expert) and programming (1 novice, 2 intermediate, 5 advanced, 2 expert) expertise. Half of them have never read or written articles that discuss data, while the other half interacted with data articles regularly. Most of them, except for one user, have created visualizations before. However, only two participants frequently create infographics, while the rest reported to design them rarely. When they do, however, they cited using software such as Microsoft Excel, Figma, Adobe Illustrator, Adobe Express, and Canva to create them.

6.2. Study Protocol

Participants were randomly divided into two groups that were given two different public domain datasets pulled from Kaggle. The first dataset contains weather conditions over a day in Leeds666https://www.kaggle.com/datasets/muthuj7/weather-dataset, England, while the second dataset contains the 2021 top 50 Spotify songs and their acoustic properties777https://www.kaggle.com/datasets/equinxx/spotify-top-50-songs-in-2021. All other conditions between the groups were kept identical. The study lasted a total of 60 minutes, and can be broken down into the following components:

6.2.1. Introduction (10 minutes)

Participants filled out a preliminary questionnaire about their prior expertise in design and programming, how often they read/write articles with data, create visualizations, and create infographics. They were then instructed to pretend to be a visualization graphic designer who was tasked with creating an infographic based on a single text-based message from a client. For inspiration, the participant was also shown three example infographics in case they did not have much prior experience viewing or creating them.

6.2.2. Sketching Task (10 minutes)

Based on the text-based message, the participants were asked to sketch out their ideas for their infographic on a digital canvas. The purpose of this task was to understand their immediate first impressions of what they wanted the infographic to look like after reading the message. In this process, they were reminded that an infographic can include text, images, animations, and visualizations, but they are free to use (or not use) these elements in any combination they so desired and encouraged to refer back to the message at any point during this process.

6.2.3. Infographic Creation Task (25 minutes)

Participants were walked through the Epigraphics system via two demo videos that showcased how to create an infographic from text with two different datasets. They also demonstrated the capabilities of the system, including the types of assets and the between-asset interactions that could be generated. The participants were encouraged to ask any questions they may have at this time. After the walk-through, participants were instructed to navigate to the Epigraphics interface and create their own infographic using the tool. The facilitator reminded them that they could either copy and paste the message the client gave them earlier directly or enter their own into the text panel. They were also encouraged to modify the message freely or use their initial sketch (or not) to achieve the vision they wanted. During this process, they were instructed to think aloud and ask if they had any questions. This task was considered finished whenever the participant felt that the infographic fulfilled the message or until 25 minutes were up.

t Refer to caption A stacked bar chart showing the proportion of asset generation vs canvas interaction clicks for each participant. Asset generation is significantly less compared to canvas interaction for all participants.

Figure 5. The distribution of number of mouse clicks spent on asset generation and interacting with the canvas for each participant. All participants spelled the bulk of their clicks on manipulating assets on the canvas.
Refer to caption
Figure 6. A mapping of asset types that participants initiated recommendations for over time. Each dot is an instance where a participant requested a specific asset type using the system. Each light green rectangle spans half of the allotted time, 12.5 minutes, and is used to visually highlight when a majority of recommendations for that asset type occurred. Most participants focused on generating visualizations during the first half (50% of asset interactions), and graphics in the later half (70% of asset interactions).

A scatter of what asset types were recommended at each minute. Points are color coded by the participant. There are shaded regions behind the points from minute 0-12.5 for visualization and from minute 12.5-25 for static and animated graphics.

Refer to caption
Figure 7. System usability scores for individual features and creativity support index scores obtained from 10 participants. Participants found the visualization generation and graphic generation features to be the most useful. Visualization generation, graphic generation, and color palette / recolor were the easiest to learn and use. Overall, all participants agreed that the end result was worth the effort.

Stacked bar charts showing the 5 point Likert scale ratings participants made for creativity support as well as the usefulness and easiness to learn / use for each asset type.

6.2.4. Post-Survey & Interview (25 minutes)

At the conclusion of the tasks, participants filled out a post-survey about the usability of the individual features based on a subset of the SUS scale (Brooke, 1996), perceived similarity between their sketch and final infographic, overall satisfaction, and feelings of creativity support based on the Creativity Support Index (Cherry and Latulipe, 2014). They also participated in two semi-structured interviews. The first probed their thoughts on their authoring process, impressions of the system compared to other ones for infographic creation, feelings on the level of automation, use cases for the system, and the natural-language centered workflow. The second asked them to retroactively reflect on their asset generation choices with respect to their sketch and how their mental models evolved from beginning to end.

7. Results

7.1. Personas Induced by Workflow Patterns

The distribution of how participants used mouse clicks are depicted in Figure 5. While participants ranged quite widely in the number of total clicks they made (μ=242.7,σ=56.5), all of them spent less than a quarter of these clicks on generating assets (μ=13.2%,σ=5.7%), and the rest on arranging assets on the canvas. One participant spent as little as 9 clicks total (4.7%) producing the visual components they desired and devoted the rest of their interactions fine-tuning the layout, sizing, and captions. The comparatively fewer clicks required to obtain assets surprised some participants, who compared this to the tediousness of their prior workflows having to collect assets from different sources, which “really impedes creativity because when you’re doing something creative, you get into the zone, but then it’s like, oh wait, I need a cloud and then you spend the next 5–10 minutes elsewhere trying to find the cloud” (P9). The mental workload of crafting the infographic is shifted to the design instead. From the asset generation clicks, we also broke down which types of assets participants wanted over time, depicted in Figure 6. Many participants spent the majority of their clicks on visualizations (50%) in the first 12.5 minutes and wanted static and animated graphics (70%) in the latter half. In contrast, there were two small clusters where color palettes were more desirable–namely between the 5 to 10 minute mark immediately after visualizations and after 20 minutes after most of the other visual elements were added. This makes sense in context, as participants would want to change the color scheme of assets after they have been imported. These patterns align with those that can be found in traditional editing experiences, indicating the presence of legacy bias (Morris et al., 2014) and indicating no steeper learning curves are introduced from disrupting old editing orders.

From our interviews, we also summarized the participants’ verbalized mental models on these interaction patterns into two personas: 1) “confident users” who, either from their prior sketch or background knowledge, knew exactly what they wanted to depict and 2) “exploratory users” who did not and used the recommendations for ideation. The former group approached the workflow from a functional perspective, stating “I tackled the largest and most important element first, which would be the visualization (P7),” while the latter felt that there were more options to explore for visualizations and the graphics would be dependent on them. The “confident” group also devoted their time into looking for the exact assets that matched their sketch; some succeeded easily (P5, P7), while others had to re-calibrate and find alternative options (P9, P10). Conversely, the “exploratory” group used the key message more as a guideline for their exploration, specifically using the message to group assets into visual sets. For example, visualizations and graphics relating to the same sentence in the message would be arranged spatially together on the canvas.

Refer to caption
Figure 8. Overview of the participant authoring process. (A) A heat map overlaid on the messages provided indicates that there are trends in how users brush over text for different types of asset recommendations. (B) Sketches from the participants before using our system show divergent content and layouts. (C) Participants selected similar assets for their infographic. (D) The final infographic shows convergent content but divergent layouts.

A 4x2 table. The 4 columns show the key message, the participant sketch, selected assets, and the final infographic. The message column contains the message itself and heatmap underlines showing what the participants brushed over, color coded by asset types. The sketch column contains different participant sketches evenly spaced out. The selected assets column contains individual assets pulled from the final infographics arranged by similar type. The final infographic shows the final infographics even spaced out. The 2 rows show each different message provided to the participants with the corresponding sketches, selected assets, and final infographics.

7.2. Effectiveness of Asset Recommendations and Interactions

A summary of the usability scores for each system feature and overall feelings of creativity can be found in Figure 7. Participants agreed that the visualization and graphic recommendations are the most useful and easiest to use because they are the core components that make up the infographic. They found the data filter and color palette recommendations were comparatively less useful because the former is circumstantial and the latter only improves aesthetics and style, which reinforces the message but is not necessarily central to the message. Animation creation/sync and visualization-graphic merge functions are similarly useful, but harder to learn and use. Their use cases are more niche. Animation requires that the specific dataset has a time-oriented column to be animated over and visualization-graphic merge requires a visualization that has a legend with labels replaceable by visual substitutions. However, in the situations where they can be used, participants recognized that they could make the infographic more engaging. For example, P9 initially felt like they were “trying to find a use for animation,” but upon more introspection about the dataset and the message, they “could imagine I could do something with animation sync. Like moving through the 24 hours of the GIF and matching that to the day night cycle of the visualization. And I feel like if I had the time to do that, that would actually be really cool and really unique.”

Both the novices and experts agreed that the system allowed them to create designs without tedious interactions, was engaging, and allowed them to be expressive. All participants (5 agree, 5 strongly agree) felt that the resulting design they were able to produce was worth the effort it took to produce it. Comparing this workflow against existing workflows they would have used to create infographics, all participants stated that this system was faster. Specifically, P8 said, “Usually, I would probably use either Figma or Photoshop to lay down the layout. But I would have to leave the program to find graphics or go to Illustrator to make my own graphics. Here, I feel like I can make one that’s decent without having to leave the program.” In addition to the convenience of a centralized tool, the automatic recommendation of assets and suggestions to integrate them removes some of the barrier of “data science knowledge” (P5) it takes to manually decide which bits of information to prioritize. Instead of the designer tunnel-visioning because “you have to have something very specific in mind before you make it (P4)” with existing workflows, Epigraphics allowed participants to think about infographic composition more holistically (P9).

Refer to captionRefer to caption
Figure 9. Before and after of two redesigns of participant (P1, P2) infographics by maintaining content and re-organizing layout. Modifying the background to enhance contrast, enlarging the title, rotating the logos, changing fonts, and grouping the plots more tightly can enhance visual cohesion.

Four infographics with arrows pointed from the first one to the second one and from the third one to the fourth one. The first infographic features two overlapping scatterplots on the top half and a small title. The second infographic contains the same overlapping scatterplots now enlarged to span the entire page, a larger title, and modified fonts. The third infographic features a small title on top, a green bar chart on the middle top, an annotated scatterplot on the bottom left, two images stacked on top of each other on the bottom right, and a background of the Spotify logo. The fourth infographic features a large title, the bar chart on the top left, the annotated scatterplot on the bottom right with the two images cropped as a circle next to each annotation, and a dark background with the Spotify logo rotated 45 degrees.

7.3. Comparing Fidelity to Message between Sketch and Infographic

All participants felt that their final infographics aligned with the client-provided message. Figure 8 depicts an overview of this message (A), their intermediate sketch (B), final infographic (D), and commonalities shared between the infographics (C) for the two datasets. Unprompted, the participants still displayed similar patterns in what text they brushed for what types of assets. For example qualitative, more general statements such as “change in weather and sky conditions throughout the day” and “trends in length and genres” were used for visualizations, while specific sentence fragments such as “temperature goes above 15 degrees C” and “how songs by Coldplay or Green Day performed” were used for data filters. Graphics and color palettes concentrated on one or two keywords such as “sky conditions,” “temperature,” and “Spotify” for color palettes and “sky,” “Coldplay,” and “Green Day” for graphics. This resulted in similarities in the recommended and ultimately used assets in the infographic (Figure 8C). For the weather dataset, all participants used scatterplots of temperature versus hour or overlaid that on top of a humidity versus hour plot to depict the message “throughout the day.” To indicate specific temperatures, two participants shared the idea of using arrows as the symbol. Similarly, for the Spotify dataset, four participants added the Spotify logo and all of them appropriately highlighted data points referring to Coldplay and Green Day songs. In contrast, since the sketches (Figure 8B) were free-form, they displayed visually more disparate contents and layouts. In addition to more variety in chart types, each sketch also had different focuses; for example, some wanted to emphasize individual data points (P1, P3), others wanted to showcase general trends (P2, P5, P6, P8, P9), while the rest wanted to achieve a combination of both (P4, P7, P10).

When asked to reflect upon the content differences, disregarding aesthetic polish, between their sketches and final infographics, eight of the ten preferred the infographic, and stated that the latter was more comprehensive in presenting the information requested from the message. For example, P1 felt that in retrospect, their sketch did not sufficiently convey the humidity and temperature changes through the day since they just drew three boxes of humidity and temperature at three specific time stamps. Similarly, two of the five Spotify sketches fail to mention Coldplay or Green Day at all, despite their emphasis in the key message. Multiple participants mentioned that the physical action of brushing repeatedly over text encouraged them to “cover all the bases” (P2), whereas this was less reinforced by just looking at the key message in the sketch task. For visualizations, since the brushed text is used under the hood as a query to reduce the number of columns into similar subsets for different users, all the participants ended up choosing from a “standardized” set of charts that shared similar axes and annotations. Similarly, graphics and color palettes were also standardized because the text-to-graphic and text-to-color provided unified representations for user intent and translated them to assets. For example, participants understood they had to reference Spotify somewhere in their infographic; two wrote the words “Spotify” on their sketch. This intent was physically translated to variations of the Spotify logo across all the infographics and reflected in the green/black color palettes used.

Overall, participants felt that using brushing over natural language to explore intent was a “cool [way] to automatically generate assets” (P7). They pointed out that this was especially true for people “who are not good at math and statistics” (P4) because they could verbally describe how they want the visualizations to combine with images, colors, or highlights, but may not necessarily have the prior knowledge to manually manipulate assets. However, some participants (P2, P5, P7, P9, P10, etc.) did want more extensive customization capabilities and alternate style recommendations for the assets after they were added to the canvas, such as deviating away from the flatness of the visualizations and static graphics to a “watercolour style” (P9) or expanding the visualization to adopt more unconventional compositions (P1, P10). They expressed desires to achieve these results via text, as “I have to write everything down when I’m brainstorming, I write things. I don’t draw things” (P9).

7.4. Reflection on Final Infographic Quality

The final infographics created by the participants are located in Figure 8D. In comparison to the example ones in the gallery of Figure 4, we note that they are less visually appealing. This disparity may be partially attributed to the differences in key message intent. For example, specifically comparing the user infographics for the Spotify dataset against Figure 4E, which was constructed from the same dataset, we note that the key messages in the former consisted of more exploratory tasks. The participant key message wanted 1) trends in length and genres, 2) whether more “dance-able” songs are shorter, and 3) songs by Green Day or Coldplay, whereas the gallery key message wanted to showcase 1) indie songs have the highest energy and 2) hip hop songs are the most dance-able. This effect was intended as we wanted the participants to fully explore the system functionalities. As a side-effect, participants were more focused on using the recommendations to effectively identify trends or to answer the dance-ability question, and less focused on asset layout. However, the system was ultimately able to support users in adding the appropriate infographic components that addressed the message onto the canvas. From this point, we then argue that improving the appearance of the final infographic through asset layout once all the components are on the canvas does not take many extra steps. Figure 9 demonstrates how assets within two of the participant infographics can be re-arranged to generate a more visually effective infographic with the same content.

But why didn’t the participants perform these steps? While Epigraphics generates infographic components, it does not necessarily provide message-sourced visual groupings or font recommendations for components. Thus, to author more visually appealing final infographics, participants may require more background in layout design. Supporting this “visual impressiveness” is a trade-off between automation and freedom of interaction; while novices would rely on automation more to generate conventionally appealing designs, experts would prefer more flexibility of expression. Our work does not aim to replace designers’ expertise, but rather to provide them with tools to extend their expertise. Future work could balance automation and interaction more during this final curation step to minimize the visual disparities of infographic outcomes.

8. Discussion

Although the notion of a text-based workflow took some participants (P1, P6) time to adapt to, all users agreed that they would use this workflow in the future. P5 specifically noted that the system reminded them that “language allows users to do stuff that’s uncommon and UI allows users to do stuff that’s common.” As Epigraphics is not necessarily the final form of text-powered authoring tools for data storytelling, we further summarize our findings about how natural-language sourced recommendations can support infographic content creation as design lessons and derive takeaways on core components of an effective key message below.

8.1. Text as a First-Class Object Effectively Standardizes Content and Mental Models

When participants were provided the same text-based message, their output sketches were visually divergent. In contrast, when they used text brushing in Epigraphics, commonly brushing over the same chunks, their final infographics were more visually cohesive. While the contents were not identical due to personal customization afterwards, the semantic information conveyed was similar. Participants overlapped in the chart axes used, color families (more noticeable for the Spotify infographics), and images selected. This indicates that a ‘text as first-class object’ paradigm has a standardizing effect over content authoring as keywords in the text 1) reduced dataset columns into semantically related subsets for visualizations and data filters and 2) attributed physical representations to implicit intent for color palettes and graphics. Because the physical action of text brushing also incited users to critically think about and identify what words could be best used what asset types, it also provided some standardization over their mental models as they started to form mental mappings between text and asset type. From these interactions, the assets added are also guaranteed to be relevant to the source message. This adherence to a focused message implies that the resultant infographic is necessarily a comprehensive one (Dunlap and Lowenthal, 2016; Hernandez-Sanchez et al., 2021; Martin et al., 2019; Murray et al., 2017).

One question that naturally arises then, is whether standardization restrains creativity. From the creativity support index scores in Figure 7, we see that most of the participants still agreed that the system allowed them to be very expressive. One participant specifically said, “Although it was a bit constraining, because there was such a variety with what the tool gave you within those constraints, I feel like it gave me things I would have never thought of.” Other participants (P2, P5, P9), when reflecting on each type of asset generation during the post-survey, continued to talk unprompted about new designs they wanted to test by combining the assets they already had. The irony is that constraints within standardization made users more creative because it encouraged them to think outside the box, and this resulted in more personally interesting outcomes. These sentiments align with prior studies that found that design constraints (Stokes, 2001), a set of boundaries set to a creative task, can stimulate creativity as opposed to suppress it (Caniëls and Rietzschel, 2015; Rietzschel et al., 2014). Thus, we argue that text-sourced standardization of creative content is such a design constraint that can be incorporated into interactive tools beyond those for infographic authoring and can have a positive effect on practiced creativity.

8.2. Message-based Content Recommendations Empower Big Picture Thinking

Despite the message-based approach being conducive to standardizing content, the layouts of the final participant infographics remained varied. Although some (P2) grouped assets based on similar semantics of the message they were sourced from, how this grouping occurred was unstructured. For example, participants may place one visualization in the center of the page and surround it with images or stack two visualizations side-by-side either horizontally or vertically with the title either on the top left, top middle, or top right. Some used the recommended highlight functionality to emphasize data points while others manually added an arrow symbol to achieve the same effect. Depending on how they were placed, the same graphics served different functions: as backgrounds, decorative accents for the title, attention drawers to highlighted data points, etc. The amount of deliberation in these different decisions is reflected in the large proportion of clicks during the authoring task that was devoted to canvas interactions. But what does flexibility in layout, but constraint in content mean? According to P9, this dichotomy indicates, “I was more focused on design. I think it meshes very naturally into the creative process because you’re just focused on the big picture. Things like composition, like symbolism in the visuals.” Attention is instead shifted away from the specificity of individual components to the design as a holistic view. This helped some participants avoid tunnel visioning, which occurred during the sketching task, and helped others re-calibrate their expectations of how they wanted to present the message.

While systems that can effectively automate layout exist (Schrier et al., 2008; Tabata et al., 2019; Guo et al., 2021), we argue that maintaining complete user autonomy over design layout has benefits, especially if the asset generation process is already automated. Particularly for personalized designs, active thinking about design alternatives allows users to avoid a linear design process (Swearngin et al., 2020) and create non-derivative motifs. More importantly, combining the standardization of assets with flexibility in layout supports concurrent convergent and divergent thinking, both of which necessarily occurs “cycling repeatedly” (Vidal, 2010) in the creative process (Goldschmidt, 2016; Perkins, 1992; Fricke, 1996). Perkins (Perkins, 1992) makes this even more explicit, stating that “inventive people are mode shifters” between convergent and divergent thinking; tools that incorporate both into the workflow can thus more effectively facilitate innovation.

8.3. Accelerating Asset Generation Facilitates Rapid Iteration across Mental Models

We previously identified two personas for patterns of asset generation: 1) the “confident” user who looks for specific assets that match their preconceived beliefs and 2) the “exploratory” user who generates assets to understand and brainstorm what final component they want. While the divide was not completely clear-cut with two exceptions, most of our novice-leaning users were “exploratory,” while the expert-leaning users were “confident.” In both instances, the message-based approach offloaded click counts from the asset generation process so users focused more on canvas interactions. This means that the “confident” user had more time and space to either recreate their visions or re-align them with either the message or tool capabilities. Conversely for the “exploratory” user, they were able to see a plethora of feasible visual stimuli to explore potential representations of what they want to convey from the inventory. P4 reinforces this, stating, “It would be really useful for not even sketching or brainstorming but like pre-brainstorming. I feel that usually this is hard, but it can give you just a wide scattershot of random graphs, images, or points to emphasize. And it’s that relationship in this system that is actually really interesting.” The centralized authoring environment and reduction of search space due to the querying nature from message based interactions allowed both groups to iterate through ideas and assets quickly.

It is often difficult to design a system for both novices and experts due to the differences in how they complete a specific task within a specific tool. The novice is often “depth-first” and considers many sub-solutions in depth before making decisions, whereas experts can skip that step by mentally removing themselves from specific examples to envision more general or abstract concepts (Cross, 2004). We found that the few-clicks paradigm for assets afforded by messages empowers both, supporting exploration more akin to pre-brainstorming or prototyping for new users and rapid assembly of complete infographics for experienced ones. Both types of users are thus able to rapidly iterate on their personal goals. Similarly, other existing systems that intentionally minimize click fatigue for performance enhancements have reported benefits such as reducing frustration and increasing engagement (Bao et al., 2006), both of which support the rapid iteration process. The novice can thus quickly and enjoyably gain greater familiarity with both visualization, design concepts, and the system itself until they become an expert and can rapidly create complete infographics of their own.

8.4. Strategies for an Effective Key Message and How It Might Fail

Based on our case studies and user study, we further reflect on the core properties of our message-driven approach, the message itself. Given the assets Epigraphics can recommend, an effective key message can be deconstructed into constituents of keywords or phrases, yet should also be a semantically correct, stand-alone summary of the envisioned infographic. Although there are no restrictions on what a user could brush over for each asset type, certain phrases can lead to greater success in recommendation quality. Consider the key message used in the first case study (Section 5.1), “A canary flapping its wings based on traced body positions taken from slow-motion video captures of the bird, highlighting its upstrokes.” Visually descriptive noun phrases such as “canary,” “canary flapping its wings,” “video captures,” and “bird” can either allude to potential sources for static and animated graphics or for thematic color families. Qualitative or quantitative verb phrases that modify these noun phrases such as “flapping its wings based on traced body positions” or “taken from slow-motion video captures” are good candidates for visualizations. Specific adjective prepositional descriptors that modify the verb phrases such as “highlighting its upstrokes” can be used for data filters. Thus, depending on what components the user wishes to include, a standard key message could consist of a mixture of 1) noun phrases, 2) verb phrases, and 3) adjective prepositional descriptors.

However, undesired assets may be generated when message phrases are brushed over incorrectly. Some obvious failure cases from the user study can be seen in Figure 8A, where the brushed text provided either too much or too little information. For example, some users brushed the whole first sentence in the Spotify message, which consisted of multiple phrases that could allude to different asset types, to request a graphic. The subsequent recommendations were too ambiguous and varied to be useful. Conversely, another user mistakenly brushed the word “conditions” to request a color palette, which was too little context and led to no recommendations. Thus, while the key message itself should be comprehensive, which words or phrases are selected also require precision and thought. In such a heavy text-dependent workflow, we therefore emphasize the importance of the brushing interaction; since trial-and-error is required in iterating through desired assets, streamling this interaction could increase efficiency.

8.5. Limitations and Future Work

Participants reported that the main limitation of the system was that it didn’t have customization capabilities comparable to other design software like Illustrator or Figma. They also wanted to create stylized visualizations in painterly or sketch-like fashions beyond the flat vector appearance. The system also did not support the entire breadth of visualization chart types. Thus, future work could expand on the expressiveness of the graph styles via application of SVG filters (Zhou et al., 2023) and the addition of more graph types. Furthermore, the current retrieval process for graphic generation could be substituted with high-fidelity text-to-image and text-to-animation, instead, which will further expand the variety of infographics that can be rapidly created. Since it is modular, the final form of Epigraphics could also easily serve as a plugin for existing design tools. A thorough comparison with other semi-automatic systems for infographic authoring tools can also reveal more nuances about the trade-offs of a message-sourced approach.

9. Conclusion

Graphical authoring tools have historically started with the canvas, prescribing a workflow that focuses on visuals first. By applying generative models to translate a message into components of an infographic, Epigraphics explores an approach to rapidly prototyping infographics starting with the author’s intent. This workflow automatically generates data visualizations, graphics, colors, highlights, and animations to help designers assemble a complete infographic that conveys a cohesive theme. Participants noticed the integrated workflow allowed them to switch back and forth between text and canvas within the same application as they combined assets together while still thinking about the core message. It induced greater infographic fidelity to the message through the physical action of brushing, as well as affected what was ultimately produced. It also led to a convergence of both content and user mental models, while maintaining a diversity of styles in the final infographics, and helped users ideate more holistically despite varying levels of expertise. Both text and canvas have their advantages–the text can support semantically nuances in its input and the canvas precision in its output; while text is linear, the canvas is multidimensional in layout and theme. By combining the two together, Epigraphics adds that second dimension to infographics authoring that empowers rapid iterations to produce a first draft that comprehensively conveys a coherent visual message.

Acknowledgements.
We would like to thank Eunyee Koh, Haijun Xia, Ji Won Chung, and the AEL group at Adobe Research for their feedback that helped shape this manuscript.

References

  • (1)
  • Amini et al. (2017) Fereshteh Amini, Nathalie Henry Riche, Bongshin Lee, Andres Monroy-Hernandez, and Pourang Irani. 2017. Authoring Data-Driven Videos with DataClips. IEEE Transactions on Visualization and Computer Graphics 23, 1 (2017), 501–510. https://doi.org/10.1109/TVCG.2016.2598647
  • Antol et al. (2015) Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C Lawrence Zitnick, and Devi Parikh. 2015. Vqa: Visual question answering. , 2425–2433 pages.
  • Badam et al. (2019) Sriram Karthik Badam, Zhicheng Liu, and Niklas Elmqvist. 2019. Elastic Documents: Coupling Text and Tables through Contextual Visualizations for Enhanced Document Reading. IEEE Transactions on Visualization and Computer Graphics 25, 1 (2019), 661–671. https://doi.org/10.1109/TVCG.2018.2865119
  • Bao et al. (2006) Xinlong Bao, Jonathan L. Herlocker, and Thomas G. Dietterich. 2006. Fewer Clicks and Less Frustration: Reducing the Cost of Reaching the Right Folder. In Proceedings of the 11th International Conference on Intelligent User Interfaces (Sydney, Australia) (IUI ’06). Association for Computing Machinery, New York, NY, USA, 178–185. https://doi.org/10.1145/1111449.1111490
  • Bawabe et al. (2021) Sarah Bawabe, Laura Wilson, Tongyu Zhou, Ezra Marks, and Jeff Huang. 2021. The UX Factor: Using Comparative Peer Review to Evaluate Designs through User Preferences. Proc. ACM Hum.-Comput. Interact. 5, CSCW2, Article 476 (oct 2021), 23 pages. https://doi.org/10.1145/3479863
  • Brooke (1996) John Brooke. 1996. Sus: a “quick and dirty’ usability scale. Usability evaluation in industry 189, 3 (1996), 189–194.
  • Cambre et al. (2018) Julia Cambre, Scott Klemmer, and Chinmay Kulkarni. 2018. Juxtapeer: Comparative Peer Review Yields Higher Quality Feedback and Promotes Deeper Reflection. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (, Montreal QC, Canada,) (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3173574.3173868
  • Caniëls and Rietzschel (2015) Marjolein CJ Caniëls and Eric F Rietzschel. 2015. Organizing creativity: Creativity and innovation under constraints. Creativity and Innovation Management 24, 2 (2015), 184–196.
  • Cao et al. (2023) Yining Cao, Jane L E, Zhutian Chen, and Haijun Xia. 2023. DataParticles: Block-Based and Language-Oriented Authoring of Animated Unit Visualizations. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 808, 15 pages. https://doi.org/10.1145/3544548.3581472
  • Castro Fernandez et al. (2018) Raul Castro Fernandez, Ziawasch Abedjan, Famien Koko, Gina Yuan, Samuel Madden, and Michael Stonebraker. 2018. Aurum: A Data Discovery System. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE, Paris, France, 1001–1012. https://doi.org/10.1109/ICDE.2018.00094
  • Chen et al. (2019) Zhutian Chen, Yijia Su, Yifang Wang, Qianwen Wang, Huamin Qu, and Yingcai Wu. 2019. Marvist: Authoring glyph-based visualization in mobile augmented reality. IEEE transactions on visualization and computer graphics 26, 8 (2019), 2645–2658.
  • Chen et al. (2020) Zhutian Chen, Wai Tong, Qianwen Wang, Benjamin Bach, and Huamin Qu. 2020. Augmenting Static Visualizations with PapARVis Designer. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3313831.3376436
  • Chen and Xia (2022) Zhutian Chen and Haijun Xia. 2022. CrossData: Leveraging Text-Data Connections for Authoring Data Documents. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 95, 15 pages. https://doi.org/10.1145/3491102.3517485
  • Cherry and Latulipe (2014) Erin Cherry and Celine Latulipe. 2014. Quantifying the creativity support of digital tools through the creativity support index. ACM Transactions on Computer-Human Interaction (TOCHI) 21, 4 (2014), 1–25.
  • Chudá (2007) Daniela Chudá. 2007. Visualization in Education of Theoretical Computer Science. In Proceedings of the 2007 International Conference on Computer Systems and Technologies (Bulgaria) (CompSysTech ’07). Association for Computing Machinery, New York, NY, USA, Article 84, 6 pages. https://doi.org/10.1145/1330598.1330687
  • Clark and Lyons (2010) Ruth C. Clark and Chopeta Lyons. 2010. Graphics for Learning: Proven Guidelines for Planning, Designing, and Evaluating Visuals in Training Materials (2nd ed.). Pfeiffer & Company, San Francisco, CA.
  • Cross (2004) Nigel Cross. 2004. Expertise in design: an overview. Design studies 25, 5 (2004), 427–441.
  • Cui et al. (2020) Weiwei Cui, Xiaoyu Zhang, Yun Wang, He Huang, Bei Chen, Lei Fang, Haidong Zhang, Jian-Guan Lou, and Dongmei Zhang. 2020. Text-to-Viz: Automatic Generation of Infographics from Proportion-Related Natural Language Statements. IEEE Transactions on Visualization and Computer Graphics 26, 1 (2020), 906–916. https://doi.org/10.1109/TVCG.2019.2934785
  • Dunlap and Lowenthal (2016) Joanna C Dunlap and Patrick R Lowenthal. 2016. Getting graphic about infographics: design lessons learned from popular infographics. Journal of Visual Literacy 35, 1 (2016), 42–59.
  • Firat and Laramee (2018) Elif E. Firat and Robert S. Laramee. 2018. Towards a Survey of Interactive Visualization for Education. In Proceedings of the Conference on Computer Graphics & Visual Computing (United Kingdom) (CGVC ’18). Eurographics Association, Goslar, DEU, 91–101. https://doi.org/10.2312/cgvc.20181211
  • Fox et al. (2016) Joe Fox, Ryan Menezes, and Armand Emamdjomeh. 2016. Every shot Kobe Bryant ever took. all 30,699 of them. https://graphics.latimes.com/kobe-every-shot-ever/
  • Fricke (1996) Gerd Fricke. 1996. Successful individual approaches in engineering design. Research in engineering design 8 (1996), 151–165.
  • Fu and Stasko (2023) Yu Fu and John Stasko. 2023. More Than Data Stories: Broadening the Role of Visualization in Contemporary Journalism. IEEE Transactions on Visualization and Computer Graphics 14, 8 (2023), 1–20. https://doi.org/10.1109/TVCG.2023.3287585
  • Ge et al. (2020) Tong Ge, Yue Zhao, Bongshin Lee, Donghao Ren, Baoquan Chen, and Yunhai Wang. 2020. Canis: A High-Level Language for Data-Driven Chart Animations. Computer Graphics Forum 39, 3 (2020), 607–617. https://doi.org/10.1111/cgf.14005
  • Goldschmidt (2016) Gabriela Goldschmidt. 2016. Linkographic evidence for concurrent divergent and convergent thinking in creative design. Creativity research journal 28, 2 (2016), 115–122.
  • Guo et al. (2021) Shunan Guo, Zhuochen Jin, Fuling Sun, Jingwen Li, Zhaorui Li, Yang Shi, and Nan Cao. 2021. Vinci: An Intelligent Graphic Design System for Generating Advertising Posters. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 577, 17 pages. https://doi.org/10.1145/3411764.3445117
  • Hernandez-Sanchez et al. (2021) Sergio Hernandez-Sanchez, Victor Moreno-Perez, Jonatan Garcia-Campos, Javier Marco-Lledó, Eva Maria Navarrete-Muñoz, and Carlos Lozano-Quijada. 2021. Twelve tips to make successful medical infographics. Medical Teacher 43, 12 (2021), 1353–1359.
  • Kang et al. (2021) DaYe Kang, Tony Ho, Nicolai Marquardt, Bilge Mutlu, and Andrea Bianchi. 2021. ToonNote: Improving Communication in Computational Notebooks Using Interactive Data Comics. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 727, 14 pages. https://doi.org/10.1145/3411764.3445434
  • Kaplan et al. (2020) Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. 2020. Scaling laws for neural language models.
  • Kim et al. (2019) Nam Wook Kim, Nathalie Henry Riche, Benjamin Bach, Guanpeng Xu, Matthew Brehmer, Ken Hinckley, Michel Pahud, Haijun Xia, Michael J. McGuffin, and Hanspeter Pfister. 2019. DataToon: Drawing Dynamic Network Comics With Pen + Touch Interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300335
  • Kim et al. (2017) Nam Wook Kim, Eston Schweickart, Zhicheng Liu, Mira Dontcheva, Wilmot Li, Jovan Popovic, and Hanspeter Pfister. 2017. Data-Driven Guides: Supporting Expressive Design for Information Graphics. IEEE Transactions on Visualization and Computer Graphics 23, 1 (2017), 491–500. https://doi.org/10.1109/TVCG.2016.2598620
  • Kim and Heer (2021) Younghoon Kim and Jeffrey Heer. 2021. Gemini: A Grammar and Recommender System for Animated Transitions in Statistical Graphics. IEEE Transactions on Visualization and Computer Graphics 27, 2 (2021), 485–494. https://doi.org/10.1109/TVCG.2020.3030360
  • Kulkarni et al. (2015) Chinmay E. Kulkarni, Michael S. Bernstein, and Scott R. Klemmer. 2015. PeerStudio: Rapid Peer Feedback Emphasizes Revision and Improves Performance. In Proceedings of the Second (2015) ACM Conference on Learning @ Scale (Vancouver, BC, Canada) (L@S ’15). Association for Computing Machinery, New York, NY, USA, 75–84. https://doi.org/10.1145/2724660.2724670
  • Lai et al. (2020) Chufan Lai, Zhixian Lin, Ruike Jiang, Yun Han, Can Liu, and Xiaoru Yuan. 2020. Automatic Annotation Synchronizing with Textual Description for Visualization. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376443
  • Latif et al. (2022) Shahid Latif, Zheng Zhou, Yoon Kim, Fabian Beck, and Nam Wook Kim. 2022. Kori: Interactive Synthesis of Text and Charts in Data Documents. IEEE Transactions on Visualization and Computer Graphics 28, 1 (2022), 184–194. https://doi.org/10.1109/TVCG.2021.3114802
  • Lee et al. (2015) Bongshin Lee, Nathalie Henry Riche, Petra Isenberg, and Sheelagh Carpendale. 2015. More Than Telling a Story: Transforming Data into Visually Shared Stories. IEEE Computer Graphics and Applications 35, 5 (2015), 84–90. https://doi.org/10.1109/MCG.2015.99
  • Lee et al. (2021) Doris Jung-Lin Lee, Dixin Tang, Kunal Agarwal, Thyne Boonmark, Caitlyn Chen, Jake Kang, Ujjaini Mukhopadhyay, Jerry Song, Micah Yong, Marti A. Hearst, and Aditya G. Parameswaran. 2021. Lux: Always-on Visualization Recommendations for Exploratory Data Science. arXiv:2105.00121 [cs.DB]
  • Li et al. (2023b) Haotian Li, Yun Wang, Q. Vera Liao, and Huamin Qu. 2023b. Why is AI not a Panacea for Data Workers? An Interview Study on Human-AI Collaboration in Data Storytelling. arXiv:2304.08366 [cs.HC]
  • Li et al. (2023a) Haotian Li, Yun Wang, and Huamin Qu. 2023a. Where Are We So Far? Understanding Data Storytelling Tools from the Perspective of Human-AI Collaboration. Technical Report MSR-TR-2023-38. Microsoft. https://www.microsoft.com/en-us/research/publication/where-are-we-so-far-understanding-data-storytelling-tools-from-the-perspective-of-human-ai-collaboration/
  • Li et al. (2022) Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. 2022. BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation. https://doi.org/10.48550/ARXIV.2201.12086
  • Liu and Chilton (2022) Vivian Liu and Lydia B Chilton. 2022. Design Guidelines for Prompt Engineering Text-to-Image Generative Models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 384, 23 pages. https://doi.org/10.1145/3491102.3501825
  • Lu et al. (2020) Min Lu, Chufeng Wang, Joel Lanir, Nanxuan Zhao, Hanspeter Pfister, Daniel Cohen-Or, and Hui Huang. 2020. Exploring Visual Information Flows in Infographics. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3313831.3376263
  • Lutz (2014) Eleanor Lutz. 2014. Flight videos deconstructed. http://tabletopwhale.com/2014/09/29/flight-videos-deconstructed.html
  • Martin et al. (2019) Lynsey J Martin, Alison Turnquist, Brianna Groot, Simon YM Huang, Ellen Kok, Brent Thoma, and Jeroen JG van Merriënboer. 2019. Exploring the role of infographics for summarizing medical literature. Health Professions Education 5, 1 (2019), 48–57.
  • McCrorie et al. (2016) Alan David McCrorie, Conan Donnelly, and Kieran J McGlade. 2016. Infographics: healthcare communication for the digital age. The Ulster medical journal 85, 2 (2016), 71.
  • Morris et al. (2014) Meredith Ringel Morris, Andreea Danielescu, Steven Drucker, Danyel Fisher, Bongshin Lee, m. c. schraefel, and Jacob O. Wobbrock. 2014. Reducing Legacy Bias in Gesture Elicitation Studies. Interactions 21, 3 (may 2014), 40–45. https://doi.org/10.1145/2591689
  • Murray et al. (2017) Iain R Murray, AD Murray, Sarah J Wordie, Chris W Oliver, AW Murray, and AHRW Simpson. 2017. Maximising the impact of your work using infographics. Bone & joint research 6, 11 (2017), 619–620.
  • OpenAI (2023) OpenAI. 2023. https://platform.openai.com/docs/models/gpt-3-5
  • Perkins (1992) David N Perkins. 1992. Topography of Invention DAVID N. PERKINS. Inventive minds: Creativity in technology 10 (1992), 238.
  • Pinheiro and Poco (2022) Joao Pinheiro and Jorge Poco. 2022. ChartText: Linking Text with Charts in Documents. arXiv:2201.05043 [cs.HC]
  • Qian et al. (2021) Chunyao Qian, Shizhao Sun, Weiwei Cui, Jian-Guang Lou, Haidong Zhang, and Dongmei Zhang. 2021. Retrieve-Then-Adapt: Example-based Automatic Generation for Proportion-related Infographics. IEEE Transactions on Visualization and Computer Graphics 27, 2 (2021), 443–452. https://doi.org/10.1109/TVCG.2020.3030448
  • Quealy and Sanger-katz (2016) Kevin Quealy and Margot Sanger-katz. 2016. Is Sushi ’Healthy’? What About Granola? Where Americans and Nutritionists Disagree. https://www.nytimes.com/interactive/2016/07/05/upshot/is-sushi-healthy-what-about-granola-where-americans-and-nutritionists-disagree.html
  • Reimers and Gurevych (2019) Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. https://arxiv.org/abs/1908.10084
  • Ren et al. (2019) Donghao Ren, Bongshin Lee, and Matthew Brehmer. 2019. Charticulator: Interactive Construction of Bespoke Chart Layouts. IEEE Transactions on Visualization and Computer Graphics 25, 1 (2019), 789–799. https://doi.org/10.1109/TVCG.2018.2865158
  • Rietzschel et al. (2014) Eric F. Rietzschel, Bernard A. Nijstad, and Wolfgang Stroebe. 2014. Effects of Problem Scope and Creativity Instructions on Idea Generation and Selection. Creativity Research Journal 26, 2 (2014), 185–191. https://doi.org/10.1080/10400419.2014.901084
  • Satyanarayan et al. (2017) Arvind Satyanarayan, Dominik Moritz, Kanit Wongsuphasawat, and Jeffrey Heer. 2017. Vega-Lite: A Grammar of Interactive Graphics. IEEE Transactions on Visualization & Computer Graphics (Proc. InfoVis) 23, 1 (2017), 341–350. https://doi.org/10.1109/tvcg.2016.2599030
  • Schrier et al. (2008) Evan Schrier, Mira Dontcheva, Charles Jacobs, Geraldine Wade, and David Salesin. 2008. Adaptive Layout for Dynamically Aggregated Documents. In Proceedings of the 13th International Conference on Intelligent User Interfaces (Gran Canaria, Spain) (IUI ’08). Association for Computing Machinery, New York, NY, USA, 99–108. https://doi.org/10.1145/1378773.1378787
  • Shi et al. (2023) Xinyu Shi, Ziqi Zhou, Jing Wen Zhang, Ali Neshati, Anjul Kumar Tyagi, Ryan Rossi, Shunan Guo, Fan Du, and Jian Zhao. 2023. De-Stijl: Facilitating Graphics Design with Interactive 2D Color Palette Recommendation. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 122, 19 pages. https://doi.org/10.1145/3544548.3581070
  • Siricharoen (2013) Waralak V Siricharoen. 2013. Infographics: the new communication tools in digital age.
  • Srinivasan et al. (2021) Arjun Srinivasan, Nikhila Nyapathy, Bongshin Lee, Steven M. Drucker, and John Stasko. 2021. Collecting and Characterizing Natural Language Utterances for Specifying Data Visualizations. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 464, 10 pages. https://doi.org/10.1145/3411764.3445400
  • Stokes (2001) Patricia D Stokes. 2001. Variability, constraints, and creativity: Shedding light on Claude Monet. American Psychologist 56, 4 (2001), 355.
  • Sultanum et al. (2021) Nicole Sultanum, Fanny Chevalier, Zoya Bylinskii, and Zhicheng Liu. 2021. Leveraging Text-Chart Links to Support Authoring of Data-Driven Articles with VizFlow. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 16, 17 pages. https://doi.org/10.1145/3411764.3445354
  • Sultanum and Srinivasan (2023) Nicole Sultanum and Arjun Srinivasan. 2023. DataTales: Investigating the use of Large Language Models for Authoring Data-Driven Articles. arXiv:2308.04076 [cs.HC]
  • Swearngin et al. (2020) Amanda Swearngin, Chenglong Wang, Alannah Oleson, James Fogarty, and Amy J. Ko. 2020. Scout: Rapid Exploration of Interface Layout Alternatives through High-Level Design Constraints. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376593
  • Tabata et al. (2019) Sou Tabata, Hiroki Yoshihara, Haruka Maeda, and Kei Yokoyama. 2019. Automatic Layout Generation for Graphical Design Magazines. In ACM SIGGRAPH 2019 Posters (Los Angeles, California) (SIGGRAPH ’19). Association for Computing Machinery, New York, NY, USA, Article 9, 2 pages. https://doi.org/10.1145/3306214.3338574
  • Tarkhova et al. (2020) Lyaylya Tarkhova, Sergey Tarkhov, Marat Nafikov, Ilshat Akhmetyanov, Dmitry Gusev, and Ramzid Akhmarov. 2020. Infographics and their application in the educational process. International Journal of Emerging Technologies in Learning (IJET) 15, 13 (2020), 63–80.
  • Thompson et al. (2021) John R Thompson, Zhicheng Liu, and John Stasko. 2021. Data Animator: Authoring Expressive Animated Data Graphics. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 15, 18 pages. https://doi.org/10.1145/3411764.3445747
  • Tong et al. (2023) Wai Tong, Zhutian Chen, Meng Xia, Leo Yu-Ho Lo, Linping Yuan, Benjamin Bach, and Huamin Qu. 2023. Exploring Interactions with Printed Data Visualizations in Augmented Reality. IEEE Transactions on Visualization and Computer Graphics 29, 1 (2023), 418–428. https://doi.org/10.1109/TVCG.2022.3209386
  • Tyagi et al. (2022) Anjul Tyagi, Jian Zhao, Pushkar Patel, Swasti Khurana, and Klaus Mueller. 2022. Infographics Wizard: Flexible Infographics Authoring and Design Exploration. Computer Graphics Forum 41, 3 (2022), 121–132. https://doi.org/10.1111/cgf.14527 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.14527
  • Vidal (2010) René Victor Valqui Vidal. 2010. Creative problem solving: An applied university course. Pesquisa Operacional 30 (2010), 405–426.
  • Wang et al. (2021) Yun Wang, Yi Gao, Ray Huang, Weiwei Cui, Haidong Zhang, and Dongmei Zhang. 2021. Animated Presentation of Static Infographics with InfoMotion. Computer Graphics Forum 40, 3 (2021), 507–518. https://doi.org/10.1111/cgf.14325
  • Wang et al. (2019) Yun Wang, Zhida Sun, Haidong Zhang, Weiwei Cui, Ke Xu, Xiaojuan Ma, and Dongmei Zhang. 2019. Datashot: Automatic generation of fact sheets from tabular data. IEEE transactions on visualization and computer graphics 26, 1 (2019), 895–905.
  • Wang et al. (2018) Yun Wang, Haidong Zhang, He Huang, Xi Chen, Qiufeng Yin, Zhitao Hou, Dongmei Zhang, Qiong Luo, and Huamin Qu. 2018. InfoNice: Easy Creation of Information Graphics. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC, Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3173574.3173909
  • Webson and Pavlick (2021) Albert Webson and Ellie Pavlick. 2021. Do prompt-based models really understand the meaning of their prompts?
  • White et al. (2023) Jules White, Quchen Fu, Sam Hays, Michael Sandborn, Carlos Olea, Henry Gilbert, Ashraf Elnashar, Jesse Spencer-Smith, and Douglas C Schmidt. 2023. A prompt pattern catalog to enhance prompt engineering with chatgpt.
  • Wongsuphasawat et al. (2016) Kanit Wongsuphasawat, Dominik Moritz, Anushka Anand, Jock Mackinlay, Bill Howe, and Jeffrey Heer. 2016. Towards a General-Purpose Query Language for Visualization Recommendation. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics (San Francisco, California) (HILDA ’16). Association for Computing Machinery, New York, NY, USA, Article 4, 6 pages. https://doi.org/10.1145/2939502.2939506
  • Xia et al. (2018) Haijun Xia, Nathalie Henry Riche, Fanny Chevalier, Bruno De Araujo, and Daniel Wigdor. 2018. DataInk: Direct and Creative Data-Oriented Drawing. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC, Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3173574.3173797
  • Xiao et al. (2023) Shishi Xiao, Suizi Huang, Yue Lin, Yilin Ye, and Wei Zeng. 2023. Let the Chart Spark: Embedding Semantic Context into Chart with Text-to-Image Generative Model. arXiv:2304.14630 [cs.AI]
  • Xiao et al. (2024) Shishi Xiao, Liangwei Wang, Xiaojuan Ma, and Wei Zeng. 2024. TypeDance: Creating Semantic Typographic Logos from Image through Personalized Generation.
  • Yuan et al. (2021) Lin-Ping Yuan, Ziqi Zhou, Jian Zhao, Yiqiu Guo, Fan Du, and Huamin Qu. 2021. InfoColorizer: Interactive Recommendation of Color Palettes for Infographics. arXiv:2102.02041 [cs.HC]
  • Zhang et al. (2020) Jiayi Eris Zhang, Nicole Sultanum, Anastasia Bezerianos, and Fanny Chevalier. 2020. DataQuilt: Extracting Visual Elements from Images to Craft Pictorial Visualizations. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376172
  • Zheng et al. (2019) Xinru Zheng, Xiaotian Qiao, Ying Cao, and Rynson W. H. Lau. 2019. Content-Aware Generative Modeling of Graphic Design Layouts. ACM Trans. Graph. 38, 4, Article 133 (jul 2019), 15 pages. https://doi.org/10.1145/3306346.3322971
  • Zhou et al. (2023) Tongyu Zhou, Connie Liu, Joshua Kong Yang, and Jeff Huang. 2023. Filtered.Ink: Creating Dynamic Illustrations with SVG Filters. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 129, 15 pages. https://doi.org/10.1145/3544548.3581051
  • Zong et al. (2023) Jonathan Zong, Josh Pollock, Dylan Wootton, and Arvind Satyanarayan. 2023. Animated Vega-Lite: Unifying Animation with a Grammar of Interactive Graphics. IEEE Transactions on Visualization and Computer Graphics 29, 1 (2023), 149–159. https://doi.org/10.1109/TVCG.2022.3209369
  • Zuñiga (2017) Amber Zuñiga. 2017. What You Should Know About Vegetarianism (Infographic). https://www.behance.net/gallery/59893615/What-You-Should-Know-About-Vegetarianism-(Infographic)