Economy Profiles

for the Global Innovation Index

Jack Gregory

29 November 2022

Overview

Project Summary

Purpose

  • Populate the economy profile pages

Uses

  • GII report
  • GII briefs

Operation

  • Adapt constituent xml files within an Adobe InDesign idml template.
  • Iteratively populate the template with GII data using using R.
  • Successful operation highly dependent on synthesis between the template and R code.

Outline

  1. Project structure
  2. InDesign Markup Language
  3. InDesign template
  4. Project operation
  5. Tutorial

1. Project structure

How does the profiles project fit within the larger gii repository? How are the profiles organized?

GII workflow

Profiles repository

  • The gii repository contains the GII workflow from data to analysis to outputs.
  • The profiles are an output and can only be run after the modelling is complete.
  • The profiles are stored in a separate repository and added to the gii as a subrepository.
  • The profiles are one vehicle for presenting the model; its workflow does not manipulate the underlying data.

Repository structure

The profiles repository contains the following files and folders.

File/Folder Description
README.md Provides high-level summary of project operation.
notes.md Provides notes related to idml, xml and R.
build Contains the main <profiles.R> script.
src Contains all source scripts, where the names are directly related to the constituent functions.
tmpl Contains an unzipped, unpopulated version of the idml templates.
out Not committed to the repository, but collects all outputs when cloned locally.

2. InDesign Markup Language

What is IDML and its associated file format? How are idml archives structured?

Description

  • InDesign Markup Language (IDML) is an open-source language developed by Adobe.
  • Similar to other markup languages (e.g., HTML, XML, etc.), it combines text and tags to typeset content and facilitate automated processing.

Description

  • InDesign Markup Language (IDML) is an open-source language developed by Adobe.
  • Similar to other markup languages (e.g., HTML, XML, etc.), it combines text and tags to typeset content and facilitate automated processing.
  • An idml file is essentially a set of xml files compressed in a zip archive.

Purpose

  • IDML functions as a conduit between InDesign and XML-based formats and technologies.
  • Documents can be created and manipulated in XML and opened using InDesign.
  • It was designed for automated workflows, including:
    • Programmatic assembly \(\rightarrow\) generate/modify InDesign documents using structured data;
    • Programmatic disassembly \(\rightarrow\) reuse parts of InDesign documents; and,
    • Searching \(\rightarrow\) find data in InDesign documents using XPath.

Archive structure

The automated idml elements for the economy profiles are in red:

File/Folder Description
mimetype Identifies the “media type” of the file format.
designmap.xml Contains the publishing architecture of the document, including where spreads and stories are located.
MasterSpreads Contains all MasterSpreads, which outline all standard page elements.
META-INF Contains metadata and describes the file encoding used in the document.
Resources Contains elements and preferences that are commonly used throughout the document, e.g., colors, fonts and paragraph styles.
Spreads Contains all Spreads, which outline page-specific elements and are composed of stories.
Stories Contains all Stories, which organize and format the document text.
XML Contains XML elements and settings used in the document.

3. InDesign template

How does the template function? What are script labels and how are they maintained?

Why use a template?

  • Automize the population of the economy profiles.
  • This drives the following benefits:
    • Decreases build time;
    • Ensures consistency of output;
    • Eliminates redundancy; and,
    • Reduces errors.

Update the template

  1. GII: Finalize model, including index structure.
  2. Design: Design current template, using updated index structure and previous template.
  3. GII: Update script labels.

Script labels

Warning

By design, it is impossible to add arbitrary elements and attributes to idml files.

Unrecognized content is discarded on idml import.

Solution

  • Script labels round-trip (meta)data through idml import and export.
  • Essentially, they attach arbitrary (meta)data to a scriptable object.
  • They are more powerful than the functionality offered within the InDesign UI, which limits each scriptable object to one string.

Script lable types

There are two forms of labeling within an idml file:

  1. KeyValuePair — used within structural elements.
<Element>
  <Properties>
    <Label>
      <KeyValuePair Key="Label" Value="Value"/>
    </Label>
  </Properties>
</Element>
  1. ListItem — used within text elements.
<Element>
  <Properties>
    <Label type="list">
        <ListItem type="list">
            <ListItem type="string">Label</ListItem>
            <ListItem type="string">Value</ListItem>
        </ListItem>
    </Label>
  </Properties>
</Element>

Script labels in the template

For the template, script labels exist within the following elements/nodes:

Element File Type Purpose Function
Document designmap.xml KeyValuePair To identify the file type. To import the template.
Spread Spread_*.xml KeyValuePair To identify the file type. To import the template and replace Self attributes.
Page Spread_*.xml KeyValuePair To identify pages. To replace Name attributes.
TextFrame Spread_*.xml KeyValuePair To identify page elements. To replace Self, ParentStory, PreviousTextFrame and NextTextFrame attributes.
Story Story_*.xml KeyValuePair To identify the file type. To import the template and replace Self attributes.
Cell Story_*.xml ListItem To identify cells. To populate the template.

Update template script labels

Important

Script labels associated with text elements are fragile and can easily be damaged by, e.g., insert, remove and copy-paste actions.

  • The following script labels are susceptible during the template design phase:
    • Cell labels; and,
    • TextFrame labels.
  • It is possible to update/repair script labels using:
    • InDesign; or,
    • Any text editor, including Notepad and RStudio.
  • Note that InDesign can only access cell labels; whereas, text editors can access all template script labels.

Update using InDesign

  • Cell label errors can typically be corrected within InDesign.
  • The following two tabs present common problems and their solutions.

Problem — “Strengths & Weaknesses” label for Indicator 6.3.2 is incorrect.

Solution — Edit the script label.


Problem — Added labels in empty row.

Solution — Merge all columns in the effected row.


  • TextFrame labels are invisible to the designer within InDesign.
  • However, they are crucial for importing the idml.
  • Most TextFrames within a spread are assigned a label.
  • As a result, any action that affects a TextFrame within InDesign should be executed with caution.

Update using a text editor

  • Within a story xml, Cell elements use ListItem labels.
  • Cells are identified using the following id structure: “Column:Code”.
<Cell ...>
  <Properties>
    <Label type="list">
      <ListItem type="list">
        <ListItem type="string">Label</ListItem>
        <ListItem type="string">Score:P1</ListItem>
      </ListItem>
    </Label>
  </Properties>
  ...
</Cell>
  • Within a story xml, Story elements use KeyValuePair labels.
  • type \(\in\) [cntry, cntxt, main, foot].
<Story Self="main" ...>
  <Properties>
    <Label>
      <KeyValuePair Key="type" Value="main"/>
    </Label>
  </Properties>
  ...
</Story>
  • Within a spread xml, TextFrame elements use KeyValuePair labels.
  • type \(\in\) [cntry, cntxt, main, foot].
  • page \(\in\) [odd, evn].
  • side \(\in\) [lft, rgt].
<TextFrame Self="main_odd_rgt" ParentStory="main" PreviousTextFrame="main_odd_lft" NextTextFrame="n" ...>
  <Properties>
    ...
    <Label>
      <KeyValuePair Key="type" Value="main"/>
      <KeyValuePair Key="page" Value="odd"/>
      <KeyValuePair Key="side" Value="rgt"/>
    </Label>
  </Properties>
  ...
</TextFrame>
  • Within a spread xml, Page elements use KeyValuePair labels.
  • type \(\in\) [cntry, cntxt, main, foot].
  • page \(\in\) [odd, evn].
<Page ... Name="1" ...>
  <Properties>
    ...
    <Label>
      <KeyValuePair Key="type" Value="page"/>
      <KeyValuePair Key="page" Value="odd"/>
    </Label>
  </Properties>
  ...
</Page>
  • Within a spread xml, Spread elements use KeyValuePair labels.
  • type \(\in\) spread.
<Spread Self="spread" ...>
  <Properties>
    <Label>
      <KeyValuePair Key="type" Value="spread"/>
    </Label>
  </Properties>
  ...
</Story>
  • Within the designmap xml, the Document element uses a KeyValuePair label.
  • type \(\in\) [designmap].
<Document ...>
  <Properties>
    <Label>
      <KeyValuePair Key="type" Value="designmap"/>
      ...
    </Label>
  </Properties>
...
</Document>

4. Project operation

How is the template imported and organized within the profiles repository? How are the economy profiles populated?

Operation

  • The profiles repository has two main functions:
    1. Template ingestion; and,
    2. Profile population.
  • Its source files are highly dependent on the following R packages:
    • xml2 \(\rightarrow\) a binding to XML parser and toolkit methods in C.
    • fs \(\rightarrow\) a cross-platform, uniform interface to file system operations.
    • tidyverse \(\rightarrow\) a collection of R packages designed for data science.
  • It is only compatible with Windows.
  • It also requires the installation of 7-zip.

Template ingestion

  • The idml_import() function:
    • Unpacks an idml template archive; and,
    • Organizes it according to the profiles population methodology.
  • It’s main tasks are to:
    • Unzip an idml;
    • Identify the requisite xml files, including the designmap, spread, and stories; and,
    • Clean the xml files so that they are ready for population.
  • The final unzipped and cleaned versions of the templates are typically stored in the <../06_profiles/tmpl/> folder.

Template ingestion code

Code

# PREAMBLE ------------------------------------------------

## Initiate
## ... packages
source(here::here("06_profiles/src/preamble.R"))

Template ingestion code

Code

# PREAMBLE ------------------------------------------------

## Initiate
## ... packages
source(here::here("06_profiles/src/preamble.R"))

## ... source files
source(here::here("06_profiles/src/import_xml.R"))
source(here::here("06_profiles/src/convert_idml.R"))

Template ingestion code

Code

# PREAMBLE ------------------------------------------------

## Initiate
## ... packages
source(here::here("06_profiles/src/preamble.R"))

## ... source files
source(here::here("06_profiles/src/import_xml.R"))
source(here::here("06_profiles/src/convert_idml.R"))

## ... definitions
idml <- here::here("06_profiles/tmpl/idml_tmpl_2022.idml")
dir <- here::here("06_profiles/tmpl/test")
zip <- "C:/Program Files/7-Zip/7z.exe"

Template ingestion code

Code

# PREAMBLE ------------------------------------------------

## Initiate
## ... packages
source(here::here("06_profiles/src/preamble.R"))

## ... source files
source(here::here("06_profiles/src/import_xml.R"))
source(here::here("06_profiles/src/convert_idml.R"))

## ... definitions
idml <- here::here("06_profiles/tmpl/idml_tmpl_2022.idml")
dir <- here::here("06_profiles/tmpl/test")
zip <- "C:/Program Files/7-Zip/7z.exe"

# IMPORT --------------------------------------------------
import_idml(idml, dir, zip)

Template ingestion code

Code

# PREAMBLE ------------------------------------------------

## Initiate
## ... packages
source(here::here("06_profiles/src/preamble.R"))

## ... source files
source(here::here("06_profiles/src/import_xml.R"))
source(here::here("06_profiles/src/convert_idml.R"))

## ... definitions
idml <- here::here("06_profiles/tmpl/idml_tmpl_2022.idml")
dir <- here::here("06_profiles/tmpl/test")
zip <- "C:/Program Files/7-Zip/7z.exe"

# IMPORT --------------------------------------------------
import_idml(idml, dir, zip)

Result

> 7-Zip 22.01 (x64) : Copyright (c) 1999-2022 Igor Pavlov : 2022-07-15
> 
> Scanning the drive for archives:
> 1 file, 571658 bytes (559 KiB)
> 
> Extracting archive: idml_tmpl_2022.idml
> --
> Path = idml_tmpl_2022.idml
> Type = zip
> Physical Size = 571658
> 
> Everything is Ok
> 
> Folders: 6
> Files: 40
> Size:       7063772
> Compressed: 571658

Template ingestion code

Code

# PREAMBLE ------------------------------------------------

## Initiate
## ... packages
source(here::here("06_profiles/src/preamble.R"))

## ... source files
source(here::here("06_profiles/src/import_xml.R"))
source(here::here("06_profiles/src/convert_idml.R"))

## ... definitions
idml <- here::here("06_profiles/tmpl/idml_tmpl_2022.idml")
dir <- here::here("06_profiles/tmpl/test")
zip <- "C:/Program Files/7-Zip/7z.exe"

# IMPORT --------------------------------------------------
import_idml(idml, dir, zip)

Result

  • The template is now ingested and a <../06_profiles/tmpl/test> folder now exists.

Template ingestion code

Code

# PREAMBLE ------------------------------------------------

## Initiate
## ... packages
source(here::here("06_profiles/src/preamble.R"))

## ... source files
source(here::here("06_profiles/src/import_xml.R"))
source(here::here("06_profiles/src/convert_idml.R"))

## ... definitions
idml <- here::here("06_profiles/tmpl/idml_tmpl_2022.idml")
dir <- here::here("06_profiles/tmpl/test")
zip <- "C:/Program Files/7-Zip/7z.exe"

# IMPORT --------------------------------------------------
import_idml(idml, dir, zip)

Result

  • <../06_profiles/tmpl/test> contains the ingested folders and files, where they have been prepared for immediate population.
  • If anything is missing, there is likely an issue with the template.

Template ingestion code

Code

# PREAMBLE ------------------------------------------------

## Initiate
## ... packages
source(here::here("06_profiles/src/preamble.R"))

## ... source files
source(here::here("06_profiles/src/import_xml.R"))
source(here::here("06_profiles/src/convert_idml.R"))

## ... definitions
idml <- here::here("06_profiles/tmpl/idml_tmpl_2022.idml")
dir <- here::here("06_profiles/tmpl/test")
zip <- "C:/Program Files/7-Zip/7z.exe"

# IMPORT --------------------------------------------------
import_idml(idml, dir, zip)

Result

  • To finalize the import, copy the following from a previous template to the current ingestion directory:
    • <icons>
    • <clock.xml>
    • <syms.txt>

Profile population

  • The <../06_profiles/build/profiles.R> script performs the profile population.
  • It performs the following tasks:
    • Queries the necessary data from the GIIDB;
    • Wrangles the data into the necessary formats;
    • Builds an iteration dataframe;
    • Populates the idml;
    • Compresses and exports the idml; and,
    • Cleans the template folder.
  • The final zipped version of the populated profiles idml is saved to the <../06_profiles/out/> folder.

Profile population code

Preamble

  • The “Preamble” section initializes all dependencies and definitions.
# PREAMBLE ------------------------------------------------

## Initiate -------------------------------------
## ... packages
source(here::here("06_profiles/src/preamble.R"))

Profile population code

Preamble

  • The “Preamble” section initializes all dependencies and definitions.
# PREAMBLE ------------------------------------------------

## Initiate -------------------------------------
## ... packages
source(here::here("06_profiles/src/preamble.R"))

## ... source files
fs::dir_ls(fs::path(here::here(), "06_profiles/src")) %>%
  as.list() %>%
  purrr::keep(!grepl("preamble", .)) %>%
  purrr::walk(source)

Profile population code

Preamble

  • The “Preamble” section initializes all dependencies and definitions.
# PREAMBLE ------------------------------------------------

## Initiate -------------------------------------
## ... packages
source(here::here("06_profiles/src/preamble.R"))

## ... source files
fs::dir_ls(fs::path(here::here(), "06_profiles/src")) %>%
  as.list() %>%
  purrr::keep(!grepl("preamble", .)) %>%
  purrr::walk(source)

## ... definitions
date <- format(Sys.Date(), "%Y%m%d")
idml <- glue::glue("ep_{date}")
dir <- here::here("06_profiles/tmpl/2022")
zip <- "C:/Program Files/7-Zip/7z.exe"
giiyr <- 2022

Profile population code

Preamble

  • The “Preamble” section initializes all dependencies and definitions.
# PREAMBLE ------------------------------------------------

## Initiate -------------------------------------
## ... packages
source(here::here("06_profiles/src/preamble.R"))

## ... source files
fs::dir_ls(fs::path(here::here(), "06_profiles/src")) %>%
  as.list() %>%
  purrr::keep(!grepl("preamble", .)) %>%
  purrr::walk(source)

## ... definitions
date <- format(Sys.Date(), "%Y%m%d")
idml <- glue::glue("ep_{date}")
dir <- here::here("06_profiles/tmpl/2022")
zip <- "C:/Program Files/7-Zip/7z.exe"
giiyr <- 2022

## Connect to AWS -------------------------------
con <- DBI::dbConnect(odbc::odbc(), "gii_admin")

Profile population code

Import GII Data

  • The “Import GII Data” section queries all necessary data from the GIIDB.

Profile population code

Import GII Data

Code

# IMPORT GII DATA -----------------------------------------

## Query economy data ---------------------------
sql.econ <- DBI::SQL(glue::glue("
  SELECT        `econ`.`GIIYR`,
                `id`.`ISO3`,
                `id`.`ECONOMY_NAME`,
                `mod`.`RANK`,
                `in`.`INPUT`,
                `out`.`OUTPUT`,
            `econ`.`INCOME`,
                `id`.`REG_UN_CODE`,
                `econ`.`POP`,
                `econ`.`PPPGDP`,
            `econ`.`PPPPC`
  FROM        `gii`.`economy_id` AS `id`
  LEFT JOIN `gii`.`economy` AS `econ` USING (`ISO3`)
  RIGHT JOIN    (SELECT `GIIYR`,
                      `ISO3`,
                      `RANK`
                  FROM    `gii`.`model`
                  WHERE   `CODE`='Index') 
              AS `mod`
  ON        `econ`.`GIIYR`=`mod`.`GIIYR` AND `id`.`ISO3`=`mod`.`ISO3`
  LEFT JOIN (SELECT `GIIYR`,
                    `ISO3`,
                    `RANK` AS `INPUT`
            FROM    `gii`.`model`
            WHERE   `CODE`='Inputs')
            AS `in`
  ON        `econ`.`GIIYR`=`in`.`GIIYR` AND `id`.`ISO3`=`in`.`ISO3`
  LEFT JOIN (SELECT `GIIYR`,
                    `ISO3`,
                    `RANK` AS `OUTPUT`
            FROM    `gii`.`model`
            WHERE   `CODE`='Outputs')
            AS `out`
  ON        `econ`.`GIIYR`=`out`.`GIIYR` AND `id`.`ISO3`=`out`.`ISO3`
  WHERE       `econ`.`GIIYR`={giiyr}
;"))

df.econ <- DBI::dbGetQuery(con, sql.econ)

Profile population code

Import GII Data

Result

> # A tibble: 132 × 11
>    GIIYR ISO3  ECONOMY_NAME          RANK INPUT OUTPUT INCOME REG_UN_CODE    POP  PPPGDP  PPPPC
>    <int> <chr> <chr>                <int> <int>  <int> <chr>  <chr>        <dbl>   <dbl>  <dbl>
>  1  2022 AGO   Angola                 127   129    117 LM     SSA         33934.  218.    6820.
>  2  2022 ALB   Albania                 84    80     89 UM     EUR          2873.   44.5  15487 
>  3  2022 ARE   United Arab Emirates    31    18     52 HI     NAWA         9991.  699.   74245.
>  4  2022 ARG   Argentina               69    77     62 UM     LCN         45606. 1049.   22892.
>  5  2022 ARM   Armenia                 80    82     73 UM     NAWA         2968.   43.6  14701.
>  6  2022 AUS   Australia               25    19     32 HI     SEAO        25788. 1427.   55492.
>  7  2022 AUT   Austria                 17    17     21 HI     EUR          9043.  531.   59406.
>  8  2022 AZE   Azerbaijan              93    79    110 UM     NAWA        10223.  156.   15299.
>  9  2022 BDI   Burundi                130   127    130 LI     SSA         12255.    9.53   779.
> 10  2022 BEL   Belgium                 26    26     24 HI     EUR         11632.  645.   55919.
> # … with 122 more rows

Profile population code

Import GII Data

Code

# IMPORT GII DATA -----------------------------------------

## Query model data -----------------------------
sql.mod <- DBI::SQL(glue::glue("
  SELECT    `mod`.`GIIYR`,
              `mod`.`CODE`,
              `id`.`NAME`,
              `id`.`LEVEL`,
              `id`.`NUM`,
            `mod`.`ISO3`,
            `id`.`PROFILE`,
            `mod`.`VALUE_SCREEN`,
            `mod`.`SCORE`,
            `mod`.`RANK`,
            `mod`.`SW_OVERALL`,
            `mod`.`SW_INCGRP`,
            `mod`.`OUTDATED`,
            `mod`.`DMC`
  FROM      `gii`.`model` AS `mod`
  LEFT JOIN `gii`.`index_id` AS `id` USING (`GIIYR`,`CODE`)
  WHERE     `mod`.`GIIYR`={giiyr}
            AND `id`.`LEVEL` IN ('Pillar','SubPillar','Indicator')
  ORDER BY  `id`.`NUM`, `mod`.`ISO3`
;"))

df.gii <- DBI::dbGetQuery(con, sql.mod)

Profile population code

Import GII Data

Result

> # A tibble: 14,388 × 14
>    GIIYR CODE  NAME         LEVEL  NUM   ISO3  PROFILE VALUE_SCREEN SCORE  RANK SW_OVERALL SW_INCGRP OUTDATED   DMC
>    <int> <chr> <chr>        <chr>  <chr> <chr> <chr>          <dbl> <dbl> <int> <chr>      <chr>        <int> <int>
>  1  2022 P1    Institutions Pillar IN.1  AGO   Score             NA  41.9   116 NA         NA              NA     0
>  2  2022 P1    Institutions Pillar IN.1  ALB   Score             NA  51.4    84 NA         NA              NA     0
>  3  2022 P1    Institutions Pillar IN.1  ARE   Score             NA  83.5     6 S          S               NA     0
>  4  2022 P1    Institutions Pillar IN.1  ARG   Score             NA  47.6    96 NA         NA              NA     0
>  5  2022 P1    Institutions Pillar IN.1  ARM   Score             NA  59.7    55 NA         NA              NA     0
>  6  2022 P1    Institutions Pillar IN.1  AUS   Score             NA  77.2    17 NA         NA              NA     0
>  7  2022 P1    Institutions Pillar IN.1  AUT   Score             NA  82.8     8 S          NA              NA     0
>  8  2022 P1    Institutions Pillar IN.1  AZE   Score             NA  62.9    46 S          NA              NA     0
>  9  2022 P1    Institutions Pillar IN.1  BDI   Score             NA  45.3   106 NA         NA              NA     0
> 10  2022 P1    Institutions Pillar IN.1  BEL   Score             NA  71.5    29 NA         NA              NA     0
> # … with 14,378 more rows

Profile population code

Build Insertions

  • The “Build Insertions” section wrangles the data into the correct population format.

Code

# BUILD INSERTIONS ----------------------------------------

## Country df -----------------------------------
df.cntry <- df.econ |>
  dplyr::select(ISO3, ECONOMY_NAME, RANK) |>
  dplyr::mutate(RANK = as.character(RANK)) |>
  tidyr::pivot_longer(cols=-ISO3, names_to="VAR", values_to="VAL") |>
  dplyr::mutate(VAR = dplyr::case_when(VAR=="ECONOMY_NAME" ~ "Name",
                                       VAR=="RANK" ~ "Rank",
                                       TRUE ~ as.character(NA)))

Result

> # A tibble: 10,692 × 3
>    ISO3  CODE          OUTDATED
>    <chr> <chr>            <int>
>  1 AGO   AppCrea              0
>  2 AGO   AppTarriff           0
>  3 AGO   BrandVal            NA
>  4 AGO   CCTLD                0
>  5 AGO   CitDoc               0
>  6 AGO   CompSoftSpend       NA
>  7 AGO   CorpIAs             NA
>  8 AGO   CostRedu             0
>  9 AGO   CreaGoodExp          1
> 10 AGO   CultServExp         NA
> # … with 10,682 more rows

Profile population code

Build Insertions

  • The “Build Insertions” section wrangles the data into the correct population format.

Code

# BUILD INSERTIONS ----------------------------------------

## Context df -----------------------------------
df.cntxt <- df.econ |>
  dplyr::select(ISO3, INPUT:PPPPC) |>
  dplyr::mutate(INCOME = dplyr::case_when(INCOME=="HI" ~ "High",
                                          INCOME=="UM" ~ "Upper middle",
                                          INCOME=="LM" ~ "Lower middle",
                                          INCOME=="LI" ~ "Low"),
                POP = formatC(POP/10^3, format="f", digits=1, big.mark=","),
                PPPGDP = formatC(PPPGDP, format="f", digits=1, big.mark=","),
                PPPPC = formatC(PPPPC, format="f", digits=0, big.mark=",")) |>
  dplyr::mutate(dplyr::across(.cols=dplyr::everything(), .fns=as.character)) |>
  tidyr::pivot_longer(cols=-ISO3, names_to="VAR", values_to="VAL") |>
  dplyr::mutate(VAR = dplyr::case_when(VAR=="INPUT" ~ "Input_Rank",
                                       VAR=="OUTPUT" ~ "Output_Rank",
                                       VAR=="INCOME" ~ "Income",
                                       VAR=="REG_UN_CODE" ~ "Region",
                                       VAR=="POP" ~ "Pop",
                                       VAR=="PPPGDP" ~ "GDP",
                                       VAR=="PPPPC" ~ "GDPPC",
                                       TRUE ~ as.character(NA)))

Result

> # A tibble: 924 × 3
>    ISO3  VAR         VAL         
>    <chr> <chr>       <chr>       
>  1 AGO   Input_Rank  129         
>  2 AGO   Output_Rank 117         
>  3 AGO   Income      Lower middle
>  4 AGO   Region      SSA         
>  5 AGO   Pop         33.9        
>  6 AGO   GDP         218.0       
>  7 AGO   GDPPC       6,820       
>  8 ALB   Input_Rank  80          
>  9 ALB   Output_Rank 89          
> 10 ALB   Income      Upper middle
> # … with 914 more rows

Profile population code

Build Insertions

  • The “Build Insertions” section wrangles the data into the correct population format.

Code

# BUILD INSERTIONS ----------------------------------------

## Main df --------------------------------------
df.main <- df.gii |>
  dplyr::mutate(SCORE = ifelse(PROFILE=="Score", SCORE, VALUE_SCREEN) |>
                  formatC(digits=1, format="f", big.mark=",") |>
                  stringr::str_replace("-", "‒")) |>
  dplyr::mutate(dplyr::across(c(SCORE, RANK), ~ifelse(is.na(.x) | .x=="NA", "n/a", .x))) |>
  dplyr::mutate(RANK = dplyr::case_when(DMC==TRUE ~ glue::glue("[{RANK}]"),
                                        DMC==FALSE | is.na(DMC) ~ as.character(RANK)),
                SW_OVERALL = dplyr::case_when(SW_OVERALL=="S" ~ "●",
                                              SW_OVERALL=="W" ~ "○"),
                SW_INCGRP = dplyr::case_when(SW_INCGRP=="S" ~ "◆",
                                             SW_INCGRP=="W" ~ "◇")) |>
  dplyr::select(ISO3, CODE, SCORE:SW_INCGRP) |>
  tidyr::pivot_longer(cols=c(-ISO3, -CODE), names_to="VAR", values_to="VAL") |>
  dplyr::mutate(VAR = dplyr::case_when(VAR=="SCORE" ~ "Score",
                                       VAR=="RANK" ~ "Rank",
                                       VAR=="SW_OVERALL" ~ "SW_Overall",
                                       VAR=="SW_INCGRP" ~ "SW_IncGrp",
                                       TRUE ~ as.character(NA)))

Result

> # A tibble: 57,552 × 4
>    ISO3  CODE  VAR        VAL   
>    <chr> <chr> <chr>      <glue>
>  1 AGO   P1    Score      41.9  
>  2 AGO   P1    Rank       116   
>  3 AGO   P1    SW_Overall NA    
>  4 AGO   P1    SW_IncGrp  NA    
>  5 ALB   P1    Score      51.4  
>  6 ALB   P1    Rank       84    
>  7 ALB   P1    SW_Overall NA    
>  8 ALB   P1    SW_IncGrp  NA    
>  9 ARE   P1    Score      83.5  
> 10 ARE   P1    Rank       6     
> # … with 57,542 more rows

Profile population code

Build Insertions

  • The “Build Insertions” section wrangles the data into the correct population format.

Code

# BUILD INSERTIONS ----------------------------------------

## Clock df -------------------------------------
df.clock <- df.gii |>
  dplyr::filter(LEVEL=="Indicator") |>
  dplyr::select(ISO3, CODE, OUTDATED) |>
  dplyr::arrange(ISO3, CODE) |>
  tibble::as_tibble()

Result

> # A tibble: 10,692 × 3
>    ISO3  CODE          OUTDATED
>    <chr> <chr>            <int>
>  1 AGO   AppCrea              0
>  2 AGO   AppTarriff           0
>  3 AGO   BrandVal            NA
>  4 AGO   CCTLD                0
>  5 AGO   CitDoc               0
>  6 AGO   CompSoftSpend       NA
>  7 AGO   CorpIAs             NA
>  8 AGO   CostRedu             0
>  9 AGO   CreaGoodExp          1
> 10 AGO   CultServExp         NA
> # … with 10,682 more rows

Profile population code

Build Iterations

  • The “Build Iterations” section first checks that economies match across insertion dataframes.

Code

# BUILD ITERATIONS ----------------------------------------

## Check ISO3 -----------------------------------
## Check countries match across insertion tables

## Create list of iso3 vectors
l.iso3 <- list(df.econ, df.gii, df.cntry, df.cntxt, df.main, df.clock) |>
  purrr::map(~sort(unique(.[["ISO3"]]))) |>
  purrr::set_names(nm=c("econ","gii","cntry","cntxt","main","clock"))

## Perform equivalence check
stopifnot(length(setdiff(l.iso3$cntry, l.iso3$econ))==0,
          length(setdiff(l.iso3$econ, l.iso3$cntry))==0,
          length(setdiff(l.iso3$cntry, l.iso3$gii))==0,
          length(setdiff(l.iso3$gii, l.iso3$cntry))==0,
          length(setdiff(l.iso3$cntry, l.iso3$cntxt))==0,
          length(setdiff(l.iso3$cntxt, l.iso3$cntry))==0,
          length(setdiff(l.iso3$cntry, l.iso3$main))==0,
          length(setdiff(l.iso3$main, l.iso3$cntry))==0,
          length(setdiff(l.iso3$cntry, l.iso3$clock))==0,
          length(setdiff(l.iso3$clock, l.iso3$cntry))==0)

Profile population code

Build Iterations

  • The “Build Iterations” section then constructs a dataframe defining the iteration process over which the profiles are populated. It is possible to populate all countries, or a small sample of countries by name or row number.

Profile population code

Build Iterations

  • The “Build Iterations” section then constructs a dataframe defining the iteration process over which the profiles are populated. It is possible to populate all countries, or a small sample of countries by name or row number.

Code

# BUILD INSERTIONS ----------------------------------------

## Iteration df ---------------------------------

## Build iteration dataframe
df.itr <- df.econ |>
  dplyr::select(ISO3, ECONOMY_NAME) |>
  dplyr::arrange(ECONOMY_NAME) |>
  tibble::rowid_to_column(var="ORDER") |>
  dplyr::mutate(PAIR = ifelse(numbers::mod(ORDER, 2)==0, ORDER/2, NA),
                ODD = numbers::mod(ORDER, 2)) |>
  tidyr::fill(PAIR, .direction="up") |>
  dplyr::mutate(PAIR = ifelse(is.na(PAIR), max(PAIR, na.rm=TRUE) + 1, PAIR)) |>
  tidyr::nest(order = ORDER, economy = ECONOMY_NAME, iso3 = ISO3, odd = ODD) |>
  dplyr::mutate(dplyr::across(-PAIR, ~purrr::map(., ~as.list(.x) |> purrr::flatten())))

Profile population code

Build Iterations

  • The “Build Iterations” section then constructs a dataframe defining the iteration process over which the profiles are populated. It is possible to populate all countries, or a small sample of countries by name or row number.

Code

# BUILD INSERTIONS ----------------------------------------

## Iteration df ---------------------------------

## Build iteration dataframe
df.itr <- df.econ |>
  dplyr::select(ISO3, ECONOMY_NAME) |>
  dplyr::arrange(ECONOMY_NAME) |>
  tibble::rowid_to_column(var="ORDER") |>
  dplyr::mutate(PAIR = ifelse(numbers::mod(ORDER, 2)==0, ORDER/2, NA),
                ODD = numbers::mod(ORDER, 2)) |>
  tidyr::fill(PAIR, .direction="up") |>
  dplyr::mutate(PAIR = ifelse(is.na(PAIR), max(PAIR, na.rm=TRUE) + 1, PAIR)) |>
  tidyr::nest(order = ORDER, economy = ECONOMY_NAME, iso3 = ISO3, odd = ODD) |>
  dplyr::mutate(dplyr::across(-PAIR, ~purrr::map(., ~as.list(.x) |> purrr::flatten())))

## Select particular rows
df.itr <- dplyr::slice(df.itr, 15:16)

Profile population code

Build Iterations

  • The “Build Iterations” section then constructs a dataframe defining the iteration process over which the profiles are populated. It is possible to populate all countries, or a small sample of countries by name or row number.

Code

# BUILD INSERTIONS ----------------------------------------

## Iteration df ---------------------------------

## Build iteration dataframe
df.itr <- df.econ |>
  dplyr::select(ISO3, ECONOMY_NAME) |>
  dplyr::arrange(ECONOMY_NAME) |>
  tibble::rowid_to_column(var="ORDER") |>
  dplyr::mutate(PAIR = ifelse(numbers::mod(ORDER, 2)==0, ORDER/2, NA),
                ODD = numbers::mod(ORDER, 2)) |>
  tidyr::fill(PAIR, .direction="up") |>
  dplyr::mutate(PAIR = ifelse(is.na(PAIR), max(PAIR, na.rm=TRUE) + 1, PAIR)) |>
  tidyr::nest(order = ORDER, economy = ECONOMY_NAME, iso3 = ISO3, odd = ODD) |>
  dplyr::mutate(dplyr::across(-PAIR, ~purrr::map(., ~as.list(.x) |> purrr::flatten())))

## Filter for specific countries
df.itr <- dplyr::filter(df.itr, purrr::map_lgl(iso3, ~any(stringr::str_detect(.x, "USA"))))

Profile population code

Build Iterations

  • The “Build Iterations” section then constructs a dataframe defining the iteration process over which the profiles are populated. It is possible to populate all countries, or a small sample of countries by name or row number.

Code

# BUILD INSERTIONS ----------------------------------------

## Iteration df ---------------------------------

## Build iteration dataframe
df.itr <- df.econ |>
  dplyr::select(ISO3, ECONOMY_NAME) |>
  dplyr::arrange(ECONOMY_NAME) |>
  tibble::rowid_to_column(var="ORDER") |>
  dplyr::mutate(PAIR = ifelse(numbers::mod(ORDER, 2)==0, ORDER/2, NA),
                ODD = numbers::mod(ORDER, 2)) |>
  tidyr::fill(PAIR, .direction="up") |>
  dplyr::mutate(PAIR = ifelse(is.na(PAIR), max(PAIR, na.rm=TRUE) + 1, PAIR)) |>
  tidyr::nest(order = ORDER, economy = ECONOMY_NAME, iso3 = ISO3, odd = ODD) |>
  dplyr::mutate(dplyr::across(-PAIR, ~purrr::map(., ~as.list(.x) |> purrr::flatten())))

Result

Profile population code

Populate IDML

  • The “Populate IDML” section migrates the data to each xml and sequentially builds the idml package. The code for the functions within the section has been omitted below.

Code

# POPULATE IDML -------------------------------------------

## Create lists of xml templates
l.tmpl <- list(
    cntry = "Story_cntry_tmpl.xml",
    cntxt = "Story_cntxt_tmpl.xml",
    main = "Story_main_tmpl.xml",
    foot = "Story_foot_tmpl.xml",
    spread = "Spread_tmpl.xml",
    designmap = "designmap_tmpl.xml"
  ) |>
  purrr::map(~here::here("06_profiles/tmpl/2022", .x))

## Full idml ------------------------------------
itr_pair(df.itr$iso3, df.itr$order, l.tmpl, dir)

Result

> Populate country profiles
> 1 -- ALB & DZA 
> 2 -- AGO & ARG 
> 3 -- ARM & AUS 
> 4 -- AUT & AZE 
> 5 -- BHR & BGD 
> 6 -- BLR & BEL 
> 7 -- BEN & BIH 
> 8 -- BWA & BRA 
> 9 -- BRN & BGR 
> 10 -- BFA & BDI 
...
Ready for export

Profile population code

Export IDML

  • The “Export IDML” section calls the export_idml() function which compresses the idml and saves it to the <../06_profiles/out/> folder. It then calls the clean_idml() function which reverts the idml template folder to its “pre-population” state.

Profile population code

Export IDML

  • The “Export IDML” section calls the export_idml() function which compresses the idml and saves it to the <../06_profiles/out/> folder. It then calls the clean_idml() function which reverts the idml template folder to its “pre-population” state.

Code

# EXPORT IDML ---------------
export_idml(idml, dir, zip)

Profile population code

Export IDML

  • The “Export IDML” section calls the export_idml() function which compresses the idml and saves it to the <../06_profiles/out/> folder. It then calls the clean_idml() function which reverts the idml template folder to its “pre-population” state.

Code

# EXPORT IDML ---------------
export_idml(idml, dir, zip)

Result

> 7-Zip 22.01 (x64) : Copyright (c) 1999-2022 Igor Pavlov : 2022-07-15
> 
> Scanning the drive:
> 6 folders, 140 files, 47585466 bytes (46 MiB)
> 
> Creating archive: C:/Users/ajg/Documents/WIPO/GII_Data/06_profiles/out/ep_20221129.idml
> 
> Add new data to archive: 6 folders, 140 files, 47585466 bytes (46 MiB)
> 
> Files read from disk: 140
> Archive size: 3597119 bytes (3513 KiB)
> Everything is Ok
> 
> 7-Zip 22.01 (x64) : Copyright (c) 1999-2022 Igor Pavlov : 2022-07-15
> 
> Open archive: C:/Users/ajg/Documents/WIPO/GII_Data/06_profiles/out/ep_20221129.idml
> --
> Path = C:/Users/ajg/Documents/WIPO/GII_Data/06_profiles/out/ep_20221129.idml
> Type = zip
> Physical Size = 3597119
> 
> Scanning the drive:
> 1 file, 43 bytes (1 KiB)
> 
> Updating archive: C:/Users/ajg/Documents/WIPO/GII_Data/06_profiles/out/ep_20221129.idml
> 
> Keep old data in archive: 6 folders, 140 files, 47585466 bytes (46 MiB)
> Add new data to archive: 1 file, 43 bytes (1 KiB)
> 
> Files read from disk: 1
> Archive size: 3597290 bytes (3513 KiB)
> Everything is Ok

Profile population code

Export IDML

  • The “Export IDML” section calls the export_idml() function which compresses the idml and saves it to the <../06_profiles/out/> folder. It then calls the clean_idml() function which reverts the idml template folder to its “pre-population” state.

Code

# EXPORT IDML ---------------
export_idml(idml, dir, zip)
clean_idml(dir)

Profile population code

Export IDML

  • The “Export IDML” section calls the export_idml() function which compresses the idml and saves it to the <../06_profiles/out/> folder. It then calls the clean_idml() function which reverts the idml template folder to its “pre-population” state.

Code

# EXPORT IDML ---------------
export_idml(idml, dir, zip)
clean_idml(dir)

Result

> Message: designmap deleted
> Message: all spreads deleted
> Message: all stories deleted
> Cleaning complete

Finalize populated profiles

  • Open the profiles idml in InDesign.
  • For the file to render properly, you must:
    • Install the necessary fonts; and,
    • Repair the broken image links.
  • Export the idml to pdf.
  • Check the accuracy of the population by:
    • Running the <../06_profiles/build/profiles.R> script to produce a structured xlsx using the GIIDB and the raw model data.
    • Compare the xlsx and idml outputs for a subset of economies.
  • Congratulations, you’ve just completed the GII profiles!

5. Tutorial

Let’s run some code.

Thank You

jackgregory@gmail.com