library(bib2df)
library(stevemisc)
library(stringi)
# load a bib file to data frame
<- bib2df(file="Crump.bib")
bib_df
# clean entries
$TITLE <- stri_replace_all_regex(bib_df$TITLE, "[\\{\\}]", "")
bib_df$JOURNAL <- stri_replace_all_regex(bib_df$JOURNAL, "[\\{\\}]", "")
bib_df$BOOKTITLE <- stri_replace_all_regex(bib_df$BOOKTITLE, "[\\{\\}]", "")
bib_df
# convert a single row back to .bib entry
<- paste0(capture.output(df2bib(bib_df[1,])), collapse="")
bib_entry
bib_entry
# print out the citation
::print_refs(bib_entry,
stevemisccsl = "apa.csl",
spit_out = TRUE,
delete_after = FALSE)
Customizing a publication list with R Markdown
NOTE: Some of the code here stopped working with an upgrade in R.
I’ve been using R Markdown to generate my lab website for years. I recently switched from the generic R markdown website to a website generated by pkgdown. I’m happy with the result. As a part of the migration I’m revisiting individual pages like my publications page.
Over the years I’ve tried different ways to list publications. I like any process that takes a .bib
file containing my publications, and then auto-generates everything I want to have.
bibbase
I was previously using bibbase, which takes a .bib
file as input and embeds a list of publications into a webpage. For example, I used to generate a publication list by inserting a script into the .Rmd
for my publications page.
<script src="https://bibbase.org/show?bib=https://crumplab.github.io/Crump.bib&jsonp=1&nocache=1&theme=side&authorFirst=1"></script>
It was quick, easy, and pretty good overall.
bibbase issues
But, there were nuisances. I couldn’t get the formatting exactly right. I don’t think bibbase supports different .csl
formats, so it doesn’t display citations in APA format.
Bibbase recognizes extra tags in the .bib
file to define arbitrary links, and then have the links printed to each citation. For example, a citation might have a pdf, a website, and data associated with it. That was nice.
However, the links double-clicked themselves. I’m not sure why this happened to me, but clicking a link to download a .pdf
would cause the file to be downloaded twice. That was annoying.
What I wanted
Here is the workflow that I wanted to achieve:
- Maintain my list of publications in a zotero folder. Then, export the folder as a biblatex repository (with .pdfs).
- Have an
.Rmd
file that reads in the.bib
file, and then outputs the list of publications - The list ideally could be formatted by any
.csl
file, which would make it easy to output in APA format - The list should automatically add any extra links and stuff that I want (provided those things can be extracted from the
.bib
file).
R Markdown issues and solutions
R Markdown is generally great for citing things. For example, I could cite a paper (Vuorre and Crump 2021), the citation would appear in the text, and a full citation would be printed in a reference section at the end of the document.
However, it’s not so easy to print a full citation in the middle of an R Markdown document, in a style that you want defined by .csl
, and with additional stuff you might want like extra links.
At least, I couldn’t find a way to do that until this morning, when I came across a life-saver function from stevemisc called print_refs()
.
There’s at least a handful of ways to input a .bib
file into R, and then print out a single entry. For example, RefManageR
can do something like this, but it doesn’t support .csl
, so the output may not be in the style you want (and it doesn’t output to APA).
Here’s a quick example of print_refs()
in action.
I’m so glad this function exists. It turns the .bib
file into markdown that can be printed directly inside an .Rmd
. And, this can be done programmatically using knitr chunks. For example, using results=asis
in the knitr
chunk options allows the citation to printed to the .Rmd
document.
```{r, results="asis", echo=FALSE}
print_me <- paste0(stevemisc::print_refs(bib_entry,csl = "apa.csl",
spit_out = FALSE,
delete_after = FALSE), collapse=" ")
cat(print_me) ```
And, this means the citation should show up nicely on the webpage, like this:
Adding links to the citation
A next step was to add any other links to a given citation. I add extra tags to a .bib
file in the extra field for citations in zotero. For example, this line is in the extra field for the Vuorre and Crump (2021) paper.
tex.url_website: https://crumplab.github.io/vertical/
As a result, when the .bib
file is loaded into R as a data.frame, it will contain a column called URL_WEBSITE
. I can then retrieve that info and write some custom code to smash together the markdown for a citation, along with any html I want to add it. The script below auto-generates a list of the first five publications in the .bib
file (after sorting by year, so the most recent are first).
# sort bib_df by year
<- bib_df[order(bib_df$DATE, decreasing=T),]
bib_df
# print individual entries to page
for (i in 1:5 ){
<- paste0(capture.output(df2bib(bib_df[i,])), collapse="")
t_bib_entry <- paste0(stevemisc::print_refs(t_bib_entry,csl = "apa.csl",
t_md_citationspit_out = FALSE,
delete_after = FALSE), collapse=" ")
cat(t_md_citation)
cat("<span class = 'publinks'>")
if(any(names(bib_df)=="FILE")){
if( !is.na(bib_df[i,"FILE"]) ){
<- paste0("../Crump/",bib_df[i,"FILE"], collapse = "")
pdf_url cat(c(" ",'<a href="',pdf_url,'"> <i class="fas fa-file-pdf"> pdf </i></a>'),
sep="")
}
}
if(any(names(bib_df)=="URL_WEBSITE")){
if( !is.na(bib_df[i,"URL_WEBSITE"]) ){
<- as.character(bib_df[i,"URL_WEBSITE"])
pdf_url cat(c(" ",'<a href="',pdf_url,'"> <i class="fas fa-globe"> website </i></a>'),
sep="")
}
}
if(any(names(bib_df)=="URL_DATA")){
if( !is.na(bib_df[i,"URL_DATA"]) ){
<- as.character(bib_df[i,"URL_DATA"])
pdf_url cat(c(" ",'<a href="',pdf_url,'"> <i class="fas fa-database"> data </i></a>'),
sep="")
}
}
cat("</span>")
cat("\n\n")
}
NOTE: the pdf links weren’t working…oops, will fix that below.
That’s all
I now have a working pipeline that inputs a .bib
file, and outputs a list of publications in APA format, with a few customizable bells and whistles.
I would feel like this excursion was wrapped up if I refactored the script into a set of functions. But, I’ll leave that for another day.
Functionalizing
Ideally I would like to run a single function like this, and have a whole publication list generated, complete with extra links and icons add to each entry.
bib_2_pub_list("mybib.bib")
I don’t have that solution yet, but may update this post when I have time to make progress in that direction.
In order for the above to work it be necessary to include any metadata for the links in the .bib file. This could be done using the extras field in zotero. I’m already using this approach to export urls. I ran into a few roadblocks attempting to generalize this approach.
Alternatively, two inputs might be better. For example, a .yml
file could be used to define metadata for links.
bib_2_pub_list("mybib.bib","mybib.yml")
Hmmm, need to brainstorm a .yml
structure. This should work. A citation key, followed by numbered links, each containing a name, url, and font awesome icon.
vuorreSharingOrganizingResearch2021:
link1:
name: 'website'
url: 'https://www.crumplab.com/vertical'
icon: 'fas fa-globe'
link2:
name: 'github'
url: 'https://github.com/CrumpLab/vertical'
icon: 'fas fa-github'
behmerCrunchingBigData2017:
link1:
name: 'data'
url: 'https://github.com/CrumpLab/BehmerCrump2017_BigData'
icon: 'fas fa-database'
I can read in the .yml
like this, which turns everything into a list.
<- yaml::read_yaml("Crump.yml") yml_links
Then, need to write some functions…
<- function(url_path,url_text, icon_class){
add_link_icon <- glue::glue('<a href = "{url_path}"> <i class="{icon_class}"> {url_text} </i></a>')
html cat(" ",html, sep="")
}
<- function(bib,yml,pdf_dir,base_url_to_pdfs){
bib_2_pub_list
# load bib file to df
<- bib2df::bib2df(bib)
bib_df
# clean {{}} from entries
# to do: improve this part
$TITLE <- stringi::stri_replace_all_regex(bib_df$TITLE, "[\\{\\}]", "")
bib_df$JOURNAL <- stringi::stri_replace_all_regex(bib_df$JOURNAL, "[\\{\\}]", "")
bib_df$BOOKTITLE <- stringi::stri_replace_all_regex(bib_df$BOOKTITLE, "[\\{\\}]", "")
bib_df
# sort bib_df by year
# to do: add sort options
<- bib_df[order(bib_df$DATE, decreasing=T),]
bib_df
# read yml with links for bib entries
<- yaml::read_yaml(yml)
yml_links
# print entries
for (i in 1:dim(bib_df)[1] ){
# convert row to .bib entry
# to do: make row to bib entry a function
<- paste0(capture.output(bib2df::df2bib(bib_df[i,])), collapse="")
t_bib_entry # generate markdown text for citation
<- paste0(stevemisc::print_refs(t_bib_entry,csl = "apa.csl",
t_md_citationspit_out = FALSE,
delete_after = FALSE), collapse=" ")
cat(t_md_citation)
cat("<span class = 'publinks'>")
### add pdf links
if( !is.na(bib_df$FILE[i]) ) { #check pdf exists
<- basename(bib_df$FILE[i])
pdf_name <- list.files(here::here(pdf_dir),
rel_path_to_pdf basename(bib_df$FILE[i]),
recursive=T)
<- paste0(base_url_to_pdfs,"/",rel_path_to_pdf,collapse="")
build_url ::add_link_icon(build_url,"pdf","fas fa-file-pdf")
crumplab
}
## add all other links
if( exists(bib_df$BIBTEXKEY[i],yml_links) ) { # check yml bib entry exists
<- yml_links[[bib_df$BIBTEXKEY[i]]]
link_list
for(l in link_list){
::add_link_icon(l$url,l$name,l$icon)
crumplab
}
}cat("</span>")
cat("\n\n")
}
}
Does it blend?
::bib_2_pub_list("Crump.bib",
crumplabr"Crump.yml",
"pkgdown/assets/Crump/files",
"https://www.crumplab.com/Crump/files")
That works pretty well.
Next step is to include this function in my crumplab
package that is part of this webpage, and make it work for real.