Back to Skills

manage-bibliography

pjt222
Updated 2 days ago
7 views
17
2
17
View on GitHub
Metageneral

About

This skill helps developers manage BibTeX bibliography files using R packages like RefManageR and bibtex. It can parse, merge, deduplicate, and generate .bib entries from identifiers like DOIs, then export clean files. Use it when creating or cleaning bibliographies for R Markdown/Quarto projects or merging collaborator files.

Quick Install

Claude Code

Recommended
Primary
npx skills add pjt222/agent-almanac -a claude-code
Plugin CommandAlternative
/plugin add https://github.com/pjt222/agent-almanac
Git CloneAlternative
git clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/manage-bibliography

Copy and paste this command in Claude Code to install this skill

Documentation

管書目

以 R 建、合、去重 BibTeX 書目檔。此技涵書目管之全期:解既 .bib 檔為結構化 R 物、自識別字(DOI、ISBN、arXiv ID)生新項、以智去重合多書目、並出淨一致格式之 .bib。

用時

  • 為 R Markdown 或 Quarto 項目建新 .bib
  • 合自多合作者或源之書目
  • 去經複貼積之 .bib 之重
  • 自 DOI 或他識別字程序化生 BibTeX 項
  • 清並統一既 .bib(一致之鍵、序之欄)

  • 必要:一或多 .bib 檔之路,或 DOI/ISBN/arXiv ID 之列
  • 可選:出 .bib 檔之路(預設:references.bib
  • 可選:去重策略(doititleboth;預設:both
  • 可選:序(authoryearkey;預設:key
  • 可選:鍵生型(預設:AuthorYear

第一步:裝並載所需之包

required_packages <- c("RefManageR", "bibtex", "stringdist")
missing <- required_packages[!vapply(required_packages, requireNamespace,
                                     logical(1), quietly = TRUE)]
if (length(missing) > 0) install.packages(missing)

library(RefManageR)

**得:**諸包無誤而載。

**敗則:**若 RefManageR 裝敗,察 curlxml2 系統庫是否可得。Ubuntu:sudo apt install libcurl4-openssl-dev libxml2-dev

第二步:解既 .bib 檔

bib <- RefManageR::ReadBib("references.bib", check = FALSE)
message(sprintf("Parsed %d entries from references.bib", length(bib)))

# Inspect structure
print(bib[1:3])

# Access fields programmatically
keys <- names(bib)
years <- vapply(bib, function(x) x$year %||% NA_character_, character(1))

**得:**一 BibEntry 物含檔中諸項。項數配檔中 @article{@book{ 等塊之數。

**敗則:**若解敗,察未配之括或 .bib 中訛之 UTF-8。以嚴解為後備:bibtex::read.bib()

第三步:自識別字生項

# From DOI
entry_doi <- RefManageR::GetBibEntryWithDOI("10.1093/bioinformatics/btz848")

# From a vector of DOIs
dois <- c("10.1093/bioinformatics/btz848", "10.1038/s41586-020-2649-2")
entries <- do.call(c, lapply(dois, function(d) {
  tryCatch(
    RefManageR::GetBibEntryWithDOI(d),
    error = function(e) {
      warning(sprintf("Failed to fetch DOI %s: %s", d, e$message))
      NULL
    }
  )
}))
entries <- Filter(Negate(is.null), entries)

**得:**BibEntry 物具全元數據(題、作者、期刊、年、DOI)為每成功解之識別字。

**敗則:**DOI 解依 CrossRef API。若請敗,察網路與 DOI 之有效。大批或受速限;於請間加 Sys.sleep(1)

第四步:合多書目

bib1 <- RefManageR::ReadBib("project_a.bib", check = FALSE)
bib2 <- RefManageR::ReadBib("project_b.bib", check = FALSE)

# Simple merge
merged <- c(bib1, bib2)
message(sprintf("Merged: %d + %d = %d entries (before dedup)",
                length(bib1), length(bib2), length(merged)))

**得:**一合之 BibEntry 物含二檔之項。

第五步:去重

deduplicate_bib <- function(bib, method = "both") {
  n_before <- length(bib)
  keys_to_remove <- c()

  for (i in seq_along(bib)) {
    if (names(bib)[i] %in% keys_to_remove) next
    for (j in seq(i + 1, length(bib))) {
      if (j > length(bib)) break
      if (names(bib)[j] %in% keys_to_remove) next

      is_dup <- FALSE
      if (method %in% c("doi", "both")) {
        doi_i <- bib[[i]]$doi %||% ""
        doi_j <- bib[[j]]$doi %||% ""
        if (nzchar(doi_i) && nzchar(doi_j) && tolower(doi_i) == tolower(doi_j)) {
          is_dup <- TRUE
        }
      }
      if (!is_dup && method %in% c("title", "both")) {
        title_i <- tolower(gsub("[^a-z0-9 ]", "", tolower(bib[[i]]$title %||% "")))
        title_j <- tolower(gsub("[^a-z0-9 ]", "", tolower(bib[[j]]$title %||% "")))
        if (nzchar(title_i) && nzchar(title_j)) {
          sim <- 1 - stringdist::stringdist(title_i, title_j, method = "jw")
          if (sim > 0.95) is_dup <- TRUE
        }
      }
      if (is_dup) keys_to_remove <- c(keys_to_remove, names(bib)[j])
    }
  }

  if (length(keys_to_remove) > 0) {
    bib <- bib[!names(bib) %in% keys_to_remove]
  }
  message(sprintf("Deduplication: %d -> %d entries (%d duplicates removed)",
                  n_before, length(bib), n_before - length(bib)))
  bib
}

merged <- deduplicate_bib(merged, method = "both")

**得:**重項已除。所除重之數已印。

**敗則:**若題比過激(除非重者),升相似閾至 0.95 上,或僅用 method = "doi"

第六步:序並出

# Sort by citation key
sorted_bib <- sort(merged, sorting = "nyt")  # name-year-title

# Export to .bib file
RefManageR::WriteBib(sorted_bib, file = "references.bib", biblatex = FALSE)
message(sprintf("Wrote %d entries to references.bib", length(sorted_bib)))

**得:**淨 .bib 檔寫於盤,格式一致,每塊一項,依引鍵字母序。

**敗則:**若 WriteBib 生編碼之疑,確 R 會 locale 支援 UTF-8:Sys.setlocale("LC_ALL", "en_US.UTF-8")

  • 出 .bib 檔無誤而解:RefManageR::ReadBib("references.bib")
  • 項數合預期(入數減重)
  • 無重 DOI 存:出中諸 DOI 皆唯一
  • 諸項具引鍵
  • 每項類之所需欄皆具(作者、題、年至少)
  • 檔為有效 BibTeX(以 bibtex::read.bib() 試)

  • 編碼之疑:Latin-1 重音之 .bib 破 UTF-8 解者。先轉編碼:iconv -f ISO-8859-1 -t UTF-8 old.bib > new.bib
  • 未配之括:單一缺之 } 靜失項。大檔解前驗括之衡
  • DOI 速限:CrossRef 節未認證之請。以 RefManageR::BibOptions(check.entries = FALSE) 設禮之信並批請
  • 鍵衝突:合具重鍵之檔(二者皆有 Smith2020)靜蓋。合後重生鍵
  • 題中之 LaTeX:題含 {DNA}$\alpha$ 需謹處;RefManageR 保之而下游工具或剝之

  • format-citations — 格式書目項為樣之引
  • validate-references — 驗 .bib 項之全與 DOI 之可解
  • ../reporting/format-apa-report — 以書目生 APA 格之報
  • ../r-packages/write-vignette — 建引參考之包篇

GitHub Repository

pjt222/agent-almanac
Path: i18n/wenyan/skills/manage-bibliography
0
agentsagentskillsai-assisted-developmentclaude-codeskillsteams

Related Skills

content-collections

Meta

This skill provides a production-tested setup for Content Collections, a TypeScript-first tool that transforms Markdown/MDX files into type-safe data collections with Zod validation. Use it when building blogs, documentation sites, or content-heavy Vite + React applications to ensure type safety and automatic content validation. It covers everything from Vite plugin configuration and MDX compilation to deployment optimization and schema validation.

View skill

polymarket

Meta

This skill enables developers to build applications with the Polymarket prediction markets platform, including API integration for trading and market data. It also provides real-time data streaming via WebSocket to monitor live trades and market activity. Use it for implementing trading strategies or creating tools that process live market updates.

View skill

creating-opencode-plugins

Meta

This skill helps developers create OpenCode plugins that hook into 25+ event types like commands, files, and LSP operations. It provides the plugin structure, event API specifications, and implementation patterns for JavaScript/TypeScript modules. Use it when you need to intercept, monitor, or extend the OpenCode AI assistant's lifecycle with custom event-driven logic.

View skill

sglang

Meta

SGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.

View skill