paper-lookup
关于
The paper-lookup skill searches 10 academic databases via REST APIs for scholarly content including papers, preprints, and citations. It handles DOI/PMID lookups, full-text retrieval, open access checks, author searches, and literature queries across sources like PubMed, arXiv, and Semantic Scholar. Use this skill when users request paper searches, citation lookups, or need academic literature from supported databases.
快速安装
Claude Code
推荐npx skills add K-Dense-AI/claude-scientific-skills -a claude-code/plugin add https://github.com/K-Dense-AI/claude-scientific-skillsgit clone https://github.com/K-Dense-AI/claude-scientific-skills.git ~/.claude/skills/paper-lookup在 Claude Code 中复制并粘贴此命令以安装该技能
技能文档
Paper Lookup
You have access to 10 academic paper databases through their REST APIs. Your job is to figure out which database(s) best serve the user's query, call them, and return the results.
Core Workflow
-
Understand the query -- What is the user looking for? A specific paper by DOI? Papers on a topic? An author's publications? Open access PDFs? Full text? This determines which database(s) to hit.
-
Select database(s) -- Use the database selection guide below. Many queries benefit from hitting multiple databases -- for example, searching PubMed for papers and then checking Unpaywall for open access copies.
-
Read the reference file -- Each database has a reference file in
references/with endpoint details, query formats, and example calls. Read the relevant file(s) before making API calls. -
Make the API call(s) -- See the Making API Calls section below for which HTTP fetch tool to use on your platform.
-
Return results -- Always return:
- The raw JSON (or parsed XML for arXiv) response from each database
- A list of databases queried with the specific endpoints used
- If a query returned no results, say so explicitly rather than omitting it
Database Selection Guide
Match the user's intent to the right database(s).
By Use Case
| User is asking about... | Primary database(s) | Also consider |
|---|---|---|
| Papers on a biomedical topic | PubMed | Semantic Scholar, OpenAlex |
| Full text of a biomedical article | PMC | CORE |
| Biology preprints | bioRxiv | Semantic Scholar, OpenAlex |
| Health/medical preprints | medRxiv | Semantic Scholar, OpenAlex |
| Physics, math, or CS preprints | arXiv | Semantic Scholar, OpenAlex |
| Papers across all fields | OpenAlex | Semantic Scholar, Crossref |
| A specific paper by DOI | Crossref | Unpaywall, Semantic Scholar |
| Open access PDF for a paper | Unpaywall | CORE, PMC |
| Citation graph (who cites whom) | Semantic Scholar | OpenAlex |
| Author's publications | Semantic Scholar | OpenAlex |
| Paper recommendations | Semantic Scholar | -- |
| Full text (any field) | CORE | PMC (biomedical only) |
| Journal/publisher metadata | Crossref | OpenAlex |
| Funder information | Crossref | OpenAlex |
| Convert between PMID/PMCID/DOI | PMC (ID Converter) | Crossref |
| Recent preprints by date | bioRxiv, medRxiv | arXiv |
Cross-Database Queries
| User is asking about... | Databases to query |
|---|---|
| Everything about a paper (metadata + citations + OA) | Crossref + Semantic Scholar + Unpaywall |
| Comprehensive literature search | PubMed + OpenAlex + Semantic Scholar |
| Find and read a paper | PubMed (find) + Unpaywall (OA link) + PMC or CORE (full text) |
| Preprint and its published version | bioRxiv/medRxiv + Crossref |
| Author overview with citation metrics | Semantic Scholar + OpenAlex |
When a query spans multiple needs (e.g., "find papers about CRISPR and get me the PDFs"), query the relevant databases in parallel.
Common Identifier Formats
Different databases use different identifier systems. If a query fails, the identifier format may be wrong.
| Identifier | Format | Example | Used by |
|---|---|---|---|
| DOI | 10.xxxx/xxxxx | 10.1038/nature12373 | All databases |
| PMID | Integer | 34567890 | PubMed, PMC, Semantic Scholar |
| PMCID | PMC + digits | PMC7029759 | PMC, Europe PMC |
| arXiv ID | YYMM.NNNNN | 2103.15348 | arXiv, Semantic Scholar |
| OpenAlex ID | W + digits | W2741809807 | OpenAlex |
| Semantic Scholar ID | 40-char hex | 649def34f8be... | Semantic Scholar |
| ORCID | 0000-XXXX-XXXX-XXXX | 0000-0001-6187-6610 | OpenAlex, Crossref |
| ISSN | XXXX-XXXX | 0028-0836 | Crossref, OpenAlex |
Cross-referencing IDs: Semantic Scholar accepts DOI, PMID, PMCID, and arXiv ID via prefixes (e.g., DOI:10.1038/nature12373, PMID:34567890, ARXIV:2103.15348). OpenAlex accepts DOI and PMID via prefixes (doi:10.1038/..., pmid:34567890). Use the PMC ID Converter to translate between PMID, PMCID, and DOI.
API Keys and Access
Most of these databases are fully open. A few benefit from API keys for higher rate limits.
Databases requiring or benefiting from API keys
| Database | Env Variable | Required? | Registration |
|---|---|---|---|
| NCBI (PubMed, PMC) | NCBI_API_KEY | No (3 req/s without, 10 with) | https://www.ncbi.nlm.nih.gov/account/settings/ |
| CORE | CORE_API_KEY | Yes for full text | https://core.ac.uk/services/api |
| Semantic Scholar | S2_API_KEY | No (shared pool without) | https://www.semanticscholar.org/product/api#api-key-form |
| OpenAlex | OPENALEX_API_KEY | Recommended | https://openalex.org/settings/api |
Fully open databases (no key needed)
| Database | Notes |
|---|---|
| bioRxiv / medRxiv | No auth, no documented rate limits |
| arXiv | No auth, max 1 request per 3 seconds |
| Crossref | No auth; add mailto param for polite pool (2x rate limit) |
| Unpaywall | No auth; requires email parameter |
Loading API keys
- Check the environment first -- the key may already be exported (e.g.,
$NCBI_API_KEY). - Fall back to
.env-- check.envin the current working directory. - Proceed without -- most APIs still work at lower rate limits. Tell the user which key is missing and how to get one.
Making API Calls
Use your environment's HTTP fetch tool to call REST endpoints:
| Platform | HTTP Fetch Tool | Fallback |
|---|---|---|
| Claude Code | WebFetch | curl via Bash |
| Gemini CLI | web_fetch | curl via shell |
| Windsurf | read_url_content | curl via terminal |
| Cursor | No dedicated fetch tool | curl via run_terminal_cmd |
| Codex CLI | No dedicated fetch tool | curl via shell |
| Cline | No dedicated fetch tool | curl via execute_command |
If the fetch tool fails, fall back to curl via whatever shell tool is available.
Special cases
- arXiv returns Atom XML, not JSON. Parse it or use
curland extract the relevant fields. Consider piping through a simple parser if available. - PMC eFetch returns JATS XML for full text. This is expected -- full text articles are in XML format.
- Crossref and Unpaywall benefit from including a
mailtoparameter or email for the polite/fast pool.
Request guidelines
- For NCBI APIs (PubMed, PMC): max 3 req/sec without key, 10 with key. Make requests sequentially.
- For arXiv: max 1 request every 3 seconds. Be patient.
- For Crossref: 5 req/sec (public), 10 req/sec (polite pool with
mailto). - For other APIs with no strict limits, you can query multiple databases in parallel.
- If you get HTTP 429 (rate limit), wait briefly and retry once.
Error recovery
- Check the identifier format -- use the Common Identifier Formats table. A PMID won't work in arXiv, an arXiv ID won't work in PubMed directly.
- Try alternative identifiers -- if a DOI fails in one database, try the title or PMID instead.
- Try a different database -- if PubMed returns nothing for a CS paper, try Semantic Scholar or OpenAlex.
- Report the failure -- tell the user which database failed, the error, and what you tried instead.
Output Format
Structure your response like this:
## Databases Queried
- **PubMed** -- esearch + esummary for "CRISPR gene therapy"
- **Unpaywall** -- DOI lookup for 10.1038/...
## Results
### PubMed
[raw JSON response or formatted results]
### Unpaywall
[raw JSON response]
If results are very large, present the most relevant portion and note that more data is available. But default to showing the full raw JSON -- the user asked for it.
Available Databases
Read the relevant reference file before making any API call.
Biomedical Literature
| Database | Reference File | What it covers |
|---|---|---|
| PubMed | references/pubmed.md | 37M+ biomedical citations, abstracts, MeSH terms |
| PMC | references/pmc.md | 10M+ full-text biomedical articles (JATS XML), ID conversion |
Preprint Servers
| Database | Reference File | What it covers |
|---|---|---|
| bioRxiv | references/biorxiv.md | Biology preprints (browse by date/DOI, no keyword search) |
| medRxiv | references/medrxiv.md | Health sciences preprints (browse by date/DOI, no keyword search) |
| arXiv | references/arxiv.md | Physics, math, CS, biology, economics preprints (keyword search, Atom XML) |
Multidisciplinary Indexes
| Database | Reference File | What it covers |
|---|---|---|
| OpenAlex | references/openalex.md | 250M+ works, authors, institutions, topics, citation data |
| Crossref | references/crossref.md | 150M+ DOI metadata, journals, funders, references |
| Semantic Scholar | references/semantic-scholar.md | 200M+ papers, citation graphs, AI-generated TLDRs, recommendations |
Open Access & Full Text
| Database | Reference File | What it covers |
|---|---|---|
| CORE | references/core.md | 37M+ full texts from OA repositories worldwide |
| Unpaywall | references/unpaywall.md | OA status and PDF links for any DOI |
GitHub 仓库
相关推荐技能
executing-plans
设计该Skill用于当开发者提供完整实施计划时,以受控批次方式执行代码实现。它会先审阅计划并提出疑问,然后分批次执行任务(默认每批3个任务),并在批次间暂停等待审查。关键特性包括分批次执行、内置检查点和架构师审查机制,确保复杂系统实现的可控性。
requesting-code-review
设计该Skill可在完成任务、实现主要功能或合并代码前自动调度代码审查子代理,确保实现符合需求和计划。它支持通过指定git SHA范围进行精准的代码变更审查,帮助开发者在关键节点及时发现潜在问题。核心原则是"早审查、勤审查",适用于开发流程的各个关键阶段。
connect-mcp-server
设计这个Skill指导开发者如何将MCP服务器连接到Claude Code,支持HTTP、stdio和SSE三种传输协议。它涵盖了从安装配置到认证安全的完整流程,适用于集成GitHub、Notion、数据库等外部服务。当开发者需要添加集成、配置外部工具或提及MCP相关功能时,这个Skill能提供实用的操作指南。
web-cli-teleport
设计该Skill帮助开发者根据任务特性选择Claude Code的Web或CLI界面,并指导如何在两种环境间无缝迁移会话。它能分析任务复杂度、迭代需求等要素,推荐最优工作界面和工作流。关键特性包括会话状态管理、环境切换指导和上下文优化建议。
