Projects
Open models, datasets, benchmarks, and developer tools that I share through GitHub and HuggingFace. GitHub entries below exclude forked repositories.
Featured Releases
AIME 1983-2024
A curated AIME benchmark collection for olympiad-style mathematical reasoning and test-time search evaluation.
ChemVLM
Multimodal large language models and instruction data for chemistry reasoning across text, molecular, and visual inputs.
ChemLLM
Chemical large language models released for chemistry question answering, molecular reasoning, and scientific language tasks.
MathBlackBox
An open-source project around mathematical reasoning workflows and black-box problem solving.
HuggingFace Resources
Reasoning Data
- AIME 1983-2024 8,237 downloads last month
- MATH500 223 downloads last month
- OpenLongCoT-Pretrain 112 downloads last month
- R1 Vision Reasoning Instructions 83 downloads last month
- OpenLongCoT-150K 69 downloads last month
Science Data
- ChemVLM SFT Datasets 1,001 downloads last month
- ChemData700K SMILES-only 120 downloads last month
- OrderlyQA 74 downloads last month
- OrderlyQA IUPAC 14 downloads last month
- MMSciBenchmark 7 downloads last month
Models
- ChemVLM-8B 1,742 downloads last month
- ChemLLM-7B-Chat 1,658 downloads last month
- ChemLLM-20B-Chat-SFT 285 downloads last month
- ChemVLM-26B 167 downloads last month
- Qwen2.5-VL-7B-R1-Distillation 5 downloads last month
GitHub Projects
MathBlackBox
Mathematical reasoning project for black-box problem solving workflows.
Fast-Web-Fetch
Fast fetching of web pages into LLM-ready contexts.
R1-Distillation-toolkit
Toolkit for distilling and annotating R1-style reasoning data.
citeclaw
Bun-migrated Citoid-style citation pipeline work for reference-first agent workflows.
CodeMem
Code memory experiments for agentic software development workflows.
EZTinker
A compact RL-as-a-service demo inspired by Tinker-style post-training workflows.
PaddleOCR-MCP
Fast PaddleOCR MCP server for extracting text from images.
ra
Rust-native agent work with ACP, A2A, shell tools, file editing, search, MCP, and remote-agent integration.
pi-lsp-extension
Language Server Protocol integration for pi-coding-agent.
openlsp
OpenLSP CLI for coding-agent language intelligence.