DeepSeek OCR – Apple Metal (MPS) & CPU support

Updated version of DeepSeek-OCR model to support non-CUDA backends, MPS and CPU. 200+ Downloads on Hugging Face.

Hugging Face Downloads

DeepSeek-OCR – Apple Metal Performance Shaders (MPS) & CPU Support

This repository uses the weights from the original DeepSeek-OCR and modifies model to support MPS and CPU inference

Usage

Inference using Huggingface transformers on Metal Performance Shaders (MPS) and CPU. Requirements tested on python 3.12.9:

git clone [email protected]:Dogacel/DeepSeek-OCR-Metal-MPS
cd DeepSeek-OCR-Metal-MPS/demo

# Use mamba or conda

mamba create -n deepseek-ocr python=3.12.9 -y
mamba activate deepseek-ocr
pip install -r requirements.txt

python run_dpsk_ocr.py
from transformers import AutoModel, AutoTokenizer
import torch

model_name = 'Dogacel/DeepSeek-OCR-Metal-MPS'

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

model = AutoModel.from_pretrained(
    model_name,
    _attn_implementation='eager',
    trust_remote_code=True,
    use_safetensors=True,
)

device = torch.device("mps")
dtype = torch.float16

model = model.eval().to(device).to(dtype)

prompt = "<image>\n<|grounding|>Convert the document to markdown. "
image_file = 'image.png'
output_path = 'results4'

res = model.infer(
    tokenizer,
    device=device,
    dtype=dtype,
    prompt=prompt,
    image_file=image_file,
    output_path = output_path,
    base_size=1024,
    image_size=640,
    crop_mode=False,
    save_results = True,
    test_compress = True,
)