Deep Dives

How to convert PDF pages to images: a practical guide

koboshiCo-founder
·9 min read
How to convert PDF pages to images: a practical guide
Summary

PDF pages don't always fit where you need them. This guide covers when to convert PDF to images, the trade-offs of each approach, native OS methods, and code examples in five languages.

A frontend developer gets a 40-page brand guidelines PDF and needs to drop three specific pages into a Figma board. A support engineer wants to paste a diagram from a datasheet into a Slack thread. A lawyer needs to attach a signed contract page to an email without sending the whole file.

PDF is built for fixed-layout documents. The web, image editors, and chat apps are built for pixels. Converting a PDF page to an image bridges that gap, but the method you pick changes the output quality, file size, and how much control you keep.

Why PDF is both great and annoying for this

PDF stores pages as a stream of drawing commands: place this glyph, draw this vector, render this image at this size. That makes it resolution-independent and visually consistent. It also means a PDF is not an image. To turn it into one, something has to render those commands onto a raster canvas.

The good parts:

  • Text and vectors stay sharp at any zoom level because they are described mathematically, not stored as pixels.
  • A single PDF can hold hundreds of pages in one file.
  • Fonts, color profiles, and annotations travel with the document.

The hard parts:

  • PDF readers disagree on rendering. The same page can look slightly different in Adobe Acrobat, Preview, Chrome, or a headless library.
  • Scanned PDFs are just images wrapped in PDF containers, so "converting" them means re-encoding, which can add artifacts or bloat.
  • Complex PDFs with transparency, layers, or interactive forms may flatten unpredictably.
  • Page dimensions vary. A US Letter page at 72 DPI is 612 by 792 pixels. At 300 DPI it is 2550 by 3300. If you do not specify the resolution, you may get something unusable.

What people actually convert PDFs into

Most conversion tasks fall into one of three output formats. Each has a different job.

FormatBest forTrade-off
JPGPhotos, previews, email attachments, web galleriesLossy, but small file sizes
PNGScreenshots, diagrams, anything needing transparencyLossless, larger than JPG for photos
WebPModern web, apps, anywhere bandwidth mattersSmaller than JPG/PNG, slightly less universal

Then there are the practical dimensions:

  • Single page vs. batch. One page is easy. A folder of invoices needs automation.
  • DPI. 150 DPI is fine for thumbnails. 300 DPI is standard for print and OCR. 600 DPI is overkill unless you are zooming into fine details.
  • Color space. RGB is safe for screens. CMYK PDFs converted to RGB can shift colors if the conversion is naive.

Picking the right approach

The right tool depends on what you are optimizing for.

Fast, browser-based conversion

If you just need a page as a JPG, PNG, or WebP and the file is not sensitive enough to require a server, a browser-based converter is the quickest path. Our PDF to JPG, PDF to PNG, and PDF to WebP tools render the PDF locally. The file never leaves your device, which matters for contracts, IDs, and medical records.

Command-line batch work

For folders of PDFs or CI pipelines, a command-line tool wins. You get repeatable output, DPI control, and scripting.

In-application conversion

When conversion is part of a product, calling a library is usually cleaner than shelling out to a CLI tool. It removes an external dependency and gives you error handling that matches the rest of your codebase.

Converting on Windows

Adobe Acrobat

  1. Open the PDF and go to the page you need.
  2. Choose File > Export to > Image > JPEG/PNG/TIFF.
  3. Set the output resolution in the export dialog.
  4. Save.

Acrobat gives reliable output, but it is not free and not scriptable without the paid SDK.

PDF-XChange Editor

A lighter alternative with a free tier. File > Export > Export Pages as Images lets you pick format, DPI, and page range.

PowerShell with pdftoppm

Install Poppler for Windows, then use pdftoppm from PowerShell:

pdftoppm -jpeg -r 300 input.pdf output

This produces output-1.jpg, output-2.jpg, and so on, one per page, at 300 DPI.

For PNG with transparent backgrounds:

pdftoppm -png -r 300 input.pdf output

For a single page:

pdftoppm -jpeg -r 300 -f 1 -l 1 input.pdf output

The -f and -l flags set first and last page.

PowerShell with ImageMagick

ImageMagick can render PDFs, but on Windows it usually delegates to Ghostscript under the hood:

magick -density 300 input.pdf[0] output.jpg

The [0] means the first page. Without it, ImageMagick may try to produce a multi-frame image.

Converting on macOS

Preview

  1. Open the PDF in Preview.
  2. Select the page thumbnail you want.
  3. Choose File > Export, pick the format, and set the resolution.

Preview is fast and private, but it handles one page at a time.

Terminal with sips

macOS includes sips, but it does not render PDF text well. Use it only for PDFs that are already bitmaps:

sips -s format jpeg input.pdf --out output.jpg

For real PDF rendering, install Poppler via Homebrew:

brew install poppler
pdftoppm -jpeg -r 300 input.pdf output

Automator quick action

You can build a right-click service in Automator that runs pdftoppm on any selected PDF. This is useful if you convert pages regularly and do not want to remember flags.

Converting on Linux

Poppler is usually the best choice on Linux.

sudo apt install poppler-utils

pdftoppm -jpeg -r 300 input.pdf output
pdftoppm -png -r 300 input.pdf output

For WebP output, convert to PNG first and then use cwebp:

pdftoppm -png -r 300 input.pdf temp
cwebp temp-1.png -o output.webp

Batch conversion

If you have a folder of PDFs and want one image per first page:

for f in *.pdf; do
    pdftoppm -jpeg -r 300 -f 1 -l 1 "$f" "${f%.pdf}"
done

ImageMagick

ImageMagick works on Linux but is often slower for PDFs because it rasterizes through Ghostscript:

magick -density 300 input.pdf[0] -quality 90 output.jpg

Set -density before reading the PDF. Putting it after will not help.

Converting with code

TypeScript / Node.js

Use pdfjs-dist to render pages to a canvas, then export to image data.

import * as pdfjsLib from "pdfjs-dist"
import { createCanvas } from "canvas"
import fs from "fs"

async function pdfPageToPng(
  pdfPath: string,
  pageNumber: number,
  outputPath: string,
  scale: number = 2
) {
  const data = new Uint8Array(fs.readFileSync(pdfPath))
  const pdf = await pdfjsLib.getDocument({ data }).promise
  const page = await pdf.getPage(pageNumber)

  const viewport = page.getViewport({ scale })
  const canvas = createCanvas(viewport.width, viewport.height)
  const context = canvas.getContext("2d")

  await page.render({ canvasContext: context, viewport }).promise

  const buffer = canvas.toBuffer("image/png")
  fs.writeFileSync(outputPath, buffer)

  await pdf.destroy()
}

await pdfPageToPng("input.pdf", 1, "output.png", 2)

The scale value maps roughly to DPI: a scale of 2 at 72 DPI gives 144 DPI output. For 300 DPI, use a scale of about 4.17.

In the browser, our PDF to PNG converter does the same render-to-canvas step without sending the file anywhere.

PHP

PHP can shell out to pdftoppm or use libraries like spatie/pdf-to-image, which wraps ImageMagick:

<?php

use Spatie\PdfToImage\Pdf;

$pdf = new Pdf('input.pdf');
$pdf->setPage(1)
    ->setOutputFormat('png')
    ->saveImage('output.png');

If you prefer not to add a dependency, call Poppler directly:

<?php

function pdfPageToPng(string $input, int $page, string $output, int $dpi = 300): void {
    $cmd = sprintf(
        'pdftoppm -png -r %d -f %d -l %d %s %s',
        $dpi,
        $page,
        $page,
        escapeshellarg($input),
        escapeshellarg($output)
    );
    exec($cmd);
}

pdfPageToPng('input.pdf', 1, 'output', 300);

This writes output-1.png. Add your own error checking before using it in production.

Go

Go has no built-in PDF renderer, but github.com/gen2brain/go-fitz wraps MuPDF:

package main

import (
	"image/jpeg"
	"os"

	"github.com/gen2brain/go-fitz"
)

func main() {
	doc, err := fitz.New("input.pdf")
	if err != nil {
		panic(err)
	}
	defer doc.Close()

	img, err := doc.Image(0)
	if err != nil {
		panic(err)
	}

	out, err := os.Create("output.jpg")
	if err != nil {
		panic(err)
	}
	defer out.Close()

	if err := jpeg.Encode(out, img, &jpeg.Options{Quality: 90}); err != nil {
		panic(err)
	}
}

go-fitz returns images at the PDF's default rendering resolution. Check the library docs if you need DPI control.

Java

Apache PDFBox is the standard choice for Java PDF work.

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;
import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;

public class PdfToImage {
    public static void main(String[] args) throws IOException {
        try (PDDocument document = PDDocument.load(new File("input.pdf"))) {
            PDFRenderer renderer = new PDFRenderer(document);
            BufferedImage image = renderer.renderImageWithDPI(0, 300);
            ImageIO.write(image, "png", new File("output.png"));
        }
    }
}

Maven dependency:

<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>pdfbox</artifactId>
    <version>3.0.2</version>
</dependency>

renderImageWithDPI takes a zero-based page index and a DPI value.

Python

Python makes this easy with pymupdf or pdf2image.

With pymupdf:

import fitz

doc = fitz.open("input.pdf")
page = doc[0]

mat = fitz.Matrix(2, 2)
pix = page.get_pixmap(matrix=mat)
pix.save("output.png")

The Matrix controls scale. For roughly 300 DPI from a 72 DPI PDF, use fitz.Matrix(300/72, 300/72).

With pdf2image, which wraps Poppler:

from pdf2image import convert_from_path

images = convert_from_path("input.pdf", dpi=300, first_page=1, last_page=1)
images[0].save("output.jpg", "JPEG", quality=90)

pdf2image is convenient but requires Poppler installed on the system.

Common pitfalls

  • Forgetting DPI. Default PDF rendering is often 72 or 96 DPI. At that resolution, text looks fuzzy. Always specify the output DPI if quality matters.
  • Ignoring color space. A CMYK PDF converted to RGB without a profile can look washed out or oversaturated.
  • Re-encoding scanned PDFs. If the PDF is already a JPEG scan, converting it to PNG will not recover lost detail. It will just make the file larger.
  • Font substitution. Headless servers sometimes lack the fonts embedded in the PDF. The renderer substitutes, and layout breaks. Embedding fonts when creating the PDF prevents this.
  • Page numbering off by one. Code APIs usually use zero-based page indexes. Command-line tools usually use one-based.

What to use where

  • One-off personal use: Preview on macOS, a PDF reader on Windows, or a browser converter.
  • Batch processing on a server: pdftoppm or pdf2image.
  • Inside a product: Apache PDFBox for Java, pymupdf for Python, pdfjs-dist for TypeScript.
  • Privacy-sensitive files: Use a browser-based tool so the PDF never leaves the device.

If you want the fastest path without installing anything, our PDF to JPG, PDF to PNG, and PDF to WebP converters run entirely in the browser.

More blog posts to read