Split PDF Script

This script allows you to split a PDF file into a specified number of parts. It takes an input PDF file and the desired number of parts, calculates the page range for each part, and generates separate output files for each part.


Installation Requirements

Before running the script, make sure to install pdftk on Ubuntu by running the following commands:

sudo apt-get update
sudo apt-get install -y pdftk

Usage

To run the script, use the following command:

./split-pdf.sh input_file number_of_parts

Replace input_file with the path to your PDF file, and number_of_parts with the number of parts you want to split the file into. For example:

./split-pdf.sh document.pdf 3

This command will split document.pdf into 3 separate PDF files.

How the Script Works

The script performs the following steps:

  1. Extracts the base name from the input file (removing the path and extension).
  2. Uses pdftk to determine the total number of pages in the PDF.
  3. Calculates the number of pages per part by dividing the total pages by the number of parts specified.
  4. Splits the PDF into the specified number of parts, creating output files named with the base name followed by a part number (e.g., document-1.pdf, document-2.pdf).

split-pdf.sh

#!/bin/bash

# Installation Requirements on Ubuntu:
# sudo apt-get update
# sudo apt-get install -y pdftk

# Usage: ./split-pdf.sh input_file number_of_parts

# Input Arguments
input_file="$1"
number_of_parts="$2"

# Check if correct arguments are passed
if [[ -z "$input_file" || -z "$number_of_parts" ]]; then
  echo "Usage: $0 input_file number_of_parts"
  exit 1
fi

# Extract base name from input file (remove path and extension)
base_name=$(basename "$input_file" .pdf)

# Find the number of pages in the PDF
total_pages=$(pdftk "$input_file" dump_data | grep NumberOfPages | awk '{print $2}')

if [[ -z "$total_pages" ]]; then
  echo "Error: Could not determine the number of pages in the PDF."
  exit 1
fi

# Calculate the number of pages per part
pages_per_part=$((total_pages / number_of_parts))
remainder=$((total_pages % number_of_parts))

start_page=1

# Split the PDF file into the specified number of parts
for ((i=1; i<=number_of_parts; i++)); do
  if [[ $i -eq $number_of_parts ]]; then
    end_page=$total_pages
  else
    end_page=$((start_page + pages_per_part - 1))
  fi

  output_file="${base_name}-${i}.pdf"

  pdftk "$input_file" cat ${start_page}-${end_page} output "$output_file"

  if [[ $? -eq 0 ]]; then
    echo "Generated: $output_file"
  else
    echo "Error creating: $output_file"
    exit 1
  fi

  start_page=$((end_page + 1))

done