When working with large language models (LLMs), it’s often useful to feed them the source code of a project for analysis, refactoring, or documentation assistance.

However, simply copying folders is messy — you usually only want the text files that are tracked by Git, displayed cleanly, without non-text files or noise.

To streamline this, I wrote a Bash script that:

  • Ensures you are inside a Git repository.
  • Lists all Git-tracked files.
  • Filters out non-text files.
  • Skips empty or unreadable files.
  • Prints the content of each file, with a clean header showing the relative path.

Perfect for quickly preparing a project snapshot to paste into an LLM.

Here’s the script:

#!/bin/bash

# Exit immediately if a command exits with a non-zero status.
# Treat unset variables as an error.
# Pipelines return status of the last command to exit with non-zero status,
# or zero if all commands exit successfully.
set -euo pipefail

# Requires GNU realpath (on macOS: brew install coreutils)

# Check if the current directory is inside a Git work tree
if ! git rev-parse --is-inside-work-tree > /dev/null 2>&1; then
  echo "Error: This script must be run from within a Git repository." >&2
  exit 1
fi

# Function to process each file
process_file() {
    local file="$1"
    # Use realpath to ensure consistent relative paths
    local relative_path
    relative_path="$(realpath --relative-to="." "$file")"

    # Skip if file is not readable or is empty
    if [ ! -r "$file" ] || [ ! -s "$file" ]; then
        return
    fi

    # Use 'file' command to determine the mime type.
    # -b (--brief): Do not prepend filename to output lines.
    # Check if the mime type starts with "text/". If it DOES NOT, skip the file.
    if ! file -b --mime-type "$file" | grep -q '^text/'; then
        return
    fi

    printf "=== File: %s ===\n\n" "$relative_path"
    cat "$file"
    printf "\n\n"
}

# Main script logic
main() {
    while IFS= read -r -d '' file; do
        process_file "$file"
    done < <(git ls-files -z)
}

# Run the main function
main

exit 0

Notes

  • realpath is used for clean relative paths (you may need GNU coreutils on some systems).
  • File type detection is based on MIME type, not extensions.
  • The script uses strict error handling (set -euo pipefail) to avoid partial outputs or silent failures.

This tool has made working with LLMs much smoother when I need to expose full project contexts without manual cleanup.

Feel free to adapt it to your own workflow.