Copy a File/Directory in Python: shutil.copy, shutil.copytree
In Python, you can copy a file with shutil.copy()
or shutil.copy2()
, and a directory (folder) with shutil.copytree()
.
If you want to move or delete files and directories, refer to the following articles.
- Move a file/directory in Python (shutil.move)
- Delete a file/directory in Python (os.remove, shutil.rmtree)
This article does not cover detailed specifications, like handling symbolic links. For comprehensive information, please check the official Python documentation.
The sample code in this article imports the shutil
module as shown below. It is part of the standard library, so no additional installation is necessary.
import shutil
Copy a file with shutil.copy()
and shutil.copy2()
To copy a file, use shutil.copy()
or shutil.copy2()
.
Although both functions have identical usage, shutil.copy2()
attempts to copy metadata such as creation and modification times, which shutil.copy()
does not.
Basic usage
The first argument should be the path of the source file, and the second should be the path of the destination directory or file. The function returns the path of the newly copied file.
Paths can be represented as either strings or path-like objects like pathlib.Path
. When specifying a directory as a string, the trailing delimiter (/
) is optional.
Consider the following files and directories:
temp/
├── dir1/
├── dir2/
│ ├── file.txt
│ └── file2.txt
└── file.txt
If the second argument is an existing directory, the source file will be copied into it. If a file with the same name already exists in this directory, the source file will overwrite it.
new_path = shutil.copy('temp/file.txt', 'temp/dir1')
print(new_path)
# temp/dir1/file.txt
new_path = shutil.copy('temp/file.txt', 'temp/dir2')
print(new_path)
# temp/dir2/file.txt
If you provide a new path as the second argument, the file will be copied with that filename. Ensure all directories in the path exist before copying to avoid errors from non-existent directories.
new_path = shutil.copy('temp/file.txt', 'temp/dir1/file2.txt')
print(new_path)
# temp/dir1/file2.txt
# new_path = shutil.copy('temp/file.txt', 'temp/dir2/new_dir/file2.txt')
# FileNotFoundError: [Errno 2] No such file or directory: 'temp/dir2/new_dir/file2.txt'
Be aware that if you attempt to specify a new directory as the second argument, for example, temp/dir3
, it will be interpreted as a filename and copied as a file named dir3
. Specifying temp/dir3/
will result in an error. You must first create a directory.
If the second argument is an existing file, it will be overwritten by the source file.
new_path = shutil.copy('temp/file.txt', 'temp/dir2/file2.txt')
print(new_path)
# temp/dir2/file2.txt
The results are as follows. The content of temp/file.txt
, which was provided as the source in the first argument, is copied and replaces the content of the destination files.
temp/
├── dir1/
│ ├── file.txt
│ └── file2.txt
├── dir2/
│ ├── file.txt
│ └── file2.txt
└── file.txt
The difference between shutil.copy()
and shutil.copy2()
The primary difference between shutil.copy()
and shutil.copy2()
is their handling of metadata (creation time, modification time, etc.). While shutil.copy()
does not copy metadata, shutil.copy2()
attempts to do so.
In the case of shutil.copy()
, the copied file's modification time is the time when the copy was made. You can verify this by comparing the modification times of the source and destination files using os.path.getmtime()
.
import os
new_path = shutil.copy('temp/file.txt', 'temp/file_copy.txt')
print(new_path)
# temp/file_copy.txt
print(os.path.getmtime('temp/file.txt') == os.path.getmtime('temp/file_copy.txt'))
# False
In contrast, shutil.copy2()
attempts to copy as much metadata as possible. For example, the modification times of the source and destination files are the same.
new_path = shutil.copy2('temp/file.txt', 'temp/file_copy2.txt')
print(new_path)
# temp/file_copy2.txt
print(os.path.getmtime('temp/file.txt') == os.path.getmtime('temp/file_copy2.txt'))
# True
It is important to remember that shutil.copy2()
cannot copy all metadata.
Warning: Even the higher-level file copying functions (
shutil.copy()
,shutil.copy2()
) cannot copy all file metadata.On POSIX platforms, this means that file owner and group are lost as well as ACLs. On Mac OS, the resource fork and other metadata are not used. This means that resources will be lost and file type and creator codes will not be correct. On Windows, file owners, ACLs and alternate data streams are not copied. shutil — High-level file operations — Python 3.11.4 documentation
Copy a directory (folder) with shutil.copytree()
Basic usage
Use shutil.copytree()
to recursively copy a directory along with all its files and subdirectories.
Specify the path of the source directory as the first argument, and the path of the destination directory as the second. The function returns the path of the destination directory.
Consider the following files and directories:
temp/
└── dir1/
├── file.txt
└── subdir/
└── file2.txt
If you specify a new path (non-existent path) as the second argument, all directories, including intermediate ones, are created, and the contents are copied.
new_path = shutil.copytree('temp/dir1', 'temp/dir2/new_dir')
print(new_path)
# temp/dir2/new_dir
temp/
├── dir1/
│ ├── file.txt
│ └── subdir/
│ └── file2.txt
└── dir2/
└── new_dir/
├── file.txt
└── subdir/
└── file2.txt
By default, if you specify an existing directory as the destination, an error will occur. To avoid this, use the dirs_exist_ok
argument described next.
# new_path = shutil.copytree('temp/dir1', 'temp/dir2/new_dir')
# FileExistsError: [Errno 17] File exists: 'temp/dir2/new_dir'
Copy to an existing directory: dirs_exist_ok
As mentioned above, by default, specifying an existing directory as the second argument raises an error.
However, by setting the dirs_exist_ok
argument to True
, this error will be prevented. If the destination directory contains a file with the same name as one in the source directory, the destination file will be replaced.
new_path = shutil.copytree('temp/dir1', 'temp/dir2/new_dir', dirs_exist_ok=True)
print(new_path)
# temp/dir2/new_dir
Specify the copy function: copy_function
You can specify the function used for copying with the copy_function
argument.
The default is shutil.copy2()
, which tries to preserve as much metadata as possible. If you prefer not to duplicate the metadata, you can specify shutil.copy()
instead.
import os
new_path = shutil.copytree('temp/dir1', 'temp/dir_copy2')
print(new_path)
# temp/dir_copy2
print(os.path.getmtime('temp/dir1/file.txt') == os.path.getmtime('temp/dir_copy2/file.txt'))
# True
new_path = shutil.copytree('temp/dir1', 'temp/dir_copy', copy_function=shutil.copy)
print(new_path)
# temp/dir_copy
print(os.path.getmtime('temp/dir1/file.txt') == os.path.getmtime('temp/dir_copy/file.txt'))
# False
Specify files and directories to ignore: ignore
By specifying shutil.ignore_patterns()
to the ignore
argument, you can exclude certain files or directories from the copying process.
- shutil.ignore_patterns() — High-level file operations — Python 3.11.4 documentation
- shutil - copytree example — High-level file operations — Python 3.11.4 documentation
You can specify multiple glob-style patterns in shutil.ignore_patterns()
. You can use wildcards such as *
. Files and directories whose names match the provided patterns will not be copied.
Consider the following files and directories:
temp/
└── dir1/
├── .config
├── .dir/
├── file.txt
└── subdir/
├── file.jpg
└── file.txt
Here, files and directories starting with .
and files with the jpg
extension are ignored.
new_path = shutil.copytree(
'temp/dir1', 'temp/dir2', ignore=shutil.ignore_patterns('.*', '*.jpg')
)
print(new_path)
# temp/dir2
temp/
├── dir1/
│ ├── .config
│ ├── .dir/
│ ├── file.txt
│ └── subdir/
│ ├── file.jpg
│ └── file.txt
└── dir2/
├── file.txt
└── subdir/
└── file.txt
Copy multiple files based on certain conditions with wildcards and regex
For a practical example, let's examine how to copy multiple files based on certain conditions.
Please note that using **
in glob.glob()
can be time-consuming with a large number of files and directories. For improved efficiency, consider specifying conditions with other special characters when feasible.
If possible, using the ignore
argument of shutil.copytree()
to specify conditions can simplify the process.
Use wildcards to specify conditions
The glob
module allows you to use wildcard characters like *
to generate a list of file and directory names. For detailed usage, refer to the following article.
Consider the following files and directories:
temp/
└── dir1/
├── 123.jpg
├── 456.txt
├── abc.txt
└── subdir/
├── 000.csv
├── 789.txt
└── xyz.jpg
Without preserving the original directory structure
For example, you can extract and copy files with the txt
extension, including those in subdirectories. The destination directory must be created beforehand.
import shutil
import glob
import os
os.makedirs('temp/dir2')
for p in glob.glob('temp/dir1/**/*.txt', recursive=True):
if os.path.isfile(p):
shutil.copy(p, 'temp/dir2')
temp/
├── dir1/
│ ├── 123.jpg
│ ├── 456.txt
│ ├── abc.txt
│ └── subdir/
│ ├── 000.csv
│ ├── 789.txt
│ └── xyz.jpg
└── dir2/
├── 456.txt
├── 789.txt
└── abc.txt
In the example above, there's no problem. However, under different conditions, directories might be included in the results. Therefore, os.path.isfile()
is used to target only files.
If you want to copy directories, use os.path.isdir()
and shutil.copytree()
instead of os.path.isfile()
and shutil.copy()
.
With preserving the original directory structure
If you want to maintain the directory structure of the source, you can do as follows.
Specify the root_dir
argument in glob.glob()
, use os.path.dirname()
to extract the directory name, and use os.path.join()
to concatenate paths. You don't need to create the destination directory in advance because it's generated with os.makedirs()
.
- Get the filename, directory, extension from a path string in Python
- Create a directory with mkdir(), makedirs() in Python
src_dir = 'temp/dir1'
dst_dir = 'temp/dir3'
for p in glob.glob('**/*.txt', recursive=True, root_dir=src_dir):
if os.path.isfile(os.path.join(src_dir, p)):
os.makedirs(os.path.join(dst_dir, os.path.dirname(p)), exist_ok=True)
shutil.copy(os.path.join(src_dir, p), os.path.join(dst_dir, p))
temp/
├── dir1/
│ ├── 123.jpg
│ ├── 456.txt
│ ├── abc.txt
│ └── subdir/
│ ├── 000.csv
│ ├── 789.txt
│ └── xyz.jpg
├── dir2/
│ ├── 456.txt
│ ├── 789.txt
│ └── abc.txt
└── dir3/
├── 456.txt
├── abc.txt
└── subdir/
└── 789.txt
Use regex to specify conditions
For complex conditions that can't be handled by glob alone, use the re
module to specify conditions with regular expressions.
The basic approach remains the same as when using glob.glob()
alone.
Consider the following files and directories:
temp/
└── dir1/
├── 123.jpg
├── 456.txt
├── abc.txt
└── subdir/
├── 000.csv
├── 789.txt
└── xyz.jpg
Without preserving the original directory structure
Use glob.glob()
with **
and recursive=True
to recursively list all files and directories, and then filter them by regex with re.search()
.
For example, to extract and copy files with names consisting only of digits and either a txt
or csv
extension, including files in subdirectories, use \d
to match digits, +
to match one or more repetitions, and (a|b)
to match either a
or b
.
import shutil
import glob
import os
import re
os.makedirs('temp/dir2')
for p in glob.glob('temp/dir1/**', recursive=True):
if os.path.isfile(p) and re.search('\d+\.(txt|csv)', p):
shutil.copy(p, 'temp/dir2')
temp/
├── dir1/
│ ├── 123.jpg
│ ├── 456.txt
│ ├── abc.txt
│ └── subdir/
│ ├── 000.csv
│ ├── 789.txt
│ └── xyz.jpg
└── dir2/
├── 000.csv
├── 456.txt
└── 789.txt
With preserving the original directory structure
If you want to retain the directory structure of the source, do as follows.
src_dir = 'temp/dir1'
dst_dir = 'temp/dir3'
for p in glob.glob('**', recursive=True, root_dir=src_dir):
if os.path.isfile(os.path.join(src_dir, p)) and re.search('\d+\.(txt|csv)', p):
os.makedirs(os.path.join(dst_dir, os.path.dirname(p)), exist_ok=True)
shutil.copy(os.path.join(src_dir, p), os.path.join(dst_dir, p))
temp/
├── dir1/
│ ├── 123.jpg
│ ├── 456.txt
│ ├── abc.txt
│ └── subdir/
│ ├── 000.csv
│ ├── 789.txt
│ └── xyz.jpg
├── dir2/
│ ├── 000.csv
│ ├── 456.txt
│ └── 789.txt
└── dir3/
├── 456.txt
└── subdir/
├── 000.csv
└── 789.txt