Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
Run the following code from a directory that contains a directory named
bar
(containing one or more files) and a directory named
baz
(also containing one or more files). Make sure there is not a directory named
foo
.
import shutil
shutil.copytree('bar', 'foo')
shutil.copytree('baz', 'foo')
It will fail with:
$ python copytree_test.py
Traceback (most recent call last):
File "copytree_test.py", line 5, in <module>
shutil.copytree('baz', 'foo')
File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/shutil.py", line 110, in copytree
File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/os.py", line 172, in makedirs
OSError: [Errno 17] File exists: 'foo'
I want this to work the same way as if I had typed:
$ mkdir foo
$ cp bar/* foo/
$ cp baz/* foo/
Do I need to use shutil.copy()
to copy each file in baz
into foo
? (After I've already copied the contents of 'bar' into 'foo' with shutil.copytree()
?) Or is there an easier/better way?
–
–
Here's a solution that's part of the standard library:
from distutils.dir_util import copy_tree
copy_tree("/a/b/c", "/x/y/z")
See this similar question.
Copy directory contents into a directory with python
Reference - https://docs.python.org/3/distutils/apiref.html#distutils.dir_util.copy_tree
–
–
–
–
–
This limitation of the standard shutil.copytree
seems arbitrary and annoying. Workaround:
import os, shutil
def copytree(src, dst, symlinks=False, ignore=None):
for item in os.listdir(src):
s = os.path.join(src, item)
d = os.path.join(dst, item)
if os.path.isdir(s):
shutil.copytree(s, d, symlinks, ignore)
else:
shutil.copy2(s, d)
Note that it's not entirely consistent with the standard copytree
:
it doesn't honor symlinks
and ignore
parameters for the root directory of the src
tree;
it doesn't raise shutil.Error
for errors at the root level of src
;
in case of errors during copying of a subtree, it will raise shutil.Error
for that subtree instead of trying to copy other subtrees and raising single combined shutil.Error
.
–
–
–
–
–
Python 3.8 introduced the dirs_exist_ok
argument to shutil.copytree
:
Recursively copy an entire directory tree rooted at src to a directory named dst and return the destination directory. dirs_exist_ok dictates whether to raise an exception in case dst or any missing parent directory already exists.
Therefore, with Python 3.8+ this should work:
import shutil
shutil.copytree('bar', 'foo') # Will fail if `foo` exists
shutil.copytree('baz', 'foo', dirs_exist_ok=True) # Fine
–
–
–
In slight improvement on atzz's answer to the function where the above function always tries to copy the files from source to destination.
def copytree(src, dst, symlinks=False, ignore=None):
if not os.path.exists(dst):
os.makedirs(dst)
for item in os.listdir(src):
s = os.path.join(src, item)
d = os.path.join(dst, item)
if os.path.isdir(s):
copytree(s, d, symlinks, ignore)
else:
if not os.path.exists(d) or os.stat(s).st_mtime - os.stat(d).st_mtime > 1:
shutil.copy2(s, d)
In my above implementation
Creating the output directory if not already exists
Doing the copy directory by recursively calling my own method.
When we come to actually copying the file I check if the file is modified then only
we should copy.
I am using above function along with scons build. It helped me a lot as every time when I compile I may not need to copy entire set of files.. but only the files which are modified.
–
–
–
–
import stat
def copytree(src, dst, symlinks = False, ignore = None):
if not os.path.exists(dst):
os.makedirs(dst)
shutil.copystat(src, dst)
lst = os.listdir(src)
if ignore:
excl = ignore(src, lst)
lst = [x for x in lst if x not in excl]
for item in lst:
s = os.path.join(src, item)
d = os.path.join(dst, item)
if symlinks and os.path.islink(s):
if os.path.lexists(d):
os.remove(d)
os.symlink(os.readlink(s), d)
st = os.lstat(s)
mode = stat.S_IMODE(st.st_mode)
os.lchmod(d, mode)
except:
pass # lchmod not available
elif os.path.isdir(s):
copytree(s, d, symlinks, ignore)
else:
shutil.copy2(s, d)
Same behavior as shutil.copytree, with symlinks and ignore parameters
Create directory destination structure if non existant
Will not fail if dst already exists
–
–
–
The destination directory, named by dst
, must not already exist; it will be created as well as missing parent directories.
I think your best bet is to os.walk
the second and all consequent directories, copy2
directory and files and do additional copystat
for directories. After all that's precisely what copytree
does as explained in the docs. Or you could copy
and copystat
each directory/file and os.listdir
instead of os.walk
.
This is inspired from the original best answer provided by atzz, I just added replace file / folder logic. So it doesn't actually merge, but deletes the existing file/ folder and copies the new one:
import shutil
import os
def copytree(src, dst, symlinks=False, ignore=None):
for item in os.listdir(src):
s = os.path.join(src, item)
d = os.path.join(dst, item)
if os.path.exists(d):
shutil.rmtree(d)
except Exception as e:
print e
os.unlink(d)
if os.path.isdir(s):
shutil.copytree(s, d, symlinks, ignore)
else:
shutil.copy2(s, d)
#shutil.rmtree(src)
Uncomment the rmtree to make it a move function.
Here is my pass at the problem. I modified the source code for copytree to keep the original functionality, but now no error occurs when the directory already exists. I also changed it so it doesn't overwrite existing files but rather keeps both copies, one with a modified name, since this was important for my application.
import shutil
import os
def _copytree(src, dst, symlinks=False, ignore=None):
This is an improved version of shutil.copytree which allows writing to
existing folders and does not overwrite existing files but instead appends
a ~1 to the file name and adds it to the destination path.
names = os.listdir(src)
if ignore is not None:
ignored_names = ignore(src, names)
else:
ignored_names = set()
if not os.path.exists(dst):
os.makedirs(dst)
shutil.copystat(src, dst)
errors = []
for name in names:
if name in ignored_names:
continue
srcname = os.path.join(src, name)
dstname = os.path.join(dst, name)
i = 1
while os.path.exists(dstname) and not os.path.isdir(dstname):
parts = name.split('.')
file_name = ''
file_extension = parts[-1]
# make a new file name inserting ~1 between name and extension
for j in range(len(parts)-1):
file_name += parts[j]
if j < len(parts)-2:
file_name += '.'
suffix = file_name + '~' + str(i) + '.' + file_extension
dstname = os.path.join(dst, suffix)
if symlinks and os.path.islink(srcname):
linkto = os.readlink(srcname)
os.symlink(linkto, dstname)
elif os.path.isdir(srcname):
_copytree(srcname, dstname, symlinks, ignore)
else:
shutil.copy2(srcname, dstname)
except (IOError, os.error) as why:
errors.append((srcname, dstname, str(why)))
# catch the Error from the recursive copytree so that we can
# continue with other files
except BaseException as err:
errors.extend(err.args[0])
shutil.copystat(src, dst)
except WindowsError:
# can't copy file access times on Windows
except OSError as why:
errors.extend((src, dst, str(why)))
if errors:
raise BaseException(errors)
Here is a version that expects a pathlib.Path
as input.
# Recusively copies the content of the directory src to the directory dst.
# If dst doesn't exist, it is created, together with all missing parent directories.
# If a file from src already exists in dst, the file in dst is overwritten.
# Files already existing in dst which don't exist in src are preserved.
# Symlinks inside src are copied as symlinks, they are not resolved before copying.
def copy_dir(src, dst):
dst.mkdir(parents=True, exist_ok=True)
for item in os.listdir(src):
s = src / item
d = dst / item
if s.is_dir():
copy_dir(s, d)
else:
shutil.copy2(str(s), str(d))
Note that this function requires Python 3.6, which is the first version of Python where os.listdir()
supports path-like objects as input. If you need to support earlier versions of Python, you can replace listdir(src)
by listdir(str(src))
.
–
def copy_dir(source_item, destination_item):
if os.path.isdir(source_item):
make_dir(destination_item)
sub_items = glob.glob(source_item + '/*')
for sub_item in sub_items:
copy_dir(sub_item, destination_item + '/' + sub_item.split('/')[-1])
else:
shutil.copy(source_item, destination_item)
Here is a version inspired by this thread that more closely mimics distutils.file_util.copy_file
.
updateonly
is a bool if True, will only copy files with modified dates newer than existing files in dst
unless listed in forceupdate
which will copy regardless.
ignore
and forceupdate
expect lists of filenames or folder/filenames relative to src
and accept Unix-style wildcards similar to glob
or fnmatch
.
The function returns a list of files copied (or would be copied if dryrun
if True).
import os
import shutil
import fnmatch
import stat
import itertools
def copyToDir(src, dst, updateonly=True, symlinks=True, ignore=None, forceupdate=None, dryrun=False):
def copySymLink(srclink, destlink):
if os.path.lexists(destlink):
os.remove(destlink)
os.symlink(os.readlink(srclink), destlink)
st = os.lstat(srclink)
mode = stat.S_IMODE(st.st_mode)
os.lchmod(destlink, mode)
except OSError:
pass # lchmod not available
fc = []
if not os.path.exists(dst) and not dryrun:
os.makedirs(dst)
shutil.copystat(src, dst)
if ignore is not None:
ignorepatterns = [os.path.join(src, *x.split('/')) for x in ignore]
else:
ignorepatterns = []
if forceupdate is not None:
forceupdatepatterns = [os.path.join(src, *x.split('/')) for x in forceupdate]
else:
forceupdatepatterns = []
srclen = len(src)
for root, dirs, files in os.walk(src):
fullsrcfiles = [os.path.join(root, x) for x in files]
t = root[srclen+1:]
dstroot = os.path.join(dst, t)
fulldstfiles = [os.path.join(dstroot, x) for x in files]
excludefiles = list(itertools.chain.from_iterable([fnmatch.filter(fullsrcfiles, pattern) for pattern in ignorepatterns]))
forceupdatefiles = list(itertools.chain.from_iterable([fnmatch.filter(fullsrcfiles, pattern) for pattern in forceupdatepatterns]))
for directory in dirs:
fullsrcdir = os.path.join(src, directory)
fulldstdir = os.path.join(dstroot, directory)
if os.path.islink(fullsrcdir):
if symlinks and dryrun is False:
copySymLink(fullsrcdir, fulldstdir)
else:
if not os.path.exists(directory) and dryrun is False:
os.makedirs(os.path.join(dst, dir))
shutil.copystat(src, dst)
for s,d in zip(fullsrcfiles, fulldstfiles):
if s not in excludefiles:
if updateonly:
go = False
if os.path.isfile(d):
srcdate = os.stat(s).st_mtime
dstdate = os.stat(d).st_mtime
if srcdate > dstdate:
go = True
else:
go = True
if s in forceupdatefiles:
go = True
if go is True:
fc.append(d)
if not dryrun:
if os.path.islink(s) and symlinks is True:
copySymLink(s, d)
else:
shutil.copy2(s, d)
else:
fc.append(d)
if not dryrun:
if os.path.islink(s) and symlinks is True:
copySymLink(s, d)
else:
shutil.copy2(s, d)
return fc
The previous solution has some issue that src
may overwrite dst
without any notification or exception.
I add a predict_error
method to predict errors before copy.copytree
mainly base on Cyrille Pontvieux's version.
Using predict_error
to predict all errors at first is best, unless you like to see exception raised one by another when execute copytree
until fix all error.
def predict_error(src, dst):
if os.path.exists(dst):
src_isdir = os.path.isdir(src)
dst_isdir = os.path.isdir(dst)
if src_isdir and dst_isdir:
elif src_isdir and not dst_isdir:
yield {dst:'src is dir but dst is file.'}
elif not src_isdir and dst_isdir:
yield {dst:'src is file but dst is dir.'}
else:
yield {dst:'already exists a file with same name in dst'}
if os.path.isdir(src):
for item in os.listdir(src):
s = os.path.join(src, item)
d = os.path.join(dst, item)
for e in predict_error(s, d):
yield e
def copytree(src, dst, symlinks=False, ignore=None, overwrite=False):
would overwrite if src and dst are both file
but would not use folder overwrite file, or viceverse
if not overwrite:
errors = list(predict_error(src, dst))
if errors:
raise Exception('copy would overwrite some file, error detail:%s' % errors)
if not os.path.exists(dst):
os.makedirs(dst)
shutil.copystat(src, dst)
lst = os.listdir(src)
if ignore:
excl = ignore(src, lst)
lst = [x for x in lst if x not in excl]
for item in lst:
s = os.path.join(src, item)
d = os.path.join(dst, item)
if symlinks and os.path.islink(s):
if os.path.lexists(d):
os.remove(d)
os.symlink(os.readlink(s), d)
st = os.lstat(s)
mode = stat.S_IMODE(st.st_mode)
os.lchmod(d, mode)
except:
pass # lchmod not available
elif os.path.isdir(s):
copytree(s, d, symlinks, ignore)
else:
if not overwrite:
if os.path.exists(d):
continue
shutil.copy2(s, d)
src = r"{}".format(src)
if not os.path.isdir(dst):
print("\n[!] No Such directory: ["+dst+"] !!!")
exit(1)
if not os.path.isdir(src):
print("\n[!] No Such directory: ["+src+"] !!!")
exit(1)
if "\\" in src:
c = "\\"
tsrc = src.split("\\")[-1:][0]
else:
c = "/"
tsrc = src.split("/")[-1:][0]
os.chdir(dst)
if os.path.isdir(tsrc):
print("\n[!] The Directory Is already exists !!!")
exit(1)
os.mkdir(tsrc)
except WindowsError:
print("\n[!] Error: In[ {} ]\nPlease Check Your Dirctory Path !!!".format(src))
exit(1)
os.chdir(h)
files = []
for i in os.listdir(src):
files.append(src+c+i)
if len(files) > 0:
for i in files:
if not os.path.isdir(i):
shutil.copy2(i, dst+c+tsrc)
print("\n[*] Done ! :)")
copydir("c:\folder1", "c:\folder2")
I couldn't edit the "Boris Dalstein" answer above so here is the improved version of this code:
EDIT on the improvements made:
The input args could be str
path or pathlib.Path
object. Type hint will help.
If the source is a directory, it will create that directory as well
types are defined for local variables so no warning by the IDE
# Recusively copies the content of the directory src to the directory dst.
# If dst doesn't exist, it is created, together with all missing parent directories.
# If a file from src already exists in dst, the file in dst is overwritten.
# Files already existing in dst which don't exist in src are preserved.
# Symlinks inside src are copied as symlinks, they are not resolved before copying.
def copy_dir(source: Union[str, pathlib.Path], destination: Union[str, pathlib.Path]):
destination_path: pathlib.Path
if isinstance(source, str):
source_path = pathlib.Path(source)
elif isinstance(source, pathlib.Path):
source_path = source
if isinstance(destination, str):
destination_path = pathlib.Path(destination)
elif isinstance(destination, pathlib.Path):
destination_path = destination
destination_path.mkdir(parents=True, exist_ok=True)
if source_path.is_dir():
destination_path = destination_path.joinpath(source_path.name)
destination_path.mkdir(parents=True, exist_ok=True)
for item in os.listdir(source_path):
s: pathlib.Path = source_path / item
d: pathlib.Path = destination_path / item
if s.is_dir():
copy_dir(s, d)
else:
shutil.copy2(str(s), str(d))
–
i would assume fastest and simplest way would be have python call the system commands...
example..
import os
cmd = '<command line call>'
os.system(cmd)
Tar and gzip up the directory.... unzip and untar the directory in the desired place.
–
–
–