Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
–
–
Immediate Issue: Shell Quoting vs Python Quoting
In an unquoted context in shell,
\;
prevents
;
from being parsed as a command separator. When invoking a command with an argument list from Python without
shell=True
, however, there
is
no shell to interpret and remove that backslash, and that extra quoting makes behavior equivalent to passing
'\;'
in shell -- meaning the argument passed to
find
isn't the exact string
;
it's expecting, and that you get a syntax error.
However, the original code is incorrect (in security-impacting ways!) on other counts as well; one of the below solutions should be used.
Smallest Change: Using
find
Safely
Personally, if you're willing to be a little flexible on the output names (ie.
testmkv-27311.mkv.avi
), I would write this more like the following:
subprocess.call([
'find', '/',
'-name', 'testmkv-27311.mkv',
'-exec',
'ffmpeg',
'-i', '{}',
'-vcodec', 'copy',
'-acodec', 'copy',
'-codec:v', 'libx264',
'-profile:v', 'high',
'-preset', 'slow',
'-b:v', '500k',
'-maxrate', '500k',
'-bufsize', '1000k',
'-vf', 'scale=-1:480',
'-threads', '0',
'-codec:a', 'libfdk_aac',
'-b:a', '128k',
'{}.mp4',
Notably:
We are not invoking a shell, and thus avoiding shell-injection-style security vulnerabilities.
Quoting is all done purely with Python syntax -- no nested shell quoting is used or required.
POSIX does not require find -exec
to support {}
as a substring (rather than a complete argument). As such, this answer is (like the code in the question) not as portable as might be hoped.
Alternate Approach: Native Python File Searching
That said, there isn't really a compelling reason to use find
at all, when the Python standard library can do searches for you, and will make it easy to update the names yourself:
def convert(filename_in, filename_out):
subprocess.call([
'ffmpeg',
'-i', filename_in,
'-vcodec', 'copy',
'-acodec', 'copy',
'-codec:v', 'libx264',
'-profile:v', 'high',
'-preset', 'slow',
'-b:v', '500k',
'-maxrate', '500k',
'-bufsize', '1000k',
'-vf', 'scale=-1:480',
'-threads', '0',
'-codec:a', 'libfdk_aac',
'-b:a', '128k',
filename_out,
target = 'testmkv-27311.mkv'
for root, dir, files in os.walk('/'):
if target in files:
filename_in = os.path.join(dir, target)
filename_out = filename_in[:-3]+'.mp4'
convert(filename_in, filename_out)
Alternate Safe Approach: Hardcoded Script Iterating Over Arguments
As yet another approach to doing this securely (not having find
modify or generate code from names, which allows those names to be parsed as code by the shell), you could have your bash script iterate over its arguments:
bash_script=r'''
for filename; do
ffmpeg -i "$filename" \
-vcodec copy \
-acodec copy \
-codec:v libx264 \
-profile:v high \
-preset slow \
-b:v 500k \
-maxrate 500k \
-bufsize 1000k \
-vf scale=-1:480 \
-threads 0 \
-codec:a libfdk_aac \
-b:a 128k \
"${filename%.mkv}.mp4}"
subprocess.call([
'find', '/',
'-name', 'testmkv-27311.mkv',
'-exec', 'bash', '-c', bash_script, '_',
'{}', '+'])
In this case, the _
is the placeholder for $0
in the script; subsequent arguments (the filenames found by find
) are passed in $1
, $2
, etc; and for filename; do
(which defaults to for filename in "$@"; do
) will iterate over them.
Using -exec ... {} +
passes as many as filenames as possible to each command, rather than invoking the command once per filename; this reduces the number of shells invoked, and thus enhances performance.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.