If you cannot guarantee what information an FTP server might choose to return from its
command, how are you going to tell directories from normal files—an essential step to downloading
entire trees of files from the server?
The answer, shown in
recursedl.py, is to simply try a
cwd() into every name that
nlst() returns and,
if you succeed, conclude that the entity is a directory. This sample program does not do any actual
downloading; instead, to keep things simple, it simply prints
out the directories it visits to the screen.
import os, sys from ftplib import FTP, error_perm def walk_dir(f, dirpath): original_dir = f.pwd() try: f.cwd(dirpath) except error_perm: return # ignore non-directores and ones we cannot enter print dirpath names = f.nlst() for name in names: walk_dir(f, dirpath + '/' + name) f.cwd(original_dir) # return to cwd of our caller f = FTP('ftp.kernel.org') f.login() walk_dir(f, '/pub/linux/kernel/Historic/old-versions') f.quit()
This sample program will run a bit slow—there are, it turns out, quite a few files in the old-versions directory on the Linux Kernel Archive— but within a few dozen seconds, you should see the resulting directory tree displayed on the screen:
root@erlerobot:~/Python_files# python recursedl.py /pub/linux/kernel/Historic/old-versions /pub/linux/kernel/Historic/old-versions/impure /pub/linux/kernel/Historic/old-versions/old /pub/linux/kernel/Historic/old-versions/old/corrupt /pub/linux/kernel/Historic/old-versions/tytso