When faced with a requirement to automate some file transfers, a common reaction is, “I can roll my own solution with a simple batch / VB / etc. script, right?” At first glance, it seems simple enough. However, the devil (as always) is in the details – and there are a LOT of details. This article will highlight a few of the more common pitfalls of automating FTP sessions and their solutions.
Lack of Standard Directory Listing Format
One of the details that turns what sounds like a one-hour FTP automation project into weeks of headaches is the lack of a standard directory listing format with FTP servers.
Almost all FTP automation workflows eventually include the need to determine what files and folders are on the server, and to retrieve information about those files and folders. The only way to do this is to retrieve and parse a directory listing. That’s where things get tricky – there is no standard way for an FTP server to provide this information.
Here are three examples of directory listings returned from different FTP servers:
AS/400 FTP Server Directory Listing
24576 07.12.94 10:55:06 *DIR subdir/
TTS 512 99-06-12 09:08:00 *FILE file1.txt
TTS 256512 09-12-07 08:30:00 *FILE file2.txt
TTS 128256512 10-01-14 22:20:00 *FILE file3.txt
UNIX FTP Server Directory Listing
drw-rw---- 1 administrator root 0 Nov 5 05:20 subdir
-rw-rw---- 1 administrator domain users 512 Jun 12 1999 file1.txt
-rw-rw---- 1 administrator sys 256512 Dec 7 08:30 file2.txt
-rw-rw---- 1 1161 1232 128256512 Jan 14 22:20 file3.txt
Novell FTP Server Directory Listing
total 0
d [R----F--] supervisor 512 May 09 17:00 subdir
- [RWCE-FM-] stan 512 Jun 12 1999 file1.txt
- [RWCE-FM-] kyle 256512 Dec 07 08:30 file2.txt
- [RWCE-FM-] eric 128256512 Jan 14 10:20 file3.txt
If that were not complicated enough, the format returned by the same server can vary depending on the circumstances. For example, many servers return a different date format depending on whether the file’s last modified date/time stamp was for this year or over a year ago.
The number of permutations are nearly limitless, which means writing code to parse that information manually is complex, time-consuming, and prone to error.
RFC 3659 was published in 2007, updating the original FTP protocol to include a type of standardized, machine-readable directory listing (among other improvements). Unfortunately, this new feature has not been universally adopted, so it is not possible to count on using it in production environments.
Lack of Standard FTP Protocol Implementation
Directory listings are not the only area where FTP servers differ from one another. The implementation of the FTP protocol itself can vary in unpredictable ways. The FTP protocol was first described in RFC 959 in 1980 and has gone through numerous changes and additions since that time. FTP vendors made their own interpretations of how that standard should be implemented, and some made choices that directly contradict the standard, often to add a proprietary feature to distinguish their product in the marketplace or to meet some other business need.
Your FTP client automation code may get wildly different results when connecting to different servers, requiring endless patches and conditional code to handle all the exceptions as they arise. This turns a simple, straight-forward VB script into hundreds of lines of hard-to-maintain spaghetti code.
Changing Requirements Create Work
The lack of standardization discussed so far all applies even if you are only dealing with one protocol, FTP. While the variations between one FTP server and another can be significant, the differences between an FTP server and an SFTP server (for example) are huge. What if your requirements change and the FTP workflow you so carefully automated now needs to use SFTP instead due to a new business requirement? You might be starting from scratch!
Don’t Attempt To Reinvent The Wheel
The good news about all of this is that products are readily available that take care of all this complexity for you. For example, Robo-FTP from Serengeti Systems Incorporated provides a simple script interface that lets you concentrate on your business logic. Their engineers have spent over a decade testing and re-testing to make sure Robo-FTP automatically handles all of the variations that remote servers might throw at it (including FTP, FTPS, SFTP, HTTP, and HTTPS) without any extra effort on your part.
For example, this Robo-FTP script logs onto an FTP server and downloads all files that have been modified in the last 10 minutes.
LOOPCOUNT 3
:LogonLoop
FTPLOGON "my_ftp_site.com" /timeout=60
IFERROR= $ERROR_SUCCESS GOTO Operation1
;; Try again for up to 3 attempts
LOOPTO LogonLoop
STOP
:Operation1
SET temp = %datetime
DATETIMESUB temp 10 /minute
SET ten_min_ago = temp
FTPGETFILE "*.txt"
IFERROR= $ERROR_NO_FILE_FOUND GOTO Disconnect
IFDATETIME> %sitefiledatetime ten_min_ago GOTO Operation2
GOTO Operation1
:Operation2
;; File has been modified within last 10 minutes, so download it
RCVFILE %sitefile
GOTO Operation1
:Disconnect
FTPLOGOFF
EXIT
The key here is that in under 20 lines of scripting (not including commments), you have an automated workflow that will work with any server. A roll-your-own solution might be many times as long. Worse, even if you get it working on one test server, it may not work at all when connecting to the production server. Once you put in the extra work to get it working with that server, you may still be right back to the drawing board if a requirement arises to automate the same workflow with yet another server.
Clearly, the cost of purchasing a solution that handles these details for you is easily eclipsed by the cost and frustration of trying to handle all those possible variations on your own.