iXora Custom Software Development Blog

Read | Practice | Advance

Fetch File Names from SFTP Location using Shell (bash) Programming

Posted by on in Blog
  • Font size: Larger Smaller
  • Hits: 991
  • 0 Comments
  • Print

We may need to download files from SFTP location. I got a requirement where files need to download from SFTP location in Linux platform. Those are data files. File download process was  part of ETL process. Our target SFTP location contains previously processed and latest (unprocessed) both types of files. But we need to download only latest (unprocessed) files. Before download the unprocessed file we have to identify unprocessed files. Each data file contain date-stamp. Based on that date-stamp we can capable to identify those files. So we need to fetch all file names first then we have to go for download those files. I am explaining the implementation of how can we achieve that.

Private/Public key files: First we have to create private and public key files for accessing SFTP server. Why because I do not need to provide any password for accessing that. Linux has a command (ssh_keygen) to create these files. One file will be private key file another will be public key file. Those files will be created in local (client) Linux machine. After creating those, public key file need to deploy to the SFTP server. Just point to be remember, private key file in local machine need to set appropriate permission. It might be “read” permission for all users/groups. Without setting proper permission private key will not work in Linux system. Need more detail please can visit:

http://ipswitchft.force.com/kb/articles/FAQ/How-do-I-create-and-use-a-public-private-key-pair-on-Linux

Story:  If SFTP location contains all files (including already processed/archived files) but we need to download latest (un-processed) files (specific date range files) and if file name contain date-stamp in that cases we first need to fetch all file names from SFTP location. Based on those file names we can select which files are eligible to download and for only those files we can go for download. It will stop to download unnecessary files and save IO operation and network bandwidth and also increase our application performance.  To achieve that we have to create an SFTP interaction process in our bash script.

# all sftp file name list from SFTP location and  store it to a global file. per line one file name.
function get_sftp_file_list() {
	local sftp_host="${1}" # Host Name
	local sftp_port="${2}" # Port Number 
	local sftp_user="${3}" # User Name
	local sftp_source_path="${4}" # SFTP source directory
	local sftp_identity="${5}" # Private key file path			

sftp -P "$sftp_port" -i "$sftp_identity" "$sftp_user"@"$sftp_host" << ! >&2>  all_file_name_list.txt
        cd  $sftp_source_path
        ls -1
!	
}

In above I created a bash function. Inside that function I use SFTP command. We should now SFTP has few commands for interaction with SFTP service. That commands are not Linux command but few cases it looks similar. But we should not confuse about that.

Inside the function I take few inputs as parameter. Those are:

  • Host Name
  • Port Number
  • SFTP User Name
  • Private key file path

Those arguments are mandatory for accessing any SFTP location. Inside the function I use sftp command. (If you need more about sftp command then you can go sftp man page. Just type man sftp in linux command prompt. It will show detail about that.)

The command syntax is:

sftp -P "$sftp_port" -i "$sftp_identity" "$sftp_user"@"$sftp_host" << ! >&2>  all_file_name_list.txt
!

The above code block is self-explanatory therefore explain little.

It start <<! And end with !

It isolates that as a SFTP code block. Inside that block we have to change our directory like:

cd  $sftp_source_path

Then we can execute

ls -1

It will retrieve all file names from the current SFTP location and store the file names to the all_file_name_list.txt file. Per line contain one file name. Just point to be remember that few things like cd, ls-1 is also unnecessarily included in that file. You just cut those lines programmatically or ignore those at run-time based on predefined logic.

Happy coding!!!

Rate this blog entry:
0

Comments

  • No comments made yet. Be the first to submit a comment

Leave your comment

Guest
Guest Tuesday, 18 December 2018