Home Data Collection

Data Collection

Print

Overview

Data from every run will be reviewed and released to users only upon passing in-house QC. Once data is available, users will be notified via email.  Two data transfer options are available:

(I)  For all users, data could be downloaded via sFTP server.
  1. User receives sFTP username and password through email, unzip password will be provided in a separate email.
  2. User downloads data from sFTP server.  Instructions can be found here.
  3. To unzip files, user can use winrar or 7zip.
(II)  For CGS HPCF* users only, data could be transferred directly to user-specified HPCF directories.
  1. User provides HPCF folder address to CGS.
  2. CGS notifies user upon completing data transfer through email.

In compliance with the centre’s data protection scheme, analysed data are compressed and encrypted prior to delivery via sFTP server.  Username and password to access sFTP server and password to unzip files are provided in separate email for added security.  Data could be unzipped using 7zip or winrar.

Due to limited server hard disk space, data will only be kept for 1 month after delivery.  Data will then be removed from our servers without prior notice.

Please ensure that you keep a copy of the data (analysis results and all intermediate files) securely and clearly identified for future reference.

*For more information about HPCF, please visit the HPCF section.


Data Collection Workflow

Data Collection Workflow


How to Download Files from CGS sFTP Server

Summary

This section shows how to transfer files from CGS sFTP Server by setting up and logging into Filezilla client as the preferred FTP client. Please remember to download the md5sum file in the same folder to verify the integrity of the downloaded file later. Due to the instability of Wi-Fi, Wi-Fi is not recommended to use for the download.

Procedures

  1. Download and install Filezilla client (https://filezilla-project.org). Note: please download the Filezilla client and NOT the file Filezilla Server.
  2. Open Filezilla. Enter the following information into the Quickconnect bar located at the top of the window.
    • Host: Your given host
    • Username: Your given username
    • Password: Your given password
    • Port number: 22
    • (The above information can be obtained from the email sent by the bioinformatics service team)
  3. Connection information
     

  4. Click on Quickconnect or press Enter to connect to the server.
  5. Quickconnect

  6. On first login, click OK to accept the security certificate about an unknown host key.
  7. Security certificate

  8. Click on the file that you wish to download from CGS sFTP Server (window on the right) and then move the file to the destination location on your computer (window on the left). Please note that you will need to click and hold the mouse button during this drag and drop action.
  9. FTP Client

  10. Upon "dropping" the file into your computer, you will see that the file transfer is in progress (see screen shot below). Please wait for the file transfer to be completed and this will take a while for files that is large in size. Note that the number in the brackets denotes how many files are to be transferred.
  11. Download progress

  12. Once the file transfer completed successfully, you will see numbers in the "Successful transfers" tab (see screen shot below). Note that the number in the brackets denotes how many files are transferred successfully.
  13. Successful transfers

  14. You can close Filezilla client when all files are downloaded successfully and proceed to verify the downloaded file using md5sum (Refer to next section). Please ensure that you follow the next step carefully to confirm that the download is successful.


How to Check md5sum of Downloaded Files

Summary

This section shows how to verify the integrity of the downloaded file using the MD5 (Message-Digest algorithm 5) hash value by WinMD5Free.

Procedures

  1. Download and unzip WinMD5Free (http://www.winmd5.com).
  2. Open the WinMD5.exe in the unzipped folder.
  3. Click on Browse and choose the zip file you have downloaded from CGS sFTP Server, the MD5 checksum value will be computed and shown in Current file MD5 checksum value. Please wait patiently as this process will take a while (up to an hour or more) for file that is large in size.
  4. WinMD5Free
     

  5. Open the md5sum file downloaded from CGS sFTP server in Notepad. Copy the MD5 checksum value and paste into Original file MD5 checksum value in WinMD5Free, then click Verify.

  6. Notepad:

    MD5 Sum

    WinMD5Free:

    WinMD5Free
     

  7. A window will pop up and show “Matched!” if the download from CGS sFTP server is success. If “NOT Matched!” is shown, please download the file again from our sFTP server.
  8. MD5 match
     

  1. After confirming that the file is downloaded successfully, you can proceed to unzip the file using 7-zip if needed (Refer to next section).

 

How to Unzip Password Protected Files Downloaded from CGS sFTP Server

Summary

This section shows how to unzip the password protected files downloaded from our sFTP server using 7-Zip.
WARNING: Please ensure you have sufficient disk space to hold all the un-compress data.


Procedures

  1. Download and install 7zip (http://www.7-zip.org).
  2. Right click the zip file you have downloaded from CGS SFTP server, then go 7-Zip and click Extract Here.
  3. 7-Zip
     

  4. A window will pop up and ask for the password. Please enter the password that we have provided through email and click OK.
  5. Extract with password
     

  6. An unzipped folder will be extracted at the same location as the original file.