FTP: connection timeout and auto-resume

This forum contains bug reports from previous beta tests - the issue has remained unresolved, either because it couldn't be reproduced, or couldn't be prevented/fixed

Moderators: Stefan2, white, sheep, Hacker

Sir Kill A Lot
Junior Member
Junior Member
Posts: 7
Joined: 2010-06-23, 17:26 UTC

FTP: connection timeout and auto-resume

Post by *Sir Kill A Lot » 2010-06-23, 18:16 UTC

This bug affects Total Commander v7.55. Previous versions (e.g. v7.50a) are not affected.

There was some change with handling FTP transfers and connection timeouts which now prevents finishing of downloads from some special configured FTPs.

The situation is as follows:
  1. I connect to the FTP (control connection is opened)
  2. I download some bigger file (data connection is opened and transfer is started)
  3. Data transfer finishes
  4. Due to the Firewall configuration of the FTP server the control connection was closed by the server because of inactivity (the server's firewall is not aware of the FTP protocol; connection timeout seems to be <1 min).
    And here the bug kicks in: Total Commander automatically reconnects to the server (which is OK), but then starts to redownload the whole file! The file was actually already downloaded completely (there was no problem with the data connection, it's the control connection which timeouts in the background during the transfer)
Although the server supports retrying it is not even tried in this case.
I've tried to change the configuration setting 'Auto-resume transfer if no data received for 30s' but there is no difference.

Here is the vital part from the FTP log (the size of the file to download was 250.745.198 bytes):

Code: Select all

----------
Connect to: (23.06.2010 19:33:30)

...

TYPE I
200 Type set to I.
PASV
227 Entering Passive Mode (#,#,#,#,#,#).
RETR file
125 Data connection already open; Transfer starting.
Download: 250.729.292 bytes, 961 kbytes/s
Waiting for server...
OFFLINE9, error=10004
Copied (23.06.2010 19:38:08): ftp://server/pathfile -> C:\temp\file 250.745.198 bytes, 899 kbytes/s
----------
Connect to: (23.06.2010 19:38:08)

... (TC starts download again using RETR)

Sob
Power Member
Power Member
Posts: 908
Joined: 2005-01-19, 17:33 UTC

Post by *Sob » 2010-06-23, 19:59 UTC

Confirming, at least partially.

When the control connection is silently dropped, TC finishes the download, waits few seconds for server and when no reply comes, then just closes the transfer window and does not detect disconnection until another command is sent and timeouts.

But when TCP Reset is sent to control connection, so TC knows about disconnection for sure, then after transfer finishes, TC immediatelly reconnects and starts downloading the file again from the beginning.

User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 37417
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) » 2010-06-23, 20:31 UTC

2Sir Kill A Lot: Thanks for your report.
2Sob: Thanks for your analysis.

Unfortunately it seems that you are right, TC doesn't handle this special situation correctly. However, it seems to work as it should when you download the file in background. Can you confirm this? I couldn't find a server with this problem, so I simulated it in the debugger. Therefore I cannot say whether my analysis is correct or not.

I'm currently preparing TC 7.55a (to be released probably next week), so I will write a fix for your problem too. Please let me know as soon as possible whether the background transfer works for you (also for more than 1 file).
Author of Total Commander
http://www.ghisler.com

Sob
Power Member
Power Member
Posts: 908
Joined: 2005-01-19, 17:33 UTC

Post by *Sob » 2010-06-23, 21:47 UTC

You're right, no problem in background. Resetting of control or data connection or both, nothing stops the background transfer, it always reconnects and continues where it stopped. Multiple files in queue are fine too.

I also don't have server with this problem, but I use different kind of simulation. I put router with MikroTik RouterOS between client and server. In wcx_ftp.ini I set KeepAliveTransfer=1 with 10 second period to get some trigger packets. Then I added following rule to ROS's firewall:

Code: Select all

/ip firewall filter 
add chain=forward src-address=<clientip> dst-address=<serverip> \
    protocol=tcp dst-port=21 action=reject reject-with=tcp-reset disabled=yes
And after I start transfer in TC, I can kill the connection any time I want by simply enabling this rule. :)

Sir Kill A Lot
Junior Member
Junior Member
Posts: 7
Joined: 2010-06-23, 17:26 UTC

Post by *Sir Kill A Lot » 2010-06-23, 23:29 UTC

Yes, I can confirm that background downloads work with that FTP server, even with multiple files!

(btw, it's a Windows Server 2003 with IIS 6 and I think Windows Firewall is used)

User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 37417
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) » 2010-06-24, 09:36 UTC

Thanks for your feedback! Any idea why you get disconnected? Is that done by the server or the firewall?
Author of Total Commander
http://www.ghisler.com

Sir Kill A Lot
Junior Member
Junior Member
Posts: 7
Joined: 2010-06-23, 17:26 UTC

Post by *Sir Kill A Lot » 2010-06-24, 17:08 UTC

The disconnection is done by the Windows Firewall on this same server.
It seems like the Firewall terminates connections which are inactive for a period of time (about 1 minute).

Since the control connection is inactive during the period of the file transfer it will be terminated by the Firewall. It's not a problem with the FTP server itself (the FTP knows, that the data connection belongs to this control connectrion, a more intelligent Firewall could this know too).

Too bad I have no idea how those connection timeouts of Windows Firewall can be configured/deactivated.

Sob
Power Member
Power Member
Posts: 908
Joined: 2005-01-19, 17:33 UTC

Post by *Sob » 2010-06-24, 18:58 UTC

It's hard to believe that it could be done by Windows Firewall (Microsoft's own one). Timeout about 1 minute is insane, it can't be default value. And I didn't find any option to set custom value, so the default value is the only one possible (but since Windows is very complex system, I'm open to corrections about this statement :).

If you can do some testing on server, download netcat (http://joncraton.org/files/nc111nt.zip), make it listen on some other port (netcat -l -p 3333), telnet to it from another computer and leave it hanging. You can run some packet sniffer along with it to be sure that no packets are sent over the established connection and keeping it alive.

Try to connect from both the usual location where you use ftp from and also from the same subnet where server is, to rule out other possible firewalls on the way.

I have one such established connection hanging for around half an hour now and it's still holding.

Sir Kill A Lot
Junior Member
Junior Member
Posts: 7
Joined: 2010-06-23, 17:26 UTC

Post by *Sir Kill A Lot » 2010-06-25, 01:56 UTC

I couldn't reproduce the timeout using netcat or a custom TCP server.

I can only reproduce it using the FTP connection. Actually the timeout seems to be around 80 seconds. E.g.: I connect to the server using telnet -> wait 85s -> send another command -> no response anymore

If the Windows Firewall is temporary disabled this problem does not occur anymore (instead the FTP command timeout will be triggered after default of 2 minutes). Btw, I'm using a Firewall rule allowing all traffic for application inetinfo.exe.


I've resigned to the fact, that this is either an uncommon bug or some unfortunate misconfiguration of the server.

But I just want to thanks for all the help!

Sob
Power Member
Power Member
Posts: 908
Joined: 2005-01-19, 17:33 UTC

Post by *Sob » 2010-06-25, 04:05 UTC

It brings suspicion to my long time favourite, the "Application Layer Gateway" service. Check with TCPView which process owns the connections and if it's alg.exe, then you have your offender.
On Windows 7 there's another troublemaker called "stateful ftp packet inspection" (netsh advfirewall show global), but I'm not exactly sure if/how it's related to ALG. And I think Win2003 does not have that, but there can be something similar.
Easy way to check for these things is to change ftp port to another from default 21 and if it helps, then some stupid firewall helper is responsible for problems.

Sir Kill A Lot
Junior Member
Junior Member
Posts: 7
Joined: 2010-06-23, 17:26 UTC

Post by *Sir Kill A Lot » 2010-06-25, 13:30 UTC

Yes, with enabled Firewall all the connections are going through alg.exe.

Well anyway for servers it's never a good idea to rely on such Firewalls, in my case this is a standalone server in a datacenter and is directly connected to the Internet. The Windows Firewall is at least better than nothing (most of the time :-)

The problem you are mentioning with Windows 7: I hope this is only a problem when hosting FTP servers, or does it actually affect FTP clients too?

User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 37417
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) » 2010-06-25, 15:50 UTC

Maybe it's a firewall bug? It may be monitoring the connection only for the LIST command, but not for MLSD. So it thinks that the connection is inactive, and kills it. This is just a thought, I cannot test it myself - but maybe someone else can?
Author of Total Commander
http://www.ghisler.com

Sir Kill A Lot
Junior Member
Junior Member
Posts: 7
Joined: 2010-06-23, 17:26 UTC

Post by *Sir Kill A Lot » 2010-06-25, 17:03 UTC

ghisler(Author) wrote:Maybe it's a firewall bug? It may be monitoring the connection only for the LIST command, but not for MLSD.
That's an interesting idea, but does not apply in this case.
The get some data connection either the PORT command or PASV is used afaik. So this information should be sufficient for the firewall to link data connections with control connections (and every FTP-aware firewall should know these commands).
Also the problematic command is RETR (retrieving file) which is likely taking more than one minute.

Sob
Power Member
Power Member
Posts: 908
Joined: 2005-01-19, 17:33 UTC

Post by *Sob » 2010-06-25, 17:19 UTC

Try to disable "Application Layer Gateway" service in Windows and it should help. That thing is supposed to help with NAT traversal, but it never did for me. It either did nothing or caused harm when it tried to do something.

About the other problem with Win7, I first thought about blaming TC here: http://ghisler.ch/board/viewtopic.php?t=25760 ;) It only happened with FTPS not with regular FTP. But I'm not exactly sure anymore, if I also tried to turn it off on server and leave it on on client. Both off was the best solution for me.

Sir Kill A Lot
Junior Member
Junior Member
Posts: 7
Joined: 2010-06-23, 17:26 UTC

Post by *Sir Kill A Lot » 2010-06-27, 23:37 UTC

It seems like FTP connections were the only ones getting hijacked by Application Layer Gateway Service. So today I tried disabling this service like you suggested and all is working normally and those strange connection timeouts seem to be gone now!

Thanks for helping!

Post Reply