This forum uses cookies. Click X button to hide this message. What is stored? / Privacy
Total Commander Forum Index Total Commander
Forum - Public Discussion and Support
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Replace duplicates with hard-links

 
Post new topic   Reply to topic    Total Commander Forum Index -> TC suggestions (English) Printable version
View previous topic :: View next topic  
Author Message
gskoczylas
Junior Member
Junior Member


Joined: 19 Jun 2008
Posts: 5

PostPosted: Mon May 08, 2017 11:36 am    Post subject: Replace duplicates with hard-links Reply with quote

I have duplicated files on my hard disk (files with the same contents). Total Commander have nice function to locate duplicate files. Then I can select files to remove from disk. But I do not want to remove them. I want to replace all selected duplicate files with their hard-links. Idea

As far as I know, it is not possible using current verision of the Total Commander.
Back to top
View user's profile Send private message Send e-mail
MVV
Power Member
Power Member


Joined: 03 Aug 2008
Posts: 8055
Location: Russian Federation

PostPosted: Mon May 08, 2017 1:38 pm    Post subject: Reply with quote

Yes, it is not possible with just TC. You have to use some kind of third-party tool or script.

BTW it is not always safe to replace duplicates with hardlinks: if you have hardlinks and you change one copy, all other hard copies will be changed too.
_________________
TCFS2 + TCFS2Tools: Full-screen mode for TC etc (forum)
TOTALCMD.NET: AskParam, CopyTree, NTLinks, Sudo, VirtualPanel…
Back to top
View user's profile Send private message Send e-mail
DrShark
Power Member
Power Member


Joined: 03 Nov 2006
Posts: 1111
Location: Kyiv, 68/262

PostPosted: Wed May 10, 2017 6:32 am    Post subject: Reply with quote

Already has been suggested, but probably won't be added to TC: this feature may be really dangerous if applied to system files.
Suggestion topic, with the link to a findduppe tool that can convert to the links the files listed in the TC panel after search for duplicates.
Example of the button for the Total Commander button bar (requires lst2str).
Notes for the solution with findduppe and lst2str applied to TC filelist:
* it isn't very smart because doesn't give a way to tell to which group of links the unlinked file will be linked (it's possible to have 2 or more groups of hardlinks of same file, with each group made from own parent/inital file);
* it isn't also stable and reliable enough because lst2str doesn't work with too many files selected in the TC panel, especially if they/their paths have long names. In such a case following error appears:
Code:
---------------------------
---== ATTENTION (lst2str) ERROR ==---
---------------------------
Too many files selected (CL limit reached)! Continue? (result would be truncated)
---------------------------
ОК   Отмена   
---------------------------

After or instead of that error, the crash may just happen.
_________________
Android 4.3.1 no root, kernel 08.09.2016; Vista Home Premium SP2 rus 32 bit
TC #149847 Personal licence

Cuz we're all in this together, We're here to make it right
Back to top
View user's profile Send private message
HAL 9000
Senior Member
Senior Member


Joined: 10 Sep 2007
Posts: 381

PostPosted: Wed Jul 11, 2018 4:30 pm    Post subject: RE:Replace duplicates with hard-links Reply with quote

Well try
Duplicate File Hard Linker (DFHL)
https://github.com/Hopfengetraenk/DFHL

Just drag 'DFHL.cmd' from
https://github.com/Hopfengetraenk/DFHL/files/2168067/DFHL_2.6.zip
on the TC-buttonbar and set
? %P
as parameters.

Now this will look for Duplicates and hardlink them to one file.

_______________________________________

To just copy some folder with hardlinks I use
link shell extension
http://schinagl.priv.at/nt/hardlinkshellext/linkshellextension.html#download

When you've installed it -
right click on source folder -> set Source
right click on destination folder -> paste a/smart copy
Back to top
View user's profile Send private message
Usher
Junior Member
Junior Member


Joined: 11 Mar 2011
Posts: 50

PostPosted: Thu Jul 12, 2018 11:51 am    Post subject: Re: Replace duplicates with hard-links Reply with quote

HAL 9000 wrote:
Duplicate File Hard Linker (DFHL)
https://github.com/Hopfengetraenk/DFHL
I can't see any link to download exe there.

HAL 9000 wrote:
https://github.com/Hopfengetraenk/DFHL/files/2168067/DFHL_2.6.zip
By default this exe won't run under Windows XP. The developers claim that:
DFHL readme.md wrote:
The tool runs in Windows NT 4.0 / 2000 / XP and 2003 Server and requires a NTFS file system to run on.

So it should be compiled to run also in older Windows.

In general, I can't recommend this tool. It seems to NOT support Unicode (partially, at least), so you may have problems when dealing with any accented character in path or file name.

Short tests run in directory with 33687 ( >32 K ) files, 93102588422 ( >86 GiB ) bytes. Some files or paths with Polish or Russian names, Windows XP SP3 with Polish language settings. DFHL used for listing files only, with no /l (link) option specified.

1. Run DFHL 2.0 (old version) under Windows XP set to Polish.
- Doesn't display any Polish or Russian character, just ends lines in such places.
- Seems to support only 32 K files, for larger directories ends with a crash.

2. Run DFHL 2.6 (use editbin to change required Windows version)
- Displays any Polish or Russian character as question mark.

So - even if DFHL makes links properly, you will have to check all links with other tools, as the file listing from DFHL won't be helpful.

Some more remarks:
- DFHL may skip file less than 1 KiB, but you can't change that limit.
- You can't select which file will be kept and which one will be replaced with link. I prefer to keep a copy with older timestamp, bot another user may prefer to keep files in a certain "master" directory. DFHL allows only to limit linking to files with the same timestamp, so it's definitely not the option to remove duplicates of downloads.
_________________
Regards from Poland
Andrzej P. Wozniak
Back to top
View user's profile Send private message
HAL 9000
Senior Member
Senior Member


Joined: 10 Sep 2007
Posts: 381

PostPosted: Fri Jul 13, 2018 8:00 am    Post subject: Re: Replace duplicates with hard-links Reply with quote

Wow first of all thanks for replying and for having a deeper look and testing DFHL.
About the history of DFHL.
I need to say that I just ask 'Hans Schmidts' who did some enhancements for the source code. Converted the sources from Visual Studio 6 to Visual Studio 2015...and put it to Github.
Before receiving the sources for v2.5 I did just for fun a reverse engineering project using IDA 7 to decompile Version 2.5 and the sources from v 2.0 to recover restore the names and class structures.
Nice to see what is possible. Razz
However when getting the source I discard this project. However by that i got some rough overview about the sources all classes and functions.

Some oddity about this source is that it included a copy of the Windows 'CreateHardLink' API in ('Hardlink.cpp') that can be uses instead of just calling this API. (It is though for old systems like Windows NT whose kernel already had the ability to do hard links but there was no really user API for it)

Usher wrote:
HAL 9000 wrote:
Duplicate File Hard Linker (DFHL)
https://github.com/Hopfengetraenk/DFHL
I can't see any link to download exe there.

There is that pink/blue line (language bar) above in the middle it there is a link "3 releases". Just in between with branches and contributor.
There you can Dl the exe.

But yes sorry old problem this Exe won't run on Windows XP and below.
Mainly because of the OperatingSystemVersion that is set by the linker in the PE-Header.

->Optional Header
Magic: 0x010B (HDR32_MAGIC)
MajorLinkerVersion: 0x0E
MinorLinkerVersion: 0x00 -> 14.00
...
MajorOperatingSystemVersion: 0x0006
MinorOperatingSystemVersion: 0x0000 -> 6.00
MajorImageVersion: 0x0000
MinorImageVersion: 0x0000 -> 0.00
MajorSubsystemVersion: 0x0006
MinorSubsystemVersion: 0x0000 -> 6.00
Win32VersionValue: 0x00000000
SizeOfImage: 0x0005F000
SizeOfHeaders: 0x00000400
...

I my care for that in further builds. Well as a quick hack open the Exe in a Hexeditor look for 'PE' at the beginning and then watch for two 06 00 00 00 as show below:

00000060 74 20 62 65 20 72 75 6E 20 69 6E 20 44 4F 53 20 t be run in DOS
00000070 6D 6F 64 65 2E 0D 0D 0A 24 00 00 00 00 00 00 00 mode. $
00000080 50 45 00 00 4C 01 05 00 CC 79 3E 5B 00 00 00 00 PE L Ìy>[
00000090 00 00 00 00 E0 00 02 01 0B 01 0E 00 00 3C 04 00 à <
000000A0 00 90 01 00 00 00 00 00 07 7D 00 00 00 10 00 00 }
000000B0 00 50 04 00 00 00 40 00 00 10 00 00 00 02 00 00 P @
000000C0 06 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00

change this two 06 to 04 and save it. Now the Window XP-Loader should at least go on. Well that's what editbin does.

Usher wrote:

1. Run DFHL 2.0 (old version) under Windows XP set to Polish.
- Doesn't display any Polish or Russian character, just ends lines in such places.
- Seems to support only 32 K files, for larger directories ends with a crash.

Okay proper char encoding decoding is always a hidden bug subject as well as proper exception handling. As far as I saw DFHL uses strictly unicode Strings and Api's. However what is probably missing is to also decode strings that are read and encode strings that are written out.


DFHL has some exception handling, but only for C++ exception - but any zero pointer or access violation is not trapped by that and just leads to a crash. I'm onto it adding some of this nasty Ms __try __except blocks at critical points so the program goes on.
... and of course I'll try to fix the bug and fix it.
Arrow please attach a zip file with some sample files to reproduce & test this bug
... or even better file an issue on the DFHL-Source codepage on GitHub.

Usher wrote:

- You can't select which file will be kept and which one will be replaced with link. I prefer to keep a copy with older timestamp, bot another user may prefer to keep files in a certain "master" directory.

Before hardlinking files are checked to be the same so it should not matter that much which of both file is picked. Okay regarding file fragmentation and assuming that old file are less fragmented than new ones it is maybe important which file DFHL picks.
Back to top
View user's profile Send private message
Usher
Junior Member
Junior Member


Joined: 11 Mar 2011
Posts: 50

PostPosted: Fri Jul 13, 2018 1:14 pm    Post subject: Re: Replace duplicates with hard-links Reply with quote

HAL 9000 wrote:
Some oddity about this source is that it included a copy of the Windows 'CreateHardLink' API in ('Hardlink.cpp') that can be uses instead of just calling this API. (It is though for old systems like Windows NT whose kernel already had the ability to do hard links but there was no really user API for it)
Yes, that is clearly stated in the changelog:
DFHL changelog wrote:
Changes from Version 1.0 to Version 1.1
* Added Support for Windows NT 4.0, missing Hardlink API was created


HAL 9000 wrote:
There is that pink/blue line (language bar) above in the middle it there is a link "3 releases". Just in between with branches and contributor. There you can Dl the exe.
In most cases the link is repeated in readme.md, so I've never learnt what's hidden below "release". My bad.

HAL 9000 wrote:
Well that's what editbin does.
Editbin has some more options taken from linker, f.e. it may clear or recalculate checksums when saving changes.

HAL 9000 wrote:
As far as I saw DFHL uses strictly unicode Strings and Api's. However what is probably missing is to also decode strings that are read and encode strings that are written out.
It seems to be a limitation of console output. If you want to keep Unicode file and directory names, you should always log them to a text (UTF-16) file. It means that only statistics of "bytes saved" will be sent to stdout - to let user know that the tool is still working.

HAL 9000 wrote:
... or even better file an issue on the DFHL-Source codepage on GitHub.
You mean: Sign up and file an issue, right?

HAL 9000 wrote:
Okay regarding file fragmentation and assuming that old file are less fragmented than new ones it is maybe important which file DFHL picks.
That's not what I mean. Some installers don't preserve original timestamps, some developers change timestamps with every release even for third party libs they use, no matter whether the files are really changed or recompiled – and you can only guess what's happened, as the file with the newer timestamp may have the same size and older or missing version number.

And a question about other possible options - How to deal with NTFS compressed or sparse files?
_________________
Regards from Poland
Andrzej P. Wozniak
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Total Commander Forum Index -> TC suggestions (English) All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Imprint/Impressum: This site is maintained by Ghisler Software GmbH
Privacy Policy | Datenschutzerklärung | Politique de Confidentialité

Using phpBB © phpBB Group