Decoding mhtml files
Moderators: white, Hacker, petermad, Stefan2
Decoding mhtml files
I have some mhtml files created with chrome on android (with option without connection chrome create this file that contain all files and images of one web page). But I'm unable to decode this files to save some png files in it. I try the mht plugin (MhtUnPack), but decode only some files and not the png images. I also try to use tc mime decode that decode all files but lose file names and png files are not usable/viewable (decode error).
Any help with this task?
Here an example mhtml file:
https://mega.nz/#!0tIiTLST!pALswG0cd1hv07Dd8Z7cVCFAhjsUNWCFeFTy8Q8IEdk
Thank you
Any help with this task?
Here an example mhtml file:
https://mega.nz/#!0tIiTLST!pALswG0cd1hv07Dd8Z7cVCFAhjsUNWCFeFTy8Q8IEdk
Thank you
PNG files are not encoded here, they are placed in binary form.
You can "recover" them using the appropriate program, eg. hex editor.
Just search for PNG header
or better HEX:
You can "recover" them using the appropriate program, eg. hex editor.
Just search for PNG header
Code: Select all
‰PNG
Code: Select all
89504E47
I tried a couple of MHT decoders including some TC plugins but none of them where able to extract the PNG files in your MHT file.
So I wrote a little PowerShell script:
So I wrote a little PowerShell script:
Code: Select all
$mhtFile = "c:\Temp\MHT\File1.mht"
$outputFolder = "c:\Temp\MHT"
# Note: Codepage 28591 returns a 1-to-1 char to byte mapping
$Encoding = [Text.Encoding]::GetEncoding(28591)
$streamIn = [System.IO.StreamReader]::new($mhtFile, $Encoding)
$BinaryString = $streamIn.ReadToEnd()
$streamIn.Close()
$PNGRegex = [Regex] '\x89\x50\x4E\x47'
#\x89\x50\x4E\x47 = xPNG is the file header
$PNGENDRegex = [Regex] '\x49\x45\x4e\x44'
#'\x49\x45\x4e\x44' = IEND is the end of the file plus 4 bytes
$PNGMatches = $PNGRegex.Matches($BinaryString)
$PNGENDMatches = $PNGENDRegex.Matches($BinaryString)
$MatchCount = $PNGMatches.Count
Write-Output "Total number of matches: $MatchCount"
foreach ($counter in (0..($PNGMatches.Count -1)))
{
Write-Output $counter
Write-Output $PNGMatches[$counter].Index
Write-Output $PNGENDMatches[$counter].Index
$start = $PNGMatches[$counter].Index
$len = ($PNGENDMatches[$counter].Index - $PNGMatches[$counter].Index) + 8
$tmpPNG = $BinaryString.Substring($start, $len)
$streamOut = [System.IO.StreamWriter]::new("$outputFolder\ExtractedPNGs_$counter.png", $false, $Encoding)
$streamOut.Write($tmpPNG)
$streamOut.Close()
}
You can use some ripping software designed for games.
Jaeder Naub V2.0.1 worked for me, as did WinHex (Disk Tools -> File Recovery By Type). Both extracted 44 images. The program is rather confusing. Select PNG format in ripping options, Load the packed file, then press Scan. Extracted files will appear in a subdirectory relative to where Naub is. The sorting of the filenames may need to be fixed by prepending the offsets with zeros.
Jaeder Naub V2.0.1 worked for me, as did WinHex (Disk Tools -> File Recovery By Type). Both extracted 44 images. The program is rather confusing. Select PNG format in ripping options, Load the packed file, then press Scan. Extracted files will appear in a subdirectory relative to where Naub is. The sorting of the filenames may need to be fixed by prepending the offsets with zeros.
#148174 Personal license
Running Total Commander v8.52a
Running Total Commander v8.52a
Thank you. I have tried to run the script on my win7 but receive an error:ZoSTeR wrote:I tried a couple of MHT decoders including some TC plugins but none of them where able to extract the PNG files in your MHT file.
So I wrote a little PowerShell script:Code: Select all
$mhtFile = ...
Method invocation failed because [System.IO.StreamReader] doesn't contain a method named 'new'.
Yess, work. Now I have to check if I can restore also the filenames of png files.j7n wrote:You can use some ripping software designed for games.
Jaeder Naub V2.0.1 worked for me, as did WinHex (Disk Tools -> File Recovery By Type). Both extracted 44 images. The program is rather confusing. Select PNG format in ripping options, Load the packed file, then press Scan. Extracted files will appear in a subdirectory relative to where Naub is. The sorting of the filenames may need to be fixed by prepending the offsets with zeros.