Extract URLs from text and HTML page
Moderators: white, Hacker, petermad, Stefan2
-
- Junior Member
- Posts: 80
- Joined: 2016-10-26, 20:04 UTC
Extract URLs from text and HTML page
Hi everyone,
question:
Is it possible to extract url from a text and an html page with Total Commander or other tools?
Thanks
question:
Is it possible to extract url from a text and an html page with Total Commander or other tools?
Thanks
- ghisler(Author)
- Site Admin
- Posts: 48173
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Re: Extract URLs from text and HTML page
You can view the html file with F3, and then copy the URL (or all of them together) via right click.
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
-
- Junior Member
- Posts: 80
- Joined: 2016-10-26, 20:04 UTC
Re: Extract URLs from text and HTML page
I did not explain well
in a web page or text that contains code and more I want to extract only the url
starting with the prefix http or https or ftp
everything else does not interest me
in a web page or text that contains code and more I want to extract only the url
starting with the prefix http or https or ftp
everything else does not interest me
Re: Extract URLs from text and HTML page
Yes, there are tools for that.Alexisback wrote: ↑2018-10-28, 16:31 UTC Is it possible to extract url from a text and an html page with Total Commander or other tools?
Here with TC for example this > "[WCX] RegXtract - String Extractor with RegEx - RegXtract packer plug-in"
viewtopic.php?f=6&t=38638
2milo1012
I, OTOH, would use a text editor or a script. Or both with e.g. EmEditor or PSPad. There a many example about that at that g00gle pages.
-
- Junior Member
- Posts: 80
- Joined: 2016-10-26, 20:04 UTC
Re: Extract URLs from text and HTML page
ThanksStefan2 wrote: ↑2018-10-30, 13:09 UTCYes, there are tools for that.Alexisback wrote: ↑2018-10-28, 16:31 UTC Is it possible to extract url from a text and an html page with Total Commander or other tools?
Here with TC for example this > "[WCX] RegXtract - String Extractor with RegEx - RegXtract packer plug-in"
viewtopic.php?f=6&t=38638
2milo1012
I, OTOH, would use a text editor or a script. Or both with e.g. EmEditor or PSPad. There a many example about that at that g00gle pages.
I try and see what I can do
even if I do not know the "regular expressions"
-
- Junior Member
- Posts: 80
- Joined: 2016-10-26, 20:04 UTC
Re: Extract URLs from text and HTML page
I use Notepad ++
it is possible to integrate it?
it is possible to integrate it?
-
- Junior Member
- Posts: 80
- Joined: 2016-10-26, 20:04 UTC
Re: Extract URLs from text and HTML page
This work
Thanks
In Notepad++, in the Replace menu (CTRL+H) you can do the following:
Find:
Replace:
Options: check the Regular expression and the . matches newline
This will return you with a list of all your links. There are two issues though:
The regex you provided for matching URLs is far from being generic enough to match any URL. If it is working in your case, that's fine, else check this question.
It will leave the text after the last matched URL intact. You have to delete it manually.
https://stackoverflow.com/questions/19717092/regex-filter-links-from-a-document
Thanks
In Notepad++, in the Replace menu (CTRL+H) you can do the following:
Find:
Code: Select all
.*?(http\:\/\/www\.[a-zA-Z0-9\.\/\-]+)
Code: Select all
$1\n
This will return you with a list of all your links. There are two issues though:
The regex you provided for matching URLs is far from being generic enough to match any URL. If it is working in your case, that's fine, else check this question.
It will leave the text after the last matched URL intact. You have to delete it manually.
https://stackoverflow.com/questions/19717092/regex-filter-links-from-a-document
-
- Junior Member
- Posts: 80
- Joined: 2016-10-26, 20:04 UTC
Re: Extract URLs from text and HTML page
the problem in Notepad ++ and that regular expressions can not be saved
it would take a tool with a database to save the snipptes
does something like this exist?
it would take a tool with a database to save the snipptes
does something like this exist?
Re: Extract URLs from text and HTML page
Just as ghisler(Author) answered you, Lister can do, here are details steps:Alexisback wrote: ↑2018-10-30, 12:50 UTCin a web page or text that contains code and more I want to extract only the url
starting with the prefix http or https or ftp
everything else does not interest me
1- Put the cursor on the web page file.
2- Press <F3> to open the file with Lister
3- from Options menu select 5 HTML text (strip tags) (usually TC auto detect the file contents and pre-select that option)
4- Right click on a link or white space, select Copy URL or Copy all URLs
5- Go to your text editor and paste
-
- Junior Member
- Posts: 80
- Joined: 2016-10-26, 20:04 UTC
Re: Extract URLs from text and HTML page
thanks youts4242 wrote: ↑2018-10-30, 20:06 UTCJust as ghisler(Author) answered you, Lister can do, here are details steps:Alexisback wrote: ↑2018-10-30, 12:50 UTCin a web page or text that contains code and more I want to extract only the url
starting with the prefix http or https or ftp
everything else does not interest me
1- Put the cursor on the web page file.
2- Press <F3> to open the file with Lister
3- from Options menu select 5 HTML text (strip tags) (usually TC auto detect the file contents and pre-select that option)
4- Right click on a link or white space, select Copy URL or Copy all URLs
5- Go to your text editor and paste