Universal EOL in Regular Expressions.

Discuss and announce Total Commander plugins, addons and other useful tools here, both their usage and their development.

Moderators: white, Hacker, petermad, Stefan2

Post Reply
olesio
Junior Member
Junior Member
Posts: 54
Joined: 2009-01-22, 15:29 UTC
Location: Poland

Universal EOL in Regular Expressions.

Post by *olesio »

Hello. Sorry for posting such a question here, but there is many experience
Delphi programers here so maybe someone can help me. I wrote to author
of TRegExpr and on two polish delphi forums, but no answers yet, so I
decided write here. Ok, here is the question. I make a simple program, which
download a html code from: http://www.money.pl/pieniadze/nbp/srednie/index.html
and parese it using RegExpr module - code below. And it is works fine.
But when authors of web site change code using some unix editor an EOL
characters will be changed from $0D$0A to $0A. So how to set HtmlRE to
"catch" lines with any EOL characters? I tryed $, but this do not work
so I read in help I must use modifier "m". So I try to add a following
line: R.ModifierStr := 'mgsrix'; but this won't help. So how to make a
universal end of line characters ($0A$0D, $0A or $0D) is this possible
in RegExpr module? I readed help file, but my english knowledge as you
may see - is not too good :( Thanks for help in advice. Here is my function:

Code: Select all

function ExtractHtml(const AInputString : string) : string;
const
  Values : array[1..3] of string =
  ('1 EUR', '1 USD', '1 CHF');
var
  I : integer;
  R : TRegExpr;
  HtmlRE : string;
begin
  Result := '';
  R := TRegExpr.Create;
    try
    for I := Low(Values) to High(Values) do
      begin
      HtmlRE :=  '<td>' + Values[I] + '</td>\x0D\x0A' +
      '<td>(\S+)</td>\x0D\x0A<td>(\S+)</td>\x0D\x0A<td class="';
      R.Expression := HtmlRE;
      if R.Exec (AInputString) then
         repeat
         Result := Result + R.Match[2] + #13#10;
         until not R.ExecNext;
      end;
    finally
    R.Free;
   end;
end;
Best regards: olesio
User avatar
Alextp
Power Member
Power Member
Posts: 2321
Joined: 2004-08-16, 22:35 UTC
Location: Russian Federation
Contact:

Post by *Alextp »

Can you try to catch all EOLs by Hex codes?
(code1 | code2 | code3 )- like that?
I don't use TRegExpr really
(prefer DIRegEx)
olesio
Junior Member
Junior Member
Posts: 54
Joined: 2009-01-22, 15:29 UTC
Location: Poland

Post by *olesio »

Thank you for answer and hint. Code below works fine:

Code: Select all

function ExtractHtml(const AInputString : string) : string;
const
  Values : array[1..3] of string =
  ('1 EUR', '1 USD', '1 CHF');
var
  I : integer;
  R : TRegExpr;
  HtmlRE : string;
begin
  Result := '';
  R := TRegExpr.Create;
    try
    for I := Low(Values) to High(Values) do
      begin
      HtmlRE :=  '<td>' + Values[I] + '</td>(\x0D\x0A|\x0A|\x0D)' +
      '<td>(\S+)</td>\x0D\x0A<td>(\S+)</td>(\x0D\x0A|\x0A|\x0D)<td class="';
      R.Expression := HtmlRE;
      if R.Exec(AInputString) then
         repeat
         Result := Result + R.Match[3] + #13#10;
         until not R.ExecNext;
      end;
    finally
    R.Free;
   end;
end;
Best regards: olesio
Post Reply