Szathmáry László honlapja @ DEIK | Nim2 / regexp examples

See also std/re

(1) get a group from a string
import re text = "Asian.lst" result = re.search(r"(.*)\.lst", text) if result: filename = result.group(1) print(filename) # Asian	import std/re let text = "Asian.lst" let pattern = re"(.*)\.lst" var matches: array[1, string] if match(text, pattern, matches): let filename = matches[0] echo filename # Asian

Warning! match() checks if the pattern matches the entire string, from the beginning! See also the 2nd example below.
The type of pattern is Regex. Note that Nim's extended raw string literals support the syntax re"[abc]" as a short form for re(r"[abc]") .
The array size (here: matches array) must match the number of capture groups (parentheses) in your pattern.
array[1, string] holds capture groups (index 0 = first group, unlike Python's group(1))
match() fills the array in-place and returns a bool
No need for a result object. Captures land directly in matches .
See the docs

(2) match a string against a regexp
import re text = "Asian.lst" result = re.search(r"ian", text) if result: print("contains 'ian'")	import std/re let text = "Asian.lst" # V1 echo match(text, re"ian") # false # false — "ian" doesn't match all of "Asian.lst" echo match(text, re".ian.") # true, full match # V2 echo contains(text, re"ian") # true # true — "ian" is found somewhere inside it if contains(text, re"ian"): echo "contains 'ian'" # (printed on the screen)

match() checks if the pattern matches the entire string from the beginning
contains() checks if it matches anywhere in the string

(3) read a file line by line and match a regexp against each line

import re

f1 = open("input.txt", "r")
p = re.compile(r"dog", re.IGNORECASE)

for line in f1:
line = line.rstrip("\n")
if p.search(line):
print(line)

f1.close()

import std/re

let f = open("input.txt", fmRead)
#let pattern = re"dog" # not enough, case-sensitive
let pattern = re(r"dog", {reIgnoreCase}) # case-insensitive

for line in f.lines:
if contains(line, pattern):
echo line

f.close()

input.txt:

yo
dogdog
a dog is here
a cat is here
Snoop Doggy Dog
pussycat

output:

dogdog
a dog is here
Snoop Doggy Dog

the type of pattern is Regex
When you write re"dog", this actually calls re(r"dog"), where re() is a function that returns a Regex object.

(4) replace the first, then all the occurences of a substring in a string
import re text = "a dog and a dog" text = re.sub(r"d.g", "cat", text, count=1) print(text) # 'a cat and a dog' text = re.sub(r"d.g", "cat", text) print(text) # 'a cat and a cat'	import std/re let text = "a dog and a dog" echo replace(text, re"d.g", "cat") # a cat and a cat

𝥶Unfortunately, std/re's replace() has no count parameter.

(5) find all the occurences of a substring in a string

import re

text = '<a href="ad1">sdqs</a>' + '<a href="ad2">sds</a><a href=ad3>qs</a>'

m = re.findall(r'href="?(.*?)"?>', text)
print(m) # ['ad1', 'ad2', 'ad3']

import std/re

proc findAllPy(text: string, pattern: Regex): seq[string] =
# In the pattern, only 1 capture group is supported.
var capture: array[1, string]
var start = 0
while start < text.len:
let m = find(text, pattern, capture, start)
if m < 0: break
# else:
result.add(capture[0])
start = m + 1

let
text = """<a href="ad1">sdqs</a>' + '<a href="ad2">sds</a><a href=ad3>qs</a>'"""
pattern = re"""href="?(.*?)"?>"""

let li1 = findAll(text, pattern)
echo li1
# @["href=\"ad1\">", "href=\"ad2\">", "href=ad3>"]

let li2 = findAllPy(text, pattern)
echo li2 # @["ad1", "ad2", "ad3"]

Nim's findAll() differs from Python's findall(). Unfortunately, findAll() returns the full match, not the capture group. However, we can write a custom function that behaves similarly to Python.

The current findAllPy() implementation has a limitation: in the pattern, you can only have 1 capture group. If you want to support multiple capture groups, then the function must be extended.