Using Regular Expression in batch files

batchbatch filenotepadregex

Is it possible for me to write regular expression into a batch file as a find and replace on all files?

I'm currently converting docx to txt using pandoc prompts in a batch file, and I want the batch file to also run a find and replace on the converted files to clean them up rather than having to open in notepad++ and run the few find and replaces manually.

Trying to turn a 5-10 minute job into a 2 minute max job for work, really, but I'm fairly unfamiliar coding.

Trying to run the following:

FIND: \r\n
REPLACE: \r\n\r\n

FIND: STUDENT: [^\r\n]{1,200}
REPLACE: STUDENT:

FIND: _[A-z].{1,300}
REPLACE: [nothing] tick .matches new line

FIND: -{2,2000}
REPLACE: [nothing]

If anyone could help out or at least point me in the right direction. I have no experience with this aside from editing batch files that already exist to make them do what I want.

Best Answer

You could use JREPL.BAT to perform the find/replace on a single file, and then package that within a FOR loop to apply it to "all" files.

JREPL.BAT is a regular expression command line text processor. It is pure script (hybrid JScript/batch) that runs on any Windows machine from XP onward, no 3rd party exe files required. Full documentation is available from the command line via jrepl /?, or jrepl /?? for paged help.

I'm not entirely sure how to interpret your 3rd replace value, but you should be able to fix any misinterpretation on my part.

The code below uses the /T (translate) option, which allows you to specify a series of find/replace pairs. I chose to delimit the expressions with |. The first argument contains the delimited list of search terms, and the second argument contains the delimited list of replacement terms.

The /X option enables escape sequences in the replacement expressions.

The /M option activates multi-line mode so you can search across line breaks.

The /F option specifies the source file, and /O - causes the result to overwrite the original file (update in place).

I have added ^ line continuation for readability

From the command line:

for %F in (*.txt) do @jrepl "\r\n|STUDENT: [^\r\n]{1,200}|_[A-z].{1,300}|-{2,2000}"^
                            "\r\n\r\n|STUDENT:||" /m /x /t "|" /f "%F" /o -

From a batch script

@echo off
for %%F in (*.txt) do (
  call jrepl "\r\n|STUDENT: [^\r\n]{1,200}|_[A-z].{1,300}|-{2,2000}"^
             "\r\n\r\n|STUDENT:||" /m /x /t "|" /f "%%F" /o -
)