Notepad++ Deleting everything apart from 40 variables, putting them on their own line, combining regex

find and replacenotepadregex

I am in Notepad++. I have the following data embedded in a large html file. I want to get the variables before the </ix:nonNumeric> at the end of the lines, onto their own lines by themselves so the output is:


00891906
1.12.13
30.11.14
30.11.14
Company Accounts
Private Limited Company

etc.

There is more data but if I can get regex to do this, I will be able to work the rest out. Thanks. Once working, I will use the Batch Replace in a directory and do this to a number of txt files.

            `<ix:hidden>
                <ix:nonNumeric contextRef="FY_30_11_2014" name="ns7:NameAuthor" order="1" tupleRef="XBRLDocumentAuthorGrouping_Group45" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL"></ix:nonNumeric>
                <ix:nonNumeric contextRef="FY_30_11_2014" name="ns7:DescriptionOrTitleAuthor" order="2" tupleRef="XBRLDocumentAuthorGrouping_Group45" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL"></ix:nonNumeric>
                <ix:nonNumeric contextRef="FY_30_11_2014" name="ns7:UKCompaniesHouseRegisteredNumber" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">00891906</ix:nonNumeric>
                <ix:nonNumeric contextRef="CountriesHypercube_FY_30_11_2014_Set1" name="ns7:CountryFormationOrIncorporation" format="ixt2:nocontent" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL" />
                <ix:nonNumeric contextRef="CurrenciesHypercube_FY_30_11_2014_Set2" name="ns7:PrincipalCurrencyUsedInBusinessReport" format="ixt2:nocontent" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL" />
                <ix:nonNumeric contextRef="EntityOfficersHypercube_FY_30_11_2014_Set3" name="ns5:NameDirectorSigningAccounts" format="ixt2:nocontent" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL" />
                <ix:nonNumeric contextRef="cfwd_30_11_2014" name="ns7:StartDateForPeriodCoveredByReport" format="ixt2:datedaymonthyear" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">1.12.13</ix:nonNumeric>
                <ix:nonNumeric contextRef="cfwd_30_11_2014" name="ns7:EndDateForPeriodCoveredByReport" format="ixt2:datedaymonthyear" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">30.11.14</ix:nonNumeric>
                <ix:nonNumeric contextRef="cfwd_30_11_2014" name="ns7:BalanceSheetDate" format="ixt2:datedaymonthyear" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">30.11.14</ix:nonNumeric>
                <ix:nonNumeric contextRef="FY_30_11_2014" name="ns7:EntityAccountsType" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">Company accounts</ix:nonNumeric>
                <ix:nonNumeric contextRef="FY_30_11_2014" name="ns7:LegalFormOfEntity" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">Private Limited Company</ix:nonNumeric>
                <ix:nonNumeric contextRef="FY_30_11_2014" name="ns7:DescriptionPeriodCoveredByReport" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">FY</ix:nonNumeric>
                <ix:nonNumeric contextRef="FY_30_11_2014" name="ns7:EntityTrading" format="ixt2:booleantrue" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">true</ix:nonNumeric>
                <ix:nonNumeric contextRef="FY_30_11_2014" name="ns7:EntityDormant" format="ixt2:booleanfalse" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">false</ix:nonNumeric>
                <ix:nonNumeric contextRef="FY_30_11_2014" name="ns5:AccountsPreparedUnderHistoricalCostConventionInAccordanceWithFRSSE" format="ixt2:booleantrue" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">true</ix:nonNumeric>
                <ix:nonNumeric contextRef="FY_30_11_2014" name="ns5:CompanyExemptFromPreparingCashFlowStatementUnderFRS1" format="ixt2:booleanfalse" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">false</ix:nonNumeric>
                <ix:nonNumeric contextRef="FY_30_11_2014" name="ns5:AccountsHaveBeenPreparedInAccordanceWithProvisionsSmallCompaniesRegime" format="ixt2:booleantrue" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">true</ix:nonNumeric>
                <ix:nonNumeric contextRef="FY_30_11_2014" name="ns5:RelatedPartyTransactionExemptionBeingClaimed" format="ixt2:booleanfalse" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">false</ix:nonNumeric>
                <ix:nonNumeric contextRef="FY_30_11_2014" name="ns6:CompanyHasActedAsAnAgentDuringPeriod" format="ixt2:booleanfalse" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">false</ix:nonNumeric>
                <ix:nonNumeric contextRef="SharesHypercube_FY_30_11_2014_Set4" name="ns7:DescriptionShareType" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">Ordinary</ix:nonNumeric>
                <ix:nonFraction contextRef="SharesHypercube_FY_30_11_2014_Set4" name="ns5:ParValueShare" unitRef="GBP" decimals="INF" format="ixt2:numdotdecimal" scale="0" xmlns:ix="http://www.xbrl.org/2008/inlineXBRL">1.00000</ix:nonFraction>
            <ix:tuple name="ns7:XBRLDocumentAuthorGrouping" tupleID="XBRLDocumentAuthorGrouping_Group45" /></ix:hidden>
            <ix:references>
            <link:schemaRef xlink:href="http://www.xbrl.org/uk/gaap/core/2009-09-01/uk-gaap-full-2009-09-01.xsd" xlink:type="simple" /></ix:references>
            <ix:resources>
            <xbrli:unit id="GBP"><xbrli:measure>iso4217:GBP</xbrli:measure></xbrli:unit><xbrli:unit id="USD"><xbrli:measure>iso4217:USD</xbrli:measure>`

Best Answer

Based on the example you gave, the following regex will work:

Find what: .+?(<.+?>)(.+?)(<.+?>)

Replace with: \2\r

This will give the following result with your data:

VARIABLE 1
VARIABLE 2
VARIABLE 3
 randomrandom random randomrandom random randomrandom  random random randomrandom random randomrandom random

Only the last line will not be filtered out, but that can be manually removed.