Excel – Create table, by conditionally copying rows from another table – possible in MS-Excel

microsoft excelmicrosoft-excel-2007worksheet-function

Every week, a new Excel file is generated by a CRM system, which has the date-stamp in it's file-name, but placed in a folder at a fixed location (a sharepoint URL).

This file is quite large, and usually contains anywhere between 25-30 thousand rows. Out of this, I need to create a new table, relevant to my purpose.

Rows relevant to my purpose, must match one (or more) keywords/key-phrases, from a list of 30-40 such keywords/key-phrases. This list of keywords/key-phrases, however also grows, but slowly, s.a. adding new keywords once every few months.

Is there some way to automate this manual, tedious and somewhat error-prone task ?

Best Answer

Because three different columns (assumed below to be B, D and F) need to be checked, an OR function looks appropriate:

=OR(ISNUMBER(MATCH(B1,keyarray,0)),
    ISNUMBER(MATCH(D1,keyarray,0)),
    ISNUMBER(MATCH(F1,keyarray,0)))

so that a match for any of the three columns will return TRUE. Place this in a ‘helper’ column in the first row occupied by data (say Row 1, or adjust B1,D1 and F1 above accordingly).

MATCH checks to see whether, for example, the value of B1 exists in keyarray, where that is the given name for the range that contains the list of keywords/key-phrases (~30-40 in number) – need not be in same sheet or workbook but if not the full path must be specified and recommended that the 'other' workbook is open when applying the formula to a new batch of data.

0 forces an exact match only (alternatively -1 for smallest value that is greater than or equal to B1, or 1 for largest.)

MATCH returns the location of the found value in the array (otherwise, with parameter 0, #N/A). That is numeric so ISNUMBER tests for a number (any number) – to exclude #N/A results.

Hence provided any of B1, D1 or F1 is in keyarray the result will be TRUE – otherwise FALSE.

To copy the formula down over 25-30 thousand rows conveniently where some contain merged cells, place something (say ‘end’) in the intersection of the last occupied row and the ‘helper’ column (to prevent expanding the size of the spreadsheet unnecessarily). Copy the cell containing the formula, select the cell immediately below it then Ctrl+Shift+Down/Paste to fill the helper column downwards for all occupied rows, without continuing past last occupied row and with overwriting ‘end’.

Filter on ‘helper’ column for TRUE, select all occupied columns, copy and paste into new sheet/workbook. Delete blank rows in new sheet/workbook and save. (May choose to delete ‘helper’ column from source also.)

Ensure that when occasionally adding items to keyarray the named range covers the additions.

Related Question