MATLAB: Problem with regular expressions

regexpregular expression

Given the string:
str='""A_3_1"": [""choice_0"", ""choice_1"", ""choice_2"", ""choice_3""], ""A_2_1"": [""choice_1"", ""choice_2""]'
I want to group with regexp from the numbers after the word choice for the for the two different situations A_3_1 and A_2_1.
The output for the A_3_1 will be:
[0 1 2 3]
and the output for the A_2_1 will be:
[1 2]

Best Answer

  • Here is one way. It is not tremendously robust. For example, it assumes that the "choices" will always be single-digit numbers. However, it should at least give you a rudimentary algorithm that works, as a starting point.
    % Identify locations of indicators. Appending the extra 'A' to get the end of the string
    indices = regexp([str,'A'],'A');
    numberIndices = numel(indices) - 1;
    for ni = 1:numberIndices
    % Find the substring for this index
    substr = str(indices(ni):indices(ni+1)-1);
    % Find the location of the beginning of the "choice" strings
    choiceIdx = regexp(substr,'choice_');
    % Find the locations of the digits following each choice, and convert to numeric.
    % Each vector of values is stored in a cell array.
    values{ni} = str2num(substr(choiceIdx+7)')'
    end