MATLAB: Reading a huge text file using Textscan()

MATLABmatrix arraymemorytext filetextscan

I have a .txt file with 3235K number of lines, and 25 space delimited columns. I use the following code to read the file:
fread = fopen('data.txt', 'r');
formatSpec ='%f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f';
C = textscan(fread, formatSpec);
A = [C{:}];
fclose(fread)
My problem is, it only reads up to 249K lines. A is 249000×25 double.
This is the output when I typed 'memory':
Maximum possible array: 55317 MB (5.800e+10 bytes) *
Memory available for all arrays: 55317 MB (5.800e+10 bytes) *
Memory used by MATLAB: 859 MB (9.002e+08 bytes)
Physical Memory (RAM): 8101 MB (8.494e+09 bytes)
Is there any way I can read the whole file?

Best Answer

  • The file itself may be the problem. There could be a text line in it that is stopping textscan.
    Try this to see:
    C = textscan(fread, formatSpec);
    A = [C{:}];
    fseek(fread,0,0);
    C = textscan(fread, formatSpec, 'HeaderLines',1);
    A = [A{:}; C{:}];
    You can also use the fgets or fgetl to see what the next lines are, and then change the subsequent textscan calls to make it work with your file.
    NOTE: You are inadvertently ‘shadowing’ (or ‘overshadowing’) the fread function. This could be a problem if you need it in your code later. Rename your file ID ‘fidin’ or some such instead to avoid the problem.