MATLAB: Conversion serial date back to calender dates in string format.

date stringmissing datanansequentialserial date number

Hello all, I have been trying to analyize wind measurement data. The dataset contains, measurement time, direction and wind speeds at different heights.The measurement has gab and I would like to replace this gaps by NaN. To do that I began producing constant timestep and replace in the time measurment so that to replace the observation by NaN. However I could not do that therefore would you please help me? here is the matlab code:
ds1b=dataset('file','1201b.txt')
time_initial=ds1b.Time(:);
time_initialmat=datenum(time_initial)*86400;
time_final=ds1b.Time(end);
time_finalmat=datenum(time_final)*86400;
time_range=time_finalmat-time_initialmat;
calltime(1,1)=time_initial
for i =1:tim_range
time_initialmat= time_initialmat+1;
calltime(i,1)=datestr(time_initialmat/86400, ...
'ConvertFrom','datenum','Format','yyyy-MM-dd HH:mm:ss'));
end
auxtime=dataset(calltime,'VarNames','Time');
res=join(dsb1, auxtime, 'Type', 'rightouter', 'MergeKeys', true);

Best Answer

  • Here is some code that reads in the complete file (including the header strings), identifies the step size, locates any missing measurements, and then creates a numeric array with the missing measurements replaced with NaN's. It uses the following functions:
    • both fgetl and textscan to read the file data
    • regexp to split the header strings
    • datenum to convert the date-strings to serial date numbers.
    • diff to calculate all timesteps, and then mode to get the likely sampling timestep.
    • ismember to locate the positions of the non-missing measurements.
    Note that this is fully vectorized code, which is much faster and more efficient than using loops to solve this kind of problem.
    % read file data:
    fid = fopen('1201b.txt','rt');
    str = fgetl(fid);
    hdr = regexp(str,'(?<=")\w+(?="(\s|$))','match');
    fmt = ['%q',repmat('%f',1,numel(hdr)-1)];
    C = textscan(fid,fmt,'CollectOutput',true);
    fclose(fid);
    % identify step and create complete list of sample times:
    fkr = 24*60*60; % precision = seconds
    dtn = round(fkr * datenum(C{1},'yyyy-mm-dd HH:MM:SS'));
    stp = mode(diff(dtn)); % step size
    dtv = dtn(1):stp:dtn(end);
    % allocate output values:
    out = nan(numel(dtv),numel(hdr)-1);
    idx = ismember(dtv,dtn);
    out(idx,:) = C{2};
    And we can check the first ten lines in the command window:
    >> out(1:10,:)
    ans =
    276 279 19.00 20.34 20.02 19.97 19.54 18.95 18.50 17.97
    274 277 19.39 20.19 19.57 19.76 19.69 19.20 17.99 17.67
    279 280 19.39 19.69 19.97 19.86 19.89 19.35 17.84 17.72
    277 277 19.30 20.19 19.62 20.12 20.25 19.96 18.65 18.07
    278 278 19.39 19.99 19.57 20.37 20.45 20.61 17.94 17.72
    278 279 19.00 19.84 19.57 20.07 20.35 19.76 16.98 17.62
    271 277 18.52 18.78 19.52 20.27 20.70 19.60 17.99 17.32
    NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    267 280 19.78 18.28 18.37 18.86 19.29 18.09 16.33 15.57
    Note that method uses mode to identify the timestep. This works as long as the majority of timesteps are correct... if there are too many missing measurements, this method may not work properly and you will have to identify the timestep size in some other way.