I have two data sets. One is10 gb and the other one is 2TB. They are both txt files.
Let me say in the small data set, I have variables timestamp, ID and x. In the big one, I have timestamp, ID and y. Unique number of ID's and timestamps are much higher in the big data set.
For each observation in the small data, I want to find the row with the same milisecond and id in the big data and then copy the value of y to small data.
Is it possible to find corresponding rows without reading 2TB of data?