Linux – save these documents on a dying machine from oblivion


First, a confession: no, I didn't do the backups I should have.

Second, the situation:

I have a Dell XPS 9550 with a solid state disk running Fedora 25.

I was working on a file and tried to save it when I was told I was trying to save to a read-only filesystem. Turns out my filesystem is read-only now and there are i/o errors all over the place.

I was able to save some of the files by emailing them to myself through an open web browser, but that crashed and I'm unable to relaunch it. But I still have files of interest open in an editor. I can't seem to save the files anywhere, but I can copy their contents. If only I could find a way to exfiltrate the file contents, I could save myself months of work.

But there are some horrible limitations. I attempted to insert a USB drive, but no device appears to represent it, and the mount command dies with a segfault. I can attempt to ssh to another computer, but I get "bus error" and it dies. ping, dmesg, ifconfig, none of these work. But I do have vim and less and ls and can spawn new bash instances.

No lynx, no firefox, no google-chrome. There's no DVD drive.

Basically it seems my SSD has died. Or maybe the whole motherboard. I have documents of great value still in memory, I have an IP address and network connection, I can run a few random commands and have 3500 more on the path that I could try.

cat and gcc seem to work. I can write to files in /tmp. I have a running ipython instance that still seems to work.

So… what I've tried so far has failed. But I feel like there are still a thousand possibilities. What am I not considering? How could I possibly get these files off of my dying computer?

There must be a way.

UPDATE: New stuff:

  • I lost my network connection due to my own stupidity.
  • I wrote a Python script to replace cp and cp -r
  • Unless I find some way to create a /dev entry for the SD card, or for USB drives, then my best bets for getting data out seem to be the screen and possibly the speakers/audio cable.
  • I'm writing a script to try reading files and output which ones are readable.

Suggestions still very welcome!

UPDATE 2: Newer stuff:

  • On the dying computer I wrote a Python script that will read a file bit by bit and try to convey those bits by flashing the screen one color or another. Right now it's trying to do a two-bit code where red, green, blue, and white all represent a two-bit pair. This isn't working that well, though, so I might just switch to two colors and do one bit at a time.
  • On my other laptop (the trusty old Thinkpad that I gave up for this hot new XPS) I wrote a script that reads from the webcam using the OpenCV Python library. The idea is to have it decode the codes sent by the other computer. The trouble is that the framerate from the camera is something like 15 frames per second, which means if I had a perfect, errorless transfer my maximum data rate would be 30 bits per second, i.e. 225 bytes per second. That's 324k per day.
  • On the dying XPS I can use tar to pack the desired files into a single archive, which is 1.7 MB. Unfortunately, gzip, bzip2, xz, lzop and whatever compression utilities are unavailable. BUT using Python's zlib module I can compress this file down to 820KB. Given that size, I could probably get this thing sent over in a couple of days.
  • Because this transfer method will likely be very error prone, I'm going to implement Hamming codes on the XPS to add some error correction as I transmit the data.
  • Likely there will be complications because that's what happens, but at least it seems somehow feasible to get this data out!
  • Since this is still a pretty sucky way of sending data, I looked more into USB serial drivers. The modules I've tried to load (usb-serial-simple, usb-debug, safe-serial) give i/o errors. I don't think it's built in to the kernel, either, because there are no /dev/ttyUSB* devices present.

Thanks for everyone's suggestions thus far—I know this isn't even a well-defined question since you guys don't know in advance which programs/files can be read or not. Still open to better suggestions than this video approach!

UPDATE 3: Newest stuff

  • I got a PS3 Eye webcam and, after disabling its automatic gain and exposure, am successfully reading data off of the XPS, albeit at an errorful 1 byte per second. This is a great success—the first data exfiltrated! But the rate is too slow to get my 820KB out in any sort of reasonable time, and the error rate is too high.

One bit transmission with clock

  • The problem is that writing to the terminal is too slow. The screen updates aren't anything like instantaneous, thanks (I think) to the slowness of the urxvt terminal emulator that I have access to.
  • I discovered that I have access to a Rust compiler on the XPS. I rewrote the transmission script using Rust to see if that would improve the terminal refresh speed, but it didn't help.
  • Because I'm unlikely to be able to increase the framerate, I'll have to try to increase how much data I get per frame. My current approach looks something like this:

grid transmission

The right half is still a clock signal, flashing on and off to mark the arrival of new frames. But the left is now a grid where each cell is marked by a red square in the corner, and then the green cell to the right and down from the red square is flashed on and off to indicate a bit. The red squares should let the receiving computer calibrate where the cells are located. I haven't got any data across this way yet, but it's what I'm working on.

  • Someone suggested that I look into writing QR codes instead of these ad hoc color patterns. I'm going to look into that, too, and perhaps implement that instead of this grid approach. The error correction would be a nice win, as well as being able to use standard libraries to decode.
  • I learned that I have access to libasound (the ALSA sound library), but not to the header files associated with it (alsa/asoundlib.h or whatever). If anyone knows how to make use of a shared library without the headers, or can help me write just the right header to let me produce audio output, then I could have an audio-based way of getting the files out.
  • Alternately, if someone could help me manipulate the USB devices without access to libusb then maybe I could do something with that?

Moving forward!

UPDATE 4: audio output produced!

User Francesco Noferi has done some great work helping me utilize the ALSA library mentioned in the previous update. The C compiler had a problem, but using the Rust compiler I was able to use the FFI to call directly into libasound. I've now played a bunch of my data over audio and it sounds like music to my ears! Still need to get a real communication channel established, but I'm feeling very hopeful. At this point my job is basically to implement a modem, so if anybody has any guidance on good ways to do that I'm all ears. Ideally modulation that's easy to implement by hand and demodulation for which there's an existing library I can use. Since this can go directly over an audio cable and not through the phone network, theoretically we can do much better than 56kbps or whatever the standard was back in the day, but in practice who knows what we'll get.

Thanks to everybody following along here and at /r/techsupportmacgyver and at /r/rust contributing so many excellent suggestions. Going to get this "modem" implemented soon and then I'll finish this up with an epilogue. I think I might put my code up somewhere for other desperate folks to make use of in the future—maybe even a repository of weird exfiltration tools that are easy to type into a dying machine by hand? We'll see what happens.

UPDATE 5: It took me a long time wrestling with ALSA and my cheap StarTech USB audio capture device (no builtin line in on the receiving laptop), and many false starts trying to roll my own transmission protocol, but finally under the advice of some Ham radio enthusiast friends of mine I implemented the RTTY line protocol running at 150 baud, which in practice gives me maybe about 10 bytes per second. It's not super fast but it's fairly reliable. And I'm very nearly done transferring my 820KB file, verified using CRC32 checksums (using the crc32 functionality from Python's zlib module, which I have access to). So I'm declaring victory, and want to give my thanks once again! I'll spend some more time finding further files that are readable and which I can transfer, but the foundation is in place. It's been fun working with you all!


On the dying machine:

$ tar cf ./files
$ ./ ./files.tar 9999999
Part 1 checksum: -1459633665
$ ./ ./files.tar
$ ./ ./files.tar.z 9999999
Part 1 checksum: -378365928
$ ./transmit_rust/target/debug/transmit ./files.tar.z
Transmitting files.tar.gz over audio using RTTY
Period size: 2048
Sample rate: 44100
Samples per bit: 294
Sending start signal.
Transmitting data.
nread: 2048
nread: 2048
nread: 2048
nread: 208
Transmission complete. Sending hold signal.

On the rescue machine:

$ minimodem --rx -8 --rx-one -R 44100 -S 915 -M 1085 --startbits 3
            --stopbits 2 --alsa=1 150 -q > ./files.tar.z
$ ./ ./files.tar.z
Part 1 checksum: -378365928
$ ./ ./files.tar.z
$ ./ ./files.tar
Part 1 checksum: -1459633665


Best Answer

here's an example libasound program with just enough definitions to get basic 2-channel 44.1k wav output going without the headers.

EDIT: I'm actually not sure if straight up dumping the data as wav would work, as noise when recording could easily damage it, but you can probably do something like a sine wave of bits at high frequency which is more reliable

EDIT2: if aplay is present and works you can also use that and just write a program that output raw audio and pipe it into aplay or anything that can play audio

EDIT3: modified it to not use any headers at all

if -lasound doesn't compile, add -L/path/where/libasound/is/located

    gcc alsa_noheader.c -lasound
    cat stuff.wav | ./a.out

typedef unsigned int uint;
typedef unsigned long ulon;

int printf(char*, ...);
void* malloc(long);
long read(int fd, void* buf, ulon count);

int snd_pcm_open(void**, char*, int, int);
ulon snd_pcm_hw_params_sizeof();
int snd_pcm_hw_params_any(void*, void*);
int snd_pcm_hw_params_set_access(void*, void*, int);
int snd_pcm_hw_params_set_format(void*, void*, int);
int snd_pcm_hw_params_set_channels(void*, void*, uint);
int snd_pcm_hw_params_set_rate_near(void*, void*, uint*, int*);
int snd_pcm_hw_params(void*, void*);
int snd_pcm_hw_params_get_period_size(void*, ulon*, int*);
long snd_pcm_writei(void*, void*, uint);
int snd_pcm_prepare(void*);
int snd_pcm_drain(void*);
int snd_pcm_close(void*);

int main(int argc, char* argv[])
    void* pcm;
    void* params;

    int rate;
    int nchannels;
    ulon frames;
    void* buf;
    int bufsize;
    long nread;

    snd_pcm_open(&pcm, "default", 0, 0);
    params = malloc(snd_pcm_hw_params_sizeof());
    snd_pcm_hw_params_any(pcm, params);

    /* 3 = rw_interleaved */
    snd_pcm_hw_params_set_access(pcm, params, 3);

    /* 2 = 16-bit signed little endian */
    snd_pcm_hw_params_set_format(pcm, params, 2);

    /* 2 channels */
    nchannels = 2;
    snd_pcm_hw_params_set_channels(pcm, params, nchannels);

    /* sample rate */
    rate = 44100;
    snd_pcm_hw_params_set_rate_near(pcm, params, &rate, 0);

    snd_pcm_hw_params(pcm, params);
    snd_pcm_hw_params_get_period_size(params, &frames, 0);

    bufsize = frames * nchannels * 2;
    buf = malloc(bufsize);

    /* read file from stdin */
    while (nread = read(0, buf, bufsize) > 0)
        if (snd_pcm_writei(pcm, buf, frames) == -29)
            printf("W: underrun\n");


    return 0;