Recently at work we had a couple customers mention to us that while backing up files on Linux, symlinks and FIFO (names pipe) files were not being skipped. Trying to backup a FIFO is what we call a bad idea. Very, very bad. One customer reported that most of his filesystems were XFS. Sure enough, after some digging around and testing, I happened to discover this most useful bit of information:
XFS does not support dirent::d_type
What the?! In other words, a line such as this will always be false:
de->d_type == DT_FIFO
Jungle Disk uses readdir_r() to list files and check their types. Uh oh. That means that on XFS partitions, everything looks like a regular file! You can see this for yourself using code like this:
#include <iostream>
#include <cstdlib>
#include <dirent.h>
#include <sys/stat.h>
using namespace std;
int main(int argc, char *argv[])
{
dirent holdde;
dirent *de;
struct stat st;
cout << "DT_BLK = " << DT_BLK << endl;
cout << "DT_CHR = " << DT_CHR << endl;
cout << "DT_DIR = " << DT_DIR << endl;
cout << "DT_FIFO = " << DT_FIFO << endl;
cout << "DT_LNK = " << DT_LNK << endl;
cout << "DT_REG = " << DT_REG << endl;
cout << "DT_SOCK = " << DT_SOCK << endl;
cout << "DT_UNKNOWN = " << DT_LNK << endl;
DIR *dir = opendir("/home/vmuser/xfs");
while (readdir_r(dir, &holdde, &de) == 0 && de)
{
cout << de->d_name << " [" << (int)de->d_type << "] " << endl;
}
closedir(dir);
lstat("/home/vmuser/xfs/TESTFIFO", &st);
if (S_ISFIFO(st.st_mode))
cout << "XFS FIFO detected: " << st.st_mode << endl;
lstat("/home/vmuser/ext3/TESTFIFO", &st);
if ((st.st_mode & S_IFIFO) != 0)
cout << "EXT3 FIFO detected: " << st.st_mode << endl;
return EXIT_SUCCESS;
}
So what’s the solution? Good old, reliable stat(). That function does work with XFS on Linux. Unfortunately, this means you will have to make an extra call for each file after listing the dir.
Well, obviously everything looks like DT_UNKNOWN, not like regular files.
This is not an xfs issue either, most filesystems do not support the d_type field (nfs, ext2/ext3 per default etc.), so apps relying on that field to have somethign besides DT_UNKNOWN is simply a buggy app.
Thirdly, one only needs to stat() when d_type is indeed DT_UNKNOWN, and then there are still many ways to avoid it.
and lastly, readdir_r is usually a bug, too (readdir is already threadsafe, per dirfd).
nfs, ext2, ext3 absolutely DO fill d_type correctly, at least under Linux, as well as cifs, ext4 and all other known file systems known to man
readdir() isn’t threadsafe when accessing the same directory stream from different threads.
Thanks for this post, it was a great help to me.
I’m the author of Select-o-Magic 3000 and I recently ran into this problem when I was building a home media server using XFS.
An interesting thing I found is that when reading an XFS file system over a Samba NFS, d_type is filled in correctly, at least under Ubuntu v10.04 LTS.