Patchwork Corrupt .drv files

login
register
mail settings
Submitter Ludovic Courtès
Date July 11, 2016, 9:49 p.m.
Message ID <87twfv26cr.fsf_-_@gnu.org>
Download mbox | patch
Permalink /patch/13738/
State New
Headers show

Comments

Ludovic Courtès - July 11, 2016, 9:49 p.m.
(Moving to guix-devel.)

ludo@gnu.org (Ludovic Courtès) skribis:

> Andreas Enge <andreas@enge.fr> skribis:
>
>> guix archive: error: build failed: error parsing derivation `/gnu/store/k49lwfwgs8wcamys5qzn8c5n2zk0prc1-tcl8.6.4-src.tar.xz.drv': expected string `Derive(['
>
> It looks like the store on this machine is corrupt.

Indeed, the daemon doesn’t attempt to atomically write files coming from
an add-to-store RPC, which includes .drv files.

So I think that if you pull the plug before the .drv has been flushed to
disk but after the .drv has been marked as valid in the SQLite database
(which is likely to happen in a timely fashion because SQLite does the
‘fdatasync’ dance appropriately), then you end up with a truncated .drv
file.

(This is acknowledged by the comment in
‘LocalStore::registerValidPaths’, which can call ‘sync’, a
sledgehammer.)

The attached patch should fix it.  I think the performance overhead of
the extra ‘fdatasync’ should be OK but I haven’t made any measurements.

Thanks,
Ludo’.
Andreas Enge - July 11, 2016, 10:04 p.m.
On Mon, Jul 11, 2016 at 11:49:08PM +0200, Ludovic Courtès wrote:
> > Andreas Enge <andreas@enge.fr> skribis:
> >
> >> guix archive: error: build failed: error parsing derivation `/gnu/store/k49lwfwgs8wcamys5qzn8c5n2zk0prc1-tcl8.6.4-src.tar.xz.drv': expected string `Derive(['
> >
> > It looks like the store on this machine is corrupt.
> 
> Indeed, the daemon doesn’t attempt to atomically write files coming from
> an add-to-store RPC, which includes .drv files.
> 
> So I think that if you pull the plug before the .drv has been flushed to
> disk but after the .drv has been marked as valid in the SQLite database
> (which is likely to happen in a timely fashion because SQLite does the
> ‘fdatasync’ dance appropriately), then you end up with a truncated .drv
> file.

I do not think that this was the problem. I opened the .drv files with vim,
and they did not contain ASCII characters. Also, the file command marked
them as binary data instead of text files. "guix gc --verify" passed, however.

There were quite a few corrupted .drv files, and I ended up doing a complete
"guix gc" instead of "guix gc -d ..." on every file, and this solved the
problem.

Thanks for your help with this,

Andreas
Ludovic Courtès - July 12, 2016, 8:59 a.m.
Andreas Enge <andreas@enge.fr> skribis:

> On Mon, Jul 11, 2016 at 11:49:08PM +0200, Ludovic Courtès wrote:
>> > Andreas Enge <andreas@enge.fr> skribis:
>> >
>> >> guix archive: error: build failed: error parsing derivation `/gnu/store/k49lwfwgs8wcamys5qzn8c5n2zk0prc1-tcl8.6.4-src.tar.xz.drv': expected string `Derive(['
>> >
>> > It looks like the store on this machine is corrupt.
>> 
>> Indeed, the daemon doesn’t attempt to atomically write files coming from
>> an add-to-store RPC, which includes .drv files.
>> 
>> So I think that if you pull the plug before the .drv has been flushed to
>> disk but after the .drv has been marked as valid in the SQLite database
>> (which is likely to happen in a timely fashion because SQLite does the
>> ‘fdatasync’ dance appropriately), then you end up with a truncated .drv
>> file.
>
> I do not think that this was the problem. I opened the .drv files with vim,
> and they did not contain ASCII characters. Also, the file command marked
> them as binary data instead of text files.

So did they contain random bytes?

That could be a hard disk problem or something equally fishy.

Nevertheless I think the problem I described above really exists and is
worth mitigating.

Thanks,
Ludo’.
Andreas Enge - July 12, 2016, 9:03 a.m.
On Tue, Jul 12, 2016 at 10:59:04AM +0200, Ludovic Courtès wrote:
> So did they contain random bytes?

I do not know, since I did not look at the files with a hex editor
(we should package hexedit...).

Andreas
Kei Yamashita - July 12, 2016, 5:30 p.m.
Andreas Enge <andreas@enge.fr> writes:

> On Tue, Jul 12, 2016 at 10:59:04AM +0200, Ludovic Courtès wrote:
>> So did they contain random bytes?
>
> I do not know, since I did not look at the files with a hex editor
> (we should package hexedit...).
>
> Andreas
>
>

Just popping in...
I was considering packaging hexedit! Now I know there is a demand for it
In the meantime, you can try using "M-x hexl-mode" in Emacs if you
haven't already.
Andreas Enge - July 14, 2016, 3:17 p.m.
On Tue, Jul 12, 2016 at 01:30:43PM -0400, Kei Kebreau wrote:
> I was considering packaging hexedit! Now I know there is a demand for it

Definitely, it is a nice little tool.

> In the meantime, you can try using "M-x hexl-mode" in Emacs if you
> haven't already.

Thanks for the tip! No, I had not...

Andreas

Patch

diff --git a/nix/libstore/local-store.cc b/nix/libstore/local-store.cc
index 347e8a7..80566ea 100644
--- a/nix/libstore/local-store.cc
+++ b/nix/libstore/local-store.cc
@@ -1391,7 +1391,7 @@  Path LocalStore::addToStoreFromDump(const string & dump, const string & name,
                 StringSource source(dump);
                 restorePath(dstPath, source);
             } else
-                writeFile(dstPath, dump);
+                writeFileAtomically(dstPath, dump);
 
             canonicalisePathMetaData(dstPath, -1);
 
diff --git a/nix/libutil/util.cc b/nix/libutil/util.cc
index c077544..e33824b 100644
--- a/nix/libutil/util.cc
+++ b/nix/libutil/util.cc
@@ -272,6 +272,22 @@  void writeFile(const Path & path, const string & s)
     writeFull(fd, s);
 }
 
+void writeFileAtomically(const Path & path, const string & s)
+{
+    auto tmpPath = path + ".tmp";
+    AutoCloseFD fd = open(tmpPath.c_str(), O_WRONLY | O_TRUNC | O_CREAT, 0666);
+    if (fd == -1)
+        throw SysError(format("opening file '%1%'") % tmpPath);
+    writeFull(fd, s);
+#if _POSIX_SYNCHRONIZED_IO > 0
+    if (fdatasync(fd) != 0)
+        throw SysError(format("flushing file '%1%'") % tmpPath);
+#endif
+    fd.close();
+    if (rename(tmpPath.c_str(), path.c_str()) != 0)
+        throw SysError(format("renaming '%1%' to '%2'") % tmpPath % path);
+}
+
 
 string readLine(int fd)
 {
diff --git a/nix/libutil/util.hh b/nix/libutil/util.hh
index e84d64d..829f90b 100644
--- a/nix/libutil/util.hh
+++ b/nix/libutil/util.hh
@@ -86,6 +86,9 @@  string readFile(const Path & path, bool drain = false);
 /* Write a string to a file. */
 void writeFile(const Path & path, const string & s);
 
+/* Same, but do it atomically.  */
+void writeFileAtomically(const Path & path, const string & s);
+
 /* Read a line from a file descriptor. */
 string readLine(int fd);