bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dd hangs when attempting to access a tape drive


From: Curt Blank
Subject: Re: dd hangs when attempting to access a tape drive
Date: Sat, 22 Oct 2005 12:00:55 -0500
User-agent: Mozilla Thunderbird 1.0.7 (Windows/20050923)

Ok, it appears to definitely be a blocking factor issue.

One thing I should mention, when the read was hanging while doing the strace it was hung on the read(0, 0x8055000, 32), all I saw while cat'ing out the log file while it was hanging was read(0 at the end.

I padded the "if" file out to 512 bytes and issued a:

dd if=b.b of=/dev/nst0

and it worked. I then issued a read (after rewinding of course):

dd if=/dev/nst0

and it worked. Note: no bs or count. I then issued a:

dd if=/dev/nst0 bs=32 count=1

and it hung.

So I'm starting to think it might not be dd problem in and of itself. My perl script and my C program both failed writing if the data being written was not padded out to 1 device block size, i.e. 512 bytes, but neither hung they just error'd out.

So, dd hangs on a read if bs=32 is used, my perl script does not hang it works if I issue a -s 32 for s 32 byte read, it reads and outputs the 32 bytes just fine. I've attached the perl script so you can see the read, it ain't rocket science. I haven't tried it in C, don't know if it's worth the time.

Curt Blank wrote:
The logs are attached. as you can see below the read never completed and the write outright failed. I was able to get dd to exit the read by unloading the tape via the front panel of the library after 15 some minutes.

Originally I said dd hangs, it appears to only hang on the read and fails on the write. I was only dealing with reading initially and hadn't really tried writing until the next day. I was up almost 24 hours when this problem was discovered so that's a bit foggy.

I don't understand the invalid arguments for nst0 on the write, mt and tar like it.

# strace -o dd-read.log dd if=/dev/nst0 bs=32 count=1
dd: reading `/dev/nst0': Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 1137.03 seconds, 0.0 kB/s
#

# strace -o dd-write.log dd if=a.a of=/dev/nst0 bs=32 count=1
dd: writing `/dev/nst0': Invalid argument
1+0 records in
0+0 records out
0 bytes (0 B) copied, 0.001387 seconds, 0.0 kB/s
#



Curt Blank wrote:

I'll do that first thing tomorrow, my backup is about to kick off at 02:00. I had hoped I provided a good amount of information to get started, and thanks. I found a way to label tapes without dd. My first attempt with perl could read existing labels but could not write them. I then moved to C and discovered what I think might have something to do with the problem.

At first I could not write the tapes with C either, the fclose kept failing and the data definitely was not on the tape. Then I got to thinking about blocking factor, so I padded the write to 512 bytes, the block size reported in the boot log for the tape drives, and wahla, it worked. So then I went back to perl and tried the same thing, bingo, worked there to.

So then I went to dd, I know you're hoping, but nada, still hung, I left it for 3300+ seconds according to dd's report and the only way I could get the command prompt back was to either force the drive to unload at the front panel or power down the library. I forgot which, I was trying so many things and I didn't take notes. The tape drive showed Active the whole 3300+ seconds. I was starting to wondering if it was spinning out to EOT but there's no way I can see that while it's occurring.

So I think it might have something to do with block size. I set bs to 512 but the data in the file "if" used was only 31 bytes. I'll probably try by padding the "if" file to 512 and see what happens.

But and here's the catch, this used to work under SuSE 8.1 with a 2.4 kernel, reading and writing with a bs of 32. It's only as of Wednesday and SuSE 9.3 with the 2.6 kernel that the problem started, so something changed somewhere. Don't get me wrong, I'm not complaining, just trying to figure it out.

Thanks,

-Curt Blank

Jim Meyering wrote:

Thanks for the detailed report.
To narrow it down a little more, please run dd via strace, e.g.,

  strace -o log dd if=file.name of=/dev/nst0 bs=32 count=1

Then, at least, we'll have a good idea which syscall is causing
the trouble.

On general principle, a process that cannot be killed
via `kill -9' suggests a kernel bug.  But even if that's
the case, we may want to make dd avoid triggering it.

Curtis J Blank <address@hidden> wrote:

I've got a problem were dd hangs attempting to access a tape drive, this used to work fine. The only change is going to the 2.6 kernel, SuSE's 9.3 2.6.11.4-21.9 to be exact. The dd command is (this is all done as root):

dd if=file.name of=/dev/nst0 bs=32 count=1
or
dd if=/dev/nst0 bs=32 count=1

If it's done command line you cannot ^C out of it, and command line or in a cron job you cannot kill -9 the process, the only way to get rid of it
is to reboot. mt can access the drive just fine, tar can read and write
the drive just fine.



...





------------------------------------------------------------------------

execve("/bin/dd", ["dd", "if=a.a", "of=/dev/nst0", "bs=32", "count=1"], [/* 77 
vars */]) = 0
uname({sys="Linux", node="bnfsrvr", ...}) = 0
brk(0)                                  = 0x8053000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=71427, ...}) = 0
old_mmap(NULL, 71427, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40018000
close(3)                                = 0
open("/lib/tls/librt.so.1", O_RDONLY)   = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300\35"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=40673, ...}) = 0
old_mmap(NULL, 33384, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x4002a000
madvise(0x4002a000, 33384, MADV_SEQUENTIAL|0x1) = 0
old_mmap(0x40031000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6000) = 0x40031000
close(3)                                = 0
open("/lib/tls/libc.so.6", O_RDONLY)    = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0pO\1\000"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1394943, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x40033000
old_mmap(NULL, 1150108, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x40034000
madvise(0x40034000, 1150108, MADV_SEQUENTIAL|0x1) = 0
old_mmap(0x40147000, 16384, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x113000) = 0x40147000
old_mmap(0x4014b000, 7324, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4014b000
close(3)                                = 0
open("/lib/tls/libpthread.so.0", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220H\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=93182, ...}) = 0
old_mmap(NULL, 70104, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x4014d000
madvise(0x4014d000, 70104, MADV_SEQUENTIAL|0x1) = 0
old_mmap(0x4015b000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xd000) = 0x4015b000
old_mmap(0x4015d000, 4568, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4015d000
close(3)                                = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x4015f000
mprotect(0x40147000, 4096, PROT_READ)   = 0
set_thread_area({entry_number:-1 -> 6, base_addr:0x4015f6c0, limit:1048575, 
seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, 
useable:1}) = 0
munmap(0x40018000, 71427)               = 0
set_tid_address(0x4015f708)             = 4084
rt_sigaction(SIGRTMIN, {0x401517f0, [], SA_SIGINFO}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x401513f0, [], SA_RESTART|SA_SIGINFO}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=RLIM_INFINITY, rlim_max=RLIM_INFINITY}) = 0
_sysctl({{CTL_KERN, KERN_VERSION}, 2, 0xbfffee5c, 31, (nil), 0}) = 0
brk(0)                                  = 0x8053000
brk(0x8074000)                          = 0x8074000
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No 
such file or directory)
open("/usr/share/locale/locale.alias", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=2528, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x40018000
read(3, "# Locale name alias data base.\n#"..., 4096) = 2528
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x40018000, 4096)                = 0
open("/usr/lib/locale/en_US.UTF-8/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such 
file or directory)
open("/usr/lib/locale/en_US.utf8/LC_CTYPE", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=208464, ...}) = 0
mmap2(NULL, 208464, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40160000
close(3)                                = 0
open("/usr/lib/gconv/gconv-modules.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=21544, ...}) = 0
mmap2(NULL, 21544, PROT_READ, MAP_SHARED, 3, 0) = 0x40193000
close(3)                                = 0
futex(0x4014a12c, FUTEX_WAKE, 2147483647) = 0
close(0)                                = 0
open("a.a", O_RDONLY|O_LARGEFILE)       = 0
_llseek(0, 0, [0], SEEK_CUR)            = 0
close(1)                                = 0
open("/dev/nst0", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 1
rt_sigaction(SIGINT, NULL, {SIG_DFL}, 8) = 0
rt_sigaction(SIGINT, {0x804a2b0, [], 0}, NULL, 8) = 0
rt_sigaction(SIGQUIT, NULL, {SIG_DFL}, 8) = 0
rt_sigaction(SIGQUIT, {0x804a2b0, [], 0}, NULL, 8) = 0
rt_sigaction(SIGPIPE, NULL, {SIG_DFL}, 8) = 0
rt_sigaction(SIGPIPE, {0x804a2b0, [], 0}, NULL, 8) = 0
rt_sigaction(SIGUSR1, NULL, {SIG_DFL}, 8) = 0
rt_sigaction(SIGUSR1, {0x8049d60, [], 0}, NULL, 8) = 0
clock_gettime(CLOCK_REALTIME, {1129997187, 845688000}) = 0
read(0, "xxxxxx 2005-10-20T20:31:34-0500\n", 32) = 32
write(1, "xxxxxx 2005-10-20T20:31:34-0500\n", 32) = -1 EINVAL (Invalid argument)
write(2, "dd: ", 4)                     = 4
write(2, "writing `/dev/nst0\'", 19)    = 19
write(2, ": Invalid argument", 18)      = 18
write(2, "\n", 1)                       = 1
close(0)                                = 0
close(1)                                = 0
clock_gettime(CLOCK_REALTIME, {1129997187, 847075000}) = 0
write(2, "1+0 records in\n", 15)        = 15
write(2, "0+0 records out\n", 16)       = 16
write(2, "0 bytes (0 B) copied", 20)    = 20
write(2, ", 0.001387 seconds, 0.0 kB/s\n", 29) = 29
exit_group(1)                           = ?


------------------------------------------------------------------------

execve("/bin/dd", ["dd", "if=/dev/nst0", "bs=32", "count=1"], [/* 76 vars */]) 
= 0
uname({sys="Linux", node="bnfsrvr", ...}) = 0
brk(0)                                  = 0x8053000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=71427, ...}) = 0
old_mmap(NULL, 71427, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40018000
close(3)                                = 0
open("/lib/tls/librt.so.1", O_RDONLY)   = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300\35"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=40673, ...}) = 0
old_mmap(NULL, 33384, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x4002a000
madvise(0x4002a000, 33384, MADV_SEQUENTIAL|0x1) = 0
old_mmap(0x40031000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6000) = 0x40031000
close(3)                                = 0
open("/lib/tls/libc.so.6", O_RDONLY)    = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0pO\1\000"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1394943, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x40033000
old_mmap(NULL, 1150108, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x40034000
madvise(0x40034000, 1150108, MADV_SEQUENTIAL|0x1) = 0
old_mmap(0x40147000, 16384, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x113000) = 0x40147000
old_mmap(0x4014b000, 7324, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4014b000
close(3)                                = 0
open("/lib/tls/libpthread.so.0", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220H\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=93182, ...}) = 0
old_mmap(NULL, 70104, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x4014d000
madvise(0x4014d000, 70104, MADV_SEQUENTIAL|0x1) = 0
old_mmap(0x4015b000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xd000) = 0x4015b000
old_mmap(0x4015d000, 4568, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4015d000
close(3)                                = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x4015f000
mprotect(0x40147000, 4096, PROT_READ)   = 0
set_thread_area({entry_number:-1 -> 6, base_addr:0x4015f6c0, limit:1048575, 
seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, 
useable:1}) = 0
munmap(0x40018000, 71427)               = 0
set_tid_address(0x4015f708)             = 3495
rt_sigaction(SIGRTMIN, {0x401517f0, [], SA_SIGINFO}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x401513f0, [], SA_RESTART|SA_SIGINFO}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=RLIM_INFINITY, rlim_max=RLIM_INFINITY}) = 0
_sysctl({{CTL_KERN, KERN_VERSION}, 2, 0xbfffee8c, 31, (nil), 0}) = 0
brk(0)                                  = 0x8053000
brk(0x8074000)                          = 0x8074000
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No 
such file or directory)
open("/usr/share/locale/locale.alias", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=2528, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x40018000
read(3, "# Locale name alias data base.\n#"..., 4096) = 2528
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x40018000, 4096)                = 0
open("/usr/lib/locale/en_US.UTF-8/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such 
file or directory)
open("/usr/lib/locale/en_US.utf8/LC_CTYPE", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=208464, ...}) = 0
mmap2(NULL, 208464, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40160000
close(3)                                = 0
open("/usr/lib/gconv/gconv-modules.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=21544, ...}) = 0
mmap2(NULL, 21544, PROT_READ, MAP_SHARED, 3, 0) = 0x40193000
close(3)                                = 0
futex(0x4014a12c, FUTEX_WAKE, 2147483647) = 0
close(0)                                = 0
open("/dev/nst0", O_RDONLY|O_LARGEFILE) = 0
_llseek(0, 0, 0xbfffeff0, SEEK_CUR)     = -1 ESPIPE (Illegal seek)
rt_sigaction(SIGINT, NULL, {SIG_DFL}, 8) = 0
rt_sigaction(SIGINT, {0x804a2b0, [], 0}, NULL, 8) = 0
rt_sigaction(SIGQUIT, NULL, {SIG_DFL}, 8) = 0
rt_sigaction(SIGQUIT, {0x804a2b0, [], 0}, NULL, 8) = 0
rt_sigaction(SIGPIPE, NULL, {SIG_DFL}, 8) = 0
rt_sigaction(SIGPIPE, {0x804a2b0, [], 0}, NULL, 8) = 0
rt_sigaction(SIGUSR1, NULL, {SIG_DFL}, 8) = 0
rt_sigaction(SIGUSR1, {0x8049d60, [], 0}, NULL, 8) = 0
clock_gettime(CLOCK_REALTIME, {1129995853, 361458000}) = 0
read(0, 0x8055000, 32)                  = -1 EIO (Input/output error)
write(2, "dd: ", 4)                     = 4
write(2, "reading `/dev/nst0\'", 19)    = 19
write(2, ": Input/output error", 20)    = 20
write(2, "\n", 1)                       = 1
close(0)                                = 0
close(1)                                = 0
clock_gettime(CLOCK_REALTIME, {1129996990, 393378000}) = 0
write(2, "0+0 records in\n", 15)        = 15
write(2, "0+0 records out\n", 16)       = 16
write(2, "0 bytes (0 B) copied", 20)    = 20
write(2, ", 1137.03 seconds, 0.0 kB/s\n", 28) = 28
exit_group(1)                           = ?


------------------------------------------------------------------------

_______________________________________________
Bug-coreutils mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/bug-coreutils

#!/usr/bin/perl
#
        $Version="1.0-1";

use Getopt::Std;
use Errno;
use Fcntl;

our ($opt_b, $opt_D, $opt_l, $opt_s, $opt_f, $opt_W);
getopts('b:Df:l:s:W');

$Size = $opt_s;
$TapeDrive = $opt_f;
$Label = $opt_l;
$WRITE = $opt_W;
$DEBUG = $opt_D;
$BLOCK = $opt_b;

if ($Size eq "" || $TapeDrive eq "") { exit -1 };
if ($WRITE && $Label eq "") { exit -1 };

if ($BLOCK = " ") { $BLOCK = 512 };
if ($DEBUG) { print " block size [$BLOCK]\n" };

if (! $WRITE) {
        open(Tdrive, "< $TapeDrive") or die "couldn't open tape drive for 
read\n";
        $length = read(Tdrive, $Label, $Size) or die "error reading tape drive 
\n";
        close(Tdrive) or die "error closing tape drive after read\n";
        print "$Label";
        if ($DEBUG) { print " length $length : read successful\n" };
        exit 0;
} else {
        $Label = substr($Label, 0, $Size);
        $pos = $Size;
        while ($pos <= $BLOCK) {
                $Label = $Label . " ";
                $pos = $pos +1
        }
        $LabelLen = length($Label);
        if ($DEBUG) { print " $LabelLen labeling with $Label\n" };
        sysopen(Tdrive,  "$TapeDrive", O_WRONLY|O_TRUNC|O_CREAT) or die 
"couldn't open tape drive for write\n";
        $length = syswrite(Tdrive, $Label, $LabelLen) or die "error writing 
tape drive\n";
        close(Tdrive) or die "error closing tape drive after write\n";
        if ($DEBUG) { print " length $length write successful\n" };
        exit 0;
}

if ($DEBUG) { print "oh-oh\n" };
exit -1;
#


reply via email to

[Prev in Thread] Current Thread [Next in Thread]