bug-ddrescue
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Suggested feature + implementation: randomize order of bad sector access


From: Stephen Stair
Subject: Suggested feature + implementation: randomize order of bad sector access in retry phase
Date: Fri, 17 Jun 2022 11:25:31 -0700

I've implemented a feature to optionally randomize the order of sector
accesses in the retry phase
It operates by allocating a list of 64k sector addresses, first scanning the
bad sector list to add entries to the list, then randomizing the order of
the list and fetching each sector in the list in sequence, repeating this
until it reaches the end of the volume.
It scans in forward/backwards order depending on the pass although this is
less relevant for a random sweep.
It's also not a completionist so canceling and resuming may skip sectors,
but again, less relevant for a random sweep.
The decision was made that skipping sectors is preferable to retrying
sectors, as some HDDs may contain sectors that cause the control interface
to fail when read.

The theory behind this feature is that perhaps the subtle differences in
head position due to seeking to a sector from different offsets may improve
the chances of getting a different read of the data and succeeding in
recovery faster than it would otherwise.
It was a feature that I and some of my friends were interested in having so
hopefully it finds some use elsewhere too.

I've been running this code on my system and it seems to work well, but it
hasn't been exhaustively tested so I make no guarantees. 
You are welcome to use / modify / distribute / relicense the code/patch
without limits.

Thanks,
-Stephen

Unified diff follows:

Common subdirectories: ddrescue-1.26-original/doc and
ddrescue-1.26-patch/doc
diff -u ddrescue-1.26-original/main.cc ddrescue-1.26-patch/main.cc
--- ddrescue-1.26-original/main.cc      2022-01-21 04:48:37.000000000 -0800
+++ ddrescue-1.26-patch/main.cc 2022-06-17 10:35:15.887395900 -0700
@@ -125,6 +125,7 @@
                "  -t, --truncate                 truncate output file to
zero size\n"
                "  -T, --timeout=<interval>       maximum time since last
successful read\n"
                "  -u, --unidirectional           run all passes in the same
direction\n"
+               "  -U, --random-retry-order       randomize the order
sectors are tried in retry\n"
                "  -v, --verbose                  be verbose (a 2nd -v gives
more)\n"
                "  -w, --ignore-write-errors      make fill mode ignore
write errors\n"
                "  -x, --extend-outfile=<bytes>   extend outfile size to be
at least this long\n"
@@ -842,6 +843,7 @@
     { 't', "truncate",             Arg_parser::no  },
     { 'T', "timeout",              Arg_parser::yes },
     { 'u', "unidirectional",       Arg_parser::no  },
+    { 'U', "random-retry-order",   Arg_parser::no  },
     { 'v', "verbose",              Arg_parser::no  },
     { 'V', "version",              Arg_parser::no  },
     { 'w', "ignore-write-errors",  Arg_parser::no  },
@@ -920,6 +922,7 @@
       case 't': o_trunc = O_TRUNC; break;
       case 'T': rb_opts.timeout = parse_time_interval( arg, pn ); break;
       case 'u': rb_opts.unidirectional = true; break;
+      case 'U': rb_opts.randomize_retry_order = true; srand( time( NULL )
); break;
       case 'v': if( verbosity < 4 ) ++verbosity; break;
       case 'V': show_version(); return 0;
       case 'w': fb_opts.ignore_write_errors = true; break;
diff -u ddrescue-1.26-original/rescuebook.cc
ddrescue-1.26-patch/rescuebook.cc
--- ddrescue-1.26-original/rescuebook.cc        2022-01-21
04:48:37.000000000 -0800
+++ ddrescue-1.26-patch/rescuebook.cc   2022-06-17 10:35:15.893897100 -0700
@@ -579,8 +579,9 @@
     first_post = true;
     snprintf( msgbuf + msglen, ( sizeof msgbuf ) - msglen, "%d %s",
               pass - first_pass + 1, forward ? "(forwards)" : "(backwards)"
);
-    int retval = forward ? fcopy_errors( msgbuf, pass, resume ) :
-                           rcopy_errors( msgbuf, pass, resume );
+    int retval = randomize_retry_order ? randcopy_errors( msgbuf, pass,
resume, forward ) :
+                                         forward ? fcopy_errors( msgbuf,
pass, resume ) :
+                                                   rcopy_errors( msgbuf,
pass, resume );
     if( retval != -3 ) return retval;
     resume = false;
     if( !unidirectional ) forward = !forward;
@@ -663,6 +664,89 @@
   return -3;
   }

+// Return values: 1 I/O error, 0 OK, -1 interrupted, -2 mapfile error.
+// Read the damaged areas one sector at a time in a random pattern
+//
+int Rescuebook::randcopy_errors( const char * const msg, const int pass,
+                              const bool resume, const bool forward )
+  {
+  long long pos = forward ? 0 : LLONG_MAX - hardbs();
+  bool block_found = false;
+  int have_sectors = 0;
+
+  if( resume && domain().includes( current_pos() ) )
+    {
+    Block b( current_pos() + ( forward ? 0 : -1 ), 1 );
+    if( forward ) find_chunk( b, Sblock::bad_sector, domain(), hardbs() );
+    if( !forward ) rfind_chunk( b, Sblock::bad_sector, domain(), hardbs()
);
+    if( b.size() > 0 ) pos = forward ? b.pos() : b.end();              //
resume at this entire block
+    }
+
+  if( !randomize ) randomize = new Randomize_details();
+
+  while( pos >= 0 )
+    {
+    // Generate a list of sector numbers to test, up to the maximum size of
the random sector list.
+    have_sectors = 0;
+    while( pos >= 0 && have_sectors < randomize->max_sectors )
+    {
+      Block b( pos, hardbs() );
+      if( forward ) find_chunk( b, Sblock::bad_sector, domain(), hardbs()
);
+      if( !forward ) rfind_chunk( b, Sblock::bad_sector, domain(), hardbs()
);
+      if( b.size() <= 0 ) { pos = -1; break; }                 // no more
blocks
+      block_found = true;
+      // Translate all the sectors in this block into the sector location
list
+      if( forward )
+        {
+        if( pos < b.pos() ) pos = b.pos();
+        while( pos < b.end() && have_sectors < randomize->max_sectors )
+          {
+          randomize->sector_list[ have_sectors++ ] = pos;
+          pos += hardbs();
+          }
+        }
+      else
+        {
+        if( pos >= b.end() ) pos = b.end() - hardbs();
+        while( pos >= b.pos() && have_sectors < randomize->max_sectors )
+          {
+          randomize->sector_list[ have_sectors++ ] = pos;
+          pos -= hardbs();
+          }
+        }
+    }
+
+    // Randomly reorder the sector list
+    // For portability just use the built-in libc rand.
+    for( int sector_index = 0; sector_index < (have_sectors-1);
sector_index++ )
+      {
+      // Swap places between [sector_index] and a random index >=
sector_index
+      int swap_with = rand() % (have_sectors - sector_index) +
sector_index;
+      if( sector_index != swap_with )
+        {
+        long long temp = randomize->sector_list[sector_index];
+        randomize->sector_list[sector_index] =
randomize->sector_list[swap_with];
+        randomize->sector_list[swap_with] = temp;
+        }
+      }
+
+    // Iterate over the randomized sector list and try to copy sector
contents
+    for( int sector_index = 0; sector_index < have_sectors; sector_index++
)
+      {
+      int copied_size = 0, error_size = 0;
+      Block b( randomize->sector_list[sector_index], hardbs() );
+      const int retval = copy_and_update( b, copied_size, error_size, msg,
+                                          retrying, pass, true );
+      if( retval ) return retval;
+      update_rates();
+      if( error_size > 0 && pause_on_error > 0 ) do_pause_on_error();
+      if( !update_mapfile( odes_ ) ) return -2;
+      }
+    }
+  if( !block_found ) return 0;
+  return -3;
+  }
+

 // Return true if slow read.
 //
diff -u ddrescue-1.26-original/rescuebook.h ddrescue-1.26-patch/rescuebook.h
--- ddrescue-1.26-original/rescuebook.h 2022-01-21 04:48:37.000000000 -0800
+++ ddrescue-1.26-patch/rescuebook.h    2022-06-17 10:35:15.898397900 -0700
@@ -42,6 +42,22 @@
     }
   };

+class Randomize_details
+  {
+public:
+  const int max_sectors = 65536;
+  long long * sector_list;
+
+  Randomize_details()
+    {
+      sector_list = new long long[max_sectors];
+    }
+  ~Randomize_details()
+    {
+      delete[] sector_list;
+    }
+
+  };

 struct Rb_options
   {
@@ -78,6 +94,7 @@
   bool sparse;
   bool try_again;
   bool unidirectional;
+  bool randomize_retry_order;
   bool verify_on_error;

   Rb_options()
@@ -91,7 +108,8 @@
       noscrape( false ), notrim( false ), reopen_on_error( false ),
       reset_slow( false ), retrim( false ), reverse( false ),
       same_file( false ), simulated_poe( false ), sparse( false ),
-      try_again( false ), unidirectional( false ), verify_on_error( false )
+      try_again( false ), unidirectional( false ), randomize_retry_order(
false ),
+      verify_on_error( false )
       {}

   bool operator==( const Rb_options & o ) const
@@ -120,6 +138,7 @@
                simulated_poe == o.simulated_poe &&
                sparse == o.sparse && try_again == o.try_again &&
                unidirectional == o.unidirectional &&
+               randomize_retry_order == o.randomize_retry_order &&
                verify_on_error == o.verify_on_error ); }
   bool operator!=( const Rb_options & o ) const
     { return !( *this == o ); }
@@ -155,6 +174,7 @@
   Sliding_average sliding_avg;         // variables for show_status
   bool first_post;                     // first read in current pass
   bool first_read;                     // first read overall
+  Randomize_details * randomize;

   void change_chunk_status( const Block & b, const Sblock::Status st );
   void do_pause_on_error();
@@ -182,6 +202,8 @@
   int copy_errors();
   int fcopy_errors( const char * const msg, const int pass, const bool
resume );
   int rcopy_errors( const char * const msg, const int pass, const bool
resume );
+  int randcopy_errors( const char * const msg, const int pass, const bool
resume, const bool forward );
+
   bool update_rates( const bool force = false );
   void show_status( const long long ipos, const char * const msg = 0,
                     const bool force = false );
Common subdirectories: ddrescue-1.26-original/testsuite and
ddrescue-1.26-patch/testsuite




reply via email to

[Prev in Thread] Current Thread [Next in Thread]