JavaScript

Adding pause, resume and filter feature to File Walker

In last tutorial we’d learned how to create node.js module using exports statement and converted our FileWalker class into module. Now we’ll further extend our FileWlaker class by adding the pause and resume functionality. We’ll also add two methods filterDir and filterFile to filter out directories and files for inclusion or exclusion.

Following code is the skeleton of our FileWalker module:

class FileWalker extends EventEmitter {
 constructor () {}
 filterDir   () {}
 filterFile  () {}
 start       () {}
 pause       () {}
 resume      () {}
 next        () {}
}

Lets start with the constructor method. The constructor method is a special method which initializes automatically when an object created within a class. For example, var walker = new FileWalker(path,true);  created a new FileWalker object,the path and true values are passes to constructor method so you don’t need to call constructor method separately. Let’s explorer the each method one by one.

Constructor

The FileWalker constructor method accept two parameters:

  • path : the initial path of a directory which you want to scan recursively
  • debug: to enable debugging in the code. You can provide any value.
constructor (entry,debug){
 super();
 this.isPaused = false;
 this.queue = [];
 this.debug = debug ? true : false;
 this.filter_dir = () => false;
 this.filter_file = () => false;
 this.start(entry); 
}

Properties of constructor method

We’ll not directly change the properties value

this.isPaused holds the current status for our Duplicate File Finder app, the default value is false and will be true if paused. To change its value we’ll use pause() and resume() methods.

this.queue an array, it stores the output of readdir  from the start() method for later processing.

this.debug you will see the debugging result on console when its value is true. To enable debugging for your app see following code:

let debug = true;
let path = '/a/dir/path';
let walker = new FileWalker(path,debug);
//let walker = new FileWalker(path); //no debugging

this.filter_dir  a method, returns false value by default.  This method will call on every directory and a directory will not processed if the function returned true. You can set a filter_dir function using filterDir() method. For example:

let walker = new FileWalker(path);
walker.filterDir( (dir,stat) => {
 //dir will not scan
 if (dir == '/do/not/scan/path'){
  return true;
 }
/* or one liner code 
 return dir == '/do/not/scan/path');
*/
} )

this.filter_file similar to this.filter_dir, this function calls on every file.

this.start(entry) calls the start method to start the app by providing the entry (initial path of a directory).

filterDir (fn) Include or exclude folders

The FileWalker’s filterDir method accepts one parameter:

fn a function provided (optionally) immediately after creating the new FileWalker instance.

filterDir (fn){
 this.filter_dir = fn;
}

You can use this method to filter out directories for inclusion or exclusion.  For example,  you want to scan a path /usr/bell but not its sub-directories:

let myPath = '/usr/bell';
let walker = new Filewalker(myPath);
walker.filterDir(function(dir,stat){
 if (dir == myPath){
  return false;
 }
 return true;
/* or one liner code
 return dir != myPath;
*/
});

Note: The return true tells the app to not to scan the directory and return false tells the app to scan the directory.

You also can use node.js stat to further filter out directories:

let myPath = '/usr/bell';
let walker = new Filewalker(myPath);
walker.filterDir( (dir,stat) => {
 let mt = Math.round(stat.mtimeMs);
 if (mt > 1536356096483)
  return true;
 return false;
/* or two liner code
 let mt = Math.round(stat.mtimeMs);
 return mt < 1536356096483;
*/ 
});

In above example, we’ read the directory modification time in milliseconds, round it using Math.round() and then compared it against some milliseconds time so a latest directory will not scan.

filterFile (fn) Include or exclude files

The filterFile method works very similar to filterDir method. The only difference is that the filterFile method filter out the files while the filterDir filter out the directories / folders. The filterFile method accepts one parameter:

fn a function provided (optionally) immediately after creating the new FileWalker instance.

filterFile (fn){
 this.filter_file = fn;
}

The following example demonstrate how to filter files for a specific size:

walker.filterFile((file,stat)=>{
 //files size less than 5kb
 if (stat.size < 500000){
  return false;
 }
 return true;
/* or one liner code
 return stat.size > 500000;
*/
});

start (path) Starting up the app

The start method is, the heart of our FileWalker class, responsible for all fs processing, i.e. read a directory and its sub directories, include or exclude directories and files by executing the filter_dir and filter_file property methods sat-up through filterDir and filterFile methods.

The start method accepts one parameter:

entry path to process.

start (entry) {
 fs.lstat(entry, (err,stat) => {
  if (err){
   this.debug&&console.log('Error stat: '+entry);
   this.emit('error',err,entry,stat);
   this.next();
   return this;
  }

  if (stat.isFile()){
   if (this.filter_file(entry,stat)){
    this.debug&&console.log('filterFile: '+entry);
    return this.next();
   }
   this.debug&&console.log('File: '+entry);
   this.emit('file',entry,stat);
   this.next();
  }
  
  else if (stat.isDirectory()){
   if (this.filter_dir(entry,stat)){
    this.debug&&console.log('filterDir: '+entry);
    return this.next();
   }
   this.debug&&console.log('Dir: '+entry);
   this.emit('dir',entry,stat);
   fs.readdir(entry, (err,files) => {
    if (err){
     this.debug&&console.log('Error readdir: '+entry);
     this.emit('error',err,entry,stat);
     return this;
    }
	
    Array.prototype.push.apply(this.queue, files.map( file => {
     return path.join(entry,file);
    }));
	
    this.next();
   });
  }
  
  else {
   this.debug&&console.log('unknown  or inaccessible: '+entry);
   this.emit('unknown',entry,stat);
   this.next()
 }
 });
}

fs.lstat(entry, (err,stat)  Read the stat of entry, if error the this.debug&&console.log('Error stat: '+entry); will print the  message on console if the debug option was enabled, then this.emit('error',err,entry,stat); emit error event for further handling of the error, for example:

let walker = new FileWalker(path); 
walker.on('error' (err,entry,stat) => {
 //do something
});

this.next(); code calls the next method which retrieves the next path (entry) from the queue for processing.

Next, stat.isFile() will return true if the entry is a file.  The this.filter_file (entry, stat) will return either true  or false,  if true we’ll skip this file, not emit the file event, and move to next() entry.

stat.isDirectory() return true if directory found, the filter_directory(entry, stat) will return true or false, we’ll not process the directory if it returned true. If false the dir event will fire and we’ll retrieve the entries of the dir using fs.readdir method, again if error occurs while reading the dir the error event will fire.

When we read a directory we receive file names not their full path, we’ll use the path.join method to join the current directory path with retrieved path. We’ll store the whole result in queue array for later processing and call the next() method to process the next entry.

Since our main focus on directories and files, we’ll not process any other file types and emit them as unknown type.

next() Retrieve next path from the queue

next(){
 if (this.isPaused) {
  this.emit('pause');
  this.debug&&console.log('isPaused');
  return this;
 }
 let nextEntry = this.queue.shift();
 if (!nextEntry){
  this.emit('done');
  this.debug&&console.log('Done');
  return this;
 }
 this.start(nextEntry);
}

The next method will emit the pause event if the app was paused and not execute the further code.

this.queue.shift() will return the next entry or the done event will fire if there is no more entries found in the queue.

pause() – Pausing the app

pause(){
 if (this.isPaused === true){
  debug&&console.log('isPaused failed.App was already paused');
  return this; 
 }
 this.isPaused = true;
}

This method will change the value of this.isPaused to true, so next method will not retrieve the next entry form the queue. i.e.

let walker = new FileWalker(path);
setTimeout( () => {
 walker.pause();
}, 10);

resume() – Resuming the app

resume(){
 if (this.isPaused === false){
  debug&&console.log('Resume failed. App was already resumed');
 return; 
 }
 this.isPaused = false;
 this.debug&&console.log('Resume');
 this.next();
}

The resume method will change the value of this.isPaused to false and then call the this.next() method to retrieve the next entry. i.e.

let walker = new FileWalker(path);
setTimeout( () => {
 walker.pause();
}, 10);

setTimeout( () => {
 walker.resume();
}, 50);

Using FileWalker Module

Lets import the FileWalker module by providing the full path and create a new FileWalker object, as shown below:

let Walker = require('./walker.js');
let walker = new Walker('D:\\',true);

walker.filterDir((dir,stat) => {
 //do not scan following folder
 return dir == 'D:\\brainBell';
});

walker.filterFile((file,stat)=>{
 //exclude files greater than 500kb
 return stat.size > 500000;
});

walker.on('error', (err,stat) => {
 //an error occurred
})

walker.on('dir', (dir,stat) => {
 //a directory found
})

walker.on('file', (file,stat) => {
 //a file found
})

walker.on('pause', () =>{
 //app has been paused
})

walker.on('done', () => {
 //walker finished its work
})

setTimeout( () => {
 walker.pause();
}, 10);

setTimeout( () => {
 walker.resume();
}, 100);

How to run FileWalker on your pc

Download walker.js and trywalker.js files in same folder i.e. d:\fw and write node trywalker in command prompt. i.e.

D:\fw>node trywalker

Downloads

  1. walker.js
  2. trywalker.js

In next tutorial we'll read about the Node.js child process by running our FileWalker module in the background (as child process).