parbash

Modified version of BASH intended for text processing on computer clusters
Download

parbash Ranking & Summary

Advertisement

  • Rating:
  • License:
  • GPL
  • Price:
  • FREE
  • Publisher Name:
  • parbash Team
  • Publisher web site:
  • http://cloud-dev.blogspot.com/
  • Operating Systems:
  • Mac OS X
  • File Size:
  • 4.5 MB

parbash Tags


parbash Description

Modified version of BASH intended for text processing on computer clusters parbash is an open source extension to the BASH scripting language to enable scalable text processing over large data using distributed or multicore systems. parbash makes use of familiar shell text processing commands to process large files by transparently distributing the processing pipeline over multiple processors or multiple machines using Apache Hadoop, a map-reduce middleware for computer clusters.The main task of parbash is to manage the tedious interface gap between hadoop (map-reduce) and bash while letting the programmer focus on writing text processing scripts.parbash is best suited for computationally expensive processing, especially when it must be done over multi-gigabyte (or larger) files. At the time of writing, processing smaller files with hadoop is not practical due to large overhead in hadoop framework.Detailed installation and usage instructions are available here and here.


parbash Related Software